January 17, 2011

Risks of automated stock prediction engines

A celebrety tweeted about a stock tip to his 3 million followers. By the end of the day the stock price jumped over 290%.

"You can double your money right now. Just get what you can afford," Jackson tweeted about H&H; Imports, a money-losing venture out of Clearwater, Fla., that owns TV Goods, a marketing firm recently founded by Kevin Harrington.
While all the online reports I read assume that his followers fell for it, I am not sold on that. I would like to know if anyone ruled out the root cause as automated systems using twitter data for stock price change prediction. Automated systems using twitter data often use follower count to compute probability of some information being true. One of my news crawlers (scalebig.com) actually uses the same twitter feed to rate technical posts on scalability.


Knowing how widespread the use of twitter data is in different kind of automations, and because of how some of the way twitter data could behave is unpredictable, I would recommend looking at events like these with a microscope so that it may never happen again.

How facebook ships code

Stumbled on a fascinating post about how facebook ships code. This level of detail is rare from an organization as big as this.  This is a very long piece... but here are a few lines from it to entice you to click on it.

From framethink


  • as of June 2010, the company has nearly 2000 employees, up from roughly 1100 employees 10 months ago.  Nearly doubling staff in under a year!

  • the two largest teams are Engineering and Ops, with roughly 400-500 team members each.  Between the two they make up about 50% of the company.

  • product manager to engineer ratio is roughly 1-to-7 or 1-to-10

  • all engineers go through 4 to 6 week “Boot Camp” training where they learn the Facebook system by fixing bugs and listening to lectures given by more senior/tenured engineers.  estimate 10% of each boot camp’s trainee class don’t make it and are counseled out of the organization.

  • after boot camp, all engineers get access to live DB (comes with standard lecture about “with great power comes great responsibility” and a clear list of “fire-able offenses”, e.g., sharing private user data)





  • any engineer can modify any part of FB’s code base and check-in at-will

  • very engineering driven culture.  ”product managers are essentially useless here.” is a quote from an engineer.  engineers can modify specs mid-process, re-order work projects, and inject new feature ideas anytime.

  • during monthly cross-team meetings, the engineers are the ones who present progress reports.  product marketing and product management attend these meetings, but if they are particularly outspoken, there is actually feedback to the leadership that “product spoke too much at the last meeting.”  they really want engineers to publicly own products and be the main point of contact for the things they built.

  • resourcing for projects is purely voluntary.

    • a PM lobbies group of engineers, tries to get them excited about their ideas.

    • Engineers decide which ones sound interesting to work on.

    • Engineer talks to their manager, says “I’d like to work on these 5 things this week.”

    • Engineering Manager mostly leaves engineers’ preferences alone, may sometimes ask that certain tasks get done first.



  • Engineers handle entire feature themselves — front end javascript, backend database code, and everything in between.  If they want help from a Designer (there are a limited staff of dedicated designers available), they need to get a Designer interested enough in their project to take it on.  Same for Architect help.  But in general, expectation is that engineers will handle everything they need themselves.


Its Logical - IAAS users will move to PAAS


Sysadmins love infrastructure control, and I have to say that there was a time when root access gave me a high. It wasn’t  until I moved to web operations team (and gave up my root access) that I realized that I was  more productive when I wasn’t dealing with day to day hardware and OS issues. After managing my own EC2/Rackspace instance for my blog for a few years , I came to another realization today that IAAS (infrastructure as a service) might be one of these fads which will give way to PAAS (Platform as a service).
WordPress is an excellent blogging platform, and I manage multiple instances of it for my blogs (and one for my  wife’s blog). I chose to run my own wordpress instance because I loved the same control which I used to have when I was a sysadmin. I not only wanted to run my own plugins, configured my own features, play with different kinds of caching features, I also wanted to choose my own linux distribution (Ubuntu ofcourse) and make it work the way I always wanted my servers to work.  But when it came to patching the OS, taking backups, updating wordpress and the zillion other plugins, I found it a little distracting, slightly frustrating and extremely time consuming.
Last week I moved one of my personal blogs to blogger.com and its possible that it may not be the last one. Whats important here is not that I picked blogger.com over wordpress.com, but the fact that I’m ready to give up control to be more productive. Amazon’s AWS started off as the first IAAS service provider, but today they provide a whole lot of other managed services like Elastic MapReduce, Amazon Route 53, Amazon cloudfront and Amazon Relational Database Service which are more of a PAAS than IAAS.
IAAS is a very powerful tool in the hands of professional systems admin. But I’m willing to bet that over the next few years lesser number organizations would be worried about kernel versions and linux distributions and would instead be happy with a simple API to upload “.war” files (if they are running tomcat for example) into some kind of cloud managed tomcat instances (like how hadoop runs in elastic mapreduce). Google App Engine (Java and Python) and Heroku (Ruby based, Salesforce bought them) are two examples of such service today and I’ll be surprised if  AWS doesn’t launch something  (or buy someone out) within the next year to do the same.