Lots of interesting updates today.
But would like to first mention the fantastic work Cloud computing group at UCSB are doing to make appengine framework more open. They have done significant work at making appscale â€œworkâ€ with different kinds of data sources including HBase, Cassandra, Voldemort, MongoDB, Hypertable and Mysql and MemcacheDB. Appscale is actively looking for folks interested in working with them to make this stable and production ready.
- GAE 1.3.1 released: I think the biggest news about this release is the fact that 1000 row limit has now been removed. You still have to deal with the 30 second processing limit per http request, but at least the row limit is not there anymore. They have also introduced support for automatic transparent datastore api retries for most operations. This should dramatically increase reliability of datastore queries, and reduces the amount of work developers have to do to build this auto-retry logic.
- Elastic search is a lucene based indexing product which seems to do what Solr used to do with the exception that it can now scale across multiple servers. Very interesting product. Iâ€™m going to try this out soon.
- MemcacheDB: A distributed key-value store which is designed to be persistent. It uses memcached protocol, but its actually a datastore (using Berkley DB) rather than cache.
- Nasuni seems to have come up with NAS software which uses cloud storage as the persistent datastore. It has capability to cache data locally for faster access to frequently accessed data.
- Guys at Flickr have two interesting posts you should glance over. â€œUsing, Abusing and Scaling MySQL at Flickrâ€ seems to be the first in a series of post about how flickr scales using Mysql. The next one in the series is â€œTicket Servers: Distributed Unique Primary Keys on the Cheapâ€
- Finally a fireside chat by Mike Schroepfer, VP of Engineering, about Scaling Facebook.
There have been a lot of interesting stories from last week for me to share. If you have interesting links you want to add to this post please forward them to me or post a comment to this post.
- Sun is planning to acquire majority stake of “Cluster File Systems, Inc“. [ Talk on Lustre File System ]
- Sun intends to add support for Solaris Operating System (Solaris OS) on Lustre and plans to continue enhancing Lustre on Linux and Solaris OS across multi vendor hardware platforms. As previously announced in July 2007, Sun also plans to deliver Lustre servers on top of Sun’s industry-leading open source Solaris ZFS solution, which is one of the fastest growing storage virtualization technology in the marketplace.
- Making Facebook Apps scale on cheap : An interesting writeup By Surj Patel about Scalability issues Facebook itself and the 3rd Party apps on it have. Also discusses EC2 and S3 as an alternative solution to scale in a cost effective way.
- Welcome to Amazon and S3 and EC2 Ã¢â‚¬â€ processing power (EC2) and storage (S3) on demand. These services let you access computational power and storage only when you need it and, better yet, pay only for what you use. The last time I checked, it was 10 cents an hour for the server, 10 cents for every gigabyte of data written and 18 cents per gigabyte read out Ã¢â‚¬â€œ all for a virtual box with 1.7Ghz x86 processor/1.75Gbytes of RAM/250Mbs of bandwidth. Nor are you limited to one usage; use as many as you need or want and can afford.
- Interesting blog post on Google Reader Numbers. They have made significant progress lately and thanks to the scalable architecture they now store 10 terabytes of raw feed data from 8 million feeds in their index.
- Todd Hoff has an interesting writeup on Scaling Twitter: Making Twitter 10000 Percent Faster. And an interview with Biz Stone (Co-Founder of Twitter) here.
- If you use Mysql and your app is not yet designed to handle federated database architecture, you should take a look at a new product in development called “Mysql Proxy”
The most powerful feature is Read/Write Splitting which allows you to scale a application which is unaware of replication automatically cross several slaves without changes to your application. Instance Scale Out we say. The Proxy also became a 1st class citizen in the MySQL world with full docs, win32 support and easy to install.
For latest set of links go here.
This is a collection of various slides, pdfs and videos about designing scalable websites I collected time. If you have something interesting which might go in here, please let me know.