Scaling updates for Feb 10, 2010

Lots of interesting updates today.

But would like to first mention the fantastic work Cloud computing group at UCSB are doing to make appengine framework more open. They have done significant work at making appscale “work” with different kinds of data sources including HBase, Cassandra, Voldemort, MongoDB, Hypertable and Mysql and MemcacheDB. Appscale is actively looking for folks interested in working with them to make this stable and production ready.

  • GAE 1.3.1 released: I think the biggest news about this release is the fact that 1000 row limit has now been removed. You still have to deal with the 30 second processing limit per http request, but at least the row limit is not there anymore. They have also introduced support for automatic transparent datastore api retries for most operations. This should dramatically increase reliability of datastore queries, and reduces the amount of work developers have to do to build this auto-retry logic.
  • Elastic search is a lucene based indexing product which seems to do what Solr used to do with the exception that it can now scale across multiple servers. Very interesting product. I’m going to try this out soon.
  • MemcacheDB: A distributed key-value store which is designed to be persistent. It uses memcached protocol, but its actually a datastore (using Berkley DB) rather than cache. 
  • Nasuni seems to have come up with NAS software which uses cloud storage as the persistent datastore. It has capability to cache data locally for faster access to frequently accessed data.
  • Guys at Flickr have two interesting posts you should glance over. “Using, Abusing and Scaling MySQL at Flickr” seems to be the first in a series of post about how flickr scales using Mysql. The next one in the series is “Ticket Servers: Distributed Unique Primary Keys on the Cheap”
    • Finally a fireside chat by Mike Schroepfer, VP of Engineering,  about Scaling Facebook.

Scalability Stories (15th Sept) Mysql Proxy, Cluster Fire System, Facebook apps and Twitter

There have been a lot of interesting stories from last week for me to share. If you have interesting links you want to add to this post please forward them to me or post a comment to this post.

  • Sun is planning to acquire majority stake of “Cluster File Systems, Inc“. [ Talk on Lustre File System ]
    • Sun intends to add support for Solaris Operating System (Solaris OS) on Lustre and plans to continue enhancing Lustre on Linux and Solaris OS across multi vendor hardware platforms. As previously announced in July 2007, Sun also plans to deliver Lustre servers on top of Sun’s industry-leading open source Solaris ZFS solution, which is one of the fastest growing storage virtualization technology in the marketplace.
  • Making Facebook Apps scale on cheap : An interesting writeup By Surj Patel about Scalability issues Facebook itself and the 3rd Party apps on it have. Also discusses EC2 and S3 as an alternative solution to scale in a cost effective way.
    • Welcome to Amazon and S3 and EC2 — processing power (EC2) and storage (S3) on demand. These services let you access computational power and storage only when you need it and, better yet, pay only for what you use. The last time I checked, it was 10 cents an hour for the server, 10 cents for every gigabyte of data written and 18 cents per gigabyte read out – all for a virtual box with 1.7Ghz x86 processor/1.75Gbytes of RAM/250Mbs of bandwidth. Nor are you limited to one usage; use as many as you need or want and can afford.
  • Interesting blog post on Google Reader Numbers. They have made significant progress lately and thanks to the scalable architecture they now store 10 terabytes of raw feed data from 8 million feeds in their index.
  • Todd Hoff has an interesting writeup on Scaling Twitter: Making Twitter 10000 Percent Faster. And an interview with Biz Stone (Co-Founder of Twitter) here.
  • If you use Mysql and your app is not yet designed to handle federated database architecture, you should take a look at a new product in development called “Mysql Proxy
      The most powerful feature is Read/Write Splitting which allows you to scale a application which is unaware of replication automatically cross several slaves without changes to your application. Instance Scale Out we say. The Proxy also became a 1st class citizen in the MySQL world with full docs, win32 support and easy to install.

Scalability stories from Sept 6th 2007

  1. A 55 minute talk by ‘ Stewart Smith’ from MySQL AB, about Mysql Clusters. He talks about NDB storage engine and synchronous replication between storage nodes. Also talks about new features in 5.1 including cluster to cluster replication, disk based data and a bunch of other things. And another Mysql talk on Google about Performance Tuning Best practices for Mysql.
  2. An interesting talk with Leah Culver about how Pownce was created. They use LAMP(Python) stack with Perlbal, Memcached, Django, AIR with Amazon S3 as the backend storage.
  3. Discussion about  High-Availability Mongrel Packs using Seesaw
  4. A blog about “Future of Data Center Computing” mentions Terracotta Sessions. If you read my previous post about “Session, state and scalabililty” and understood the problem, do look at this as a solution as well
  5. EC2 and S3 are being used more than before. Unfortunately because storage on EC2 doesn’t persist across reboots creative ways of keeping the data alive has to be found. Here is a talk about Redundant Mysql Replication using EC2 and S3.
  6. Found an interesting post by IBM engineers on how to setup a web cluster in 5 easy steps

Blogged with Flock

Scalability Stories for Aug 30

  1. I found a very interesting story on how memcached was created. Its an old story titled “Distributed caching with memcached“. I also found an interesting FAQ on memcached which some of you might like.
  2. Inside Myspace is another old story which follows Myspace’s growth over time. Its a very long and interesting read which shouldn’t be ignored.
  3. Measuring Scalability tries to put numbers to the problem of scalability. If you have to justify the cost of scalability to anyone in your organization, you should atleast skim through this page
  4. I found a wonderful story on the humble architecture of Mailinator and how it grew over time on just one webserver. It receives approx 5 million emails a day and runs the whole operation pretty much in memory with no logs or database to leave traces behind. And here is another page from the creator of Mailinator abouts its stats from Feb.
  5. Finally another very interesting presentation/slide on the topic of “Scalable web architecture” which focuses primary on LAMP architecture.

Links on scalability, performance and problems

8/19/2007 Big Bad Postgres SQL
8/19/2007 Scalable internet architectures
8/19/2007 Production troubleshooting (not related to scalability)
8/19/2007 Clustered Logging with mod_log_spread
8/19/2007 Understanding and Building HA/LB clusters
8/12/2007 Multi-Master Mysql Replication
8/12/2007 Large-Scale Methodologies for the World Wide Web
8/12/2007 Scaling gracefully
8/12/2007 Implementing Tag cloud – The nasty way
8/12/2007 Normalized Data is for sissies
8/12/2007 APC at facebook
8/6/2007 Plenty Of fish interview with its CEO
8/6/2007 PHP scalability myth
8/6/2007 High performance PHP
8/6/2007 Digg: PHP’s scalability and Performance td>

Talks and slides from various web architects

For latest set of links go here.

This is a collection of various slides, pdfs and videos about designing scalable websites I collected time. If you have something interesting which might go in here, please let me know.

Date Type Title
6/23/2007 Blog Getting Started with Drupal
6/23/2007 Blog 4 Problems with Drupal
6/23/2007 Video Seattle Conference on Scalability: MapReduce Used on Large Data Sets
6/23/2007 Video Seattle Conference on Scalability: Scaling Google for Every User
6/23/2007 Video Seattle Conference on Scalability: VeriSign’s Global DNS Infrastucture
6/23/2007 Video Seattle Conference on Scalability: YouTube Scalability
6/23/2007 Video Seattle Conference on Scalability: Abstractions for Handling Large Datasets
6/23/2007 Video Seattle Conference on Scalability: Building a Scalable Resource Management
6/23/2007 Video Seattle Conference on Scalability: SCTPs Reliability and Fault Tolerance
6/23/2007 Video Seattle Conference on Scalability: Lessons In Building Scalable Systems
6/23/2007 Video Seattle Conference on Scalability: Scalable Test Selection Using Source Code
6/23/2007 Video Seattle Conference on Scalability: Lustre File System
6/9/2007 Slides Technology at
6/9/2007 Blog Extreme Makeover: Database or MySQL@YouTube
4/26/2007 Blog Mysql at Google
4/1/2007 Slides Scaling Twitter
4/1/2007 Slides How we build Vox
4/1/2007 Slides High Performance websites
4/1/2007 Slides Beyond the file system design
4/1/2007 Slides Scalable web architectures
3/1/2007 Slides Scalability set Amazon’s servers on fire not yours
3/1/2007 Slides Hardware layouts for LAMP installations
3/1/2007 Video Mysql scaling and high availability architectures
3/1/2007 Audio Lessons from Building world’s largest social music platform
3/1/2007 PDF Lessons from Building world’s largest social music platform
3/1/2007 Slides Lessons from Building world’s largest social music platform
11/1/2006 PDF Livejournal’s backend: history of scaling
11/1/2006 Slides Livejournal’s backend: history of scaling
11/1/2006 Slides Scalable Web Architectures (w/ Ruby and Amazon S3)
10/26/2006 Blog Yahoo! bookmarks uses symfony
7/26/2006 Slides Getting Rich with PHP 5
7/26/2006 Audio Getting Rich with PHP 5
3/7/2006 Blog Scaling Fast and Cheap – How We Built Flickr
3/1/2005 News Open source helps Flickr share photos
  Slides Flickr and PHP
  Slides Wikipedia: Cheap and explosive scaling with LAMP
  Blog YouTube Scalability Talk
    High Order Bit: Architecture for Humanity
  PDF Mysql and Web2.0 companies
8/3/2007   Building Highly Scalable Web Applications
8/3/2007   Introduction to hadoop
8/3/2007 webpage The Hadoop Distributed File System: Architecture and Design
8/3/2007   Interpreting the Data: Parallel Analysis with Sawzall
8/3/2007 PDF ODISSEA: A Peer-to-Peer Architecture for Scalable Web Search and Information Retrieval
8/3/2007 PDF SEDA: An Architecture for well conditioned scalable internet services
8/3/2007 PDF A scalable architecuture for Global web service hosting service
7/25/2007   Meed Hadoop
7/25/2007 Blog Yahoo’s Hadoop Support
7/18/2007 Blog Running Hadoop MapReduce on Amazon EC2 and Amazon S3
6/22/2007   LH*RSP2P : A Scalable Distributed Data Structure for P2P Environment
6/12/2007   Scaling the Internet routing table with Locator/ID Separation Protocol (LISP)
6/3/2007   Hadoop Map/Reduce
6/1/2007 Slides Hadoop distributed file system
4/20/2007 Video Brad Fitzpatrick – Behind the Scenes at LiveJournal: Scaling Storytime
2/1/2007 Slides Inside LiveJournal’s Backend (April 2004)
2/1/2007 Slides How to scale
1/23/2007   Testing Oracle 10g RAC Scalability
1/1/2007 Slides PHP & Performance
12/22/2006   SQL Performance Optimization
10/13/2006   Building_a_Scalable_Software_Security_Practice
5/31/2006   Building Large Systems at Google
5/4/2006   Scalable computing with Hadoop
1/1/2006 Slides The Ebay architecture
1/1/2006 PDF Bigtable: A Distributed Storage System for Structured Data
1/1/2006 PDF Fault-Tolerant and scalable TCP splice and web server architecture
10/18/2005 Video BigTable: A Distributed Structured Storage System
1/1/2004 PDF MapReduce: Simplified Data Processing on Large Clusters
8/3/2003 PDF Google Cluster architecture
1/1/2003 PDF Google File System
11/1/2002 Doc Implementing a Scalable Architecture
10/30/2001 News How linux saved Millions for Amazon
    Yahoo experience with hadoop
  Slides Scalable web application using Mysql and Java
  Slides Friendster: scalaing for 1 Billion Queries per day
  Blog Lightweight web servers
  PDF Mysql Scale out by application partitioning
  PDF Replication under scalable hashing: A family of algorithms for Scalable decentralized data distribution
  Product Clustered storage revolution
  Blog Early Amazon Series
  Web Wikimedia Server info
  Slides Wikimedia Architecture
  Slides MySpace presentation
  PDF A scalable and fault-tolerant architecture for distributed web resource discovery
8/4/2007 PDF The Chubby Lock Service for Loosely-Coupled Distributed Systems
8/5/2007 Slides Real world Mysql tuning
8/5/2007 Slides Real world Mysql performance tuning
8/5/2007 Slides Learning MogileFS: Buliding scalable storage system
8/5/2007 Slides Reverse Proxy and Webserver
8/5/2007 PDF Case for Shared Nothing
7/1/2007 Slides A scalable stateless proxy for DBI
1/1/2006 Slides Real world scalability web builder 2006
8/5/2005 Slides Real world web scalability