Archive for September, 2007

Scalable products: KFS released

Kosmix, a search startup has released source to C++ implementation of something which looks like a clustered file system. This looks very similar to Hadoop/HDFS, but the C++ factor will be a big performance boost.
From Skrenta blog

Incremental scalability – New chunkserver nodes can be added as storage needs increase; the system automatically adapts to the [...]

Read the rest of this entry »

What is scalability ?

When asked what they mean by scalability, a lot of people talk about improving performance, about implementing HA, or even talk about a particular technology or protocol. Unfortunately, scalability is none of that. Don’t get me wrong. You still need to know all about speed, performance, HA technology, application platform, network, etc. But that is [...]

Read the rest of this entry »

Scaling Smugmug from startup to profitability

Smugmug.com, a 5 year old company with just 23 employees has 315000 paying customers and 195 million photographs. CEO & “Chief Geek” Don MacAskill has a nice set of slides where he talks about its 5 year journey during which it went from small startup to a profitable business. The talk was given during Amazon’s [...]

Read the rest of this entry »

Scalability Stories (15th Sept) Mysql Proxy, Cluster Fire System, Facebook apps and Twitter

There have been a lot of interesting stories from last week for me to share. If you have interesting links you want to add to this post please forward them to me or post a comment to this post.

Sun is planning to acquire majority stake of “Cluster File Systems, Inc“. [ Talk on Lustre [...]

Read the rest of this entry »

Scaling Powerset using Amazon’s EC2 and S3

The first thing most doc-com companies do before going public is setup an infrastructure to provide the service. And though it might sound straight forward to most of you, it can be a very expensive affair. To come up with the right kind of infrastructure for any new service a few key architectural decisions have [...]

Read the rest of this entry »

P2P network scalability

Youtube is said to be pushing about 25 petabytes per month which is about 77 Gbps sustained data rate on an average. The bandwidth usage at the peaks would be even higher. Thanks to Limelight networks, Youtube doesn’t really need to scale or provision for that kind of bandwidth and based on the some reports [...]

Read the rest of this entry »

Sharding: Different from Partitioning and Federation ?

Ive been hearing this word “sharding” more and more often, and its spreading like fire. Theo Schlossnagle, the author of “Scalable internet architecutres” argues that federation is form of partitioning, and that sharding is nothing but a form of partitioning and federation. Infact, according to him, Sharding has already been in use use for [...]

Read the rest of this entry »

Adventures of scaling eins.de

Patrick Lenz, founder and lead developer of freshmeat.net was also responsible for the relaunch of another website eins.de which recently moved from php to ruby. Eins.de site serves about 1.2 million dynamic pages a day. He wrote a series of articles describing how they redesigned the site to scale for growth. I found [...]

Read the rest of this entry »

Scalability stories from Sept 6th 2007

A 55 minute talk by ‘ Stewart Smith’ from MySQL AB, about Mysql Clusters. He talks about NDB storage engine and synchronous replication between storage nodes. Also talks about new features in 5.1 including cluster to cluster replication, disk based data and a bunch of other things. And another Mysql talk on Google about Performance [...]

Read the rest of this entry »

Session, state and scalability

In my other life I work with a medium scale web application which has had many different kinds of growing problems over time. One of the most painful one is the issue about “statelessness”. If I could only give one recommendation to anyone building a brand new web application, I’d say “go stateless“. [...]

Read the rest of this entry »