Archive for the ‘scalability’ Category

Netflix: Dev and Ops internals

I’ve seen a number of posts from Netflix folks talking about their architecture in the recent weeks. And part of that is due to an ongoing effort to expand their business for which they seem to be hiring like crazy. Here is the yet another interesting deck of slides which mentions stuff across both Dev [...]

Read the rest of this entry »

“Chrome instant” feature could break your webapp

The “Google instant” wasn’t a ground breaking idea by itself. We have all been using various forms of auto-completes for a while now. What makes it stand out is that unlike all the previous kinds of auto-completes, this one is able to search the entire web archive, at an amazing speed and still be able [...]

Read the rest of this entry »

Operations Dashboards – KaChing

I’ve mentioned this before, but like to do it again because I think these guys are awesome. If you have not listened to devopscafe’s podcasts, this might be the right time to take a look at it. Here is a video of one of their sessions with folks at KaChing who have been doing amazing [...]

Read the rest of this entry »

Scalability updates for Aug 27th 2010

My updates have been slow recently due to other things I’m involved in. If you need more updates around what I’m reading, please feel free to follow me on twitter or buzz. Here are some of the big ones I have mentioned on my twitter/buzz feeds. Tools: Real-time Relationship Analytics from large scale graph processing [...]

Read the rest of this entry »

Distributed systems and Unique IDs: Snowflake

Most of us who deal with traditional databases take auto-increments for granted. While auto-increments are simple on consistent clusters, it can become a challenge in a cluster of independent nodes which don’t use the same source for the unique-ids. Even bigger challenge is to do it in such a way so that they are roughly [...]

Read the rest of this entry »

@twitter annotations : What I learnt at the hackfest….

A few of us joined in at the new Twitter office in downtown SF (right next to Moscone Center) and were for the first time shown what Twitter is doing about  “Twitter Annotations”. We probably created the first set of 3rd party applications around this new API. During the Hackathon I spent some time to [...]

Read the rest of this entry »

Google Storage : What it really is…

Yesterday Google formally announced Google Storage to a few (5000?) of us at Google I/O. Here is the gist of this as I see it from the various discussions/talks I attended. To begin with, I have to point out that there is almost nothing new in what Google has proposed to provide. Amazon has been [...]

Read the rest of this entry »

Spanner: Google’s next Massive Storage and Computation infrastructure

MapReduce, Bigtable and Pregel have their origins in Google and they all deal with “large systems”. But all of them may be dwarfed in size and complexity by a new project Google is working on, which was mentioned briefly (may be un-intentionally) at an event last year. Instead of caching data closer to user, it [...]

Read the rest of this entry »

Talk on “database scalability”

This is a very interesting talk by Jonathan Ellis on database scalability. He designed and implemented multi-petabyte storage for Mozy and is currently the project chair for Apache Cassandra. What every developer should know about database scalability, PyCon 2010 View more presentations from jbellis. Scalability is not improving latency, but increasing throughput But overall performance [...]

Read the rest of this entry »

Brewers CAP Theorem on distributed systems

Large distributed systems run into a problem which smaller systems don’t usually have to worry about. “Brewers CAP Theorem” [ Ref 1] [ Ref 2] [ Ref 3]  defines this problem in a very simple way. It states, that though its desirable to have Consistency, High-Availability and Partition-tolerance in every system, unfortunately no system can [...]

Read the rest of this entry »