Archive for the ‘architecture’ Category

Continuous deployments may not be for everyone: Culture

If you have read this blog before, you know how much I admire those who use continuous deployments in production. Doing that at scale is even more impressive. But the message which gets lost sometimes is that Continuous deployments may not be for everyone. Most continuous integration environments usually do all of their deployments from [...]

Read the rest of this entry »

Thoughts on scalable web operations

Interesting observations/thoughts on  web operations collected from a few sessions at Velocity conference 2010 [ most are from a talk by Theo Schlossnagle, author of “Scalable internet architectures” ] Optimization Don’t over optimize. Could take away precious resources away from critical functions.  Don’t scale early. Planning for more than 10 times the load you currently [...]

Read the rest of this entry »

@twitter annotations : What I learnt at the hackfest….

A few of us joined in at the new Twitter office in downtown SF (right next to Moscone Center) and were for the first time shown what Twitter is doing about  “Twitter Annotations”. We probably created the first set of 3rd party applications around this new API. During the Hackathon I spent some time to [...]

Read the rest of this entry »

You don’t have to be Google to use NoSQL

Ted Dziuba has a post about “I can’t wait for NoSQL to Die”. The basic argument he makes is that one has to be at the size Google is to really benefit from NoSQL. I think he is missing the point. Here are my observations. This is similar to the argument the traditional DB vendors [...]

Read the rest of this entry »

Automated, faster, repeatable, scalable deployments

While efficient automated deployment tools like Puppet and Capistrano are a big step in the right direction, its not the complete solution for an automated deployment process. This post will explore some of the less discussed issues which are as important for automated, fast, repeatable scalable deployments.  Rapid Build and Integration with tests Use Source [...]

Read the rest of this entry »

Disaster Recovery: Impressive RPO and RTO objectives set by Google Apps Operations

Unless you are running a fly by night shop, DR (Disaster recovery) should be one of the top issues for your operations team. In a “Scalable architecture” world, the complexity of DR can become a disaster in itself.  Yesterday Google Announced that it now finally has DR plan for Google Apps. While this is nice, [...]

Read the rest of this entry »

The Reddit problem: Learning from mistakes

Reddit has a very interesting post about what not to do when trying to build a scalable system. While the error is tragic, I think its an excellent design mistakes to learn from. Though the post lacked detailed technical report, we might be able to recreate what happened. They mentioned they are using MemcacheDB datastore, [...]

Read the rest of this entry »

Brewers CAP Theorem on distributed systems

Large distributed systems run into a problem which smaller systems don’t usually have to worry about. “Brewers CAP Theorem” [ Ref 1] [ Ref 2] [ Ref 3]  defines this problem in a very simple way. It states, that though its desirable to have Consistency, High-Availability and Partition-tolerance in every system, unfortunately no system can [...]

Read the rest of this entry »

Scaling deployments

Most of the newer, successful, web startups have one thing in common. They release smaller changes more often. Being in operations, I am often surprised how these organizations manage such a feat without breaking their website. Here are some notes from someone in flickr about how they do it. The two most important part of [...]

Read the rest of this entry »

Cloud : Agility vs Security

Networking devices on the edges have become smarter over time. So have the firewalls and switches used internally within the networks. Whether we like it or not, web applications over time have grown to depend on them. Its impossible to build a flawless product because of which its standard practice to disable all unused services [...]

Read the rest of this entry »