Posted in March 9, 2010 ¬ 12:58 amh.Royans
While efficient automated deployment tools like Puppet and Capistrano are a big step in the right direction, its not the complete solution for an automated deployment process. This post will explore some of the less discussed issues which are as important for automated, fast, repeatable scalable deployments.
Rapid Build and Integration with tests
Use Source control [...]
Read the rest of this entry »
Posted in March 5, 2010 ¬ 7:44 amh.Royans
Unless you are running a fly by night shop, DR (Disaster recovery) should be one of the top issues for your operations team. In a “Scalable architecture” world, the complexity of DR can become a disaster in itself.
Yesterday Google Announced that it now finally has DR plan for Google Apps. While this is nice, [...]
Read the rest of this entry »
Posted in March 1, 2010 ¬ 9:30 pmh.Royans
Reddit has a very interesting post about what not to do when trying to build a scalable system. While the error is tragic, I think its an excellent design mistakes to learn from.
Though the post lacked detailed technical report, we might be able to recreate what happened. They mentioned they are using MemcacheDB datastore, with [...]
Read the rest of this entry »
Posted in February 14, 2010 ¬ 3:33 pmh.Royans
Large distributed systems run into a problem which smaller systems don’t usually have to worry about. “Brewers CAP Theorem” [ Ref 1] [ Ref 2] [ Ref 3] defines this problem in a very simple way.
It states, that though its desirable to have Consistency, High-Availability and Partition-tolerance in every system, unfortunately no system can [...]
Read the rest of this entry »
Posted in February 5, 2010 ¬ 1:44 amh.Royans
Most of the newer, successful, web startups have one thing in common. They release smaller changes more often. Being in operations, I am often surprised how these organizations manage such a feat without breaking their website. Here are some notes from someone in flickr about how they do it. The two most important part of [...]
Read the rest of this entry »
Posted in February 3, 2010 ¬ 1:27 amh.Royans
Networking devices on the edges have become smarter over time. So have the firewalls and switches used internally within the networks. Whether we like it or not, web applications over time have grown to depend on them.
Its impossible to build a flawless product because of which its standard practice to disable all unused services on [...]
Read the rest of this entry »
Posted in February 1, 2010 ¬ 8:24 pmh.Royans
Windows Azure is an application platform provided by Microsoft to allow others to run applications on Microsoft’s “cloud” infrastructure. Its finally open for business (as of Feb 1, 2010). Below are some links about Azure for those who are still catching up.
Wikipedia: Windows Azure has three core components: Compute, Storage and Fabric. As the names [...]
Read the rest of this entry »
Posted in January 31, 2010 ¬ 5:29 pmh.Royans
While “private clouds may not be the future” they are definitely needed today. Here are some of the top issues bothering some organizations which have been thinking about going into the cloud. Some of issues were based on Craig Bolding’s talk on “Guide to cloud security”.
Unlike your own data center, you will never know [...]
Read the rest of this entry »
Posted in January 25, 2010 ¬ 11:45 pmh.Royans
Hive is a data warehouse infrastructure built over Hadoop. It provides tools to enable easy data ETL, a mechanism to put structures on the data, and the capability to querying and analysis of large data sets stored in Hadoop files. Hive defines a simple SQL-like query language, called QL, that enables users familiar with SQL [...]
Read the rest of this entry »
hadoop, highavailability, hive, product, scalability, scaledatabase, datawarehouse, facebook, hadoop, hive, scalability
Posted in January 25, 2010 ¬ 12:23 amh.Royans
James Hamilton is one of the leaders in this industry and has written a very thought provoking post about private clouds not being the future. This is what he said about private clouds when compared to existing not-cloud solutions.
A fix, Not the future (reference to an InformationWeek post)
Runs at lower utilization levels
Consumes more [...]
Read the rest of this entry »