Brewers CAP Theorem on distributed systems

Large distributed systems run into a problem which smaller systems don’t usually have to worry about. image“Brewers CAP Theorem” [ Ref 1] [ Ref 2] [ Ref 3]  defines this problem in a very simple way.

It states, that though its desirable to have Consistency, High-Availability and Partition-tolerance in every system, unfortunately no system can achieve all three at the same time.

Consistent: A fully Consistent system is one where the system can guarantee that once you store a state (lets say “x=y”) in the system, it will report the same state in every subsequent operation until the imagestate is explicitly changed by something outside the system. [Example 1] A single MySQL database instance is automatically fully consistent since there is only one node keeping the state. [Example 2] If two MySQL servers are involved, and if the system is designed in such a way that all keys starting “a” to “m” is kept on server 1, and keys “n” to “z” are kept on server 2”,  then system can still easily guarantee consistency.image Lets now setup the DBs as master-master replicas[Example 3] . If one of the database accepts get a “row insert” request, that information has to be committed to the second system before the operation is considered complete. To require 100% consistency in such a replicated environment, communication between nodes is paramount. The over all performance of such a system could drop as number of replica’s required goes up.

Available:  The database in [Example 1] or [Example 2] are not highly Available. In [Example 1] if the node goes down there would 100% data loss. In [Example 2] if one node goes down, you will have 50% data loss. [Example 3] is the only solution which solves that problem. A simple MySQL server replication [ Multi-master mode]  setup could provide 100% availability. Increasing the number of nodes with copies of the data directly increases the availability of the system. Availability is not just a protection from hardware failure. Replicas also help in loadbalancing concurrent operations, especially the read operations. “Slave” MySQL instances are a perfect example of such “replicas”.

Partition-tolerance: So you got Consistency and Availability by replicating data. Lets say you had these two MySQL servers in Example 3, in two different datacenters, and you loose the network connectivity between the two datacenters making both databases incapable of synchronizing state between the two. Would the two DBs be fully functional in such a scenario ? If you somehow do manage allow read/write operations on these two databases, it can be proved that the two servers won’t be consistent anymore. A banking application which keeps “state of your account” at all times is a perfect example where its bad to have inconsistent bank records. If I withdraw 1000 bucks from California, it should be instantly reflected in the NY branch so that the system accurately knows how much more I can withdraw at any given time. If the system fails to do this, it could potentially cause problems which could make some customers very unhappy. If the banks decide consistency is very important, and disable write operations during the outage, it will will loose “availability” of the cluster since all the bank accounts at both the branches will now be frozen until network comes up again.

This gets more interesting when you realize that C.A.P. rules don’t have to be applied in an “all or nothing” fashion. Different systems can choose various levels of consistency, availability or partition tolerance to meet their business objective. Increasing the number of replicas, for example, increases high availability but it could at the same time reduce partition tolerance or consistency.

When you are discussing distributed systems ( or distributed storage ), you should try to identify which of the three rules it is trying to achieve. BigTable, used by Google App engine, and HBase, which runs over Hadoop, claim to be always consistent and highly available. Amazon’s Dynamo, which is used by S3 service and datastores like Cassandra instead sacrifice consistency in favor of availability and partition tolerance.

CAP theorem doesn’t just apply to databases. Even simple web applications which store state in session objects have this problem. There are many “solutions” out there which allow you to replicate session objects and make sessions data “highly available”, but they all suffer from the same basic problem and its important for you to understand it.

Recognizing which of the “C.A.P.” rules your business really needs should be the first step in building any successful distributed, scalable, highly-available system.


Comments are closed.