Archive for the ‘highavailability’ Category

Why Membase Uses Erlang

It’s totally worth it. Erlang (Erlang/OTP really, which is what most people mean when they say “Erlang has X”) does out of the box a lot of things we would have had to either build from scratch or attempt to piece together existing libraries to do. Its dynamic type system and pattern matching (ala Haskelland [...]

Read the rest of this entry »

You don’t have to be Google to use NoSQL

Ted Dziuba has a post about “I can’t wait for NoSQL to Die”. The basic argument he makes is that one has to be at the size Google is to really benefit from NoSQL. I think he is missing the point. Here are my observations. This is similar to the argument the traditional DB vendors [...]

Read the rest of this entry »

Brewers CAP Theorem on distributed systems

Large distributed systems run into a problem which smaller systems don’t usually have to worry about. “Brewers CAP Theorem” [ Ref 1] [ Ref 2] [ Ref 3]  defines this problem in a very simple way. It states, that though its desirable to have Consistency, High-Availability and Partition-tolerance in every system, unfortunately no system can [...]

Read the rest of this entry »

AppScale, an OpenSource GAE implementation

If you don’t like EC2 you have an option to move your app to a new vendor. But if you don’t like GAE  (Google app engine) there aren’t any solutions which can replace GAE easily. AppScale might change that. AppScale is an open-source implementation of the Google AppEngine (GAE) cloud computing interface from the RACELab [...]

Read the rest of this entry »

Hive @Facebook

Hive is a data warehouse infrastructure built over Hadoop. It provides tools to enable easy data ETL, a mechanism to put structures on the data, and the capability to querying and analysis of large data sets stored in Hadoop files. Hive defines a simple SQL-like query language, called QL, that enables users familiar with SQL [...]

Read the rest of this entry »

HAProxy : Load balancing

Designing any scalable web architecture would be incomplete without investigating “load balancers”.  There used to be a time when selecting and installing load balancers was an art by itself. Not anymore. A lot of organizations today, use Apache web servers as a proxy server (and also as a load balancer) for the backend application clusters. [...]

Read the rest of this entry »

New EC2 features: Elastic Load Balancing, Auto Scaling, and Monitoring

  If you have not used EC2 because of some reason, chances are that those reasons don’t exist anymore. More information available in the following places. AWS Blog All things Distributed Right Scale

Read the rest of this entry »

Experimenting with SimpleDB (Flagthis.com)

A few years ago I wrote a simple online bookmarking tool called Flagthis. The tool allowed one to bookmark sites using a javascript bookmarklet from the bookmark tab. The problem it was trying to solve is that most links people bookmark are never used again if they are not checked out within the next few [...]

Read the rest of this entry »

Mysql Cluster

Link “Introduction to MySQL Cluster The NDB storage engine (MySQL Cluster) is a high-availability storage engine for MySQL. It provides synchronous replication between storage nodes and many mysql servers having a consistent view of the database. In 4.1 and 5.0 it’s a main memory database, but in 5.1 non-indexed attributes can be stored on disk. [...]

Read the rest of this entry »