Showing posts from October, 2010

Why Membase Uses Erlang

It’s totally worth it. Erlang (Erlang/OTP really, which is what most people mean when they say “Erlang has X”) does out of the box a lot of things we would have had to either build from scratch or attempt to piece together existing libraries to do. Its dynamic type system and pattern matching (ala Haskell and ML) make Erlang code tend to be even more concise than Python and Ruby, two languages known for their ability to do a lot in few lines of code. The single largest advantage to us of using Erlang has got to be its built-in support for concurrency . Erlang models concurrent tasks as “ processes ” that can communicate with one another only via message-passing (which makes use of pattern matching!), in what is known as the actor model of concurrency. This alone makes an entire class of concurrency-related bugs completely impossible. While it doesn’t completely prevent deadlock , it turns out to be pretty difficult to miss a potential deadlock scenario when you wr

Cassandra's future @facebook and links to other NoSQL slides

I heard an unconfirmed rumor that facebook is moving away from Cassandra . Not sure why, or to what, but rumors like this is a concern regardless. After twitter 's backing off , and digg's troubles , which were indirectly linked to either Cassandra's maturity as a production solution or their understanding of Cassandra's capability, it might raise more eyebrows if facebook does really abandon cassandra.  Cassandra was created in Facebook, which it opensourced, but its my understanding today that most of the development on the open sourced cassandra happens outside its walls. Rackspace is a big sponsor(may not be the largest anymore) of the open source project and Riptano , which has built a whole compnay around the open source project has done a tremendous job of promoting. Scalability links for October 30th : SD Forum Membase Talk Slides - Slides from "Membase" Talk on Oct 26th at SDForum. NoSQL Database Architectures & Hypertable - NoSQL Data

Scalability links for October 28th

Scalability links for October 28th: New Features in Cassandra 0.7 - The biggest change for me in 0.7 is the presence of secondary indexes (inverted indexes). I'll be switching cfmap to 0.7 very soon... Do you have an Elephant and Pig in your data center? Hadoop momentum continues - Hadoop's is taking over the data analysis world in a way LAMP once did. Pregel: Graph Processing at Large-Scale - Some slides on Pregel "A System for large-scale graph processing" High-End Varnish – 275 thousand requests per second. - Very impressive. RESTful Cassandra - REST apis to cassandra is the one thing which is missing. This might fill the gap.

Scalability links for October 21st

Scalability links for October 21st: VMware and Google Launch 1st Series of Development Tools - VMware has collected some interesting assets. One of them was SpringSource. I'm slightly surprised with google and vmware working together. I guess both of them think there is something they both can gain from it. OpenStack: An Open Cloud Initiative Makes its 1st Release - Openstack has promised a lot. Lets figure out if it can deliver. I'm optimistic. AWS Free Tier: 750 hours of EC2 for free - This is big. They are going after Google App Engine I think. But it sucks for those who are already on AWS since this is available only to new users. SPOF (Single Point of Failure) Analysis - I can see why this can be a science by itself. SPOF detections and impact analysis is very important in production systems. Data Center Automation Startup Puppet Labs Acquires Open Source Project The Marionette Collective - The Marionette Collective aka. mcollective is a framework to build

Scalability links for October 18th

Scalability links for October 18th: Foursquare MongoDB Outage Post Mortem - Detailed analysis of what caused the foursquare (mongodb) outages. SURGE Recap - Interesting take aways from a scalability conference. The one new take away I noticed is "Make use of academic literature". Netflix Migration to the Cloud - Very interesting (technical) information about why netflix moved to the cloud. Phoebus: Erlang-based Implementation of Google's Pregel - Is Phoebus and attempt at opensource version of Pregel ? Why Riak Search Matters... - Didn't understand riak until this post compared it with lucene and solr. Itching to try it out... one more nosql experimentation isn't going to kill me. Google at USENIX Symposium on Operating Systems Design and Implementation (OSDI ‘10) - I've mentioned one of the papers listed here already. the other two seem to be interesting as well. OpenTSDB: A Distributed, Scalable Monitoring System on Top of HBase - I sa

Scaling Graphite by using Cfmap as the data transport

Graphite is an extremely promising system and resource graphing tool which tries to take RRD to the next level. Here are some of the most interesting features of graphite which I liked. Updates can happen to records in the past (RRD doesn’t allow this I think) Creation of new datasets is trivial with whisper/carbon ( its part of  the graphite framework ) Graphite allows federated storage (multiple servers across multiple datacenters for example) Monitoring and graphing resources across data-centers is tricky however. Especially because WAN links cannot be trusted. While loosing some data may be ok for some folks, it may not be acceptable for others. Graphite tries to solve this problem by providing an option to federate data across multiple servers which could each be kept in separate datacenters. Another way to solve this problem is by using a data transport which is resilient to network failures. Since Cfmap (thanks to Cassandra ) is a distributed, eventuall

Cfmap: Publishing, discovering and dashboarding infrastructure state

Dynamic infrastructure can be a challenging if apps and scripts can’t keep up with them. At Ingenuity we observed this problem when we started moving towards virtualization and SOA (service oriented architecture). Remembering server names became impractical, and error-free manual configuration changes became impossible. While there are some tools which solve parts of this specific problem, we couldn’t find any opensource tool which could be used to both publish and discover state of a system in a distributed, scalable and fault-tolerant way. Zookeeper which comes pretty close to what we needed was a fully consistent system which was not designed to be used across multiple data centers over high latency, unstable network connections. We wanted a system which could not only be up during network outages, but also sync up the state from different data-centers when they are connected. We built a few different tools to solve our scalability problems, one of which is a tool called Cfmap

“Chrome instant” feature could break your webapp

The “ Google instant ” wasn’t a ground breaking idea by itself. We have all been using various forms of auto-completes for a while now. What makes it stand out is that unlike all the previous kinds of auto-completes, this one is able to search the entire web archive, at an amazing speed and still be able to serve personalized, hyper-local results.  You can get more information about its backend here and here . It wasn't surprising that Google even put this feature inside chrome itself. Take a look at this demo from lifehacker . This is where it gets interesting…   At the beginning this looked very exciting. I was pleasantly surprised when chrome brought up websites, in addition to auto-completing URLs,  as I typed. The impact on the servers didn’t sink in until I was debugging a bug in my own application which required me to take a look at the apache logs. Look at the following log snippet from apache. Not surprisingly, I found 17 calls instead of just 1 made to my