Cassandra for service registry/discovery service

My last post was about my struggle to find a good distributed ESB/Service-discovery solution built over open source tools which was simple to use and maintain. Thanks to reader comments (Dan especially) and some other email exchanges, it seems like building a custom solution is unavoidable if I really want to keep things simple.

Dan suggested that I could use DNS to find seed locations for config store which would work very well in a distributed network. If security wasn’t a concern this seed location could have been on S3 or SimpleDB, but the requirement that it needs to be secured on internal infrastructure forced me to investigate simple replicated/eventually-consistent databases which could be hosted internally in different data centers with little or no long term administration cost.

My search lead me to investigate a few different NOSQL options

But the one I finally settled on as a possible candidate was Cassandra. Unlike some of the others, since our application platform was based on java, Cassandra was simple to install and setup. The fact that Facebook used it to store 50TB of data across 150 servers helped us convince it was stable as well.

The documentation on this project isn’t as much as I would have liked, but I did get it running pretty fast. Building a service registry/discovery service on top of this is whats next on my mind..

More on Cassandra

If you are interested in learning more about cassandra I’ll recommend you to listen to this talk by Avinash Lakshman (facebook) and read a few other posts listed here.

Cassandra: Articles

  • Cassandra — Getting Started: Cassandra data model from a Java perspective

  • Using Cassandra’s Thrift interface with Ruby

  • Cassandra and Thrift on OS X: one, two, three

  • Looking to the Future with Cassandra: how Digg migrated their friends+diggs data set to Cassandra from mysql

  • Building Scalable Databases: Denormalization, the NoSQL Movement and Digg

  • WTF is a SuperColumn? An Introduction to the Cassandra Data Model

  • Meet Scalandra: Scala wrapper for Cassandra

  • Cassandra and Ruby: A Love Affair? – Engine Yard’s walk-through of the Cassandra gem

  • Up and Running with Cassandra: featuring data model examples of a Twitter clone and a multi-user blog, and ruby client code

  • Facebook Engineering notes and Cassandra introduction and LADIS 2009 paper

  • ArchitectureInternals

  • ArchitectureGossip

  • Cassandra: Presentations

  • Cassandra in Production at Digg from NoSQL East 09

  • Introduction to Cassandra at OSCON 09

  • What Every Developer Should Know About Database Scalability: presentation on RDBMS vs. Dynamo, BigTable, and Cassandra

  • IBM Research’s scalable mail storage on Cassandra

  • NOSQL VideoNOSQL Slides: More on Cassandra internals from Avinash Lakshman.

  • Video of a presentation about Cassandra at Facebook: covers the data model of Facebook’s inbox search and a lot of implementation details. Prashant Malik and Avinash Lakshman presenting.

  • Cassandra presentation at sigmod: mostly the same slides as above

  • If any of you have worked on cassandra, please let me know how that has been working out for you.

    2 comments

    1. It’s worth noting the data stored by Facebook is transient. They use it for storing the index for inbox search. This means losing this kind of data isn’t a too big deal for them as it can easily be recreated.

      However, rumour has it that Facebook develops their own version aside from the open-source version.
      Perhaps they use Cassandra to store more sensitive data too but I haven’t seen that confirmed by any Facebooks devs yet (prove me wrong though).

      I’d say a much more interesting use case would be Digg’s deployment of Cassandra. (You linked the article).

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>