Cassandra for service registry/discovery service
My last post was about my struggle to find a good distributed ESB/Service-discovery solution built over open source tools which was simple to use and maintain. Thanks to reader comments (Dan especially) and some other email exchanges, it seems like building a custom solution is unavoidable if I really want to keep things simple.
Dan suggested that I could use DNS to find seed locations for config store which would work very well in a distributed network. If security wasn’t a concern this seed location could have been on S3 or SimpleDB, but the requirement that it needs to be secured on internal infrastructure forced me to investigate simple replicated/eventually-consistent databases which could be hosted internally in different data centers with little or no long term administration cost.
My search lead me to investigate a few different NOSQL options
But the one I finally settled on as a possible candidate was Cassandra. Unlike some of the others, since our application platform was based on java, Cassandra was simple to install and setup. The fact that Facebook used it to store 50TB of data across 150 servers helped us convince it was stable as well.
The documentation on this project isn’t as much as I would have liked, but I did get it running pretty fast. Building a service registry/discovery service on top of this is whats next on my mind..
More on Cassandra
If you are interested in learning more about cassandra I’ll recommend you to listen to this talk by Avinash Lakshman (facebook) and read a few other posts listed here.
Cassandra: Articles
Cassandra -- Getting Started: Cassandra data model from a Java perspective
Looking to the Future with Cassandra: how Digg migrated their friends+diggs data set to Cassandra from mysql
Building Scalable Databases: Denormalization, the NoSQL Movement and Digg
WTF is a SuperColumn? An Introduction to the Cassandra Data Model
Cassandra and Ruby: A Love Affair? - Engine Yard's walk-through of the Cassandra gem
Up and Running with Cassandra: featuring data model examples of a Twitter clone and a multi-user blog, and ruby client code
Facebook Engineering notes and Cassandra introduction and LADIS 2009 paper
Cassandra: Presentations
Cassandra in Production at Digg from NoSQL East 09
What Every Developer Should Know About Database Scalability: presentation on RDBMS vs. Dynamo, BigTable, and Cassandra
NOSQL Video - NOSQL Slides: More on Cassandra internals from Avinash Lakshman.
Video of a presentation about Cassandra at Facebook: covers the data model of Facebook's inbox search and a lot of implementation details. Prashant Malik and Avinash Lakshman presenting.
Cassandra presentation at sigmod: mostly the same slides as above
If any of you have worked on cassandra, please let me know how that has been working out for you.
Comments
However, rumour has it that Facebook develops their own version aside from the open-source version.
Perhaps they use Cassandra to store more sensitive data too but I haven't seen that confirmed by any Facebooks devs yet (prove me wrong though).
I'd say a much more interesting use case would be Digg's deployment of Cassandra. (You linked the article).