My last post was about my struggle to find a good distributed ESB/Service-discovery solution built over open source tools which was simple to use and maintain. Thanks to reader comments (Dan especially) and some other email exchanges, it seems like building a custom solution is unavoidable if I really want to keep things simple.
Dan suggested that I could use DNS to find seed locations for config store which would work very well in a distributed network. If security wasnâ€™t a concern this seed location could have been on S3 or SimpleDB, but the requirement that it needs to be secured on internal infrastructure forced me to investigate simple replicated/eventually-consistent databases which could be hosted internally in different data centers with little or no long term administration cost.
My search lead me to investigate a few different NOSQL options
But the one I finally settled on as a possible candidate was Cassandra. Unlike some of the others, since our application platform was based on java, Cassandra was simple to install and setup. The fact that Facebook used it to store 50TB of data across 150 servers helped us convince it was stable as well.
The documentation on this project isnâ€™t as much as I would have liked, but I did get it running pretty fast. Building a service registry/discovery service on top of this is whats next on my mind..
More on Cassandra
If you are interested in learning more about cassandra Iâ€™ll recommend you to listen to this talk by Avinash Lakshman (facebook) and read a few other posts listed here.
Cassandra — Getting Started: Cassandra data model from a Java perspective
Looking to the Future with Cassandra: how Digg migrated their friends+diggs data set to Cassandra from mysql
Cassandra and Ruby: A Love Affair? – Engine Yard’s walk-through of the Cassandra gem
Up and Running with Cassandra: featuring data model examples of a Twitter clone and a multi-user blog, and ruby client code
Cassandra in Production at Digg from NoSQL East 09
What Every Developer Should Know About Database Scalability: presentation on RDBMS vs. Dynamo, BigTable, and Cassandra
Video of a presentation about Cassandra at Facebook: covers the data model of Facebook’s inbox search and a lot of implementation details. Prashant Malik and Avinash Lakshman presenting.
Cassandra presentation at sigmod: mostly the same slides as above
If any of you have worked on cassandra, please let me know how that has been working out for you.