Distributed systems and Unique IDs: Snowflake

Most of us who deal with traditional databases take auto-increments for granted. While auto-increments are simple on consistent clusters, it can become a challenge in a cluster of independent nodes which don’t use the same source for the unique-ids. Even bigger challenge is to do it in such a way so that they are roughly in sequence.

While this may be an old problem, I realized the importance of such a sequence only after using Cassandra in my own environment. Twitter, which has been using Cassandra in many interesting ways has proposed a solution for it which they are releasing as open source today.

Here are some interesting sections from their post announcing “Snowflake”.

The Problem

We currently use MySQL to store most of our online data. In the beginning, the data was in one small database instance which in turn became one large database instance and eventually many large database clusters. For various reasons, the details of which merit a whole blog post, we’re working to replace many of these systems with the Cassandra distributed database or horizontally sharded MySQL (using gizzard).

Unlike MySQL, Cassandra has no built-in way of generating unique ids – nor should it, since at the scale where Cassandra becomes interesting, it would be difficult to provide a one-size-fits-all solution for ids. Same goes for sharded MySQL.

Our requirements for this system were pretty simple, yet demanding:

We needed something that could generate tens of thousands of ids per second in a highly available manner. This naturally led us to choose an uncoordinated approach.

These ids need to be roughly sortable, meaning that if tweets A and B are posted around the same time, they should have ids in close proximity to one another since this is how we and most Twitter clients sort tweets.[1]

Additionally, these numbers have to fit into 64 bits. We’ve been through the painful process of growing the number of bits used to store tweet ids before. It’s unsurprisingly hard to do when you have over 100,000 different codebases involved.

 

Solution

To generate the roughly-sorted 64 bit ids in an uncoordinated manner, we settled on a composition of: timestamp, worker number and sequence number.

Sequence numbers are per-thread and worker numbers are chosen at startup via zookeeper (though that’s overridable via a config file).

We encourage you to peruse and play with the code: you’ll find it on github. Please remember, however, that it is currently alpha-quality software that we aren’t yet running in production and is very likely to change.

@twitter annotations : What I learnt at the hackfest….

A few of us joined in at the new Twitter office in downtown SF (right next to Moscone Center) and were for the first time shown what Twitter is doing about  “Twitter Annotations”. We probably created the first set of 3rd party applications around this new API. During the Hackathon I spent some time to wear my “Scalable web architecture” hat to think what I could learn from this experience which I’ve summarized below.Twitter

Twitter annotations from a developer’s view point is just an extension to existing APIs which now allows posting of additional structured content along with “tweet”. The content stays within the context of the tweet and will be retweeted/shared automatically with the main tweet. Twitter has some recommendations on how the annotations should be structured, for example they were talking about “type” which sounded very much like Open Graph’s “type/category” concept, with the difference that Twitter has left the field open for any kind of “type” users want. Facebook, if I remember right, had strongly recommended users to use a small set of “categories/types” which they published. Twitter accepted these annotations in multiple formats of which I tried the “simple” and “JSON” protocols. The “JSON” way was the most recommended/used medium of annotation during the whole hackathon. While annotation structure (using JSON) did allow multiple “types” in the same tweet, there were a few limitations which were slightly constricting. The first big one was that the current implementation allowed only 512 bytes in the annotations field. The second limitation was that the structure, while its JSON, it only supported a few levels of depth in the structured annotation. This was extremely restrictive for the use case I was trying to hack up.

There were a few things I learnt during the whole 32 hour experience. The first one was that twitter had actually hosted these half baked APIs on http://api.twitter.com and http://www.twitter.com, which I’m glad to say is still accessible using my account from outside twitter’s buildings. Of course the hackers(we) had to be white-listed to get access to use it, but from an operations view point this is extremely gutsy since one bad ACL code fragment could expose a lot of uncooked APIs to the whole world. This approach of testing is not new to Twitter and is frequently used for A/B testing in newer (more agile) organizations around the world.

The second was the fact that while Cassandra is in use at twitter, they don’t use it as the primary datastore for all the tweets (yet). They have other uses for it which they didn’t elaborate. The version of Cassandra they use is close to 0.6.2 which just got released. It also looked (from my discussions with one engineer) like cassandra treated rack-awareness and datacenter-awareness in a slightly different way. In the previous documentations I read, they both were the same for all practical purposes. In other words, I need to research this a little more since optimizations in this area can boost Cassandra’s performance across datacenters.

The third was that while Twitter uses cutting edge tools for a lot of different things, they don’t have service discovery nailed yet. They are playing with zookeeper, and I believe they will use that eventually, but its not there yet. This by itself is amazing because without service discovery, the management of configuration and rolling out configuration changes becomes centralized which has its own advantages and disadvantages. At the organization I work, we are playing with cassandra as a service publication/discovery tool for monitoring and consuming services. The short discussion I had with twitter folks about using cassandra in such a way validated the work work I’m doing with cassandra. But I’m still puzzled why others are not thinking about cassandra (or other eventually-consistent datastore) for service discovery. It sounded like Zookeeper might be an overkill for my organization, but I should take a look at it again.

The fourth was that Twitter employs a lot of very smart/passionate people who are amazingly good with most of the network/application stack. They dove into things like browser/javascript/cookies and then switched to dissecting network traffic using sniffer tools to debug a possible Oauth implementation bug and other weird things. This just adds to the current popular wisdom that scalability/stability/security can’t be done in small little silos.

The fifth and final I’d like to write about is the hackthon itself. Its amazing how Twitter organized this hackathon, got a group of hackers to play with their new APIs and gave them ability to demo their hacks to the likes of Paul Graham  and Ron Conway. In return they got very interesting product-ideas and use-cases for a feature which is still unpolished and unreleased. But more importantly they also got a bunch of hackers to intentionally and unintentionally break the feature and discover some serious and some very annoying bugs. They also got feedback on what does and doesn’t resonates with developers. In a way this is similar to what some other organizations (including Google) already do with their alpha/beta program, but nothing beats the velocity of hacking up 10 to 20 almost-ready products around a brand new feature in less than 32 hours.

References

P.S.: I’m terribly sorry for spamming my twitter followers who were bombarded with twitter test messages for two days. Next time I’ll pick a test account 🙂

NoSQL in the Twitter world

NoSQL solutions have one thing in common. They are generally designed for horizontal scalability. So its no wonder that lot of applications in the “twitter” world have picked NoSQL based datastores for their Twitter.compersistence layer. Here is a collection of these apps from MyNoSQL blog.

  1. Twitter uses Cassandra
  2. MusicTweets used Redis [ Ref ] – The site is dead, but you can still read about it
  3. Tstore uses CouchDB
  4. Retwis uses CouchDB
  5. Retwis-RB uses Redis and Sinatra ??  – No idea what sinatra is. Will have to look into it. [ Update: Sinatra is not a DB store ]
  6. Floxee uses MongoDB
  7. Twidoop uses Hadoop
  8. Swordfish built on top of Tokyo Cabinet comes with a twitter clone app with it.
  9. Tweetarium uses Tokyo Cabinet

References

Do you know of any more ?