Scaling Graphite by using Cfmap as the data transport

Graphite is an extremely promising system and resource graphing tool which tries to take RRD to the next level. Here are some of the most interesting features of graphite which I liked.

  • Updates can happen to records in the past (RRD doesn’t allow this I think)image
  • Creation of new datasets is trivial with whisper/carbon ( its part of  the graphite framework )
  • Graphite allows federated storage (multiple servers across multiple datacenters for example)

Monitoring and graphing resources across data-centers is tricky however. Especially because WAN links cannot be trusted. While loosing some data may be ok for some folks, it may not be acceptable for others. Graphite tries to solve this problem by providing an option to federate data across multiple servers which could each be kept in separate datacenters.

Another way to solve this problem is by using a data transport which is resilient to network failures.

Since Cfmap (thanks to Cassandra) is a distributed, eventually consistent, state repository, it could easily be extended to act as an eventually consistent data queue for tools like graphite. Take a look at an example of a server record here (on the right).  With some minor modifications, we were able to log and publish all changes to system attributes using an API like this. And with the right script running on the graphite server, the import of these stats into carbon (a component of graphite), became a trivial task.

image

With Cfmap’s easy REST interface, adding new stats to graphite becomes as simple as registering the stats to cfmap from anywhere in the network. [ sample script ]

Graphite is not in use in our corporate network today, but I’m extremely excited at the possibilities and would be actively looking at how else we could use it.

[ Take a look at RabbitMQ integration with Graphite for another interesting way of working with graphite ]

Comments

Popular posts from this blog

Chrome Frame - How to add command line parameters

Creating your first chrome app on a Chromebook

Brewers CAP Theorem on distributed systems