Posts

Showing posts from December, 2010

Switching roles: next stop Google

Image
Jan of 2011 will start a little different for me after 10 years. I’ve accepted a position in the Google Apps Enterprise Group and would be joining them early next month. Other than the fun stuff I do outside my regular job, I’ve been in IT related roles for as long as I can remember. And while IT has been very challenging and is an exciting field to be in , I feel that its time for a little exploration. I will deeply miss all of my friends at Ingenuity . Some of whom I’ve worked with for over 10 years... but I'm ready for my next challenge.

Switching roles: next stop Google

Image
Jan of 2011 will start a little different for me after 10 long years. I’ve accepted a position in the Google Apps Enterprise group and would be joining them early next month. Other than the fun stuff I do outside my regular job, I’ve been doing IT related stuff for as long as I can remember. And while IT has been very challenging and exciting field to be in , I feel that its time for a little exploration. My scalable web architecture blog and this personal blog will continue to stay up, but I’m not sure at this point how my new job will impact the frequency at which I post here.

S4: Distributed Stream Computing Platform

Image
A few weeks ago I mentioned Yahoo! Labs was working on something called S4 for real-time data analysis. Yesterday they released an 8 page paper with detailed description of how and why they built this. Here is the abstract from the paper. Its interesting to note that the authors compared S4 with MapReduce and explained that MapReduce was too optimized for batch process and wasn’t the best place to do real time computation. They also made an architectural decision of not building a system which can do both offline (batch) processing and real-time processing since they feared such a system would end up to be not good for either. S4 is a general-purpose, distributed, scalable, partially fault-tolerant, pluggable platform that allows programmers to easily develop applications for processing continuous unbounded streams of data. Keyed data events are routed with affinity to Processing Elements (PEs), which consume the events and do one or both of the following: (1) emit one or mo

REST APIs for cloud management and the Database.com launch

Image
I found the top two stories on scalebig last night to be interesting enough for me to dig a little deeper. The one which surprised me the most was William Vambenepe’s post about why he thinks that REST APIs doesn’t matter in context of cloud management. While REST might be ideal for many different things, including web based applications which are accessed mostly by the browsers, Amazon chose to avoid REST for most of its infrastructure management APIs. Has this lack of REStfulness stopped anyone from using it? Has it limited the scale of systems deployed on AWS? Does it limit the flexibility of the Cloud offering and somehow force people to consume more resources than they need? Has it made the Amazon Cloud less secure? Has it restricted the scope of platforms and languages from which the API can be invoked? Does it require more experienced engineers than competing solutions? I don’t see any sign that the answer is “yes” to any of these questions. Considering the

Providing Dynamic DNS over “Amazon Route 53” ( a hackathon )

Image
On hindsight, yesterday’s “Route 53” announcement was not completely unexpected. Amazon is an IAAS provider and its in their own interest to automate infrastructure as much as possible. After tackling monitoring and cloudfront features, DNS was one of the more obvious targets for improvement. So when I was trying to pick a challenge for this morning’s hackathon , I picked one around “Amazon Route 53”  service. At the end of the day I had a almost functional public dynamic DNS service using “Route 53” as the DNS service and Twitter’s oauth service for authentication. The final hack is up here http://www.XXXXXXX.com/. You are most welcome to play with it and/or use it. After the initial creation of user in the system (with a little help from twitter’s oauth), the end user is free to use the browser based web application or  lynx or curl based REST interface to add/create/update host records. The current version only supports “A” records, but it would be expa

Amazon Route 53 : Programmable DNS is finally here

Image
Managing DNS has been considered as an art by many. If you manage your own DNS records, and run your own external DNS servers, I’m sure you have some stories to share. Unfortunately unlike most other infrastructure on the internet, DNS screw-ups can get very costly, especially because caching policies can tend to keep your mistakes alive long after you have rolled back your changes. The unforgiving nature of DNS has forced most, except a few hardcore sys-admins, from avoiding the DNS hell and choosing a managed service to do it for them. Domain name registrars like network solutions, mydomain and godaddy already provide these DNS services, but I can’t recall any of them providing APIs to make these changes automatically. DynDNS does provide an API to change DNS mappings, but it costs15 bucks a year for a single host. There might be others which I’m not aware off, but the bottom line is that there is no standard, and its not cheap. Customers on AWS today unfortunately have t

Scalability links for December 4th

Scalability links for December 4th: Presenting - La Brea - An interesting tool which could be used to understand how failures, latency and other annoying issues can impact an application. The tool allows one to insert system calls into an existing application without recompiling original application. What's new in Cassandra 0.7: Secondary indexes - I finally see an example of the promissed land !! :) Can't wait to try this out. NoCAP Part III GigaSpaces clustering explained.. - Devops - The War Is Over - if You Want It - Great Introductory Video on Scalability from Harvard Computer Science - Strategy: Google Sends Canary Requests into the Data Mine - This is another way of testing code thrown out by continuous deployments. Very nice. Very Low-Cost, Low-Power Servers - Better Workflow Management in CDH with Oozie 2 - Facebook at 13 Million Queries Per Second Recommends: Minimize Request Variance - Keeping Customers Happy - Another New Elastic Load Balancer Feature -

AWS Cloudwatch is now really open for business

Image
In a surprise move Amazon today released a bunch of new features to its cloudwatch service, some of which, till now, were provided by third party service providers. Basic Monitoring of Amazon EC2 instances at 5-minute intervals at no additional charge. Elastic Load Balancer Health Checks -Auto Scaling can now be instructed to automatically replace instances that have been deemed unhealthy by an Elastic Load Balancer. Alarms - You can now monitor Amazon CloudWatch metrics, with notification to the Amazon SNS topic of your choice when the metric falls outside of a defined range. Auto Scaling Suspend/Resume - You can now push a "big red button" in order to prevent scaling activities from being initiated. Auto Scaling Follow the Line -You can now use scheduled actions to perform scaling operations at particular points in time, creating a time-based scaling plan. Auto Scaling Policies - You now have more fine-grained control over the modifications

Kafka : A high-throughput distributed messaging system.

Image
Found an interesting new open source project which I hadn’t heard about before. Kafka is a messaging system used by linkedin to serve as the foundation of their activity stream processing. Kafka is a distributed publish-subscribe messaging system. It is designed to support the following Persistent messaging with O(1) disk structures that provide constant time performance even with many TB of stored messages. High-throughput: even with very modest hardware Kafka can support hundreds of thousands of messages per second. Explicit support for partitioning messages over Kafka servers and distributing consumption over a cluster of consumer machines while maintaining per-partition ordering semantics. Support for parallel data load into Hadoop. Kafka is aimed at providing a publish-subscribe solution that can handle all activity stream data and processing on a consumer-scale web site. This kind of activity (page views, searches, and other user actions) are a key ingredient in many of

The unbiased private vs AWS ROI worksheet

Image
One of the my problems with most cloud ROI worksheets is that they are heavily weighted for use-cases where resource usage is very bursty. But what if your resource requirements aren’t bursty ? And what if you have a use case where you have to maintain a small IT team to manage some on-site resources due to compliance and other issues ?  In his latest post , Richard shares his worksheet for everyone to play with.