Its logical – IAAS users will move to PAAS

Sysadmins love infrastructure control, and I have to say that there was a time when root access gave me a high. It wasn’t  until I moved to web operations team (and gave up my root access) that I realized that I was  more productive when I wasn’t dealing with day to day hardware and OS issues. After managing my own EC2/Rackspace instance for my blog for a few years , I came to another realization today that IAAS (infrastructure as a service) might be one of these fads which will give way to PAAS (Platform as a service).

WordPress is an excellent blogging platform, and I manage multiple instances of it for my blogs (and one for my  wife’s blog). I chose to run my own wordpress instance because I loved the same control which I used to have when I was a sysadmin. I not only wanted to run my own plugins, configured my own features, play with different kinds of caching features, I also wanted to choose my own linux distribution (Ubuntu ofcourse) and make it work the way I always wanted my servers to work.  But when it came to patching the OS, taking backups, updating wordpress and the zillion other plugins, I found it a little distracting, slightly frustrating and extremely time consuming.

Last week I moved one of my personal blogs to blogger.com and its possible that it may not be the last one. Whats important here is not that I picked blogger.com over wordpress.com, but the fact that I’m ready to give up control to be more productive. Amazon’s AWS started off as the first IAAS service provider, but today they provide a whole lot of other managed services like Elastic MapReduce, Amazon Route 53, Amazon cloudfront and Amazon Relational Database Service which are more of a PAAS than IAAS.

IAAS is a very powerful tool in the hands of professional systems admin. But I’m willing to bet that over the next few years lesser number organizations would be worried about kernel versions and linux distributions and would instead be happy with a simple API to upload “.war” files (if they are running tomcat for example) into some kind of cloud managed tomcat instances (like how hadoop runs in elastic mapreduce). Google App Engine (Java and Python) and Heroku (Ruby based, Salesforce bought them) are two examples of such service today and I’ll be surprised if  AWS doesn’t launch something  (or buy someone out) within the next year to do the same.

Google App Engine 1.4.0 pre-release is out

The complete announcement is here, but here are the changes for the java SDK. The two big changes I liked is the fact that there is now an “always on” feature, and “tasks” feature has graduated out of beta/testing.

  • The Always On feature allows applications to pay and keep 3 instances of
    their application always running, which can significantly reduce application latency.
  • Developers can now enable Warmup Requests. By specifying  a handler in an app’s appengine-web.xml, App Engine will attempt to to send a Warmup Request to initialize new instances before a user interacts with it. This can reduce  the latency an end-user sees for initializing your application.
  • The Channel API is now available for all users.
  • Task Queue has been officially released, and is no longer an experimental feature. The API import paths that use ‘labs’ have been deprecated. Task queue storage will count towards an application’s overall storage quota, and will thus be charged for.
  • The deadline for Task Queue and Cron requests has been raised to 10 minutes.  Datastore and API deadlines within those requests remain unchanged.
  • For the Task Queue, developers can specify task retry-parameters in their queue.xml.
  • Metadata Queries on the datastore for datastore kinds, namespaces, and entity  properties are available.
  • URL Fetch allowed response size has been increased, up to 32 MB. Request
    size is still limited to 1 MB.
  • The Admin Console Blacklist page lists the top blacklist rejected visitors.
  • The automatic image thumbnailing service supports arbitrary crop sizes up to 1600px.
  • Overall average instance latency in the Admin Console is now a weighted  average over QPS per instance.
  • Added a low-level AysncDatastoreService for making calls to the datastore asynchronously.
  • Added a getBodyAsBytes() method to QueueStateInfo.TaskStateInfo, this returns the body of the task state as a pure byte-string.
  • The whitelist has been updated to include all classes from javax.xml.soap.
  • Fixed an issue sending email to multiple recipients. http://code.google.com/p/googleappengine/issues/detail?id=1623

Heroku platform for scalable web applications

I’m so locked up in my own java world that I didn’t realize something as cool as this existed in the ruby world.

Heroku is the instant ruby platform. Deploy any ruby app instantly with a simple and familiar git push. Take advantage of advanced features like HTTP caching, memcached, rack middleware, and instant scaling built into every app. Never think about hosting or servers again.

From a layman’s point of view, Heroku looks like a ruby version of GAE image(Google app engine). It has some of the same features as GAE.  But unlike GAE, Heroku actually talks about their architecture in great detail.

They use Nginx as the front-end HTTP reverse proxy server and Varnish for the caching right behind Nginx. They wrote their own custom software to “route” requests between the web frontend and the backend services. The actual user code runs on the “Dyno Grid” where each dyno looks like a self contained ruby instance with user code (compiled slugs).

There could be multiple “dynos” on the same server, and a user application could use up multiple “dynos” on same or different servers. Since each“dyno” comes preconfigured with information about user’s database and cache connection information there is absolutely nothing else (configuration wise) “compiled slugs” need to do its job.

image

The “routing mesh” tracks detailed performance data for each of the apps and load balances as required. An unresponsive “dyno” is marked and replaced automatically. Based on the documentation they can initialize new dynos in about 2 seconds.

image

A dyno, incase you are curious is a single process running your code, somewhat like a jre container. And it looks like they put about 4 dynos for each core (CPU) they have on a server. The POSIX view of the system available to the ruby vm is read-only, and though they don’t use OS virtualization to separate each dyno, they do seperate them using “unix permissions”. I guess that means each dyno has its own unique userid/groupid pair. I don’t have much experience with ruby, but for those who care, they use “plain-vanilla MRI ruby”. image

Just like how GAE/Pyhon uses a stripped down version of Django and GAE/Java uses stripped down version of Jetty as their app server, Dyno uses a thin version of Mongrel. It also uses “Rack”/ “Rack Middleware” for apps interaction with Mongreal/webserver.

Now here is another interesting implementation choice they went with. To update your apps all you have to do is check in your changes using “Git”, and the Heroku will take care of compiling your slugs and deploy it for you. I wish GAE was like this.

The pricing looks slightly higher than raw EC2 cost, but you need to understand that Heroku is Platform (PAAS) and not Infrastructure (IAAS). They take care of the stuff you would otherwise have to struggle with if you were on AWS.

They also have some pretty interesting “Add-ons”. The one I liked was “Websolr” is a custom implementation of “solr” full-text search engine which is in turn based on lucene.

I’m curious if any of you have used Heroku and comment on what you feel about it. The devil is in the details.

Related interesting Links:

  1. http://highscalability.com/heroku-simultaneously-develop-and-deploy-automatically-scalable-rails-applications-cloud
  2. http://sazbean.com/2008/05/29/interview-with-james-lindenbaum-ceo-of-heroku/
  3. http://ec2onrails.rubyforge.org/

Understanding Cloud computing efficiency

Picking a cloud service at times, unfortunately,  is far more complex  than picking up a brand new car. I remember how torn I was between a honda-hybrid, which came with some tax rebates and a carpool sticker and a non-hybrid one which was significantly cheaper. Understanding the short term and long term benefits is the key.

Today AWS is not the only game in the town. There are lots of other reliable efficiency_light_bulb(or some flavor off) options. GoGrid, JoyentMicrosoft and GoogleAppEngine are some.

Here are the key differences which one should understand before deciding which one to go for.

* IAAS (Infrastructure as a service) providers like AWS (EC2) and Rackspace provide virtual infrastructure which you can manage and control. In most cases you are billed by a time-unit and you would have control to increase or decrease resources available for your application. PAAS (Platform as a service) on the other hand only provides APIs for your application. PAAS based infrastructure is usually billed by number of requests or by the CPU cycles spent on supporting the requests. Microsoft’s Azure places itself somewhere in between these two paradigms which makes this even more interesting.

* If your application’s resource requirements fluctuate a lot on a daily basis and you don’t want to invest in building a scalable architecture and the logic to manage/monitor the process of scaling up and down, then PAAS based service might help you. But if you want higher performance, more control of your code and infrastructure (and the way it scales) then IAAS is the way to go.

* If you have consistent load throughout the year, you should think about reserving resources for longer term if possible. It could turn out to be cheaper. But at the same time more servers/resources you reserve, more expensive it gets for you. There is a point at which it might be cheaper to host the infrastructure yourself.

* If your application is has short but high CPU resource peaks, you should look at a vendor which doesn’t put a performance ceiling. “The BitSource” did some performance tests between Rackspace vs Amazon EC2 which explains this problem very well.

* Finally, If you already have a large computing infrastructure within your organization and want more “long term” computing resources, based on the studies I have seen, its cheaper to manage/setup new servers/storage within the organization than outsourcing it to AWS/Rackspace.

At the end of the day remember that vendors are there to make money as well. If you plan to make significant long term investment into cloud services, you should do some research to make sure that is really the cheapest solution.