Private clouds: By Amazon

A few days ago I blogged about how VMware is going to do a huge push into “private clouds” around the VMware 2009 conference. But little did we know that Amazon had something up its sleeve as well. It has announced it today.

AWS now supports creation of Virtual Private Cloud with private address space (including RFC 1918) which could be locked down by a VPN connection to only your organization only. You still get most of the benefit of Amazons cheap hardware pricing but you get to lock down the infrastructure for security reasons.

Regardless of how you see it, this is huge for IT and the developer community. Some may love it, and I’m sure some will be pretty angry at Amazon for trying to commodities security and making it look as if network security was as simple as that.

With VMware’s announcements next week, there is no doubt in my mind that the next one year at least there will be a significant push towards “private clouds”.

Steps to migrate your webapp to AWS

Most web applications needs at least the following services to be self sufficient. Computational power, storage, webserver/cdn, database,  messaging, loadbalancer and monitoring.

Here is the tried and tested steps as recommended by AWS folks

  1. Move static web content to S3 storage first. Images, css stylesheets, javascript files, html, etc can all be moved to S3. Its easier to move some static content than others, so there See full size imagemight be some work required to understand how to breakup web content to move parts of it into the cloud.
  2. The content on S3 can be served by Amazon Cloudfront service which is Amazon’s CDN(content delivery network) service. Once you persist your data on S3, your users will get those objects from the S3 servers located closest to them.
  3. Move applications and webserver layer to the EC2 infrastructure. This step will require you to figure out how to automate deployments into cloud infrastructure
  4. Once your apps are in the cloud, you can start working on building your availability zones to make your infrastructure tolerant to failures of Amazon datacenters. For example if you have apps deployed across US and Europe, if the US datacenters have problems, European datacenters would be able to absorb the shock and keep your services available.
  5. Start using Amazons auto-scaling functionality to add/remove infrastructure automatically depending on the load on the system.
  6. The most complicated part might be moving your databases to the AWS cloud. If you plan to keep your databases on RDBMS (Mysql/Postgress) then you should try to EBS (Elastic Block Storage) and figure out how to take snapshots to S3. You should also try to figure out how to do DB replication across availability zones to keep your site available during single datacenter failures.
  7. At this point since most of your application components are in the cloud, you should be able to start using new amazon services to make your service even better. One possible example is SQS which allows frontend applications to queue requests for other parts of the application (or DB) for asynchronous processing.
  8. Investigate the possibility of moving more of the DB components to S3 and SimpleDB to reduce the need of RDBMS as much as possible. S3 is ideal for storing large objects while SimpleDB is ideal for small stubs of data. A lot of applications using these services , use them together.
  9. After your apps are all configured on aws, this would be a good time to setup monitoring. Amazon provides CloudWatch service which allows you to monitor your applications.

Issues to worry about. Moving to the cloud can be full of small potholes. If you understand them and anticipate them it would be easier for you to move. Here are some, you should be careful about

  1. S3 service is “eventually consistent”. Which means that the data saved to S3 server may not be immediately available on read. Its also possible that if the same content is updated on two different S3 servers at the same time, one of the writes would be lost. This is not always bad, and if you understand it you will realize that there are ways around it.
  2. The loadbalancer service Amazon provides doesn’t support SSL.
  3. SimpleDB has per row max size limitation. This is why SimpleDB is better for keeping metadata which can be searched with reference to the complete data which could be kept in S3.

Parts of this post was summarized from Jinesh’s talk at the “AWS Start-up Tour 2009”.