July 28, 2009

Vmware: internal + external “private” clouds

Last year at VMware 2008 conference they discussed something called diagram-private-cloud-fed-large[1]vCloud. Before VMware 2009, they will be announcing external clouds providers around that platform which allow internal clouds to extend their infrastructure to external clouds.

What VMware is trying to do is allow organizations to build cloud networks with the possibility of moving few services/components to external clouds.

vCenterServer_TN_2[1]To make this seamless the VMware vSphere tool which currently allows internal cloud management will be enhanced to allow it to manage instances on the external cloud almost as if it was part of the internal cloud. In fact if the rumors are true, they will even support vMotion across to external cloud providers (restrictions apply).

VMware is getting on the cloud bandwagon in a big way… just take a look at the number of sessions they have mentioning cloud.

July 20, 2009

Scalability for dummies

Alex Barrera has a very interesting post about how frustrating it is to figure out that you have a problem and how much trouble it is to fix it after the product is live.

I am there, I am suffering the redesign phase (twice now). It’s hard, it’s lonely, it’s discouraging and frustrating, but it needs to be done. I just wrote this post so that outsiders can get a glimpse of what is it to be there and how it affects the whole company, not just the tech department. Scalability problems aren’t something you can discard as being ONLY technical, it’s roots might be technical but its effects will shake the whole company.

The post actually reminded me of this post by Marton Trencseni which talks about the phases of improvement in scalability architecture a product goes through and digs a little deeper into what could have prevented it.

For startups or for companies which are just prototyping new ideas, their goals can sometimes be just to “test the waters”, and the product owners don’t care much about allocating/reserving enough resources for engineering to build it the “right way”. And there is a good reason for that as well, since a lot of prototypes ( or early products ) die off soon after launch because of issues completely unrelated to scalability. Its hard to figure out if you want to test the idea first or devote a lot of resources to get it done the right way from day one.

July 17, 2009

Weekend reading material

 

Products/Ideas

  1. redis - http://code.google.com/p/redis/ : Redis is a key-value database. It is similar to memcached but the dataset is not volatile, and values can be strings, exactly like in memcached, but also lists and sets with atomic operations to push/pop elements.
  2. HBase - http://hadoop.apache.org/hbase/ : HBase is the Hadoop database. Its an open-source, distributed, column-oriented store modeled after the Google paper, Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop.
  3. Sherpa - http://research.yahoo.com/node/2139
  4. BigTable - http://labs.google.com/papers/bigtable-osdi06.pdf
  5. voldemort - It is basically just a big, distributed, persistent, fault-tolerant hash table. For applications that can use an O/R mapper like active-record or hibernate this will provide horizontal scalability and much higher availability but at great loss of convenience. For large applications under internet-type scalability pressure, a system may likely consists of a number of functionally partitioned services or apis, which may manage storage resources across multiple data centers using storage systems which may themselves be horizontally partitioned. For applications in this space, arbitrary in-database joins are already impossible since all the data is not available in any single database. A typical pattern is to introduce a caching layer which will require hashtable semantics anyway. For these applications Voldemort offers a number of advantages
  6. Dynamo - A highly available key-value storage system that some of Amazon’s core services use to provide an “always-on” experience.  To achieve this level of availability, Dynamo sacrifices consistency under certain failure scenarios. It makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.
  7. Cassandra - Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store. Cassandra brings together the distributed systems technologies from Dynamo and the data model from Google's BigTable. Like Dynamo, Cassandra is eventually consistent. Like BigTable, Cassandra provides a ColumnFamily-based data model richer than typical key/value systems.
  8. Hypertable - : Hypertable is an open source project based on published best practices and our own experience in solving large-scale data-intensive tasks.
  9. HDFS - The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. It has many similarities with existing distributed file systems. However, the differences from other distributed file systems are significant. HDFS is highly fault-tolerant and is designed to be deployed on low-cost hardware. HDFS provides high throughput access to application data and is suitable for applications that have large data sets.

Blog/Posts/Links

  1. Eventually Consistent 
  2. Bunch of Links at bytepawn
  3. Fallacies of Distributed Computing

Is Yahoo launching a cloud storage solution : MObStor

While rest of the world is busy with Microsoft and Google, Yahoo might be preparing to launch MObStor which they tout as the “Unstructured Storage for the Internet”.

While comparing MObStor to the various Cloud computing storage solutions already available, Navneet Joneja, Sr. Product Manager, mentions Facebook’s Haystack to describe MObStor’s architectural design. He also points out that though Facebook’s Haystack was optimized to store photographs, MObStor was optimized for diverse set of use cases.

Its a REST based, browser-accessible API with simple security model, and content-agnostic storage features. The focus of this service seems to be fast, reliable, secure storage with the option of allowing customers to layer additional services on top of the core service. It claims it would be optimized for high performance and high availability (who doesn’t).

Here is more from the Yahoo Developer Network Blog

Facebook's Haystack is based on commodity storage. While MObStor does support commodity storage, it doesn't require it. Instead, we have a storage-layer abstraction we call the ObjectStore. The ObjectStore encapsulates the key storage operations we need to perform, and allows us to have many underlying physical object stores. This allows us to mix, for example, filer-based storage with commodity storage. The upper layers have the routing intelligence that determines which ObjectStore a given piece of data is stored in. However, like Haystack, we do support high request rates using our own optimized ObjectStore written to run on commodity hardware - with one important difference. While Haystack identifies every object using a 64-bit photo key, all objects in MObStor are accessible through logical (i.e., client-supplied) URLs, not object IDs.

In MObStor, the storage layer maintains the mapping between logical URLs and physical storage, and can use any means to do so - the implementation is encapsulated within the storage layer. Needless to say, this operation is a potential performance bottleneck, so we've carefully optimized the algorithms used and the hardware that they run on.

Now with Amazon, Google, Microsoft and Yahoo in the picture the last shoe might finally drop.

July 12, 2009

CouchDB scalability issues ? (updated)

Jonathan Ellis’ started up a storm when he posted an entry about CouchDB about 6 months ago. He questioned some of CouchDB’s claims and made an attempt to warn users who don’t understand practical issues around CoughDB very well.

After reading his post and some comments, it looked like he was specifically concerned about CouchDB’s ability to distribute/scale a growing database automatically.

Its a good read if you are curious. He has stopped accepting comments on his blog, but that shouldn’t stop you from commenting here.

As Jan pointed out in the comments Jonathan is assuming “distributed” means “auto-scaling” which is not true.

-- links from the blog.. Cassandra dynomite Sawzall Pig

July 11, 2009

Cloud architecture: Notes from an Amazon talk

 

Some notes from a talk I was at. Didn’t get time to write it in detail. But hey, something is better than nothing… right ?

Design for failure

        - handle failure
            - use elastic ip addresses
            - use multiple amazon ec2 availability zones
            - create mutliple database slaves across multiple zones
            - use real-time monitoring (amazon cloudwatch)
            - use amazon EBS for persistent file system
                - snapshot database to s3 (from ebs)
  

Loose coupling sets you free

        - independent components
        - design everything as a blackbox
        - de-coupling for hybrid models
        - loadbalance-clusters
        - use SQS as buffers to queue messages. Allows elasticity
  

Design for dynamism


        - build for changes in infrastructure 
            - Don't assume health of fixed location of components
            - Use designs that are resilient to reboot and re-launch
            - Bootstrap your instances
            - Enable dynamic configuration
                - Enable Self discovery
                    (puttet, chef, ?)
            - Free auto-scaling features (by triggers)
            - Use Elastic loadbalancing on multiple layers
            - Use configurations in SimnpleDB to bootstrap instances

Build security in every layerider encrypted files


        - Physical is free
        - network is easy
            - Can confider app to talk to only web and db layer... etc. Everything can be automated.
        - The rest can be added
            - Create distinct Security Groups for each Amazon EC2 cluster
            - Use group-based rules for controlling access between layers
            - Restrict external access to specific IP ranges
            - Encrypt data "at-rest" in Amazon S3
            - Encrypt data "in-transit" (SSL)
            - Consider encrypted file systems in EC2 for sensitive data

Dont fear constraints

        - More RAM ?
            Distribute load across machines. Shared distributed cache
        - Better IOPS on my database ?
            Multiple read0only / sharding / DB clustering
        - Your server has better config ?
            Implement elasticity
        - Static IP ?
            Boot script for software reconfiguration from SimpleDB

        -
  

Leverage aws storage solutions


        - Amazon S3: for large static objects (whats the maximum size per object ?)
        - Amazon Coudfront: content distribution
        - Amazon SimpleDB: simple data indexing/querying
        - Amazon EC2 local disk drive: transient data
        - Amazon EBS: RDBMS persistent storage + S3 Snapshots

Is Percentage of company Bloggers/Twitter_users inversely proportional to Company size ?

Small organizations often keep a very active online presence . For them, any news is good news. Larger organizations however try to be opposite of that and control information.

What I’ve been trying to understand is how in spite of all that companies like Google and Microsoft still manage to have a huge online presence.

No.Of.TwitterAccounts= (Size.Of.Company)^(1/2)  ?

For example today, Google announced a list of all of its Twitter accounts in one page. 

How do they do it ?

General
twitter.com/Google - our central account
twitter.com/Blogger - for Blogger fans
twitter.com/GoogleCalendar - user tips & updates
twitter.com/GoogleImages - news, tips, tricks on our visual image search
twitter.com/GoogleNews - latest headlines via Google News
twitter.com/GoogleReader - from our feed reader team
twitter.com/iGoogle - news & notes from Google's personalized homepage
twitter.com/GoogleStudents - news of interest to students using Google
twitter.com/YouTube - for YouTube fans
twitter.com/YouTubeES - en Espanol
twitter.com/GoogleAtWork - solutions for IT and workplace productivity

Geo-related
twitter.com/SketchUp - Google SketchUp news
twitter.com/3DWH - SketchUp's 3D Warehouse
twitter.com/Modelyourtown - 3D modeling to build your favorite places
twitter.com/EarthOutreach - Earth & Maps tools for nonprofits & orgs
twitter.com/GoogleMaps - uses, tips, mashups
twitter.com/GoogleSkyMap -Android app for the night sky

Ads-related
twitter.com/AdSense - for online publishers
twitter.com/AdWordsHelper - looking out for AdWords questions and tech issues
twitter.com/AdWordsProSarah - Google Guide for AdWords Help Forum
twitter.com/GoogleAnalytics - insights for website effectiveness
twitter.com/GoogleAdBuilder - re building display ads
twitter.com/GoogleRetail - for retail advertisers
twitter.com/TechnologyUK - for U.K. tech advertisers
twitter.com/InsideAdWordsDE - for German AdWords customers
twitter.com/GoogleAgencyDE - for German ad agencies
twitter.com/AdSensePT - info for Portuguese-language publishers
twitter.com/AdWordsRussia - AdWords news & tips in Russian
twitter.com/DentroDeAdWords - Spanish updates from the Inside

AdWords blog
twitter.com/AdWordsAPI - AdWords API tips

Developer & technical
twitter.com/GoogleResearch - from our research scientists
twitter.com/GoogleWMC - Google Webmaster Central
twitter.com/GoogleCode - latest updates for Google developer products
twitter.com/GoogleData - Data APIs provide a standard protocol for reading and writing web data
twitter.com/app_engine - web apps run on Google infrastructure
twitter.com/DataLiberation - our initiative for complete import/export of all data
twitter.com/GoogleMapsAPI - about using Google Maps embedded in websites
twitter.com/GoogleIO - Google's largest annual developer event

Culture, People
twitter.com/googletalks - notes from our @Google speaker series
twitter.com/googlejobs - the voice of Google recruiters

Country or Region
twitter.com/googledownunder - Google activities in Australia & New Zealand
twitter.com/GoogleDE - Google in Germany
twitter.com/GoogleLatAm - Latin America (en Espanol)
twitter.com/GooglePolicyIt - Notes on Google policy issues in Italy

July 04, 2009

Cell phone speeds, reliability in US

Novaram and PC World did a cell phone service provider test across the nation to compare the three big cell giants. 

I was very shocked and surprised at how crappy the AT&T; wireless network’s reliability is in the city I live.  No wonder people have been constantly complaining about service problems.

I wish Apple had gone with Verizon for iPhone… I’ve used verizon for years (before I switched to AT&T;) and was pretty happy with them.

Novarum test results; click for full-size image.