Amazon launches CloudFront

November 18th, 2008

Amazon has finally opened the doors of its new CDN (Content Delivery Network) called CloudFront. But instead of building a completely new product it has interestingly expanded its S3 network to include content replication for lower latency content delivery. By not reinventing a whole new way of uploading data to the CDN network, Amazon has seriously cut down the cost for end users to try out this technology.image Most of the CDNs I’ve investigated do very well with static content which needs to be periodically refreshed somehow.

There is at least one service from Akamai called WAA - Web application accelerator which seem to understand the importance of accelerating extremely dynamic content using intelligent routing and closer points of presence to end user. WAA doesn’t put the content closer to the end user, but provides an extremely efficient conduit for this traffic where Akamai controls both ends network by placing a POP in front of the client and the server. By doing this Akamai can take control of TCP/IP window sizes within its network and provide a low latency, higher bandwidth response to the customer. In addition to all this Akamai also provides an option to cache some data ( as defined in the HTTP headers, or WAA configuration ) to be cached for a longer duration.

Though Amazon might be doing replication as well, it may be closer to the Akamai’s WAA model than what you thought. Its kind of obvious that if the data is going to change all the time, there has to be some kind of master-slave concept, and its also clear that if many people are accessing that data around the world it has to be transported through a very efficient high bandwidth network to the various Amazon Points of presence around the world. And finally just like the Akamai’s WAA model, it probably does the cache content to serve the content directly from its local cache incase the object hasn’t changed on the master since the last time it was retrieved.

A month ago I went shopping, looking for alternatives to Akamai’s WAA and didn’t find anyone. I suspect CloudFront changes that a little bit. One significant difference between Amazon and most CDNs out there including CloudFront is that there is relatively very little work which needs to be done by the developer to integrate with WAA. This is not true with most CDNs, and certainly not true for CloudFront if you are not already on S3. But it does change the dynamics of this industry.

Popularity: 9%

Scaling Early: Feedjit

November 10th, 2007

Mark Maunder from Feedjit make an interesting presentation about scaling early. He focuses on some of the key operational issues related to the web server and server caching which I found very interesting.

SlideShare | View | Upload your own

Popularity: 50%

Mysql on HDFS

November 4th, 2007

A short thought provoking post by Mark Callaghan about running Mysql over HDFS. Its probably not ideal, but its an interesting thought regardless.

Popularity: 47%

Scaling technorati - 100 million blogs indexed everyday

October 25th, 2007

Indexing 100 million blogs with over 10 billion objects, and with a user base which is doubling every six months, technorati has an edge over most blog search engines. But they are much more than search, and any technorati user can explain you that. I recommend you read John Newton’s interview with David Sifry which I found fascinating. Here are the highlights from the interview if you don’t have time to read the whole thing

  • Current status of technorati
    • 1 terabyte a day added to its content storage
    • 100 million blogs
    • 10 billion objects
    • 0.5 billion photos and videos
    • Data doubling every six months
    • Users doubling every six months
  • The first version was supposed to be for tracking temporal information on low budget.
    • That version put everything in relational database which was fine since the index sizes were smaller then physical memory
    • It worked fine till about 20 million blogs
  • The next generation took advantage of parallelism.
    • Data was broken up into shards
    • Synced up frequently between servers
    • The database size reached largest known OLTP size.
      • Writing as much data as reading
      • Maintaining data integrity was important
        • This put a lot of pressure on the system
  • The third generation
    • Shards evolved
      • The shards were based on time instead of urls
      • They moved content to special purpose databases instead of relational database
    • Don’t delete anything
    • Just move shards around and use a new shard for latest stuff
  • Tools used
    • Green plum - enables enterprises to quickly access massive volumes of critical data for in-depth analysis. Purpose built for high performance, large scale BI, Greenplum’s family of database products comprises solutions suited to installations ranging from departmental data marts to multi-terabyte data warehouses.
  • Should have done sooner
    • Should have invested in click stream analysis software to analyze what clicks with the users
      • Can tell how much time users spend on a feature

Popularity: 86%

Scalability stories for Oct 22, 2007

October 22nd, 2007
  • Why most large-scale sites which scale are not written in java ?  -  ( What nine of the world’s largest websites are running on) -  A couple of very interesting blogs to read.
  • Slashdot’s setup Part 1 - Just in time for the 10 year anniversary.
  • Flexiscale - Looks like an amazon competitor in the hosting business.
  • Pownce - Lessons learned  - Lessons learned while developing Pownce, a social messaging web application
  • Amazon’s Dynamo - Dynamo is internal technology developed at Amazon to address the need for an incrementally scalable, highly-available key-value storage system. The technology is designed to give its users the ability to trade-off cost, consistency, durability and performance, while maintaining high-availability. [ PDF ]. This paper presents the design and implementation of Dynamo, a highly available key-value storage system that some of Amazon’s core services use to provide an “always-on” experience.  To achieve this level of availability, Dynamo sacrifices consistency under certain failure scenarios. It makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.
  • Client side loadbalancing - Not a good idea if you don’t know what you are getting into. But there are certain kinds of application which can benifit from this. BTW DNS,SMTP,NTP already have some kind of client side loadbalancing features implemented.

Popularity: 49%

Crawling sucks.

October 22nd, 2007

I wrote my first crawler in a few lines of perl code to spider a website recursively about 10 years ago. Two years ago I wrote another crawler in a few thousand lines using java+php and mysql. But this time I wasn’t really interested in competing with google, and instead crawled feeds (rss/atom). Google hadn’t released its blog search engine at that time. Using the little java knowledge I had and 3rd part packages like Rome and some HML parsers I hacked up my first crawler in a matter of days. The original website allowed users to train the bayesian based engine to teach it what kind of news items you like to read and automatically track millions of feeds to find the best content for you. After a few weeks of running that site, I started having renewed appreciation for the search engineers who breath this stuff day in and day out. That site eventually went down… mostly due to design flaws I made early which I’m listing here for those who love learning.

  • Seeding problem- DMOZ might have a good list of seed URLs for a traditional crawler, but there wasn’t a DMOZ like public data source for feeds which I could use. I ended up crawling sites providing OPMLs, and sites like weblogs.com for new feeds.
  • Spamming - It dawned on me pretty fast that sites like Weblogs.com was probably not the best place to look for quality feeds. Every tom dick and harry were pinging weblogs.com and so were all the spammers. The spam statistics blew me away. 60% of the blogs were spam according to a few analysts in 2005. I’m sure this number has gone up now.
  • The Size - I thought just crawling feeds would be easy to manage, since that doesn’t require images/css/etc to be archived. But I was so wrong. I crossed 40GB of storage within a week or two of crawling. I could always add new harddisk, but without the ability to detect spam nicely and without a scalable search platform this blog search engine was DOA.
  • I also underestimated number of posts I was collecting. I had to increase the byte size for some of the IDs in Java and Mysql.
  • Feed processing/Searching - Mysql is fast, but in the hands of a totally untrained professional like me, it can be a ticking time bomb. Though I made good start, I struggled to get a grip on how the indexing, inner/outer joins work. I had underestimated the complexities of databases.
  • Threading - I over-designed the threaded application without investing enough time to understand how threading works in java. That, with a few caching features I created became the memory leak I so much wanted to avoid. It was a mess :)
  • I liked PHP for its simplicity, and Java for its speed. Unfortunately my attempt to design the UI in PHP and leave the backend in Java didn’t work very well because of my lack of experience with PHP-java interoperability.
  • Feed update frequency - Some feeds update faster than others. To calculate when you need to crawl next is an interesting problem by itself. Especially because some feeds update more frequently in certain parts of the day than others. Apparently google reader’s backend checks for feed updates about every one hour. So if you have 10 million feeds to crawl thats about 2777 feed requests per second. There is no way I can do that from a single machine in my basement.
  • The worst annoying problem I had was not really my fault :) . I own a belkin wireless router which became extremely unstable when my crawlers ran. I had to resort to daily reboots of this device to solve the problem. And on busy days it required two.

The reason why I’m not embarrassed, blogging about my mistakes, is because I’m not a developer to begin with. And the second reason is that I’m about to take a second shot at it to see if I can do it better this time. The objective is not to build another search engine, but to understand and learn from your mistakes and do it better.

The first phase of this learning experience resulted in what you see currently on blogofy.com. The initial prototype displayed here does limited crawling to gather feed updates. The feed update algorithm is already a little smarter (it updates some more often than others). The quality of the feeds should also be better because of human behind the engine who manually adds the feeds to the database. I also realized soon after my first attempt that I should have investigated JSON to make Java and PHP talk to each other. In the current version of blogofy the core engine is in PHP except the indexing/storage engine which will soon move to Solr(which is java based). PHP has JSON based hooks to talk to Solr which seems to work very well. Solr incase you didn’t know is a very fast lucene based search engine which does much better than Mysql for the kind of search operations I would be doing. And yes, I replaced Belkin router with an apple wireless… so lets see how that works out to be.

Will send more updates if I make progress. If any of you have ideas on what else I should be doing please let me know.

Popularity: 73%

EC2 for everyone. And now includes 64bit with 15GB Ram too.

October 16th, 2007

 

Finally it happened. EC2 is available for everybody. And more than that they now provide servers with 7.5GB and 15GB of RAM per instance. Sweet.  

For a lot of companies EC2 was not viable due to high memory requirements of some of the applications. Splitting up such tasks to use less memory on multiple servers was possible, but not really cost and time efficient. The release of new types of instances removes that road block and would probably invoke significant curiosity from memory crunching application developers.

$0.10 - Small Instance (Default)

    1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit), 160 GB of instance storage, 32-bit platform

$0.40 - Large Instance

    7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each), 850 GB of instance storage, 64-bit platform

$0.80 - Extra Large Instance

    15 GB of memory, 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each), 1690 GB of instance storage, 64-bit platform

      Data Transfer

      $0.10 per GB - all data transfer in

      $0.18 per GB - first 10 TB / month data transfer out
      $0.16 per GB - next 40 TB / month data transfer out
      $0.13 per GB - data transfer out / month over 50 TB

      Popularity: 56%

      Web Scalability dashboard

      October 16th, 2007

      [Blogofy: bringing feeds together ]

      I took a week’s break from blogging to work on one of my long overdue personal projects. Even though I use Google Reader as my feed aggregator I noticed a lot of folks still prefer a visual UI to track news and feeds. The result of my experimentation of designing such a Visual UI to track feeds lead me to create Blogofy

      If you have an interesting blog on Web Scalability, Availability or Performance which you want included here please let me know. The list of blogs on the page is in flux at the moment and I might move the feeds around a little depending on user feedback and blog activity.

      Popularity: 62%

      Scalable products: KFS released

      September 28th, 2007

      Kosmix, a search startup has released source to C++ implementation of something which looks like a clustered file system. This looks very similar to Hadoop/HDFS, but the C++ factor will be a big performance boost.Kosmic

      From Skrenta blog

        • Incremental scalability - New chunkserver nodes can be added as storage needs increase; the system automatically adapts to the new nodes.
        • Availability - Replication is used to provide availability due to chunk server failures.
        • Re-balancing - Periodically, the meta-server may rebalance the chunks amongst chunkservers. This is done to help with balancing disk space utilization amongst nodes.
        • Data integrity - To handle disk corruptions to data blocks, data blocks are checksummed. Checksum verification is done on each read; whenever there is a checksum mismatch, re-replication is used to recover the corrupted chunk.
        • Client side fail-over - During reads, if the client library determines that the chunkserver it is communicating with is unreachable, the client library will fail-over to another chunkserver and continue the read. This fail-over is transparent to the application.
        • Language support - KFS client library can be accessed from C++, Java, and Python.
        • FUSE support on Linux - By mounting KFS via FUSE, this support allows existing Linux utilities (such as, ls) to interface with KFS.
        • Leases - KFS client library uses caching to improve performance. Leases are used to support cache consistency.

      If anyone has experience with KFS, or has more information please leave a comment here.

      Popularity: 65%

      What is scalability ?

      September 22nd, 2007

      When asked what they mean by scalability, a lot of people talk about improving performance, about implementing HA, or even talk about a particular technology or protocol. Unfortunately, scalability is none of that. Don’t get me wrong. You still need to know all about speed, performance, HA technology, application platform, network, etc. But that is not the definition of scalability.

      Scalability, simply, is about doing what you do in a bigger way. Scaling a web application is all about allowing more people to use your application. If you can’t figure out how to improve performance while scaling out, its okay. And as long as you can scale to handle larger number of users its ok to have multiple single points of failures as well.

      There are two key primary ways of scaling web applications which is in practice today.

      • Vertical Scalability” - Adding resource within the same logical unit to increase capacity. An example of this would be to add CPUs to an existing server, or expanding storage by adding hard drive on an existing RAID/SAN storage.
      • Horizontal Scalability” - Adding multiple logical units of resources and making them work as a single unit. Most clustering solutions, distributed file systems, load-balancers help you with horizontal scalability.

      Every component, whether its processors, servers, storage drives or load-balancers have some kind of management/operational overhead. When you try to scale that, its important to understand what percentage of the resource is actually usable. This measurement is called “scalability factor“. If you loose 5% of a processor power every time you add a CPU to your system, then your “scalability factor” is 0.95. A scalability factor of 0.9 means you will only be able to use 90% of the resource.

      Scalability can be further sub-classified based on the “scalability factor”.

      • If the scalability factor stays constant as you scale. This is called “linear scalability“.
      • But chances are that some components may not scale as well as others. A scalability factor below 1.0 is called “sub-linear scalability“.
      • Though rare, its possible to get better performance (scalability factor) just by adding more components (i/o across multiple disk spindles in a RAID gets better with more spindles). This is called “supra-linear scalability“.
      • If the application is not designed for scalability, its possible that things can actually get worse as it scales. This is called “negative scalability“.

      If you need scalability, urgently, going vertical is probably going to be the easiest (provided you have the bank balance to go with it). In most cases, without a line of code change, you might be able to drop in your application on a super-expensive 64 CPU server from Sun or HP and storage from EMC, Hitachi or Netapp and everything will be fine. For a while at least. Unfortunately Vertical scaling, gets more and more expensive as you grow.

      Horizontal scalability, on the other hand doesn’t require you to buy more and more expensive servers. Its meant to be scaled using commodity storage and server solutions. But Horizontal scalability isn’t cheap either. The application has to be built ground up to run on multiple servers as a single application. Two interesting problems which most application in a horizontally scalable world have to worry about are “Split brain” and “hardware failure“.

      While infinite horizontal linear scalability is difficult to achieve, infinite vertical scalability is impossible. If you are building capacity for a pre-determined number of users, it might be wise to investigate vertical scalability. But if you are building a web application which could be used by millions, going vertical could be an expensive mistake.

      But scalability is not just about CPU (processing power). For a successful scalable web application, all layers have to scale in equally. Which includes the storage layer(Clustered file systems, s3,etc), the database layer (partitioning,federation), application layer(memcached,scaleout,terracota,tomcat clustering,etc), the web layer, loadbalancer , firewall, etc. For example if you don’t have a way to implement multiple load balancers to handle your future web traffic load, it doesn’t really matter how much money and effort you put into horizontal scalability of the web layer. Your traffic will be limited to only what your load balancer can push.

      Choosing the right kind of scalability depends on how much you want to scale and spend. In fact if someone says there is a “one size fits all” solution, don’t believe them. And if someone starts a “scalability” discussion in the next party you attend, please do ask them what they mean by scalability first.

      References

      1. Cost and Scalability
      2. My linear scalability is bigger than yours

      Popularity: 100%