Private clouds: By Amazon

A few days ago I blogged about how VMware is going to do a huge push into “private clouds” around the VMware 2009 conference. But little did we know that Amazon had something up its sleeve as well. It has announced it today.

AWS now supports creation of Virtual Private Cloud with private address space (including RFC 1918) which could be locked down by a VPN connection to only your organization only. You still get most of the benefit of Amazons cheap hardware pricing but you get to lock down the infrastructure for security reasons.

Regardless of how you see it, this is huge for IT and the developer community. Some may love it, and I’m sure some will be pretty angry at Amazon for trying to commodities security and making it look as if network security was as simple as that.

With VMware’s announcements next week, there is no doubt in my mind that the next one year at least there will be a significant push towards “private clouds”.

P2P network scalability

Youtube is said to be pushing about 25 petabytes per month which is about 77 Gbps sustained data rate on an average. The bandwidth usage at the peaks would be even higher. Thanks to Limelight networks, Youtube doesn’t really need to scale or provision for that kind of bandwidth and based on the some reports from 2006 it had cost them close to 4 million a month back then. Youtube and services like that have to invest a lot in their infrastructure before they can really launch their service and though using shared Content delivery networks is not ideal, its probably not a bad deal. In Youtube’s case, it helped them survive until Google bought it out.

Newer Internet television service providers, however need not build their services around the traditional CDN model. Joost Network architecture presentation from Colm MacCarthaigh is an interesting example to discuss to prove my point. Joost was founded by the same guys who founded Kazaa and Skype . Kazaa was one of notorious P2P file sharing application (used the FastTrack protocol) which died after RIAA revolt. Skype, as it happens, also has its roots in P2P network [ Skype protocol , Skype scalability problems ] and has been doing pretty good over the years. So its no surprise that Joost chose P2P model again to distribute part of the content to its users. Joost has a cluster of servers which serve as “original seeders” or all content, and rely on the P2P network to distribute the popular content. The number of Joost servers, however, is not small because it still also has to address the “long tail” of requests which are not among the popular content.

Two of the most important network optimization ground rules, which I noticed from the talks, was that they decided against using firewalls or loadbalancers in its network. Thats good, because the firewalls and loadbalancers wouldn’t have kept up with the bandwidth anyway. But even more impressive was that they designed the entire P2P application/network-algorithm to intelligently find and peer with nodes and supernodes closest to them. Joost tries to do this this in two different ways. The first one is using IP address (prefix aware) as proximity sensors (two IPs which start with similar set of numbers/octets will probably be in the same network). The second way to detect proximity is using Network AS Numbers which can work irrespective of what the IP addresses start with. [ Colm also mentioned about AS proximity detection below ]

A comment to blog @ ipdev.net by Colm himself
We have many gigs of transit, and are adding more. I’m not sure who claimed it’s near HD quality, I like to think it’s about NTSC, sometimes better, never quite PAL.We have some efforts in the code to save transit costs, there is very very basic prefix awareness, and we’re adding AS-level awareness using live BGP data. I have looked at adding AS adjacency information, ie prefer AS-adjacent peers, but it’s a lot of work and the US internet is relatively poorly mapped, so I don’t think this will come soon.

Its possible that Joost might still require CDNs to serve the long-tail content, but the work they have done to build the P2P infrastructure would not only save them an a lot of mulah in the long run but would also allow them to easily scale to be larger than any of the current CDNs if they do get that big.

Interestingly companies like Microsoft are not sitting idle watching the world go by. Microsoft has been working on something called Avalanche and I think they already have a prototype client out which you can download and try it out yourself.

Microsoft Secure Content Downloader

Some MSCD clients may be connected to each other via peer connections, forming a ‘cloud’ of clients. Pieces of the file you are downloading are sent through these peer connections between clients, as well as through connections with the file server. As a member of the cloud, your computer both serves as a client and server to other members of the cloud. Data destined for the cloud may be routed through your computer and sent to other cloud members. The other cloud members connected to you will be able to access only pieces of the file you are downloading via MSCD – they have no access to any other data on your computer.

You are only connected to other clients while you are downloading a file via MSCD. When the file has finished downloading – or when you pause or cancel the download, or exit the application – you disconnect from the cloud. Once you disconnect from the cloud, you will no longer have any connections to any other members in the cloud and no data will be routed through your computer.The Microsoft Secure Content Downloader (MSCD) is a peer-assisted download manager capable of securely downloading specific files. MSCD is intended for consumers who are downloading from a home PC, or business users whose computers are not behind a corporate firewall. If you use MSCD from behind a corporate firewall, you may be unable to download content, and may adversely affect other clients’ ability to download content.

Of course there are also other rumors that apple is trying this out… but you know how these things go.

Anyway, the point is that in spite of occasional gliches P2P is probably the way to go if you want to cut long term costs of CDN. Personally, I believe that Skype had no other way out. I mean can you think off all the phone calls in the world going through the same first phone exchange in New Haven, Connecticut where it all started ? P2P models are still evolving and its hard to imagine there will be a one-solution-fits-all. But if you know one, please let me know.

How Skype network handles scalability..

There was a major skype outage last week and though there is an “official explaination” and other discussions about it floating around, I found this comment from one of the GigaOm readers more interesting to think about. Now this particular description may not accurately describe the problem (which might be speculation as well) but it does describe , in a few words, how skype’s p2p network scales out. You should also take a look at the detailed discussion of the skype protocol here.

Number of Skype Authentication servers:
Count == 50; // Clustered
Number of potential Skype clients:
Count = 220,000,000 // Mostly decentralized
Number of SuperNode clients to maintain network connectivity:
Count = N / 300 at any one time.

• If there are 3.0 million users online then the ratio is 3,000,000 / 300 = 10,000 == Supernodes available
• Supernodes are bootstraps into the network for normal first run clients (“and handle routing of children calls”).
• Supernodes maintain the network overlay via a DHT(“Distributed Has Table”) “type” method. // This is normally very slow and done over UDP
• If a client cannot find a Supernode, regardless of authentication via central server then is NOT allowed on the Skype network.

Lack of Supernodes mean lack of network connectivity regardless of successful login via “central server”.
You CAN be a Supernode but not have full network connectivity because you have only a portion of the “Distributed Index Data aka DHT”.
MOST people that become Supernodes will bail out if they cannot keep a clear route (”aka calls bail out, client restarts and aborts Supernode status, thus booting it’s 300 – 500 Children and putting them into a “Connecting mode”.

Children that are trying to “Connect” are unable to do anything unless they have a “Supernode” as a parent. // No calls, No IM….

The overview of this is as follows:

Skype introduced a flaw into the network that dealt with “routing” and “fucked” the “decentralized data store aka DHT” this in turn ran clients on a RANDOM search of Supernodes which at this point were well booted off of the network.

In the End:
It is a huge cycle, no matter how many bugs they “fix” in the “central servers” it will take many days for N nodes to become Supernodes so they can route X data from peer A to peer B. This is NOT minor, a fix to the centralized server code base to relay data to N Supernodes there is lack there of, resulting of a very segregate network. Right now there are approximatly 10,000 sub Skype networks instead of 1 Single “in sync” network. When this “data store(see DHT) is in sync globally then the Skype network will be again STABLE.

I know this is very broad but, unless magically all of said nodes can recreate the “single overlay (DHT)” then nothing will be in sync. You will see delayed messaged, delayed or incorrect profiles and presence.

My take, in the end is give it 48 more hours and it may be semi-stable, but hey this is what you get with using end users as your own redundancy…

Yours…