February 27, 2006

Will Microsoft take the VMware bait ?

While listening to one of Mike poor's SANS security talk, he mentioned the problems with untrusted third-party applications. In this age of trojans, even a security expert like him takes precautions before he downloads and tests a new security tool. In fact, he said, that even uncompiled "source code" of trojans have hidden trojans waiting for an unsuspecting security researcher to try it out.

I don't download trojans everyday, but I do play around with tools which want to modify my registry. How many times have you yelled in frustration after finding out that the tool you've been trying to work with (and failed) for last 2 days doesn't uninstall anymore. I have a dozen or two interesting third party broken tools deployed somewhere on my computer which I haven't uninstalled yet.

VMware is in a perfect position to fix this problem. With Microsoft right behind them, they released something called the VMware player. Its basically a stripped down VMware product which can "play" Virtual machine images created by others. What makes it so interesting is that now I can download and run untrusted applications without thinking as much as I used to before. And all those uninstallable tools, which I never got to work, can be easily removed if they were shipped in a VMware image. Its almost like deleting a compressed archive with all the files in it except that in this case you don't have to worry about that tool messing with your operating system.

Coming from a unix admin background I have to tell you how much fun its is to install an unstable version of an opensource tool. Whats worse is that some of them have so many dependencies that by the time you get to use the tool, you would figure out that you broke something else. A perfect example for me was when I was playing with this great Network Monitoring tool called "opennms". If I were its maintainer, I would jump on the VMware bandwagon and release an opennms image ready to go for people like me who want to try it out without wasting time. And with oracle giving away its low-end database engine for free, this is the perfect way for some vendors to release their products preinstalled with oracle/linux inside a VMware image.

Now, Microsoft's Virtual PC is a very strong competitor. They came out from nowhere and literally forced VMware to give out its product for free. And though you might think VMware is on a loosing streak they have one of the best things going for them. The support from the linux/OS community.

The catch with VMware player is that one can't distribute an image of an OS for free if the OS itself costs money. So if you think you are going to get a demo of "Quicken 2006" installed on Windows XP platform, its probably not going to happen soon. But if you want to try out "Squid" proxy server running on Linux, you can have it for free. In fact Community Virtual application page at VMware site lists quite a few applications running on open source OS like Linux and Free BSD.

When pkzip/winzip started gaining popularity Microsoft started building compressed folders right into their OS. When Realplayer started gaining popularity they built the Media player. The IE browser, Internet firewall, MSN Toolbar are all part of Microsoft's effort of killing the competition. (They of course deny it.) So if VMware player gains popularity how will Microsoft compete ? Will they change their licensing policy for distributing Microsoft OS inside a VM image ?

Or worse, will Microsoft, officially, (an openly) claim that Linux runs on Virtual PC ?
Other interesting links on VMware

February 26, 2006

Google CL2 : Is Google Calender finally ready ?

Its been a month since we heard of google calender rumors. Just when everything has cooled down, Paul stone dug up some links within googles code to a "Google Links" page which had a whole bunch of google services listed. What stood out on that page was a link to "Google CL2" which said "A calendar for you and the world". If you have a google account go to this link http://mail.google.com/mail/?view=barc and see for yourself. Image1 Image2

Weekly updates 26 Feb 2006


  • Google Pages is here for everyone. I've heard people comparing it with what geocities (now owned by yahoo) used to do long time back. Google pages allows users to create pages with the help of AJAX. However, what stood out, was the hint that Google pages might be group editable, making it easy to create and manage like a wiki server.

  • My Page Rank is an interesting service which allows you to put page rank on your website. Nothing new about it, but its there and does the dirty work of converting stats into an image.

  • Zillow.com has got a lot of traction in the media lately. Zillow does one thing and does it very well. It tries to predict home prices based on the past and present price change trends. It takes into account the amenities available to the home which is used to predict its next sale price. With the house prices tanking in some places, the importance of zillow to understand trends will be of great value.

  • Google Finance might be on the way. Search engine journal has some interesting observactions.

  • An other interesting but useless site I found is Toogle

February 25, 2006

Secret of Microsoft Origami is out

A lot of people were wondring what Origami is all about. Seems like the secret is out of the bag... and I'm a little dissapointed.

The way it was hyped it was made to look as if this is will be cutting edge, something new, which we haven't thought about yet. But this device is larger than a PDA and has less features than a small laptop. The advertisement didn't answer why someone won't buy a 12inch Powerbook, or a tablet PC. Will someone really walk around with that clunky piece of equipment ? May be they will... but doesn't the success of iPods and the failiure of Mobile CD players define acceptable size of a mobile device which people are willing to carry around ?
In my personal opinion, based on what I saw in the short clip, this product could be a direct competition for sony's PSP. Other than that, with the current feature set and its size, it can't replace the laptop, cellphone, camera or the iPod/music player.
If there is anything to be blamed for the launch disaster, it would be the hype itself. I wouldn't have blogged about Origami if I wasn't waiting for it with so much anticipation.


Wireless Skype handsets (802.11)

Those of you who never talk to anyone outside this (US) island you live on, chances are that you have never used skype. For others who can't buy unlimited minutes to other parts of the world, thank skype for trying to change the world.

But skype world is not perfect yet. You still have to use your computer to make and receive calls. There are some skype-phone gateways available, but most of them are just hacks at best. A few companies have big plans for skype in the near future and here are some interesting details I gathered over the last few weeks.

To begin with there are 4 classes of skype devices currently out there.

  • Traditional headphone and microphones devices, using speaker/mic-in connections on the computer

  • Intelligent wired USB devices which can interact with Skype software on the computer and make/receive calls. Some even have a lcd display on the handset.

  • Intelligent USB devices which can do whatever the wired devices can do, but you have the flexibility of moving around without the wires. The catch is that there is a base module which is always attached to the USB port of the computer. And your computer has to on for it to work.

  • Skype on handheld devices which requires you to buy expensive PDAs to make free/cheap phone calls


Most of these hardware are available on ebay, froogle and skype.

A few days ago, a friend showed me some interesting news about FON on Om Maliks Blog. Fon is a interesting community project which promotes wireless access sharing by promissing connectivity to its large network of POPs around the world. The catch is that if you want to access thier POPs, you have to set one FON wireless gateway yourself. Aha.. if you know how bittorrent works, u will try to draw some similarities here.

Anyway, so I heard that Google and Skype are two of the investors in Fon. Google, who have been heavily investing in the last mile connectivity (free wireless in cities like mountain view) have a lot to gain by monitoring user activity. But what I couldn't understand was the reason skype getting into it.

Until I figured out that skype heavily depends on internet availability to allow customers to make phone calls, and without this network being available, every one of its customers will continue to depend on traditional means of wireless communication which bites into thier revenue.

If skype could provide internet connectivity over 802.11, skype users might think about just using skype for thier calls. But who on earth wants to cary their seven pound laptops around with them ? I'm glad you asked.

There are a few vendors out there who have been very busy building 802.11 based Skype phones which don't need any USB at all. There are a few others who are building 802.11 into the regular cellular phones such that customers have a choice about which network to use when there is a 802.11 network available. Interesting. So who are these guys ?

I recently bought free.1 phone which is a wired USB device to use it with my skype account on my powerbook. Though it worked beautifully (as expected) , I'll probably recommend everyone out there to wait for the 802.11 devices to come out if you can. Also, even though I mentioned about FON, I personally would never set it up without putting it behind a firewall of some kind. So in other words I'll probably end up paying FON/Skype instead of me setting up the FON wireless device on my network.

February 17, 2006

The pain of Load balancing applications

Introduction


Loadbalancing may mean many different things to different people but its all about distributing load. For me its an architecture of how some network services can be scaled by adding multiple servers performing the same tasks.

If you had a popular website with static content, and if your server couldn't keep up with the request, all you had to do was setup multiple web servers and use round-robin DNS entries to divide the load into multiple servers. For dynamic web applications like search engines this plays a significant role because the number of users per node can support is much lower.
Over time, as applications grew more complex and as web companies found customers outside US they found out the hardware that the only way to optimize network performance was by going local. Loadbalancing POP (points of presence) around the world provide a snappy user experience which has been important and drawing more customers.

While, static content on web servers can easily be replicated to servers around the world, some web applications were required to maintain state of user actions. The loadbalancer have been trying to attack this particular problem for the last few years. Among the many odd ways of doing this, one was by associating source-IP to a web server. Unfortunately├â‚├é  some ISPs switch source-IP in between sessions which proved to be disastrous for some applications. Others used cookies and session identifiers in URL to solve the problem.
Loadbalancing is rocket sciences, but its not the the faint of heart either. This article is collection of my past and present thoughts on loadbalancing architectures which I've worked with or read about.

Under the hood


Though loadbalancers sound simple, under the hood they are a complicated beast. Todays loadbalancers have so many features that it sometimes overshadows the complexities of application its supposed to loadbalance. Its also important to note that loadbalancers are not just designed for web applications anymore. Its an ideal hardware to use loadbalance databases, ldap servers, terminal servers and other custom behind the scene custom applications.

Firewalls and Application Gateways


The internal design and implementation of a modern Loadbalancer is very close to a basic firewall. While the firewall is designed to block all illegal traffic, it also does limited Network and Port address translations. A good firewall is more than a packet filter in the sense that it actually keeps state of whats going on between a user and the client. From the moment a session is initiated, assuming that its allowed by the acls (access control lists), it creates session record where it logs the traffic protocol, source and target addresses and port numbers. Subsequent packets are tagged and allowed through or rejected based on what sessions are valid.

HTTP is a relatively trivial protocol when compared to more complex protocols like FTP and SNMP. UDP and ICMP in particular are complicated beasts because they were never designed to be "stateless", which is one of the basic requirements for it to be firewalled and tracked easily. UDP, ICMP and other complex communication protocols have forced firewall vendors to come out with custom hacks to deal with the different problems.

Depending on whom you talk to, a firewall which can talk to two different networks, inspect and validate sessions using "deep packet inspection" could be called an "application gateway", because they probably have sufficient intelligence to understand, and create responses and respond to requests for that application protocol. Most modern firewalls can be called an HTTP gateway because they can understand and respond to HTTP requests.

TCP/IP basics


To understand what an "application gateway" does its important to understand how TCP/IP works.

  • Resolve the address
    Address resolution is the first step for every successful TCP/IP connection. A client/server cannot communicate with just a name.

  • SYN
    Next step for the client is to send the the first TCP/IP packet with "SYN" flag set. This is like a "hello" packet telling the server that the client is interested to talk. One more information in this packet which the server needs to know is the port number on which the client is interested to talk on. For most web requests its set to 80 or 443.

  • SYN-ACK
    If the server wants to talk on that port, and if it has resources it will reply back with a packet in which "SYN" and "ACK" is set. When the client gets this packet it means that the server is alive and the service is running on that particular port.

  • ACK
    The client at this point can "ACK" the previous packet which the server sent and can, if it wants, send data too.


An "HTTP Gateway" does two important things with this knowledge. First it talks the the browser and does the TCP/IP handshake to understand where the user wants to go. This is important to understand, because even though the browser assumes it is connected to web server, its actually being terminated on the firewall. Once the gateway decodes the HTTP request and knows where it has to go to (and whether that request is allowed) it will initiate a second TCP/IP connection with the webserver at the backend server using a second set of handshake packets. Thats the point when the browser is really connected to the server.
A loadbalancing appliance, for the most part, works just like this. The only thing significantly different with the loadbalancer is its ability to send traffic to multiple servers without the user knowing about it.

Basic Loadbalancing terminologies


Terminologies I will use in the rest of the documentation are based on my experience with Cisco CSS and Radware WSD/CT100s. I've noticed that the vendors take great liberties at creating new terminologies which can easily confuse the admin.

  • Service endpoint
    A "service" in CSS is defined to be an endpoint which can provide service. An example of such a endpoint would be a server with IP 192.168.1.4 running a TCP service on port 80. If you want to loadbalance a couple of read-only oracle servers you might have them providing service on port 1531 instead. In most cases a client won't ever directly connect to this endpoint. The only exception is when the loadbalancer is doing a DNS based loadbalancing in which case the client will directly connect to the service end point.
    This terminology is a little fudgy in Radware WSD. By default WSD assumes one wants to loadbalance all available services on all ports of the servers and doesn't force the user to select a port number on which service is running. This might be a good thing when you have multiple servers providing multiple services, but I personally avoid this for reasons which I'll explain later.

  • Content Rule end point
    A "content rule" in CSS is defined to be an endpoint to which an actual user would connect to. In case of TCP/IP based service, it would probably include IP and port number of the where the requests should come to. If this is a DNS based loadbalancer, it probably would be running a DNS server on port 53 using UDP/TCP.

  • Session persistence
    The feature which allows the LB to track users sessions to direct them to the same server for subsequent requests is what I call "session persistence". Again, there are many different ways of doing this depending on what application server and loadbalancer you use.

  • Timeouts
    This is one of the most critical parameter which will play a big role in how your application works. While timeouts allow clients and application/server/networking components to understand when to give up, they also play a big role in freeing up critical resources which can otherwise slowdown the application. But setting it too low or setting up different timeouts in different parts of network and application components can break your application in unexpected ways.

  • Keepalives
    The work keepalive has different mean in different context. If you are a networking guru you would know that keepalive can be used in some protocols to keep connections alive through firewalls which would otherwise shutdown the connection due to inactivity. If you are a web guru, you would be thinking about the keepalives in HTTP protocol which allows browser to send multiple requests to the server without renegotiating TCP/IP all over again. Unfortunately Cisco CSS also uses this terminology to check service availability.

  • Layer 4 loadbalancing
    Most of the early loadbalancers did loadbalancing and session persistence based on the source IP and port number. In a perfect world where every user has his or her own IP which doesn't change with time this is a perfect solution. However for our world, where ISPs like AOL change proxy server without telling the user and where 100 to 1000 users can be NATing with the same source IP, this solution doesn't work.

  • Layer 7 loadbalancing
    This is what most loadbalancing applications use to persist and distribute sessions to multiple servers. This requires the loadbalancer to inspect the HTTP packet to look at the various HTTP header parameters to make a decision. Common HTTP parameters which get inspected are the HOST string, the REQUEST_URI and Cookies.

  • Load distribution Algorithms
    One of the trickiest problems for a loadbalancer is finding the most optimal server a new user should go to. Unlike round robin DNS, which gives the same weightage to each of the servers, some algorithms have the capability to send more traffic to some servers if they are faster/newer than others, or send less traffic to some nodes if they are very busy or have a lot of active online sessions. Some of the common algorithms I've come across are



  • Round Robin

  • Weighted Round Robin

  • Least Users

  • Weighted Least Users

  • Least Traffic

  • Weighted Least Traffic


  • DNS based loadbalancing
    The mechanism of distributing load at DNS query time is called DNS Round Robin. The loadbalancing appliance usually does some kind of check to see which web servers are available. Based on the Load distribution algorithm it will send a list of available nodes in the order of priority as part of the DNS query response to the customer.

  • Global Loadbalancing
    This terminology is generally reserved for appliances which trying to loadbalance customers to various different points of presence around the world/country. The appliance does some kind of polling to find out which POP is closest and most responsive to the customer before it sends the client to that POP. The implementation of how Global Loadbalancing is done may vary, but DNS is one of the popular mechanisms for directing users.

    Design Recommendations



    • N, N+1 or 2*N Configuration
      Whenever resources are procured and deployed, always plan for one extra. This is the only way you can provide continuous service without degrading quality. You don't have to keep it running, but just have it available as a standby. Loadbalancing solutions which understand the significance of standby server and know when to use it could reduce the number of annoying phone alerts at 3am on Sunday morning.

    • Health Monitoring
      Almost all loadbalancers will claim to have some mechanism of detecting web server failure. But if you have a complex web application which relies on a host of other components to service customer requests, then make sure that the health monitoring module can accurately poll node health. For example, there are time when requesting a "/index.html" page may come back with "200 OK", but "/login.aspx?username=xyz&pass;=xyz" might throw a stack trace because LDAP was not available. Also remember that the frequency of health checks can degrade your applications response time as well.

    • Maintaining State
      Applications which maintain state information within session-memory are very picky about session persistence. Most loadbalancers can be configured to extract session identifiers from URL or from Cookies. If you know how your application sends session identifiers to the end user, make sure the Loadbalancer supports it.Unfortunately though cookies are simple to implement on the application server, they can sometime become a complicated beast for networking devices. Here are the problems I've dealt with in the past

      • Cookies need to be enabled
        Applications which maintain session require Cookies to be enabled in the browser. URL rewriting is another way to send session identifier, however its considered less secure because most proxy servers log GET/POST requests which will include the session identifier. If you are using SSL this is not a problem, however bookmarks can get ugly

      • Cookie size is limited
        If you have a lot of cookies, or forget to delete cookies from users browsers, then they will add up to the point after which cookies cannot be part of the HTTP header. Whats more tricky is the fact that some Loadbalancers don't even read the complete cookie header. Which means that if the session cookie is at the end of a long list of cookies, some loadbalancers might actually ignore it.

      • Cookies+Java over SSL
        If your application uses HTTPS and have Java applets communicating over SSL, this is one bug to look out for. We have seen instances where Java applets insert the HTTP cookie headers into HTTPS header section instead of HTTP header. The work around is do the HTTP->HTTPS packet encapsulation yourself. If this bug does show up in your network, the responsibility of extracting cookie from HTTPS packet and inserting it into HTTP packet belongs to the SSL engine you are using. For us Radware seems to do the trick so we were never able to break the application in-house. However, some clients outside our company were using proxy servers which were remove extra information in SSL header which broke our application

      • Set-Cookie bug
        One of the very early session persistence bugs we noticed in a couple of loadbalancers I tested in late 2000 was the one where the "Set-Cookie" HTTP header from the server was ignored by the loadbalancer. This meant that the there was a very good possibility that the first HTTP request the client sent with the Cookie set, would be different from the original server which sent a Set-Cookie request to the client.

      • Keep-alive bug
        Keep-alives are designed to optimize network throughput by allowing clients to send multiple HTTP requests over the same TCP channel. Unfortunately some loadbalancers ignore all Cookies except the one in the first HTTP request. The logic of this implementation is simple. Once a client is connected to the servers, there is no reason to check the cookies anymore. The problem however shows up when the client is using a proxy server. Some "intelligent" proxy servers can multiplex multiple client requests in the same Keepalive channel which can play havoc with the sessions if the loadbalancer doesn't decode them.



    • Inactivity Timeouts
      Inactivity timeout of an established TCP/IP connection can be a problem if delays over a minute are normal with your web application. We have faced a number of timeout related issues in our network. Six of the most common components which can timeout your TCP connection early are..

      • Proxy servers

      • Firewalls

      • Loadbalancers

      • SSL Accelerators

      • Web server

      • Application server



    • Session Timeouts
      Session timeouts also are important. In most cases these are the only two components which actually worry about "sessions" over multiple TCP connections.

      • Loadbalancers

      • Application server



    • Recommended Optimizations

      • Use Multiple domains
        If you have a site with lot of images, CSS files or Javascripts embedded in them, I strongly recommend you to distribute your files over multiple "hosts". The reason is simple. There is a limitation on how many objects can be downloaded per host for both IE and Firefox. If you spread your files over 2 hosts, your browser will open twice the number of threads to download. For most customers who don't have too many images, this is not a problem. But a website heavy on AJAX should consider this.

      • Latency
        Every request has a latency associated with it. If you are have an option of setting up multiple datacenters, look for latency instead of distance from the customers location. If buying a leased pipe from customers location to your data center is possible, that would be closest to the perfect solution you can achieve for. The only thing greater than that is moving the data center to customers location.
        If you can't do either of these, think about using services like Akamai who cache and serve the object from a server nearest to the customer.

      • Caching
        Caching is a great feature. If a customer already has an image file, there aren't many good reasons why that image should be requested for again and again. Setup caching on your web server. On Apache it can be done using mod_expires. If you have a dynamic web application, try to set it up such that dymanic content is not negatively affected due to the caching feature.

      • Compression
        Many of you are not even aware that many websites (if not most) already do data compression on the fly. If you have application which are bandwidth intensive, enabling compression can probably speed up the the UE(User Experience) and save you a bunch of money at the same time. However remember that there is a computational expense at the server end to compress content on the fly. If the servers are very loaded, think about deploying a cluster of SSL accelerators which can take over the load.

      • Keepalives

      • Browser threads



    • SSL Accelerators

      • Compatibility
        If your application might require SSL acceleration at somepoint, design your architecture assuming that you need one rightaway. SSL is a CPU intensive process which is usually not done by the loadbalancer. However there are a few which do.The decision to buy a loadbalancer with or without SSL within it purely depends on the traffic one is expecting over time. Because the throughput of a Loadbalancer is usually much higher than that of a SSL accelerator, a solution were Loadbalancer and SSL are in the same box might be more expensive to scale than a solution where SSL and LB are different components in the network.If you plan to seperate your LB and SSL infrastructure one addtional issue you would have to deal with is thier compatiblity. The devices we initially selected for LB and SSL did work together very well, untill we switched on VRRP when all hell broke lose. Unless you have a lot of time and resources it might be better off to go with combination of solutions which have been implemented before instead of picking a new pair of vendors.

      • One-Arm or In-Line configuration
        When you design the network diagram, another question you will be asking yourself is whether you want to deploy SSL in "One-Arm" or "In-Line" configuration. The "In-Line" configuration is a configuration where all requests go through SSL loadbalancer before they hit the Loadbalancer. The "One-Arm" configuration is where all traffic hit the Loadbalancer which then makes the decision on whether to send the traffic to the SSL box. If you are a financial site which does all its work over SSL, you might like to investigate "In-Line" configuration, but for the rest of us "One-Arm" might be more suitable.






    technorati tags: