What is scalability ?
When asked what they mean by scalability, a lot of people talk about improving performance, about implementing HA, or even talk about a particular technology or protocol. Unfortunately, scalability is none of that. Don’t get me wrong. You still need to know all about speed, performance, HA technology, application platform, network, etc. But that is not the definition of scalability.
Scalability, simply, is about doing what you do in a bigger way. Scaling a web application is all about allowing more people to use your application. If you can’t figure out how to improve performance while scaling out, its okay. And as long as you can scale to handle larger number of users its ok to have multiple single points of failures as well.
There are two key primary ways of scaling web applications which is in practice today.
- “Vertical Scalability” – Adding resource within the same logical unit to increase capacity. An example of this would be to add CPUs to an existing server, or expanding storage by adding hard drive on an existing RAID/SAN storage.
- “Horizontal Scalability” – Adding multiple logical units of resources and making them work as a single unit. Most clustering solutions, distributed file systems, load-balancers help you with horizontal scalability.
Every component, whether its processors, servers, storage drives or load-balancers have some kind of management/operational overhead. When you try to scale that, its important to understand what percentage of the resource is actually usable. This measurement is called “scalability factor“. If you loose 5% of a processor power every time you add a CPU to your system, then your “scalability factor” is 0.95. A scalability factor of 0.9 means you will only be able to use 90% of the resource.
Scalability can be further sub-classified based on the “scalability factor”.
- If the scalability factor stays constant as you scale. This is called “linear scalability“.
- But chances are that some components may not scale as well as others. A scalability factor below 1.0 is called “sub-linear scalability“.
- Though rare, its possible to get better performance (scalability factor) just by adding more components (i/o across multiple disk spindles in a RAID gets better with more spindles). This is called “supra-linear scalability“.
- If the application is not designed for scalability, its possible that things can actually get worse as it scales. This is called “negative scalability“.
If you need scalability, urgently, going vertical is probably going to be the easiest (provided you have the bank balance to go with it). In most cases, without a line of code change, you might be able to drop in your application on a super-expensive 64 CPU server from Sun or HP and storage from EMC, Hitachi or Netapp and everything will be fine. For a while at least. Unfortunately Vertical scaling, gets more and more expensive as you grow.
Horizontal scalability, on the other hand doesn’t require you to buy more and more expensive servers. Its meant to be scaled using commodity storage and server solutions. But Horizontal scalability isn’t cheap either. The application has to be built ground up to run on multiple servers as a single application. Two interesting problems which most application in a horizontally scalable world have to worry about are “Split brain” and “hardware failure“.
While infinite horizontal linear scalability is difficult to achieve, infinite vertical scalability is impossible. If you are building capacity for a pre-determined number of users, it might be wise to investigate vertical scalability. But if you are building a web application which could be used by millions, going vertical could be an expensive mistake.
But scalability is not just about CPU (processing power). For a successful scalable web application, all layers have to scale in equally. Which includes the storage layer(Clustered file systems, s3,etc), the database layer (partitioning,federation), application layer(memcached,scaleout,terracota,tomcat clustering,etc), the web layer, loadbalancer , firewall, etc. For example if you don’t have a way to implement multiple load balancers to handle your future web traffic load, it doesn’t really matter how much money and effort you put into horizontal scalability of the web layer. Your traffic will be limited to only what your load balancer can push.
Choosing the right kind of scalability depends on how much you want to scale and spend. In fact if someone says there is a “one size fits all” solution, don’t believe them. And if someone starts a “scalability” discussion in the next party you attend, please do ask them what they mean by scalability first.