HAProxy : Load balancing
Designing any scalable web architecture would be incomplete without investigating “load balancersâ€. There used to be a time when selecting and installing load balancers was an art by itself. Not anymore.
A lot of organizations today, use Apache web servers as a proxy server (and also as a load balancer) for the backend application clusters. Though Apache is the most popular web server in the world, it also considered over-weight if all you want to do is proxy a web application. The huge codebase which apache comes with and the separate modules which need to be compiled and configured with it, could soon become a liability.
HAProxy is a tiny proxying engine which doesn’t have all the bells and whistles of apache, but is highly qualified to act as a HTTP/TCP proxy server. Here are some of the other wonderful things I liked about it
- Extremely tiny codebase. Just two runtime files to worry about, the binary and the configuration file.
- Compiles in seconds. 10 seconds the last time I did it.
- Logs to syslog by default
- Can load balance HTTP as well as regular TCP connections. Can easily load balance most non-HTTP applications.
- Can do extremely detailed performance (and cookie capture) logging. It can differentiate backend processing time from the end-user request completion time. This is extremely helpful in monitoring performance of backend services.
- It can do sticky load balancing out of the box
- It can use application generated cookies instead of self-assigned cookies.
- It can do health monitoring of the nodes and automatically removes them when health monitors fail
- And it has a beautiful web interface for application admins who care about number.
A few other notes
- HAProxy really doesn’t serve any files locally. So its definitely not a replacement for your apache instance if you are using it to serve local files.
- It doesn’t do SSL, so you sill need an SSL engine in front of it if you need secure http.
- HAProxy is not the only apache replacement. Varnish is a strong candidate which can also do caching (with ESI). And while you are at it, do take a look at Perlbal which looked interesting.
Some Interesting external links
- Live HAProxy stats page
- HAProxy Manual
- HAProxy Architecture guide
- Other HAProxy docs
- HA cluster using HAProxy
Finally a sample configuration file with most of the features I mentioned above configured for use. This is the entire thing and should be good enough for a production deployment with minor changes.
global
log loghost logfac info
maxconn 4096
user webuser
group webuser
daemon
defaults
log global
stats enable
mode http
option httplog
option dontlognull
option httpclose
retries 3
option redispatch
maxconn 2000
contimeout 5000
clitimeout 300000
srvtimeout 300000
listen http_proxy 0.0.0.0:8000
option httpchk HEAD /app/health.jsp HTTP/1.0
mode http
cookie SERVERID insert
capture cookie JSESSIONID len 50
capture request header Cookie len 200
capture request header Host len 50
capture request header Referer len 200
capture request header User-Agent len 150
capture request header Custom-Cookie len 15
appsession JSESSIONID len 32 timeout 3600000
balance roundrobin
server server1_name server1:8080 weight 1 cookie server1_name_cookie check inter 60000
server server2_name server2:8080 weight 1 cookie server2_name_cookie check inter 60000
Comments