Service registry (ESB) for scalable web applications.
This blog post is the result of my futile attempts at understanding how others have solved the problem of automatic service discovery.
How do organizations, which have a huge collection of custom applications, design scalable web application without having to hardcode server names and port numbers in the configuration file ?
I believe the terminology I’m hinting at is either called a “Service Registry†or a “Enterprise Service Bus†which is part of the whole SOA (Service oriented architecture) world.
The organization I work for, has a limited multicast based service announcement/discovery infrastructure, but not widely used across all the applications. In addition to the fact that multicast routing can become complicated (ACL management of yet another set of network addresses), its also not a solution where parts of applications could reside on the cloud. Amazon’s EC2, for instance, doesn’t allow multicast traffic between its hosts.
Microsoft Azure .Net services (part of the Azure platform) provides a service registry (or proxy) to which internal and external applications can connect to, to provide and consume services. The design doesn’t allow direct connection between provider and consumer which makes this a massive single point of failure. I agree any kind of registry has this problem, but the fact that you need this service to be up all the time to make every single request makes it extremely risky.
There are at least two open source projects by Apache foundation which touch this topic. One is Service Mix and the other is Synapse. I’ve also spoken to a few commercial entities who do this, and wasn’t really convinced that its worth spending big bucks for this.
The reason why I’m puzzled is that I don’t see a single open source project being widely used in this area. I’ve been a REST kind of a guy and hate the whole SOAP world… and was hoping there would be something simple which could be setup and used without pulling my hair off.
My contacts at some of the larger organizations seem to make me believe that they all use proprietary solutions for their infrastructure. Is this really such a complex problem ?
If you have an ESB in your network, please do drop in a line about it.
Comments
I think the problem is that most places simply "manage" with configuration for the task of lookup. They just don't feel the pain enough to be bothered thus there is little incentive for most to do or try open source.
I've personally been involved in building infrastructure such as this several times with varying approaches. It's not that complex just that most don't see the value, not too mention it can mean they have to start thinking a little differently which in many places is resisted quite strongly.
FWIW, I wouldn't use anything like an ESB for this task noting at the same time that the foundations I would use are rarely well maintained in most organisations. Drop me a line if you want more details....
I figure if there is one node which acts as a master list, and the other nodes get their IP/port info from it on startup (refreshing at intervals), then even if that master node goes down the other nodes can continue to operate (using the cached copy). The master server (a static node, not doing much) is generally not going to go down, and if it does, the chances of other nodes going down - and coming back up before the master node, are not too likely.
This of course does not solve a lot of the more complex problems which could arise down the road, but I think keeping things as simple as possible until your infrastructure grows to justify the added complexity is generally the best way to go. Maybe you are already at that point though.