XMPP protocol, specifications and scalability
XMPP scalability is a mix of pragmatic protocol design and the continuous improvement of software implementations
Scalability has always been a hot topic for the XMPP instant messaging protocol. In 2000, I was working in a company that was building a customer project to create a Jabber platform supporting 10,000 simultaneous connections. The platform had been build on top of jabberd, the only possible choice at this time. However, the implementation was able to sustain only 1,000 simultaneous users. It was not possible to optimise the platform 10 times, so the company wrote a custom clustering layer around 10 jabberd machines to meet the client goal.
Time has passed and XMPP (eXtensible Messaging and Presence Protocol) is now able to sustain much larger communities in a single domain (up to several million users is now possible). The hardware and implementation have improved. But, attracted by success stories, large communities are now looking for more challenging deployments: they want more simultaneous users and more features. Real time communication is ubiquitous.
XMPP is an interesting protocol because it is quite versatile. It can adapt to a broad range of different needs, from a relatively small business that wants an infrastructure packed with lots of features, to a large social networking organisation that wants to offer instant messaging to the
wider community (and will soon start wanting to distribute more real time events).
Correct analysis and experience
Sometimes, the technology is stretched to accommodate incorrectly modelled problems. Maths helps a lot when working with XMPP. Suppose you have to deal with a project where there is a requirement to send a message every two seconds to all connected users. Suppose that this project needs to deal with 200,000 connected users in peak. Both figures are realistic independently, but if you do the maths, you find that you have to design your platform to be able to send 100,000 messages per second. This can lead to the deployment of very large platforms and considerably increase the bandwidth consumption. At some level and in some context, the problem could not even be possible without an extremely costly platform. Analysis of the end user need is key. In this example, we have been able to rewrite the problem with a publish and subscribe mechanism and reduce the number of messages needed on the network by a factor of ten.
This is just a small and simple example, but in this matter the correct, innovative analysis of the problem and experience is key.
Pragmatic protocol design
The XMPP protocol covers a very broad scope. It relies on a core Internet Engineering Task Force (IETF) set of specifications (RFC), but also on ever-evolving and increasing extensions. Those extensions are evolving because the goal is to promote consistent design of the protocol, as well as efficiency. Implementers of the protocol are always invited to suggest improvements to the protocol that strengthen its consistency or that can simplify its implementation. This pragmatic
approach has helped the protocol to achieve success and it is why deployments are currently spreading fast.
To get a good idea on how the protocol is developing in practice, take a look at a discussion on the XMPP standard mailing list:
https://mail.jabber.org/pipermail/standards/2009-February/021145.html. This discussion is simply about adding a parameter in one fundamental part of the protocol. The purpose of this parameter is to allow scalable implementation to happen. This is a small detail that can make a huge difference when you have to deal with large scale XMPP deployments (hundreds of thousands of simultaneous users and tens of millions of registered users).
Flexible implementation: the XMPP application server
I said previously that XMPP covers an unusually broad range of needs. It is often compared to Voice over IP protocols, but none of these protocols (except Skype) were devised cover scale ranging from phone server (PBX) to Internet scale communication. Asterisk protocol
for example is designed for corporate networks. It must cover, at most, a few thousand users. XMPP is much more ambitious. It wants to cover both the team size and the “Internet” scale.
This ambition has a price. You cannot expect a single implementation to work on all those ranges per se, although some features required in the corporate world can also work at a larger scale. What we need is an implementation that does not try to cover all the scope, but that is flexible enough to be adapted to cover the whole range of use cases. This is why ProcessOne has been working for several years to make ejabberd something much more than an XMPP server. Our goal since that time has been to build an XMPP application server – a tool that can be used to forge an XMPP-enabled world. (See my previous article Introducing
the XMPP application server). We are making good progress, based on feedback from the field and customer projects we are doing. Yes, this is our most ambitious project, but this is the only way to fully embrace the versatility of XMPP itself.
Your are fully encourage to participate to this ambitious task. Share your use case, talk about API and what you had developed on top of ejabberd or would like to develop but cannot easily with the current API. Join our forum, this is probably the most obvious place to get started. We are listening 😊