SitePen Support

Elastic Comet Scalability

by Michael CarterDecember 31st, 2007


We’ve been experimenting lately with methods of scaling Orbited horizontally. Orbited is a cross-platform/cross-language distributed Comet server written with libevent and Python. Some time ago we locked down how to do this for a statically-sized cluster: you use a distributed hash function that reliably assigns particular users to particular nodes. Then the application layer can always dispatch a Comet message to the correct user.

The next hurdle is dynamically scaling an Orbited cluster. The root of the problem is that the hashing algorithm will be unreliable in the face of a dynamically resizing cluster. Every time a node joins or leaves the cluster, all users will have to reconnect to the cluster. This is an unacceptable spike in traffic as it represents a self-inflicted DDOS attack.

Some time ago I proposed a project called HaloD with the intent of solving this problem. The basic idea is to pre-allocate a cluster as having an inordinately large size, such as 1024 nodes. Then, using a custom, software-based router, dynamically assign real Orbited nodes to be responsible for a whole range in the table. So if we have a single actual Orbited node, no matter if a user attempts to connect to node 995 or 2, both connections are sent to our single node. When a second node is added, half of those routes are re-assigned to the new node. And half the users are told to reconnect. This method is known as stuffing.

This works out such that 1/N of the total user base must reconnect when a node joins or leaves the cluster, where N is the number of nodes before the change. In theory this works quite well, except that in practice we need to build that custom router. Below is what a HaloD deployment would look like.

A HaloD Cluster


Enter DynDNS, a service for providing DNS lookup with a very small TTL (time to live.) Their service costs $27.50 a year for a single domain — a very reasonable price. With DynDNS I can set my TTL literally to one second such that if I make a DNS request for my domain, change the DynDNS configuration, and then immediately make another DNS request, then I’ll have gotten two different IP addresses. This means that we can create 1024 sub-domains that all point at our first node. As more nodes join, we can use the HaloD routing algorithm to change DynDNS configuration so that all users will always be correctly located with the distributed hash function.

Amazon EC2

This discussion wouldn’t be complete without thinking about executing this whole setup on Amazon’s EC2 service. This service allows you to create virtualized instances of a Linux disk image. In a minute you could start up a hundred Linux machines, and you pay only for what you use ($0.10 per instance-hour.) Our goal with the Orbited project is to provide a clustering option that will run on pay-as-you-go hosting services such as EC2. With this kind of setup, a developer could start an Orbited cluster with a single node, and never worry again about scaling. As demand increased, the cluster would dynamically enlarge. If there was a slack in demand, the cluster would shutdown nodes as needed to avoid unneeded cost.


So this solution isn’t perfect yet. We still need to find a good way to move users from one Orbited node to another. The easiest solution would be to just ask them to reconnect, which would happen automatically with most transports anyway. But if the app reconnects first, then messages will be dispatched to the new destination node before the user moves. I think ultimately the solution will be to use a Comet protocol that supports delivery guarantees, because then the client would just re-request any messages it missed.

Overall, this setup is very compelling for developers who just want to program a Comet app and not worry about deployment or scalability. This would lower the barrier to entry substantially for many existing applications that need to be able to scale to enormous user load right away due to an existing user base. Without a similar setup it’s hard to imagine Facebook or Myspace adding real-time, Comet-based chat, and that needs to be changed.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

11 Responses to “Elastic Comet Scalability”

  1. Simon Willison Says:

    I don’t think the DynDNS trick will be effective in this case thanks to DNS pinning, a browser feature whereby browsers aggressively cache the IP address for a specific IP address until the browser is restarted (to protect against attacks that attempt to subvert the same-origin policy using DNS tricks).

  2. Simon Willison Says:

    That should be “aggressively cache the IP address for a specific domain name”

  3. Michael Carter Says:

    Thats a good point that I didn’t consider. Does IE contain these protections? One way to get around this issue is to update the dns slightly differently. Instead of just re-assigning node1.domain.com from one ip address to another, we start with node1.iteration1.domain.com pointing at the original node1 server. When we have to re-assign addresses, then we set node1.iteration2.domain.com to the new address. We can then add another step to our comet connect (and reconnect) logic to first ask any server what iteration we are in. Our application layer could, of course, ignore the iteration nonsense and just use node1.domain.com to dispatch requests to users that should be on node 1.

  4. Claude Says:

    Would Adobe’s BlazeDS help in the area of guaranteed delivery after a user reconnects to a different node?

  5. Dan Says:

    I recently experimented with low TTL DNS to provide fail-over for some EC2 instances and found in practice that some DNS server’s ignore TTLs that are less than 1-3 mins. I’ve also seen what Simon describes where Browsers (and some OS’s) cache DNS too.

    You mentioned that ultimately we need a “Comet protocol that supports delivery guarantees”. Do you think this means we’ll be seeing a convergence between Comet and Message Queues? They seem to have alot of the same problems and goals in mind, and it would be excellent to use the same system I use to broadcast messages internally used to broadcast messages externally to clients.

  6. Michael Carter Says:

    Dan: My above solution with iterations would address the 2-3 minute TTL problem. I think that we will ultimately see good integration between Message Queues and Comet. Orbited right now works like a Message Queue with a Comet interface. I think that ultimately users will be able to connect as a client to the Message Queue of your choice by using a Comet server just as the means of connection. That is, actually, one of our goals with the Orbited Project.

  7. Neil Hodgson Says:

    The dynamic scaling sounds similar to consistent hashing.

  8. Michael Carter Says:

    Claude: BlazeDS looks promising for the world of flash. I don’t think it would be a good idea to use it as a basis for all Comet as one our goals, at least with Orbited, is to support applications that don’t use flash. I’m sure it will probably have concepts that would be good for us to look at.

    Neil: I read up on Consistent hashing and you’re right, its very similar to HaloD. Interestingly, we arrived at the solution of Consistent hashing from solving two different problems. I think the particular flavor of CH that I propose is a different than any of the resources I’ve thus far found because it’s crucial that a Comet user is only ever hashed/connected to a single node, but we don’t expect our clients, in this case javascript browsers, to have full knowledge of the network. We solve this problem by pre-defining a maximum number of nodes, basically. But the idea of adding extra “virtual nodes” to keep the distribution uniform is quite similar.

  9. Jacob Rus Says:

    Claude: I don’t claim any deep knowledge of BlazeDS: I only read through a few pages of publicly-accessible documents. But it looks to me that it might help with some of that:

    The destination must be able to correlate messages and, in some cases, persist messages so that they can be reliably routed to consumers. Any time a message is sent, it must be acknowledged for the client to know that it was properly received. The remote service sends an AcknowledgeMessage to the client to satisfy this requirement. The client also dispatches an event for application code when these acknowledgments occur.

    But the application itself might have to do the work:

    Messaging components provide configurable connection and subscription retries when the connection to the server is lost. Producers provide reconnectAttempts and reconnectInterval properties to configure their reconnect behavior. Consumers provide resubscribeAttempts and resubscribeInterval properties that attempt to reconnect and resubscribe if a connection is lost. Messages sent by Producers or calls made by RPC services are not automatically queued or resent. These operations are not idempotent, and the application determines whether it is safe to retry a message send or RPC invocation.

    In any case, it seem like BlazeDS is mainly aimed at sending messages to Flash/Flex, while most of us Comet developers are more interested in using browser-native technologies.

  10. Comet Daily » Blog Archive » Scaling != Adoption Says:

    [...] imagine that a year down the road there will be more interest in advanced features like scalability and delivery guarantees. But there are so few Comet applications out there that work for even one [...]

  11. Comet Daily » Blog Archive » Comet Gazing: Maturity Says:

    [...] Orbited acts as a distributed hash table, allowing unlimited horizontal scaling with an O(1) peer-messaging dispatch algorithm. See HaloD for more information. [...]

Leave a Reply

Copyright 2014 Comet Daily, LLC. All Rights Reserved