Comet Support by SitePen
SitePen Support

Independence Day: HTML5 WebSocket Liberates Comet From Hacks

by Michael CarterJuly 4th, 2008

A recent set of HTML5 discussions are changing the course of Comet. First, a recap of the last two years of Comet: With long-polling we set the bar to cross-browser push. With XHR streaming and ActiveXObject(’htmlfile’) we raised it to cross-browser streaming. With SSE we’ve been trying to raise the bar to native, cross-browser streaming. And there we’ve sat, hoping that browser vendors actually implement the latest SSE spec.

I say we’ve been selling ourselves short. All this time pushing for a native server->client streaming transport, but we still lack client->server streaming, and anything resembling a standard transport for bi-directional communication. The Holy Grail of Comet development has always been native browser support of a full-duplex, single-connection communication’s channel, otherwise known as a TCP socket. But we’ve been mired down in hacks so long that we’ve lost the vision.

No Longer. The HTML5 specification now offers WebSocket, a full-duplex communications channel that operates over a single socket. I have been listening closely, and in some cases contributing, to the process of ensuring that WebSocket will:

  • Seamlessly traverse firewalls and routers
  • Allow duly authorized cross-domain communication
  • Integrate well with cookie-based authentication
  • Integrate with existing HTTP load balancers
  • Be compatible with binary data

The API of WebSocket is very straightforward. You create a WebSocket and point it at a url:

var conn = new WebSocket("ws://")

Then you attach three callbacks:

conn.onopen = function(evt) { alert("Conn opened"); }
conn.onread = function(evt) { alert("Read: " +; }
conn.onclose = function(evt) { alert("Conn closed"); }

And finally, you can send upstream data with a simple function call:

conn.send("Hello World")

The browser will perform an HTTP handshake with the target web server to determine support, and then a direct stream will be exposed via the onread and send functions. The uri scheme “ws://” is used for WebSocket connections, and the “wss://” URI scheme is for secure WebSocket connections.

After the handshake, bi-directional framed communication ensues. Each frame can be either binary or text, thus allowing for swapping the encoding mid-stream. You can find more information about the protocol itself at the network section of the whatwg HTML5 draft page

While the HTML5 specification is not in a finalized stage, the first public draft was published by the W3C in January 2008, and browser vendors have already began targeting features in the specification. The idea of putting a duplex channel into the spec is not a new one; the TCPConnection API and protocol was initially drafted more than two years ago. Unfortunately there were many significant problems with TCPConnection that held back browser adoption. Ian Hickson, the editor of the HTML5 specification, tackled these problems head on and under his guidance the standard has evolved to a usable state with WebSocket. About this new feature, Mr. Hickson comments:

“I’m looking forward to seeing Web Socket implemented in browsers, as I think it’s going to enable all kinds of realtime applications like chatting, remote controls, and the like, without the ridiculous hacks authors have to use today.”


WebSocket in a browser is terrific because it drastically cuts down the complexity of the Comet server by an order of magnitude. What’s more though, it provides a straightforward, understandable API to JavaScript developers. The most important part of the specification is that developers can wrap their heads around the API in about five seconds. That’s because it looks so much like a socket.

If the future prospect of a native WebSocket isn’t enough good news, I am proud to announce that the Orbited project has implemented WebSocket for all major browsers, today. We do this by communicating over various Comet transports with the browsers, then performing the WebSocket handshake with the remote server, and proxying data in between. This means that today you can write a WebSocket server and application, start Orbited up, and be on your way. Tomorrow, you won’t need to change any of your server or client code whatsoever. Your application will fall forward to the native implementation of WebSocket for improved performance.


The single most voiced criticism to this specification has been that a WebSocket isn’t quite the same as a raw TCP socket, because a WebSocket server needs to understand a specific handshake in order for browsers to connect directly, and as such a WebSocket can’t connect to existing servers. If we did allow raw TCP sockets in the browser, a malicious site could cause any visitors to open up a TCP connection to an SMTP server, for instance, turning a casual web visitor into a spambot. There are many variations on this scenario, but the general problem is that a raw socket connections in a browser will allow any sites that a user visits to access network services as if they were the user, in the same network security context as the user. We need to therefore make this an opt-in process or we’ll catch existing servers off-guard. Furthermore, very few protocols have any kind of cross-domain authorization or security mechanisms built in. If we were to allow raw TCP, then we would be opening all manner of cross-site security holes. We could fix these by limiting TCP connections to the origin domain and port, (meaning a direct sockets back to the webserver only) but that would limit any usefulness the TCP socket could provide.

I fully understand the criticism though; Earlier this week I discussed exactly why having a raw socket in the browser is so desirable. You could, for instance, quickly prototype a Gmail clone using a raw socket, an IMAP client, and an XMPP client in the browser.

We have a clear problem then: Direct access to existing network servers could greatly simplify application architecture, but due to security restrictions it’s a non-starter; we absolutely must retrofit each network server with the new WebSocket protocol first. I hope that happens, but we can’t count on it, at least not right away. What we really need is a way to allow the server to opt-in without putting it in the protocol, a way to seamlessly layer access control in front of the back-end server. It turns out that this problem has already been solved for traditional networks. That is, if we have two end-points communicating over TCP, and we need transparent access control in between, then we can use a well known device: A firewall. The beauty of a firewall is that server behind it requires no re-programming, or even re-configuration, yet gains all of the access control/security benefits. What we really need in the browser case, is a custom firewall that can listen for WebSocket connections from the browser, enforce access control, and relay TCP to a back-end server.

That is why Orbited provides this feature under the API name TCPSocket. Orbited is the firewall that sits between the back-end server and the browser. It understands WebSocket protocol for browser communication, and uses whitelist security to accept or reject requests to proxy TCP data to and from a back-end server. That’s right, you can fire up a stock XMPP server, and Orbited, and write the XMPP client entirely in JavaScript. This works cross-browser today. We also offer a binary mode that uses an intermediate encoding to allow the browser to read raw bytes (in the form of JavaScript integer arrays) from a remote server. Here is a diagram of the architecture:

Orbited is a Web Firewall

The Future

Now its up to the browser developers to implement Websocket. I expect some will be very quick on the uptake, while it may take years for others. I expect to see a common pattern emerge where application servers listen to the WebSocket protocol directly from new browsers, but fallback to Orbited’s emulation layer for legacy browsers. The key here is that we don’t have to wait on the browser vendors to get started. We can all develop these WebSocket applications now, and when browsers have native support, we’ll all get a performance boost.

We will probably never get a native (raw) TCP socket in the browser, for the security reasons I outlined. It’s okay though — we can use the firewall pattern I outlined. For more information about installing and configuring Orbited, check out the documentation and the getting started section.

[Slashdot] [Digg] [Reddit] [] [Facebook] [Technorati] [Google] [StumbleUpon]
SitePen, Inc. Comet Services

23 Responses to “Independence Day: HTML5 WebSocket Liberates Comet From Hacks”

  1. Martin Tyler Says:

    I’ve not had time yet to look at the spec for this, but you seem to be mainly talking about browsers implementing this. Surely every proxy out there needs to somehow support this.. and what does a server wanting to support it need to do? You, or maybe ‘they’, seem to be presenting it as a browser thing, but it needs support on the server too surely.

  2. Frank Salim Says:

    In the article, it says that WebSocket has a protocol designed to work through current proxies. Also, until there are servers supporting WebSocket, Orbited provides backwards compatibility. It has a WebSocket to TCP bridge.

  3. Martin Tyler Says:

    The point of this is to get a socket, not a socket wrapper over comet ‘hacks’ though.

    Ok, I have now read the spec. It uses the existing CONNECT command when going through a proxy, which is intended for SSL connections. Thats something I played around with years ago, but obviously only things like signed applets can make connections to the proxy directly like that.

    With no proxy configured it just makes a straight connection to the server.

    The communication isnt a straight socket in that you couldnt connect to an existing server and handle its protocol - the websocket protocol requires specific headers and encoding (although pretty simple).

    So this looks just like a much better way to implement a comet server, rather than being able to connect browsers to existing servers.

    So anyone know an ETA on browsers supporting this?

  4. Metal Hurlant Says:

    > We will probably never get a native (raw) TCP socket in the browser, for the security reasons I outlined.

    I question the validity of this statement. Every browser with Flash 9 installed has support for raw TCP sockets, and the security model it relies on doesn’t seem to have the security problems you allude to. (it uses socket-served crossdomain.xml files.)

    You mentioned how neat it would be to have an XMPP client in the browser.
    There’s an open-source project that does just that already:

    In truth, with so much of HTML 5’s spec aiming squarely at catching up with Flash functionality, I’m a little bit amazed to see several people behind the spec act in their html 5-related blog posts as if Flash doesn’t exist.

    It does exist, it has already implemented a lot of the new exciting stuff HTML 5 promises, and it seems peculiar not to learn from it before re-implementing your own flavor of it.

    (As a fun aside, the SVG 1.2 draft spec hints at some support for raw TCP socket support. It’s obviously not popular to acknowledge Flash, but their source of inspiration is fairly obvious to anyone who knows the flash APIs and bothers to read their spec )

  5. David Davis Says:

    I think It should use an observer pattern, so you don’t have to wrap it in a scope.

    conn.addListener( ‘open’, this.handleOpen, this );
    conn.addListener( ‘read’ this.handleRead, this );
    conn.addListener( ‘close’, this.handleClose, this );

  6. Jacob Rus Says:

    David: It does. Notice that the spec says “WebSocket objects must also implement the EventTarget interface.”

  7. RogerV Says:

    WebSocket as a much superior bi-directional, full-duplex Comet is superb just by itself. Not particularly concerned about trying to get JavaScript in a browser to consume legacy TCP services at the server side.

    I’ve been waiting about five years for the web to get a protocol like WebSocket so that web apps can be implemented to be like distributed rich client messaging-based apps.

    What will be way cool is to keep writing Flex apps today to Adobe’s BlazeDS remoting and messaging abstraction layer, and then down the road get an automatic upgrade in performance when BlazeDS and the FlashPlayer implement WebSocket support.

    Finally the W3C has something cooking that is very worthwhile. Otherwise I’d written W3C off as completely hopeless.

  8. David Davis Says:

    Jacob: Ah I see


  9. GregWilkins Says:

    I think this is a huge step backwards. Sockets are simply not the right level of abstraction that we want to expose in javascript. Asynchronous socket programming is hard, with one of my favorite examples being, what does a programmer do with 3 byte of a 6 bytes UTF-8 character?

    Currently, nice native code in the browser handles our dataframing and character conversion for us. Websockets is putting that difficult burden onto the javascript framework.

    HTTP gives us a lot of support with character encoding and data framing. Many of the so-called hacks take great advantage of this and use the efficient implementations in servers and browsers. Dropping the semantic of the conversation to raw bytes on the wire will just turn js framework developers into async IO programmers and introduce years of instability while they learn how hard that is!.

  10. Frank Salim Says:

    The authors of the WebSocket spec anticipated this, so WebSocket differs from a TCP socket* in that it is text-based and framed.

    You will see that the browser hides buffering so that JavaScript only deals with complete frames (which are strings). This makes sending and receiving complete JSON objects or XML very easy. As a matter of fact, this sounds like _exactly_ what you want.

    *WebSocket is on top of TCP — it isn’t reinventing the wheel

  11. GregWilkins Says:

    OK, I stand corrected.

    They need to work on there names and descriptions. The name Socket is misleading and their text talks about bytes being available etc.

    So this is a messaging system then, were the messages are just strings that can be XML, Json, javascript or something else. Well that I like!

  12. Orbited Blog » Blog Archive » Talk at OSCON 2008 Says:

    [...] Independence Day: HTML5 WebSocket Liberates Comet From Hacks [...]

  13. Comet Daily » Blog Archive » Comet Gazing: WebSocket Says:

    [...] time, we’ve asked our contributors the question: “With the recent addition of WebSocket to the HTML 5 recommendation, what impact will this have on your Comet implementation in both the [...]

  14. Orbited Blog » Blog Archive » Our Ancestor’s Secrets: WebSocket Article and Panel Says:

    [...] on WebSocket for the Silicon Valley Web Builder blog. He talks about his experiences explaining WebSocket to developers, and how we can “recover our ancestor’s secrets” of good [...]

  15. Comet Daily » Blog Archive » Dojo and WebSocket Says:

    [...] HTML5 defined WebSocket has been gaining in popularity because of its efficient and intuitive approach to Comet. Orbited [...]

  16. John Bailo Says:

    I’ve been trying to avoid Comet altogether until someone came up with a true means to invoke callback functions remotely.

    This seems like the real thing — closer to RMI then a “long polling” hack like Comet.

  17. Protocols for the real-time web « Thoughts on technology and social web Says:

    [...] without the long poll overhead. The javascript API is also very simple. COMET can surely take advantage of this and things become more straight forward without hidden iframes and arcane protocols. So can [...]

  18. javamike Says:

    > Metal Hurlant : Every browser with Flash 9 installed has support for raw TCP sockets

    So does every browser with a java plugin. The focus is to support this functionality without relying on plugins.

  19. testm Says:

    Hi all,

    The problem is that I don’t think it will work thru “any” proxy :(

    For instance the Orbited demo is blocked here :( :( :(

    This whole story about tunneling (or “simulating”) a TCP socket into an HTTP is quite odd to me.

    If the whole thing is about using one single point out, then you will fall into priority handling. If the whole thing is on optimizing the stream, then you already have TCP.

    As javamike said, TCP socket is accessible from either a Java applet (no need to sign as long as you call “home” server) or a flash file … this is possible right now. This spec look like nothing but pushing the TCP API into the webbrowser but not having full benefit of TCP : not straight forward compatibility = need an adapter.
    Applet of Flash are viable solutions right now. There will only be benefit using Websocket when this would be implemented on most Major browser. This will not happen. I don’t think MS will do that.

    The standard compatible full featured browser is either a mirage or a myth. You can’t get tons of feature and keep compatible without strong Test Compatibility Kits. There is none madatory for “Web Browser”, to tout : “Compatible Web 2.0″ for instance ;-)
    Hece, the problem to get something as simple as : asynchronous data sending.

    If anycase browser become compatible, be sure firewall will block that.

    By putting another layer, you create the illusion of openeing possibilities, but the basic problem is the security model of the whole thing. Chain-o-trust if you preffer …

    Don’t get we wrong, I woudl be glad if it succeed, but I doubt it would.

  20. Comet Daily » Blog Archive » Is WebSocket Chat Simple? Says:

    [...] WebSocket protocol has been touted as a great leap forward for bidirectional web applications like chat, promising a new era of simple [...]

  21. Don Moir Says:

    Michael, in one of your articles you make the statement that ‘everything sucks’ and it surely does.

    About 6 years ago I started looking at browsers and realtime socket connections. At that time the only realistic approach was to use a java applet. I supported MS JVM and Sun Java 1.1 and up. I used HTML for display purposes as Java was and still is clunky as a display mechanism within a browser. This had to be same domain only as well. There were and still are version and user installation problems with Java. Java 1.6 now offers cross-domain socket connections but am I to ask a naive user to install it ? Sun absolutely blew it by not having a lightweight container for browser applets.

    HTML5 Websockets ? I wonder also how this is actually going to work in some proxy senarios. We have corporations that use proxies that still only support HTTP 1.0. Corporations in general move slowly. It’s probably going to be years and in the meantime we can layer more junk to support this, that, and the other thing.

    These days I use Flash. It sucks too, but provides good support for cross-domain socket or HTTP connections. You can use binary or text data and it is widely available. Of course it is still a plugin and much of it seems to be have written by amatures.

    My main thing is to allow cross-domain connections and wide support for it. This allows me to listen on port 80 for socket or HTTP connections quite easily.

    If I have to change my server to support websockets, big deal, its not like I don’t have to jump thru hoops already.

    I am getting older by the minute, faster than normal that is.

  22. Websocket Chat - Grails websocket Says:

    [...] websocket protocol has been touted as a great leap forward for bidirectional web applications like chat, promising a new era of simple [...]

  23. Calling Mr. Client, do you read me? WebSockets to the rescue! | Front end Blog Says:

    [...] Protocol: This entry was posted in front end, user experience and tagged comet, HTML5, websocket. Bookmark [...]

Leave a Reply

Copyright 2014 Comet Daily, LLC. All Rights Reserved