Skip to content

What else is burried down in the depth’s of Google’s amazing JavaScript?

So the new GTalk interface in GMail is pretty rad. Congrats to Dan and the rest of the team that made it “go”.

The talk feature is cool not just from a UI perspective as the code is also chock full of little gems. I’m kind of a dork about low-latency data transport to the browser. HTTP wasn’t meant to be used this way…so of course I’m interested! Ever since Joyce got me involved in the rewrite of mod_pubsub I’ve had my eye on the various ways that servers can push data to browsers and the kinds of technology that will prevent a server that’s doing this from melting down (hellooooooooo Twisted). Using just what’s available to the browser, it’s possible to have the server push data encapsulated in <script> blocks and rely on a progressive rendering behavior that every modern browser implements to dispatch events in near real-time (compared to full page refresh or polling delay). There are a mountain of browser quirks that of course play into this process. The least desirable of these to the user are the “phantom click” and the “throbber of doom” that afflict IE users.

When a page (or an br it hosts) is loading content, your browser usually shows some sort of “I’m working” indicator. In the bottom “taskbar” there is usually some sort of progress meter. In the upper right (on IE) the “throbber” will continue to animate until the work is done. Of course in the scenario I’m describing the sent page is never done. The whole point is that the server keeps the connection open. Combine this with the IE behavior of producing a “click” like sound when an br is navigated to a different URL, and you’ve got a pretty poor user experience.

But couldn’t you do something with XMLHTTP? Short answer: yes, but not as portably and it won’t get you around IE’s 2-connection limit either so there’s not much of a win. For the long answer, see my talk at ETech or wait for me to post the slides. At the end of the day, the hidden <br> hack scales best and is the most portable. Especially if you can lick the UX problems.

Which Google has.

How? By cleverly abusing another safe-for-scripting ActiveX control in IE. Here’s the basic structure of the hack:

  // we were served from child.example.com but 
  // have already set document.domain to example.com
  var currentDomain = "exmaple.com/"; 
  var dataStreamUrl = currentDomain+"path/to/server.cgi";
  var transferDoc = new ActiveXObject("htmlfile"); // !?!
  // make sure it's really scriptable
  transferDoc.open();
  transferDoc.write("<html>");
  transferDoc.write("<script>document.domain='"+currentDomain+"';</script>");
  transferDoc.write("</html>");
  transferDoc.close();
  // set the br up to call the server for data
  var ifrDiv = transferDoc.createElement("div");
  transferDoc.appendChild(ifrDiv);
  // start communicating
  ifrDiv.innerHTML = "<br src="/img/spacer.gif"> 

This is the kind of fundamental technique that is critical to making the next generation of interactive experiences a reality. Server tools like mod_pubsub and LivePage (and perhaps even JMS buses) are starting to come into their own and the benefits of event-driven IO are starting to become well understood by server-side devs. It’s only a matter of time before server-push data hits an inflection point in the same way that background single-request/single-response data transfer did with Ajax. Dojo will, of course, have infrastructure to support this kind of thing when the borader developer community is ready (most if it is already in place).

From long and painful experience and amazingly deep respect, I take my hat off and bow to whoever it was on the GMail/GTalk team that figured this out. It’s a hell of a hack. It’s no wonder that Google has been able to attract and develop the best DHTML hackers in the world.

Update: so just to be *very* clear, I worked on the rewrite of the mod_pubsub *client*. The server rewrite was handled by some folks who are much smarter than I am.

This entry was written by alex, posted on February 12, 2006 at 7:11 pm, filed under dhtml, javascript, programming, webdev and tagged vim. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Both comments and trackbacks are currently closed.

67 Comments

  1. robert
    Posted February 13, 2006 at 2:17 am | Permalink

    So, how does this work in Firefox then? Another technique?

  2. alex
    Posted February 13, 2006 at 2:27 am | Permalink

    On FF (1.5), the communication br only makes the statusbar say “Transfering data from example.com…” while the throbber stops when a subsequent HTTP request has finished. It’s much less distracting. While not perfect, it sure beats having the top thinger spinning, and a solution for the 85%+ browser is *much* more important for the acceptance of the technique.

  3. Simon Willison
    Posted February 13, 2006 at 2:31 am | Permalink

    Is the htmlfile object documented anywhere? I can’t find it on MSDN (but then I can never find anything on MSDN).

  4. alex
    Posted February 13, 2006 at 3:01 am | Permalink

    Not sure. I think the above syntax gives you a document that implements IHTMLDocument2:

    msdn.microsoft.com/library/default.asp?url=/workshop/browser/mshtml/reference/ifaces/document2/document2.asp

    Regards

  5. Jeff Lindsay
    Posted February 13, 2006 at 3:32 am | Permalink

    How’s the rewrite of mod_pubsub going?

  6. alex
    Posted February 13, 2006 at 3:35 am | Permalink

    It’s done but not yet released as Open Source software. You might ping Joyce to see what the current price for a copy is.

  7. Sjors Pals
    Posted February 13, 2006 at 3:54 am | Permalink

    Nice, but this is still a “hack”, the problem is that HTML is just not suitable for RIA’s. I believe that “ajax” can be used for small parts of a website, but its totally not suitable for complete webbased applications with rich interaction. One of the worst things about Ajax is that techniques are based on hacks, and not on standards. An other problem is that HTML is just not suitable for rich internet, example: if you need an accordion element, tree structure, or tab interface, you have to build it still in HTML. While in flash its just adding 1 tag and its completely rendered on the client.

  8. alex
    Posted February 13, 2006 at 4:03 am | Permalink

    Sjors,

    Of course it’s a hack. Welcome to the web. This is how real work gets done out here in the world of universal deployment.

    As for Flash and components, I invite you to check out the widgets we’re building in Dojo. They make building richer interfaces easier, to the extent that declaring rich components can be as little as a couple of tags. These components aren’t Flash, but that’s both a benefit and a liability.

    Regards

  9. Jake
    Posted February 13, 2006 at 6:25 am | Permalink

    Tables were hacks too, and look at how using them turned out. I have a feeling that we’re going to be repeating history with applications based on code like this.

  10. Scott
    Posted February 13, 2006 at 10:12 am | Permalink

    Forgive the denseness, but how is this different from programatically or manually (via a click) changing the src of the br to a dynamic page and writing out the javascript data using some server side technology? (e.g. the “back in the old days” method) Or if you prefer, changing the location of a 0px w/h or hiddent frame?

  11. alex
    Posted February 13, 2006 at 11:18 am | Permalink

    Scott,

    It’s unique in several ways. It builds on the programmaticaly “moving” an br to a different URL, but with the cooperation of the server streams events down the wire without closing the connection. Furthermore, unlike hosting the br directly under the spawning document, this technique avoids the background “click” noise and prevents the throbber from spinning. It’s a usability enhancement to a well-known technique (at least in the small community of people that care about low-latency data to the browser).

    Regards

  12. Dan Pupius
    Posted February 13, 2006 at 11:47 am | Permalink

    Cheers Alex. The team are glad to know people are noticing the technical achievements of Gmail Chat. As you know, I joined the team quite late on and was equally impressed when I found out how they were handling the persistant connection, it’s a stroke of Genius.

  13. Dan Pupius
    Posted February 13, 2006 at 11:47 am | Permalink

    (For the record I only played a small role in this launch)

  14. Dimitri Glazkov
    Posted February 13, 2006 at 1:59 pm | Permalink

    That’s pretty cute. Indeed, the call will return IHTMLDocument2. I wonder if this leaks any?

  15. lescoste
    Posted February 14, 2006 at 4:48 am | Permalink

    Hi, Nice job going thru the js code.
    But did you found how gmail talk detects when you are away ?

  16. Tim
    Posted February 14, 2006 at 4:58 am | Permalink

    Thanks for the insight.

    It’s amazing how irritating that br click is on the sites that use that method.

  17. Hull
    Posted February 14, 2006 at 4:58 am | Permalink

    The problem I see is that common web hosts aren’t likely to be very happy, are they? As far as I can tell, using this technique with any common web host will tie up Apache workers, reducing the number of ‘ready to serve’ Apache workers, introducing various performance side effects, say from the number of requests going through per minute to how well you can withstand a slashdotting.

  18. Jonathan
    Posted February 14, 2006 at 7:14 am | Permalink

    I think this is pretty cool, I just find the inconsistency that exist with AJAX implementation is a bit of a pain. Until the standard’s grow I just don’t see the movement of AJAX going any further, all we do is hack out some code for small little apps when developing the page using AJAX instead of using a standard built in method. Anyway’s this was a great read!

  19. Tim
    Posted February 14, 2006 at 8:05 am | Permalink

    What about the up and coming Windows Smart Client tools? I don’t know much about it but I had heard it will make the “Desktop over internet” experiance even more a reality. It sounds very promising.

  20. Filip de Waard
    Posted February 14, 2006 at 9:33 am | Permalink

    A potential downside of this approach is the need to send the complete HTML to the client, while with XmlHttpRequest you could send a more efficient XML format and let JavaScript reformat the output. This means you have to deal with more ‘live’ data and bandwith usage. Of course, both approaches have their advantages and the ‘right tool for the job’ rule applies, but I’d like to mention this nevertheless…

  21. alex
    Posted February 14, 2006 at 9:48 am | Permalink

    Hull: this stuff won’t run on today’s Apache (hence the link to Tiwsted). The current worker setup is just too resource intensive for “zombie” connections.

    Tim: as I’ve said here before, you can develop richer interactions in whatever environment you like, but please don’t make any mistake that if it’s not HTML, CSS and JavaScript, it’s not the web.

    Filip: You’re not sending “the complete html”, you’re sending small datagrams encoded in a script tag envelope. The data on the wire is usually some sort of JSON data structure.

  22. Sandy McArthur
    Posted February 14, 2006 at 10:33 am | Permalink

    This isn’t really new, just got associated with the shiny buzzword AJAX now. The concept of an open pending http connection exactly what iTunes does when sharing between two computers. iTunes’ DAAP is really just a HTTP service between two clients. They keep an open HTTP request and if the server needs to tell the client something now it just sends a response back over that request.

  23. George
    Posted February 14, 2006 at 10:34 am | Permalink

    I have been doing an AJAX like technology for several years now. Instead of using an ActiveX object, I simply define a number of script tags and then use javascript to assign them to send and retreive data from the server.

    function loadCustomer(McustId) { document.all.general1.src="/img/spacer.gif"> }


    The php returns executable javascript, which obviously works like regular javascript. Instead of using XML, I format it so that it renders the data in the form needed for each specific case.

    The issue of PUSH is interesting to me. There are many times that I would like my local application to know about events. I use polling currently. I have thought about using Flash, which has notification capabilities. Really, all that is needed is to say “Hey, its time to check the server.”

    I am unclear about the server aspect of this particular approach (Tiwsted)

  24. Tree
    Posted February 14, 2006 at 3:13 pm | Permalink

    multipart http content type is what you are looking for. Netscape already has invented that back in the 90′s.
    wp.netscape.com/assist/net_sites/pushpull.html

    They were just too far ahead of their time. Web weren’t mature enough to use it. Here is a link on how you can implement “Serverside-Push” web application. Combine with br and XMLHTTPRequest Object (AKA Ajax) you can build a realtime application. Althougth according to how most server side scripting language (PHP, Perl, Python and Ruby)are integrated into the web server (each process doesn’t know about other process), they are not suitable for such task.

  25. alex
    Posted February 14, 2006 at 3:17 pm | Permalink

    Tree,

    Multipart is *not* what we’re looking for. We need something portable, and multipart isn’t it. While I would personally prefer it if Opera, Safari, and IE would agree on a multipart boundary and encoding syntax, it hasn’t happened. Until then, the br hack is the lowest latency option.

    Regards

  26. Peter Goodman
    Posted February 14, 2006 at 9:16 pm | Permalink

    I just started playing around with this, but it’s got me stumped: I’ve added an onload event to the br, but I don’t expect that to give me anything useful. Do I simply continuously check to see if the br’s innerHTML has changed?

  27. Tal Broda
    Posted February 14, 2006 at 9:55 pm | Permalink

    This is not new technology. It has been around for at least 6 years. The guy who invented it works for Oracle, and so Oracle now owns the patent on it.

    I came to Oracle when it acquired PeopleSoft, which acquired a startup I was the architect for called istante software.

    We use this technology in Oracle BAM to keep our business activity monitoring dashboard up to date in real time. In fact we guarantee that the time elapsed between our backend server getting a transaction committed and the time it shows up in all of the dashboard that are affected is not more than 2-5 behinds real time.

    We use 1 connecion for all of the dashboard a use has on his desktop, and we “multiplex” the ActiveData through that connection (which is one of our pending patents).

    We send down XML documents that only have the change each view needs to get, and not complete HTML like someone was suggesting here…

    Using Oracle BAM users can build real time dashboards and alerts on any data model they define, and doing all of that happens also in 100% thin and rich web based applications (which are also using Ajax when it makes sense).

    The first version n our product with all of what I mentioned shipped in 2003.

  28. alex
    Posted February 14, 2006 at 11:35 pm | Permalink

    Hi Tal,

    I never claimed that the technique of streaming data down the wire was new. When I re-wrote the client for the Open Source’d version of mod_pubsub several years ago, the technique was old then. We even re-implemented the hack that allowed multiple browser windows to use the same connection (think IPC over cookies).

    What I *did* claim is that Google’s improvement on it with respect to IE *is* novel and that it removes a significant usability barrier. If Oracle implemented this particular browser-specific workaround, that would be good to know so credit can be given where it’s due.

    As for patents and timing, I suggest you contact KnowNow whose business started around this technique (I believe) prior to the date you mention. There is likely not only prior art, there is prior art for the multiplexing portion and all of the client techniques.

    I recommend that you do your homework on this one before claiming that your company has been wronged, that anyone is in breach of patent, or that I have stated something inaccurate. Well-informed corrections are welcome, however.

    Regards

  29. alex
    Posted February 14, 2006 at 11:38 pm | Permalink

    Peter: what I blogged is only the smallest portion of the overall technique for streaming data to the client over HTTP. An onload event won’t get you where you want to go. The script tags sent down the wire themselves need to call the dispatchEvent() method (or whatever your version will call it).

  30. Filip de Waard
    Posted February 15, 2006 at 5:41 am | Permalink

    Oops, I misread the article. I concluded that the data going through is the final product: HTML. Of course, HTML is less compact then XML, so that would mean more bandwidth. Obviously, I was wrong because the format doesn’t have to be HTML.

    This approach actually seems to be quite a viable alternative to XmlHttpRequest in some cases, so I’m definitely going to check it out and learn more about it. Thanks for mentioning it spacer

  31. Andrew Collins
    Posted February 15, 2006 at 12:50 pm | Permalink

    Isn’t this technique an updated form of “Pushlets?”

    Pushlets: Send events from servlets to DHTML client browsers
    www.javaworld.com/javaworld/jw-03-2000/jw-03-pushlet.html

    Discover how pushlets, a servlet-based notification mechanism, enables server-side Java objects to call back JavaScript code within a client browser.

  32. alex
    Posted February 15, 2006 at 1:06 pm | Permalink

    Andrew,

    It has gone by many names over the years. Pushlets was just one. IIRC, however, it scaled like a lead brick. Newer event-driven server environments that eschew Servelets and Threads in favor of CPS (Twisted and POE) or a single “keepalive dispatcher” (Apache Event MPM) allow the technique to finally scale well. Kernel level improvements like epoll and kqueue (wrapped in libevent) have accelerated this.

    As for how it looks on the wire and what the client strategy is, there are many ways to skin this cat. I’m only covering a single improvement by Google to one of the better-performing approaches.

    Regards

  33. james
    Posted February 15, 2006 at 3:43 pm | Permalink

    yawn, cgiirc works (has worked) on all modern browsers using an open connection to the web server. this isn’t new.

  34. alex
    Posted February 15, 2006 at 3:47 pm | Permalink

    James: you clearly didn’t read this. I didn’t claim that the technique of streaming data to the client was new. I claimed that Google’s variant on it that solves significant usability issues *is* novel.

  35. Marco Casalaina
    Posted February 15, 2006 at 6:55 pm | Permalink

    A super-hidden br, fair enough. Does this work around the 2 max connections limit that exists in both browsers, though?

    If not, has _anyone_ come up with a clever workaround for the 2 connections limit, other than changing the registry for IE and prefs.js for Firefox (which can have other undesirable side effects)?

  36. Tal Broda
    Posted February 15, 2006 at 8:54 pm | Permalink

    Alex,

    I never claimed that our company has been wronged, or that anyone is in breach of patent.

    We have checked prior art (including KnowNow) before we submitted the patent applications.

    I agree think that Google’s use of this technology is awesome, and I think that what we do with it in Oracle BAM is not less cool spacer

    Tal.

  37. Tal Broda
    Posted February 15, 2006 at 8:55 pm | Permalink

    check out our website (soon we will have some live demos on it): www.oracle.com/technology/products/integration/bam/index.html

    Tal.

  38. Martin Franz
    Posted February 17, 2006 at 3:25 am | Permalink

    Hey Alex,

    dont know if this is all obvious to you guys, but it took me some time to figure out:
    in order to access a function “outside” the htmlfile ActiveX Object you’ll have to set a reference under the “parentWindow” property of the ActiveX Object.
    e.g.

    function foo() {…}
    transferDoc = new ActiveXObject(“htmlfile”);
    transferDoc.parentWindow.foo = foo;

    // inside the br
    parent.foo();

    gmail does it the same way, so i guess there is no better solution spacer
    in order to avoid the “browser keeps loading” syndrom on Mozilla gmail uses the XMLHttpRequest which supports, at least under Mozilla, the readyState “INTERACTIVE”. It allows access to the responseText while its still loading. Unfortunatelly there is no way to clear the responseText, so everytime the readyState occurs you’ll have to substr out the stream data that you received before in order to get the newly received data. So it might be a good idea to reestablish the stream connection at some stage so the browser may free that memory. Certainly, this also applies to the br technique. (again, gmail does that aswell)

    greets from germany,

    martin

  39. Martin Franz
    Posted February 17, 2006 at 3:30 am | Permalink

    oops, i didnt know that a tripple dash does some formating stuff. would you please correct that and delete this post here?

    thanx!

    greets,
    martin

  40. Rui Pinheiro
    Posted March 13, 2006 at 2:28 pm | Permalink

    Amazing stuff. I wonder about the possibility of doing the reverse, i.e., making the file upload process much smarter.

    Imagine resume, upload in blocks, etc. Besides being useful in a P2P-like situation, would be great where clients have to upload LARGE files to the server.

    I know, I’m a dreamer spacer

  41. Willem Mulder
    Posted March 25, 2006 at 9:28 am | Permalink

    So… Why does Google nog use this to check if there’s new mail… For as much as I know, there’s still a click on the ‘inbox’ link required to find out if there’s new mail… (or maybe it does check, but not often enough?)

  42. Jon
    Posted March 29, 2006 at 9:05 pm | Permalink

    Sorry to join the conversation so late; you wrote waaay back at the start of all this:

    “On FF (1.5), the communication br only makes the statusbar say “Transfering data from example.com…? while the throbber stops when a subsequent HTTP request has finished.”

    I’ve been having a very hard time reproducing that result, and I wonder if you could clarify a bit. Is it any old HTTP request on the page (i.e. an image or something), or does something fancy need to be done to make the spinner stop for Firefox?

    Thanks very much!

  43. Jon
    Posted April 1, 2006 at 8:42 pm | Permalink

    Something fancy DOES need to happen. I’ve managed to reproduce it, but I’m not entirely sure where the magic is…

  44. sam
    Posted April 8, 2006 at 11:20 am | Permalink

    Has anyone experimented with this “htmlfile” object to try multiple synchronous ajax requests? Seems like you could just use this instead of XMLHttpRequest.

  45. Nutz
    Posted May 6, 2006 at 11:17 am | Permalink

    Can someone (Alex?) post a working code example so we can see this Comet stuff in action ?

    Would/could it work with Microsoft IIS ?

    Thanks

  46. Gordon
    Posted June 9, 2006 at 4:22 pm | Permalink

    Maybe this is a stupid question, but I’m wondering one more thing. When a user (let’s call them the sender) clicks on another user’s name (the receiver) to chat, the sender opens up a small br window to chat and all this code does its magic. But how does the receiver’s window know to likewise open an br?

    My best guess is that as soon as anyone logs into Gmail, a persistent connection is established and is constantly left open, even when no one is chatting. Do you know if this is the case?

  47. Craig
    Posted June 9, 2006 at 7:42 pm | Permalink

    This rules! I’m in the process of releasing an br based AJAX implementation that is able to get around the browser “same origin” policy (

gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.