« POST Considered Inconvenient
I Can Outrun a 767 »

Why Tim Berners-Lee is Wrong

The W3C is finally waking up and realizing they’ve got a problem with HTML. The browser vendors are once again abandoning them and going their own way (except for Microsoft, which is going in a different direction entirely). The W3C has wisely decided to start listening to Mozilla, Opera, and Apple and revisit classic HTML. Unfortunately though they realize they have a problem, they haven’t yet realized what the problem is. Berners-Lee seems to think it’s about “quotes around attribute values and slashes in empty tags and namespaces”, and it’s not.

XHTML is not the problem. Well-formedness is certainly not the problem. Hell, even namespaces aren’t really the problem although they’re clunky and ugly and everyone hates them. The problem is that the W3C has abandoned HTML for years. HTML hasn’t moved forward since 1999. No wonder browser vendors are getting antsy.

XHTML (1.0 and 1.1) is nothing but a reformulation of HTML. It is a very good reformulation that offers real benefits to developers and authors. However it doesn’t add any significant new functionality. It makes many tasks easier (especially ones that involve machine processing of HTML) but it doesn’t make anything new possible. Nonetheless it’s an unalloyed good thing, and we should keep it. Berners-Lee complains that:

The attempt to get the world to switch to XML, including quotes around attribute values and slashes in empty tags and namespaces all at once didn’t work. The large HTML-generating public did not move, largely because the browsers didn’t complain. Some large communities did shift and are enjoying the fruits of well-formed systems, but not all.

The simple fact is that it’s hard to change direction on a moving train. It’s even harder to change direction when that train is made up of millions of independent authors and software vendors. It takes years, but guess what? The train is moving. XHTML is winning. More and more pages are being served in valid XHTML, and more and more tools are generating it. We may never get rid of classic HTML in my life time, but there’s no reason to give up on XHTML now.

The problem is not now and has never been XHTML or well-formedness. The problem is that the W3C lost interest in improvements to HTML and XHTML. Instead they’ve run off and started work on huge, complicated, massive monolithic plugin technologies like XForms, MathML, and SVG, but even these aren’t the problem themselves. Considered individually they’re each useful and practical. The problem is that the W3C stopped worrying about the smaller problems, like how to DELETE a URL with a web form, how to identify a date in a document, or how to logout of a site that uses HTTP authentication. There’s still a lot of room for improvement in classic HTML and XHTML. There are still elements and attributes and attribute values that are simply missing and glaring by their absence.

The W3C’s mistake was ignoring these little things while it worked on big problems like MathML and SVG. What’s needed now is not an abandonment of the good work the W3C has done in XForms, SVG, MathML and most especially XHTML. Instead what we need to do is tie up the loose ends. Finish what Tim Berners-Lee started way back in 1989, and make HTML a really solid language for the writing and reading of narrative content.

Then we can make it even more powerful by mixing in XForms, SVG, MathML, MusicXML, and other pieces. However, we can only do this if we keep well-formedness, keep XHTML, and keep namespaces. These are all critical to enabling HTML to expand beyond the narrow confines of newspapers, blogs, personal home pages, and online stores. Otherwise we’ll be condemned to a hell of tag soup and JavaScript for all eternity; and that is not a fate I wish to experience.

« POST Considered Inconvenient
I Can Outrun a 767 »

This entry was posted on Sunday, October 29th, 2006 at 8:44 am and is filed under XML, Web Development. You can follow any responses to this entry through the RSS 2.0 feed. You can make a comment, Digg this story, or trackback from your own site.

27 Responses to “Why Tim Berners-Lee is Wrong”

  1. Guy Says:
    October 29th, 2006 at 3:31 pm

    Yes! The w3c needs to hear this message loud & clear… for the sake of everything we hold dear on the www.

  2. John Cowan Says:
    October 29th, 2006 at 3:39 pm

    That sounds like exactly what XHTML 2.0 is, and fortunately there will be an XHTML 2.0 task force.

  3. SamFeltus Says:
    October 30th, 2006 at 2:42 am

    Maybe the problem is…

    HTML is a Web Display Technology

    HTML should have been a browser plugin, the same as Flash, Windows Media, Java, QuickTime, etc.

    While the WC3 sleeps, Flash is leaving HTML in the dust.

  4. Matthew Wilson Says:
    October 30th, 2006 at 4:29 am

    “The problem is not now and has never been XHTML or well-formedness.”

    And yet people have felt the need to write liberal XML parsers for RSS, which people and tools can’t seem to get right on a regular basis.

    “XHTML is winning.”

    That seems a strong claim.

  5. Craig Says:
    October 30th, 2006 at 4:33 am

    I think quite a bit of work is going on with XHTML2 and WHATWG… for example, they are looking at a possible tag (navigation list) to replace the that web developers tend to use for navigation (will help search engine spiders identify page content and the navigation bar).

    Have a look at these…

    www.w3.org/TR/xhtml2/mod-list.html#edef_list_nl
    whatwg.org/specs/web-apps/current-work/#the-nav (the WHATWG equivalent)
    www.w3.org/TR/xhtml2/mod-roleAttribute.html (a great tool to identify elements like a site map).

    Originally I wrote a small article with some of the problems I was experiencing with (X)HTML, and my solution with CSS… then I found out about the above tools…

    www.krang.org.uk/searchEngineCSS/

  6. Elliotte Rusty Harold Says:
    October 30th, 2006 at 5:51 am

    The problem with RSS is that too many people involved with it never took the time to learn even the basics of XML. That one of its chief evangelists was XML-clueless didn’t help. When even the inventors don’t understand what they’re doing, what hope is there for the rest of us?

    However, RSS is being rapidly replaced by Atom; and Atom, unlike RSS, makes no compromises on well-formedness. As Fred Brooks wrote, “Plan to throw one away. You will anyway.” RSS is the one we’re going to throw away. It was a learning experience, but not what we’ll see moving forward. Perhaps the problem with XHTML is that people never planned to throw HTML away?

  7. len Says:
    October 30th, 2006 at 7:54 am

    Perhaps the problem is that the inventor of HTML was clueless about SGML, the inventor of RSS learned by example, and by the time an application whose dominant design characteristic is ease of typing tags over maintaining large distributed applications reaches out to millions of users it can no longer reliably scale to more complex applications.

    IOW, the fielding of the World Wide Web was witless both technically and socially.

    Tag soup was the imprimatur. As the twig is bent, so grows the tree. No amount of force or persuasion will change that. What might change in a few minds in a few places is an understanding of how design for public and very large systems has to be done to stay out of these traps.

    What are the requirements that make it necessary to take the chunks out of the soup and put them back into the can?

  8. Jim Says:
    October 30th, 2006 at 3:08 pm

    The problem with XHTML adoption has one clear factor that stands out head and shoulders above the rest: Internet Explorer doesn’t support it. Given that fact, only in corner cases can the average web developer ever possibly have anything to gain by using it.

    The thing is, this problem applies just as well to *any* improvement to HTML that the W3C is going to cook up. Saying that incremental improvement is going to help is just sticking your head in the sand. If Internet Explorer doesn’t implement the new stuff it might as well not exist for the majority of web developers, incremental or not.

  9. Matthew Says:
    October 30th, 2006 at 6:55 pm

    Since I worked on a browser several years ago this is an interest to me. My opinion is that the w3c is and has always been the problem. HTML is not really a spec, even if you just consider the syntax. But real problem is that HTML is a presentation language and that’s not addressed. I do remember an RFC on tables layout being referenced but the major browsers didn’t conform to it. HTML is not unique every spec I’ve seen from the w3c is poor (XML, SOAP, XML Schema….). The w3c also doesn’t seem to know how to manage a standard.

    In my opinion correct standard management requires:
    1. Define a specification using a defined syntax (note how w3c specs doesn’t define the BnF grammer they use, note how the IETF specs do). People find it hard to write a parser based upon paragraphs of description.
    2. Define and provide conformance and validation tests for the specification. People should be able to test to see if their implementation conforms to the specification.
    3. Provide a reference implementation.
    4. Define and provide an interoperability suite.
    5. Define a method for subsetting and extending the specification.
    6. Enforce the use of the standard so that only conforming implementations exist.

    Instead the W3C put out a spec a month for years without providing all the necessary support to make them a success. Face it Microsoft now defines the standard for HTML and the w3c is lost in a haze of the semantic web or whatever. Now people just need to acknowlege this and turn out the lights and lock the doors at the w3c.

  10. ben Says:
    October 30th, 2006 at 10:06 pm

    Excepting Nos. 4 and 6 (every software vendor wants to follow the Sinatra Doctrine, and enforceability only works in a certification context AFAIC), Matt’s list is great - and even the items I write off as pipe dreams are spot-on in spirit. Even just having a SOLID REFERENCE IMPLEMENTATION (whether written by W3C, or taken to market by a publisher and consequently certified as the reference implementation) would BY ITSELF be a huge step in the right direction for nearly any W3C technology you can name… I believe that lot of standards-aware developers would latch onto that with a vengeance, especially if the title in question wasn’t Internet Explorer. I also believe that Microsoft’s influence is one of the reasons why that step has never been taken.

    As for the relevance of the W3C - regarding technologies in current use, they can’t really hope to influence the process beyond what individual participants might take back to their employers after dialogues at a W3C soiree - information that product managers are likely as not to ignore or forget. The better part of the influence comes from inertia and market forces, for better or worse.

    When it comes to the Semantic Web, well, if the Recommendation track work on it ever starts to achieve fruition, then the W3C has a snowball’s chance of having its way there. I hope.

  11. len Says:
    October 31st, 2006 at 8:53 am

    It’s not quite that simple. The WHATWG has buy-in from enough developers and vendors to produce changes in the implementations of their browsers. Those browsers are a weak but not insignificant force of competition on Microsoft. Microsoft will harvest the best changes they see in that domain and put it into IE. The distribution of features across browsers and browser users will be uneven but their will be an overall improvement.

    There is no such thing as a homogoenous technology ecosystem. Homogeneity is gray goo.

    As to standards, maybe some people are finally beginning to understand what the forty-somethings tried to tell them in 1993-4-5: it don’t come easy. Know the difference between specifications for a technology developed in an emergent space versus a standard for a technology made interoperable in a mature space. These are different kinds of processes for different kinds of products at different stages of a technology lifecycle.

    As for the Semantic Web: SGML (Sounds Good Maybe Later), meaning its time will come when the need for it outweighs the costs of implementing it such that the results are worth the trouble. When that happens economics and marketing will require it be rebranded and have new names on the specifications.

    As for the W3C: it’s sooooo over. You will one day look back at its ten years of dominance as the good ol’days. And so it goes.

  12. Reinventando HTML « Predeciblemente Impredecible Says:
    October 31st, 2006 at 9:31 am

    […] Tim Berners-Lee se lamenta sobre la poca adopción de xHTML como estándar en la web, y el poco soporte por parte de los desarrolladores de navegadores. Otro post interesante sobre el asunto es el titulado Why Tim Berners-Lee is wrong, el cual me ha hecho pensar sobre las palabras del primero, y realmente me parece una opinión bastante acertada. […]

  13. johnk Says:
    October 31st, 2006 at 4:18 pm

    How about the W3C come up with an xml language for creating menus, that can be embedded only into xhtml docs. That might spur more xhtml conformance. Nearly every popular page on the web has some kind of menu. A well designed one would provide some level of semantic information, in a form comprehensible to computers. It could be the RSS of documents that don’t change very much.

  14. Blackheart Says:
    November 2nd, 2006 at 12:22 am

    If browser vendors really have nothing to do, then they might, oh I don’t know, let me see, how about: fix the bugs in their CSS rendering?!? Sweet Jesus, how many years does it take? About the only renderer I feel I can trust is Firefox’s Gecko. I still regularly trip across bugs in Safari and Opera, and IE 7 (from what I hear) is still a joke. And rendering in other applications like Dreamweaver is even worse.

    I mean, I really have to say that browser rendering seems like an infinite regress. If it’s not one thing, it’s another. If operating systems were written by the people who write browsers, everyone would need a RAID array and we would be rebooting every half an hour. The word “standard” in this context is just a pretty word that everyone pays lip service to. In reality, there are ten different standards, named IE 5.0, IE 5.5, IE Mac, Netscape 4, Firefox, Safari, Opera, … If they had actually gotten it right in the first place, we could say “CSS 1″ or “CSS 2″, but no.

    And if the vendors think they’ve ironed out all the rendering bugs, then how about supporting more of CSS2, or CSS3? I would give anything for cross-browser support of CSS2’s font-size-adjust… or, say, even one browser that supported it! It’s absolutely indispensible for matching different fonts in a heterogeneous environment such as, for example, oh I don’t know… the WWW?

    And then if they STILL have nothing to do, maybe they could add MathML support in something besides Firefox, so I can actually depend on it and publish things with it rather than just admire it in Firefox.

  15. junior Says:
    November 2nd, 2006 at 10:06 am

    Writing an HTML-browser today is actually harder than writing an operating system. It’s not a trivial task to develop a parser that can process all that messy HTML code that is floating around in the wild. Parsing well-fomed, valid XHTML is not a big deal but every real-world browser will have to deal with legacy HTML as well.

    Reference implementations are fine as long as all other implementations conform to it. But as a matter of fact, Internet Explorer is still the dominant browser and I seriously doubt that Microsoft is really concerned about conforming to standards which they didn’t define themselves. They still haven’t even got the box model right.

    I’m really curious about the future of web applications. HTML is never going to cut it, no matter how many X’s you put in front of it. Flash is such a poor concept, I doubt it will survive the next couple of years. I imagine something along the lines of Display-PDF for advanced and reliable rendering and a virtual machine of Java’s fashion on which we can build the client-side code in a language of our choice. Higher level concepts like HTML could be build on top of this “Web-OS” to allow for a smooth transition.

    So much for my pipe dreams…

  16. Ragu Sivanmalai Says:
    November 5th, 2006 at 5:37 am

    While trying to assess the future of internet, I was reading a survey made in 2005 about where the internet will go in the next 10 years. To read this, you can find in the following blog posting.

    ragusivanmalai.blogspot.com/2006/11/future-of-internet.html

  17. molly.com » Have Your Say about the Future of HTML Says:
    November 7th, 2006 at 9:00 am

    […] Some people asked for new features; others were wondering if formerly deprecated elements would return; some had comments and criticisms about the decision itself, the WHATWG or W3C process; and a few raised concerns about the WHATWG and W3C ignoring the needs of particular groups. The WHATWG, who are in the process of developing the next version of HTML (called HTML 5), feel that it’s important to not only listen to all of this feedback, but to actively seek it out and respond so that we can develop a language that meets your needs. […]

  18. Have Your Say about the Future of HTML - The Web Standards Project Says:
    November 7th, 2006 at 1:40 pm

    […] Some people asked for new features; others were wondering if formerly deprecated elements would return; some had comments and criticisms about the decision itself, the WHATWG or W3C process; and a few raised concerns about the WHATWG and W3C ignoring the needs of particular groups. The WHATWG, who are in the process of developing the next version of HTML (called HTML 5), feel that it’s important to not only listen to all of this feedback, but to actively seek it out and respond so that we can develop a language that meets your needs. […]

  19. have your say about the future of HTML · igoo Says:
    November 7th, 2006 at 2:16 pm

    […] Some people asked for new features; others were wondering if formerly deprecated elements would return; some had comments and criticisms about the decision itself, the WHATWG or W3C process; and a few raised concerns about the WHATWG and W3C ignoring the needs of particular groups. The WHATWG, who are in the process of developing the next version of HTML (called HTML 5), feel that it’s important to not only listen to all of this feedback, but to actively seek it out and respond so that we can develop a language that meets your needs. […]

  20. Webkrauts » Beteiligen Sie sich an der Zukunft von HTML Says:
    November 8th, 2006 at 3:28 am

    […] Einige Leute baten um neue Features; andere haben sich gefragt, ob nun veraltete Elemente wieder zurückkehren; einige veröffentlichten Kommentare und kritische Anmerkungen zu der Entscheidung als solche, der WHATWG oder den Abläufen im W3C; einige erhoben Bedenken, dass die WHATWG und das W3C die Bedürfnisse spezieller Gruppen ignoriertens. Die WHATWG, die sich mitten in der Entwicklung der nächsten Version von HTML (genannt HTML 5) befindet ist der Meinung, dass es wichtig sei, diesen Rückmeldungen nicht nur zuzuhören, sondern sie auch aktiv zu suchen und zu beantworten, so dass wir eine Sprache entwickeln können, die Ihre Bedürfnisse trifft. […]

  21. Tripix.net » Blog Archive » Iniciativa por el HTML 5: que alquien me lo explique Says:
    November 8th, 2006 at 2:22 pm

    […] En este post se habla del revuelo que se ha montado con la iniciativa de Tim y se dice que en las opiniones aparecidas en blogs, foros y listas de correo hay muchas “ideas falsas” (traducción de google del término “misconceptions”). A partir de ahí explica como la comunidad puede aportar sus ideas y sugerencias. […]

  22. SitePoint Blogs » HTML’s Uncertain Future Says:
    November 9th, 2006 at 12:17 am

    […] Though these issues have been simmering for years, Tim Berners-Lee’s announcement has reignited the conversation. Some influential members of the community have begun to post their wish lists, others have questioned the W3C’s track record of selecting sensible additions to HTML, and still others have rallied against this perceived step backwards, calling for a renewed focus on XHTML. […]

  23. Juan R. Says:
    November 9th, 2006 at 9:28 am

    The problem with w3c specs is that several of them are fatally flawed: technically deficient, often incompatible between them, and adding lot of redundancy that slow implementations in browsers.

    Take the case of MathML. The whole spec is technically incorrect with the double core language, the element in c-numbers, the wrong orientation of items in , the basis problem in scripts, ultraverbosity (wait 15x size over a LateX or Mathematica file) etc, etc.

    If you want implement MathML in a browser today you need implement ‘anything’ twice: XML parser + MathML special parser, CSS + , usual DOM + special MathML DOM… That is reason only Mozilla implements a half of the MathML spec -moreover the implementation is defficient and so slow that can take 10-20 minutes to render some mathml docs-.

    Similar thoughts apply to SVG.

    The failure of XHTML is because the whole concept is not-logical. For instance, you define XML to be eXtensible and that implicate that tags are always closed. Ok, this is understandable when tags are not defined a priori, , you cannot know if second wee is inside first or not then you disambiguate using end tags always e.g. .

    In HTML tags are pre-defined and you can write and computer know exactly the structure because cannot be nested. You can write end tags, just are uneeded there. This is also ok -note this is not tag soap but SGML like minimization feature-.

    Now you define a XHTML constrained by DTD but you are obligated to add the end tags: ! That is, you obtain a no extensible system (if you want use mathml you may change the DTD from XHTML alone to the mixed document XHTML + MathML DTD) but you are forced to write empty tags even when are not needed. You are not obtaining the best of both worlds, just the contrary.

    This was from the theoretical part. From the implementation part, XHTML parsers are poor than their HTML cousins. Even Mozilla recommend people was not using MathML or any other XML application the change from XHTML to HTML, for instance XHTML rendering is not incremental.

  24. Filter for 1/11 2006 - Felt Says:
    November 10th, 2006 at 9:30 am

    […] The Cafes: Why Tim Berners-Lee is Wrong Like I Said. […]

  25. Yahia Says:
    November 11th, 2006 at 9:03 pm

    IMO, and after a good read of both the article and the precious comments, I think that the XHTML 2.0 improvements to the semantics of the markup should be also present in another HTML version? The NL tag, the role attribute, the section and headings(without numbering) are excellent [no need for DIVs, use section in conjunction with H like fieldset-legend. and for downlevel/highlevel headings, it will be computed by the UA, meaning an H tag which is a child of a SECTION tag which itself is a child of another SECTION, will be a level-2 heading]
    The paragraph improvement to include lists is also great.

    These are good improvements regarding HTML structure, not xHTML. Why didn’t HTML go this way with W3c ?

    As for HTML v. xHTML, I think that xHTML is really needed today and for the future, though its democratisation isn’t for this time because IE7 didn’t manage to support the application/xhtml+xml MIME type, and also because all websites coded by XHTML 1.0 or 1.1 as text/html don’t know what they’re doing.

  26. مدونة رضا البرازي » Blog Archive » شارك برأيك حول مستقبل الـ HTML Says:
    November 12th, 2006 at 9:23 am

    […] طالب البعض بمزايا جديدة بينما تسائل آخرون عن عودة بعض العناصر الملغاة, كان للبعض انتقادات وتعليقات حول القرار نفسه وآلية عمل كل من WHATWG و W3G, كما أثار البعض قلقه إزاء تجاهل WHATWG و W3C لاحتياجات مجموعات معينة. تعتبر مجموعة WHATWG –المسؤولة عن تطوير الإصدار التالي من لغة HTML (المسمى: HTML 5)- أنه من الضروري عدم الإكتفاء بمجرد الإصغاء لهذه المتابعات فحسب بل السعي وراء المزيد عنها والرد عليها حتى نتمكن من تطوير لغة تلائم جميع احتياجاتكم. […]

  27. Blog Posible » Blog Archive » Qué dirías acerca del Futuro de HTML Says:
    November 13th, 2006 at 10:55 am

    […] Algunas personas pidieron nuevas características; otros se preguntaban si los elementos formalmente desaconsejados volverían; algunos tienen comentarios y críticas sobre la decisión en sí misma, el proceso de WHATWG ó W3C; y algunos se preocupaban porque el WHATWG y el W3C ignoraban las necesidades de grupos concretos. El WHATWG, que está en el proceso de desarrollo de la próxima versión de HTML (denominada HTML 5), siente que es importante no solo escuchar a todos sus observaciones (feedback), también buscar activamente y responder de modo que podamos desarrollar un lenguaje que cumpla tus necesidades. […]

Leave a Reply

gipoco.com is neither affiliated with the authors of this page nor responsible for its contents. This is a safe-cache copy of the original web site.