HTML 5 is a mess. Now what?

A few days ago on this site, John Allsopp argued passionately that HTML 5 is a mess. In response to HTML 5 activity leader Ian Hickson’s comment here that, “We don’t need to predict the future. When the future comes, we can just fix HTML again,” Allsopp said “This is the only shot for a generation” to get the next version of markup right. Now Bruce Lawson explains just why HTML 5 is “several different kind of messes:”

  1. It’s a mess, Lawson says, because the process is a mess. The process is a mess, he claims, because “[s]pecifying HTML 5 is probably the most open process the W3C has ever had,” and when you throw open the windows and doors to let in the fresh air of community opinion, you also invite sub-groups with different agendas to create competing variant specs. Lawson lists and links to the various groups and their concerns.
  2. It’s a “spec mess,” Lawson continues, citing complaints by Allsopp and Matt Wilcox that many elements suffer from imprecise or ambiguous specification or from seemingly needless restrictions. (Methinks ambiguities can be resolved, and needless restrictions lifted, if the Working Group is open to honest, accurate community feedback. Lawson tells how to contact the Working Group to express your concerns.)
  3. Most importantly, Lawson explains, HTML 5 is a backward compatibility mess because it builds on HTML 4:

[I]f you were building a mark-up language from scratch you would include elements like footer, header and nav (actually, HTML 2 had a menu element for navigation that was deprecated in 4.01).

You probably wouldn’t have loads of computer science oriented elements like kbd,var, samp in preference to the structural elements that people “fake” with classes. Things like tabindex wouldn’t be there, as we all know that if you use properly structured code you don’t need to change the tab order, and accesskey wouldn’t make it because it’s undiscoverable to a user and may conflict with assistive technology. Accessibility would have been part of the design rather than bolted on.

But we know that now; we didn’t know that then. And HTML 5 aims to be compatible with legacy browsers and legacy pages. …

There was a cartoon in the ancient satirical magazine Punch showing a city slicker asking an old rural gentleman for directions to his destination. The rustic says “To get there, I wouldn’t start from here”. That’s where we are with HTML. If we were designing a spec from scratch, it would look much like XHTML 2, which I described elsewhere as “a beautiful specification of philosophical purity that had absolutely no resemblance to the real world”, and which was aborted by the W3C last week.

Damned if you do

The third point is Lawson’s key insight, for it illuminates the dilemma faced by HTML 5 or any other honest effort to move markup forward. Neither semantic purity nor fault-tolerance will do, and neither approach can hope to satisfy all of today’s developers.

A markup based on what we now know, and can now do thanks to CSS’s power to disconnect source order from viewing experience, will be semantic and accessible, but it will not be backward compatible. That was precisely the problem with XHTML 2, and it’s why most people who build websites for a living, if they knew enough to pay attention to XHTML 2, soon changed the channel.

XHTML 2 was conceived as an effort to start over and get it right. And this doomed it, because right-wing Nativists will speak Esperanto before developers adopt a markup language that breaks all existing websites. It didn’t take a Mark Pilgrim to see that XHTML 2 was a dead-end that would eventually terminate XHTML activity (although Mr Pilgrim was the first developer I know to raise this point, and he certainly looks prescient in hindsight).

It was in reaction to XHTML 2’s otherworldliness that the HTML 5 activity began, and if XHTML suffered from detachment from reality, HTML 5 is too real. It accepts sloppiness many of us have learned to do without (thereby indirectly and inadvertently encouraging those who don’t develop with standards and accessibility in mind not to learn about these things). It is a hodgepodge of semantics and tag soup, of good and bad markup practices. It embraces ideas that logically cancel each other out. It does this in the name of realism, and it is as admirable and logical for so doing as XHTML 2 was admirable and logical in its purity.

Neither ethereal purity nor benign tolerance seems right, so what’s a spec developer to do? They’re damned either way—which almost suggests that the web will be built with XHTML 1.0 and HTML 4.01 forever. Most importantly for our purposes, what are we to do?

Forward, compatibly

As the conversation about HTML 5 and XHTML has played out this week, I’ve felt like Regan in The Exorcist, my head snapping around in 360 degree arcs as one great comment cancels out another.

In a private Basecamp discussion a friend said,

Maybe I’m just confused by all the competing viewpoints, but the twisted knots of claim and counterclaim are getting borderline Lovecraftian in shape.

Another said,

[I] didn’t realize that WHATWG and the W3C’s HTML WG were in fact two separate bodies, working in parallel on what effectively amounts to two different specs [1, 2—the entire thread is actually worth reading]. So as far as I can tell, if Ian Hickson removes something from the WHATWG spec, the HTML WG can apparently reinsert it, and vice versa. [T]his… seems impossibly broken. (I originally used a different word here, but, well, propriety and all that.)

Such conversations are taking place in rooms and chatrooms everywhere. The man in charge of HTML 5 appears confident in its rightness. His adherents proclaim a new era of loaves and fishes before the oven has even finished preheating. His articulate critics convey a palpable feeling of crisis. All our hopes now hang on one little Hobbit. What do we do?

As confused as I have continually felt while surfing this whirlwind, I have never stopped being certain of two things:

  1. XHTML 1.0—and for that matter, HTML 4.01—will continue to work long after I and my websites are gone. For the web’s present and for any future you or I are likely to see, there is no reason to stop using these languages to craft lean, semantic markup. The combination of CSS, JavaScript, and XHTML 1.0/HTML 4.01 is here to stay, and while the web 10 years from now may offer features not supported by this combination of technologies, we need not fear that these technologies or sites built on them will go away in the decades to come.
  2. That said, the creation of a new markup language concerns us all, and an informed community will only help the framers of HTML 5 navigate the sharp rocks of tricky shoals. Whether we influence HTML 5 greatly or not at all, it behooves us to learn as much as we can, and to practice using it on real websites.

Read more

  • Web Fonts, HTML 5 Roundup: Worthwhile reading on the hot new web font proposals, and on HTML 5/CSS 3 basics, plus a demo of advanced HTML 5 trickery. — 20 July 2009
  • Web Standards Secret Sauce: Even though Firefox and Opera offered powerfully compelling visions of what could be accomplished with web standards back when IE6 offered a poor experience, Firefox and Opera, not unlike Linux and Mac OS, were platforms for the converted. Thanks largely to the success of the iPhone, Webkit, in the form of Safari, has been a surprising force for good on the web, raising people’s expectations about what a web browser can and should do, and what a web page should look like. — 12 July 2009
  • In Defense of Web Developers: Pushing back against the “XHTML is bullshit, man!” crowd’s using the cessation of XHTML 2.0 activity to condescend to—or even childishly glory in the “folly” of—web developers who build with XHTML 1.0, a stable W3C recommendation for nearly ten years, and one that will continue to work indefinitely. — 7 July 2009
  • XHTML DOA WTF: The web’s future isn’t what the web’s past cracked it up to be. — 2 July 2009

[tags]HTML5, HTML4, HTML, W3C, WHATWG, markup, webstandards[/tags]