Rebuilding The Web

Articles, advocacy, discussion and debate about the many problems of the Web and the challenges of rebuilding it.

The shortcomings of HTML5

Introduction

As professional Web designers and developers, we can use and support HTML5, but at the same time we need to be able to discuss the shortcomings and limitations of the technology. In this way, we can avoid pitfalls and provide invaluable feedback to the HTML5 team, to help improve HTML.

While considering the shortcomings of HTML5 in this article, you will probably wonder why a specific feature was implemented in a given way. To answer this, it's important to understand that features that go into HTML are rarely designed and evaluated purely on merit. More often than not, features that make it into the spec are negotiated by different implementers (mostly browser vendors) in a kind of standards bazaar.

But it is ultimately Web developers and designers like you and I who will work with the technology on a daily basis, and who will have the final say about the use and longevity of any HTML feature.

HTML5 does not fix HTML

Perhaps the biggest misconception about HTML5 is that it fixes the problems with HTML. The most serious problem is that HTML/JavaScript applications are inherently insecure and are vulnerable to mischief, attacks and data theft. One reason for this stems from the fact that HTML permits data to be intermixed with executable code (JavaScript). This makes HTML/JavaScript applications susceptible to cross-site scripting attacks. HTML5 adds new and more powerful DOM APIs that enable JavaScript code greater access to network communication and cached data, but giving more power to executable code without fixing security design flaws is disturbing.

Making simple concepts complex

The HTML5 team has a way of making simple concepts complex. Take for example the idea of alternate text used in the img element. Alternate text stands in place of images when images cannot be seen. It's a simple concept, right? Yet, in addition to the 15 pages that describe the img element in the specification (meant for technical users) there is an accompanying 40 page document that describes how to author alternate text (meant for non-technical users). Are content authors going to read 40 pages on alternate text? Making HTML unnecessarily complex can deter users from using features correctly.

Headings have also been made more complex. Headings (h1 to h6) in HTML were ill conceived, but at least it was easy enough to train someone to use them correctly. Headings are important because they organize and group content, as well as providing navigation for users of assistive technologies. The following screen shot shows how IBM's aiBrowser enables navigation using headings, because the heading levels reflect the physical role of the headings in the structure of the document. For example, "Header Level 1" is represented by the h1 element, "Header Level 2" by the h2 element, and so on.

Menu with options to select next header level 1 to 6 and previous header level 1 to 6.

Now let's look at how HTML5 redefines h1 to h6 elements. HTML5 bases the ranking of the heading depending on where in the document they are used. In the following example, in the document outline, the first h3 element represents the highest ranking heading, the second h3 element represents a lower ranking heading (yet has the same rank as the h1 element), and the third h3 element has no ranking at all. Try to explain that to a non-technical user!

  1. <body>
  2. <h3>Movies</h3>
  3. <section>
  4. <h1>Romance</h1>
  5. </section>
  6. <section>
  7. <h3>Action</h3>
  8. </section>
  9. <section>
  10. <h1>Science Fiction</h1>
  11. <hgroup>
  12. <h2>Star Trek</h2>
  13. <h3>The Wrath Of Kahn</h3>
  14. </hgroup>
  15. </section>
  16. </body>

The time element introduced in HTML5 is another example of an element that changes its meaning depending on where it is used. First, if it is used without the pubdate attribute, it provides a machine-readable date. For example:

  1. <time datetime="2011-11-24">November 24, 2010</time>

Alternatively, if used with the pubdate attribute that is not inside an article element, it indicates the publication date of the entire document. For example:

  1. <body>
  2. ...
  3. <time datetime="2010-11-24" pubdate>November 24, 2010</time>
  4. ...
  5. </body>

Or, if used with the pubdate attribute that is inside an article element, it indicates the publication date only of the article. For example:

  1. <article>
  2. ...
  3. <time datetime="2010-11-24" pubdate>November 24, 2010</time>
  4. ...
  5. </article>

In addition, the pubdate attribute is always optional, whereas the datetime attribute is sometimes optional and at other times it is required:

The datetime attribute is required when the pubdate attribute is used and when the element does not contain a string in the valid date time format. For example:

  1. <time datetime="2010-11-24" pubdate>Wednesday</time>

But the datetime attribute is optional when the time element does contain a valid date time format. For example:

  1. <time>2009-11-16</time>
  2. <time pubdate>2009-11-16</time>

The time element can also be empty. For example:

  1. <time datetime="2009-11-16"></time>
  2. <time datetime="2009-11-16" pubdate></time>

A further rule states that no more than one time element can be permitted directly within an article element, or no more than one time element outside the article element.

Incompatibility issues

HTML5 is also supposed to be compatible with existing browsers and tools. However, in reality this is not the case. There is nothing wrong with breaking compatibility if it is intentional and all users are aware and prepared for the technology shift. However, the problem is that much of the breakage in compatibility is unintentional and now the error has been made, it's hushed up.

Let's look at some incompatibility issues:

HTMLTidy is a tool that is used to fix HTML. Some use HTML Tidy directly before publishing a document. Some use HTML Tidy indirectly and don't even know they are using it, because the CMS or a WYWISYG editor hides its use. Let's take the following valid HTML5 document as an example:

  1. <!DOCTYPE html>
  2. <title>Greetings</title>
  3. <a href="http://localhost">
  4. <div>Hello World!</div>
  5. </a>

HTML Tidy will distort the content by changing the DOCTYPE and creating a useless hyperlink:

  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
  2. <html>
  3. <head>
  4. <meta name="generator" content="HTML Tidy for Windows (vers 25 March 2009), see www.w3.org">
  5. <title>Greetings</title>
  6. </head>
  7. <body>
  8. <a href="http://localhost"></a>
  9. <div>Hello World!</div>
  10. </body>
  11. </html>

The following document uses a new HTML5 element:

  1. <!DOCTYPE html>
  2. <title>Greetings</title>
  3. <header>Hello World!</header>

When HTML Tidy encounters an unknown element, it will throw the following error and will not be able to proceed:

  1. line 3 column 1 - Error: <header> is not recognized!
  2. line 3 column 1 - Warning: discarding unexpected <header>
  3. line 3 column 9 - Warning: plain text isn't allowed in <head> elements
  4. line 2 column 1 - Info: <head> previously mentioned
  5. line 3 column 9 - Warning: inserting implicit <body>
  6. line 3 column 21 - Warning: discarding unexpected </header>
  7. Info: Document content looks like HTML 3.2
  8. 4 warnings, 1 error were found!
  9. This document has errors that must be fixed before using HTML Tidy to generate a tidied up version.

Other authoring tools have incompatibility issue with HTML5 as well. Let's start with the following markup:

  1. <a id="abc">
  2. <div>Hello World!</div>
  3. </a>

TinyMCE running in IE will generate the following invalid HTML (all versions):

  1. <p><a id="abc">
  2. <div>Hello World!</div>
  3. </a></p>

XStandard will remove the div and enclose the a element in a paragraph:

  1. <p><a id="abc">Hello World!</a></p>

If you start with the following markup:

  1. <p>XML<wbr />Document</p>

CKEditor will create this:

  1. <p>XML<wbr>Document</wbr></p>

Many new HTML5 features cannot be implemented

It is impossible or impractical to build authoring tools (such as WYSIWYG editors) to support some of the new HTML5 features. Let's take headings (h1 to h6) as an example. As discussed, in HTML5 these elements change their ranking and/or semantic meaning depending on where in the document they are used. But authoring tools have a fixed label for each heading like this:

Drop-down list showing options for heading 1 to 6.

It would be confusing to users if authoring tools displayed "Heading 2" for an h3 element simply because it is used in a part of the document that gives it a ranking of 2.

Another feature that is impractical to implement is the new hyperlink behavior that permits hyperlinks to contain block level elements such as paragraphs. If a user selects all the content in a paragraph and presses on the hyperlink button in an authoring tool, how is the authoring tool to know if the user wants this:

  1. <p><a href="page.htm">Some text.</a></p>

or this:

  1. <a href="page.htm">
  2. <p>Some text.</p>
  3. </a>

Making a user interface that will support an optional alt attribute for the img element is also impractical. The img element without an alt attribute is permitted if one of the following conditions is met:

  • The title attribute is present and has a non-empty value.
  • The img element is in a figure element that contains a figcaption element that contains content other than inter-element whitespace, and, ignoring the figcaption element and its descendants, the figure element has no text node descendants other than inter-element whitespace, and no embedded content descendant other than the img element.

To support the 3 states of alt attribute supported by HTML5, an authoring interface might need to look something like this:

Image properties dialog box. First control is a text field with a label 'Image URL'. Second control is a set of 3 radio buttons with a label 'Image type'. First option is 'Decorative (alternate text blank). Second option is 'Non-decorative (alternate text required). Third option is 'Don't know (no alt attribute, title may be required)'. Third control is a text field labeled 'Alternate text'. Forth control is a text field labeld 'Title'. The dialog box includes an OK and Cancel buttons.

Lack of versioning

What does the 5 in HTML5 stand for? You might be surprised to know that it does not stand for version 5. I guess the best way to describe it is that 5 represents the fifth major effort of work on HTML. But the problem is not so much with how we label HTML. Instead, the problem is that there is no way for Web page creators to mark Web pages in a way that says that their document conforms to HTML5. There is a new DOCTYPE that is associated with HTML5, but it has no version number, and it will be the same DOCTYPE when HTML6 comes out. Lack of versioning is not a problem for Web browsers because they treat everything like it's Tag Soup. But Web page creators that use validators and markup correction tools like HTML Tidy may miss versioning because these tools will not be able to distinguish between HTML5 and HTML6 when processing documents.

Lack of extensibility

HTML5 defines new semantic elements such as header, footer, nav, aside, article and section. More semantics are good, but the semantics offered by these elements are not enough. Also, these elements function in the same way as the generic div element. If an element is a derivative of a generic element such as div, then semantics should be added via an attribute like this:

  1. <div type="article">...</div>

And semantics could be grouped into classifications. For example:

  1. <div type="article.news.international">...</div>

These semantics could also be defined in a separate, smaller specification that could be updated more frequently than the HTML specification. You could also have online services that make these classifications available to authoring tools (and other apps) with friendly, localized labels which will empower non-technical content authors to apply semantic markup with ease, as shown in this mock screen shot:

Authoring tool contains a document on regarding the discovery of water on Mars. A pop-up dialog box contains a tree structure. Top level is 'Classification'. Under this item is 'Articles'. Under this item are 'Events', 'How-to', 'Human interest', 'Interview', 'News' and 'Press releases'. Under 'Press releases' is 'Astronomy' which is selected.

Missing features

Making the Web into a better application platform is a good idea, but the Web is still and for many years to come will be about content. So where are the features to make content easier to publish and read? I am talking about features that you could use in desktop publishing applications 20 years ago but still cannot use on the Web today. How about a feature that will generate a dynamic table of contents from headings like this:

  1. <toc></toc>
  2. <h1>Movies</h1>
  3. <h2>Action</h2>
  4. ...
  5. <h2>Science Fiction</h2>
  6. ...
  7. <h2>Romance</h2>

Future browsers can then display the outline automatically hyperlinked like this:

List item 'Movies'. Sub-items are 'Action', Science Fiction' and 'Romance'.

And with line numbers when printed like this:

List item 'Movies' on page 1. Sub-items are 'Action' on page 1, Science Fiction' on page 2 and 'Romance' on page 3.

What about other features such as newspaper-like columns, or markup to support change tracking?

Stalled progress on Web accessibility

Perhaps the most disappointing shortcoming of HTML5 is the lack of any new features to make HTML more accessible, or to make existing accessibility features easier to use. HTML5 does nothing significant to make Web technology more accessible. In fact, some accessibility features have either been removed or weakened.

One of the removed features is the longdesc attribute for the img element. This feature is used to provide a description of the image. In the past, because this feature has been poorly defined in the HTML spec, it has been misunderstood and used incorrectly. However, longdesc does have the potential to make images accessible, were it to be defined correctly, and there is strong support in the accessibility community to make this feature work should the HTML5 team add this feature back into the HTML5 spec.

Another issue is that the alt attribute for the img element has been made optional under certain conditions. However, in the minds of many people, if a feature is optional in one situation, it is optional in all situations.

There are also features such as the canvas element, used as a drawing surface, that have been added to HTML5 without any consideration for accessibility. Although efforts are being made to make this feature somewhat accessible, it is unclear at this stage if they will be successful.

Headings are also important for users of assistive technologies, primarily to navigate content. Yet numbered headings (h1 to h6) are a fundamentally flawed construct. Headings have to be redesigned so that authoring them is foolproof. Since headings don't make sense on their own, they should be part of a section element to form a compound construct, similar to table, ol, ul and dl. For example:

  1. <section>
  2. <heading>...</heading>
  3. <content>...</content>
  4. </section>

And heading sections could be nested like this:

  1. <section>
  2. <heading>...</heading>
  3. <content>
  4. <section>
  5. <heading>...</heading>
  6. <content>...</content>
  7. </section>
  8. <section>
  9. <heading>...</heading>
  10. <content>...</content>
  11. </section>
  12. </content>
  13. </section>

Navigating between pages on a site is also a big challenge for users of assistive technologies. The nav element in HTML5 is insufficient to help navigation, because this element is essentially a div that can contain anything. What is required is a more structured construct similar to an ordered list. This construct could be used to build navigation menus and breadcrumbs.

Conclusion

Discussing the shortcomings of a technology is a way to improve it, and provides invaluable information to the HTML5 team:

  • HTML5 does not fix existing problems with HTML and it may also leave Web applications vulnerable to security breaches.
  • HTML5 makes simple concepts unnecessarily complex, which can discourage users from using features correctly.
  • There are numerous unanticipated incompatibility issues with HTML5.
  • Many HTML5 features cannot be implemented in authoring tools.
  • Users of validators and markup correction tools will lose functionality because HTML5 does not have a version identifier.
  • Semantics defined by the new HTML5 elements are not sufficient. HTML needs to have extensibility built in.
  • HTML still lacks features found in desktop publishing.
  • Perhaps the most grievous shortcoming of HTML5 is that it does nothing significant to make HTML more accessible.

Public comments

1. Posted by Fabrice
on Wednesday 2010-12-01 at 07:56:01 PST

Amen. I'm glad I quit my day job to experiment on my own. I can picture the discussions I would have had this year already "Why is this a HEADER? It should be a FOOTER". "FOOTER? No in this case after all it should be a SECTION". Religion 2.0.

2. Posted by Jens O. Meiert
on Wednesday 2010-12-01 at 09:54:19 PST

HTML 5 is not perfect and won’t be.

I’m not sure the article focuses on actual flaws however. The “incompatibility issues,” for example, bringing up problems with HTMLTidy, demonstrate problems with HTMLTidy, not HTML 5. HTMLTidy will need to be fixed.

3. Posted by Vlad Alexander
on Wednesday 2010-12-01 at 10:25:48 PST

Jens O. Meiert wrote "...bringing up problems with HTMLTidy, demonstrate problems with HTMLTidy, not HTML 5. HTMLTidy will need to be fixed."

Imagine a Web browser that did not render a Web page or rendered the entire page incorrectly because the page contained a new HTML5 feature. Do you think that feature would have ever made it into the spec? No, because the spec is authored primarily by browser vendors. Little or no consideration was given to other stakeholders in Web technology (such as authoring tool vendors) and this is a shortcoming of HTML5.

Jens, as I wrote in the article "there is nothing wrong with breaking compatibility if it is intentional", however I don't believe the HTML5 team knew that their modification of HTML would break HTML Tidy, TinyMCE, CKEditor, XStandard, etc. As a result, the flaw is not with HTML Tidy, but with the HTML5 spec.

4. Posted by adam
on Wednesday 2010-12-01 at 11:59:46 PST

You keep talking about this "HTML5 Team" like it's something that actually exists.

5. Posted by seutje
on Wednesday 2010-12-01 at 12:46:48 PST

They should fix the C language as well, my notepad doesn't colour-code it properly.
That's about how silly this entire post sounded to me.

People who need HTML Tidy probably shouldn't be allowed to write HTML to begin with and I'm pretty sure TinyMCE has always been broken.

I like the lack of versioning bit, it's brought up so often and it's so silly.
There is no need for versioning, if you really wanted to "validate" ur coads, wouldn't u want to "validate" against what work now, instead of what worked 10 years ago (and most likely still works now btw)? "Validating" against a spec is all honky dory, but it's essentially meaningless. Congrats, ur shizzle is according to spec X, but unfortunately browser Y doesn't support half the shit u used and u were too much of a douche to use progressive enhancement techniques. So it breaks in browser Y, but hey... u get to put that retarded W3C badge on there, kudos!

I also very much like your worthless <toc> element, not only is it pretty much exactly what the HTML5 outline thingy is all about, but u actually require some useless markup for it, while the whole idea of the HTML5 outline is that browsers handle this themselves without the need for dead, meaningless markup.

Your argument about headings also made me chuckle a bit, because I've experienced the exact opposite. If I give my clients the ability to use an H1 in a post or article, they'll surely use it. This usually ends up in at least 2 H1s being on the page: the sitename or article title and the one they used within the article (assuming they'll only use 1, coz u know... editors allow u to use as many as u like, surely not a flaw in those editors, right?). Fortunately, I'm not a total retard and thus don't give them an H1 to use. They might think they're using an H1, but this is swapped out in the backend or faked in the editor, allowing them to only start at H2 or even H3. Now, I don't have to worry about this anymore, or at least I don't have to do my comparison on a global page-level, which often requires some seriously backwards logic when using a modular CMS that builds a page inside-out because u don't have access to the global page-level when rendering an individual article. Also, the current way of doing headings (prior to HTML5 that is) simply kills aggregation, starting with a clean slate in every separate context fixes this quite effectively.

U should seriously revise some of your assumptions and perhaps u should quit using such crappy tools.

6. Posted by Laura
on Wednesday 2010-12-01 at 13:12:55 PST

Thanks for this post, Vlad.

Just an fyi...A Wiki Page with the Change Proposal Choices for The HTML WG alt ISSUE 31 are at:
http://www.w3.org/html/wg/wiki/ChangeProposals/ImgElementSurveyConformaceChoices

7. Posted by Laura
on Wednesday 2010-12-01 at 13:18:14 PST

Another reference:
Tidy5 aka the Future of HTML Tidy
By Lars Gunther.
http://itpastorn.blogspot.com/2010/11/tidy5-aka-future-of-html-tidy.html

8. Posted by Vlad Alexander
on Wednesday 2010-12-01 at 13:30:34 PST

seutje wrote: "People who need HTML Tidy probably shouldn't be allowed to write HTML to begin with"

Who gets to decide who is "allowed" to write HTML and how? FYI, HTML Tidy is used in many applications that you or the content author may not be aware of. For example Wikipedia uses HTML Tidy.

seutje wrote: "I also very much like your worthless <toc> element, not only is it pretty much exactly what the HTML5 outline thingy is all about"

Which "thingy" would that be? Right now a content author of a long document has to manually insert a list of hyperlinks at the beginning of the document so that readers can jump to different sections of the document. This is a lot of work and is often beyond the skill set of many content authors because they first have to make headings into anchors (assign an id to h1 to h6 elements). And when the document is printed, there are no page numbers in this table of contents. Does the "thingy" you mention provide equivalent functionality to the problem I described?

9. Posted by Rob Burns
on Wednesday 2010-12-01 at 14:26:32 PST

The article makes some excellent points.Many of the shortcomings of HTML5 are addressed in a burgeoning HTML 4.1 specification. HTML 4.1 (or maybe it should be HTML 5.1) accepts the user agent processing norms of HTML5 (though making some user agent processing norm additions) but fundamentally changes the authoring norms.

For example, the issue of HTML5’s h1-h6 headings could be solved through improved authoring norms. That is HTML5 should define how the h1-h6 headers get processed by browsers and other user-agents as it does. However, authoring should be greatly simplified by requiring authors only use the h1 element for headings. In this way authors can adopt the new nested section approach to defining a heading level where the heading level is determined by the sections hierarchies depth (in other words a body -> section -> section -> h1 is a third level heading). Authors could also follow the prior approach of avoiding the section element and using h1 - h6, but the forward looking author would instead use the new section element.

I agree that extensibility is an issue. That is why I think HTML5 should have adopted XML-namespaces-like extensibility in text/html parsing as Internet Explorer has already long supported. Specifying namespace extensibility that was compatible with IE and adopted by the other browsers would have ensured a robust extensibility mechanism (though speccing something that is compatible both with IE namespace mechanism and also much like if not identical to XML namespaces).

I think many of the new elements are poorly conceived: in particular aside, header, footer, and nav elements all try to accomplish too much. HTML 4.1 introduces a far better approach of optional namespaced class and id attribute values which serve much the same role in an extensible manner (using id="header" or class="header" for sections headers). Also I think the header concept further confuses two concepts needlessly: header and heading (as well as a third concept of head as a container for metadata). A header is a runner for a document that generally is visible at all time. A heading provides a title for a resource or sub-source. Other metadata is often contained in the head and is not always presented to the reader though it is optionally available for presentation through styling. By further confusing these distinctions it makes it harder to use these elements correctly and even harder to create authoring tools that keep these concepts straight (especially in a heterogenous authoring tool environment).

The HTML 4.1 is available here:
<http://html4all.org/HTMLDraft.html>

10. Posted by seutje
on Wednesday 2010-12-01 at 14:27:52 PST

HTML5 is used in many application that you or the content author may not be aware of. For example Wikipedia uses HTML5.

U know, I actually got to an HTML5 outliner by literally googling "HTML5 outline thingy". Drop the "thingy" and u'll get a crapload of resources.

Right now a content author is a moron if he doesn't demand the developer to automate the detection and implementation of a toc without any meaningless onload markup if this is a requirement. The thingy I mentioned would be handled by the browser, providing a customized solution for any user going way beyond your wildest assumptions. So the <toc> element would be implied.

11. Posted by Rob Burns
on Wednesday 2010-12-01 at 14:39:05 PST

Regarding HTML5 and table of content outline algorithm, my understanding is that it is supposed to solve the problem your example solves simply through proper use of h1-h6 and sectioning elements. Unfortunately without simpler authoring norms surrounding the use of these elements it will be a heroic task for any author to use those elements in just the right manner (hence that is why I suggest canonically authoring only with h1 and hierarchically arranged sectioning elements).

HTML 4.1 actually proposes several other auto generated lists to complement the HTML5 outline algorithm. HTML4.1 supports semantic markup that allows the automatic algorithmic generation of an subject index, authority index, reference list, notes, and semantic presentation legend all in addition to a table of contents / outline proposed by HTML5.

12. Posted by Francesco
on Wednesday 2010-12-01 at 14:56:10 PST

Vlad, you wrote »HTML5 does nothing significant to make Web technology more accessible.« What about the new form types and attributes? I think they make websites way more accessible and have got a really nice progressive enhancement approach.

13. Posted by Rob
on Wednesday 2010-12-01 at 15:21:26 PST

XML and XHTML are the best thing for the web for web programmers and web professionals. HTML may be good enough for all the rest but it's not good enough for the former. Unfortunately, the HTML people are garnering all the attention to the detriment of the former and the detriment to the web.

14. Posted by Shelley
on Wednesday 2010-12-01 at 15:40:05 PST

An interesting perspective, and some good points.

Francesco, one can use ARIA with existing forms and achieve as much or more. Many of the input elements are probably not going to have widespread acceptance with web developers and designers because too much is taken out of the hands of the designers and developers.

So Opera implements required one way, Safari, another. One provides a message, the other just doesn't submit the form. I don't know of any developer that would be comfortable with this state.

And there is so little implementation of most, we don't have a good idea of how acceptable the default implementations of these new form inputs, elements, and attributes will be.

The new form elements are actually the oldest part of HTML5 (from the original Web Forms 2.0), yet the aspect of the spec with the least implementation.

15. Posted by Lars Gunther
on Wednesday 2010-12-01 at 16:04:09 PST

My blog post about "Tidy5" has already been mentioned by Laura in comment 7.

Let me add a few notes.

@seutje:

Tidy is an excellent tool for processing user contributed HTML on the server. One common idiom is to author in MS Word and then cut and paste into a rich text field in ones CMS. That markup is ghastly beyond imagination. In fact, I did once write a cleaning function that ran the code through Tidy 3 times, with intermediate processing dine with other means to strip out ugliness, while trying to preserve the allowed use cases.

In short: Tidy has many more uses than helping people hand author HTML. And even for that use case, it can certainly speed up the process by relieving the author of many mundane tasks, that one very well could do by hand, but since Tidy is available - perhaps even integrated into one's IDE - it's a speed bonus. Just like Zen coding, snippets and auto completion.

@Rob and Vlad:

When you argue about sectioning through h1-h6 and sectioning through section elements, like section, article, etc, you are missing one point. HTML5 does already encourage authors to use <h1> only. The example where arbitrary header tags are used in the article is how BROWSERS should treat the markup. It is not a recommendation for authors.

BTW, keeping a generic <h> from XHTML 2.0 was thoroughly discussed and dropped only after having had the issue turned inside out several times.

For the next five to ten years, until browsers and AT software has caught up, authors are encouraged to use a combination of the old and the new way. Like this:

  1. <section>
  2. <h1>Heading 1</h1>
  3. ...
  4. <section>
  5. <h2>Sub heading</h2>
  6. ...
  7. </section>
  8. </section>

Ergo: We will be transitioning by using both systems in sync. And doing so is highly encouraged by everyone I've read on both the WhatWG and W3C mailing lists, including Hixie himself.

I strongly suggest that a lint tool for HTML should catch any deviations from that structure and that all post processing or server side tools should rewrite the HTML to produce such a structure.

In the absence of such tools, users should not be allowed to put sectioning elements into their code anyway. Run strip_tags() on them (or whatever function is available in the language of your choice.) Sectioning elements are *mostly* of concern for template authors anyway.

And for the record, there is NO browser on the market today that implements the sectioning algorithm and makes it available to screen readers and other AT. When so called HTML5 support charts claim that browser X support these elements, its only at the level of them being visible in the DOM - as any unknown element should be - and therefore stylable with CSS. Perhaps they get a default style of display: block as well, but that's it.

Plus: The only browser to make such styling feasible is Firefox 4, thanks to the -moz-any() selector. Until other browsers catch up and every old browser that do not have a similar rule die out - which won't be until the year 2016 or something like that - we simply can not use the sectioning elements without aligning them to h1-h6 as explained above for this reason as well.

That gives you (Vlad) 5 years to implement this in XStandard.

And a short note about HTML5 allowing <a></a> to wrap block elements. That's a de facto standard of HTML that has existed in all browsers for more than 15 years. It is very much appreciated by many authors, including yours truly. I know for a fact that CSS guru Eric Meyer used that as his sole reason to use the new versionless doctype on a site he authored a little more than a year ago. (That was the single HTML5 feature he used...)

So while this change in the spec does break some tools of today, it is a feature that is in demand. And if someone else does not like it, well just don't use it!

16. Posted by Luke Desroches
on Wednesday 2010-12-01 at 16:29:50 PST

You seem to missing the most important point, and that is HTML5 is still a long way from being "complete", or more so supported enough. Of course much of it cannot be implemented yet and is incompatible because much of it is still experimental. And if HTMLTidy and TinyMCE doesn't yet play nice with HTML5, then don't use HTML5 because again, much of it is new and still experimental.

Your argument about making simple concepts complex is just silly. Changes to elements are being made for progression, not to make things complex. Change is good. Without change, HTML would remain stagnant, and the Web would not move forward. If this means all developers have to learn about new changes and features, then so be it. Catch up or be left behind.

Your argument about missing features is all just your opinion. I'm sure every developer has a feature they would like to see implemented, but that's not realistic. You can't please everyone.

In regards to your argument about stalled progress on Web accessibility, specifically the point about the nav element, refer to this article - http://www.alistapart.com/articles/aria-and-progressive-enhancement/
Of course not all browsers and assistive technology support the new HTML5 elements and ARIA because, once again, much of it is still experimental.

To sum it all up: HTML5 is still very new. If you don't understand a certain new feature, or it is not widely supported, then don't use it yet.

17. Posted by bruce
on Wednesday 2010-12-01 at 17:21:27 PST

An interesting post, Vlad, but I disagree with some of it.

The HTML Tidy problem is a red herring. I have image editors so old that they can't handle PNG format. That's not a problem with PNG, that's a problem with my image editor. As far as I can tell, Dreamweaver CS5 can tidy my HTML5 just fine.

You also say "Many new HTML5 features cannot be implemented". But they are implemented in browsers. If you mean "they are difficult to implement in authoring tools" then, yes - perhaps they are. I know Daniel Glazman has said as much. But "cannot be implemented" is untrue.

"Another feature that is impractical to implement is the new hyperlink behavior that permits hyperlinks to contain block level elements such as paragraphs."

This behaviour has been implemented for ever in browsers. That's why it's in the HTML5 spec; there is an obvious use case and it works interoperably now.

"Alternate text stands in place of images when images cannot be seen. It's a simple concept, right?" I wish it were. Then we might see decent alt text widespread on the Web now. But we don't, so maybe it needs better explanation.

I agree "numbered headings (h1 to h6) are a fundamentally flawed construct", so HTML5 sets out to do as you suggest: "Since headings don't make sense on their own, they should be part of a section element to form a compound construct". This is from XHTML2, and originally suggested by Tim Berners-Lee in 1991. But if a new [heading] element were minted (or XHTML2's [h] element were adopted), there would be no backwards compatibility with older browsers or assistive technologies, leading to a worsening of accessibility.

As for [time] and pubdate, none of the developers I've spoken to in training workshops and conferences finds the concepts difficult to grasp.

I agree that [hgroup] is unnecessarily complex: http://www.brucelawson.co.uk/2010/on-the-hgroup-element/ and have submitted a proposal for its abolition.


18. Posted by Rob Burns
on Wednesday 2010-12-01 at 17:58:01 PST

bruce writes:
“But if a new [heading] element were minted (or XHTML2's [h] element were adopted), there would be no backwards compatibility with older browsers or assistive technologies, leading to a worsening of accessibility.”

We hear these arguments again and again yet so many elements have been introduced in HTML5 that are unnecessary and even cumbersome solutions to existing problems. The whole nav, aside, header, and footer elements are simply poor solutions to those problems. HTML5 pushed for video and audio elements despite lack of interoperability. It could have done the same with an h element. It could easily incorporate many of the far superior suggestions that have come through the WG over the years if it could propose these other elements. Doing so merely means authors must wait for the targeted browsers to support the feature (just as they must do for video, audio, aside, time, etc.).

As an example, namespace extensibility exists in all the major browsers right now. In IE namespaces are supported in text/html parsing. In the other browsers they are supported in XHTML XML parsing. So it would make perfect sense for HTML5 to include namespaces in text/html parsing and encourage the other two browsers to implement it (they already treat HTML elements within the HTML namespaces in XML namespace).

In short, HTML5’s strength is as a browser specification. The new features added never fulfilled a real use case or were clumsy solutions for the use cases they sought to fulfill.

19. Posted by Vlad Alexander
on Wednesday 2010-12-01 at 18:02:31 PST

Luke Desroches wrote: "Change is good."

Is all change good? What if the HTML5 team got some features wrong? What if there is a better solution or a solution that does not hurt other stakeholders in Web technology (like authoring tool vendors, content authors, etc.)?

bruce wrote: "The HTML Tidy problem is a red herring. I have image editors so old that they can't handle PNG format."

PNG is a new format. HTML5 is not a new format. In fact, according to Ian Hickson, there is no such thing as HTML5 - only HTML.

bruce wrote: "As far as I can tell, Dreamweaver CS5 can tidy my HTML5 just fine."

Can you use Dreamweaver CS5 as a library in your C++ app, as a COM object, as a PHP module, as a command line app, etc.?

bruce wrote: "'Alternate text stands in place of images when images cannot be seen. It's a simple concept, right?' I wish it were. Then we might see decent alt text widespread on the Web now. But we don't, so maybe it needs better explanation."

Maybe, but the one person on the HTML5 team decided that a better explanation is not needed.

bruce wrote: "I agree 'numbered headings (h1 to h6) are a fundamentally flawed construct', so HTML5 sets out to do as you suggest: 'Since headings don't make sense on their own, they should be part of a section element to form a compound construct.'"

Actually, they haven't. You can use section without h1. And you can use h1 without section. And you can use h1 anywhere within section, even at the end of content. So how is this a compound (structured) construct like table, ol, ul and dl?

bruce wrote: "But if a new [heading] element were minted (or XHTML2's [h] element were adopted), there would be no backwards compatibility with older browsers or assistive technologies, leading to a worsening of accessibility."

I bet if you were to ask the accessibility community if they are willing to temporarily tolerate breakage with backwards compatibility in exchange for a permanent fix to the heading problem, then I bet most would be willing to accept this trade-off. I know that we at XStandard would implement this feature right away and the community would press to get AT vendors to implement this feature in a timely manner. Why not ask the accessibility community?

bruce wrote: "As for [time] and pubdate, none of the developers I've spoken to in training workshops and conferences finds the concepts difficult to grasp."

When you train people in a workshop, they get it. But we cannot train everyone in a workshop setting. We need to develop features that are foolproof; features that can be implemented in authoring tools.

bruce wrote: "If you mean 'they are difficult to implement in authoring tools' then, yes - perhaps they are. I know Daniel Glazman has said as much. But 'cannot be implemented' is untrue."

I challenge you to design a user interface for an authoring tool that will let users author/edit the time element in all its possible forms.

20. Posted by Rob Burns
on Wednesday 2010-12-01 at 18:16:05 PST

Lars Gunther writes:

“Sectioning elements are *mostly* of concern for template authors anyway.”

I think this is backwards. Sections are only the concern of over-bearing control freak template authors. The author of the content is the one who must author the section hierarchy. The CMS can canonize that hierarchy in the old h1-h6 manner or a nested section and h1 manner, but the semantics must be provided by the content author. Many CMS developers are control freaks who take valuable semantic content and strip it away deciding that no one should use those HTML enabled semantics.

The problem is that HTML5 is written by someone who only cares about the processing end of HTML. Getting the HTML WG leadership to care about authors and reader/end-users/consumers has been like beating one’s head against a brick wall.

21. Posted by John Foliot
on Wednesday 2010-12-01 at 18:33:38 PST

Here's what I don't understand:

Vlad, as the creator of a very good wysiwyg editting tool, is complaining that HTML5 is 'broken', as it is difficult for tool makers to meet the (yet to be finalized) specifications. He complains that:

"Little or no consideration was given to other stakeholders in Web technology (such as authoring tool vendors)"

Yet despite repeated suggestions that Vlad participate in the process of creating HTML5 (by joining the W3C HTML WG), he steadfastly refuses to do so, relegating himself to the sidelines where he complains that progress is happening without input from tool vendors.

There is an old saying Vlad: you can't suck and blow a the same time.

Here's what I do understand:

Participation in the HTML5 Working Group at the W3C is easy to do. Visit http://www.w3.org/html/wg/, and then follow the very simple registration process, which involves getting an account and filling out a form for copyright, patent, etc. policies. More details can be found at the W3C site: http://www.w3.org/2007/04/html-ie-faq It takes about 5 minutes to complete.

Approval is generally very quick, usually within 24 hours. At that time you can participate in the discussions and dialog on the W3C Mailing List, and have the opportunity to be consulted.

If this level of participation makes you feel that you are overly committed, there is an even easier way of providing feedback: the W3C HTML Bug Tracker. Anyone can file a bug, no strings attached (although please, be serious when filing a bug: there are already a large number of bugs, and frivolous bug reports simply clog the system). To file a bug, go to: http://www.w3.org/Bugs/Public/ Please, be sure to search for a similar bug before filing a new one - thanks.

I think that some of the points that Vlad raises are worth discussing, others simply suggest to me that he doesn't completely understand the intent, and others again simply represent a challenge for him as a tool author that seem overwhelming... It may in fact require significant re-writes of his tool. I am sympathetic to that point of view, but trying to arrest progress is a no-win situation. Further, while he may disagree with various aspects of the emerging specification and standard, we must remember that the final outcome is consensus based (at least at the W3C) and represents a meeting of all stake-holders who bother to show up: again, standing on the sidelines simply ensures you are not in the game.

22. Posted by Leif Halvard Silli
on Wednesday 2010-12-01 at 18:42:38 PST

@Lars Gunter

The debate about going for <h> instead of <h1>-<h6> did not happen very much within the HTMLwg, I think. That must have happend in the WHATwg.

So what were were the arguments? The arguments I have heared were about compatibility. In what way? Was it about IE6 compat? I have heard a lot about compat, in fact, w.r.t. to the heading elements. But most of it was just ironical comments about the XHTML2 rather than any actual explanation of the problem.

You also say that HTML5 says that we can use a single <h1> instead. Can you point me to the place in the spec where this is clearly - or unclearly - expressed? I have heared this claim before, but I have never understood where this idea is expressed.

The only new thing that has happened when it comes to heading elements, is that we have got a <hgroup> element, which in fact is an elemetn that brings one to think about <h>. Because, as it stands, <hgroup><h1>Lorem.</h1></hgroup> is equivalent to <h1>Lorem.</h1>. And this, in my view, sets a pattern for how one could have introduced <h>: we could have allowed <h> to take <h1>-<h6>.

What I mean is that the following 2 constructs would be synonymous with each others:

  1. <h><h1>Lorem.</h1></h>

and

  1. <h>Lorem.</h>

We could then have ditched the <hgroup>.

Can you point to a place where this idea has been dicussed in the past?

23. Posted by Leif H Silli
on Wednesday 2010-12-01 at 18:50:00 PST

@Bruce

«But if a new [heading] element were minted (or XHTML2's [h] element were adopted), there would be no backwards compatibility with older browsers or assistive technologies, leading to a worsening of accessibility.»

If a new <heading> or <h> element were allowed to take <h1>-<h6> elements, the same way <hgroup> currently allows the same thing, then it should be at least as backwords compatible as <hgroup> is.

Don't you think?

24. Posted by Leif Halvard Silli
on Wednesday 2010-12-01 at 19:02:49 PST

@John Foliot

HTML5 has an editor that that does not limit himself only to those messages that appear inside the HTMLwg or the WHATwg. So were is the problem? <smile> But I am not solely ironical when I say that ...

25. Posted by Vlad Alexander
on Wednesday 2010-12-01 at 19:57:41 PST

John Foliot wrote: "Yet despite repeated suggestions that Vlad participate in the process of creating HTML5 (by joining the W3C HTML WG), he steadfastly refuses to do so"

John, let me quote Ian Hickson: "The reality is that the browser vendors have the ultimate veto on everything in the spec, since if they don't implement it, the spec is nothing but a work of fiction."

Why are the browser vendors first citizens of the Web? What about the millions of users of authoring tools, screen readers, etc. I will not lend my name to a process that does not represent the interests of all stakeholders in Web technology.

John Foliot wrote: "there is an even easier way of providing feedback: the W3C HTML Bug Tracker"

One of my peers submitted a bug based on an article I wrote. That bug was marked WONTFIX by one person without any discussion or debate. Okay, that's just one bug - right? Apparently, accessibility related bugs have been disproportionately rejected and marked WONTFIX or INVALID. I cannot find a link to the stats but perhaps you or Laura could post the link here. So providing feedback to the HTML5 team via a bug tracking system is a flawed process.

John Foliot wrote: "we must remember that the final outcome is consensus based (at least at the W3C) and represents a meeting of all stake-holders who bother to show up: again, standing on the sidelines simply ensures you are not in the game."

Am I the only tool vendor on the sidelines? Are the vendors of tools like CKEditor, TinyMCE, Windows-Eyes, JAWS, in the game and active players? If not, why?

26. Posted by bruce
on Wednesday 2010-12-01 at 22:31:02 PST

Vlad, you asked "Am I the only tool vendor on the sidelines? Are the vendors of tools like CKEditor, TinyMCE, Windows-Eyes, JAWS, in the game and active players? If not, why?"

In 2008 I wrote to the W3C to ask them to invite the screenreader vendors to participate, even though they could choose to at any time. In 2009, I asked Ian Hickson their response:

Bruce: You wrote to ask screenreader vendors to participate in the specification process. Did they ever reply?

Hixie: A couple did, but only to say they had little time for the standards process, which was quite disappointing. Since then, though, Apple has ramped up their efforts on their built-in Mac OS X screen reader software, and we do get a lot of feedback from Apple. So at least one screen reader vendor is actively involved.

http://www.webstandards.org/2009/05/13/interview-with-ian-hickson-editor-of-the-html-5-specification/

As for challenging me to design a UI for [time], sorry! I'm not an authoring tool developer, and never claimed to be. But I'm sure it can be done.

27. Posted by Rossi
on Thursday 2010-12-02 at 00:12:25 PST

>A couple did, but only to say they had little time for the standards process
Is that a euphemism for "I don't want to get bent over and spanked in public"?

28. Posted by seutje
on Thursday 2010-12-02 at 06:18:19 PST

The more I look at this, the more I wonder how much you have researched these things.
You mention CKeditor breaking on it, but CKeditor is GPL and extensible out the wazoo, so why don't u take an example to Mr Neal and continue the work of others instead of tearing it down -> http://sandbox.thewikies.com/html5-ckeditor/
(btw, that's the first result when googling for "ckeditor html5", just saying...)

29. Posted by Christophe Strobbe
on Thursday 2010-12-02 at 08:20:44 PST

@Rossi
When the screen reader vendors said they had little time for the standards process, I doubt this had anything to do with HTML5 itself (or its editor). They were also absent while the Web Content Accessibility Guidelines (WCAG) 2.0 were being developed.

30. Posted by jck
on Thursday 2010-12-02 at 09:27:11 PST

Vlad said: "I bet if you were to ask the accessibility community if they are willing to temporarily tolerate breakage with backwards compatibility in exchange for a permanent fix to the heading problem, then I bet most would be willing to accept this trade-off."

Seems like an interesting idea. Does anyone have opinion on this?

31. Posted by John Foliot
on Thursday 2010-12-02 at 17:27:46 PST

Vlad wrote:

"John, let me quote Ian Hickson: "The reality is that the browser vendors have the ultimate veto on everything in the spec, since if they don't implement it, the spec is nothing but a work of fiction.""

Ah yes, the big evil browsers. Vlad, you have a very skewed understanding of how the marketplace works: browsers respond to what their clients want, it's as simple as that. Simply look at the history of the web.

Firefox rose from the ashes of Netscape, itself once the premier web browser that became bloated and flaky, while that other evil empire, Microsoft, actually employed smart and talented people like Tantek Celic and Chris Wilson (and a slew of other smart engineers) to re-create Internet Explorer. We all scoff at IE 6 today, but when it landed (and IE 5.5 for Mac) they were the leading, *standards compliant* browsers in the market. But Microsoft got lazy, they re-deployed those smart engineers to other projects and presumed the browser issue closed. But progress doesn't work that way, and even though IE 6/5.5 managed to deal a death blow to Netscape, Firefox rose from those ashes to challenge IE in just a few years. Meanwhile, Safari, once a second-class web browser continued to refine and improve (based on the Open Source Web Kit engine) so that today it stands as one of the "Big 5" as well. And Opera, the little browser that could? It focused on a market niche under-served - mobile - so that today it dominates that space (as well as markets like Russia, due to their commitment to internationalization). The point here? the browsers follow the money, plain and simple. And Ian Hickson's comment while offensively stated on the surface is very true: without support and commitment from the browsers, anything any one person or group of people dream up will languish. It's the same in any industry, and not exclusive to the web.

So how do we get the browsers to actually move in a direction we want? Certainly not by standing outside their doors and hurling rocks at them; no, it is by working with them, looking to forge a partnership with them, and helping them to understand user needs. (When it comes to people with disabilities, this is doubly hard, but not impossible. Progress has been slow, slower than many want, but it *IS* being made).

The other way is to works toward a consensus based Standards process, because to gain that consensus the browsers are part of the dialog, and support from them comes with that consensus. You can demand all you want, but demanding gets you no-where, you need to request and show the value, because if there is value they will do it... they follow the money and what the users want - as Firefox, Opera and Safari/Chrome (web kit) have amply proven.

...continued

32. Posted by John Foliot
on Thursday 2010-12-02 at 17:28:15 PST

Vlad wrote:
"Why are the browser vendors first citizens of the Web? What about the millions of users of authoring tools, screen readers, etc. I will not lend my name to a process that does not represent the interests of all stakeholders in Web technology."

It is you that characterizes them as first citizens, not I nor any other organization that is a signatory and dues-paying member of the W3C. I guess however that you know better than AT&T, the Australian Government Information Management Office (AGIMO), British Broadcasting Corporation, IBM, Israel Internet Association and the other 300+ members of the W3C (http://www.w3.org/Consortium/Member/List). They don't seem to have a problem lending *their* names to the W3C, but I guess you have either higher standards or better insight into how International Standards are created.

Vlad wrote:
"One of my peers submitted a bug based on an article I wrote. That bug was marked WONTFIX by one person without any discussion or debate."

Yes Vlad, I am aware of your "bug". But here's a thought: if you were actually a member of the Working Group you would have the ability to challenge that decision, and escalate it if you had enough support. Without broader support however, a bug such as that which you submitted is an 'opinion'. As such, to see it forward you need champions to advocate and build the case. It is much easier to do that when you are working inside an organization - any organization - than if you are standing outside lobbing in the occasional opinion. There is a politics to this that you seem blissfully unaware of.

Vlad wrote:
"Am I the only tool vendor on the sidelines? Are the vendors of tools like CKEditor, TinyMCE, Windows-Eyes, JAWS, in the game and active players? If not, why?"

Ask them. Nobody is stopping them from participating. Adobe is a tool maker too (ever hear of Dreamweaver?), and they are participating. IBM makes tools too, and have embedded CKEditor into a lot of their offerings - they too participate, and specifically both companies have invested time and effort into topics of Accessibility (Rich Schwerdtfeger, CTO of IBM's Accessibility Software Group is also the Chair of the Canvas Accessibility Task Force). If you were a member of the W3C Working Group you would know that, but you aren't and choose not to be. I can't change that, but I do know that participation is a choice, one that you too can make.

Anyway, you've decided that perhaps you can affect change via your blog, and that's a fair decision on your part. I'm not sure how successful you will be, but good luck with it.

33. Posted by Laura
on Friday 2010-12-03 at 05:09:33 PST

John, I have escalated Vlad's bug as part of Issue 31 and written a change proposal for it. It is Proposal 6 at
http://www.w3.org/html/wg/wiki/ChangeProposals/ImgElementSurveyConformaceChoices

But I do agree with you. If Vlad joined the HTML Working Group he would be able to state his position, as only he can, in the upcoming survey that will decide the issue.

Vlad, please do consider joining.

I don't know for sure but your first hand rationale as a tool maker and accessibility advocate on the survey could make a big difference to the Chairs in their decision. You have a unique perspective that is void in the working group.

I do know that if you don't join the group and respond to the survey, you won't know if you could have made a difference.

Like you Vlad, John once had great apprehension about joining the group. He stayed on the sidelines for a couple of years and wrote posts. Please consider his words. They are from the same type of experience as yours.

BTW the Bug Resolution Comparisons Chart is at:
http://www.d.umn.edu/~lcarlson/html5bugchart/20101009/

34. Posted by Vlad Alexander
on Friday 2010-12-03 at 10:25:40 PST

John Foliot wrote: "Certainly not by standing outside their doors and hurling rocks at them [browser vendors]"

How is it "hurling rocks" if I say that it is more useful to do this <div type="article.news.international"> instead of <article>? Or say that users can benefit from having <toc>? Or say that changing the meaning of numbered headings (h1 to h6) can be confusing to content authors? Or say that is it impractical to create a user interface for the time element? Or removal of longdesc is loss of a key feature? Where are the rocks (i.e. attacks)? All I am doing is brining issues to the forefront, that should be addressed for everyone's benefit, while HTML5 is still in development. And in all my articles, I invite the browser vendors to a dialog with other stakeholders in Web technology.

Laura wrote: "If Vlad joined the HTML Working Group he would be able to state his position, as only he can, in the upcoming survey that will decide the issue. Vlad, please do consider joining."

The HTML5 team is intractable when it comes to accessibility issues. The only way to bring significant change to make HTML more accessible is to make the public aware of these issues and to have a strong, united and active accessibility community.

Laura wrote: "Like you Vlad, John once had great apprehension about joining the group. He stayed on the sidelines for a couple of years and wrote posts."

Then he joined the Accessibility Task Force of the HTML WG and on his watch (and against the recommendation of this Task Force) the longdesc attribute was removed. Now instead of fighting for new accessibility features, he and other experts/advocates are kept busy by fighting to re-instate old accessibility features back into HTML.

35. Posted by John Foliot
on Friday 2010-12-03 at 19:19:09 PST

Vlad,

You make it seem that somehow the current situation. with LONGDESC is of my doing. You have no idea.

The current status of LONGDESC is that there is one Formal Objection lodged with the Director (TBL), which means that no final decision has been reached. This is not optimum, to be sure, but it ain't over yet either. So please be careful what you state as fact, as mis-information is just as harmful here.

As for me, I am trying to work within the Task Force to advance this issue still, as well as working outside with others to prepare another Formal Objection as a companion to the first. However, in both cases I am working with a group of people, not trying to single-handedly change things. I don't always agree with them on every single point, nor they on my opinions, but that is the nature of consensus.

You keep saying that you are inviting the browser vendors to dialog with you, but you need to come to where they already are; I find it somewhat arrogant of you that you expect them to engage on your terms and your turf. Again, the W3C is as neutral a locale as any, and further the browser vendors are already there. Somebody once said to me " do you want to be right or do you want to be married?"

Think about it.

36. Posted by Vlad Alexander
on Friday 2010-12-03 at 20:48:03 PST

John Foliot wrote: "You make it seem that somehow the current situation. with LONGDESC is of my doing."

Not at all! First, I only mentioned you because Laura compared me to you. Second, what I was trying so say is that there many accessibility experts, including you, already on the HTML WG but they have no power. As a result, on their watch and against their recommendation, the longdesc attribute was removed. My participation in the WG will would not have changed this.

John Foliot wrote: "The current status of LONGDESC is that there is one Formal Objection lodged with the Director (TBL), which means that no final decision has been reached."

That is not the point. The longdesc attribute should have never been removed in the first place because the Accessibility Task Force said it should not be removed. You should not have to waste your time fighting to keep existing accessibility features.

John Foliot wrote: "You keep saying that you are inviting the browser vendors to dialog with you, but you need to come to where they already are"

Why? Let's meet on neutral turf. I am willing to discuss any of the issues brought up in this article on another Web site or mailing list such as Web Standards Group list or WebAIM list.

37. Posted by AlastairC
on Saturday 2010-12-04 at 06:36:07 PST

I'm not sure what the problem with implementing the time tag is?

Quite a few of the scenarios of use (e.g. pubdate) should be output from a system, rather than authored. (E.g. a blog will produce the published date, regardless of what the authored content is). Things like event listings will generally be auto-generated as well for the microformats example (I do already: http://ukwindsurfing.com/events/2011/ ). Therefore an editing interface in a WYSIWYG environment could be focused.

As an author who knew that you could add a computer-readable version of the date, I'd want to treat it like a link:
- Type in the content, e.g. "The final count of the election was complete at 9pm."
- Select "9pm".
- Select a 'time' button in the toolbar.
- A dialogue opens with all optional fields: [year] [month] [day] [hour] [minute]
- Submit that dialogue, and the time element encloses the text, perhaps with a dotted line to indicate it's special, and perhaps a nearby icon to enable editing.

Am I missing something about the time element that makes it more difficult, or were you assuming a wider scope?

38. Posted by Vlad Alexander
on Saturday 2010-12-04 at 10:56:33 PST

AlastairC wrote: "As an author who knew that you could add a computer-readable version of the date, I'd want to treat it like a link"

Let's work through some scenarios.

1. Start with:

  1. <p>I usually have a snack at <time>16:00</time>.</p>

The user wants to change the time in a WYSIWYG interface. A user puts the cursor in front of "16:00" while still inside the time element, and presses DELETE key 5 times. At this point, what should be displayed?

2. Would you agree that a user interface should validate input? For example, if you have a calendar control, it should prevent the user from entering a date like February 29, 2011. The following usage of the time element is identical in meaning.

  1. <time date="2011-02-29"></time>
  2. <time>2011-02-29</time>

You can build an interface to prevent the former from being entered. But how would you build an interface that prevented the latter from being entered?

3. Would you do anything to prevent this?

  1. <time date="2010-12-10">2011-05-12</time>

4. The time element with a pubdate attribute is only permitted once inside an article element. What would the interface look like to enforce this rule? This would also need to work in a copy and paste scenario.

5. What should be the label for control that generates the pubdate attribute in the user interface? Remember, the time element has different meaning with the pubdate attribute if the parent is body or article.

6. You wrote: "A dialogue opens with all optional fields: [year] [month] [day] [hour] [minute]". Which fields are required?

39. Posted by Sean
on Sunday 2010-12-05 at 10:19:26 PST

All interesting, but I think somewhat missing the point. Whether the boys in the WhatWG are aware of it or not, they are not designing the web you seem to think they are.

Today the top sites are using anywhere from 3 thousand to 20 thousand lines of javascript on a page, browsers are starting to compile this to natively executed code, so this is likely to only increase; the canvas API allows that code to draw arbitrary graphics, all offloaded to the GPU. Websockets will allow for grabbing arbitrary back end data delivered as JSON or possibly XML for the die-hards.

5-10 years from now I predict nobody is going to be much interested in the minutiae of declarative markup and fighting with the painful arcana of CSS; they'll be using graphics and widget libraries and taking it all into their own hands. The DOM is basically either going to be a single canvas, or a list of canvas sprites. Accessibility is unfortunately not going to get much of a look in that brave new world.

There’s a reason HTML5 is the last in the line, but I don’t think it’s the stated one, the fact is markup's days are numbered (until the wheel comes full circle again, as it inevitably does).

40. Posted by Patricia
on Sunday 2010-12-05 at 14:59:38 PST

>Somebody once said to me " do you want to be right or do you want to be married?"
That may be true if it's marriage of equals. Do you feel like an equal partner to browsr vendors?

>Accessibility is unfortunately not going to get much of a look in that brave new world.
People think that ARIA will magically make future applications accessible but its complexity will only put people off accessibility.

41. Posted by AlastairC
on Monday 2010-12-06 at 04:57:24 PST

On the time element:

1. As per links. It should be the same for any inline element that can include extra information. What does Xstandard to for links?

2. Validation of input would be nice, but best done as you enter it. For example, if the user selects Feb 2011 from a drop-down, the day drop-down would only have 1-28 (except on leap years). However, that is a nice to have, you can't completely stop people putting in silly content.

3. A warning perhaps, but I wouldn't prevent it.

4 & 5. I wouldn't include pubdate functionality in a WYSIWYG editor, that's something the wider system & templates should handle.

6. All optional, the spec looks quite flexible about that. As a nice to have, if the user selects text with a programmatically determinable date, it could default to appropriate values. (Similar to how Apple's Mail.app picks up dates that could be used in iCal.)

42. Posted by Vlad Alexander
on Monday 2010-12-06 at 11:04:07 PST

AlastairC wrote: "As per links. It should be the same for any inline element that can include extra information."

I don't believe that there are any other non-empty inline elements that have meaning when they contain no content. For example, the following markup is meaningless:

  1. <a href="page.htm"></a>
  2. <em id="abc"></em>
  3. <span class="important"></span>

Whereas time can be empty and have meaning like this:

  1. <time date="2011-02-28"></time>

AlastairC wrote: "What does Xstandard to for links?"

XStandard removes non-empty inline elements when they contain no content.

AlastairC wrote: "However, that [validation] is a nice to have, you can't completely stop people putting in silly content."

That is true when you have non-structured content (data types) such as alternate text for an image. However, problem here is that sometimes the time element contains a string data type:

  1. <time date="2011-02-28">Monday</time>

... and at other times it contains an ISO8601 date data type:

  1. <time>2011-02-28</time>

This is poor design. You should only have one type of data type.

AlastairC wrote: "I wouldn't include pubdate functionality in a WYSIWYG editor, that's something the wider system & templates should handle."

1. Should content editors (XStandard, CKEditor, TinyMCE, etc.) be able to author content containing the article element? If so, they need to support time element with pubdate attribute. For example:

  1. <h1>The following is a list of articles I've published this week.</h1>
  2. <article>
  3. <h2>...</h2>
  4. <p>....</p>
  5. <p>Published on <time pubdate="pubdate">2011-01-15</time>.</p>
  6. </article>
  7. <article>
  8. <h2>...</h2>
  9. <p>....</p>
  10. <p>I first published this article on <time pubdate>2010-11-20</time> but later republished it on <time pubdate>2011-01-15</time>.</p>
  11. </article>

(FYI, the above markup contains an error - two time elements with a pubdate attribute inside an article element. How can you prevent this in a WYSIWYG environment?)

2. There are many WYSIWYG editors that generate a static HTML file with no template (server-side scripts). For example - KompoZer. Dreamweaver and Expression Web can also generate static HTML pages from WYSIWYG modes.

AlastairC wrote: "All optional, the spec looks quite flexible about that"

So what happens when the user selects only the Month and leaves all other fields blank? Would you generate markup like this:

  1. <time date="03">March</time>

Does "03" stand for third month or third day of the month?

43. Posted by Voice
on Wednesday 2010-12-08 at 10:19:32 PST

@Vladimir:42
1) you have your tool validate the markup that is generated to ensure it is, in fact, valid. If an article tag can only contain one time tag with a pubdate, then you check for it and throw up a warning or error message. What's so hard about that?
2) if you're writing a tool, you make sure it generates sensible markup. If you find the spec a bit too ambiguous, join in the process and make your point there. Otherwise, complaining that a tool could generate that particular markup isn't really any different than complaining that some other tool generated <time date="Fred">George<\time> and blaming it on the spec rather than the tool that generated the senseless markup in the first place.

44. Posted by Phil
on Wednesday 2010-12-08 at 12:58:54 PST

Here are the results of WC3 HTML Tidy validator :

<code>
This document was successfully checked as HTML5!
Result : Passed, 3 warning(s)
Source :
<!DOCTYPE html>
<title>Greetings</title>
<header>Hello World!</header>

Congratulations

The uploaded document was successfully checked as HTML5. This means that the resource in question identified itself as "HTML5" and that we successfully performed a formal validation using an SGML, HTML5 and/or XML Parser(s) (depending on the markup language used).
</code>

You have just to specify that your document is a HTML5 one.

As HTML5 is only experimental for now,
automatic detection for html5 is not operational.

This is the reason for your observation
("HTML Tidy will distort the content by changing the DOCTYPE and creating a useless hyperlink").

45. Posted by Dave Mason
on Wednesday 2010-12-08 at 13:26:50 PST

Well.. This is all well and good.. But out in the real world we'll continue coding what works in browsers, and avoiding bleeding edge protocol as far as possible. Stick to good old stuff you can make work.
I code websites for blue chips to African charities.. 30%+ of aggregate visitors are still IE6... Every site needs browser specific tweaks to work with IE6,7,8. FF 3 - 3.6 (and 4 soon) Chrome and Safari. So HTML5 seems a bit of an irrelevance for the next few years unless it becomes truly cross browser standardised.
In reality, browser vendors are the standard setters, and they are all differently "non-compliant"...

46. Posted by Me
on Wednesday 2010-12-08 at 14:01:07 PST

There are some real retards posting here, most of which haven't a clue!
HTML is dead. Move on or get buried with it.

47. Posted by Lea Verou
on Wednesday 2010-12-08 at 14:06:31 PST

<blockquote>Try to explain that to a non-technical user!</blockquote>
Non technical users aren't supposed to be writing HTML anyway.

48. Posted by Rob A
on Wednesday 2010-12-08 at 14:16:25 PST

Good article and very topical.

I gave up web development 5 years ago when I realized web applications were basically just hacks using a mis-mash of technologies which weren't originally designed for that purpose.

I admit I'm living in the unrealistic world of the technical purest which bears no relation to the insatiable global demand for what the web provides. At least I have the luxury of living in this world ;-).

It's like someone invented the first car which was a Skoda. They then added some racing tires to the Skoda and called it a sports car, joined three Skoda's together and called it a bus etc.

The language of the web & browsers need to be completely re-designed from the ground up IMO, your article adds weight to my belief nothing has changed.

Have fun hacking up web apps lol.

49. Posted by Andrés Sanhueza
on Wednesday 2010-12-08 at 21:32:35 PST

I remember someone, while defending HTML5, saying that using XHTML2 approach for headings (sections and h tags) would lead to "semantic abuse". No idea what that means through, I still prefer it.

Currently, HTML5 does allow to have nested sections and h1 tags, but the hx system is still allowed, which kind of baffles me. Managing the hx system via an UI can work in some way like in recent versions of Microsoft Word (no HTML, but the same principle), yet it always became a mess when working with PHP page embeddings and such.

Comments are closed for this article.

Main menu

Check out the a11y bugs project that aims to help browser / tool vendors fix accessibility bugs.