Rebuilding The Web

Articles, advocacy, discussion and debate about the many problems of the Web and the challenges of rebuilding it.

Is it irresponsible to advocate using HTML5 before it is ready?

A number of renowned HTML5 supporters are advocating the use of HTML5 to the general public while the spec is still in development, unproven, and subject to change. Is this in the best interest of Web site creators or simply irresponsible behaviour?

If HTML5 works in the browsers, does it mean it's ready to go?

The major browser vendors are developing HTML5. So it should not come as a complete surprise that the only criterion for determining if HTML5 can be used by the general public is if Web browsers can gracefully render HTML5 documents. In addition, there is high demand by Web site creators to have new features offered by HTML5, so it would seem to make sense to give the people what they want as soon as possible. But what if some widely used authoring tools will mangle HTML5 documents?

The HTML5 manglers

Probably the most popular authoring tool, built into many HTML editors, is HTML Tidy. Some use HTML Tidy directly before publishing a document. Some use HTML Tidy indirectly and don't even know they are using it, because its use is hidden by the content management system. In some cases, HTML Tidy is used as a batch process on documents that have been published for some time.

So let's take a look what HTML5 supporters advocate and see how the result of using HTML Tidy with HTML5 will produce mangled results.

The following advice regarding the use of HTML5 DOCTYPE is paraphrased to protect the identity of the supporter of HTML5: "You can use HTML5 now by simply switching to <!DOCTYPE html> document type identifier." On the basis of that advice, let's look at what HTML Tidy would do with the following valid HTML5 document:

  1. <!DOCTYPE html>
  2. <title>Test</title>
  3. <p>Hello World!</p>

HTML Tidy will convert the HTML5 document identifier into an HTML 3.2 identifier:

  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
  2. <html>
  3. <head>
  4. <meta name="generator" content="HTML Tidy for Windows (vers 25 March 2009), see www.w3.org">
  5. <title>Test</title>
  6. </head>
  7. <body>
  8. <p>Hello World!</p>
  9. </body>
  10. </html>

The following assurance regarding the use of nesting block level elements inside the <a> element is paraphrased to protect the identity of the supporter of HTML5: "if you need to put an <a> element around a <div> element, use HTML5 and move on". On the basis of that advice, let's see what HTML Tidy will do to the following valid HTML5 document:

  1. <!DOCTYPE html>
  2. <title>Greetings</title>
  3. <a id="abc">
  4. <div>Hello World!</div>
  5. </a>

HTML Tidy will move the <div> element outside the <a> element because according to the current rules of HTML, the <a> element cannot contain block level elements such as <div>. The result will be:

  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
  2. <html>
  3. <head>
  4. <meta name="generator" content="HTML Tidy for Windows (vers 25 March 2009), see www.w3.org">
  5. <title>Greetings</title>
  6. </head>
  7. <body>
  8. <a id="abc" name="abc"></a>
  9. <div>Hello World!</div>
  10. </body>
  11. </html>

If the <a> element in the previous example is used as a hyperlink like this:

  1. <!DOCTYPE html>
  2. <title>Greetings</title>
  3. <a href="http://localhost">
  4. <div>Hello World!</div>
  5. </a>

HTML Tidy will move the <div> out of the hyperlink thus creating an empty hyperlink which is unusable in Web browsers:

  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
  2. <html>
  3. <head>
  4. <meta name="generator" content="HTML Tidy for Windows (vers 25 March 2009), see www.w3.org">
  5. <title>Greetings</title>
  6. </head>
  7. <body>
  8. <a href="http://localhost"></a>
  9. <div>Hello World!</div>
  10. </body>
  11. </html>

The next assurance (also paraphrased to protect the identity of another HTML5 supporter) is as follows: "of course, HTML5 can be used now". S/he then goes on to provide examples of how to use the following new HTML5 elements: <header>, <footer>, <nav>, <aside>, <article>, <figure>, <section> and <article>. Again, let's start with the following valid HTML5 document:

  1. <!DOCTYPE html>
  2. <title>Greetings</title>
  3. <header>Hello World!</header>

This time, HTML Tidy will throw the following error and will not be able to proceed:

  1. line 3 column 1 - Error: <header> is not recognized!
  2. line 3 column 1 - Warning: discarding unexpected <header>
  3. line 3 column 9 - Warning: plain text isn't allowed in <head> elements
  4. line 2 column 1 - Info: <head> previously mentioned
  5. line 3 column 9 - Warning: inserting implicit <body>
  6. line 3 column 21 - Warning: discarding unexpected </header>
  7. Info: Document content looks like HTML 3.2
  8. 4 warnings, 1 error were found!
  9. This document has errors that must be fixed before using HTML Tidy to generate a tidied up version.

And it's not only HTML Tidy that mangles HTML5. Let's take the following valid HTML5 snippet example and run it through three popular WYSIWYG editors:

  1. <a id="abc">
  2. <div>Hello World!</div>
  3. </a>

XStandard will remove the <div> and enclose the <a> element in a paragraph:

  1. <p><a id="abc">Hello World!</a></p>

CKEditor running in Firefox will create an empty paragraph and move the <a> element inside the <div>:

  1. <p></p>
  2. <div><a id="abc">Hello World!</a></div>

TinyMCE running in Internet Explorer will permit a <div> inside the <a> element but will create invalid HTML (all versions) by placing the <div> element inside a paragraph:

  1. <p><a id="abc">
  2. <div>Hello World!</div>
  3. </a></p>

And there are plenty of other examples of HTML5 being mangled. For example, if you start with this:

  1. <header>
  2. <p>blah</p>
  3. </header>

One editor will produce:

  1. <p>
  2. <header></header>
  3. </p>
  4. <p>blah</p>

And another editor will produce:

  1. <p>blah</p>

The problem is not with authoring tools. Authoring tools were written to support the rules of HTML. The problem is that HTML5 has changed the rules of HTML, and this stems from the fact that HTML5 is developed by only one group of stakeholders in Web technology - browser vendors. As a result, the needs and concerns of other stakeholders of Web technology are not being addressed. Yet influential HTML5 supporters, for various reasons, are disregarding these shortcomings, and are encouraging public use of HTML5 even before it is ready.

Should HTML5 come with a warning label?

The way HTML5 is being sold/advocated reminds me of how the tobacco industry in its early days used to market cigarettes - by claiming that they are safe to smoke. Today cigarette packages come with safety warning labels. Perhaps HTML5 should come with a warning label similar to those found on cigarette packages today?

Image of a cigarette package. Top text reads: 'Warning: HTML5 is not ready for general use. Authoring tools could mangle HTML5 markup!'. Bottom text reads: 'HTML5, version free'.

Yes - advocating premature use of HTML5 is irresponsible

Web pages can start their lifecycle being written by hand then some time later they can be edited using authoring tools, which can mangle HTML5 markup and possibly delete content. To the supporters of HTML5 who advocate premature use of HTML5, I say that even if you have good intentions, please apply a broader view of how HTML is used and be mindful of the influence you have on novice Web site creators.

Public comments

1. Posted by Lachlan Hunt
on Monday 2010-01-18 at 15:23:07 PST

Of course you can't use a crappy, largely obsolete tool like HTML Tidy and expect it to work with HTML5. If you need to use HTML Tidy to clean up your own markup, then you need to improve your skills. The solution here is: Don't Use HTML Tidy!

You can use HTML5 features now if you're prepared to be an early adopter, and accept the consequences that go with that, like limited browser support and limited authoring tool support.

2. Posted by Vlad Alexander
on Monday 2010-01-18 at 18:52:57 PST

Lachlan Hunt wrote: "Of course you can't use a crappy, largely obsolete tool like HTML Tidy and expect it to work with HTML5"

Obsolete? Wikipedia uses HTML Tidy. Code editors like UltraEdit use HTML Tidy. WYSIWYG editors like XStandard use HTML Tidy. Server-side scripting environments like PHP offers HTML Tidy extension. But let's put that aside. Is HTML5 only backwards compatible to Web browsers? Why is HTML5 not backwards compatible to widely used authoring tools?

Lachlan Hunt wrote: "The solution here is: Don't Use HTML Tidy!"

Does your advice apply to other authoring tools mentioned in the article? Also, as I stated in the article, many people don't know that they are using HTML Tidy because Tidy is built into many applications and is hidden from users.

Lachlan Hunt wrote: "You can use HTML5 features now if you're prepared to be an early adopter, and accept the consequences that go with that"

I support that! But that is not what this article is about. This article is about HTML5 supporters who advocate the use of HTML5 to the general public without disclosing the consequences.

3. Posted by mattur
on Monday 2010-01-18 at 18:55:08 PST

Using HTML5 will give you bad teeth? I'm British, I already have bad teeth. It was all that HTML3.2 I used to smoke.

4. Posted by Montoya
on Tuesday 2010-01-19 at 07:09:07 PST

HTML 5 supporters, like those who supported XHTML 2.0 and other markup languages in the past, have a mistake belief that their own blind support for the spec will somehow help to get it finalized. They tell anyone and everyone to start using the spec, regardless of the risks, in the hopes that increased support from end users will push it along. The truth is that HTML 5, and any other new spec, will get finalized when companies like Microsoft, Adobe, Mozilla, Google, Opera, Apple, etc. are good and ready to do so. They don't care about you, or the open source community, or any of the myriad of companies that are responsible for every tool under the sun aside from the browsers. When they decide to make it happen, it will happen. Until then, you are just wasting your own time!

5. Posted by Jonas
on Tuesday 2010-01-19 at 07:18:31 PST

Come on,...HTML5 is really great. Just because your favorite editor does not yet support it there is no reason to make use of it. Last week I decided to put my first HTML5 website online - http://www.rield.com/ - I've never developed that fast and easy. No IE-tricks, no nonsense. Just great.

6. Posted by Andy Hume
on Wednesday 2010-01-20 at 11:01:30 PST

I was fairly surprised by such a clearcut conclusion not to advocate HTML5.

The obvious conclusion to me is: don't use HTML5 with a load of authoring tools that don't support it. :)

Lachlan saying HTML Tidy is obselete might be a little over the top. It's not obsolete it just doesn't support HTML5. That doesn't mean you shouldn't advocate HTML5 to the 'general public'. Actually, you seem to have a very dim view of the intelligence of these people. If they have enough knowledge to be making a decision about HTML5, then they should be more than comfortable with understanding that some authoring tools might not support it.

7. Posted by Vlad Alexander
on Wednesday 2010-01-20 at 13:42:43 PST

Andy Hume wrote: "If they have enough knowledge to be making a decision about HTML5..."

Andy, the problem is that nobody has enough knowledge about the consequences of using HTML5 right now. The only consequences that have been discussed in public are the impact on Web browsers. As far as I know, nobody has discussed the impact on authoring tools, or screen readers, or HTML email, or search engines, etc.

Andy Hume wrote: "The obvious conclusion to me is: don't use HTML5 with a load of authoring tools that don't support it. :)"

Ah, but you have it backwards. According to the design principles of HTML5 to be forwards compatible, existing tools must gracefully accept HTML5. So it's not existing authoring tools that must support HTML5 (which is impossible), it's HTML5 that must support existing authoring tools!

Comments are closed for this article.

Main menu

Check out the a11y bugs project that aims to help browser / tool vendors fix accessibility bugs.