Choosing the right doctype for your site

0
1521

Which Doctype should I use? This is one of the first questions people ask when they start using web standards. There are four main doctypes in use today. This artcile will firstly define what a doctype is and how it works, and then go on to explain the four types and help you to decide which one to use.

What is a doctype?

A doctype (or DTD – document type definition) is a tag at the very beginning of an HTML document that tells the browser what type of document it is. This way the browser knows which specification was used to code the document and, therefore, how to display it.

A typical doctype looks like this:

This says that this is an html document written under the html 4.01 specification with a link to that specification on the W3C website. This is kind of complex and difficult to understand and remember. The W3C has recognized this and greatly simplified the format for future specification. The proposed html 5 doctype looks like this:

Much better!

The four choices

Currently, four doctypes are commonly used:

    html 4.01 transitional

    html 4.01 strict

    xhtml 1.0 transitional

    xhtml 1.0 strict

There are two major differences between these doctypes: html vs. xhtml and strict vs. transitional. Your decision on these two points will determine which doctype you use.

html vs. xhtml

This is actually a trickier decision than you’d think. xhtml was developed as a bridge between html and the much stricter xml.It is a stricter syntax, which is good because it encourages better coding habits. However, the problem is that some browsers have never properly supported xhtml.

What is a mime type?

A mime type simply identifies the format of the page. A mime type declaration in a web page looks like this:

This says that this is an xhtml document. In order to correctly serve an xhtml document you would need to include this specification that tells the browser to serve the page as xhml.

The mime type declaration also includes the character encoding, which tells the browser what type of text to use to render the page. HTML documents always use the unicode character set, which is a universal standard for rendering language characters on computers. The W3C’s Character sets & encodings in XHTML, HTML and CSS tutorial has more details on character encoding. All you really need to know is that UTF-8 is usually the best character set to use for html documents. If you’re having problems with certain characters showing up as question marks or square boxes it may be a problem with the character encoding.

Xhtml documents may also use an xml declaration at the top of the document (before the doctype) to set the character encoding:

The only thing we need to know about this right now is that it puts Internet Explorer 6 into quirks mode (see below). Some html editors will insert this tag with an xhtml doctype. If you’re having rendering problems in IE 6 check to see if this tag is present and if it is, remove it. The W3C notes that:

    … if you decide to omit the XML declaration you should choose either UTF-8 or UTF-16 as the encoding for the page.

What does this mean?

There are two problems with the xhtml mime type. The first is that the specification says that the browser should break and return an error message whenever it encounters a problem in the xhtml code. This is the way xml works. So, for example, if you had forgotten to close a

  • tag, the browser would stop and return and error. This means that when serving a document as xhtml you have to make sure that it’s correct. Otherwise users will not be able to see that page at all, they will just see an error message.

The second problem is that Internet Explorer doesn’t support xml mime types and has no plans to do so in the future. That means that if you included the application/xhtml+xml content-type specification, Internet Explorer would give the user a download dialog since it doesn’t know how to display the page. There are ways that you can get around this (use scripting to serve xhtml only to browsers that support it) but that’s a bit complicated. The bottom line is that since IE doesn’t support the xhtml mime type, you really can’t serve pages as xhtml.

Since that’s the case, what is the benefit of using an xhtml doctype? If the page can’t be served with an xhtml mime type it’s really the same as html. But, xhtml was developed with the intention that it would work as text/html.There is really nothing wrong with serving an xhtml document as html. According to a W3C background document:

 In addition, [XHTML1] defines a profile of use of XHTML which is compatible with HTML 4.01 and which may also be labeled as text/html.

Just keep in mind that if/when you do change the mime type on your page to to xml you will ned to make sure that everything is valid. Since you’re unlikely to go back and do that on old pages, there really is no danger in using an xhtml doctype with an text/html mime type.

xhtml is stricter than html

The one caveat here is that xhtml does encourage better quality coding practices than html. In xhtml:

all tags and attributes must be lower case

all tags must be closed, including non-enclosing tags

all attributes must be quoted

attribute minimization is not allowed (e.g. checked=”checked” not just checked)

As the Web Standards Project notes:

    The margin for errors in HTML is much broader than in XHTML, where the rules are very clear. As a result, XHTML is easier to author and to maintain, since the structure is more apparent and problem syntax is easier to spot.

For this reason, it may be a good idea to use an xhtml doctype. There really is no harm in serving an xhtml document as html and it will help you to learn better coding practices.

What about xhtml 1.1?

The w3c guidelines require xhtml 1.1 documents to be served as xhtml with the an xml mime type. This means that you shouldn’t serve xhtml 1.1 documents as text/html and the previously discussed problems with xhtml mime types apply.

Transitional vs. strict

Transitional doctypes were invented to provide a way for webmasters to transition to the new, stricter specification (html 4.01 from html 3.2). They allow you to get away with older, depreciated tags such as tags. If you are working on moving to a new specification this would ben an appropriate doctype to use.

Strict doctypes don’t let you use these older tags. That doesn’t mean that they won’t display them – the page may look the same as it would with a transitional doctype. The difference is that when you run the validator all of those depreciated tags and other errors will be reported. A strict doctype helps you to write better html by reporting all of these errors including:

depreciated tags and attributes (see ‘Depr’ column)

improperly nested tags (i.e. inline elements must be contained by a block level element

What is quirks mode?

Quirks mode was created by browsers to ensure that old documents wouldn’t be broken by changes to their rendering engines. This was because older versions of browsers didn’t get CSS quite right. Many old documents that were designed for the old implementation would break if the implementation was corrected in a new version of the browser.

To ensure that those old documents would still be displayed as intended, browser makers decided to use doctypes to decide whether documents shold be displayed in the old way or the new way. The table towards the bottom of this page shows you which doctypes trigger quirks mode in which browsers. In that table you’ll also notice an “almost standards mode” for some browsers. This is just what it says – almost like standards mode but not quite. Either way, you can end up with an unreliable display if you use a “quirks mode” doctype.

One of the reasons to use a strict doctype is that you’ll always get standards compliant rendering mode. This way your page will always render correctly and according to the specificaiton. It will also be more likely to look the same in different browsers. Consistent rendering is one of the reasons to use a doctype in the first place.

Which doctype to choose?

When your just starting out with web standards you’ll probably want to start with an html 4.01 transitional doctype. Once you have your pages validating with that doctype you should change to html 4.01 strict and continue to work on any errors that are detected. From there you may choose to move on to xhtml transitional and finally xhtml strict.

Doctype codes to copy & paste

Whether you chooes html or xhtml, transitional or strict, below is the doctype code that you will need to copy and paste into your pages. Mime types with utf-8 character sets are included.

Html 4.01 Transitional

Html 4.01 Strict

Xhtml 1.0 Transitional

Xhtml 1.0 Strict

Discussion

To discuss, ask questions or comment on this article please see the Webmaster Sun Forum discussion about this article.

LEAVE A REPLY

Please enter your comment!
Please enter your name here