Writing HTML/XHTML
* Introduction
- HTML Stands for Hypertext Markup Language
- XML Stands for Extensible Markup Language.
- XHTML Stands for Extensible Hypertext Markup Language
Today on the web XHTML has all but taken over. XHTML essentially is the newer version of HTML, it was designed to be a replacement. It is almost identical to HTML except that the syntax (or the way it must be written) is more strict and as a result more clean. HTML is not dead however, development of the language continues and it is evolving a new specification for HTML called HTML5, which will likely be quickly adopted once enough browsers support it. It gleans a lot from the XHTML 1.0 world and adds a lot of new things that will become very useful I'm sure. For now XHTML is still king of the mountain and is the preferred syntax and document type that should be used for all new web pages.
So you may be wondering, why not just use good old HTML4 for my web pages? Well, the answer comes down to this, HTML is very forgiving in its acceptance of poorly written and malformed documents. In fact, it is too forgiving. It has enabled a world wide web filled with incompatible and sloppily written code. Many web pages that use this poorly written code may appear just fine in the browser or in one browser, but the web is increasingly becoming a multi-platform vehicle for viewing and sharing information across vastly different devices, languages, and it's being indexed in increasingly complex ways. In short, the web is becoming more of a web. It is no longer just a collection of destinations, but rather a vast pool of information that can be shaped, tooled, and re-mixed into a very specific and more personal experience. Web pages are becoming less and and less static and more dynamic and custom tailored to the users. It is the age of the "mash-up" where multiple bits of information can be compared, combined, and made into entirely new entities. XHTML is how this has begun to happen.
XML was designed to describe data or information. HTML was designed to display data or information. XHTML enables web pages themselves to not just display, but to actually be used as data. This enables web pages to easily be converted into multiple types data from full blown HTML documents to stripped down raw feeds. The only thing you have to do differently from HTML is to obey a few more rules that everyone else has to follow and suddenly all of the information on the world wide web is playing nice with one another.
Read further - Syntax ↓
* Syntax
XHTML is a language and thus it has a syntax, meaning it has rules about what are words or "tags" and how to make well formed sentences or "elements". A tag is simply a way of communicating to the browser what to do with something. An element is the tag and the content or instruction, together as one entity.
A tag is written as such
<tag>
An element would be
<tag>content</tag>
Notice the / at the end of the element. This is how you specify the end or close of that element.
HTML allows for certain tags to remain unclosed meaning that both of these examples are acceptable.
<H1> Header 1
<h1> Header 1 </h1>
XHTML is far more strict however in that the close tags are required, so only that second example would be acceptable. If you notice in that first example I also used a capital 'H'. In XHTML, all tags must be written in lower case but this is not true with HTML.
In XML tags may be whatever you wish them to be to describe the data. For instance, you could write:
<people>
<person1>sue</person1>
<person2>joe</person2>
</people>
In XHTML, there are only specific tags that can be used, all of which come directly from HTML. Not all of the tags available in HTML are used in XHTML because they have been deemed to be either poorly formed or unnecessary.
Read further - Formatting ↓
* Formatting
The basic formatting for both HTML and XHTML documents consists of three tags.
- The <html> tag.
- The <head> tag.
- The <body> tag.
Every document must contain these tags formatted as such.
<html>
Everything in your document must be within the <html> tag.
<head>
Inside the head tag is where you would place any code that you want the browser to load first before moving onto your content. Things like character encoding, javascript, and CSS(Cascading Style Sheets) should be placed here.
</head>
<body>
Inside the <body> tag is where all of the content of your page must go.
</body>
</html>
Doctype Declarations (DTD)
In order for a web browser to display your document the way you intend you must, at the very beginning of your page, tell it how you want it displayed. We do this by telling the browser what kind of document it is with the "<!DOCTYPE>" tag. This tag tells the browser which HTML or XHTML specification the document uses.
Note:The <!DOCTYPE> tag does not have an end tag and is the only tag that can be capitalized.The "<!DOCTYPE>" tag specifies whether or not your page is HTML or XHTML and how well it adheres to the specifications by using one of three methods.
- Strict
- Transitional
- Frameset
HTML
HTML 4.01 specifies three document types: Strict, Transitional, and Frameset
-
HTML Strict DTD
Use this when you want clean markup, free of presentational clutter. Use this together with Cascading Style Sheets (CSS):
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> -
HTML Transitional DTD
The Transitional DTD includes presentation attributes and elements that W3C expects to move to a style sheet. Use this when you need to use HTML's presentational features because your readers don't have browsers that support Cascading Style Sheets (CSS):
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> -
Frameset DTD
The Frameset DTD should be used for documents with frames. The Frameset DTD is equal to the Transitional DTD except for the frameset element replaces the body element:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">
XHTML
XHTML 1.0 specifies three XML document types: Strict, Transitional, and Frameset.
XHTML Strict DTD
Use this DTD when you want clean markup, free of presentational clutter. Use this together with Cascading Style Sheets (CSS):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">-
XHTML Transitional DTD
Use this DTD when you need to use XHTML's presentational features because your readers don't have browsers that support Cascading Style Sheets (CSS):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> -
XHTML Frameset DTD
Use this DTD when you want to use frames!
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
Read further - Tags ↓
* Tags
Below is a list of the available tags for both HTML 4.01 and XHTML 1.0 with there availabilty by doctype.
S = Strict, T = Transitional, F = Frameset.
| Tag | Description | Availabilty |
|---|---|---|
| <!--...--> | Defines a comment | STF |
| <!DOCTYPE> | Defines the document type | STF |
| <a> | Defines an anchor | STF |
| <abbr> | Defines an abbreviation | STF |
| <acronym> | Defines an acronym | STF |
| <address> | Defines an address element | STF |
| <applet> | Deprecated. Defines an applet | TF |
| <area> | Defines an area inside an image map | STF |
| <b> | Defines bold text | STF |
| <base> | Defines a base URL for all the links in a page | STF |
| <basefont> | Deprecated. Defines a base font | TF |
| <bdo> | Defines the direction of text display | STF |
| <big> | Defines big text | STF |
| <blockquote> | Defines a long quotation | STF |
| <body> | Defines the body element | STF |
| <br> | Inserts a single line break | STF |
| <button> | Defines a push button | STF |
| <caption> | Defines a table caption | STF |
| <center> | Deprecated. Defines centered text | TF |
| <cite> | Defines a citation | STF |
| <code> | Defines computer code text | STF |
| <col> | Defines attributes for table columns | STF |
| <colgroup> | Defines groups of table columns | STF |
| <dd> | Defines a definition description | STF |
| <del> | Defines deleted text | STF |
| <dir> | Deprecated. Defines a directory list | TF |
| <div> | Defines a section in a document | STF |
| <dfn> | Defines a definition term | STF |
| <dl> | Defines a definition list | STF |
| <dt> | Defines a definition term | STF |
| <em> | Defines emphasized text | STF |
| <fieldset> | Defines a fieldset | STF |
| <font> | Deprecated. Defines text font, size, and color | TF |
| <form> | Defines a form | STF |
| <frame> | Defines a sub window (a frame) | F |
| <frameset> | Defines a set of frames | F |
| <h1> to <h6> | Defines header 1 to header 6 | STF |
| <head> | Defines information about the document | STF |
| <hr> | Defines a horizontal rule | STF |
| <html> | Defines an html document | STF |
| <i> | Defines italic text | STF |
| <iframe> | Defines an inline sub window (frame) | TF |
| <img> | Defines an image | STF |
| <input> | Defines an input field | STF |
| <ins> | Defines inserted text | STF |
| <isindex> | Deprecated. Defines a single-line input field | TF |
| <kbd> | Defines keyboard text | STF |
| <label> | Defines a label for a form control | STF |
| <legend> | Defines a title in a fieldset | STF |
| <li> | Defines a list item | STF |
| <link> | Defines a resource reference | STF |
| <map> | Defines an image map | STF |
| <menu> | Deprecated. Defines a menu list | TF |
| <meta> | Defines meta information | STF |
| <noframes> | Defines a noframe section | TF |
| <noscript> | Defines a noscript section | STF |
| <object> | Defines an embedded object | STF |
| <ol> | Defines an ordered list | STF |
| <optgroup> | Defines an option group | STF |
| <option> | Defines an option in a drop-down list | STF |
| <p> | Defines a paragraph | STF |
| <param> | Defines a parameter for an object | STF |
| <pre> | Defines preformatted text | STF |
| <q> | Defines a short quotation | STF |
| <s> | Deprecated. Defines strikethrough text | TF |
| <samp> | Defines sample computer code | STF |
| <script> | Defines a script | STF |
| <select> | Defines a selectable list | STF |
| <small> | Defines small text | STF |
| <span> | Defines a section in a document | STF |
| <strike> | Deprecated. Defines strikethrough text | TF |
| <strong> | Defines strong text | STF |
| <style> | Defines a style definition | STF |
| <sub> | Defines subscripted text | STF |
| <sup> | Defines superscripted text | STF |
| <table> | Defines a table | STF |
| <tbody> | Defines a table body | STF |
| <td> | Defines a table cell | STF |
| <textarea> | Defines a text area | STF |
| <tfoot> | Defines a table footer | STF |
| <th> | Defines a table header | STF |
| <thead> | Defines a table header | STF |
| <title> | Defines the document title | STF |
| <tr> | Defines a table row | STF |
| <tt> | Defines teletype text | STF |
| <u> | Deprecated. Defines underlined text | TF |
| <ul> | Defines an unordered list | STF |
| <var> | Defines a variable | STF |
| <xmp> | Deprecated. Defines preformatted text |
Read further - Semantics ↓
* Semantics
There are many tags in the HTML/XHTML world you'll find that do much the same thing. For instance the <b> tag and <strong> tag in most browsers these are both rendered as bold text. The <b> tag describes only how the text within it should be displayed (as bold text). The <strong> tag however describes the contained information as something that should be conveyed in a strong way. This difference may seem trivial at first, but the choice of which is more appropriate can really make a difference when it comes to how the data is interpreted by different devices or if the page is translated into another language.
For instance, if the page is being interpreted by a screen reader for a visually impaired person there is no way for it to interpret the concept of bolder text but, a screen reader will be able to interpret the <strong> tag simply by reading that passage a little louder.
Another example would be <i> tag versus the <em>. In most browsers these are both displayed as italic text, but again the emphasis tag is a much more descriptive method for expressing the meaning behind it.
In some written languages there is no parallel for italicized text. There are other methods employed to show this emphasis so, if you were to use the <em> tag the browser would be able to interpret that passage in the most appropriate way for the reader.
Essentially what this means for the author of the document is that you should always try to use the most descriptive tag available to you. This can sometimes mean employing tags that you otherwise wouldn't even consider.
Some Other Semantic Tags to Consider
-
<abbr> - Defines an abbreviation.
Example:
By using the title attribute with the <abbr> tag, you can have a tool tip on hover that will render the abbreviation and screen readers will pronounce the title correctly.<abbr title=”Doctor”>Dr</abbr>DooLittleRenders:
Dr. Doolittle
-
<acronym> - Defines an Acronym.
Example:
Using the title attribute will render a tool tip that you can use to spell out the acronym. And again screen readers will pronounce the full acronym.<acronym title=”Cascading Style Sheet”>CSS</acronym>Renders:
CSS -
<address> - Defines an address.
Example:
You should use it to define addresses, a signature, or the authorship of a documents.<address>1234 E. Main St. Anytown, U.S.A.</address>Renders:
1234 E. Main St. Anytown, U.S.A. -
<cite> - Defines a reference to another work such as a book, report, or web site.
Example:
Essentially this tag is to define a citation to other work, and it can include the date of publication, a link to the publication and a description of the publication you are citing.<cite> Written on January 25th, 2008 by Kris Hedges</cite>Renders:
Written on January 25th, 2008 by Kris Hedges -
<code> - Defines computer code text.
Example:
The <code> tag by default will monospace your font, but it also defines computer code and sets it apart from other written content.<code>#header { margin: 0; font-size: 3em; } </code>Renders:
#header { margin: 0; font-size: 3em; } -
<samp> - Defines sample computer code.
Example:
Similar to the <code> tag, the <samp> tag monospaces your font and refers to example or sample code in your documents.<samp> html, body { margin: 0; padding: 0; } </samp>Renders:
html, body { margin: 0; padding: 0; } -
<del> - Defines text that was deleted.
Example:
Most browsers will render the text with a strike through, and it is used to show when a document has been updated and how.Semantics are <del>often</del> always importantRenders:
Semantics are
oftenalways important -
<ins> - Defines text that was inserted (often used with <del>). Again the <ins> tag shows where a document has been updated and how. Many browsers will underline the the inserted content, however some still render them as plain text.
Example:Semantics are <del>often</del> <ins>always</ins> importantRenders:
Semantics are
oftenalways important -
<q> - defines a short quotation. Similar to <blockquote> the <q> tag defines a short quotation. Modern browsers will automatically render quotations around the content enclosed, however IE6 and below do not. Some resort to the DOM to fix this hole in IE’s rendering engine.
Example:<q>Studying is learning.</q>Renders:
Studying is learning.
So now you should have a pretty good understanding of how to create well formed HTML/XHTML documents. Let's Learn how to make them look great too and give them some style!