The Missing Part of HTML

This post covers something I meet for the first time or have forgotten about HTML after reading Getting started with HTML on MDN.

Elements Categories, Block-level and Inline

There are two important categories of elements in HTML, block-level elements and inline elements. HTML5 redefined the element categories in HTML5. These definitions are more accurate and less ambiguous than the ones that went before (prevent the confusion between types of CSS boxes and element categories). But for simplicity, we can still use the concepts of block-level and inline before H5.

The terms “block” and “inline” should not be confused with the types of CSS boxes with the same names. While they correlate by default, changing the CSS display type doesn’t change the category of the element and doesn’t affect which elements it can contain and which elements it can be contained in. One of the reasons why HTML5 dropped these terms was to prevent this rather common confusion.

Quotes around Attribute Value

You can omit quotes around attribute values, but when:

1
<a href=https://www.mozilla.org/ title=The Mozilla homepage>favorite website</a>

Mozilla and homepage will be parsed as boolean attributes.

Single or double quotes? Both are ok, just like some programming language, if you’ve used one type of quote in your HTML, you can include the other type of quote inside your attribute values without causing any problems:

1
<a href="http://www.example.com" title="Isn't this fun?">link</a>

HTML entities

Used for including special characters in HTML, complete table at wikipedia.

If you want to include a quote, within the quotes where both the quotes are of the same type (single quote or double quote), you’ll have to use HTML entities for the quotes. For example:

1
<a href="http://www.example.com" title="Isn&#39;t this fun?">link</a>

In HTML, the characters <, >, ", ' and & are special characters. They are parts of the HTML syntax itself, so how do you include one of these characters in your text? For example, if you really want to use an ampersand or less-than sign, and not have it interpreted as code.

We have to use character references — special codes that represent characters, and can be used in these exact circumstances. Each character reference is started with an ampersand (&), and ended by a semicolon (;).

Literal character Character reference equivalent
< &lt;
> &gt;
" &quot;
' &apos;
& &amp;

For example, to show <p> in your html element content:

1
<p>In HTML, you define a paragraph using the &lt;p&gt; element.</p>

<!DOCTYPE html>

<!DOCTYPE html>: The doctype. In the mists of time, when HTML was young (about 1991/2), doctypes were meant to act as links to a set of rules that the HTML page had to follow to be considered good HTML, which could mean automatic error checking and other useful things. They used to look something like this:

1
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

However, these days no one really cares about them, and they are really just a historical artifact that needs to be included for everything to work right. <!DOCTYPE html> is the shortest string of characters that counts as a valid doctype; that’s all you really need to know and use.

Whitespace in HTML

the two following code snippets are equivalent:

1
2
3
<p>Dogs are silly.</p>
<p>Dogs are
silly.</p>

No matter how much whitespace you use (which can include space characters, but also line breaks), the HTML parser reduces each one down to a single space when rendering the code. So why use so much whitespace? The answer is readability — it is so much easier to understand what is going on in your code if you have it nicely formatted, and not just bunched up together in a big mess.

Miscellaneous

Tags in HTML are case-insensitive, so <img> and <IMG> both work. Best practice, however, is to write all tags in lowercase.