The i, b, em, & strong elements

Tuesday, March 9th, 2010 by Oli Studholme.

While many HTML4 elements have been brought into HTML5 essentially unchanged, several historically presentational ones have been given semantic meanings.

Let’s look at  and  and compare them to the semantic stalwarts  and . In summary:

 — was italic, now for text in an “alternate voice”, such as transliterated foreign words, technical terms, and typographically italicized text (W3C:Markup, WHATWG)
 — was bold, now for “stylistically offset” text, such as keywords and typographically emboldened text (W3C:Markup, WHATWG)
 — was emphasis, now for stress emphasis, i.e., something you’d pronounce differently (W3C:Markup, WHATWG)
 — was for stronger emphasis, now for strong importance, basically the same thing (stronger emphasis or importance is now indicated by nesting) (W3C:Markup, WHATWG)

Giving presentational elements new semantic meanings

 and  were HTML4 font style elements and are still used presentationally where appropriate to follow typographic conventions. They now have semantic meaning, however, and their style can be changed via CSS, meaning they’re not only presentational — , for example, doesn’t have to be bold. Because of this, it’s recommended to use classes to indicate meaning to make it easy to change the style later.

The `` element

The i element represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose in a manner indicating a different quality of text, such as a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, or a ship name in Western texts.

HTML — Living Standard, WHATWG

Other things that are typically italicised include transliterated foreign words (using the attribute lang=""), inline stage directions in a script, some musical notation, and when representing hand-written text inline:

Deckard: Move! Get out of the way!

Deckard fires. Kills Zhora in dramatic slow motion scene.

Deckard: The report would be routine retirement of a replicant which didn’t make me feel any better about shooting a woman in the back. There it was again. Feeling, in myself. For her, for Rachael.

Deckard: Deckard. B-263-54.

Using  to indicate a voiceover (alternate mood)

We ate unagi, aburi-zake, and tako sushi last night, but the toro sushi was all fished out.

Using  to indicate a transliterated word from a foreign language (with lang="ja-latn" indicating transliterated Japanese). To check character sets for lang="" values you can use the IANA’s official list of character sets (ouch), or the excellent Language Subtag Lookup tool by Richard Ishida, W3C.

Nanotyrannus (“dwarf tyrant”) is a genus of tyrannosaurid dinosaur, and is possibly a juvenile specimen of Tyrannosaurus. It is based on CMN 7541, a skull collected in 1942 and described by Charles W. Gilmore described in 1946, who gave it the new species Gorgosaurus lancensis.

Using  for taxonomic names

Only use  when nothing more suitable is available — e.g.,  for text with stress emphasis,  for text with semantic importance, <cite> for titles in a citation or bibliography, <dfn> for the defining instance of a word, and <var> for mathematical variables. Use CSS instead for italicizing blocks of text, such as asides, verse, and (as used here for W3C specification quote) block quotations. Remember to use the class attribute to identify why the element is being used, making it easy to restyle a particular use. You can target lang in CSS using the attribute selector (eg [lang="ja-latn"]). Full sentences of foreign prose should generally be set in quotes in their own paragraph (or blockquote), and should not use  (add the lang attribute to the containing element).

The `` element

The b element represents a span of text to which attention is being drawn for utilitarian purposes without conveying any extra importance and with no implication of an alternate voice or mood, such as key words in a document abstract, product names in a review, actionable words in interactive text-driven software, or an article lede.

HTML — Living Standard, WHATWG

For  text that should merely look different, there is no requirement to use font-style: bold; — other styling could include a round-cornered background, larger font size, different color, or formatting such as small caps. For instance, in the script example above,  is used to indicate who’s speaking or narrating.

Text that is bold by typographic convention (and not because it’s more important) could include names in a Hollywood gossip column or the initial text on a complex or traditionally designed page:

Only use  when there are no other more suitable elements — e.g.,  for text with semantic importance,  for emphasized text (text with “stress emphasis”), <h1>–<h6> for titles, and  for highlighted or marked text. Use classes on list items for a tag cloud. To recreate traditional typographic effects, use CSS pseudo-element selectors like :first-line and :first-letter where appropriate. Again, remember to use the class attribute to identify why the element is being used, making it easy to restyle a particular use.

…and for comparison, the `` and `` elements

While  and  have remained pretty much the same, there has been a slight realignment in their meanings. In HTML4 they meant ‘emphasis’ and ‘strong emphasis’. Now their meanings have been differentiated into  representing stress emphasis (i.e., something you’d pronounce differently), and  representing importance.

The `` element

The em element represents stress emphasis of its contents.

HTML — Living Standard, WHATWG

The ‘stress’ being referred to is linguistic. If spoken, this stress would be emphasised pronunciation on a word that can change the nuance of a sentence. For example, “Call a doctor now!” emphasises the importance of calling a doctor, perhaps in reply to someone asking “Should I get a nurse?” In contrast, “Call a doctor now!” emphasises the importance of calling immediately.

Use  instead to indicate importance and  when you want italics without implying emphasis. The level of nesting represents the level of emphasis.

The `` element

The strong element represents strong importance for its contents.

HTML — Living Standard, WHATWG

Not much more to say really — it’s the  we all know so well. Indicate relative importance by nesting  elements, and use  for text with stress emphasis, or  for text that is “stylistically offset” or bold without being more important.

In summation…

A final thing to note: these elements (and almost all HTML5 elements) have also been made explicitly media-independent, meaning their semantics are not tied to how they look in a visual browser.

So there you have it — two stray dogs of presentational HTML4 have been transformed into meaningful HTML5 elements, ready to be adopted into your coding once again. Can you resist their semantically shiny puppy-dog eyes? Let us know!

Changes #

2012-01-31 I’ve updated mentions of using  for foreign words to be specifically transliterated foreign words (what I was meaning), based on feedback in the comments. I’ve also updated spec quotes.

70 Responses on the article “The i, b, em, & strong elements”

mwiik says

March 9, 2010 at 4:40 pm #

Argh, this seems a bad idea. We deal with lots of (usu. MS Word) content with bold and italicized text, and neither the provider nor me is likely to spend time figuring out if b and i tags conform to these semantics. For those page elements where we have control, we happily use strong and em (with their html 4.01 semantics) where such styling is appropriate, but we stick with b and i for provided content since the implied semantics may differ or there may not be any implied semantics at all.

Oli Studholme says

March 9, 2010 at 5:06 pm #

Hi mwiik,
Thanks for your comment. Luckily in your case using  and  as-is (with no classes) means you’re just using them for typographic effect — no semantics at all. This is the same as HTML4. I think of HTML is a best-effort game, and it sounds like you’re using the appropriate elements when you can.
peace – oli

jacobian says

March 9, 2010 at 5:46 pm #

very interesting info about html 5.I’ll to learn and apply it then. :-)

Vladislav says

March 9, 2010 at 6:51 pm #

What a relief! I was puzzled how to write “R-isomer” or in vito in html5. Now it is clear. Thank you.

Daniel Baird says

March 9, 2010 at 8:24 pm #

Nice article; welcome to my RSS feed :)

In the paragraph “Use  instead to indicate importance and  when you want italics…” do you perhaps mean to say “use em to indicate emphasis and i when …”?

Jimmy says

March 9, 2010 at 8:25 pm #

I think this change is going to need a hell of a lot of publicity if it’s going to be even remotely successful, nor do I think it’s a great idea in the first place. The new meanings of and are not intuitive at all. There’s nothing about the letter I that implies “alternative voice” and nothing about the letter B that implies “stylistically offset” to me. Add on years and years of history using those same tags to mean something entirely different and it’s going to be difficult to get people to remember this change, let alone use it as described. The benefit of semantic elements to me is that it’s pretty clear from looking at code what content is what. A lot of that value comes from the intuitive naming of elements. Hence, the new meanings of and cause confusion more than they help mark up content in a clear and useful way.

Andrew Vit says

March 9, 2010 at 10:09 pm #

While it seems like this is just meant to clear up and refine what the true meanings of these tags should really be, I worry that it’s way over-specified now, even though most people probably found it confusing already.

When’s the last time you cared about whether or why something is “strong” or “emphasized”? Bold and italic are usually pretty clear in what they mean in their context, and when you need to specify further, there’s always the class attribute. I’m starting to wonder why the separate semantic elements? Are they really justified when we use classes on them anyway? Why not just  when such specificity is needed?

I think this is getting too academic for 90% of HTML authors, when class seems perfectly usable as an enhanced specifier.  or  provide the fallback user-agent rendering, and as for semantics, are importance or emphasis actually semantic things?

There are other elements that could use more spec love, I wonder why something this basic needs to be so baroque!

Andrew Vit says

March 9, 2010 at 11:14 pm #

A few more gripes:

<dfn> and  also seem redundant. Or aren’t they? What’s the difference?

Why is :first-letter only applicable to block elements? Another arbitrary restriction?

these elements (and almost all HTML5 elements) have also been made explicitly media-independent, meaning their semantics are not tied to how they look in a visual browser.

Can you elaborate on what that implies? I’m reading this to mean that user agents are free to render , ,  and  as they choose, so we should start specifying these elements in our stylesheets?

Alohci says

March 10, 2010 at 12:55 am #

@Oli

Luckily in your case using  and  as-is (with no classes) means you’re just using them for typographic effect — no semantics at all

I’m sorry but I can see nothing in the spec to justify that assertion. It’s seems clear to me that valid use of does not permit typographic effect on the whim of the author, only where the typical typographic effect is italics. It’s true that the spec encourages further differentiation by use of class name, but class names, in the absence of microformats, are private semantics, not public ones, and therefore are of limited use. In particular, class names are likely to be written in the language of the author.

In general, it’s hard to work out what use the new semantics can be put, since HTML5 documents are officially indistinguishable from HTML4 ones, (i.e. HTML5 does not specify any versioning mechanism). So any processor must assume that for any given document, either HTML4 or HTML5 semantics may apply, and that therefore for each element, the meaning can only be resolved to the union of the HTML4 and HTML5 meanings. Since for the elements of this article, the HTML5 semantics seem to be a narrowing of the HTML4 ones, the union of the two is the same as the HTML4 semantics. Note that this does not apply to the <cite> element, where HTML5 permits usage that HTML4 did not.

Oli Studholme says

March 10, 2010 at 1:46 am #

Thank you all for your comments!

@Vladislav — thank you!

@Daniel Baird — Nope :) Those are notes on when elements other than  would be more appropriate.

@Jimmy — The “typographically italicized text” and “typographically emboldened text” meanings have not changed at all compared to HTML4. The additional semantics give us some preset ways of marking up content, and in general map to how  and  are used. The somewhat confusing terminology is part of making these media independent (what does “italic” sound like in a speech reader?).

Finally remember that intuitive naming only applies if you speak English —  has zero connection to 太字 (futoji, the Japanese equivalent). You’ll be relieved to know Japanese web developers are not proposing to replace  with the <f> element ;-)

@Andrew Vit — over-specified? [blink] Well I guess it does take all the fun out of semantic debates ;-) In reality many of the new HTML5 elements are like this, eg <div> vs <nav>.

The benefits of using specific elements over classes are greater uniformity (class is freeform and generally depends on the author’s language, validators will catch typos on elements but ignore classes), and because we can do things based on agreed semantics. It’s arguable whether we really need  in addition to , but that’s what history has given us.

By “there are other elements that could use more spec love” do you mean at HTML5Doctor? If so please let us know what you’d like us to cover! If you mean in the spec, I’d say everything has received a lot of love ;-)

<dfn> indicates the defining instance of a term — but none of the taxonomy terms are being defined (there’s no title on nanotyrannus) as I’m no palaentologist.

Re: :first-line, I’m terribly sorry to mislead you but I mistook a browser bug for bad coding on my part (see below). I’ve removed that text from the article.

Finally by media-independent I mean they’re not just defined based on how they look, so that they still have meaning in non-visual user agents.

Oli Studholme says

March 10, 2010 at 2:07 am #

@Alohci I’d say that text that has been italicised in MS Word is by default prose whose typographic presentation is italicised. I agree there are cases another element may be more appropriate, but it’s representing italic text.

I agree with you that the new semantics aren’t particularly useful for user agents. However the additional meaning provided by class names will help e.g. in restyling and in identifying all elements of a particular class (for someone inheriting the code etc). It would also be possible to pull data out of a site if this was used consistently (ref: @mpilgrim’s use of <cite> back in the day)

Andrew Vit says

March 10, 2010 at 5:34 am #

@oli, thanks for the response. Au contraire ←(lang=”fr”), I’m not trying to dismiss the semantic debate, I’m all for it: I just think it’s very muddled in this area. Also, I’m not questioning your interpretation of the HTML5 spec — your article is outstanding — I’m really questioning the reasoning behind the spec itself.

I don’t think “over-specified” is quite the word I’m looking for, but it’s like these overlapping tags are begging to justify their existence by adding more specs to prop them up.

Here we have 4 generic tags that essentially mean bold or italic on the surface, but can mean any number of things underneath. Since these styles are conventionally applied to such a variety of content, the spec has to be burdened with arcane rules to explain when to use which one, and the author is burdened with unnecessary (academic or arbitrary) choices. Note that the “author” can be software, so these rules can never be adequate. It’s impossible nail down something that’s wide open for interpretation when all we really need to distinguish is bold/italic, and whatever semantic meaning it has can be done by other means.

I understand and embrace the motives behind semantic elements. I just can’t imagine a scenario where your “agreed semantics” for a generic bold/strong or italic/emphasized element would have any useful meaning by itself.

Let’s say you want to collect the japanese sushi names, or taxonomies from a science article. You can’t rely on  alone, you need those class/lang attributes for additional meaning.  and  are no different in this regard.

As for the semantic meaning of  and , is there any practical boundary between them, independent of how they’re rendered? Whether you call something “important” or “emphasized” seems like hair-splitting. (Note how the spec authors themselves are just swapping words: “stronger emphasis” is now “strong importance”!?) The only practical consideration here is to choose the one that renders logically, whether in a visual or aural user agent, because otherwise they seem equivalent.

(In aural user agents, is there a difference between “alternate voice” and “stress emphasis”, which are the touted differences between code> and ? Italics cover both of these in print, wouldn't the result be the same thing in speech as well?)

Also, any text, or block element, or group of elements could be considered "important". An inline element to mark importance doesn't really fit the pattern for anything except text. (A paragraph, a list, or a form element could be considered important; headings are block elements with implicit importance.) I would derive that importance is really an inherent attribute, not a separate element itself. And similar to the point above, is generic importance semantically meaningful by itself, outside of any scope? Why not just have a class for it instead?

many of the new HTML5 elements are like this, eg <div> vs <nav>

Why not <voiceover>, <taxo title="common house fly">, <idiom lang="jp"> in that case? Obviously, because there would be hundreds of these...

Yes, more of these elements are wonderful when there's an obvious need, and I appreciate them. But for bold/italic passages, the uses are too varied and generic that the argument is against em/strong as truly "semantic" tags.

John Faulds says

March 10, 2010 at 9:56 am #

I’ve already been using in the way that it’s now been redefined by HTML5 so I’m happy that the usage has now been formalised. I’d never really found a good use for though so I think what’s being proposed in the spec is a good thing and will lead to me finding more uses for it in future.

UltraBob says

March 10, 2010 at 10:03 am #

Great article, but a question unrelated to the core content:

:first-letter is restricted to being applied to block elements, but surely with a p:first-letter {} style and HTML like

It should come as no surprise that …

‘I’ would still be the first letter in the paragraph.

UltraBob says

March 10, 2010 at 10:05 am #

Obviously the second p tag should have a /. Typing HTML entities on an iPhone sucks.

HrvojeKC says

March 10, 2010 at 11:00 am #

I have a problem with this:
The level of nesting represents the level of emphasis.

Does this mean that for wery strog emphasis you would writte something like this?

Help!

If so, then it’s not a good specification.

HrvojeKC says

March 10, 2010 at 11:04 am #

damn, I forgot to change the code, here it is:

Help!

HTML5 Doctor

Helping you implement HTML5 today

The i, b, em, & strong elements

Giving presentational elements new semantic meanings

The `<i>` element

The `<b>` element

…and for comparison, the `<em>` and `<strong>` elements

The `<em>` element

The `<strong>` element

In summation…

Changes #

70 Responses on the article “The i, b, em, & strong elements”

HTML5 Doctor

Helping you implement HTML5 today

The i, b, em, & strong elements

Giving presentational elements new semantic meanings

The <i> element

The <b> element

…and for comparison, the <em> and <strong> elements

The <em> element

The <strong> element

In summation…

Changes #

70 Responses on the article “The i, b, em, & strong elements”

The `<i>` element

The `<b>` element

…and for comparison, the `<em>` and `<strong>` elements

The `<em>` element

The `<strong>` element