The i, b, em, & strong elements
by Oli Studholme.Tweet
While many HTML4 elements have been brought into HTML5 essentially unchanged, several historically presentational ones have been given semantic meanings.
Let’s look at <i>
and <b>
and compare them to the semantic stalwarts <em>
and <strong>
. In summary:
<i>
— was italic, now for text in an “alternate voice”, such as transliterated foreign words, technical terms, and typographically italicized text (W3C:Markup, WHATWG)<b>
— was bold, now for “stylistically offset” text, such as keywords and typographically emboldened text (W3C:Markup, WHATWG)<em>
— was emphasis, now for stress emphasis, i.e., something you’d pronounce differently (W3C:Markup, WHATWG)<strong>
— was for stronger emphasis, now for strong importance, basically the same thing (stronger emphasis or importance is now indicated by nesting) (W3C:Markup, WHATWG)
Giving presentational elements new semantic meanings
<i>
and <b>
were HTML4 font style elements and are still used presentationally where appropriate to follow typographic conventions. They now have semantic meaning, however, and their style can be changed via CSS, meaning they’re not only presentational — <b>
, for example, doesn’t have to be bold. Because of this, it’s recommended to use classes to indicate meaning to make it easy to change the style later.
The <i>
element
The i element represents a span of text in an alternate voice or mood, or otherwise offset from the normal prose in a manner indicating a different quality of text, such as a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, or a ship name in Western texts.
Other things that are typically italicised include transliterated foreign words (using the attribute lang=""
), inline stage directions in a script, some musical notation, and when representing hand-written text inline:
Only use <i>
when nothing more suitable is available — e.g., <em>
for text with stress emphasis, <strong>
for text with semantic importance, <cite>
for titles in a citation or bibliography, <dfn>
for the defining instance of a word, and <var>
for mathematical variables. Use CSS instead for italicizing blocks of text, such as asides, verse, and (as used here for W3C specification quote) block quotations. Remember to use the class
attribute to identify why the element is being used, making it easy to restyle a particular use. You can target lang
in CSS using the attribute selector (eg [lang="ja-latn"]
). Full sentences of foreign prose should generally be set in quotes in their own paragraph (or blockquote), and should not use <i>
(add the lang
attribute to the containing element).
The <b>
element
The b element represents a span of text to which attention is being drawn for utilitarian purposes without conveying any extra importance and with no implication of an alternate voice or mood, such as key words in a document abstract, product names in a review, actionable words in interactive text-driven software, or an article lede.
For <b>
text that should merely look different, there is no requirement to use font-style: bold;
— other styling could include a round-cornered background, larger font size, different color, or formatting such as small caps. For instance, in the script example above, <b>
is used to indicate who’s speaking or narrating.
Text that is bold by typographic convention (and not because it’s more important) could include names in a Hollywood gossip column or the initial text on a complex or traditionally designed page:
Only use <b>
when there are no other more suitable elements — e.g., <strong>
for text with semantic importance, <em>
for emphasized text (text with “stress emphasis”), <h1>
–<h6>
for titles, and <mark>
for highlighted or marked text. Use classes on list items for a tag cloud. To recreate traditional typographic effects, use CSS pseudo-element selectors like :first-line
and :first-letter
where appropriate. Again, remember to use the class
attribute to identify why the element is being used, making it easy to restyle a particular use.
…and for comparison, the <em>
and <strong>
elements
While <em>
and <strong>
have remained pretty much the same, there has been a slight realignment in their meanings. In HTML4 they meant ‘emphasis’ and ‘strong emphasis’. Now their meanings have been differentiated into <em>
representing stress emphasis (i.e., something you’d pronounce differently), and <strong>
representing importance.
The <em>
element
The em element represents stress emphasis of its contents.
The ‘stress’ being referred to is linguistic. If spoken, this stress would be emphasised pronunciation on a word that can change the nuance of a sentence. For example, “Call a doctor now!” emphasises the importance of calling a doctor, perhaps in reply to someone asking “Should I get a nurse?” In contrast, “Call a doctor now!” emphasises the importance of calling immediately.
Use <strong>
instead to indicate importance and <i>
when you want italics without implying emphasis. The level of nesting represents the level of emphasis.
The <strong>
element
The strong element represents strong importance for its contents.
Not much more to say really — it’s the <strong>
we all know so well. Indicate relative importance by nesting <strong>
elements, and use <em>
for text with stress emphasis, or <b>
for text that is “stylistically offset” or bold without being more important.
In summation…
A final thing to note: these elements (and almost all HTML5 elements) have also been made explicitly media-independent, meaning their semantics are not tied to how they look in a visual browser.
So there you have it — two stray dogs of presentational HTML4 have been transformed into meaningful HTML5 elements, ready to be adopted into your coding once again. Can you resist their semantically shiny puppy-dog eyes? Let us know!
Changes #
- I’ve updated mentions of using
<i>
for foreign words to be specifically transliterated foreign words (what I was meaning), based on feedback in the comments. I’ve also updated spec quotes.
70 Responses on the article “The i, b, em, & strong elements”
Argh, this seems a bad idea. We deal with lots of (usu. MS Word) content with bold and italicized text, and neither the provider nor me is likely to spend time figuring out if b and i tags conform to these semantics. For those page elements where we have control, we happily use strong and em (with their html 4.01 semantics) where such styling is appropriate, but we stick with b and i for provided content since the implied semantics may differ or there may not be any implied semantics at all.
Hi mwiik,
Thanks for your comment. Luckily in your case using
<b>
and<i>
as-is (with no classes) means you’re just using them for typographic effect — no semantics at all. This is the same as HTML4. I think of HTML is a best-effort game, and it sounds like you’re using the appropriate elements when you can.peace – oli
very interesting info about html 5.I’ll to learn and apply it then. :-)
What a relief! I was puzzled how to write “R-isomer” or in vito in html5. Now it is clear. Thank you.
Nice article; welcome to my RSS feed :)
In the paragraph “Use
<strong>
instead to indicate importance and<i>
when you want italics…” do you perhaps mean to say “use em to indicate emphasis and i when …”?I think this change is going to need a hell of a lot of publicity if it’s going to be even remotely successful, nor do I think it’s a great idea in the first place. The new meanings of <i> and <b> are not intuitive at all. There’s nothing about the letter I that implies “alternative voice” and nothing about the letter B that implies “stylistically offset” to me. Add on years and years of history using those same tags to mean something entirely different and it’s going to be difficult to get people to remember this change, let alone use it as described. The benefit of semantic elements to me is that it’s pretty clear from looking at code what content is what. A lot of that value comes from the intuitive naming of elements. Hence, the new meanings of <i> and <b> cause confusion more than they help mark up content in a clear and useful way.
While it seems like this is just meant to clear up and refine what the true meanings of these tags should really be, I worry that it’s way over-specified now, even though most people probably found it confusing already.
When’s the last time you cared about whether or why something is “strong” or “emphasized”? Bold and italic are usually pretty clear in what they mean in their context, and when you need to specify further, there’s always the
class
attribute. I’m starting to wonder why the separate semantic elements? Are they really justified when we use classes on them anyway? Why not just<b>
when such specificity is needed?I think this is getting too academic for 90% of HTML authors, when
class
seems perfectly usable as an enhanced specifier.<b>
or<i>
provide the fallback user-agent rendering, and as for semantics, are importance or emphasis actually semantic things?There are other elements that could use more spec love, I wonder why something this basic needs to be so baroque!
A few more gripes:
<dfn>
and<i>
also seem redundant. Or aren’t they? What’s the difference?Why is
:first-letter
only applicable to block elements? Another arbitrary restriction?Can you elaborate on what that implies? I’m reading this to mean that user agents are free to render
<strong>
,<b>
,<i>
and<em>
as they choose, so we should start specifying these elements in our stylesheets?@Oli
I’m sorry but I can see nothing in the spec to justify that assertion. It’s seems clear to me that valid use of <i> does not permit typographic effect on the whim of the author, only where the typical typographic effect is italics. It’s true that the spec encourages further differentiation by use of class name, but class names, in the absence of microformats, are private semantics, not public ones, and therefore are of limited use. In particular, class names are likely to be written in the language of the author.
In general, it’s hard to work out what use the new semantics can be put, since HTML5 documents are officially indistinguishable from HTML4 ones, (i.e. HTML5 does not specify any versioning mechanism). So any processor must assume that for any given document, either HTML4 or HTML5 semantics may apply, and that therefore for each element, the meaning can only be resolved to the union of the HTML4 and HTML5 meanings. Since for the elements of this article, the HTML5 semantics seem to be a narrowing of the HTML4 ones, the union of the two is the same as the HTML4 semantics. Note that this does not apply to the <cite> element, where HTML5 permits usage that HTML4 did not.
Thank you all for your comments!
@Vladislav — thank you!
@Daniel Baird — Nope :) Those are notes on when elements other than
<em>
would be more appropriate.@Jimmy — The “typographically italicized text” and “typographically emboldened text” meanings have not changed at all compared to HTML4. The additional semantics give us some preset ways of marking up content, and in general map to how
<i>
and<b>
are used. The somewhat confusing terminology is part of making these media independent (what does “italic” sound like in a speech reader?).Finally remember that intuitive naming only applies if you speak English —
<b>
has zero connection to 太字 (futoji, the Japanese equivalent). You’ll be relieved to know Japanese web developers are not proposing to replace<b>
with the<f>
element ;-)@Andrew Vit — over-specified? [blink] Well I guess it does take all the fun out of semantic debates ;-) In reality many of the new HTML5 elements are like this, eg
<div>
vs<nav>
.The benefits of using specific elements over classes are greater uniformity (class is freeform and generally depends on the author’s language, validators will catch typos on elements but ignore classes), and because we can do things based on agreed semantics. It’s arguable whether we really need
<strong>
in addition to<b>
, but that’s what history has given us.By “
” do you mean at HTML5Doctor? If so please let us know what you’d like us to cover! If you mean in the spec, I’d say everything has received a lot of love ;-)<dfn>
indicates the defining instance of a term — but none of the taxonomy terms are being defined (there’s no title on nanotyrannus) as I’m no palaentologist.Re:
:first-line
, I’m terribly sorry to mislead you but I mistook a browser bug for bad coding on my part (see below). I’ve removed that text from the article.Finally by media-independent I mean they’re not just defined based on how they look, so that they still have meaning in non-visual user agents.
@Alohci I’d say that text that has been italicised in MS Word is by default prose whose typographic presentation is italicised. I agree there are cases another element may be more appropriate, but it’s representing italic text.
I agree with you that the new semantics aren’t particularly useful for user agents. However the additional meaning provided by class names will help e.g. in restyling and in identifying all elements of a particular class (for someone inheriting the code etc). It would also be possible to pull data out of a site if this was used consistently (ref: @mpilgrim’s use of
<cite>
back in the day)@oli, thanks for the response. Au contraire ←(lang=”fr”), I’m not trying to dismiss the semantic debate, I’m all for it: I just think it’s very muddled in this area. Also, I’m not questioning your interpretation of the HTML5 spec — your article is outstanding — I’m really questioning the reasoning behind the spec itself.
I don’t think “over-specified” is quite the word I’m looking for, but it’s like these overlapping tags are begging to justify their existence by adding more specs to prop them up.
Here we have 4 generic tags that essentially mean bold or italic on the surface, but can mean any number of things underneath. Since these styles are conventionally applied to such a variety of content, the spec has to be burdened with arcane rules to explain when to use which one, and the author is burdened with unnecessary (academic or arbitrary) choices. Note that the “author” can be software, so these rules can never be adequate. It’s impossible nail down something that’s wide open for interpretation when all we really need to distinguish is bold/italic, and whatever semantic meaning it has can be done by other means.
I understand and embrace the motives behind semantic elements. I just can’t imagine a scenario where your “agreed semantics” for a generic bold/strong or italic/emphasized element would have any useful meaning by itself.
Let’s say you want to collect the japanese sushi names, or taxonomies from a science article. You can’t rely on
<i>
alone, you need those class/lang attributes for additional meaning.<em>
and<strong>
are no different in this regard.As for the semantic meaning of
<em>
and<strong>
, is there any practical boundary between them, independent of how they’re rendered? Whether you call something “important” or “emphasized” seems like hair-splitting. (Note how the spec authors themselves are just swapping words: “stronger emphasis” is now “strong importance”!?) The only practical consideration here is to choose the one that renders logically, whether in a visual or aural user agent, because otherwise they seem equivalent.(In aural user agents, is there a difference between “alternate voice” and “stress emphasis”, which are the touted differences between code><i> and
<em>
? Italics cover both of these in print, wouldn't the result be the same thing in speech as well?)Also, any text, or block element, or group of elements could be considered "important". An inline element to mark importance doesn't really fit the pattern for anything except text. (A paragraph, a list, or a form element could be considered important; headings are block elements with implicit importance.) I would derive that importance is really an inherent attribute, not a separate element itself. And similar to the point above, is generic importance semantically meaningful by itself, outside of any scope? Why not just have a class for it instead?
Why not
<voiceover>
,<taxo title="common house fly">
,<idiom lang="jp">
in that case? Obviously, because there would be hundreds of these...Yes, more of these elements are wonderful when there's an obvious need, and I appreciate them. But for bold/italic passages, the uses are too varied and generic that the argument is against em/strong as truly "semantic" tags.
I’ve already been using <i> in the way that it’s now been redefined by HTML5 so I’m happy that the usage has now been formalised. I’d never really found a good use for <b> though so I think what’s being proposed in the spec is a good thing and will lead to me finding more uses for it in future.
Great article, but a question unrelated to the core content:
:first-letter is restricted to being applied to block elements, but surely with a p:first-letter {} style and HTML like
<p><b>It should come as no surprise</b> that …<p>
‘I’ would still be the first letter in the paragraph.
Obviously the second p tag should have a /. Typing HTML entities on an iPhone sucks.
I have a problem with this:
Does this mean that for wery strog emphasis you would writte something like this?
Help!
If so, then it’s not a good specification.
damn, I forgot to change the code, here it is:
<strong>
<strong>
<strong>Help!</strong>
</strong>
</strong>