Designing HTML for Inline XBRL 1.0

Working Group Note 19 April 2023

This version
https://www.xbrl.org/WGN/html-for-ixbrl-wgn/WGN-2023-04-19/html-for-ixbrl-wgn-2023-04-19.html
Editor
Paul Warren, XBRL International Inc. <pdw@xbrl.org>

Table of Contents

1 Overview

This document provides a number of recommendations for the construction of HTML for use in Inline XBRL reports. These recommendations are intended to improve compatibility and operation with Inline XBRL viewer software, ensure the quality of extracted XBRL data, and improve performance.

2 Reserved CSS class names

Inline XBRL viewer software may apply CSS classes to a rendered Inline XBRL report in order to enable fact highlighting and other features. In order to avoid collisions between CSS classes used by viewer software, and CSS classes used for styling the Inline XBRL report, class names starting with -ixv- should be considered reserved for use by Inline XBRL viewer software, and should not be used in an Inline XBRL report or any accompanying stylesheet.

If Inline XBRL viewer software needs to add CSS classes to an Inline XBRL report, it should ensure that all such classes are prefixed with -ixv-, for example, -ixv-selected-fact.

This document also proposes a CSS class that can be used to provide a hint to Inline XBRL viewer software (see Section 8. This uses a prefix of -ixh-. The prefix -ixh- should also be considered reserved, and CSS classes starting with this prefix should not be used in Inline XBRL reports or any accompanying stylesheet, other than as described in this or documents published by XBRL International.

3 HTML efficiency

The HTML in Inline XBRL reports can be very large, particularly when generated using a PDF-to-HTML conversion process, and this can lead to slow loading and rendering performance. This can be particularly problematic for Inline XBRL viewer software, as such software will typically require the report to be fully rendered before the software can be fully functional.

Improving rendering performance in Inline XBRL reports is discussed in more detail in the Inline XBRL Rendering Performance Working Group Note. This includes a recommendation to use the content-visibility: auto CSS property which can very substantially improve the rendering performance of such documents in some browsers.

4 HTML tag selection

The Inline XBRL specification does not impose any rules on the choice of HTML tags used to achieve a particular layout result. For example, there is no requirement to mark-up tabular data using <table>, <tr> and <td> tags, or for headings to use the HTML heading tags (<h1>, <h2>, <h3>, etc.). Such features can be tagged using generic tags such as <div> or <span> and applying appropriate styling, or by any other approach supported by HTML.

Use of more specific HTML tags may improve the usability and accessibility of HTML documents, and thus of Inline XBRL reports, but it is not required by the Inline XBRL specification.

5 Text-block tags

Inline XBRL provides a mechanism for including HTML markup from the source document in an XBRL fact value. This mechanism is enabled by setting the value of the escape attribute on an ix:nonNumeric element to true. When enabled, any HTML tags occurring within the Inline XBRL tag will be included in the resulting fact value. The resulting fact value will be a valid XHTML fragment.

If set to false, HTML tags are not included in the output; only text content is included in the output. The resulting fact value will be plain text.

For example, consider an ix:nonNumeric tag around the following content:

<b>Bold text</b>

With escape="false", the resulting fact value will be "Bold text".

With escape="true", the fact value will include the HTML tags, i.e. "<b>Bold text</b>".

How a fact value is interpreted is determined by its datatype. Facts that are intended to contain XHTML will typically use dtr-types:textBlockItemType. The value of a fact using the dtr-types:textBlockItemType MUST be valid XHTML, and so Inline XBRL tags for such facts should use escape="true"

If escape="false" is used, special characters occurring in the fact's text, such as <, > and & will not be XML-escaped, and as a result the result fact value may not be valid XHTML.

Use of escape="true" on facts which are not expected to contain XHTML, such as xbrli:stringItemType, will lead to unexpected markup in the resulting fact value, which is likely to lead to XHTML tags being displayed directly to the user when viewing such facts.

Therefore it is important that the value of the escape attribute matches the datatype of the fact:

5.1 Use of escape="false" on text block tags

If the text content of an Inline XBRL tag does not contain characters which must be escaped in XML (< and &), then it is possible to use escape="false" on text block tags, because the resulting plain text string is also valid XHTML. This is not generally recommended as any HTML formatting within the tag will be lost.

5.2 Escaping when reserialising

In this section, references to "fact value" refer to the semantic value of a fact in an XBRL report. If the report is re-serialised to another format, fact values must be escaped appropriately for that format. For example, where a report is serialised to XBRL v2.1's XML syntax (xBRL-XML), fact values must be XML-escaped. This means that a fact value of:

<b>Profit &amp; Loss</b>

Gets escaped to:

&lt;b&gt;Profit &amp;amp; Loss&lt;/b&gt;

This escaping is unrelated to the behaviour of the escape attribute, and is an artefact of the format being used. For example, if a fact were serialised in xBRL-JSON, it would instead undergo JSON-escaping (replacing \ with \\, and " with \").

6 Whitespace in text facts

When content in an iXBRL document is tagged using an ix:nonNumeric tag, care needs to be taken in order to ensure that the fact values extracted from the iXBRL document preserve whitespace, so that breaks between words, paragraphs and numbers are retained.

This section describes some of the common issues.

6.1 Block-level HTML tags

Where an ix:nonNumeric tag uses the default escape="false" attribute, the resulting fact value is the concatenation of all text nodes that are descendants of the tag and of any referenced ix:continuation elements. If care is not taken in the construction of the HTML, the extracted value may not contain whitespace in all places where space is visible in the rendered report.

One place where this can occur is if the text is split across HTML block-level tags, such as <p> or <div>:

<ix:nonNumeric name="eg:DescriptionOfPolicy" context="c1" id="f1" escape="false">
  <p>This is the first part of the description.</p><p>This is the second part of the description.</p>
</ix:nonNumeric>

This will be rendered as two separate paragraphs:

This is the first part of the description.

This is the second part of the description.

but the extracted fact value will not include any space between the two sentences:

This is the first part of the description.This is the second part of the description.

Including whitespace between the closing </p> and the next opening <p>, will ensure that a break between the sentences is preserved, and will not affect the rendering of the original document:

<ix:nonNumeric name="eg:DescriptionOfPolicy" context="c1" id="f1" escape="false">
  <p>This is the first part of the description.</p> <p>This is the second part of the description.</p>
</ix:nonNumeric>

Extracted fact value:

This is the first part of the description. This is the second part of the description.

Note that this issue also applies to other CSS display modes such as list-item and table-row, and also the <br> tag.

6.2 ix:continuation

A similar situation can occur when using ix:continuation. This can affect tags that use either escape="true" or escape="false".

<div class="page">
    <ix:nonNumeric 
        name="eg:DescriptionOfPolicy" context="c1" id="f1" 
        escape="false" continuedAt="cont1"
    >This is a description</ix:nonNumeric>
</div>
<div class="page">
    <ix:continuation id="cont1">of my policy.</ix:continuation>
</div>

This arrangement may occur when a sentence is split across a page or column break. The rendered view will show the sentence split across two pages or columns:

This is a description

of my policy.

but the extracted fact value will be:

This is a descriptionof my policy

This situation can be avoided by introducing additional space within the ix:continuation elements.

6.3 Use of CSS styling to create spaces

It is also possible to introduce spaces into the rendered output using CSS styling. For example:

<ix:nonNumeric 
    name="eg:DescriptionOfPolicy" context="c1" id="f1" 
    escape="false" 
><span style="padding-right: 10px">My</span><span>policy</span></ix:nonNumeric>

This will render as:

My policy

but the extracted fact value (using escape="false") will be:

Mypolicy

The behaviour when using escape="true" will depend on viewer software, and may also depend on whether the styling is applied using an inline style, or a CSS class.

It should be noted that this use of styling to simulate word breaks will also interfere with standard web browser features such as copy and paste, and text search, which will treat the text as if there is no space between the words. Similarly, where documents are published on the web, it is likely to break correct indexing by search engines.

Using styling to simulate word breaks is not recommended. Instead, the report should include whitespace in the HTML, and if necessary, adjust the width of the rendered space using other means.

6.4 Whitespace normalisation (escape="false")

When rendering HTML, browsers apply whitespace normalisation to most whitespace. This means that runs of spaces, and other whitespace characters such as tabs and newlines, are rendered as a single space. For example, the following two examples will render in the same way:

<p>My policy</p>
<p>My 
   policy</p>

When extracting XBRL fact values from an ix:nonNumeric tag with escape="false", all whitespace in the original document is preserved. This means that the resulting fact value for the above two examples will be different.

The Inline XBRL Specification does not prescribe how such fact values should be presented to an end user, and does not define whether whitespace normalisation should be applied.

7 HTML compatibility

Inline XBRL requires that documents are valid XHTML, the XML-based syntax for HTML. Although similar, browsers will treat XHTML and HTML differently. Whether a document is treated as XHTML or HTML depends on a number of factors, not all of which are within the document author's control. For example, where a report is placed on a website, the mode may be controlled by the HTTP Content-Type that it is served with, which will depend on the server configuration.

Similarly, where text-block facts are extracted from an iXBRL report, tools consuming the resulting fact may not always treat it as XHTML rather than HTML.

For this reason, it is recommended that iXBRL reports are constructed in a way that results in them be rendered correctly in both XHTML and HTML modes. There are three issues that should be considered when doing this:

  1. Use of self-closing tags;
  2. Escaping rules; and
  3. Browser rendering mode.

These are discussed in more detail below.

7.1 Use of self-closing tags

In XML, an empty tag may be "self-closed". <br /> and <br></br> are completely equivalent in XML, and thus XHTML.

In HTML, tags which can only be empty, such as br, cannot have a closing tag, and an HTML parser will treat a closing </br> tag as if it were another <br> tag. This means that <br></br> will be treated as two <br> tags, yielding a difference in rendering between HTML and XHTML. Fortunately, self-closed tags (<br />) are also treated as a single tag, so self-closing elements that can only be empty yields the same results in HTML and XHTML.

Conversely, self-closing tags that are expected to have content will also lead to differences between HTML and XHTML. For example, in XHTML, an empty span tag can be represented as <span />, but in HTML this will be interpretted as an opening tag (a closing tag will be inferred at some point later in the document). This yields a different DOM structure, with the result that CSS styling may be applied differently.

To ensure consistency between HTML and XHTML, tags that must be empty should use the self-closing tag syntax, whereas empty tags that are allowed to contain content should use the expanded notation.

7.2 Escaping rules

XHTML must be well-formed XML, which means that XML special characters (< and &) must be always be escaped. The > character is not usually required to be escaped in XML, but it frequently is for consistency with <.

HTML follows similar escaping rules, but applies different rules within different elements, most notably <style> and <script> elements.

Content with an <style> tag is not expected to be XML-escaped in HTML, and a CSS style rule such as:

div > p { color: red }

will not be understood if it is represented as:

div &gt; p { color: red }

Therefore, > should not be escaped within <style> tags.

< and & are not part of CSS syntax, but can appear within CSS comments and CSS strings. In the latter case, they can be represented using the unicode escape sequences, \00003C and \000026 respectively.

A similar issue occurs with the content of <script> tag, where XML special characters must be escaped in XHTML, but must not be in HTML. As script content is not generally permitted in iXBRL reporting environments, approaches to dealing with this are not discussed further.

In both cases, these escaping issues can be avoided entirely by placing the styling or script content in a separate file.

7.3 Rendering mode

In addition to the syntactic issues noted above, browsers will, by default, use different rendering modes for XHTML and HTML documents. XHTML documents will use "standards mode" whereas HTML documents will use "quirks mode".

For the most part these two modes yield the same results, but certain HTML and CSS constructs will be treated differently. A full discussion of the differences is outside the scope of this document, but in many cases, these differences can be resolved by using more explicit CSS constructs.

8 Highlighting hint container

Software that displays iXBRL reports will typically highlight facts within the report for different purposes. Determining exactly which region of the report to highlight is not trivial, as whatever approach is taken needs to cope with the wide variety of different HTML structures seen in real-world iXBRL reports.

The widespread use of absolutely position elements, driven largely by the use of PDF-to-HTML conversion software, adds additional complexity to this, and even where such HTML is handled correctly, the end result can be somewhat messy.

For example, where full paragraphs of text are constructed using <p></p> tag, and then tagged, an outline or background color can be applied to the paragraphs as a whole.

Figure 1: Highlighting applied to <p> tags

PDF-to-HTML conversion software will often use a separate, absolutely positioned element for each line within the paragraph. If a viewer copes with this at all, it is likely to add a border or background color to each line individually, as shown below:

Figure 2: Highlighting applied to individual lines

If there is a single enclosing element for the content, it will, by default, have zero width and height, so applying highlighting to that will not have the desired effect.

This document proposes a mechanism that can be used by creation software to provide a hint that will allow viewer software to provide cleaner highlighting of such content.

Where the nearest HTML ancestor of an ix:footnote, ix:nonNumeric, ix:nonFraction, ix:fraction, or ix:nonNumeric element is an HTML element with a class of -ixh-highlight-region, the HTML element should have position and dimensions that corresponds to the content within the tag. Viewer software may then choose to apply highlighting to that element, rather than attempting to determine the extent of the iXBRL tags rendered content.

This will yield an appearance that is similar to Figure 1.

Where multiple iXBRL elements tag the same content (nested tags) they can share the same highlighting hint by placing it around the outer most iXBRL element.

Appendix A Document history

Date Description
7th Dec 2022 Initial public release
Updated to include text block tags and HTML compatibility sections