Writing in DocBook

Atlas supports writing in DocBook (versions 4.4, 4.5, and 5.0), an OASIS standard for XML that is great for writing long, technical documents that have complex structure and cross-references. Files ending in .xml will be converted to HTMLBook when building. DocBook files in Atlas can be edited using the Code Editor.

DocBook XML Markup Reference

Following are some guidelines that you may find helpful for writing in DocBook with Atlas.

Using Elements Correctly

For XML be valid, it must not only be well-formed (i.e., all the start and end tags match), it must also have all the tags in the proper hierarchy according to the associated DTD. The tag at the top of the hierarchy is called the root element (e.g., <chapter>s or <part>s) and contains various child elements (e.g., <sect1>). Logical rules apply, such as the fact that a <sect3> cannot be directly nested within a <sect1>; it must be within a <sect2>. Improper nesting will result in invalid DocBook.

Note

Many people prefer to write DocBook locally using a standalone XML editor. Appropriate editors can safeguard you from moving, adding, or deleting elements in ways that don’t follow the DTD hierarchy. To learn how to work with Atlas projects locally, see Atlas + Git.

The terms tag and element are sometimes used interchangeably, but there is a distinction. For example, <chapter> is a tag that indicates the start of a chapter element. For the XML document to be well-formed, it must contain an end tag, </chapter>. Some tags are self-contained and stand alone as complete elements, without the need for separate end tags. For example, <xref linkend="foo"/> is self-contained. If you’re familiar with HTML, the rules are pretty much the same.

Chapters and Sections

Each .xml file should be a complete DocBook document with its own DOCTYPE declaration.

Each chapter is made up of sections. We recommend using sect1, sect2, and sect3 elements rather than generic section elements to structure your chapter.

The barebones structure of a chapter is something like this:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" 
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
<chapter id="chapter_id">
  <title>Chapter Title Here</title>
  <sect1>
    <title>Sect1 Title Here</title>
    <para>Text goes here...</para>
    <sect2>
      <title>Sect2 Title Here</title>
      <para>Text goes here...</para>
      <sect3>
        <title>Sect3 title here</title>
        <para>Text goes here...</para>
      </sect3>
    </sect2>
  </sect1>
</chapter>

Chapter contributors

For content with multiple contributors, you may want an author name to appear with each chapter. Simply add the following markup above each chapter title:

<chapterinfo>
  <author>
    <firstname>Author</firstname>
    <surname>Name</surname>
  </author>
</chapterinfo>

You can also use this markup for forewords and prefaces (just use prefaceinfo instead of chapterinfo) as well as appendixes (use appendixinfo).

Parts

Parts are files that group chapters together. Note that the chapters should be in their own files. Atlas will take care of the nesting based on the order specified in the Build List.

<?xml version="1.0"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" 
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
<part>
<title>Part 1</title>
</part>

Block Versus Inline Elements

There are two kinds of elements:

Block

Usually presented with a paragraph break before and after them, block elements may contain character data, inline elements, and possibly other block elements. Examples include paras, lists, sidebars, tables, and block quotes.

Inline

Usually distinguished by a font change rather than obvious breaks, inline elements may contain character data and possibly other inline elements, but never block elements. Examples include cross-references, filenames, commands, and URLs.

Inline Font Markup

Here are some commonly used inline elements:

<emphasis>

Provided for use where you would traditionally use italics to emphasize a word or phrase.

<emphasis role="bold">

A general-purpose tag provided for where you would use bold type to emphasize a word or phrase.

Warning

O'Reilly house style prefers italic rather than bold for emphasis.

<phrase role="roman">

Provided for use within italicized text where you would ordinarily use italics to emphasize a word or phrase.

<literal>

Any stretch of text that must appear in constant width font.

<replaceable>

Text that should be replaced with user-supplied values or by values determined by context. Appears in constant width italic.

<subscript>

A subscript character.

<superscript>

A superscript character.

<ulink url="ulink.org"/>

Adding a hyperlink. See Hyperlinks for more details. (For fake or example URLs, use <emphasis> instead.)

<userinput>

Data entered by the user, typically at a prompt line. Use with <replaceable> if needed: <userinput><replaceable>...</replaceable><userinput>

Cross-References

All cross-references to titled elements—figures, tables, examples, sections, chapters, parts, etc.—should be marked up using xrefs, not written in plain text. xref elements will become live hyperlinks in online versions, and they will automatically update if you move the referenced elements around while editing. There is never any need to hardcode labels (e.g., “Chapter 1”, “Figure 1”) or page numbers, as these aspects of the rendered xref are autogenerated by Atlas.

To insert an xref, follow these steps:

  1. Note the id of the element you are referencing. If the element does not have an id, you will need to add one. For the project to be valid, id attributes must be unique across the entire project, have no spaces, not contain a colon, and not start with a number. Here’s an example of a figure id:

    <figure id="figure_titles_written_with_underscores_make_nice_ids">
  2. Once you have the id, you can insert an xref element that references it via a linkend attribute, like so:

    <xref linkend="figure_titles_written_with_underscores_make_nice_ids" />
Warning

You cannot use the word “inherit” as an id. It won’t render properly.

The following table shows examples of xref markup and standard rendering for various elements.

Element to be referenced xref markup xref rendering
<sect1 id="keep_it_simple"> <xref linkend="keep_it_simple"/> "Inline Macros" on page 14
<chapter id="picking_an_xml_editor"> <xref linkend="picking_an_xml_editor"/> Chapter 17
<figure id="docbook_duck_fig"> <xref linkend="docbook_duck_fig"/> Figure 2-3
<example id="sample_example"> <xref linkend="sample_example"/> Example 3-5
<table id="maximum_widths"> <xref linkend="maximum_widths"/> Table 4-1

Figures

Figures generally have a title (aka caption) and an autogenerated number. You do not need to number the figure in the XML; Atlas will autogenerate the number in the figure label and in all xrefs to it. If you'd prefer that your image not be numbered or labeled, simply omit the id attribute and title element.

Here’s an example of formal figure markup:

<figure id="docbook_duck_fig">
<title>The DocBook duck</title>
<mediaobject>
  <imageobject>
    <imagedata fileref="images/docbook_duck.png" format="PNG" />
  </imageobject>
</mediaobject>
</figure>

Figure 8-1 shows how the above markup renders.

Figure 8-1. The DocBook duck

Make sure to add your image files to the project (typically in an images/ directory). Then set the fileref and format attributes in the XML markup so that they match the image file names and types exactly. For example, if an image is named battery.png in the images/ directory, it should be referenced in the XML as images/battery.png, not images/Battery.png, and the format should be PNG.

Inline graphics

If you need to add an inline graphic (e.g., a small icon that is part of the text), use an inlinemediaobject:

<inlinemediaobject>
   <imageobject>
      <imagedata fileref="images/icons_0501.png"/>
   </imageobject>
</inlinemediaobject>

A width is required for an inlinemediaobject so that the processor knows how much space to allocate for it. The value 0.12in works well. You can also find the width of the graphic using a web browser, Adobe Acrobat, or any other program that shows you an image’s dimensions.

Alt text for images

To improve accessibility to visually impaired readers in your ebook files, please consider adding alt-text to images.

By default, for figure elements, Atlas will use the contents of the title element as the alt text. However, you can supply your own custom alt text for a figure by adding a textobject element as a child of the figure’s mediaobject, and enclosing the alt text in a phrase element. Here’s an example of the markup to use:

<figure id="figure_with_custom_alt_text">
  <title>Figure image with custom alt text</title>

  <mediaobject>
    <imageobject>
      <imagedata fileref="images/universal_design_for_web_applications_cover.png"/>
    </imageobject>

    <textobject>
      <phrase>Universal Design for Web Applications Cover</phrase>
    </textobject>
  </mediaobject>
</figure>

For images you include in your book that do not have title elements (e.g., informalfigures and inlinemediaobjects), we highly encourage you to supply your own custom alt text in textobjects. (By default, Atlas uses the text “image with no caption” as the alt text for informalfigures and leave alt attributes empty for inlinemediaobjects.) informalfigure with textobject and inlinemediaobject with textobject show examples of the markup for an informalfigure and inlinemediaobject with custom alt text.

Example 8-1. informalfigure with textobject
<informalfigure id="informalfigure_with_custom_alt_text">
  <mediaobject>
    <imageobject>
      <imagedata fileref="images/universal_design_for_web_applications_cover.png"
                    width="2.4in"/>
    </imageobject>

    <textobject>
      <phrase>Universal Design for Web Applications Cover</phrase>
    </textobject>
 </mediaobject>
</informalfigure>
Example 8-2. inlinemediaobject with textobject
<inlinemediaobject>
  <imageobject>
    <imagedata fileref="images/oreilly_logo.png" width="0.12in"/>
  </imageobject>
  <textobject>
    <phrase>O’Reilly Media, Inc. logo</phrase>
  </textobject>
</inlinemediaobject>
Tip

For some tips on writing good alt text, O’Reilly’s Universal Design for Web Applications is a great resource. In particular, see the section, “Keys to Writing Good Text Alternatives.”

Tables

If your table requires a description, you expect to refer to it later elsewhere in the text, or it’s especially complex, you probably want to use a table element. Otherwise, you can consider using an informaltable.

Formal tables

Here’s the markup for a table with a header:

<table id="example_table">
<title>Example formal table</title>
  <tgroup cols="2">
    <thead>
      <row>
        <entry>Heading1</entry>
        <entry>Heading2</entry>
      </row>
    </thead>
    <tbody>
      <row>
        <entry>Text1</entry>
        <entry>Text2</entry>
      </row>
      <row>
        <entry>Text3</entry>
        <entry>Text4</entry>
      </row>
    </tbody>
  </tgroup>
</table>

Table 8-2 shows how it renders.

Table 8-2. Example formal table
Heading1 Heading2
Text1 Text2
Text3 Text4

Note that table headers are optional. Tables can get much more complex than this example; see http://www.docbook.org/tdg/en/html/table.html for more details.

Informal tables

The markup of an informaltable is similar to that of a table, but it does not have a title or need an id. Here’s an example.

<table id="example_table">
  <tgroup cols="2">
    <tbody>
      <row>
        <entry>Text1</entry>
        <entry>Text2</entry>
      </row>
      <row>
        <entry>Text3</entry>
        <entry>Text4</entry>
      </row>
    </tbody>
  </tgroup>
</table>
Text1 Text2
Text3 Text4

This particular informal table doesn’t have a header (no thead), but it would be valid to add one.

Lists

There are four common types of lists. Here’s the markup and an example of each.

Simple lists

Markup:

<simplelist>
  <member>This is a list of several short items.</member>
  <member>Usually one or a few words each.</member>
</simplelist>

Rendering:

  • This is a list of several short items.
  • Usually one or a few words each.

Bulleted (aka itemized) lists

Markup:

<itemizedlist>
  <listitem><para>This is a list.</para></listitem>
  <listitem><para>With bullets.</para></listitem>
<itemizedlist>

Rendering:

  • This is a list.

  • With bullets.

Numbered (aka ordered) lists

Markup:

<orderedlist>
  <listitem><para>This list uses numbers.</para></listitem>
  <listitem><para>Instead of bullets.</para></listitem>
<orderedlist>

Rendering:

  1. This list uses numbers.

  2. Instead of bullets.

To continue the numbering of an orderedlist from a previous list, use a continuation attribute with a value of continues:

<orderedlist continuation="continues">

The default is continuation="restarts". This causes the numbering to begin at 1.

If an orderedlist has other lists nested within it, <orderedlist contin⁠uation="continues"> may cause them to start at the wrong number. In these cases you can add an override attribute with the number at which you’d like the incorrectly numbered listitem to start. Your continued orderedlist will then begin at that number.

Variable lists

A variable list is made up of pairs of items.

Markup:

<variablelist>
  <varlistentry>
    <term>The first part could be a term</term>
    <listitem><para>Followed by a definition.</para></listitem>
  </varlistentry>
  <varlistentry>
    <term>Or a name</term>
    <listitem><para>Followed by a description. Etc.</para></listitem>
  </varlistentry>
</variablelist>

Rendering:

The first part could be a term

Followed by a definition.

Or a name

Followed by a description, etc.

By default, the list term will render in italics. To remove the italics, add a role attribute of plain:

<term role="plain">Variable list term</term>

Notes, Tips, Warnings, and Cautions

Admonitions are great for adding supplemental information or warnings to the reader.

The markup for a note looks like this:

<note>
  <para>Here's some text inside a note.</para>
</note>

It will render something like this:

Note

Here's some text inside a note.

You can create a tip like this:

<tip>
  <para>If you tie your shoelaces, you're less likely to trip and fall down.</para>
</tip>

Which looks something like this:

Tip

If you tie your shoelaces, you're less likely to trip and fall down.

You can also use a warning:

<warning>
  <para>Be warned of something important!</para>
</warning>

Which looks something like this:

Warning

Be warned of something important!

Notes, tips, and warnings may contain paras, code blocks, and lists. They should not contain figures, tables, or examples.

Sidebars

Sidebar markup looks like this:

<sidebar>
  <title>When to Use a Sidebar?</title>
  <para>If a note, tip, or warning covers a lot of information or includes complex elements, consider using a sidebar instead. A sidebar can be much longer—even spanning several pages—and should have a title.</para>
</sidebar>

And renders something like this:

Footnotes

A footnote generates a superscript number wherever it is placed in the text, and the body of the footnote appears at the bottom of the page.1

<footnote><para>Like this.</para></footnote>

Table footnotes are lettered and appear directly after the table (not at the bottom of the page).

Note

Footnotes should generally be inserted after punctuation.

Indexing

Here’s the basic index entry markup:

Note

During production, a freelance indexer will be hired to index your book, so you may not need to use this markup.

<indexterm><primary>index entry syntax, level 1</primary></indexterm>

Secondary entry (subentry) markup:

<indexterm>
    <primary>index entry syntax</primary>
    <secondary>for a subentry</secondary>
</indexterm>

Tertiary entry (sub-subentry) markup:

<indexterm>
    <primary>index entry syntax</primary>
    <secondary>for a subentry</secondary>
    <tertiary>with a subentry</tertiary>
</indexterm>

Index entry with a range markup:

This book is full of geeky text with DocBook XML markup, which starts here:
<indexterm class="startofrange" id="geekytext">
<primary>geeky DocBook XML text</primary></indexterm>blah blah blah Ajax
blah blah blah Ruby on Rails
...
and ends here<indexterm class="endofrange" startref="geekytext"/>.
Note

The closing indexterm tag does not contain a primary or secondary entry, just a startref attribute that references the starting indexterm entry. Do not place the closing tag on its own line.

Code

Code blocks can be defined in DocBook with a programlisting element. Here’s a very simple one:

<programlisting>Hello World</programlisting>

If you have larger blocks of code that you want to give a title, a number, and a cross-reference, use an example element:

<example id="sample_example">
<title>Sample example</title>
  <programlisting>Hello World</programlisting>
</example>

programlisting is a verbatim environment, which means whitespace is preserved in rendered versions. You must escape all characters that have special meaning in XML (such as < and >—these characters obviously come up quite a bit in code). The simplest way to do this is to replace the character with an entity, such as &gt; for >.

External code files

If you want to manage your code in separate files from the manuscript, you can use <xi:include> tags to point to your code. If you do this, the parser doesn’t try to interpret them as XML, but you must include a parse="text" attribute:

<programlisting>
<xi:include 
  xmlns:xi="http://www.w3.org/2001/XInclude" 
  parse="text" href="hello.c" />
</programlisting>

Caveats

Although inline markup and newlines within verbatim environments2 are valid DocBook, we ask that you follow these guidelines to prevent rendering problems downstream.

Tabs

Please don’t use tabs in code blocks, as tabs don’t necessarily translate to the same amount of space on different systems. To align or indent within your code, use spaces.

Inline markup on multiple lines

When using inline markup on multiple lines of code (e.g., <emphasis role="bold">), please close the tag at the end of each line and open a new one on the next line. For example, instead of this:

<programlisting><emphasis role="bold">GLuint m_gridTexture;
IResourceManager* m_resourceManager;
</emphasis>};</programlisting>

do this:

<programlisting><emphasis role="bold">GLuint m_gridTexture;</emphasis>
<emphasis role="bold">IResourceManager* m_resourceManager;</emphasis>
};</programlisting>

Failing to do this can cause delays in Production.

Newlines

Be careful not to add newlines to the beginning or end of code blocks. Because all line breaks are preserved in verbatim blocks, newlines can result in excess whitespace. For example, the following listing will render with unwanted blank lines at the top and bottom due to the line breaks after the opening <programlisting> tag and before the closing </programlisting> tag:

<programlisting>
CLLocationManager *locationManager = [[CLLocationManager alloc] init];
locationManager.delegate = self;
    [locationManager startUpdatingLocation];
} else {
    NSLog(@"Location services not enabled.");
}
</programlisting>

Do this instead:

<programlisting>CLLocationManager *locationManager = [[CLLocationManager alloc] init];
locationManager.delegate = self;
    [locationManager startUpdatingLocation];
} else {
    NSLog(@"Location services not enabled.");
}</programlisting>

Callouts

If you want to have cross-references to specific lines of code, you can use callouts. Just put a co element at the end of each line you want to reference—these will generate callout markers. Then create a calloutlist element after the code block. This list contains callout items that discuss or explain each referenced line.

Here is an example of the markup:

<programlisting><programlisting> <co id="opening_tag_co"
          linkends="opening_tag" />
<xi:include <co id="xinclude_co" linkends="xinclude" /> 
  xmlns:xi="http://www.w3.org/2001/XInclude" 
  parse="text" href="hello.c" />
</programlisting> <co id="closing_tag_co" linkends="closing_tag" />
      
<calloutlist>
  <callout arearefs="opening_tag_co" id="opening_tag">
    <para>The opening tag for a <literal>programlisting</literal> element.</para>
  </callout>

  <callout arearefs="xinclude_co" id="xinclude">
    <para>An <literal>XInclude</literal>.</para>
  </callout>

  <callout arearefs="closing_tag_co" id="closing_tag">
    <para>The closing tag for a <literal>programlisting</literal> element.</para>
  </callout>
</calloutlist></programlisting>

And here's how it will render:

<programlisting> 1
<xi:include 2 
  xmlns:xi="http://www.w3.org/2001/XInclude" 
  parse="text" href="hello.c" />
</programlisting> 3
1

The opening tag for a programlisting element.

2

An XInclude.

3

The closing tag for a programlisting element.

Each co element in the code block includes an optional linkends attribute that points to the callout elements that refer to it, forming a link between the marker and the callout. Conversely, each callout element requires an arearefs attribute that points to co elements, forming a link between the callout and the marker. The markers will be rendered as clickable bidirectional cross-references if you use this markup.

The markup for the above looks like this:

<programlisting>&lt;programlisting&gt; <co id="opening_tag_co" 
  linkends="opening_tag"/>
&lt;xi:include <co id="xinclude_co" linkends="xinclude"/> 
  xmlns:xi="http://www.w3.org/2001/XInclude" 
  parse="text" href="hello.c" /&gt;
&lt;/programlisting&gt; <co id="closing_tag_co" linkends="closing_tag"/>
</programlisting>

<calloutlist>
<callout arearefs="opening_tag_co" id="opening_tag">
<para>The opening tag for a <literal>programlisting</literal>
element.</para>
</callout>

<callout arearefs="xinclude_co" id="xinclude">
<para>An <literal>XInclude</literal>.</para>
</callout>

<callout arearefs="closing_tag_co" id="closing_tag">
<para>The closing tag for a <literal>programlisting</literal>
element.</para>
</callout>
</calloutlist>

For more information on DocBook callout markup, see http://www.sagehill.net/docbookxsl/AnnotateListing.html#Callouts. Please note that the Atlas toolchain does not support areaspec/area/areaset elements to specify callout regions.

Note

Although DocBook has markup for adding line numbers and annotations directly to code, the Atlas toolchain doesn’t support these options. Line numbers don’t allow for good cross-referencing and can potentially cause problems if code is revised and line numbers change. If you want to cross-reference code blocks by number, we recommend using callouts instead; they are autonumbered and will adjust automatically if you shift code around.

Syntax highlighting

The Atlas toolchain supports syntax highlighting via Pygments, and we recommend that all authors use it when possible. All you need to do is add a language attribute to each code block that should include syntax highlighting, and specify the language of the code. For example:

<programlisting language="java">int radius = 40;
float x = 110;
float speed = 0.5;
int direction = 1;</programlisting>

Here’s how it renders:

int radius = 40;
float x = 110;
float speed = 0.5;
int direction = 1;

Pygments supports a wide variety of languages that can be used in the language attribute; see the full list at http://pygments.org/docs/lexers. Ebook readers that do not have color screens will still display the highlighting, but in more subtle shades of gray.

Note

The color scheme is consistent across books.

Unicode for Special Characters

For nonstandard characters, use Unicode. The following table provides the values for some common characters; for all others, check out the Unicode Char⁠acter Search.

To add a Unicode character directly to XML in a text editor, use the entity &#xCODEPOINT;, where CODEPOINT is the four-digit hexadecimal number after U+ (e.g., for U+20A0, enter &#x20A0;). Letters that are part of the codepoint may be entered as either upper- or lowercase (i.e., &#x03bb; is the same as &#x03BB;), but the x between the # symbol and the codepoint must be lowercase.

Character Unicode value (hexadecimal codepoint)
— (Em Dash) U+2014
– (En Dash) U+2013
“ (Curly Left Double Quotation Mark) U+201C
” (Curly Right Double Quotation Mark) U+201D
‘ (Curly Left Single Quotation Mark) U+2018
’ (Curly Right Single Quotation Mark) U+2019
× (MathMultiplier) U+00D7
→ (CharMenuDelim) U+2192
€ (Euro Currency Symbol) U+20A0
✓ (Check Mark) U+2713
✗ (Ballot X) U+2717
⌘ (Place Of Interest Sign) U+2318
↵ (Carriage Return Arrow) U+21B5

Comments and Remarks

You have two options for adding comments to your manuscript: standard XML comments (<!--foo-->) and remark elements.

XML comments are useful for commenting out large blocks of text—for example, text that is under review, or text that you don’t currently want to include in your manuscript. In the following example, the entire paragraph is commented out:

<!-- O’Reilly’s mission statement. 
<para>O’Reilly Media spreads the knowledge of innovators through its books, 
online services, magazines, research, and conferences. Since 1978, O’Reilly 
has been a chronicler and catalyst of leading-edge development, homing in 
on the technology trends that really matter and galvanizing their adoption 
by amplifying “faint signals” from the alpha geeks who are creating the future. 
An active participant in the technology community, the company has a long 
history of advocacy, meme-making, and evangelism.</para> -->

remark elements are better for directing specific comments to other collaborators. For example:

<remark>PRODUCTION: Please stet grammatical errors in the following</remark>

<para>I can haz cheezburger, plz?</para>

By default, remarks are not displayed in your PDF builds.

Quotes and Epigraphs

To add a quote anywhere in your book, use the blockquote element. Since it’ll be set apart from the text, there’s no need to put quotation marks around it. Here’s some example markup—a quote attributed to Benjamin Disraeli (by Wilfred Meynell, according to Frank Muir):

<blockquote>
  <attribution>Wilfred Meynell</attribution>

  <para>Many thanks; I shall lose no time in reading it.</para>
</blockquote>

Here’s how it renders:

Many thanks; I shall lose no time in reading it.

Wilfred Meynell

If you want to add a quote at the beginning of your chapters (or sections, parts, etc.), use the epigraph element. Here’s some example markup:

<epigraph>
  <attribution>Robert Benchley</attribution>

  <para>There are two kinds of people in the world: those who believe 
  there are two kinds of people in the world, and those who don't.</para>
</epigraph>

And here’s how it renders:

There are two kinds of people in the world: those who believe there are two kinds of people in the world, and those who don’t.

Robert Benchley

1Like this.

2E.g., programlistings, used for code blocks where line breaks and spaces need to be preserved.