2.4. Entities

An important SGML feature that we use extensively in our documentation is entities. Entity names appear in the text in this format: &word; For example, a line from the Cascade Connect manual looks like this:

The &cserve; program sends and receives ...

The &cserve; entity name is replaced at processing time by the text: Connect Server.

An entity can be a symbol, one or more words, or any chunk of SGML code, such as a paragraph, section, index, or even a whole book. As soon as the processor comes across an entity name, it resolves its definition and inserts the entire SGML code as if it were part of the original document. This is the core of single-source documentation. We can put an entity in several places to produce identical output in different contexts.

For more information about entities, see Chapter 2 of DocBook - The Definitive Guide.

2.4.1. Entity declarations

In general, we use three different kinds of entity declarations:

<entity entityname   "string">
<entity entityname   SYSTEM "filename">
<ENTITY % entityname SYSTEM "filename"> %entityname;

The first kind is called an internal entity, because its definition is written directly into the declaration. It is used to declare strings of individual words such as product names. The second kind is called an external entity because it refers to a different file than the document where the declaration is made. The word "SYSTEM" tells the SGML validator that the file can be found on our local system. The third kind of entity that uses a percent ( % ) sign is called a parameter entity. It has several purposes in SGML, but we mainly use it as an external entity that includes other entity files.

Entity declarations are are written between square brackets ( [ and ] ) in the document type definition, the first part of the document. For example, the SGML source code for this document looks like this:

<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.1//EN" [

<!ENTITY % commonentities SYSTEM "../../i/common/entities.sgml"> %commonentities;

<!entity prepdocmanual    SYSTEM "../../i/prepdoc/pd_man.sgml">
<!entity index            SYSTEM "indexDocumentationManual.sgml">



There are three entity declarations here, one parameter entity, and two external entities. All of them are SYSTEM entities.

The %commonentities parameter entity calls the doc/i/common/entities.sgml file, which holds entities for text that are common to all of our books (see the Entity declaration files appendix).

The two external entities point to SGML files to be included at the point they are called in the text. The body of text for the manual, which is the entire contents of the doc/i/prepdoc/dp_man.sgml gets inserted at processing time where the &prepdocmanual; string appears. The index doesn't get inserted because it is commented out.

The entities for this book are relatively simple because it is a one-file book. Our multi-file books have a more complex structure, because each of their many files is declared as an entity. The entity declarations for each of those books are found in the book's entities.sgml file.

2.4.2. Types of entities in Cogent documents Symbols

We don't have to declare symbols because they are declared as PUBLIC entities in the DocBook distribution, like this:

<!ENTITY % ISOnum PUBLIC "ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN"> %ISOnum;

We mainly use symbol entities in Cogent documents to escape SGML markup symbols. Thus, to distinguish the mathematical symbols < and > from the beginning and end of an SGML markup tag in the text, the strings &lt; and &gt; are used. They will be rendered as the symbols < and > in the text output. Also, since the & symbol marks the start of an SGML entity, whenever we need a literal & symbol in the text, we have to use the string &amp;. Otherwise, the processor starts looking for an entity that doesn't exist.


You shouldn't use the < or > symbols to tag things as replaceable. Use the tags <replaceable> or <parameter>. For example, don't do this in the text of a document, not even within a <screen> tag:

myfunct (<parm1> <parm2>)

Instead, mark up for meaning, like this:

<function>myfunct<function> (<parameter>parm1</parameter> <parameter>parm2</parameter>)

The stylesheets will automatically render the "filename" or "parm" text in italics, like this:

function (parm1 parm2) Product names

To maintain consistency and allow for future changes in naming, we have defined many of our products as entities. For example, the word "Gamma" is written as &gamma; in an SGML source document. The Gamma prompt is written as &gamma.prompt;, and produces output like this: Gamma> . You can find a complete list of these in the common entity declaration, doc/i/common/commonentities.sgml.


Please be sure to use the entity name and not the product name whenever there is an entity for a product name. SGML files

When substituting a paragraph, section, or whole book, it's obviously not practical to put the whole text in the entity declaration, so the entity name refers to a file instead. For example, near the beginning of the Gamma wrapper (doc/i/gamma/wrapper.sgml) you find these two entity names:


The first entity (&frontgam;) is declared in doc/i/gamma/entities.sgml. It calls the file doc/i/gamma/frontgam.sgml, which contains the Gamma manual front matter, ie., the Gamma-specific information about the book, such as the abstract and edition number. The second entity (&titlestandard;) is declared in doc/i/common/entities.sgml. This refers to the file doc/i/common/titlestandard.sgml, which has all the standard stuff contained in every book, like the Cogent graphic, copyright information, and so on. Wrappers and source text

Most of our books are set up to be used as SGML source text which is included in various wrappers when generating output. For example, the Cogent Documentation bookset output file (doc/o/cogent-set/main.sgml) is a wrapper that contains entities that refer to most of our books. This is explained in more detail the Bookset section of the Special Organization chapter. Include/Ignore entities

The DocBook SGML DTD lets you make special entities whose contents can be included or ignored at processing time. (This code is ignored when SGML is coverted to XML.) We have set up four of these entities:

EntityUseCurrent Setting
include.gammaGamma-specific material. INCLUDE
include.lispLisp-specific material.IGNORE
include.notesNotes we don't want to see in the output. This is not in use much, as we use SGML comments (<!-- and -->) to do the same thing.IGNORE
htmlText to ignore when processing HTML. This is necessary to compensate for a QNX Helpviewer bug that can't seem to handle an M-dash.IGNORE

So far, we have been generating all output with these entities set at the current settings, but they can be changed if necessary.