Modifying a NISO STS Tag Set
How to Start Using This Tag Library
If you want to learn about this Tag Set in order to write a new tag set (based on this Tag Set) or to modify the current Tag Set, follow the steps below.
- Skim two early chapters of this Tag Library, the How To Use (Read Me First) and the Tag Library General Introduction.
- If you do not know the symbols used in the Document Hierarchy diagrams, then read the “Key to the Near & Far® Diagrams”.
- Use the Document Hierarchy diagrams to give you a good sense of the top-level elements and their contents, how a standards document is structured.
- Pick an element from one of the diagrams. Read the description of that element in the Element Section to find the full name of the element, its description, usage notes, content allowed, an attributes list, and at least one tagged example. Look up one of the element’s attributes in the Attribute Sectionto find the attributes full name, usage notes, and potential values.)
Scan Representative Tag Set Modules
New Tag Sets are created by writing, at a minimum, a new DTD module and new customization modules, so you might want to read the existing modules in the following order:
- The one of the DTD modules that you wish to modify or emulate: (Note that most of each DTD module is taken up with calling in many other modules using external parameter entities.)
- NISO-STS-interchange-1-mathml2.dtd,
- NISO-STS-interchange-1-mathml3.dtd,
- NISO-STS-extended-1-mathml2.dtd, or
- xNISO-STS-extended-1-mathml3.dtd
- The modules that declare the modules the DTDs call in:
- The NISO STS module that declares all the NISO-STS-specific customization and element modules (NISO-STS-modules);
- The JATS module (used in NISO STS) that declares all the JATS Publishing-Tag-Set-specific customization and element modules (JATS-journalpubcustom-modules); and
- The module that declares all the other modules in the Suite (JATS-modules).
- Customization is accomplished using parameter entities. For each Tag Set, there are four potential levels of customization:
- those for the particular DTD, which override
- those for the base NISO STS Tag Set, which override
- those for the JATS Publishing Tag Set, which override
- those for the JATS base suite, which NISO STS uses unchanged.
This mechanism works because in XML DTD syntax, the first parameter entity encountered that has name “YYY” overrides all other parameter entities with name “YYY”. So a Tag Set may set a parameter entity, thereby overriding both anything set by the NISO STS Suite as a whole or by any JATS Suite modules, which NISO STS calls in.Each new Tag Set created has the potential for five new modules, only one of which is required:- The module that defines the Tag Set as a DTD (required);
- A module that declares any new modules needed by the Tag Set;
- A module that declares any class overrides used;
- A module that declares any mix overrides used; and
- A module that declares any content model, attributes list, or other parameter entity overrides used.
- Familiarize yourself with the relationship between these “customization” modules and the “default” modules for classes, mixes, and models. In your DTD directory, scan the NISO STS class modules next (NISO-STS-classes), which overrides the JATS Publishing DTD classes (JATS-journalpubcustom-classes), which overrides the JATS base default classes (JATS-default-classes).
- As a last step, read any one of the many class-element-defining modules (for example, the “JATS-list” module, that defines all the list types) to see how the element declarations work.
How To Make New Tag Sets
NISO STS was created to allow a multiplicity of tag sets to be created, based on the needs of the
intended use. Many users will subset a tag set, to remove processing elements and alternatives they do not use from their operational tag set. Others will make a superset, adding organization-specific metadata and processing information that is not part of the base Suite. Some users will do both, restricting the options and adding their own metadata. Each new NISO STS Tag Set can override parameter entity values in the base tag sets and omit modules or add new Tag-Set-specific modules.
Guidance for Making a New Tag Set
Best Practice: When making a new NISO STS tag set, Best Practice is to
produce a JATS-Compatible tag set, as described in the
the JATS Compatibility Meta-Model (published as “JATS Compatibility Model Description Draft 0.7” at
http://www.niso.org/apps/group_public/download.php/16764/JATS-Compatibility-Model-v0-
7.pdf). The Meta-Model document provides
guidance on how to create JATS-based tag sets that will interoperate in expected ways with other JATS-based documents in databases, display systems, and similar applications. This document is the first thing you should read before making a new NISO STS tag set. The most fundamental principle is to follow the existing semantics. In other words:
- If NISO STS/JATS has the element you need, use it.
- If NISO STS/JATS does not have it, create your own new element with a new name.
- Do not force new meaning into old elements.
Parameter Entities Customize and Change
Parameter entities are the major mechanism for customizing this Tag Set or creating a new tag set from the modules in the Suite. Individual tag sets will be constructed by:
- Establishing new content model, element combinations, attribute list, or attribute value combinations using parameter entities in one of the Tag-Set-specific customizing modules, and
- Using the DTD modules to choose appropriate modules from the Suite that declare the kinds of elements needed. For example, if a base tag set contained 6 kinds of lists and 2 table models, a sub-setted tag set might use its Customize Classes Module to redefine the List Class to name only 3 list types and redefine the Table Class to allow only one table element.
The typical modules that create a customized tag set are: the DTD itself (required), a module to declare the modules the DTD needs (optional, the new tag set may not define any new modules), as many override modules as necessary (class, mix, model), and any new element definition modules required. As an example, modules for a new tag set might be:
- DTD — The DTD module (.dtd) for the new tag set base DTD (At a minimum, this module declares the top-level element (such as guide, book, or report) and any other structural elements unique to the new document type.)
- Tag-Set-specific Module of Modules — Module to declare all the new modules created expressly for the new tag set. If there are no new modules defined, only the NISO STS, the JATS Journal Publishing, and the JATS suite-level modules that name modules will be needed.
- Class overrides — Tag-Set-specific overrides of the Suite default element classes. If there are no new class overrides, only the NISO-STS, the JATS Journal Publishing, and the JATS suite-level default classes modules will be needed.
- Mix overrides — Tag-Set-specific overrides of the Suite default class mixes. If there are no new mix overrides, only the NISO-STS, the JATS Journal Publishing, and the JATS suite-level default mixes modules will be needed.
- Model overrides — Tag-Set-specific content model, attribute list, and attribute value overrides for the content models in the modules of the NISO STS or JATS Suite. (The complete content model overrides will use parameter entities named “*-model”. The mix-with-#PCDATA overrides will use parameter entities named “*-elements”.) The attribute list overrides will use parameter entities names “*-atts”.)
- New Modules for New Elements — Tag-Set-specific new elements (For example, a new Book Tag Set might add a module of book-specific metadata elements.)
Element Classes Concept
Many of the elements in the NISO STS and JATS have been grouped into loose element classes. There is no hard and fast rule for what constitutes a class; each one is a design decision, a matter of judgment. Classes of elements are used in content models to name a group of elements (or even a single element) whenever there is an OR grouping within the content model. Thus a model should never say “(def-list | list)*”, it would always contain “(%list.class;)*” to name all the lists.
The existing classes have been designed to make customization slightly easier and to meet the particular needs of new tag sets. Base classes for the JATS DTD Suite are defined in a separate module (JATS-default-classes). Some of these classes have been overridden in the JATS Journal Publishing classes customization module (JATS-journalpubcustom-classes)). Some of these classes have been overridden in the NISO STS classes customization module (NISO-STS-classes.ent). Any classes the new DTD defines will override any or all of the existing classes.
Content models are built using sequences of elements, OR groups that mix with character data (#PCDATA), or sequences that are mixtures of elements and OR groups. In theory, all OR groups should be written as classes (typically), or mixes. As an example, the content model for a <p> element is declared to be an OR group (that is, a choice) of data characters and any of the elements named in the %p-elements; mix parameter entity. The mix %p-elements; is declared to be a large OR group of many other element-defining classes: the %block-display.class;, the %math.class;, the %list.class;, the %citation.class;, and others.
Implementor’s Note: These element classes can be viewed as building blocks that will be used to build content models or to build larger parameter entities for element mixes. A mix describes a usage circumstance for a group of elements, such as all the paragraph-level elements, all the elements allowed inside a table cell, all the elements allowed inside a paragraph, or all the inline elements. For example, to add another block display item to the %block-display.class;, you would edit the block-display.class parameter entity (and probably also the block-display-noalt.class parameter entity) in your Tag-Set-specific Class Override Module to override the default parameter entity defined in the Suite’s default-classes.ent module and create a new module containing the Element Declaration of the new block display item.
Parameter Entity Names for Classes and Mixes
PARAMETER ENTITY: SAME FUNCTION, SAME NAME — The Suite modules and initial NISO STS Tag Sets have used a series of parameter entity naming conventions consistently. While parsing software cannot enforce these parameter entity naming or usage conventions, these conventions make it much easier for a person to know how the content models work and what must be modified to make a Tag Set change.
CLASSES
— Classes are functional groupings of elements used together in an OR group. Each class is named with a parameter entity, and all class parameter entity names end in the suffix “.class”:
<!ENTITY % list.class "def-list | list">
A class, by definition, should never be made empty; the class should be removed from all models where you do not want the class elements included.
MIXES
— Mixes are functional OR groups of classes; mixes should never contain element names directly. All mixes must be declared after all classes, since mixes are composed of classes. Mix names have no set suffix; for example, they may end in “-mix” or “-elements”. Content models and content model overrides should use mixes and classes for all OR groups. Only content model sequences are made up of element names directly.
MODEL OVERRIDES
— Parameter entity mixes for overriding a content model are of two styles:
- Inline mixes, and
- Full content model replacements.
These two groupings have been defined and named separately to preserve the mixed-content or element-content nature of the models in tag sets derived from this Suite.
The override of a complete content model will be named with a suffix “-model” and should include the entire content model, including the enclosing parentheses:
<!ENTITY % kwd-group-model "(label?, title?, (%kwd.class;)+ )" > <!ELEMENT kwd-group %kwd-group-model; >
The inline parameter entities to be intermingled with character data (#PCDATA) in a mixed content model are named with a suffix “-elements”. For example, “%institution-elements;” would be used in the content model for the element <institution>:
<!ENTITY % institution-elements "| %subsup.class;" > <!ELEMENT institution (#PCDATA %institution-elements;)* >
All inline mixes begin with an OR bar, so that the mix can be removed by making its value the empty string (""), which leaves just the character data (#PCDATA) as the model of the element:
<!ENTITY % rendition-plus "| %emphasis.class; | %subsup.class; | %phrase-content.class;" >
How To Build a New Custom Tag Set
The Concept
The basic idea for a new Tag Set is that all lower-level elements (paragraphs, lists, figures, etc.) will be defined in modules — either the modules of the base Suite or in new Tag-Set-specific modules — rather than in the DTD itself. The new DTD will be fairly short and include only definitions of the topmost elements, at least the document element and perhaps its children.
Note: In this section, unlike in the rest of the Tag Library, the Suite modules are all given their full and complete filenames, including a version number and a file type.
Modules are declared (named) using external parameter entities in one of the defined Modules of Modules:
- The new-tag-set-specific one,
- The NISO STS one (NISO-STS-modules1.ent),
- The JATS Publishing Tag Set one (JATS-journalpubcustom-modules1.ent) [if the OASIS CALS tables are needed, use JATS-journalpub-oasis-custom-modules1.ent instead], and
- The base JATS one (JATS-modules1.ent).
Modules defined in these modules are referenced (called in) inside the DTD proper, in the order needed to define the parameter entity overrides in sequence.
The NISO STS Tag Set was written as an example of this Best Practice customization technique, as a customization of the JATS Journal Publishing Tag Set. A new variant tag set that follows this plan will probably consist of the following modules:
- A DTD module to define the top-level elements (as examples: “NISO-STS-interchange-1-mathml2.dtd” and “JATS-journalpublishing1.dtd”);
- A Tag-Set-specific Module of Modules to declare new non-Suite modules in this Tag Set (There are no NISO STS examples since all current NISO STS Tag Sets use the same Module of Modules, but, as an example: “JATS-journalpubcustom-modules1.ent”.);
- A Tag-Set-specific definition of element classes to add new classes and override the default classes (as examples: “NISO-STS-classes-extended1.ent” and “JATS-journalpubcustom-classes1.ent”);
- A Tag-Set-specific definition of element mixes to add new mixes and override the default mixes (There are no NISO STS examples since all current NISO STS Tag Sets use the same Mixes module, but, as an example: “JATS-journalpubcustom-mixes1.ent”.);
- A Tag-Set-specific module of content model, attribute list, and attribute value overrides (There are no NISO STS examples since all current NISO STS Tag Sets use the same Models module, but, as an example: “JATS-journalpubcustom-models1.ent”);
- Tag-Set-specific modules to hold any new element declarations that are not in the DTD file; and
- All or most of the modules from full NISO STS (which includes JATS modules, MathML modules, BITS modules, table modules, and more.)
Illustrating Making a Variant Tag Set
To show the process, here is a series of instructions for making a new tag set, illustrated by showing how the JATS Journal Publishing Tag Set was created from the modules of the JATS base Suite.
- Modules
— Write a new Tag-Set-specific Module of Modules, which defines all new customization modules the tag set needs. As an example, the Journal Publishing Tag Set created the module JATS-journalpubcustom-modules1.ent. This Module of Modules contains the declarations of the JATS Publishing DTD class-override module (JATS-journalpubcustom-classes1.ent), the JATS Publishing DTD mix-override module (JATS-journalpubcustom-mixes1.ent), and the JATS Publishing DTD models-override module (JATS-journalpubcustom-models1.ent).NISO STS Note: The current NISO STS Tag Sets all use the same Module of Modules (NISO-STS-modules1.ent), which contains the definitions of the single class, mix, and model overrides modules all the NISO STS Tag Sets use as well as definitions of all NISO-STS-specific modules.
- Class overrides
— Write a Tag-Set-specific class-override module, defining any overrides to the Suite classes. As an example, the Journal Publishing Tag Set created the module
JATS-journalpubcustom-classes1.ent, in which the %date.class; and
%rest-of-para.class.class; parameter entities were redefined from their values in the JATS default classes (JATS-default-classes1.ent).NISO STS Note: The current NISO STS Tag Sets all use the same class overrides module (NISO-STS-classes1.ent), which contains the declarations of the class overrides that all the NISO STS Tag Sets use. In addition, the Extended Tag Sets uses the module NISO-STS-classes-extended1.ent. So the full NISO STS class override hierarchy will be (in the order called):
- NISO STS extended custom class overrides (CALS only)
- NISO STS custom class overrides
- JATS Publishing custom class overrides
- JATS default classes
- Mix overrides
— Write a Tag-Set-specific mix-override module, defining any overrides to
the Suite mixes. As an example, the Journal Publishing Tag Set created the module
JATS-journalpubcustom-mixes1.ent.NISO STS Note: The current NISO STS Tag Sets all use the same mix overrides module (NISO-STS-mixes1.ent), which contains the definitions of the mix overrides that all the NISO STS Tag Sets use. So the full NISO STS mix override hierarchy will be (in the order called):
- NISO STS custom mix overrides
- JATS Publishing custom mix overrides
- JATS default mixes
- Model overrides
— Write a Tag-Set-specific content-model-override module, defining any overrides to the content models, attribute lists, and attribute values for the NISO STS Suite. As an example, the Journal Publishing Tag Set created the module JATS-journalpubcustom-models1.ent, in which element collections (suffixed “*-elements”) that will be mixed with #PCDATA were redefined, full content models overrides (suffixed “*-model”) were redefined, and some new attributes and attribute lists were added.NISO STS Note: The current NISO STS Tag Sets all use the same model and attribute overrides module (NISO-STS-models1.ent), which contains the definitions of the content model and attribute parameter entity overrides that all the NISO STS Tag Sets use to override JATS publishing DTD and JATS Suite models and attribute lists.
- New Elements
— Write any new element-defining modules needed. For the Journal Publishing Tag Set, there are no such modules, but, for example, the BITS Book Tag Set (also made from the JATS Suite) defines a book-specific metadata module and several structural index modules.NISO STS Note: The current NISO STS Tag Sets created all the element-specific modules that start with “NISO-STS”, for example “NISO-STS-terms-n-def1.ent” and “NISO-STS-adoption1.ent”.
- DTD Module — Once these customization modules are in place, Write a new DTD module. Within that DTD module:
- First declare and then use the Modules of Modules parameter entities (in this order).
- Use an external parameter entity Declaration to declare and then call any Tag-Set-specific Module of Modules.
- Use an external parameter entity Declaration to declare and then call the NISO STS shared Module of Modules (NISO-STS-modules1.ent).
- Use an external parameter entity Declaration to declare and then call the JATS Publishing Module of Modules, which names all the potential modules that are part of the JATS Publishing DTD (JATS-journalpub-oasis-custom-modules1.ent).
- Use an external parameter entity Declaration to declare and then call the JATS default Module of Modules, which names all the potential modules that are part of the JATS Suite (JATS-modules1.ent).
- The rest of the external parameter entities will just be references (calls), not definitions, since the various Modules of Modules declare all the needed modules.
- Use external parameter entity references to call in any needed namespace setup modules, the xi:include module, and the JATS common attributes (JATS-common-atts1.ent) module.
- Then use parameter entities to call in the class and class customization modules:
- Use an external parameter entity reference to call the Tag-Set-specific class override module, if there is one.
- To use the OASIS exchange CALS tables in a tag set, use an external parameter entity reference to call the extended classes used for CALS (NISO-STS-classes-extended1.ent)
- Use an external parameter entity reference to call the NISO STS class overrides (NISO-STS-classes1.ent).
- Use an external parameter entity reference to call the JATS Journal Publishing tag-set-specific class overrides module (JATS-journalpubcustom-classes1.ent).
- Use an external parameter entity reference to call the JATS base-suite default classes module (JATS-default-classes1.ent).
- Use external parameter entities to call in the mix and mix override modules:
- Use an external parameter entity reference to call the Tag-Set-specific mix override module, if there is one.
- Use an external parameter entity reference to call the NISO STS mix override module (NISO-STS-mixes1.ent).
- Use an external parameter entity reference to call the Journal Publishing Tag Set mix overrides module (JATS-journalpubcustom-mixes1.ent).
- Use an external parameter entity reference to call the JATS base-suite default mixes module (JATS-default-mixes1.ent).
- Use external parameter entities to call in the model override modules:
- Use an external parameter entity reference to call the module for the Tag-Set-specific content models and attribute list override modules, if any.
- Use an external parameter entity reference to call the NISO-STS-specific content models and attribute list override module (NISO-STS-models1.ent).
- Following that, use an external parameter entity reference to call the content models and attribute list override modules for the JATS Journal Publishing Tag Set (JATS-journalpubcustom-models1.ent).
- Then use an external parameter entity reference to call in the standard JATS Common Module (JATS-common1.ent) that defines elements and attributes shared by many other modules.
- Use many external parameter entity references to call any new Tag-Set-specific modules that defining new block-level or phrase-level elements, the modules of the JATS base-suite used by NISO STS, and all the NISO STS base-suite modules.In this step, you select, from the various Modules of Modules, those modules which contain the elements needed for your Tag Set (for instance, selecting lists and not selecting terms and definition elements) and call in each of the modules needed. The NISO STS Interchange and Extended DTDs call these in alphabetical order, first JATS and then NISO STS, since the order does not matter.
- Now your element components have all been declared. In your DTD file, define the document (top-level) element and any other unique elements, attributes, or entities needed for your new specific Tag Set that JATS and NISO STS do not already define. For example, the NISO STS Interchange Tag Set with MathML 2.0 DTD declares only one element <standard> [the top-level element] and a few attribute lists.
- First declare and then use the Modules of Modules parameter entities (in this order).
Namespaces and MathML
When JATS was first designed, many software tools did not handle multiple redefinitions of the same namespace cleanly and correctly. Therefore, the following namespace prefixes, namespace URIs, and xmlns declarations are declared in the MathML DTD setup modules or in the MathML 2.0 and MathML 3.0 QName modules (and MathML 2.0 and MathML 3.0 schema modules for XSD and RNG):
- XLink
- The XLink prefix is set to “xlink”.
- The XLink namespace URI is set to “http://www.w3.org/1999/xlink”
- The XLink xmlns pseudo-attribute is set as follows, for use in attribute lists: “xmlns:xlink CDATA #FIXED 'http://www.w3.org/1999/xlink'”.
- MathML
- The MathML namespace prefix is set to “mml”.
- The MathML namespace URI is set to “http://www.w3.org/1998/Math/MathML”.
- The MathML xmlns pseudo-attribute is set as follows, for use in attribute lists: “xmlns:mml CDATA #FIXED 'http://www.w3.org/1998/Math/MathML'”
- W3C Schema Instance
- The W3C Schema namespace prefix is set to “xsi”.
- The W3C Schema namespace URI is set to “http://www.w3.org/2001/XMLSchema-instance”.
- The W3C schema xmlns pseudo-attribute is set as follows, for use in attribute lists: “xmlns:xsi CDATA #FIXED 'http://www.w3.org/2001/XMLSchema-instance'”.
This definition outside the ordinary JATS modules has annoying subsetting implications. It means that if you do not include the MathML setup modules and MathML modules in your tag set, you will not have those namespaces defined.
Thus, if you want to use the NISO STS and JATS modules to create a tag set that does not include MathML, there are two options open to you:
- Include the MathML setup modules and MathML DTD modules and ignore them in your tagging and in your documentation; or
- Write your own namespace setup module that declares the namespaces mentioned above.
Attributes for Linked Data
NISO STS provides several tagging constructs that are useful in making a NISO STS document as RDF-friendly as is practical in an application specifically designed for full text document production:
- Every element in NISO STS has either an optional or a required attribute of type “ID” that can be used to make the standards document, or any portion of the standards document, directly addressable, at any level of specificity.
- Every element in the NISO STS can take an @xml:base attribute. This attribute provides a base URI for identifiers within the XML document. While this mechanism provides an inward-facing linkability rather than a pointer to an external ontology, @xml:base can be used to support link-bases into the XML and external semantic interpretations layered over the XML.
- There is a very easy mechanism to add RDF-a attributes (or any other attributes) to every NISO STS element. The NISO STS DTD model (similarly the RNG model) provides two parameter entities (%jats-common-atts; and %jats-common-atts-id-required;) that can be used to add any attributes a user may prefer to all of the elements in the Tag Suite (except those out of our control, such as MathML elements). The two just-named parameter entities are used to give each NISO STS element an ID and an @xml:base. These parameter entities can be used to add one or more RDF-a attributes to each element in NISO STS.