xml:lang Language of an Element

The language of the intellectual content of the element for which this is an attribute. This attribute may be set on many elements, including the two document elements (<standard> and <adoption>) and the values are inherited by all the element’s children, unless overridden explicitly.

Usage/Remarks

How to Tag the Language
In NISO STS, there are several ways to describe the natural language of the content of an element:
  • Content Language Element: The <content-language> element, in the metadata of a standards document (inside <std-ident>), identifies the official language(s) used in this standards document. The element appears once for each official language used in the document. For Best Practice, the element content should be the two-letter ISO 639 code for the language, for example, “en” for English, “de” for German, or “es” for Spanish.
  • XML Lang Language Attribute: The @xml:lang attribute can be put on many elements, to indicate the language of the element. This is an inherited value, so that element and all of its children will be in the named language, unless specifically overridden with another @xml:lang attribute.
  • Language Element: For ISO-related processing only, the <language> element (inside the <doc-ident> element in the metadata of a standards document) identifies the official language(s) used in this standards document. This element allows more than one language value to be named. Best Practice Note: the <content-language> element should be used in addition to this element.
Values of XML Language Attribute
The value of this attribute has been set by the W3C as part of the XML specifications. The value of this attribute must conform to the IETF RFC 5646 (http://tools.ietf.org/html/rfc5646). For most uses, a primary-language subtag such as “fr” (French), “en” (English), “de” (German), or “zh” (Chinese) is sufficient. These values are NOT case sensitive, but current best practice uses all lower case. In addition to the primary language subtag, the value of this attribute may contain other subtags as well. Values for the various subtags (which can be used in certain combinations) can be obtained from the IANA Language Subtag Registry: http://www.iana.org/assignments/language-subtag-registry
Inheritance
The language value inherits down the tree, so an @xml:lang attribute names the language of the element and all its descendants, unless the descendant sets its own @xml:lang attribute. The default value of English (“en”) is set at the top-level element, thus it can be overridden there or anywhere lower in the document.
Script and Language (in contexts other than TBX)
In some languages, script codes are also critically important; for example, in Japanese, there is the need to express whether a name is in Kanji as opposed to in Kana (Hiragana or Katakana) to determine sort keys. Best Practice is to use the full language-code-plus-script-code as the value for @xml:lang. In our use of both language and script tagging as values for @xml:lang, we are following the IETF (Internet Engineering Task Force) Best Practice guideline: Network Working Group Request for Comments: 5646 [Tags for Identifying Languages, A. Phillips and M. Davis, Editors, September 2009]. That document defines a language tag as composed of (in part):
  1. A language code Language (typically using the shortest ISO 639)
  2. Potentially followed by a hyphen and then a script code Script (using the ISO 15924 code)
  3. Potentially followed by a hyphen and a region code Region (using the ISO 15924 code)
Script and Language in TBX
For TBX elements, the @xml:lang attribute is most often associated with the <tbx:langSet> element, to specify the language of all the child elements in a language section. However, it can be associated with virtually any element to specify the language of the content of that element.
As discussed earlier, use the values found for “language” in the IANA language subtag registry; do not use deprecated values. Unlike other NISO STS elements, TBX elements do not combine script and region values with the @xml:lang attribute; they instead use the separate @script attribute, and the <tbx:geographicalUsage> element.
OPTIONAL (defaults to en) on many elements; click for list and usage
Value Meaning
An alphanumeric string, which may include hyphens An abbreviation for a natural language (such as “en” for English or “de” for German) or for a language and a script (“ja-Kana”).
Default value en (English is only a default and may be changed in the XML document.)
REQUIRED on element: <tbx:langSet>
Value Meaning
Text, numbers, or special characters An abbreviation for a natural language (such as “en” for English or “de” for German) or for a language and a script (“ja-Kana”).
Restriction This attribute is required; it must be provided if the element is used.
OPTIONAL on many elements; click for list and usage

<abbrev>, <abstract>, <ack>, <addr-line>, <address>, <aff>, <alt-text>, <alt-title>, <annotation>, <anonymous>, <app>, <app-group>, <array>, <article-title>, <attrib>, <author-comment>, <award-id>, <bio>, <boxed-text>, <caption>, <chapter-title>, <chem-struct>, <chem-struct-wrap>, <city>, <code>, <collab>, <comment>, <compl>, <compl-title-wrap>, <conf-acronym>, <conf-date>, <conf-loc>, <conf-name>, <conf-sponsor>, <content-language>, <contrib-id>, <copyright-holder>, <copyright-statement>, <country>, <custom-meta>, <custom-meta-group>, <data-title>, <date-in-citation>, <day>, <def>, <def-item>, <def-list>, <degrees>, <disp-formula>, <disp-formula-group>, <disp-quote>, <doc-ident>, <edition>, <element-citation>, <email>, <era>, <etal>, <ext-link>, <fig>, <fig-group>, <fn>, <fn-group>, <fpage>, <full>, <funding-source>, <glossary>, <gov>, <graphic>, <ics-desc>, <index>, <index-div>, <index-entry>, <index-group>, <index-term>, <inline-code>, <inline-formula>, <inline-graphic>, <inline-media>, <inline-supplementary-material>, <institution>, <institution-id>, <intro>, <intro-title-wrap>, <issue>, <issue-id>, <issue-part>, <issue-title>, <journal-id>, <kwd-group>, <label>, <legend>, <license>, <list>, <list-item>, <long-desc>, <lpage>, <main>, <main-title-wrap>, <media>, <meta-note>, <milestone-end>, <milestone-start>, <mixed-citation>, <month>, <name>, <named-content>, <nav-pointer>, <nlm-citation>, <notes>, <on-behalf-of>, <open-access>, <p>, <page-range>, <part-title>, <patent>, <person-group>, <postal-code>, <prefix>, <preformat>, <price>, <pub-date>, <publisher-loc>, <publisher-name>, <rb>, <ref>, <ref-list>, <related-article>, <related-object>, <role>, <rt>, <season>, <sec>, <see>, <see-also>, <see-also-entry>, <see-entry>, <self-uri>, <series>, <sig>, <size>, <source>, <speaker>, <speech>, <state>, <statement>, <std>, <std-doc-meta>, <std-id>, <std-id-group>, <std-ident>, <std-meta>, <std-org>, <std-org-abbrev>, <std-org-group>, <std-org-name>, <string-date>, <string-name>, <styled-content>, <sub-part>, <subj-group>, <subtitle>, <suffix>, <supplement>, <supplementary-material>, <table-wrap>, <table-wrap-group>, <target>, <term>, <term-sec>, <textual-form>, <title-wrap>, <toc>, <toc-div>, <toc-entry>, <toc-group>, <trans-source>, <trans-subtitle>, <trans-title>, <trans-title-group>, <uri>, <verse-group>, <verse-line>, <version>, <volume>, <volume-id>, <volume-series>, <xref>, <year>

Value Meaning
An alphanumeric string, which may include hyphens An abbreviation for a natural language (such as “en” for English or “de” for German) or for a language and a script (“ja-Kana”).
Restriction This is an optional attribute; there is no default.
OPTIONAL on many elements; click for list and usage
Value Meaning
Text, numbers, or special characters An abbreviation for a natural language (such as “en” for English or “de” for German). (Use the @script attribute for any associated script.)
Restriction This is an optional attribute; there is no default.
Tagged Samples
Languages for licenses
...
<permissions>
 <license xml:lang="en">
  <license-p>Apart from exceptions provided by the law, 
   nothing from this publication may be duplicated and/or 
   published ...</license-p>
  ...
 </license>
 <license xml:lang="nl">
  <license-p>Auteursrecht voorbehouden. Behoudens 
   uitzondering door de wet gesteld mag zonder schriftelijke 
   toestemming van het Nederlands Normalisatie-instituut 
   niets uit deze uitgave worden ...</license-p>
  ...
 </license>
</permissions>
...
Providing the language codes for languages sets in different languages
<tbx:termEntry id="ISO10241-1.a23.311">
  <tbx:langSet xml:lang="en">
    <tbx:definition>numerical reference that indicates whether the flow will be laminar or turbulent
      for a given set of conditions</tbx:definition>
    <tbx:tig>
      <tbx:term id="a23.311-1">critical Reynolds number</tbx:term>
      <tbx:partOfSpeech value="noun"/>
    </tbx:tig>
  </tbx:langSet>
  <tbx:langSet xml:lang="fr">
    <tbx:definition>référence numérique indiquant si un écoulement est soit laminaire soit turbulent
      pour un ensemble de conditions données</tbx:definition>
    <tbx:tig>
      <tbx:term id="a23.311-2">nombre de Reynolds critique</tbx:term>
      <tbx:partOfSpeech value="noun"/>
      <tbx:grammaticalGender value="masculine"/>
    </tbx:tig>
  </tbx:langSet>
  <tbx:langSet xml:lang="de">
    <tbx:definition>numerische Bezugsgröße, die anzeigt, ob die Strömung unter definierten
      Bedingungen laminar oder turbulent ist</tbx:definition>
    <tbx:tig>
      <tbx:term id="a23.311-3">kritische Reynoldszahl</tbx:term>
      <tbx:partOfSpeech value="noun"/>
      <tbx:grammaticalGender value="feminine"/>
    </tbx:tig>
  </tbx:langSet>
</tbx:termEntry>