= metanorma-gb: Authoring Chinese standards (GuoBiao, sector...) in AsciiDoc
image:https://img.shields.io/gem/v/metanorma-gb.svg["Gem Version", link="https://rubygems.org/gems/metanorma-gb"]
image:https://img.shields.io/travis/riboseinc/metanorma-gb/master.svg["Build Status", link="https://travis-ci.org/riboseinc/metanorma-gb"]
image:https://codeclimate.com/github/riboseinc/metanorma-gb/badges/gpa.svg["Code Climate", link="https://codeclimate.com/github/riboseinc/metanorma-gb"]
== Functionality
This gem generates
https://en.wikipedia.org/wiki/Guobiao_standards[Guibiao standards]
(Chinese national standards), using AsciiDoc.
This gem implements the https://github.com/riboseinc/gbdoc[GbDoc] data model,
which inherits from
https://github.com/riboseinc/isodoc-models[IsoDoc and StandardDocument].
The code of this gem inherits from
https://github.com/riboseinc/metanorma-iso[metanorma-iso], a gem used to
generate ISO standards using Asciidoc.
The two standards formats are closely aligned. Refer to the ISO gem
for guidance, including
https://github.com/riboseinc/metanorma-iso/wiki/Guidance-for-authoring[IsoDoc: Guidance for authoring]
The gem can also be used to generate Chinese local or sector standards, which
have the same format; the gem formats the title page to have the correct
metadata displayed.
The following outputs are generated.
* (Optional) An HTML preview generated directly from the Asciidoctor document,
using native Asciidoc formatting.
** http://asciimath.org[AsciiMathML] is to be used for mathematical formatting.
The gem uses the https://github.com/asciidoctor/asciimath[Ruby AsciiMath parser],
which is syntactically stricter than the common MathJax processor;
if you do not get expected results, try bracketting terms your in AsciiMathML
expressions.
* an XML representation of the document, intended as a document model for GB
International Standards.
* The XML representation is processed in turn to generate the following outputs
as end deliverable GB standard drafts.
** HTML
** Word
The Word output of the gem is strictly
aligned to the GB/T 1.1 specification, including the fonts and font sizes
prescribed, and the measurements for element positioning on the page.
== Usage
The preferred way to invoke this gem is via the `metanorma` script:
[source,console]
----
$ metanorma --type gb a.adoc # output HTML and DOC
$ metanorma --type gb --extensions html a.adoc # output just HTML
$ metanorma --type gb --extensions doc a.adoc # output just DOC
$ metanorma --type gb --extensions xml a.adoc # output GB XML
----
The gem translates the document into GB XML format, and then
validates its output against the GB XML document model; errors are
reported to console against the XML, and are intended for users to
check that they have provided all necessary components of the
document.
The gem then converts the XML into HTML and DOC.
The gem can also be invoked directly within asciidoctor, though this is deprecated:
[source,console]
----
$ asciidoctor -b gb -r 'metanorma-gb' a.adoc
----
=== Installation
If you are using a Mac, the https://github.com/riboseinc/metanorma-macos-setup
repository has instructions on setting up your machine to run Metanorma
scripts such as this one. You need only run the following in a Terminal console:
[source,console]
----
$ bash <(curl -s https://raw.githubusercontent.com/riboseinc/metanorma-macos-setup/master/metanorma-setup)
$ gem install metanorma-gb
----
== Differences from `metanorma-iso`
=== Document Attributes
In the following, "GB standard" should be read to refer to any Chinese
national, sector or local standard. Asterisked document attributes are
mandatory.
`:title-intro-zh:`, `:title-main-zh:`*, `:title-part-zh:`::
These are the title introduction, main title, and part title in Chinese.
The intro and part titles are optional.
(They replace their French counterparts in
https://github.com/riboseinc/metanorma-iso[metanorma-iso].)
`:title-intro-en:`, `:title-main-en:`*, `:title-part-en:`::
These are the title introduction, main title, and part title in English.
The intro and part titles are optional.
(They form the document subtitle, instead of the document title as in
https://github.com/riboseinc/metanorma-iso[metanorma-iso].)
`:technical-committee-type:`::
The type of the technical committee (`technical` or `provisional`).
`:iso-standard:`::
(optional) A corresponding ISO standard that the GB standard relates to. Format
is full document code, then optionally comma followed by document title;
e.g. `ISO/IEC 27001:2013, Information security management systems`
`:equivalence:`::
(optional, only valid if there is a corresponding `:iso-standard:`)
The relation of the GB standard to the corresponding ISO standard
(`equivalent`, `identical`, `nonequivalent`). Defaults to `equivalent`.
`:obsoletes:`::
(optional)
A corresponding GB standard that this GB standard obsoletes. Format is full
document code, then optionally comma followed by document title;
e.g. `GB/T 22080-2008`
`:obsoletes-parts:`::
A list of bibliographic localities in the corresponding GB standard that this
GB standard obsoletes. These are formatted the same way as the localities in
citations; e.g. `clause 7-9, clause 11`
`:scope:`::
The scope of the GB standard (`national`, `sector`, `professional`, `local`,
`enterprise`, `social-group`). Defaults to `national`.
`:mandate:`::
The mandate of the GB standard (`mandatory`, `recommended`, `guidelines`).
Defaults to `mandatory`.
`:topic:`::
The topic of the GB standard (`basic`, `health-and-safety`, `environment-protection`, `engineering-and-construction`, `product`, `method`, `management-techniques`, `other`). Defaults to `basic`.
`:prefix:`::
The prefix classifying the GB standard.
(Refer to
https://github.com/riboseinc/gbdoc/blob/master/models/gb-standard-national-prefix.adoc[GB National Standard Prefixes],
https://github.com/riboseinc/gbdoc/blob/master/models/gb-standard-sector-prefix.adoc[GB Sector Standard Prefixes],
https://github.com/riboseinc/gbdoc/blob/master/models/gb-standard-local-prefix.adoc[GB Local Standard Prefixes],
https://github.com/riboseinc/metanorma-gb/issues/54[GB Social and Enterprise Standard Prefixes].)
Any `/Z` or `/T` suffix (indicating "recommended" and "guidelines" mandate) is
ignored unless the `:mandate:` attribute is not given. Any `Q/` or `T/` prefix for social and enterprise
standards is ignored unless the `:scope:` attribute is not given.
`:issued-date:`::
The date on which the GB standard was issued (authorised for publication by the issuing authority).
`:published-date:`::
The date on which the GB standard was published (distributed by the publisher).
`:implemented-date:`::
The date on which the GB standard became active.
`:created-date:`::
The date on which the first version of the GB standard was created.
`:updated-date:`::
The date on which the current version of the GB standard was updated.
`:obsoleted-date:`::
The date on which the GB standard was obsoleted/revoked.
`:confirmed-date:`::
The date on which the GB standard was reviewed and approved by the issuing authority.
`:library-ics:`::
The ICS (International Categorization for Standards) number for the GB standard. There may be more than one ICS for a document; if so, they should be comma-delimited. (Unlike the case for ISO, the ICS identifier is output to the front page of the GB standard.)
`:library-ccs:`::
The CCS (Chinese Categorization Scheme) code for the GB standard. See https://github.com/riboseinc/cn-ccs-codes
`:plan-number:`::
The Plan Number (计划单号) for the GB standard.
`:issuer:`::
The issuer of the standard. This is the authority which authors, manages, and issues the standard. For social standards, this is the social group; for enterprise standards, this is the company. The issuer appears on the standard frontispiece. By default, the issuer is inferred from the prefix of the standard; this attribute overrides the value inferred from the prefix. It is required for social and entperprise standards.
`:publisher:`::
The publisher of the standard, which distributes the standard. This is distinct from the issuer, the authority which authors, manages, and issues the standard.
`:proposer:`::
The party which proposed the standard.
`:authority:`::
The authority which sponsored the standard.
`:author:`::
The individuals who drafted the standard.
`:author-committee:`::
The committees which drafted the standard.
`:title-font:`::
The font to use for the standard class and issuer on the (Word) cover page; described in GB/T 1.1 as
"custom font". If not provided, the font is inferred from the scope of the standard, aligning
with existing practice: SimSun for national scope, SimHei for all other scopes.
`:keep-boilerplate:`::
If absent (default), any paragraphs supplied at the start of the Terms and Definitions
section are deleted, and replaced with standard boilerplate. If present, any such
paragraphs in the text are retained.
`:standard-logo-img:`::
User-supplied graphic to overwrite the logo for the standard on the title page.
`:standard-class-img:`::
User-supplied graphic to overwrite the name of the standard class on the title page.
`:standard-issuer-img:`::
User-supplied graphic to overwrite the name of the standard issuer on the title page.
=== Language macros
In Terms and Definitions, preferred terms, alternate terms and deprecated terms
are expected to be given in both Chinese and English. By default, the gem does
this by detecting space-delimited runs of Han or Latin script text:
[source,asciidoc]
--
alt:[rough rice 糙米]
--
[source,xml]
--
糙米 rough rice
--
However if there is script mixing in a term -- if the Chinese term contains
a Latin script acronym or a mathematical expression, for example -- the
Chinese term will not be detected correctly. To address this, the formatting macros
`+[zh]#...#+` and `+[en]#...#+` are used. If they are present, then the content
of those macros is treated as the Chinese and English equivalents of the
parent node instead:
[source,asciidoc]
--
=== [en]#XYZ paddy# [zh]#水稻XY#]
alt:[[en]#rough rice# [zh]#糙米#]
--
[source,xml]
--
XYZ paddy 水稻XYZ
糙米 rough rice
--
Unfortunately no further markup is permitted within the `+[zh]#...#+` and
`+[en]#...#+` macros by Asciidoctor, and Asciidoctor does not correctly nest
inline macros within other inline macros (so `+alt:[en:[_xyz_] zh:[xyz]+`
would not give correct behaviour either.)
Localisation strings can be used anywhere else in the document where the
grammar permits localised strings (notably in bibliographic data). For example,
a bibliographic title can be given in two languages as follows. (Note that formatting appears outside the language macros.)
[source,asciidoc]
--
[[[ISO7301,ISO 7301:2011]]], _[zh]#大米 - 规格# [en]#Rice -- Specification#_
--
[source,xml]
--
大米 - 规格 Rice‑Specification
ISO 7301
2011
International Organization for Standardization
ISO
--
The gem also supports `+[zh-Hant]#...#+` and `+[zh-Hans]#...#+` to
differentiate traditional and simplified script in ISOXML; `zh-Hant` is
provisionally supported through changing font in the output.
== Caveats
=== Microsoft Word
The Word output is meticulously aligned to the GB/T 1.1 specification, which is highly
prescriptive on the positioning of elements on the page. This means that the Word output
uses http://www.addbalance.com/word/frames_textboxes.htm[frames] and
https://en.wikipedia.org/wiki/Vector_Markup_Language[VML] extensively, as the best mechanism
Word HTNL has to ensure precise positioning of elements. However, the use of frames
makes Word documents more cumbersome to edit; it is envisaged that the bulk of document
editing should be happening in Asciidoctor, with Word treated as a write-only output format.
The use of VML and frames is mostly confined to the cover page, which is the most heavily
prescribed by GB/T 1.1. However, Word as of 2016 suppresses space before a paragraph
after a page break (though not a section break--which means that the Foreword, Introduction,
Document Title, Annex and Bibliography titles would all either lose their mandated initial
space in Word, or else would all have to be treated as separate sections. For that reason,
those headings are instead treated by this gem as frames (in-line with their following text),
which preserve their initial spacing.
=== GB/T 1.1 Compliance
GB/T 1.1-2009 prescribes the format of GB standards meticulously, and is based on ISO/IEC DIR 2-2004
(though it is not equivalent, and ISO/EIC DIR 2 is less prescriptive about layout).
GB issued a template program for generating compliant Word documents
in 2010; this program no longer executes on Windows. (This gem has extracted its stylesheet for
use in formatting output, but the stylesheet itself had to be modified in places to comply with
GB/T 1.1.)
Compliance of GB standards with GB/T 1.1 has been patchy. This has been exacerbated by the fact that
ISO/IEC DIR 2 was substantially revised in 2011 and again in 2016. Although GB/T 1.1 has not been
updated to align with ISO/IEC DIR 2-2016, published GB standards increasingly are formatted according
to ISO in most areas where ISO and GB now conflict.
This gem attempts to align with current best practice of GB standards, and does so in consultation with
GB. GB/T 19018-2017 has been used as the exemplar standard.
The following area the areas where the gem's Word output aligns with or deviates from GB/T 1.1-2009.
* https://github.com/riboseinc/metanorma-gb/issues/58[Measurements (GB/T 1.1 Annex I.)] The gem
scrupulously aligns with the measurements prescribed in GB/T, to a greater extent than the 2010
template tool. As already noted, it makes extensive use of frames to ensure correct vertical positioning
of headers, and of elements on the cover page.
* https://github.com/riboseinc/metanorma-gb/issues/56[Fonts (GB/T 1.1 Annex J.)] The gem aligns
with the fonts and font sizes prescribed in GB/T. (The only exception is the standard name, for which a
point size of 72 is quite unrealistic: 26pt is used instead, in compliance with the preexisting Word
template.) For Simplified Chinese script, the gem uses by default SimSun as its "serif" font, and SimHei
has its "sans-serif" font; this reflects practice in the
Word templates used for GB. For Latin script, it uses Cambria as its serif font, and Calibri as its
sans-serif font; this is to minimise disruption moving between scripts. (Note that the stylesheets
make minimal use of boldface and italics, as these are not well-matched with Chinese typography;
the sans-serif font occupies the niche that boldface occupies in ISO Latin-script documents.)
+
GB/T 1.1 prescribes a "custom font" for the standard class and standard issuer on the cover page.
By default, this is the serif font for standards with national scope, and the sans-serif font for
all other scopes. All font selections can be overriden in the document attributes (`:bodyfont:`,
`:headerfont:`, `:titlefont:`.)`
* https://github.com/riboseinc/metanorma-gb/issues/57[Layout (GB/T 1.1 Clause 9.)]. The gem complies
with GB/T 1.1, with the following exceptions where it follows ISO/IEC DIR 2-2016 practice instead:
** 9.3: There are no separate tables of figures, tables of tables, or tables of annexes. Table of Contents
indentation in the 2010 stylesheet did not comply with GB/T 1.1.
** 9.5.2: Normal references and Bibliography references are indented like normal paragraphs, instead of
having a hanging indent ("on overflow they should be indented to the top level"); in fact, GB/T 1.1
does not follow this in its own references list.
** 9.5.3: Terms and Definitions is aligned with ISO/IEC DIR 2: there is provision for alternate and
deprecated terms, and term sources are notated in brackets whether they are modified or direct citations
from the source document, instead of being treated as a note in the latter case.
(https://github.com/riboseinc/metanorma-gb/issues/67) Clauses numbers are separated from the term
source reference by a dash. References to terms defined elsewhere in the Terms and Definitions clause
are accompanied with clause references.
** 9.9.3: Figure footnotes are not longer treated as footnotes, but are instead merged into the figure
key, as is done in ISO/IEC DIR 2. Footnote indentation and note indentation in the 2010 stylesheet
did not comply with GB/T 1.1.
** 9.9.4: Example labels do not appear on a separate line. Examples like notes have a hanging indent,
so that their content is left-aligned.
** 9.9.5: Formulas are centered in the page, but are not connected with the formula number with a
dotted tab.