=begin # $Id: conformance.rd.src,v 1.1 2003/01/22 16:41:45 katsu Exp $ = Conformance of xmlscan to the specifications This document describes the conformance of each parser included in xmlscan for XML related specifications. == Abstract XMLscan is one of "non-validating XML processor" according to XML 1.0 Specification ((<[XML]>)). XMLscan is satisfied with almost conditions required for a non-validation XML processor, though, for the limitations of implementations, there are mainly the following restrictions. For detail, See the below descriptions for each class. * It is impossible to parse an XML document encoded in UTF-16 directly. * By default, it is not checked for illegal characters which must not appear in an XML document or in a context. * XMLscan doesn't read any external entities. Well-formedness constraints for external entities are not checked. * XMLscan skips an internal DTD subset. (it will be supported in future version). Well-formedness constraints for an internal DTD subset are not checked. == Conformance of XMLScan::XMLScanner XMLScan::XMLScanner tokenize an XML document and only recognize each XML declaration, document type declaration, processing instruction, comment, start tag, end tag, empty element tag, CDATA section, general entity reference, and character reference. It is NOT an error even that one of these parts appears in the context which prohibits existence of it, except in the case described below. It is reported as an parse error that an XML declaration, document type definition (except internal DTD subset), processing instruction, comment, start tag, end tag, empty element tag, CDATA section, general entity reference, or a character reference is not matched with its production defined in XML 1.0 Specification ((<[XML]>)). For reasonably speed, if `strict_char' option is not specified, XMLScan::XMLScanner doesn't check whether a name or character data includes an illegal characters for it. All characters except ones recognized as one of delimiters in that context are allowed. To be more precise, without `strict_char' option, the production Char[2], Name[5], Nmtoken[7], EntityValue[9], AttValue[10], SystemLiteral[11], PubidChar[13], CharData[14], VersionNum[26], and EncName[81] are not checked strictly. XMLScan::XMLScanner doesn't normalize linebreaks. Since Ruby is not supported UTF-16, it is impossible to parse an XML document encoded in UTF-16 as it is. You need to convert it to UTF-8 before parsing. `)), and ensure that an XML document is namespace-well-formed. All limitations for XMLScan::XMLParser are inherited to XMLScan::XMLNamespace. == References : [XML] W3C (World Wide Web Consortium). Extensible Markup Language (XML) 1.0 (Second Edition), January 2000. (()) : [Namespaces] W3C (World Wide Web Consortium). Namespaces in XML, January 1999. (()). Important corrections are found at (()). =end