README in X12-0.1.0 vs README in X12-1.1.0
- old
+ new
@@ -1,8 +1,8 @@
= X12Parser - a library to manipulate X12 structures using native Ruby syntax
-$Id: README 40 2008-11-13 19:51:31Z ikk $
+$Id: README 92 2009-05-13 22:12:10Z ikk $
*WARNING* <tt>The project is in development. Contributors are welcome.</tt>
Project home is at http://rubyforge.org/projects/x12parser/. Please
note, this is a different project from {Chris Parker's port}[http://rubyforge.org/projects/x12-parser/] of
@@ -70,61 +70,60 @@
Each participant in EDI has to know the structure of the data coming
across the wire - X12 or no X12. The X12 structures are defined in
so-called Implementation Guides - thick books with all the data pieces
spelled out. There is no other choice, but to invent a
computer-readable definition language that will codify these
-books. For example, the X12/997 message can be defined as
+books. For familiarity sake we'll use XML. For example, the X12/997
+message can be defined as
- loop 997 1:1
- {
- segment ST 1:1
- segment AK1 1:1
- loop L1000 0:999999
- {
- segment AK2 0:1
- loop L1010 0:999999
- {
- segment AK3 0:1
- segment AK4 0:99
- } # L1010
- segment AK5 1:1
- } # L1000
- segment AK9 1:1
- segment SE 1:1
- } # 997
+ <Definition>
+ <Loop name="997">
+ <Segment name="ST" min="1" max="1"/>
+ <Segment name="AK1" min="1" max="1"/>
+ <Loop name="L1000" max="999999" required="y">
+ <Segment name="AK2" max="1" required="n"/>
+ <Loop name="L1010" max="999999" required="n">
+ <Segment name="AK3" max="1" required="n"/>
+ <Segment name="AK4" max="99" required="n"/>
+ </Loop>
+ <Segment name="AK5" max="1" required="y"/>
+ </Loop>
+ <Segment name="AK9" max="1" required="y"/>
+ <Segment name="SE" max="1" required="y"/>
+ </Loop>
+ </Definition>
-Namely, the 997 is a 'loop' containing segments ST (only one - '1:1'),
-AK1 (also only one), another loop L1000 (zero or many repeats),
-segments AK9 and SE. The loop L1000 can contain a segment AK2
-(optional - '0:1') and another loop L1010 (zero or many), and so on.
+Namely, the 997 is a 'loop' containing segments ST (only one), AK1
+(also only one), another loop L1000 (zero or many repeats), segments
+AK9 and SE. The loop L1000 can contain a segment AK2 (optional) and
+another loop L1010 (zero or many), and so on.
The segments' structure can be further defined as, for example,
- segment AK2 {
- TransactionSetIdentifierCode S R 3-3 Tbl143
- TransactionSetControlNumber S R 4-9
- } # AK2
+ <Segment name="AK2">
+ <Field name="TransactionSetIdentifierCode" required="y" min="3" max="3" validation="T143"/>
+ <Field name="TransactionSetControlNumber" required="y" min="4" max="9"/>
+ </Segment>
-wihch defines a segment AK2 as having to fields:
+which defines a segment AK2 as having two fields:
TransactionSetIdentifierCode and TransactionSetControlNumber. The
field TransactionSetIdentifierCode is defined as having a type of
-string ('S'), begin required ('R'), having length of minimum 3 and
-maximum 3 characters ('3-3'), and being validated against a table
-Tbl143. The validation table is defined as
+string (default), being required, having length of minimum 3 and
+maximum 3 characters, and being validated against a table T143. The
+validation table is defined as
- table Tbl143 {
- 100 Insurance Plan Description
- 101 Name and Address Lists
+ <Table name="T143">
+ <Entry name="100" value="Insurance Plan Description"/>
+ <Entry name="101" value="Name and Address Lists"/>
...
- 997 Functional Acknowledgment
- 998 Set Cancellation
- } # Tbl143
+ <Entry name="997" value="Functional Acknowledgment"/>
+ <Entry name="998" value="Set Cancellation"/>
+ </Table>
-where required values are first tokens on each line, i.e., 100, 101,
-..., 997, 998.
+with entries having just names and values.
-This message is fully flashed out in an example 'misc/997.d12' file,
+This message is fully flashed out in an example 'misc/997.xml' file,
copied from the ASC X12N 276/277 (004010X093) "Health Care
Claim Status Request and Response" National Electronic Data
Interchange Transaction Set Implementation Guide.
Now expressions like
@@ -137,26 +136,26 @@
inside the enclosing loop 'L1000'. The meaning of the value '66' found
in this field is still in the eye of the beholder, but, at least its
location is clearly identified in the message.
-=== X12 Structure Definition Language (d12)
+=== X12 Structure Definition Language
The syntax of the X12 structure definition language should be apparent
-from the '997.d12' file enclosed with the package. The strict definition
-is formalized in 'lib/X12/x12syntax.treetop' file.
+from the '997.xml' file enclosed with the package. A more detailed
+description follows in Appendix A.
=== Parsing
Here is how to parse an X12/997 message (the source is in
example/parse.rb):
require 'x12'
# Read message definition and create an actual parser
- # by compiling .d12 file
- parser = X12::Parser.new('misc/997.d12')
+ # by compiling the XML file
+ parser = X12::Parser.new('misc/997.xml')
# Define a test message to parse
m997='ST*997*2878~AK1*HS*293328532~AK2*270*307272179~'\
'AK3*NM1*8*L1010_0*8~AK4*0:0*66*1~AK4*0:1*66*1~AK4*0:2*'\
'66*1~AK3*NM1*8*L1010_1*8~AK4*1:0*66*1~AK4*1:1*66*1~AK3*'\
@@ -191,11 +190,11 @@
require 'x12'
# Read message definition and create an actual parser
# by compiling .d12 file
- parser = X12::Parser.new('misc/997.d12')
+ parser = X12::Parser.new('misc/997.xml')
# Make a new 997 message
r = parser.factory('997')
#
@@ -268,15 +267,10 @@
You can install X12 library with the following command.
% gem install X12
-If you install directly from the X12*.gem file, it requires these
-packages to be installed first:
-* {Treetop}[http://rubyforge.org/projects/treetop/]
-* {Polyglot}[http://rubyforge.org/projects/polyglot/]
-
== License
X12 library is released under the Lesser GPL license, see
http://www.gnu.org/licenses/lgpl.txt
@@ -284,16 +278,10 @@
* Validation is not implemented.
* Field types and sizes are ignored.
* No access methods for composites' fields.
-== Wish list
-
-* .d12 files should have an 'include' facility, so data definitions can be reused for different messages.
-
-* It would be nice to codify all popular X12 messages in .d12 format.
-
== Support
Please use the following:
* forums on Rubyforge for general discussions, http://rubyforge.org/forum/?group_id=7297
@@ -304,6 +292,173 @@
The authors of the project were inspired by the following works:
1. The Perl X12 parser by Prasad Poruporuthan, http://search.cpan.org/~prasad/X12-0.09/lib/X12/Parser.pm
2. The Ruby port of the above by Chris Parker, http://rubyforge.org/projects/x12-parser/
-3. Treetop Ruby parser, http://treetop.rubyforge.org
+
+== Appendix A. Structure definition language
+
+The structure definition language uses XML to describe X12 message
+format. A message definition can be in a single file or in several. If
+the definition parser encounters an element (segment, composite, or
+table), which has not been previously defined, it tries to load the
+definition from the file with the same name and in the same
+directory. For example, if a loop mentions a segment named 'ST' and
+this segment is not defined, the parser will try to load and parse a
+file called 'ST.xml'. This convention works for any name unless it
+conflicts with a Microsoft's device name, see Appendix B.
+
+Each element in a structure definition (except 'Definition') must have
+an attribute called 'name'. It is used to set/get respective content
+from Ruby. If an element's 'name' attribute cannot be a valid Ruby
+identifier (for example, '270'), the parser will prepend the name with
+'_' (underscore). I.e., this expression is not valid:
+
+ @r.FG[1].270[1].ST.TransactionSetIdentifierCode
+
+but his one is
+
+ @r.FG[1]._270[1].ST.TransactionSetIdentifierCode
+
+Each XML file has to have a single root element, one of the following:
+
+=== Definition
+
+The 'Definition' element can have nested loops, segments, composites,
+and tables. It is used to provide 'artificial' root element for XML
+document when several definitions are in one physical file. For
+example, this is `misc/997single.xml' (edited for size):
+
+ <Definition>
+ <Segment name="ST">
+ <Field name="TransactionSetIdentifierCode" min="3" max="3" validation="T143"/>
+ <Field name="TransactionSetControlNumber" min="4" max="9"/>
+ <Field name="ImplementationConventionReference" required="y" min="1" max="35"/>
+ </Segment>
+
+ <Composite name="C030">
+ <Field name="ElementPositionInSegment" type="long" required="n" min="1" max="2"/>
+ <Field name="ComponentDataElementPositionInComposite" type="long" required="y" min="1" max="2"/>
+ <Field name="RepeatingDataElementPosition" type="long" required="y" min="1" max="4"/>
+ </Composite>
+
+ <Segment name="AK1">
+ <Field name="FunctionalIdentifierCode" min="2" max="2" validation="T479"/>
+ <Field name="GroupControlNumber" type="long" min="1" max="9"/>
+ </Segment>
+
+ <Table name="T723">
+ <Entry name="1" value="Mandatory data element missing"/>
+ <Entry name="2" value="Conditional required data element missing."/>
+ <!-- ... other entries -->
+ <Entry name="13" value="Too Many Components"/>
+ </Table>
+
+ <!-- ... other segments or composites or tables -->
+
+ <Loop name="997">
+ <Segment name="ST" min="1" max="1"/>
+ <Segment name="AK1" min="1" max="1"/>
+ <Loop name="L1000" max="999999" required="y">
+ <Segment name="AK2" max="1" required="n"/>
+ <Loop name="L1010" max="999999" required="n">
+ <Segment name="AK3" max="1" required="n"/>
+ <Segment name="AK4" max="99" required="n"/>
+ </Loop>
+ <Segment name="AK5" max="1" required="y"/>
+ </Loop>
+ <Segment name="AK9" max="1" required="y"/>
+ <Segment name="SE" max="1" required="y"/>
+ </Loop>
+
+ </Definition>
+
+This element does not have any attributes and cannot be addressed from Ruby's API.
+
+=== Loop
+
+The 'Loop' element is a main element to define either loops or whole
+messages. Loops can have nested segments and other loops.
+
+Note, that a segment defined inside a loop takes precedence over
+previously defined segments. This is convenient if a special version
+of a segment is required. For example, this is a general definition of
+an 'ST' segment stored in a 'ST.xml' file:
+
+ <Segment name="ST">
+ <Field name="TransactionSetIdentifierCode" min="3" max="3" validation="T143"/>
+ <Field name="TransactionSetControlNumber" min="4" max="9"/>
+ <Field name="ImplementationConventionReference" required="y" min="1" max="35"/>
+ </Segment>
+
+If you want the X12 parser to look for a particular message type, say '997', do this:
+
+ <Loop name="997">
+ <Segment name="ST" max="1">
+ <Field name="TransactionSetIdentifierCode" const="997"/>
+ <Field name="TransactionSetControlNumber" min="4" max="9"/>
+ </Segment>
+ <Segment name="AK1" max="1"/>
+ <!-- ... the rest of the 997 definition -->
+ </Loop>
+
+A loop can have the following attributes:
+* min - minimal number of repeats allowed, default is 0.
+* max - maximum number of repeats allowed, default is 'inf' (infinite).
+* required - if the loop is required (boolean), default is false. The true value implies min="1".
+* comment - ignored
+
+=== Segment
+
+Segments can only have nested fields. Attributes are as follows:
+* min - minimal number of repeats allowed, default is 0. Value min>0 implies required="y".
+* max - maximum number of repeats allowed, default is 'inf' (infinite).
+* required - if the segment is required (boolean), default is false. The true value implies min="1".
+* comment - ignored
+
+All attributed except 'name' are ignored in standalone definitions outside any loop.
+
+=== Composite
+Same as a segment.
+
+=== Table
+
+Tables can only have entries defined as name-value pairs. Order is
+not important. Only required attribute is 'name'.
+
+=== Field
+
+A field cannot have any nested elements, but its attributes are very important:
+* min - minimal number of characters allowed, default is 0. Value min>0 DOES NOT imply required="y" - the field can be missing, but may require a minimum length if present.
+* max - maximum number of characters allowed, default is 'inf' (infinite).
+* required - if the field is required (boolean), default is false. The true value DOES NOT imply min="1".
+* type - one of the 'string' (default), 'integer', 'long', or 'double'. These abbreviations are also valid: 'str', 'int'.
+* const - forces the field to have this value, if present.
+* validation - the name of a validation table, if any.
+* comment - ignored
+
+== Appendix B. Microsoft's device file names
+
+Apparently, in Microsoft's operating systems one cannot create a file
+named like 'device_name.whatever', for example, 'CON.xml' is highly
+illegal. For such cases, the X12 parser creates an exception and
+expects 'CON_.xml' instead.
+
+Here is the full device list according to Microsoft (see
+http://support.microsoft.com/kb/74496):
+
+ Name Function
+ ---- --------
+ CON Keyboard and display
+ PRN System list device, usually a parallel port
+ AUX Auxiliary device, usually a serial port
+ CLOCK$ System real-time clock
+ NUL Bit-bucket device
+ A:-Z: Drive letters
+ COM1 First serial communications port
+ LPT1 First parallel printer port
+ LPT2 Second parallel printer port
+ LPT3 Third parallel printer port
+ COM2 Second serial communications port
+ COM3 Third serial communications port
+ COM4 Fourth serial communications port
+