README in bindata-0.11.1 vs README in bindata-1.0.0

- old
+ new
@@ -1,300 +1,1143 @@
-= BinData
+Title: BinData Reference Manual
 
+{:ruby: lang=ruby html_use_syntax=true}
+
+# BinData
+
 A declarative way to read and write structured binary data.
 
-== What is it for?
+## What is it for?
 
 Do you ever find yourself writing code like this?
 
-  io = File.open(...)
-  len = io.read(2).unpack("v")
-  name = io.read(len)
-  width, height = io.read(8).unpack("VV")
-  puts "Rectangle #{name} is #{width} x #{height}"
+    io = File.open(...)
+    len = io.read(2).unpack("v")[0]
+    name = io.read(len)
+    width, height = io.read(8).unpack("VV")
+    puts "Rectangle #{name} is #{width} x #{height}"
+{:ruby}
 
 It's ugly, violates DRY and feels like you're writing Perl, not Ruby.
+
 There is a better way.
 
-  class Rectangle < BinData::Record
-    uint16le :len
-    string   :name, :read_length => :len
-    uint32le :width
-    uint32le :height
-  end
+    class Rectangle < BinData::Record
+      endian :little
+      uint16 :len
+      string :name, :read_length => :len
+      uint32 :width
+      uint32 :height
+    end
 
-  io = File.open(...)
-  r = Rectangle.read(io)
-  puts "Rectangle #{r.name} is #{r.width} x #{r.height}"
+    io = File.open(...)
+    r = Rectangle.read(io)
+    puts "Rectangle #{r.name} is #{r.width} x #{r.height}"
+{:ruby}
 
 BinData makes it easy to specify the structure of the data you are
 manipulating.
 
 Read on for the tutorial, or go straight to the
-download[http://rubyforge.org/frs/?group_id=3252] page.
+[download](http://rubyforge.org/frs/?group_id=3252) page.
 
-== Syntax
+## License
 
+BinData is released under the same license as Ruby.
+
+Copyright &copy; 2007 - 2009 [Dion Mendel](mailto:dion@lostrealm.com)
+
+---------------------------------------------------------------------------
+
+# Overview
+
 BinData declarations are easy to read.  Here's an example.
 
-  class MyFancyFormat < BinData::Record
-    stringz :comment
-    uint8   :count, :check_value => lambda { (value % 2) == 0 }
-    array   :some_ints, :type => :int32be, :initial_length => :count
-  end
+    class MyFancyFormat < BinData::Record
+      stringz :comment
+      uint8   :num_ints, :check_value => lambda { value.even? }
+      array   :some_ints, :type => :int32be, :initial_length => :num_ints
+    end
+{:ruby}
 
-The structure of the data in this example is
-1. A zero terminated string
-2. An unsigned 8bit integer which must by even
-3. A sequence of unsigned 32bit integers in big endian form, the total
-   number of which is determined by the value of the 8bit integer.
+This fancy format describes the following collection of data:
 
-The BinData declaration matches the english description closely.  Just for
-fun, lets look at how we'd implement this using #pack and #unpack.  Here's
-the writing code, have a go at the reading code.
+1.  A zero terminated string
+2.  An unsigned 8bit integer which must by even
+3.  A sequence of unsigned 32bit integers in big endian form, the total
+    number of which is determined by the value of the 8bit integer.
 
-  comment = "this is a comment"
-  some_ints = [2, 3, 8, 9, 1, 8]
-  File.open(...) do |io|
-    io.write([comment, some_ints.size, *some_ints].pack("Z*CN*"))
-  end
+The BinData declaration matches the English description closely.
+Compare the above declaration with the equivalent `#unpack` code to read
+such a data record.
 
+    def read_fancy_format(io)
+      comment, num_ints, rest = io.read.unpack("Z*Ca*")
+      raise ArgumentError, "ints must be even" unless num_ints.even?
+      some_ints = rest.unpack("N#{num_ints}")
+      {:comment => comment, :num_ints => num_ints, :some_ints => *some_ints}
+    end
+{:ruby}
 
-The general format of a BinData declaration is a class containing one or more
-fields.
+The BinData declaration clearly shows the structure of the record.  The
+`#unpack` code makes this structure opaque.
 
-  class MyName < BinData::Record
-    type field_name, :param1 => "foo", :param2 => bar, ...
-    ...
-  end
+The general usage of BinData is to declare a structured collection of
+data as a user defined record.  This record can be instantiated, read,
+written and manipulated without the user having to be concerned with the
+underlying binary representation of the data.
 
-*type* is the name of a supplied type (e.g. <tt>uint32be</tt>,  +string+)
-or a user defined type.  For user defined types, convert the class name
-from CamelCase to lowercase underscore_style.
+---------------------------------------------------------------------------
 
-*field_name* is the name by which you can access the data.  Use either a
-String or a Symbol.
+# Common Operations
 
-Each field may have *parameters* for how to process the data.  The
-parameters are passed as a Hash using Symbols for keys.
+There are operations common to all BinData types, including user defined
+ones.  These are summarised here.
 
-== Handling dependencies between fields
+## Reading and writing
 
-A common occurance in binary file formats is one field depending upon the
-value of another.  e.g. A string preceded by it's length.
+`::read(io)`
 
-As an example, let's assume a Pascal style string where the byte preceding
-the string contains the string's length.
+:   Creates a BinData object and reads its value from the given string
+    or `IO`.  The newly created object is returned.
 
-  # reading
-  io = File.open(...)
-  len = io.getc
-  str = io.read(len)
-  puts "string is " + str
+        str = BinData::Stringz::read("string1\0string2")
+        str.snapshot #=> "string1"
+    {:ruby}
 
-  # writing
-  io = File.open(...)
-  str = "this is a string"
-  io.putc(str.length)
-  io.write(str)
+`#read(io)`
 
-Here's how we'd implement the same example with BinData.
+:   Reads and assigns binary data read from `io`.
 
-  class PascalString < BinData::Record
-    uint8  :len,  :value => lambda { data.length }
-    string :data, :read_length => :len
-  end
+        obj = BinData::Uint16be.new
+        obj.read("\022\064")
+        obj.value #=> 4660
+    {:ruby}
 
-  # reading
-  io = File.open(...)
-  ps = PascalString.new
-  ps.read(io)
-  puts "string is " + ps.data
+`#write(io)`
 
-  # writing
-  io = File.open(...)
-  ps = PascalString.new
-  ps.data = "this is a string"
-  ps.write(io)
+:   Writes the binary representation of the object to `io`.
 
-This syntax needs explaining.  Let's simplify by examining reading and
-writing separately.
+        File.open("...", "wb") do |io|
+          obj = BinData::Uint64be.new
+          obj.value = 568290145640170
+          obj.write(io)
+        end
+    {:ruby}
 
-  class PascalStringReader < BinData::Record
-    uint8  :len
-    string :data, :read_length => :len
-  end
+`#to_binary_s`
 
-This states that when reading the string, the initial length of the string
-(and hence the number of bytes to read) is determined by the value of the
-+len+ field.
+:   Returns the binary representation of this object as a string.
 
-Note that <tt>:read_length => :len</tt> is syntactic sugar for
-<tt>:read_length => lambda { len }</tt>, but more on that later.
+        obj = BinData::Uint16be.new
+        obj.assign(4660)
+        obj.to_binary_s #=> "\022\064"
+    {:ruby}
 
-  class PascalStringWriter < BinData::Record
-    uint8  :len, :value => lambda { data.length }
-    string :data
-  end
+## Manipulating
 
-This states that the value of +len+ is always equal to the length of +data+.
-+len+ may not be manually modified.
+`#assign(value)`
 
-Combining these two definitions gives the definition for +PascalString+ as
-previously defined.
+:   Assigns the given value to this object.  `value` can be of the same
+    format as produced by `#snapshot`, or it can be a compatible data
+    object.
+  
+        arr = BinData::Array.new(:type => :uint8)
+        arr.assign([1, 2, 3, 4])
+        arr.snapshot #=> [1, 2, 3, 4]
+    {:ruby}
 
-Once thing to note with dependencies, is that a field can only depend on one
-before it.  You can't have a string which has the characters first and the
-length afterwards.
+`#clear`
 
-== Predefined Types
+:   Resets this object to its initial state.
 
-These are the predefined types.  Custom types can be created by composing
-these types.
+        obj = BinData::Int32be.new(:initial_value => 42)
+        obj.assign(50)
+        obj.clear
+        obj.value #=> 42
+    {:ruby}
 
-BinData::String::   A sequence of bytes.
-BinData::Stringz::  A zero terminated sequence of bytes.
+`#clear?`
 
-BinData::Array::    A list of objects of the same type.
-BinData::Choice::   A choice between several objects.
-BinData::Struct::   An ordered collection of named objects.
+:   Returns whether this object is in its initial state.
 
-BinData::Int8::     Signed  8 bit integer.
-BinData::Int16le::  Signed 16 bit integer (little endian).
-BinData::Int16be::  Signed 16 bit integer (big endian).
-BinData::Int32le::  Signed 32 bit integer (little endian).
-BinData::Int32be::  Signed 32 bit integer (big endian).
-BinData::Int64le::  Signed 64 bit integer (little endian).
-BinData::Int64be::  Signed 64 bit integer (big endian).
+        arr = BinData::Array.new(:type => :uint16be, :initial_length => 5)
+        arr[3] = 42
+        arr.clear? #=> false
 
-BinData::Uint8::    Unsigned  8 bit integer.
-BinData::Uint16le:: Unsigned 16 bit integer (little endian).
-BinData::Uint16be:: Unsigned 16 bit integer (big endian).
-BinData::Uint32le:: Unsigned 32 bit integer (little endian).
-BinData::Uint32be:: Unsigned 32 bit integer (big endian).
-BinData::Uint64le:: Unsigned 64 bit integer (little endian).
-BinData::Uint64be:: Unsigned 64 bit integer (big endian).
+        arr[3].clear
+        arr.clear? #=> true
+    {:ruby}
 
-BinData::Bit1::     1 bit unsigned integer (big endian).
-BinData::Bit2::     2 bit unsigned integer (big endian).
-...
-BinData::Bit63::    63 bit unsigned integer (big endian).
+## Inspecting
 
-BinData::Bit1le::   1 bit unsigned integer (little endian).
-BinData::Bit2le::   2 bit unsigned integer (little endian).
-...
-BinData::Bit63le::  63 bit unsigned integer (little endian).
+`#num_bytes`
 
-BinData::FloatLe::  Single precision floating point number (little endian).
-BinData::FloatBe::  Single precision floating point number (big endian).
-BinData::DoubleLe:: Double precision floating point number (little endian).
-BinData::DoubleBe:: Double precision floating point number (big endian).
+:   Returns the number of bytes required for the binary representation
+    of this object.
 
-BinData::Rest::     Consumes the rest of the input stream.
+        arr = BinData::Array.new(:type => :uint16be, :initial_length => 5)
+        arr[0].num_bytes #=> 2
+        arr.num_bytes #=> 10
+    {:ruby}
 
-== Parameters
+`#snapshot`
 
-  class PascalStringWriter < BinData::Record
-    uint8  :len, :value => lambda { data.length }
-    string :data
-  end
+:   Returns the value of this object as primitive Ruby objects
+    (numerics, strings, arrays and hashs).  The output of `#snapshot`
+    may be useful for serialization or as a reduced memory usage
+    representation.
 
-Revisiting the Pascal string writer, we see that a field can take
-parameters.  Parameters are passed as a Hash, where the key is a symbol.
-It should be noted that parameters are designed to be lazily evaluated,
-possibly multiple times.  This means that any parameter value must not have
-side effects.
+        obj = BinData::Uint8.new
+        obj.assign(3)
+        obj + 3 #=> 6
 
+        obj.snapshot #=> 3
+        obj.snapshot.class #=> Fixnum
+    {:ruby}
+
+`#offset`
+
+:   Returns the offset of this object with respect to the most distant
+    ancestor structure it is contained within.  This is most likely to
+    be used with arrays and records.
+
+        class Tuple < BinData::Record
+          int8 :a
+          int8 :b
+        end
+
+        arr = BinData::Array.new(:type => :tuple, :initial_length => 3)
+        arr[2].b.offset #=> 5
+    {:ruby}
+
+`#rel_offset`
+
+:   Returns the offset of this object with respect to the parent
+    structure it is contained within.  Compare this to `#offset`.
+
+        class Tuple < BinData::Record
+          int8 :a
+          int8 :b
+        end
+
+        arr = BinData::Array.new(:type => :tuple, :initial_length => 3)
+        arr[2].b.rel_offset #=> 1
+    {:ruby}
+
+`#inspect`
+
+:   Returns a human readable representation of this object.  This is a
+    shortcut to #snapshot.inspect.
+
+---------------------------------------------------------------------------
+
+# Records
+
+The general format of a BinData record declaration is a class containing
+one or more fields.
+
+    class MyName < BinData::Record
+      type field_name, :param1 => "foo", :param2 => bar, ...
+      ...
+    end
+{:ruby}
+
+`type`
+:   is the name of a supplied type (e.g. `uint32be`, `string`, `array`)
+    or a user defined type.  For user defined types, the class name is
+    converted from `CamelCase` to lowercased `underscore_style`.
+
+`field_name`
+:   is the name by which you can access the data.  Use either a
+    `String` or a `Symbol`.
+
+Each field may have optional *parameters* for how to process the data.
+The parameters are passed as a `Hash` with `Symbols` for keys.
+Parameters are designed to be lazily evaluated, possibly multiple times.
+This means that any parameter value must not have side effects.
+
 Here are some examples of legal values for parameters.
 
-  * :param => 5
-  * :param => lambda { 5 + 2 }
-  * :param => lambda { foo + 2 }
-  * :param => :foo
+*   `:param => 5`
+*   `:param => lambda { 5 + 2 }`
+*   `:param => lambda { foo + 2 }`
+*   `:param => :foo`
 
-The simplest case is when the value is a literal value, such as 5.
+The simplest case is when the value is a literal value, such as `5`.
 
-If the value is not a literal, it is expected to be a lambda.  The lambda
-will be evaluated in the context of the parent, in this case the parent is
-an instance of +PascalStringWriter+.
+If the value is not a literal, it is expected to be a lambda.  The
+lambda will be evaluated in the context of the parent, in this case the
+parent is an instance of `MyName`.
 
 If the value is a symbol, it is taken as syntactic sugar for a lambda
 containing the value of the symbol.
-e.g <tt>:param => :foo</tt> is <tt>:param => lambda { foo }</tt>
+e.g `:param => :foo` is `:param => lambda { foo }`
 
-== Saving Typing
+## Specifying default endian
 
-The endianess of numeric types must be explicitly defined so that the code
-produced is independent of architecture.  Explicitly specifying the
-endianess of each numeric type can become tedious, so the following
-shortcut is provided.
+The endianess of numeric types must be explicitly defined so that the
+code produced is independent of architecture.  However, explicitly
+specifying the endian for each numeric field can result in a bloated
+declaration that can be difficult to read.
 
-  class A < BinData::Record
-    endian :little
+    class A < BinData::Record
+      int16be  :a
+      int32be  :b
+      int16le  :c  # <-- Note little endian!
+      int32be  :d
+      float_be :e
+      array    :f, :type => :uint32be
+    end
+{:ruby}
 
-    uint16   :a
-    uint32   :b
-    double   :c
-    uint32be :d
-    array    :e, :type => :int16
-  end
+The `endian` keyword can be used to set the default endian.  This makes
+the declaration easier to read.  Any numeric field that doesn't use the
+default endian can explicitly override it.
 
-is equivalent to:
+    class A < BinData::Record
+      endian :big
 
-  class A < BinData::Record
-    uint16le  :a
-    uint32le  :b
-    double_le :c
-    uint32be  :d
-    array     :e, :type => :int16le
-  end
+      int16   :a
+      int32   :b
+      int16le :c   # <-- Note how this little endian now stands out
+      int32   :d
+      float   :e
+      array   :f, :type => :uint32
+    end
+{:ruby}
 
-Using the endian keyword improves the readability of the declaration as well
-as reducing the amount of typing necessary.  Note that the endian keyword will
-cascade to nested types, as illustrated with the array in the above example.
+The increase in clarity can be seen with the above example.  The
+`endian` keyword will cascade to nested types, as illustrated with the
+array in the above example.
 
-== Creating custom types
+## Optional fields
 
-Custom types should be created by subclassing BinData::Record or
-BinData::Primitive.  Ocassionally it may be useful to subclass
-BinData::BasePrimitive.  Subclassing other classes may have unexpected results
-and is unsupported.
+A record may contain optional fields.  The optional state of a field is
+decided by the `:onlyif` parameter.  If the value of this parameter is
+`false`, then the field will be as if it didn't exist in the record.
 
+    class RecordWithOptionalField < BinData::Record
+      ...
+      uint8  :comment_flag
+      string :comment, :length => 20, :onlyif => :has_comment?
+
+      def has_comment?
+        comment_flag.nonzero?
+      end
+    end
+{:ruby}
+
+In the above example, the `comment` field is only included in the record
+if the value of the `comment_flag` field is non zero.
+
+## Handling dependencies between fields
+
+A common occurence in binary file formats is one field depending upon
+the value of another.  e.g. A string preceded by its length.
+
+As an example, let's assume a Pascal style string where the byte
+preceding the string contains the string's length.
+
+    # reading
+    io = File.open(...)
+    len = io.getc
+    str = io.read(len)
+    puts "string is " + str
+
+    # writing
+    io = File.open(...)
+    str = "this is a string"
+    io.putc(str.length)
+    io.write(str)
+{:ruby}
+
+Here's how we'd implement the same example with BinData.
+
+    class PascalString < BinData::Record
+      uint8  :len,  :value => lambda { data.length }
+      string :data, :read_length => :len
+    end
+
+    # reading
+    io = File.open(...)
+    ps = PascalString.new
+    ps.read(io)
+    puts "string is " + ps.data
+
+    # writing
+    io = File.open(...)
+    ps = PascalString.new
+    ps.data = "this is a string"
+    ps.write(io)
+{:ruby}
+
+This syntax needs explaining.  Let's simplify by examining reading and
+writing separately.
+
+    class PascalStringReader < BinData::Record
+      uint8  :len
+      string :data, :read_length => :len
+    end
+{:ruby}
+
+This states that when reading the string, the initial length of the
+string (and hence the number of bytes to read) is determined by the
+value of the `len` field.
+
+Note that `:read_length => :len` is syntactic sugar for
+`:read_length => lambda { len }`, as described previously.
+
+    class PascalStringWriter < BinData::Record
+      uint8  :len, :value => lambda { data.length }
+      string :data
+    end
+{:ruby}
+
+This states that the value of `len` is always equal to the length of
+`data`.  `len` may not be manually modified.
+
+Combining these two definitions gives the definition for `PascalString`
+as previously defined.
+
+It is important to note with dependencies, that a field can only depend
+on one before it.  You can't have a string which has the characters
+first and the length afterwards.
+
+---------------------------------------------------------------------------
+
+# Primitive Types
+
+BinData provides support for the most commonly used primitive types that
+are used when working with binary data.  Namely:
+
+*   fixed size strings
+*   zero terminated strings
+*   byte based integers - signed or unsigned, big or little endian and
+    of any size
+*   bit based integers - unsigned big or little endian integers of any
+    size
+*   floating point numbers - single or double precision floats in either
+    big or little endian
+
+Primitives may be manipulated individually, but is more common to work
+with them as part of a record.
+
+Examples of individual usage:
+
+    int16 = BinData::Int16be.new
+    int16.value = 941
+    int16.to_binary_s #=> "\003\255"
+
+    fl = BinData::FloatBe.read("\100\055\370\124") #=> 2.71828174591064
+    fl.num_bytes #=> 4
+
+    fl * int16 #=> 2557.90320057996
+{:ruby}
+
+There are several parameters that are specific to primitives.
+
+`:initial_value`
+
+:   This contains the initial value that the primitive will contain
+    after initialization.  This is useful for setting default values.
+
+        obj = BinData::String.new(:initial_value => "hello ")
+        obj + "world" #=> "hello world"
+
+        obj.assign("good-bye " )
+        obj + "world" #=> "good-bye world"
+    {:ruby}
+
+`:value`
+
+:   The primitive will always contain this value.  Reading or assigning
+    will not change the value.  This parameter is used to define
+    constants or dependent fields.
+
+        pi = BinData::FloatLe.new(:value => Math::PI)
+        pi.assign(3)
+        puts pi #=> 3.14159265358979
+    {:ruby}
+
+`:check_value`
+
+:   When reading, will raise a `ValidityError` if the value read does
+    not match the value of this parameter.
+
+        obj = BinData::String.new(:check_value => lambda { /aaa/ =~ value })
+        obj.read("baaa!") #=> "baaa!"
+        obj.read("bbb") #=> raises ValidityError
+
+        obj = BinData::String.new(:check_value => "foo")
+        obj.read("foo") #=> "foo"
+        obj.read("bar") #=> raises ValidityError
+    {:ruby}
+
+## Numerics
+
+There are three kinds of numeric types that are supported by BinData.
+
+### Byte based integers
+
+These are the common integers that are used in most low level
+programming languages (C, C++, Java etc).  These integers can be signed
+or unsigned.  The endian must be specified so that the conversion is
+independent of architecture.  The bit size of these integers must be a
+multiple of 8.  Examples of byte based integers are:
+
+`uint16be`
+:   unsigned 16 bit big endian integer
+
+`int8`
+:   signed 8 bit integer
+
+`int32le`
+:   signed 32 bit little endian integer
+
+`uint40be`
+:   unsigned 40 bit big endian integer
+
+The `be` | `le` suffix may be omitted if the `endian` keyword is in use.
+
+### Bit based integers
+
+These unsigned integers are used to define bitfields in records.
+Bitfields are big endian by default but little endian may be specified
+explicitly.  Little endian bitfields are rare, but do occur in older
+file formats (e.g.  The file allocation table for FAT12 filesystems is
+stored as an array of 12bit little endian integers).
+
+An array of bit based integers will be packed according to their endian.
+
+In a record, adjacent bitfields will be packed according to their
+endian.  All other fields are byte aligned.
+
+Examples of bit based integers are:
+
+`bit1`
+:   1 bit big endian integer (may be used as boolean)
+
+`bit4_le`
+:   4 bit little endian integer
+
+`bit32`
+:   32 bit big endian integer
+
+The difference between byte and bit base integers of the same number of
+bits (e.g. `uint8` vs `bit8`) is one of alignment.
+
+This example is packed as 3 bytes
+
+    class A < BinData::Record
+      bit4  :a
+      uint8 :b
+      bit4  :c
+    end
+
+    Data is stored as: AAAA0000 BBBBBBBB CCCC0000
+{:ruby}
+
+Whereas this example is packed into only 2 bytes
+
+    class B < BinData::Record
+      bit4 :a
+      bit8 :b
+      bit4 :c
+    end
+
+    Data is stored as: AAAABBBB BBBBCCCC
+{:ruby}
+
+### Floating point numbers
+
+BinData supports 32 and 64 bit floating point numbers, in both big and
+little endian format.  These types are:
+
+`float_le`
+:   single precision 32 bit little endian float
+
+`float_be`
+:   single precision 32 bit big endian float
+
+`double_le`
+:   double precision 64 bit little endian float
+
+`double_be`
+:   double precision 64 bit big endian float
+
+The `_be` | `_le` suffix may be omitted if the `endian` keyword is in use.
+
+### Example
+
+Here is an example declaration for an Internet Protocol network packet.
+
+    class IP_PDU < BinData::Record
+      endian :big
+
+      bit4   :version, :value => 4
+      bit4   :header_length
+      uint8  :tos
+      uint16 :total_length
+      uint16 :ident
+      bit3   :flags
+      bit13  :frag_offset
+      uint8  :ttl
+      uint8  :protocol
+      uint16 :checksum
+      uint32 :src_addr
+      uint32 :dest_addr
+      string :options, :read_length => :options_length_in_bytes
+      string :data, :read_length => lambda { total_length - header_length_in_bytes }
+
+      def header_length_in_bytes
+        header_length * 4
+      end
+
+      def options_length_in_bytes
+        header_length_in_bytes - 20
+      end
+    end
+{:ruby}
+
+Three of the fields have parameters.
+*   The version field always has the value 4, as per the standard.
+*   The options field is read as a raw string, but not processed.
+*   The data field contains the payload of the packet.  Its length is
+    calculated as the total length of the packet minus the length of
+    the header.
+
+## Strings
+
+BinData supports two types of strings - fixed size and zero terminated.
+Strings are treated as a sequence of 8bit bytes.  This is the same as
+strings in Ruby 1.8.  The issue of character encoding is ignored by
+BinData.
+
+### Fixed Sized Strings
+
+Fixed sized strings may have a set length.  If an assigned value is
+shorter than this length, it will be padded to this length.  If no
+length is set, the length is taken to be the length of the assigned
+value.
+
+There are several parameters that are specific to fixed sized strings.
+
+`:read_length`
+
+:   The length to use when reading a value.
+
+        obj = BinData::String.new(:read_length => 5)
+        obj.read("abcdefghij")
+        obj.value #=> "abcde"
+    {:ruby}
+
+`:length`
+
+:   The fixed length of the string.  If a shorter string is set, it
+    will be padded to this length.  Longer strings will be truncated.
+
+        obj = BinData::String.new(:length => 6)
+        obj.read("abcdefghij")
+        obj.value #=> "abcdef"
+
+        obj = BinData::String.new(:length => 6)
+        obj.value = "abcd"
+        obj.value #=> "abcd\000\000"
+
+        obj = BinData::String.new(:length => 6)
+        obj.value = "abcdefghij"
+        obj.value #=> "abcdef"
+    {:ruby}
+
+`:pad_char`
+
+:   The character to use when padding a string to a set length.  Valid
+    values are `Integers` and `Strings` of length 1.
+    `"\0"` is the default.
+
+        obj = BinData::String.new(:length => 6, :pad_char => 'A')
+        obj.value = "abcd"
+        obj.value #=> "abcdAA"
+        obj.to_binary_s #=> "abcdAA"
+    {:ruby}
+
+`:trim_padding`
+
+:   Boolean, default `false`.  If set, the value of this string will
+    have all pad_chars trimmed from the end of the string.  The value
+    will not be trimmed when writing.
+
+        obj = BinData::String.new(:length => 6, :trim_value => true)
+        obj.value = "abcd"
+        obj.value #=> "abcd"
+        obj.to_binary_s #=> "abcd\000\000"
+    {:ruby}
+
+### Zero Terminated Strings
+
+These strings are modelled on the C style of string - a sequence of
+bytes terminated by a null (`"\0"`) character.
+
+    obj = BinData::Stringz.new
+    obj.read("abcd\000efgh")
+    obj.value #=> "abcd"
+    obj.num_bytes #=> 5
+    obj.to_binary_s #=> "abcd\000"
+{:ruby}
+
+## User Defined Primitive Types
+
+Most user defined types will be Records, but occasionally we'd like to
+create a custom type of primitive.
+
 Let us revisit the Pascal String example.
 
-  class PascalString < BinData::Record
-    uint8  :len,  :value => lambda { data.length }
-    string :data, :read_length => :len
-  end
+    class PascalString < BinData::Record
+      uint8  :len,  :value => lambda { data.length }
+      string :data, :read_length => :len
+    end
+{:ruby}
 
-We'd like to make PascalString a custom type that behaves like a
-BinData::BasePrimitive object so we can use :initial_value etc.  Here's an
-example usage of what we'd like:
+We'd like to make `PascalString` a user defined type that behaves like a
+`BinData::BasePrimitive` object so we can use `:initial_value` etc.
+Here's an example usage of what we'd like:
 
-  class Favourites < BinData::Record
-    pascal_string :language, :initial_value => "ruby"
-    pascal_string :os,       :initial_value => "unix"
-  end
+    class Favourites < BinData::Record
+      pascal_string :language, :initial_value => "ruby"
+      pascal_string :os,       :initial_value => "unix"
+    end
 
-  f = Favourites.new
-  f.os = "freebsd"
-  f.to_binary_s #=> "\004ruby\007freebsd"
+    f = Favourites.new
+    f.os = "freebsd"
+    f.to_binary_s #=> "\004ruby\007freebsd"
+{:ruby}
 
-We create this type of custom string by inheriting from BinData::Primitive
-and implementing the #get and #set methods.
+We create this type of custom string by inheriting from
+`BinData::Primitive` (instead of `BinData::Record`) and implementing the
+`#get` and `#set` methods.
 
-  class PascalString < BinData::Primitive
-    uint8  :len,  :value => lambda { data.length }
-    string :data, :read_length => :len
+    class PascalString < BinData::Primitive
+      uint8  :len,  :value => lambda { data.length }
+      string :data, :read_length => :len
 
-    def get;   self.data; end
-    def set(v) self.data = v; end
-  end
+      def get;   self.data; end
+      def set(v) self.data = v; end
+    end
+{:ruby}
 
-If the type we are creating represents a primitive value then inherit from
-BinData::Primitive, otherwise inherit from BinData::Record.
+### Advanced User Defined Primitive Types
 
-== License
+Sometimes a user defined primitive type can not easily be declaratively
+defined.  In this case you should inherit from `BinData::BasePrimitive`
+and implement the following three methods:
 
-BinData is released under the same license as Ruby.
+*   `value_to_binary_string(value)`
+*   `read_and_return_value(io)`
+*   `sensible_default()`
 
-Copyright (c) 2007 - 2009 Dion Mendel
+Here is an example of a big integer implementation.
+
+    # A custom big integer format.  Binary format is:
+    #   1 byte  : 0 for positive, non zero for negative
+    #   x bytes : Little endian stream of 7 bit bytes representing the
+    #             positive form of the integer.  The upper bit of each byte
+    #             is set when there are more bytes in the stream.
+    class BigInteger < BinData::BasePrimitive
+      def value_to_binary_string(value)
+        negative = (value < 0) ? 1 : 0
+        value = value.abs
+        bytes = [negative]
+        loop do
+          seven_bit_byte = value & 0x7f
+          value >>= 7
+          has_more = value.nonzero? ? 0x80 : 0
+          byte = has_more | seven_bit_byte
+          bytes.push(byte)
+
+          break if has_more.zero?
+        end
+
+        bytes.collect { |b| b.chr }.join
+      end
+
+      def read_and_return_value(io)
+        negative = read_uint8(io).nonzero?
+        value = 0
+        bit_shift = 0
+        loop do
+          byte = read_uint8(io)
+          has_more = byte & 0x80
+          seven_bit_byte = byte & 0x7f
+          value |= seven_bit_byte << bit_shift
+          bit_shift += 7
+
+          break if has_more.zero?
+        end
+
+        negative ? -value : value
+      end
+
+      def sensible_default
+        0
+      end
+
+      def read_uint8(io)
+        io.readbytes(1).unpack("C").at(0)
+      end
+    end
+{:ruby}
+
+---------------------------------------------------------------------------
+
+# Arrays
+
+A BinData array is a list of data objects of the same type.  It behaves
+much the same as the standard Ruby array, supporting most of the common
+methods.
+
+When instantiating an array, the type of object it contains must be
+specified.
+
+    arr = BinData::Array.new(:type => :uint8)
+    arr[3] = 5
+    arr.snapshot #=> [0, 0, 0, 5]
+{:ruby}
+
+Parameters can be passed to this object with a slightly clumsy syntax.
+
+    arr = BinData::Array.new(:type => [:uint8, {:initial_value => :index}])
+    arr[3] = 5
+    arr.snapshot #=> [0, 1, 2, 5]
+{:ruby}
+
+There are two different parameters that specify the length of the array.
+
+`:initial_length`
+
+:    Specifies the initial length of a newly instantiated array.
+     The array may grow as elements are inserted.
+
+        obj = BinData::Array.new(:type => :int8, :initial_length => 4)
+        obj.read("\002\003\004\005\006\007")
+        obj.snapshot #=> [2, 3, 4, 5]
+    {:ruby}
+
+`:read_until`
+
+:   While reading, elements are read until this condition is true.  This
+    is typically used to read an array until a sentinel value is found.
+    The variables `index`, `element` and `array` are made available to
+    any lambda assigned to this parameter.  If the value of this
+    parameter is the symbol `:eof`, then the array will read as much
+    data from the stream as possible.
+  
+        obj = BinData::Array.new(:type => :int8,
+                                 :read_until => lambda { index == 1 })
+        obj.read("\002\003\004\005\006\007")
+        obj.snapshot #=> [2, 3]
+
+        obj = BinData::Array.new(:type => :int8,
+                                 :read_until => lambda { element >= 3.5 })
+        obj.read("\002\003\004\005\006\007")
+        obj.snapshot #=> [2, 3, 4]
+
+        obj = BinData::Array.new(:type => :int8,
+                :read_until => lambda { array[index] + array[index - 1] == 9 })
+        obj.read("\002\003\004\005\006\007")
+        obj.snapshot #=> [2, 3, 4, 5]
+
+        obj = BinData::Array.new(:type => :int8, :read_until => :eof)
+        obj.read("\002\003\004\005\006\007")
+        obj.snapshot #=> [2, 3, 4, 5, 6, 7]
+    {:ruby}
+
+---------------------------------------------------------------------------
+
+# Choices
+
+A Choice is a collection of data objects of which only one is active at
+any particular time.  Method calls will be delegated to the active
+choice.  The possible types of objects that a choice contains is
+controlled by the `:choices` parameter, while the `:selection` parameter
+specifies the active choice.
+
+`:choices`
+
+:   Either an array or a hash specifying the possible data objects.  The
+    format of the array/hash.values is a list of symbols representing
+    the data object type.  If a choice is to have params passed to it,
+    then it should be provided as `[type_symbol, hash_params]`.  An
+    implementation constraint is that the hash may not contain symbols
+    as keys.
+
+`:selection`
+
+:   An index/key into the `:choices` array/hash which specifies the
+    currently active choice.
+
+`:copy_on_change`
+
+:   If set to `true`, copy the value of the previous selection to the
+    current selection whenever the selection changes.  Default is
+    `false`.
+
+Examples
+
+    type1 = [:string, {:value => "Type1"}]
+    type2 = [:string, {:value => "Type2"}]
+    
+    choices = {5 => type1, 17 => type2}
+    obj = BinData::Choice.new(:choices => choices, :selection => 5)
+    obj.value # => "Type1"
+
+    choices = [ type1, type2 ]
+    obj = BinData::Choice.new(:choices => choices, :selection => 1)
+    obj.value # => "Type2"
+
+    choices = [ nil, nil, nil, type1, nil, type2 ]
+    obj = BinData::Choice.new(:choices => choices, :selection => 3)
+    obj.value # => "Type1"
+
+    class MyNumber < BinData::Record
+      int8 :is_big_endian
+      choice :data, :choices => { true => :int32be, false => :int32le },
+                    :selection => lambda { is_big_endian != 0 },
+                    :copy_on_change => true
+    end
+
+    obj = MyNumber.new
+    obj.is_big_endian = 1
+    obj.data = 5
+    obj.to_binary_s #=> "\001\000\000\000\005"
+
+    obj.is_big_endian = 0
+    obj.to_binary_s #=> "\000\005\000\000\000"
+{:ruby}
+
+---------------------------------------------------------------------------
+
+# Advanced Topics
+
+## Wrappers
+
+Sometimes you wish to create a new type that is simply an existing type
+with some predefined parameters.  Examples could be an array with a
+specified type, or an integer with an initial value.
+
+This can be achieved with a wrapper.  A wrapper creates a new type based
+on an existing type which has predefined parameters.  These parameters
+can of course be overridden at initialisation time.
+
+Here we define an array that contains big endian 16 bit integers.  The
+array has a preferred initial length.
+
+    class IntArray < BinData::Wrapper
+      endian :big
+      array :type => :uint16, :initial_length => 5
+    end
+
+    arr = IntArray.new
+    arr.size #=> 5
+{:ruby}
+
+The initial length can be overridden at initialisation time.
+
+    arr = IntArray.new(:initial_length => 8)
+    arr.size #=> 8
+{:ruby}
+
+## Parameterizing User Defined Types
+
+All BinData types have parameters that allow the behaviour of an object
+to be specified at initialization time.  User defined types may also
+specify parameters.  There are two types of parameters: mandatory and
+default.
+
+### Mandatory Parameters
+
+Mandatory parameters must be specified when creating an instance of the
+type.  The `:type` parameter of `Array` is an example of a mandatory
+type.
+
+    class IntArray < BinData::Wrapper
+      mandatory_parameter :half_count
+
+      array :type => :uint8, :initial_length => lambda { half_count * 2 }
+    end
+
+    arr = IntArray.new
+        #=> raises ArgumentError: parameter 'half_count' must be specified in IntArray
+
+    arr = IntArray.new(:half_count => lambda { 1 + 2 })
+    arr.snapshot #=> [0, 0, 0, 0, 0, 0]
+{:ruby}
+
+### Default Parameters
+
+Default parameters are optional.  These parameters have a default value
+that may be overridden when an instance of the type is created.
+
+    class Phrase < BinData::Primitive
+      default_parameter :number => "three"
+      default_parameter :adjective => "blind"
+      default_parameter :noun => "mice"
+
+      stringz :a, :initial_value => :number
+      stringz :b, :initial_value => :adjective
+      stringz :c, :initial_value => :noun
+
+      def get; "#{a} #{b} #{c}"; end
+      def set(v)
+        if /(.*) (.*) (.*)/ =~ v
+          self.a, self.b, self.c = $1, $2, $3
+        end
+      end
+    end
+
+    obj = Phrase.new(:number => "two", :adjective => "deaf")
+    obj.to_s #=> "two deaf mice"
+{:ruby}
+
+## Debugging
+
+BinData includes several features to make it easier to debug
+declarations.
+
+### Tracing
+
+BinData has the ability to trace the results of reading a data
+structure.
+
+    class A < BinData::Record
+      int8  :a
+      bit4  :b
+      bit2  :c
+      array :d, :initial_length => 6, :type => :bit1
+    end
+
+    BinData::trace_reading do
+      A.read("\373\225\220")
+    end
+{:ruby}
+
+Results in the following being written to `STDERR`.
+
+    obj.a => -5
+    obj.b => 9
+    obj.c => 1
+    obj.d[0] => 0
+    obj.d[1] => 1
+    obj.d[2] => 1
+    obj.d[3] => 0
+    obj.d[4] => 0
+    obj.d[5] => 1
+{:ruby}
+
+### Rest
+
+The rest keyword will consume the input stream from the current position
+to the end of the stream.
+
+    class A < BinData::Record
+      string :a, :read_length => 5
+      rest   :rest
+    end
+
+    obj = A.read("abcdefghij")
+    obj.a #=> "abcde"
+    obj.rest #=" "fghij"
+{:ruby}
+
+### Hidden fields
+
+The typical way to view the contents of a BinData record is to call
+`#snapshot` or `#inspect`.  This gives all fields and their values.  The
+`hide` keyword can be used to prevent certain fields from appearing in
+this output.  This removes clutter and allows the developer to focus on
+what they are currently interested in.
+
+    class Testing < BinData::Record
+      hide :a, :b
+      string :a, :read_length => 10
+      string :b, :read_length => 10
+      string :c, :read_length => 10
+    end
+
+    obj = Testing.read(("a" * 10) + ("b" * 10) + ("c" * 10))
+    obj.snapshot #=> {"c"=>"cccccccccc"}
+    obj.to_binary_s #=> "aaaaaaaaaabbbbbbbbbbcccccccccc"
+{:ruby}
+
+---------------------------------------------------------------------------
+
+# Alternatives
+
+There are several alternatives to BinData.  Below is a comparison
+between BinData and its alternatives.
+
+The short form is that BinData is the best choice for most cases.  If
+decoding / encoding speed is very important and the binary formats are
+simple then BitStruct may be a good choice.  (Though if speed is
+important, perhaps you should investigate a language other than Ruby.)
+
+### [BitStruct](http://rubyforge.org/projects/bit-struct)
+
+BitStruct is the most complete of all the alternatives.  It is
+declarative and supports all the same primitive types as BinData.  In
+addition it includes a self documenting feature to make it easy to write
+reports.
+
+The major limitation of BitStruct is that it does not support variable
+length fields and dependent fields.  The simple PascalString example
+used previously is not possible with BitStruct.  This limitation is due
+to the design choice to favour speed over flexibility.
+
+Most non trivial file formats rely on dependent and variable length
+fields.  It is difficult to use BitStruct with these formats as code
+must be written to explicitly handle the dependencies.
+
+BitStruct does not currently support little endian bit fields, or
+bitfields that span more than 2 bytes.  BitStruct is actively maintained
+so these limitations may be removed in a future release.
+
+If speed is important and you are only dealing with simple binary data
+types then BitStruct is a good choice.  For non trivial data types,
+BinData is the better choice.
+
+### [BinaryParse](http://rubyforge.org/projects/binaryparse)
+
+BinaryParse is a declarative style packer / unpacker.  It provides the
+same primitives as Ruby's `#pack`, with the addition of date and time.
+Like BitStruct, it doesn't provide dependent or variable length fields.
+
+### [BinStruct](http://rubyforge.org/projects/metafuzz)
+
+BinStruct is an imperative approach to unpacking binary data.  It does
+provide some declarative style syntax sugar.  It provides support for
+the most common primitive types, as well as arbitrary length bitfields.
+
+It's main focus is as a binary fuzzer, rather than as a generic decoding
+/ encoding library.
+
+### [Packable](http://github.com/marcandre/packable/tree/master)
+
+Packable makes it much nicer to use Ruby's `#pack` and `#unpack`
+methods.  Instead of having to remember that, for example `"n"` is the
+code to pack a 16 bit big endian integer, packable provides many
+convenient shortcuts.  In the case of `"n"`, `{:bytes => 2, :endian => :big}`
+may be used instead.
+
+Using Packable improves the readability of `#pack` and `#unpack`
+methods, but explicitly calls to `#pack` and `#unpack` aren't as
+readable as a declarative approach.
+
+### [Bitpack](http://rubyforge.org/projects/bitpack)
+
+Bitpack provides methods to extract big endian integers of arbitrary bit
+length from an octet stream.
+
+The extraction code is written in `C`, so if speed is important and bit
+manipulation is all the functionality you require then this may be an
+alternative.
+
+---------------------------------------------------------------------------