manual.md in bindata-1.3.1 vs manual.md in bindata-1.4.0

- old
+ new

@@ -38,10 +38,12 @@ manipulating. It supports all the common datatypes that are found in structured binary data. Support for dependent and variable length fields is built in. +Last updated: 2011-06-14 + ## License BinData is released under the same license as Ruby. Copyright &copy; 2007 - 2011 [Dion Mendel](mailto:dion@lostrealm.com) @@ -696,13 +698,15 @@ the header. ## Strings BinData supports two types of strings - fixed size and zero terminated. -Strings are treated as a sequence of 8bit bytes. This is the same as -strings in Ruby 1.8. The issue of character encoding is ignored by -BinData. +Strings are treated internally as a sequence of 8bit bytes. This is the +same as strings in Ruby 1.8. BinData fully supports Ruby 1.9 string +encodings. See this [FAQ +entry](#im_using_ruby_19_how_do_i_use_string_encodings_with_bindata) for +details. ### Fixed Sized Strings Fixed sized strings may have a set length. If an assigned value is shorter than this length, it will be padded to this length. If no @@ -832,19 +836,21 @@ `def sensible_default()` : The ruby value that a clear object should return. +If you wish to access parameters from inside these methods, you can +use `eval_parameter(key)`. + Here is an example of a big integer implementation. # A custom big integer format. Binary format is: # 1 byte : 0 for positive, non zero for negative # x bytes : Little endian stream of 7 bit bytes representing the # positive form of the integer. The upper bit of each byte # is set when there are more bytes in the stream. class BigInteger < BinData::BasePrimitive - register_self def value_to_binary_string(value) negative = (value < 0) ? 1 : 0 value = value.abs bytes = [negative] @@ -1103,10 +1109,173 @@ --------------------------------------------------------------------------- # Advanced Topics +## Debugging + +BinData includes several features to make it easier to debug +declarations. + +### Tracing + +BinData has the ability to trace the results of reading a data +structure. + + class A < BinData::Record + int8 :a + bit4 :b + bit2 :c + array :d, :initial_length => 6, :type => :bit1 + end + + BinData::trace_reading do + A.read("\373\225\220") + end +{:ruby} + +Results in the following being written to `STDERR`. + + obj.a => -5 + obj.b => 9 + obj.c => 1 + obj.d[0] => 0 + obj.d[1] => 1 + obj.d[2] => 1 + obj.d[3] => 0 + obj.d[4] => 0 + obj.d[5] => 1 +{:ruby} + +### Rest + +The rest keyword will consume the input stream from the current position +to the end of the stream. + + class A < BinData::Record + string :a, :read_length => 5 + rest :rest + end + + obj = A.read("abcdefghij") + obj.a #=> "abcde" + obj.rest #=" "fghij" +{:ruby} + +### Hidden fields + +The typical way to view the contents of a BinData record is to call +`#snapshot` or `#inspect`. This gives all fields and their values. The +`hide` keyword can be used to prevent certain fields from appearing in +this output. This removes clutter and allows the developer to focus on +what they are currently interested in. + + class Testing < BinData::Record + hide :a, :b + string :a, :read_length => 10 + string :b, :read_length => 10 + string :c, :read_length => 10 + end + + obj = Testing.read(("a" * 10) + ("b" * 10) + ("c" * 10)) + obj.snapshot #=> {"c"=>"cccccccccc"} + obj.to_binary_s #=> "aaaaaaaaaabbbbbbbbbbcccccccccc" +{:ruby} + +## Parameterizing User Defined Types + +All BinData types have parameters that allow the behaviour of an object +to be specified at initialization time. User defined types may also +specify parameters. There are two types of parameters: mandatory and +default. + +### Mandatory Parameters + +Mandatory parameters must be specified when creating an instance of the +type. + + class Polygon < BinData::Record + mandatory_parameter :num_edges + + uint8 :num, :value => lambda { vertices.length } + array :vertices, :initial_length => :num_edges do + int8 :x + int8 :y + end + end + + triangle = Polygon.new + #=> raises ArgumentError: parameter 'num_edges' must be specified in Polygon + + triangle = Polygon.new(:num_edges => 3) + triangle.snapshot #=> {"num" => 3, "vertices" => + [{"x"=>0, "y"=>0}, {"x"=>0, "y"=>0}, {"x"=>0, "y"=>0}]} +{:ruby} + +### Default Parameters + +Default parameters are optional. These parameters have a default value +that may be overridden when an instance of the type is created. + + class Phrase < BinData::Primitive + default_parameter :number => "three" + default_parameter :adjective => "blind" + default_parameter :noun => "mice" + + stringz :a, :initial_value => :number + stringz :b, :initial_value => :adjective + stringz :c, :initial_value => :noun + + def get; "#{a} #{b} #{c}"; end + def set(v) + if /(.*) (.*) (.*)/ =~ v + self.a, self.b, self.c = $1, $2, $3 + end + end + end + + obj = Phrase.new(:number => "two", :adjective => "deaf") + obj.to_s #=> "two deaf mice" +{:ruby} + +## Extending existing Types + +Sometimes you wish to create a new type that is simply an existing type +with some predefined parameters. Examples could be an array with a +specified type, or an integer with an initial value. + +This can be achieved by subclassing the existing type and providing +default parameters. These parameters can of course be overridden at +initialisation time. + +Here we define an array that contains big endian 16 bit integers. The +array has a preferred initial length. + + class IntArray < BinData::Array + default_parameters :type => :uint16be, :initial_length => 5 + end + + arr = IntArray.new + arr.size #=> 5 +{:ruby} + +The initial length can be overridden at initialisation time. + + arr = IntArray.new(:initial_length => 8) + arr.size #=> 8 +{:ruby} + +We can also use the block form syntax: + + class IntArray < BinData::Array + endian :big + default_parameter :initial_length => 5 + + uint16 + end +{:ruby} + ## Skipping over unused data Some structures contain binary data that is irrelevant to your purposes. Say you are interested in 50 bytes of data located 10 megabytes into the @@ -1174,158 +1343,89 @@ c = CrazyAlignment.read("\xff" * 10) c.to_binary_s #=> "\377\377\377\377\377" {:ruby} -## Wrappers +--------------------------------------------------------------------------- -Sometimes you wish to create a new type that is simply an existing type -with some predefined parameters. Examples could be an array with a -specified type, or an integer with an initial value. +# FAQ -This can be achieved with a wrapper. A wrapper creates a new type based -on an existing type which has predefined parameters. These parameters -can of course be overridden at initialisation time. +## I'm using Ruby 1.9. How do I use string encodings with BinData? -Here we define an array that contains big endian 16 bit integers. The -array has a preferred initial length. +BinData will internally use 8bit binary strings to represent the data. +You do not need to worry about converting between encodings. - class IntArray < BinData::Wrapper - endian :big - array :type => :uint16, :initial_length => 5 +If you wish BinData to present string data in a specific encoding, you +can override `#snapshot` as illustrated below: + + class UTF8String < BinData::String + def snapshot + super.force_encoding('UTF-8') + end end - arr = IntArray.new - arr.size #=> 5 + str = UTF8String.new("\xC3\x85\xC3\x84\xC3\x96") + str #=> "ÅÄÖ" + str.to_binary_s #=> "\xC3\x85\xC3\x84\xC3\x96" {:ruby} -The initial length can be overridden at initialisation time. +## How do I speed up initialization? - arr = IntArray.new(:initial_length => 8) - arr.size #=> 8 -{:ruby} +I'm doing this and it's slow. -## Parameterizing User Defined Types - -All BinData types have parameters that allow the behaviour of an object -to be specified at initialization time. User defined types may also -specify parameters. There are two types of parameters: mandatory and -default. - -### Mandatory Parameters - -Mandatory parameters must be specified when creating an instance of the -type. The `:type` parameter of `Array` is an example of a mandatory -type. - - class IntArray < BinData::Wrapper - mandatory_parameter :byte_count - - array :type => :uint16be, :initial_length => lambda { byte_count / 2 } + 999.times do |i| + foo = Foo.new(:bar => "baz") + ... end - - arr = IntArray.new - #=> raises ArgumentError: parameter 'byte_count' must be specified in IntArray - - arr = IntArray.new(:byte_count => 12) - arr.snapshot #=> [0, 0, 0, 0, 0, 0] {:ruby} -### Default Parameters +BinData is optimized to be declarative. For imperative use, the +above naïve approach will be slow. Below are faster alternatives. -Default parameters are optional. These parameters have a default value -that may be overridden when an instance of the type is created. +The fastest approach is to reuse objects by calling `#clear` instead of +instantiating more objects. - class Phrase < BinData::Primitive - default_parameter :number => "three" - default_parameter :adjective => "blind" - default_parameter :noun => "mice" - - stringz :a, :initial_value => :number - stringz :b, :initial_value => :adjective - stringz :c, :initial_value => :noun - - def get; "#{a} #{b} #{c}"; end - def set(v) - if /(.*) (.*) (.*)/ =~ v - self.a, self.b, self.c = $1, $2, $3 - end - end + foo = Foo.new(:bar => "baz") + 999.times do + foo.clear + ... end - - obj = Phrase.new(:number => "two", :adjective => "deaf") - obj.to_s #=> "two deaf mice" {:ruby} -## Debugging +If you can't reuse objects, then consider the prototype pattern. -BinData includes several features to make it easier to debug -declarations. - -### Tracing - -BinData has the ability to trace the results of reading a data -structure. - - class A < BinData::Record - int8 :a - bit4 :b - bit2 :c - array :d, :initial_length => 6, :type => :bit1 + prototype = Foo.new(:bar => "baz") + 999.times do + foo = prototype.new + ... end - - BinData::trace_reading do - A.read("\373\225\220") - end {:ruby} -Results in the following being written to `STDERR`. +The prefered approach is to be declarative. - obj.a => -5 - obj.b => 9 - obj.c => 1 - obj.d[0] => 0 - obj.d[1] => 1 - obj.d[2] => 1 - obj.d[3] => 0 - obj.d[4] => 0 - obj.d[5] => 1 -{:ruby} + class FooList < BinData::Array + default_parameter :initial_length => 999 -### Rest - -The rest keyword will consume the input stream from the current position -to the end of the stream. - - class A < BinData::Record - string :a, :read_length => 5 - rest :rest + foo :bar => "baz" end - obj = A.read("abcdefghij") - obj.a #=> "abcde" - obj.rest #=" "fghij" + array = FooList.new + array.each { ... } {:ruby} -### Hidden fields +## How do I model this complex nested format? -The typical way to view the contents of a BinData record is to call -`#snapshot` or `#inspect`. This gives all fields and their values. The -`hide` keyword can be used to prevent certain fields from appearing in -this output. This removes clutter and allows the developer to focus on -what they are currently interested in. +A common pattern in file formats and network protocols is +[type-length-value](http://en.wikipedia.org/wiki/Type-length-value). The +`type` field specifies how to interpret the `value`. This gives a way to +dynamically structure the data format. An example is the TCP/IP protocol +suite. An IP datagram can contain a nested TCP, UDP or other packet type as +decided by the `protocol` field. - class Testing < BinData::Record - hide :a, :b - string :a, :read_length => 10 - string :b, :read_length => 10 - string :c, :read_length => 10 - end - - obj = Testing.read(("a" * 10) + ("b" * 10) + ("c" * 10)) - obj.snapshot #=> {"c"=>"cccccccccc"} - obj.to_binary_s #=> "aaaaaaaaaabbbbbbbbbbcccccccccc" -{:ruby} +Modelling this structure can be difficult when the nesting is recursive, e.g. +IP tunneling. Here is an example of the simplest possible recursive TLV structure, +a [list that can contains atoms or other +lists](http://bindata.rubyforge.org/svn/trunk/examples/list.rb). --------------------------------------------------------------------------- # Alternatives