#Using Treetop Grammars in Ruby ##Using the Command Line Compiler You can compile `.treetop` files into Ruby source code with the `tt` command line script. `tt` takes an list of files with a `.treetop` extension and compiles them into `.rb` files of the same name. You can then `require` these files like any other Ruby script. Alternately, you can supply just one `.treetop` file and a `-o` flag to name specify the name of the output file. Improvements to this compilation script are welcome. tt foo.treetop bar.treetop tt foo.treetop -o foogrammar.rb ##Loading A Grammar Directly The Polyglot gem makes it possible to load `.treetop` or `.tt` files directly with `require`. This will invoke `Treetop.load`, which automatically compiles the grammar to Ruby and then evaluates the Ruby source. If you are getting errors in methods you define on the syntax tree, try using the command line compiler for better stack trace feedback. A better solution to this issue is in the works. In order to use Polyglot dynamic loading of `.treetop` or `.tt` files though, you need to require the Polyglot gem before you require the Treetop gem as Treetop will only create hooks into Polyglot for the treetop files if Polyglot is already loaded. So you need to use: require 'polyglot' require 'treetop' in order to use Polyglot auto loading with Treetop in Ruby. ##Instantiating and Using Parsers If a grammar by the name of `Foo` is defined, the compiled Ruby source will define a `FooParser` class. To parse input, create an instance and call its `parse` method with a string. The parser will return the syntax tree of the match or `nil` if there is a failure. Note that by default, the parser will fail unless *all* input is consumed. Treetop.load "arithmetic" parser = ArithmeticParser.new if parser.parse('1+1') puts 'success' else puts 'failure' end ##Parser Options A Treetop parser has several options you may set. Some are settable permanently by methods on the parser, but all may be passed in as options to the `parse` method. parser = ArithmeticParser.new input = 'x = 2; y = x+3;' # Temporarily override an option: result1 = parser.parse(input, :consume_all_input => false) puts "consumed #{parser.index} characters" parser.consume_all_input = false result1 = parser.parse(input) puts "consumed #{parser.index} characters" # Continue the parse with the next character: result2 = parser.parse(input, :index => parser.index) # Parse, but match rule `variable` instead of the normal root rule: parser.parse(input, :root => :variable) parser.root = :variable # Permanent setting If you have a statement-oriented language, you can save memory by parsing just one statement at a time, and discarding the parse tree after each statement. ##Learning From Failure If a parse fails, it returns nil. In this case, you can ask the parser for an explanation. The failure reasons include the terminal nodes which were tried at the furthermost point the parse reached. parser = ArithmeticParser.new result = parser.parse('4+=3') if !result puts parser.failure_reason puts parser.failure_line puts parser.failure_column end => Expected one of (, - at line 1, column 3 (byte 3) after + 1 3 ##Using Parse Results Please don't try to walk down the syntax tree yourself, and please don't use the tree as your own convenient data structure. It contains many more nodes than your application needs, often even more than one for every character of input. parser = ArithmeticParser.new p parser.parse('2+3') => SyntaxNode+Additive1 offset=0, "2+3" (multitive): SyntaxNode+Multitive1 offset=0, "2" (primary): SyntaxNode+Number0 offset=0, "2": SyntaxNode offset=0, "" SyntaxNode offset=0, "2" SyntaxNode offset=1, "" SyntaxNode offset=1, "" SyntaxNode offset=1, "+3": SyntaxNode+Additive0 offset=1, "+3" (multitive): SyntaxNode offset=1, "+" SyntaxNode+Multitive1 offset=2, "3" (primary): SyntaxNode+Number0 offset=2, "3": SyntaxNode offset=2, "" SyntaxNode offset=2, "3" SyntaxNode offset=3, "" SyntaxNode offset=3, "" Instead, add methods to the root rule which return the information you require in a sensible form. Each rule can call its sub-rules, and this method of walking the syntax tree is much preferable to attempting to walk it from the outside.