README.md in yoga-0.2.0 vs README.md in yoga-0.2.1

- old
+ new
@@ -1,36 +1,258 @@
 # Yoga
+[![Build Status][build-status]][build-status-link] [![Coverage Status][coverage-status]][coverage-status-link]
 
-Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/yoga`. To experiment with that code, run `bin/console` for an interactive prompt.
+A helper for your Ruby parsers.  This adds helper methods to make parsing
+(and scanning!) easier and more structured.  If you're looking for an LALR
+parser generator, that isn't this.  This is designed to help you construct
+Recursive Descent parsers - which are solely LL(k).  If you want an LALR parser
+generator, see [_Antelope_](https://github.com/medcat/antelope) or
+[Bison](https://www.gnu.org/software/bison/).
 
-TODO: Delete this and the text above, and describe your gem
+Yoga requires [Mixture](https://github.com/medcat/mixture) for parser node
+attributes.  However, the use of the parser nodes included with Yoga are
+completely optional.
 
 ## Installation
 
 Add this line to your application's Gemfile:
 
 ```ruby
-gem 'yoga'
+gem "yoga"
 ```
 
 And then execute:
 
     $ bundle
 
-Or install it yourself as:
+## Usage
 
-    $ gem install yoga
+To begin your parser, you will first have to create a scanner.  A scanner
+takes the source text and generates "tokens."  These tokens are abstract
+representations of the source text of the document.  For example, for the
+text `class A do`, you could have the tokens `:class`, `:CNAME`, and `:do`.
+The actual names of the tokens are completely up to you.  These token names
+are later used in the parser to set up expectations - for example, for the
+definition of a class, you could expect a `:class`, `:CNAME`, and a `:do`
+token.
 
-## Usage
+Essentially, the scanner breaks up the text into usable, bite-sized pieces
+for the parser to chomp on.  Here's what scanner may look like:
 
-TODO: Write usage instructions here
+```ruby
+module MyLanguage
+  class Scanner
+    # All of the behavior from Yoga for scanners.  This provides the
+    # `match/2` method, the `call/0` method, the `match_line/1` method,
+    # the `location/1` method, and the `emit/2` method.  The major ones that
+    # are used are the `match/2`, the `call/0`, and the `match_line/1`
+    # methods.
+    include Yoga::Scanner
 
-## Development
+    # This must be implemented.  This is called for the next token.  This
+    # should only return a Token, or true.
+    def scan
+      # Match with a string value escapes the string, then turns it into a
+      # regular expression.
+      match("[") || match("]") ||
+      # Match with a symbol escapes the symbol, and turns it into a regular
+      # expression, suffixing it with `symbol_negative_assertion`.  This is
+      # to prevent issues with identifiers and keywords.
+      match(:class) || match(:func) ||
+      # With a regular expression, it's matched exactly.  However, a token
+      # name is highly recommended.
+      match(/[a-z][a-zA-Z0-9_]*[!?=]?/, :IDENT)
+    end
+  end
+end
+```
 
-After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
+And that's it!  You now have a fully functioning scanner.  In order to use it,
+all you have to do is this:
 
-To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
+```ruby
+source = "class alpha [func a []]"
+MyLanguage::Scanner.new(source).call # => #<Enumerable ...>
+```
 
+Note that `Scanner#call` returns an enumerable.  `#call` is aliased as `#each`.
+What this means is that tokens aren't generated until they're requested by the
+parser - each token is generated from the source incrementally.  If you want
+to retrieve all of the tokens immediately, you have to first convert it into
+a string, or perform some other operation on the enumerable (since it isn't
+lazy):
+
+```ruby
+MyLanguage::Scanner.new(source).call.to_a # => [...]
+```
+
+The scanner also automatically adds location information to all of the tokens.
+This is handled automatically by `match/2` and `emit/2` - the only issue being
+that all regular expressions **must not** include a newline.  Newlines should
+be matched with `match_line/1`; if lines must be emitted as a token, you can
+pass the kind of token to emit to `match_line/1` using the `kind:` keyword.
+
+You may notice that all of the tokens have `<anon>` set as the location's file.
+This is the default location, which is provided to the initializer:
+
+```ruby
+MyLanguage::Scanner.new(source, "foo").call.first.location.to_s # => "foo:1.1-6"
+```
+
+Parsers are a little bit more complicated.  Before we can pull up the parser,
+let's define a grammar and some node classes.
+
+```
+; This is the grammar.
+<root> = *<statement>
+<statement> = <expression> ';'
+<expression> = <expression> <op> <expression>
+<expression> /= <int> ; here, <int> is defined by the scanner.
+<op> = '+' / '-' / '*' / '/' / '^' / '%' / '='
+```
+
+```ruby
+module MyLanguage
+  class Parser
+    class Root < Yoga::Node
+      # An attribute on the node.  This is required for Yoga nodes since the
+      # update syntax requires them.  The type for the attribute is optional.
+      attribute :statements, type: [Yoga::Node]
+    end
+
+    class Expression < Yoga::Node
+    end
+
+    class Operation < Expression
+      attribute :operator, type: ::Symbol
+      attribute :left, type: Expression
+      attribute :right, type: Expression
+    end
+
+    class Literal < Expression
+      attribute :value, type: ::Integer
+    end
+  end
+end
+```
+
+With those out of the way, let's take a look at the parser itself.
+
+```ruby
+module MyLanguage
+  class Parser
+    # This provides all of the parser helpers.  This is the same as adding
+    # `Yoga::Parser::Helpers` as an include statement as well.
+    include Yoga::Parser
+
+    # Like the `scan/0` method on the scanner, this must be implemented.  This
+    # is the entry point for the parser.  However, public usage should use the
+    # `call/0` method.  This should return a node of some sort.
+    def parse_root
+      # This "collects" a series of nodes in sequence.  It iterates until it
+      # reaches the `:EOF` token (in this case).  The first parameter to
+      # collect is the "terminating token," and can be any value that
+      # `expect/1` or `peek?/1` accepts.  The second, optional parameter to
+      # collect is the "joining token," and is required between each node.
+      # We're not using the semicolon as a joining token because that is
+      # required for _all_ statements.  The joining token can be used for
+      # things like argument lists.  The parameter can be any value that
+      # `expect/1` or `peek?/1` accepts.
+      children = collect(:EOF) { parse_statement }
+
+      # "Unions" the location of all of the statements in the list.
+      location = children.map(&:location).inject(:union)
+      Parser::Root.new(statements: children, location: location)
+    end
+
+    # Parses a statement.  This is the same as the <statement> rule as above.
+    def parse_statement
+      expression = parse_expression
+      # This says that the next token should be a semicolon.  If the next token
+      # isn't, it throws an error with a detailed error message, denoting
+      # what was expected (in this case, a semicolon), what was given, and
+      # where the error was located in the source file.
+      expect(:";")
+
+      expression
+    end
+
+
+    # A switch statement, essentially.  This is defined beforehand to make it
+    # _faster_ (not really; it's just useful).  The first parameter to the
+    # switch function is the name of the switch.  This is used later to
+    # actually perform the switch; it is also used to define a first set with
+    # the allowed tokens for the switch.  The second parameter defines a key
+    # value pair.  The keys are the tokens that are allowed; a symbol or an
+    # array of symbols can be used.  The value is the block or the method that
+    # is executed upon encountering that token.
+    switch(:Operation,
+      "=": proc { |left| parse_operation(:"=", left) },
+      "+": proc { |left| parse_operation(:"+", left) },
+      "-": proc { |left| parse_operation(:"-", left) },
+      "*": proc { |left| parse_operation(:"*", left) },
+      "/": proc { |left| parse_operation(:"/", left) },
+      "^": proc { |left| parse_operation(:"^", left) },
+      "%": proc { |left| parse_operation(:"%", left) })
+
+    def parse_expression
+      # Parse a literal.  All expressions must contain a literal of some sort;
+      # we're just going to use a numeric literal here.
+      left = parse_expression_literal
+
+      # Whenever the `.switch` function is called, it creates a
+      # "first set" that can be used like this.  The first set consists of
+      # a set of tokens that are allowed for the switch statement.  In this
+      # case, it just makes sure that the next token is an operator.  If it
+      # is, it parses it as an operation.
+      if peek?(first(:Operation))
+        # Uses the switch defined below.  If a token is found as a key, its
+        # block is executed; otherwise, it errors, giving a detailed error of
+        # what was expected.
+        switch(:Operation, left)
+      else
+        left
+      end
+    end
+
+    def parse_operation(op, left)
+      token = expect(op)
+      right = parse_expression
+
+      Parser::Operation.new(left: left, op: op, right: right, location:
+        left.location | op.location | right.location)
+    end
+
+    def parse_expression_literal
+      token = expect(:NUMERIC)
+      Parser::Literal.new(value: token.value, location: token.location)
+    end
+  end
+end
+```
+
+This parser can then be used as such:
+
+```ruby
+source = "a = 2;\nb = a + 2;\n"
+scanner = MyLanguage::Scanner.new(source).call
+MyLanguage::Parser.new(scanner).call # => #<MyLanguage::Parser::Root ...>
+```
+
+That's about it!  If you have any questions, you can email me at
+<jeremy.rodi@medcat.me>, open an issue, or do what you like.
+
+For more documentation, see [the Documentation][documentation] - Yoga has a
+requirement of 100% documentation.
+
 ## Contributing
 
-Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/yoga. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
+Bug reports and pull requests are welcome on GitHub at
+<https://github.com/medcat/yoga>. This project is intended to be a safe,
+welcoming space for collaboration, and contributors are expected to adhere to
+the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
 
+[build-status]: https://travis-ci.org/medcat/yoga.svg?branch=master
+[documentation]: http://www.rubydoc.info/github/medcat/yoga/master
+[coverage-status]: https://coveralls.io/repos/github/medcat/yoga/badge.svg?branch=master
+[build-status-link]: https://travis-ci.org/medcat/yoga
+[coverage-status-link]: https://coveralls.io/github/medcat/yoga?branch=master