Sha256: ac2b93a6e6a637cd0ef0171ab9b7a729e95bcd5c93674d019d9d6042fe0df316
Contents?: true
Size: 1.3 KB
Versions: 20
Compression:
Stored size: 1.3 KB
Contents
Lexical analysis is performed by obtaining a tokenizer of the appropriate class and calling @tokenize@ on it, passing the text to be tokenized. Each token is yielded to the associated block as it is discovered. {{{lang=ruby,number=true,caption=Tokenizing a Ruby script require 'syntax' tokenizer = Syntax.load "ruby" tokenizer.tokenize( File.read( "program.rb" ) ) do |token| puts token puts " group: #{token.group}" puts " instruction: #{token.instruction}" end }}} If you need finer control over the process, you can use the lower-level API: {{{lang=ruby,number=true,caption=Tokenizing a Ruby script via step require 'syntax' tokenizer = Syntax.load "ruby" tokenizer.start( File.read( "program.rb" ) ) do |token| puts token puts " group: #{token.group}" puts " instruction: #{token.instruction}" end tokenizer.step tokenizer.step ... tokenizer.finish }}} In this case, each time @#step@ is invoked, it results in tokens being consumed and yielded to the block. However, a single step may result in multiple tokens being detected and yielded--there is no way to guarantee a single token at a time, unless the corresponding syntax module was written to work that way. For efficiency, the existing modules will yield multiple tokens when processing (for instance) strings, regular expressions, and heredocs.
Version data entries
20 entries across 20 versions & 1 rubygems