Parent

Class Index [+]

Quicksearch

TaskJuggler::TextScanner

The TextScanner class is an abstract text scanner with support for nested include files and text macros. The tokenizer will operate on rules that must be provided by a derived class. The scanner is modal. Each mode operates only with the subset of token patterns that are assigned to the current mode. The current line is tracked accurately and can be used for error reporting. The scanner can operate on Strings or Files.

Public Class Methods

new(masterFile, messageHandler, tokenPatterns, defaultMode) click to toggle source

Create a new instance of TextScanner. masterFile must be a String that either contains the name of the file to start with or the text itself. messageHandler is a MessageHandler that is used for error messages.

     # File lib/TextScanner.rb, line 188
188:     def initialize(masterFile, messageHandler, tokenPatterns, defaultMode)
189:       @masterFile = masterFile
190:       @messageHandler = messageHandler
191:       # This table contains all macros that may be expanded when found in the
192:       # text.
193:       @macroTable = MacroTable.new(messageHandler)
194:       # The currently processed IO object.
195:       @cf = nil
196:       # This Array stores the currently processed nested files. It's an Array
197:       # of Arrays. The nested Array consists of 2 elements, the IO object and
198:       # the @tokenBuffer.
199:       @fileStack = []
200:       # This flag is set if we have reached the end of a file. Since we will
201:       # only know when the next new token is requested that the file is really
202:       # done now, we have to use this flag.
203:       @finishLastFile = false
204:       # True if the scanner operates on a buffer.
205:       @fileNameIsBuffer = false
206:       # A SourceFileInfo of the start of the currently processed token.
207:       @startOfToken = nil
208:       # Line number correction for error messages.
209:       @lineDelta = 0
210:       # Lists of regexps that describe the detectable tokens. The Arrays are
211:       # grouped by mode.
212:       @patternsByMode = { }
213:       # The currently active scanner mode.
214:       @scannerMode = nil
215:       # Points to the currently active pattern set as defined by the mode.
216:       @activePatterns = nil
217: 
218:       tokenPatterns.each do |pat|
219:         type = pat[0]
220:         regExp = pat[1]
221:         mode = pat[2] || :tjp
222:         postProc = pat[3]
223:         addPattern(type, regExp, mode, postProc)
224:       end
225:       self.mode = defaultMode
226:     end

Public Instance Methods

addMacro(macro) click to toggle source

Add a Macro to the macro translation table.

     # File lib/TextScanner.rb, line 440
440:     def addMacro(macro)
441:       @macroTable.add(macro)
442:     end
addPattern(type, regExp, mode, postProc = nil) click to toggle source

Add a new pattern to the scanner. type is either nil for tokens that will be ignored, or some identifier that will be returned with each token of this type. regExp is the RegExp that describes the token. mode identifies the scanner mode where the pattern is active. If it’s only a single mode, mode specifies the mode directly. For multiple modes, it’s an Array of modes. postProc is a method reference. This method is called after the token has been detected. The method gets the type and the matching String and returns them again in an Array.

     # File lib/TextScanner.rb, line 236
236:     def addPattern(type, regExp, mode, postProc = nil)
237:       if mode.is_a?(Array)
238:         mode.each do |m|
239:           # The pattern is active in multiple modes
240:           @patternsByMode[m] = [] unless @patternsByMode.include?(m)
241:           @patternsByMode[m] << [ type, regExp, postProc ]
242:         end
243:       else
244:         # The pattern is only active in one specific mode.
245:         @patternsByMode[mode] = [] unless @patternsByMode.include?(mode)
246:         @patternsByMode[mode] << [ type, regExp, postProc ]
247:       end
248:     end
close() click to toggle source

Finish processing and reset all data structures.

     # File lib/TextScanner.rb, line 278
278:     def close
279:       unless @fileNameIsBuffer
280:         Log.startProgressMeter("Reading file #{@masterFile}")
281:         Log.stopProgressMeter
282:       end
283:       @fileStack = []
284:       @cf = @tokenBuffer = nil
285:     end
error(id, text, sfi = nil, data = nil) click to toggle source

Call this function to report any errors related to the parsed input.

     # File lib/TextScanner.rb, line 469
469:     def error(id, text, sfi = nil, data = nil)
470:       message(:error, id, text, sfi, data)
471:     end
expandMacro(prefix, args) click to toggle source

Expand a macro and inject it into the input stream. prefix is any string that was found right before the macro call. We have to inject it before the expanded macro. args is an Array of Strings. The first is the macro name, the rest are the parameters.

     # File lib/TextScanner.rb, line 453
453:     def expandMacro(prefix, args)
454:       # Get the expanded macro from the @macroTable.
455:       macro, text = @macroTable.resolve(args, sourceFileInfo)
456:       unless macro && text
457:         error('undefined_macro', "Undefined macro '#{args[0]}' called")
458:       end
459: 
460:       # If the expanded macro is empty, we can ignore it.
461:       return if text == ''
462: 
463:       unless @cf.injectMacro(macro, args, prefix + text)
464:         error('macro_stack_overflow', "Too many nested macro calls.")
465:       end
466:     end
fileName() click to toggle source

Return the name of the currently processed file. If we are working on a text buffer, the text will be returned.

     # File lib/TextScanner.rb, line 334
334:     def fileName
335:       @cf ? @cf.fileName : @masterFile
336:     end
include(includeFileName, sfi) click to toggle source

Continue processing with a new file specified by includeFileName. When this file is finished, we will continue in the old file after the location where we started with the new file. The method returns the full qualified name of the included file.

     # File lib/TextScanner.rb, line 291
291:     def include(includeFileName, sfi)
292:       if includeFileName[0] != '/'
293:         pathOfCallingFile = @fileStack.last[0].dirname
294:         path = pathOfCallingFile.empty? ? '' : pathOfCallingFile + '/'
295:         # If the included file is not an absolute name, we interpret the file
296:         # name relative to the including file.
297:         includeFileName = path + includeFileName
298:       end
299: 
300:       # Try to dectect recursive inclusions. This will not work if files are
301:       # accessed via filesystem links.
302:       @fileStack.each do |entry|
303:         if includeFileName == entry[0].fileName
304:           error('include_recursion',
305:                 "Recursive inclusion of #{includeFileName} detected", sfi)
306:         end
307:       end
308: 
309:       # Save @tokenBuffer in the record of the parent file.
310:       @fileStack.last[1] = @tokenBuffer unless @fileStack.empty?
311:       @tokenBuffer = nil
312:       @finishLastFile = false
313: 
314:       # Open the new file and push the handle on the @fileStack.
315:       begin
316:         @fileStack << [ (@cf = FileStreamHandle.new(includeFileName)), nil, ]
317:         Log << "Parsing file #{includeFileName}"
318:       rescue StandardError
319:         error('bad_include', "Cannot open include file #{includeFileName}", sfi)
320:       end
321: 
322:       # Return the name of the included file.
323:       includeFileName
324:     end
macroDefined?(name) click to toggle source

Return true if the Macro name has been added already.

     # File lib/TextScanner.rb, line 445
445:     def macroDefined?(name)
446:       @macroTable.include?(name)
447:     end
mode=(newMode) click to toggle source

Switch the parser to another mode. The scanner will then only detect patterns of that newMode.

     # File lib/TextScanner.rb, line 252
252:     def mode=(newMode)
253:       #puts "**** New mode: #{newMode}"
254:       @activePatterns = @patternsByMode[newMode]
255:       raise "Undefined mode #{newMode}" unless @activePatterns
256:       @scannerMode = newMode
257:     end
nextToken() click to toggle source

Return the next token from the input stream. The result is an Array with 3 entries: the token type, the token String and the SourceFileInfo where the token started.

     # File lib/TextScanner.rb, line 353
353:     def nextToken
354:       # If we have a pushed-back token, return that first.
355:       unless @tokenBuffer.nil?
356:         res = @tokenBuffer
357:         @tokenBuffer = nil
358:         return res
359:       end
360: 
361:       if @finishLastFile
362:         # The previously processed file has now really been processed to
363:         # completion. Close it and remove the corresponding entry from the
364:         # @fileStack.
365:         @finishLastFile = false
366:         #Log << "Completed file #{@cf.fileName}"
367:         @cf.close if @cf
368:         @fileStack.pop
369: 
370:         if @fileStack.empty?
371:           # We are done with the top-level file now.
372:           @cf = @tokenBuffer = nil
373:           @finishLastFile = true
374:           return [ :endOfText, '<EOT>', @startOfToken ]
375:         else
376:           # Continue parsing the file that included the current file.
377:           @cf, tokenBuffer = @fileStack.last
378:           Log << "Parsing file #{@cf.fileName} ..."
379:           # If we have a left over token from previously processing this file,
380:           # return it now.
381:           if tokenBuffer
382:             @finishLastFile = true if tokenBuffer[0] == :eof
383:             return tokenBuffer
384:           end
385:         end
386:       end
387: 
388:       # Start processing characters from the input.
389:       @startOfToken = sourceFileInfo
390:       loop do
391:         match = nil
392:         begin
393:           @activePatterns.each do |type, re, postProc|
394:             if (match = @cf.scan(re))
395:               if match == :scannerEOF
396:                 # We've found the end of an input file. Return a special token
397:                 # that describes the end of a file.
398:                 @finishLastFile = true
399:                 return [ :eof, '<END>', @startOfToken ]
400:               end
401: 
402:               raise "#{re} matches empty string" if match.empty?
403:               # If we have a post processing method, call it now. It may modify
404:               # the type or the found token String.
405:               type, match = postProc.call(type, match) if postProc
406: 
407:               break if type.nil? # Ignore certain tokens with nil type.
408: 
409:               return [ type, match, @startOfToken ]
410:             end
411:           end
412:         rescue ArgumentError
413:           error('scan_encoding_error', $!.to_s)
414:         end
415: 
416:         if match.nil?
417:           if @cf.eof?
418:             error('unexpected_eof',
419:                   "Unexpected end of file found")
420:           else
421:             error('no_token_match',
422:                   "Unexpected characters found: '#{@cf.peek(10)}...'")
423:           end
424:         end
425:       end
426:     end
open(fileNameIsBuffer = false) click to toggle source

Start the processing. if fileNameIsBuffer is true, we operate on a String, else on a File.

     # File lib/TextScanner.rb, line 262
262:     def open(fileNameIsBuffer = false)
263:       @fileNameIsBuffer = fileNameIsBuffer
264:       if fileNameIsBuffer
265:         @fileStack = [ [ @cf = BufferStreamHandle.new(@masterFile), nil ] ]
266:       else
267:         begin
268:           @fileStack = [ [ @cf = FileStreamHandle.new(@masterFile), nil ] ]
269:         rescue StandardError
270:           error('open_file', "Cannot open file #{@masterFile}")
271:         end
272:       end
273:       @masterPath = @cf.dirname + '/'
274:       @tokenBuffer = nil
275:     end
returnToken(token) click to toggle source

Return a token to retrieve it with the next nextToken() call again. Only 1 token can be returned before the next nextToken() call.

     # File lib/TextScanner.rb, line 430
430:     def returnToken(token)
431:       #Log << "-> Returning Token: [#{token[0]}][#{token[1]}]"
432:       unless @tokenBuffer.nil?
433:         $stderr.puts @tokenBuffer
434:         raise "Fatal Error: Cannot return more than 1 token in a row"
435:       end
436:       @tokenBuffer = token
437:     end
sourceFileInfo() click to toggle source

Return SourceFileInfo for the current processing prosition.

     # File lib/TextScanner.rb, line 327
327:     def sourceFileInfo
328:       @cf ? SourceFileInfo.new(fileName, @cf.lineNo - @lineDelta, 0) :
329:             SourceFileInfo.new(@masterFile, 0, 0)
330:     end
warning(id, text, sfi = nil, data = nil) click to toggle source
     # File lib/TextScanner.rb, line 473
473:     def warning(id, text, sfi = nil, data = nil)
474:       message(:warning, id, text, sfi, data)
475:     end

Private Instance Methods

message(type, id, text, sfi, data) click to toggle source
     # File lib/TextScanner.rb, line 479
479:     def message(type, id, text, sfi, data)
480:       unless text.empty?
481:         line = @cf ? @cf.line : nil
482:         sfi ||= sourceFileInfo
483: 
484:         if @cf && !@cf.macroStack.empty?
485:           @messageHandler.info('macro_stack', 'Macro call history:', nil)
486: 
487:           @cf.macroStack.reverse_each do |entry|
488:             macro = entry.macro
489:             args = entry.args[1..1]
490:             args.collect! { |a| '"' + a + '"' }
491:             @messageHandler.info('macro_stack',
492:                                  "  ${#{macro.name} #{args.join(' ')}}",
493:                                  macro.sourceFileInfo)
494:           end
495:         end
496: 
497:         case type
498:         when :error
499:           @messageHandler.error(id, text, sfi, line, data)
500:         when :warning
501:           @messageHandler.warning(id, text, sfi, line, data)
502:         else
503:           raise "Unknown message type #{type}"
504:         end
505:       end
506:     end

Disabled; run with --debug to generate this.

[Validate]

Generated with the Darkfish Rdoc Generator 1.1.6.