The Scanner class is an abstract text scanner with support for nested include files and text macros. The tokenizer will operate on rules that must be provided by a derived class. The scanner is modal. Each mode operates only with the subset of token patterns that are assigned to the current mode. The current line is tracked accurately and can be used for error reporting. The scanner can operate on Strings or Files.
Create a new instance of Scanner. masterFile must be a String that either contains the name of the file to start with or the text itself. messageHandler is a MessageHandler that is used for error messages. log is a Log to report progress and status.
# File lib/taskjuggler/TextParser/Scanner.rb, line 203 203: def initialize(masterFile, messageHandler, log, tokenPatterns, defaultMode) 204: @masterFile = masterFile 205: @messageHandler = messageHandler 206: @log = log 207: # This table contains all macros that may be expanded when found in the 208: # text. 209: @macroTable = MacroTable.new 210: # The currently processed IO object. 211: @cf = nil 212: # This Array stores the currently processed nested files. It's an Array 213: # of Arrays. The nested Array consists of 2 elements, the IO object and 214: # the @tokenBuffer. 215: @fileStack = [] 216: # This flag is set if we have reached the end of a file. Since we will 217: # only know when the next new token is requested that the file is really 218: # done now, we have to use this flag. 219: @finishLastFile = false 220: # True if the scanner operates on a buffer. 221: @fileNameIsBuffer = false 222: # A SourceFileInfo of the start of the currently processed token. 223: @startOfToken = nil 224: # Line number correction for error messages. 225: @lineDelta = 0 226: # Lists of regexps that describe the detectable tokens. The Arrays are 227: # grouped by mode. 228: @patternsByMode = { } 229: # The currently active scanner mode. 230: @scannerMode = nil 231: # The mode that the scanner is in at the start and end of file 232: @defaultMode = defaultMode 233: # Points to the currently active pattern set as defined by the mode. 234: @activePatterns = nil 235: 236: tokenPatterns.each do |pat| 237: type = pat[0] 238: regExp = pat[1] 239: mode = pat[2] || :tjp 240: postProc = pat[3] 241: addPattern(type, regExp, mode, postProc) 242: end 243: self.mode = defaultMode 244: end
Add a Macro to the macro translation table.
# File lib/taskjuggler/TextParser/Scanner.rb, line 474 474: def addMacro(macro) 475: @macroTable.add(macro) 476: end
Add a new pattern to the scanner. type is either nil for tokens that will be ignored, or some identifier that will be returned with each token of this type. regExp is the RegExp that describes the token. mode identifies the scanner mode where the pattern is active. If it’s only a single mode, mode specifies the mode directly. For multiple modes, it’s an Array of modes. postProc is a method reference. This method is called after the token has been detected. The method gets the type and the matching String and returns them again in an Array.
# File lib/taskjuggler/TextParser/Scanner.rb, line 254 254: def addPattern(type, regExp, mode, postProc = nil) 255: if mode.is_a?(Array) 256: mode.each do |m| 257: # The pattern is active in multiple modes 258: @patternsByMode[m] = [] unless @patternsByMode.include?(m) 259: @patternsByMode[m] << [ type, regExp, postProc ] 260: end 261: else 262: # The pattern is only active in one specific mode. 263: @patternsByMode[mode] = [] unless @patternsByMode.include?(mode) 264: @patternsByMode[mode] << [ type, regExp, postProc ] 265: end 266: end
Finish processing and reset all data structures.
# File lib/taskjuggler/TextParser/Scanner.rb, line 298 298: def close 299: unless @fileNameIsBuffer 300: @log.startProgressMeter("Reading file #{@masterFile}") 301: @log.stopProgressMeter 302: end 303: @fileStack = [] 304: @cf = @tokenBuffer = nil 305: end
Call this function to report any errors related to the parsed input.
# File lib/taskjuggler/TextParser/Scanner.rb, line 504 504: def error(id, text, sfi = nil, data = nil) 505: message(:error, id, text, sfi, data) 506: end
Expand a macro and inject it into the input stream. prefix is any string that was found right before the macro call. We have to inject it before the expanded macro. args is an Array of Strings. The first is the macro name, the rest are the parameters. callLength is the number of characters for the complete macro call “${…}”.
# File lib/taskjuggler/TextParser/Scanner.rb, line 488 488: def expandMacro(prefix, args, callLength) 489: # Get the expanded macro from the @macroTable. 490: macro, text = @macroTable.resolve(args, sourceFileInfo) 491: unless macro && text 492: error('undefined_macro', "Undefined macro '#{args[0]}' called") 493: end 494: 495: # If the expanded macro is empty, we can ignore it. 496: return if text == '' 497: 498: unless @cf.injectMacro(macro, args, prefix + text, callLength) 499: error('macro_stack_overflow', "Too many nested macro calls.") 500: end 501: end
Return the name of the currently processed file. If we are working on a text buffer, the text will be returned.
# File lib/taskjuggler/TextParser/Scanner.rb, line 355 355: def fileName 356: @cf ? @cf.fileName : @masterFile 357: end
Continue processing with a new file specified by includeFileName. When this file is finished, we will continue in the old file after the location where we started with the new file. The method returns the full qualified name of the included file.
# File lib/taskjuggler/TextParser/Scanner.rb, line 311 311: def include(includeFileName, sfi, &block) 312: if includeFileName[0] != '/' 313: pathOfCallingFile = @fileStack.last[0].dirname 314: path = pathOfCallingFile.empty? ? '' : pathOfCallingFile + '/' 315: # If the included file is not an absolute name, we interpret the file 316: # name relative to the including file. 317: includeFileName = path + includeFileName 318: end 319: 320: # Try to dectect recursive inclusions. This will not work if files are 321: # accessed via filesystem links. 322: @fileStack.each do |entry| 323: if includeFileName == entry[0].fileName 324: error('include_recursion', 325: "Recursive inclusion of #{includeFileName} detected", sfi) 326: end 327: end 328: 329: # Save @tokenBuffer in the record of the parent file. 330: @fileStack.last[1] = @tokenBuffer unless @fileStack.empty? 331: @tokenBuffer = nil 332: @finishLastFile = false 333: 334: # Open the new file and push the handle on the @fileStack. 335: begin 336: @fileStack << [ (@cf = FileStreamHandle.new(includeFileName, @log)), 337: nil, block ] 338: @log << "Parsing file #{includeFileName}" 339: rescue StandardError 340: error('bad_include', "Cannot open include file #{includeFileName}", sfi) 341: end 342: 343: # Return the name of the included file. 344: includeFileName 345: end
Return true if the Macro name has been added already.
# File lib/taskjuggler/TextParser/Scanner.rb, line 479 479: def macroDefined?(name) 480: @macroTable.include?(name) 481: end
Switch the parser to another mode. The scanner will then only detect patterns of that newMode.
# File lib/taskjuggler/TextParser/Scanner.rb, line 270 270: def mode=(newMode) 271: #puts "**** New mode: #{newMode}" 272: @activePatterns = @patternsByMode[newMode] 273: raise "Undefined mode #{newMode}" unless @activePatterns 274: @scannerMode = newMode 275: end
Return the next token from the input stream. The result is an Array with 3 entries: the token type, the token String and the SourceFileInfo where the token started.
# File lib/taskjuggler/TextParser/Scanner.rb, line 374 374: def nextToken 375: # If we have a pushed-back token, return that first. 376: unless @tokenBuffer.nil? 377: res = @tokenBuffer 378: @tokenBuffer = nil 379: return res 380: end 381: 382: if @finishLastFile 383: # The previously processed file has now really been processed to 384: # completion. Close it and remove the corresponding entry from the 385: # @fileStack. 386: @finishLastFile = false 387: #@log << "Completed file #{@cf.fileName}" 388: 389: # If we have a block to be executed on EOF, we call it now. 390: onEof = @fileStack.last[2] 391: onEof.call if onEof 392: 393: @cf.close if @cf 394: @fileStack.pop 395: 396: if @fileStack.empty? 397: # We are done with the top-level file now. 398: @cf = @tokenBuffer = nil 399: @finishLastFile = true 400: return [ :endOfText, '<EOT>', @startOfToken ] 401: else 402: # Continue parsing the file that included the current file. 403: @cf, tokenBuffer = @fileStack.last 404: @log << "Parsing file #{@cf.fileName} ..." 405: # If we have a left over token from previously processing this file, 406: # return it now. 407: if tokenBuffer 408: @finishLastFile = true if tokenBuffer[0] == :eof 409: return tokenBuffer 410: end 411: end 412: end 413: 414: # Start processing characters from the input. 415: @startOfToken = sourceFileInfo 416: loop do 417: match = nil 418: begin 419: @activePatterns.each do |type, re, postProc| 420: if (match = @cf.scan(re)) 421: if match == :scannerEOF 422: if @scannerMode != @defaultMode 423: # The stream resets the line number to 1. Since we still 424: # know the start of the token, we setup @lineDelta so that 425: # sourceFileInfo() returns the proper line number. 426: @lineDelta = -(@startOfToken.lineNo - 1) 427: error('runaway_token', 428: "Unterminated token starting at #{@startOfToken}") 429: end 430: # We've found the end of an input file. Return a special token 431: # that describes the end of a file. 432: @finishLastFile = true 433: return [ :eof, '<END>', @startOfToken ] 434: end 435: 436: raise "#{re} matches empty string" if match.empty? 437: # If we have a post processing method, call it now. It may modify 438: # the type or the found token String. 439: type, match = postProc.call(type, match) if postProc 440: 441: break if type.nil? # Ignore certain tokens with nil type. 442: 443: return [ type, match, @startOfToken ] 444: end 445: end 446: rescue ArgumentError 447: error('scan_encoding_error', $!.to_s) 448: end 449: 450: if match.nil? 451: if @cf.eof? 452: error('unexpected_eof', 453: "Unexpected end of file found") 454: else 455: error('no_token_match', 456: "Unexpected characters found: '#{@cf.peek(10)}...'") 457: end 458: end 459: end 460: end
Start the processing. if fileNameIsBuffer is true, we operate on a String, else on a File.
# File lib/taskjuggler/TextParser/Scanner.rb, line 280 280: def open(fileNameIsBuffer = false) 281: @fileNameIsBuffer = fileNameIsBuffer 282: if fileNameIsBuffer 283: @fileStack = [ [ @cf = BufferStreamHandle.new(@masterFile, @log), 284: nil, nil ] ] 285: else 286: begin 287: @fileStack = [ [ @cf = FileStreamHandle.new(@masterFile, @log), 288: nil, nil ] ] 289: rescue StandardError 290: error('open_file', "Cannot open file #{@masterFile}") 291: end 292: end 293: @masterPath = @cf.dirname + '/' 294: @tokenBuffer = nil 295: end
Return a token to retrieve it with the next nextToken() call again. Only 1 token can be returned before the next nextToken() call.
# File lib/taskjuggler/TextParser/Scanner.rb, line 464 464: def returnToken(token) 465: #@log << "-> Returning Token: [#{token[0]}][#{token[1]}]" 466: unless @tokenBuffer.nil? 467: $stderr.puts @tokenBuffer 468: raise "Fatal Error: Cannot return more than 1 token in a row" 469: end 470: @tokenBuffer = token 471: end
Return SourceFileInfo for the current processing prosition.
# File lib/taskjuggler/TextParser/Scanner.rb, line 348 348: def sourceFileInfo 349: @cf ? SourceFileInfo.new(fileName, @cf.lineNo - @lineDelta, 0) : 350: SourceFileInfo.new(@masterFile, 0, 0) 351: end
# File lib/taskjuggler/TextParser/Scanner.rb, line 514 514: def message(type, id, text, sfi, data) 515: unless text.empty? 516: line = @cf ? @cf.line : nil 517: sfi ||= sourceFileInfo 518: 519: if @cf && !@cf.macroStack.empty? 520: @messageHandler.info('macro_stack', 'Macro call history:', nil) 521: 522: @cf.macroStack.reverse_each do |entry| 523: macro = entry.macro 524: args = entry.args[1..1] 525: args.collect! { |a| '"' + a + '"' } 526: @messageHandler.info('macro_stack', 527: " ${#{macro.name}#{args.empty? ? '' : ' '}" + 528: "#{args.join(' ')}}", 529: macro.sourceFileInfo) 530: end 531: end 532: 533: case type 534: when :error 535: @messageHandler.error(id, text, sfi, line, data) 536: when :warning 537: @messageHandler.warning(id, text, sfi, line, data) 538: else 539: raise "Unknown message type #{type}" 540: end 541: end 542: end
Disabled; run with --debug to generate this.
Generated with the Darkfish Rdoc Generator 1.1.6.