The Scanner class is an abstract text scanner with support for nested include files and text macros. The tokenizer will operate on rules that must be provided by a derived class. The scanner is modal. Each mode operates only with the subset of token patterns that are assigned to the current mode. The current line is tracked accurately and can be used for error reporting. The scanner can operate on Strings or Files.
Create a new instance of Scanner. masterFile must be a String that either contains the name of the file to start with or the text itself. messageHandler is a MessageHandler that is used for error messages. log is a Log to report progress and status.
# File lib/taskjuggler/TextParser/Scanner.rb, line 194 194: def initialize(masterFile, messageHandler, log, tokenPatterns, defaultMode) 195: @masterFile = masterFile 196: @messageHandler = messageHandler 197: @log = log 198: # This table contains all macros that may be expanded when found in the 199: # text. 200: @macroTable = MacroTable.new 201: # The currently processed IO object. 202: @cf = nil 203: # This Array stores the currently processed nested files. It's an Array 204: # of Arrays. The nested Array consists of 2 elements, the IO object and 205: # the @tokenBuffer. 206: @fileStack = [] 207: # This flag is set if we have reached the end of a file. Since we will 208: # only know when the next new token is requested that the file is really 209: # done now, we have to use this flag. 210: @finishLastFile = false 211: # True if the scanner operates on a buffer. 212: @fileNameIsBuffer = false 213: # A SourceFileInfo of the start of the currently processed token. 214: @startOfToken = nil 215: # Line number correction for error messages. 216: @lineDelta = 0 217: # Lists of regexps that describe the detectable tokens. The Arrays are 218: # grouped by mode. 219: @patternsByMode = { } 220: # The currently active scanner mode. 221: @scannerMode = nil 222: # The mode that the scanner is in at the start and end of file 223: @defaultMode = defaultMode 224: # Points to the currently active pattern set as defined by the mode. 225: @activePatterns = nil 226: 227: tokenPatterns.each do |pat| 228: type = pat[0] 229: regExp = pat[1] 230: mode = pat[2] || :tjp 231: postProc = pat[3] 232: addPattern(type, regExp, mode, postProc) 233: end 234: self.mode = defaultMode 235: end
Add a Macro to the macro translation table.
# File lib/taskjuggler/TextParser/Scanner.rb, line 465 465: def addMacro(macro) 466: @macroTable.add(macro) 467: end
Add a new pattern to the scanner. type is either nil for tokens that will be ignored, or some identifier that will be returned with each token of this type. regExp is the RegExp that describes the token. mode identifies the scanner mode where the pattern is active. If it’s only a single mode, mode specifies the mode directly. For multiple modes, it’s an Array of modes. postProc is a method reference. This method is called after the token has been detected. The method gets the type and the matching String and returns them again in an Array.
# File lib/taskjuggler/TextParser/Scanner.rb, line 245 245: def addPattern(type, regExp, mode, postProc = nil) 246: if mode.is_a?(Array) 247: mode.each do |m| 248: # The pattern is active in multiple modes 249: @patternsByMode[m] = [] unless @patternsByMode.include?(m) 250: @patternsByMode[m] << [ type, regExp, postProc ] 251: end 252: else 253: # The pattern is only active in one specific mode. 254: @patternsByMode[mode] = [] unless @patternsByMode.include?(mode) 255: @patternsByMode[mode] << [ type, regExp, postProc ] 256: end 257: end
Finish processing and reset all data structures.
# File lib/taskjuggler/TextParser/Scanner.rb, line 289 289: def close 290: unless @fileNameIsBuffer 291: @log.startProgressMeter("Reading file #{@masterFile}") 292: @log.stopProgressMeter 293: end 294: @fileStack = [] 295: @cf = @tokenBuffer = nil 296: end
Call this function to report any errors related to the parsed input.
# File lib/taskjuggler/TextParser/Scanner.rb, line 494 494: def error(id, text, sfi = nil, data = nil) 495: message(:error, id, text, sfi, data) 496: end
Expand a macro and inject it into the input stream. prefix is any string that was found right before the macro call. We have to inject it before the expanded macro. args is an Array of Strings. The first is the macro name, the rest are the parameters.
# File lib/taskjuggler/TextParser/Scanner.rb, line 478 478: def expandMacro(prefix, args) 479: # Get the expanded macro from the @macroTable. 480: macro, text = @macroTable.resolve(args, sourceFileInfo) 481: unless macro && text 482: error('undefined_macro', "Undefined macro '#{args[0]}' called") 483: end 484: 485: # If the expanded macro is empty, we can ignore it. 486: return if text == '' 487: 488: unless @cf.injectMacro(macro, args, prefix + text) 489: error('macro_stack_overflow', "Too many nested macro calls.") 490: end 491: end
Return the name of the currently processed file. If we are working on a text buffer, the text will be returned.
# File lib/taskjuggler/TextParser/Scanner.rb, line 346 346: def fileName 347: @cf ? @cf.fileName : @masterFile 348: end
Continue processing with a new file specified by includeFileName. When this file is finished, we will continue in the old file after the location where we started with the new file. The method returns the full qualified name of the included file.
# File lib/taskjuggler/TextParser/Scanner.rb, line 302 302: def include(includeFileName, sfi, &block) 303: if includeFileName[0] != '/' 304: pathOfCallingFile = @fileStack.last[0].dirname 305: path = pathOfCallingFile.empty? ? '' : pathOfCallingFile + '/' 306: # If the included file is not an absolute name, we interpret the file 307: # name relative to the including file. 308: includeFileName = path + includeFileName 309: end 310: 311: # Try to dectect recursive inclusions. This will not work if files are 312: # accessed via filesystem links. 313: @fileStack.each do |entry| 314: if includeFileName == entry[0].fileName 315: error('include_recursion', 316: "Recursive inclusion of #{includeFileName} detected", sfi) 317: end 318: end 319: 320: # Save @tokenBuffer in the record of the parent file. 321: @fileStack.last[1] = @tokenBuffer unless @fileStack.empty? 322: @tokenBuffer = nil 323: @finishLastFile = false 324: 325: # Open the new file and push the handle on the @fileStack. 326: begin 327: @fileStack << [ (@cf = FileStreamHandle.new(includeFileName, @log)), 328: nil, block ] 329: @log << "Parsing file #{includeFileName}" 330: rescue StandardError 331: error('bad_include', "Cannot open include file #{includeFileName}", sfi) 332: end 333: 334: # Return the name of the included file. 335: includeFileName 336: end
Return true if the Macro name has been added already.
# File lib/taskjuggler/TextParser/Scanner.rb, line 470 470: def macroDefined?(name) 471: @macroTable.include?(name) 472: end
Switch the parser to another mode. The scanner will then only detect patterns of that newMode.
# File lib/taskjuggler/TextParser/Scanner.rb, line 261 261: def mode=(newMode) 262: #puts "**** New mode: #{newMode}" 263: @activePatterns = @patternsByMode[newMode] 264: raise "Undefined mode #{newMode}" unless @activePatterns 265: @scannerMode = newMode 266: end
Return the next token from the input stream. The result is an Array with 3 entries: the token type, the token String and the SourceFileInfo where the token started.
# File lib/taskjuggler/TextParser/Scanner.rb, line 365 365: def nextToken 366: # If we have a pushed-back token, return that first. 367: unless @tokenBuffer.nil? 368: res = @tokenBuffer 369: @tokenBuffer = nil 370: return res 371: end 372: 373: if @finishLastFile 374: # The previously processed file has now really been processed to 375: # completion. Close it and remove the corresponding entry from the 376: # @fileStack. 377: @finishLastFile = false 378: #@log << "Completed file #{@cf.fileName}" 379: 380: # If we have a block to be executed on EOF, we call it now. 381: onEof = @fileStack.last[2] 382: onEof.call if onEof 383: 384: @cf.close if @cf 385: @fileStack.pop 386: 387: if @fileStack.empty? 388: # We are done with the top-level file now. 389: @cf = @tokenBuffer = nil 390: @finishLastFile = true 391: return [ :endOfText, '<EOT>', @startOfToken ] 392: else 393: # Continue parsing the file that included the current file. 394: @cf, tokenBuffer = @fileStack.last 395: @log << "Parsing file #{@cf.fileName} ..." 396: # If we have a left over token from previously processing this file, 397: # return it now. 398: if tokenBuffer 399: @finishLastFile = true if tokenBuffer[0] == :eof 400: return tokenBuffer 401: end 402: end 403: end 404: 405: # Start processing characters from the input. 406: @startOfToken = sourceFileInfo 407: loop do 408: match = nil 409: begin 410: @activePatterns.each do |type, re, postProc| 411: if (match = @cf.scan(re)) 412: if match == :scannerEOF 413: if @scannerMode != @defaultMode 414: # The stream resets the line number to 1. Since we still 415: # know the start of the token, we setup @lineDelta so that 416: # sourceFileInfo() returns the proper line number. 417: @lineDelta = -(@startOfToken.lineNo - 1) 418: error('runaway_token', 419: "Unterminated token starting at #{@startOfToken}") 420: end 421: # We've found the end of an input file. Return a special token 422: # that describes the end of a file. 423: @finishLastFile = true 424: return [ :eof, '<END>', @startOfToken ] 425: end 426: 427: raise "#{re} matches empty string" if match.empty? 428: # If we have a post processing method, call it now. It may modify 429: # the type or the found token String. 430: type, match = postProc.call(type, match) if postProc 431: 432: break if type.nil? # Ignore certain tokens with nil type. 433: 434: return [ type, match, @startOfToken ] 435: end 436: end 437: rescue ArgumentError 438: error('scan_encoding_error', $!.to_s) 439: end 440: 441: if match.nil? 442: if @cf.eof? 443: error('unexpected_eof', 444: "Unexpected end of file found") 445: else 446: error('no_token_match', 447: "Unexpected characters found: '#{@cf.peek(10)}...'") 448: end 449: end 450: end 451: end
Start the processing. if fileNameIsBuffer is true, we operate on a String, else on a File.
# File lib/taskjuggler/TextParser/Scanner.rb, line 271 271: def open(fileNameIsBuffer = false) 272: @fileNameIsBuffer = fileNameIsBuffer 273: if fileNameIsBuffer 274: @fileStack = [ [ @cf = BufferStreamHandle.new(@masterFile, @log), 275: nil, nil ] ] 276: else 277: begin 278: @fileStack = [ [ @cf = FileStreamHandle.new(@masterFile, @log), 279: nil, nil ] ] 280: rescue StandardError 281: error('open_file', "Cannot open file #{@masterFile}") 282: end 283: end 284: @masterPath = @cf.dirname + '/' 285: @tokenBuffer = nil 286: end
Return a token to retrieve it with the next nextToken() call again. Only 1 token can be returned before the next nextToken() call.
# File lib/taskjuggler/TextParser/Scanner.rb, line 455 455: def returnToken(token) 456: #@log << "-> Returning Token: [#{token[0]}][#{token[1]}]" 457: unless @tokenBuffer.nil? 458: $stderr.puts @tokenBuffer 459: raise "Fatal Error: Cannot return more than 1 token in a row" 460: end 461: @tokenBuffer = token 462: end
Return SourceFileInfo for the current processing prosition.
# File lib/taskjuggler/TextParser/Scanner.rb, line 339 339: def sourceFileInfo 340: @cf ? SourceFileInfo.new(fileName, @cf.lineNo - @lineDelta, 0) : 341: SourceFileInfo.new(@masterFile, 0, 0) 342: end
# File lib/taskjuggler/TextParser/Scanner.rb, line 504 504: def message(type, id, text, sfi, data) 505: unless text.empty? 506: line = @cf ? @cf.line : nil 507: sfi ||= sourceFileInfo 508: 509: if @cf && !@cf.macroStack.empty? 510: @messageHandler.info('macro_stack', 'Macro call history:', nil) 511: 512: @cf.macroStack.reverse_each do |entry| 513: macro = entry.macro 514: args = entry.args[1..1] 515: args.collect! { |a| '"' + a + '"' } 516: @messageHandler.info('macro_stack', 517: " ${#{macro.name} #{args.join(' ')}}", 518: macro.sourceFileInfo) 519: end 520: end 521: 522: case type 523: when :error 524: @messageHandler.error(id, text, sfi, line, data) 525: when :warning 526: @messageHandler.warning(id, text, sfi, line, data) 527: else 528: raise "Unknown message type #{type}" 529: end 530: end 531: end
Disabled; run with --debug to generate this.
Generated with the Darkfish Rdoc Generator 1.1.6.