module RubyLabs =begin rdoc == ElizaLab The ElizaLab module has definitions of classes and methods used in the projects for Chapter 10 of Explorations in Computing. The methods and classes in this module are a Ruby implementation of Joseph Weizenbaum's ELIZA program. Users can "chat" with the Doctor script, which mimics a Rogerian psychiatrist, and experiment by adding new rules to Doctor or writing their own scripts. Most methods used to install a script or carry on a conversation are in a module named Eliza. To interact with Eliza, call one of the class methods, e.g. to load the "doctor" script that comes with the ElizaLab module call Eliza.load(:doctor) and to start a conversation call Eliza.run. See the documentation for the Eliza module for a complete list of these top level methods. =end module ElizaLab require 'readline' include Readline =begin rdoc == Rule A transformation rule is associated with a key word, and is triggered when that word is found in an input sentence. Rules have integer priorities, and if more than one rule is enabled Eliza applies the one with the highest priority. Each rule has an ordered list of patterns, which control how Eliza will respond to sentences containing the key word (see the Pattern class). =end class Rule attr_accessor :key, :priority, :patterns # Create a new Rule object for sentences containing the word +key+. An # optional second argument specifies the rule's priority (the default is 1, # which is the lowest priority). The list of patterns is initially empty. def initialize(key, priority = 1) @key = key @priority = priority @patterns = Array.new end # Compare this Rule with another Rule object +x+ based on their priority attributes. The rule comparison operator # is used when a Rule is added to a priority queue. #-- # The >= operator in the method body is important, in order to make sure the default # rule stays at the end of the queue (i.e. new rules will be inserted at the # front). def <(x) @priority >= x.priority end # Add a new sentence pattern (represented by a Pattern object) to the list of patterns # for this rule. +expr+ can either be a reference to an existing Pattern object, or # a string, in which case a new Pattern is created. def addPattern(expr) if expr.class == Pattern @patterns << expr else if expr.class == String expr = Regexp.new(expr.slice(1..-2)) end @patterns << Pattern.new(expr) end end # Return a reference to sentence pattern +n+ associated with this rule. def [](n) @patterns[n] end # Helper method called by methods that read scripts from a file -- add a response # string to sentence pattern +n+. def addReassembly(line, n = -1) @patterns[n].add_response(line) end # Apply this rule to a sentence +s+. Try the patterns in order, to see if any of them match +s+. # When +s+ matches a pattern, return the next reassembly for that pattern. Apply variable substitutions to both # the patterns and the reassemblies if they contain variables. Return +nil+ if no patterns apply to +s+. # # The second argument, +opt+, is a symbol that is passed to Pattern#apply to control whether or not # it should do preprocessing. Possible values are :preprocess or :no_preprocess. def apply(s, opt) @patterns.each do |p| if @@verbose print "trying pattern " p p.regexp end res = p.apply(s, opt) return res if ! res.nil? end return nil end # Create a string that contains the rule name and priority. def to_s s = @key + " / " + @priority.to_s + "\n" @patterns.each { |r| s += " " + r.to_s + "\n" } return s end # Create a string that describes the attributes of this Rule object. def inspect # s = @key.inspect s = "" s += " [#{@priority}]" if @priority > 1 s += " --> [\n" + @patterns.join("\n") + "]" return s end end # class Rule =begin rdoc == Pattern A Pattern represents one way to transform an input sentence into a response. A Pattern instance has a regular expression and a list of one or more reassembly strings that can refer to groups in the expression. There is also an index to record the last reassembly string used, so the application can cycle through the strings. =end class Pattern attr_accessor :regexp, :list, :index, :md # Create a new sentence pattern that will apply to input sentences that # match +expr+. The argument can be either a string or a regular expression. # If the argument is a string, it is converted to a regular expression that # matches exactly that string, e.g. "duck" is converted to /duck/. # # To make it easier for uses to create patterns without knowing too many details # of regular expressions the constructor modifies the regular expression: # word breaks:: Insert word break anchors before the first word and after the last word in the expression # case insensitive:: Add a \i modifier to the regular expression # wildcards:: Insert parentheses around ".*" # variables:: Insert parentheses around variable names of the form "$n" # alternatives:: Insert parentheses around groups of words, e.g. "a|b|c" # # To see the real final regular expression stored with a rule call the # +regexp+ accessor method. # # Example: # >> p = Pattern.new("duck") # => duck: [] # >> p.regexp # => /\bduck\b/i # # >> p = Pattern.new("plane|train|automobile") # => (plane|train|automobile): [] # >> p.regexp # => /(plane|train|automobile)/i # # >> p = Pattern.new("I don't like .*") # => I don't like (.*): [] # >> p.regexp # => /\bI don't like (.*)/i #-- # Pattern.new called internally only from Rule#addPattern, which is called # to add /.*/ for default rule, or when reading /.../ line from script. # # In interactive experiments, users can call Pattern.new(s) or Pattern.new(s,a) # where s is a string or regexp, and a is an array of response strings. def initialize(expr, list = []) raise "Pattern#initialize: expr must be String or Regexp" unless (expr.class == String || expr.class == Regexp) re = (expr.class == String) ? expr : expr.source add_parens(re, /$?\.\*$?/ ) add_parens(re, /$?[\w' ]+(\|[\w' ]+)+$?/ ) add_parens(re, /$?\$\w+$?/ ) re.insert(0,'\b') if re =~ /^\w/ re.insert(-1,'\b') if re =~ /\w$/ @regexp = Regexp.new(re, :IGNORECASE) @list = list.nil? ? Array.new : list @index = 0 end # Reset the internal counter in this pattern, so that the next response comes from # the first response string. def reset @index = 0 end # Helper method called by the constructor -- add parentheses around every occurrence # of the string +r+ in sentence pattern +s+. Checks to make sure there aren't already # parentheses there. def add_parens(s, r) s.gsub!(r) { |m| ( m[0] == ?( ) ? m : "(" + m + ")" } end # Add sentence +s+ to the set of response strings for this pattern. def add_response(s) @list << s end # Try to apply this pattern to input sentence +s+. If +s+ matches the regular # expression for this rule, extract the parts that match groups, insert them # into the next response string, and return the result. If +s+ does not match # the regular expression return +nil+. # # The second argument should be a symbol that controls whether or not the method # applies preprocessing rules. The default is to apply preprocessing, which is the # typical case when users call the method from an IRB session. But when Eliza is # running, preprocessing is done already, so this argument is set to :no_preprocess. def apply(s, opt = :preprocess) Eliza.preprocess(s) if opt == :preprocess @md = s.match(@regexp) return nil if @list.empty? || @md == nil res = @list[inc()].clone return res if res[0] == ?@ puts "reassembling '#{res}'" if @@verbose res.gsub!(/\$\d+/) do |ns| n = ns.slice(1..-1).to_i # strip leading $, convert to int if n && @md[n] puts "postprocess #{@md[n]}" if @@verbose @md[n].gsub(/[a-z\-$']+/i) do |w| (@@post.has_key?(w) && @@post[w][0] != ?$) ? @@post[w] : w end else warn "Pattern.apply: no match for #{ns} in '#{res}'" "" end end return res end # Helper method -- return +true+ if sentence +s+ matches the regular expression # for this pattern. def match(s) @md = s.match(@regexp) return @md != nil end # Helper method -- return an array of parts of the input sentence captured when # the input was compared to the regular expression and that matched any wild cards # or groups in the regular expression. def parts return @md.nil? ? nil : @md.captures end # Create a string that summarizes the attributes of this pattern. def to_s s = " /" + cleanRegexp + "/\n" @list.each { |x| s += " \"" + x + "\"\n" } return s end # Create a more detailed string summarizing the pattern and its possible responses. def inspect return cleanRegexp + ": " + @list.inspect end # Helper method called by inspect and to_s -- remove the word boundary anchors from # the regular expression so it is easier to read. def cleanRegexp res = @regexp.source res.gsub!(/\\b/,"") return res end private def inc n = @index @index = (@index + 1) % @list.length return n end end # class Pattern =begin rdoc == Dictionary A Dictionary object is basically a Hash, but it overrides [] and []= to be case-insensitive. =end class Dictionary < Hash # Create a new empty dictionary. def initialize super @lc_keys = Hash.new end # Look up word +x+ in the dictionary, after converting all the letters in +x+ to lower case. def [](x) @lc_keys[x.downcase] end # Convert all letters in +x+ to lower case, then save item +y+ with the converted key. def []=(x,y) super @lc_keys[x.downcase] = y end # Convert +x+ to lower case, then see if there is an entry for the converted key in the dictionary. def has_key?(x) return @lc_keys.has_key?(x.downcase) end end # class Dictionary =begin rdoc == Eliza This top-level class of the Eliza module defines a singleton object that has methods for managing a chat with Eliza. =end class Eliza # Initialize (or reinitialize) the module -- clear out any rules that have been # loaded from a script, and install the default script that simply echoes the # user intput. def Eliza.clear @@script = nil @@aliases = Hash.new @@vars = Hash.new @@starts = Array.new @@stops = Array.new @@queue = PriorityQueue.new @@verbose = false @@pre.clear @@post.clear @@rules.clear @@default = Rule.new(:default) @@default.addPattern(/(.*)/) @@default.addReassembly("$1") return true end # # def Eliza.queue # return @@queue # end # # def Eliza.aliases # return @@aliases # end # # def Eliza.vars # return @@vars # end # # These methods are useful for debugging Eliza, but not for end users... def Eliza.pre # :nodoc: return @@pre end def Eliza.post # :nodoc: return @@post end def Eliza.rules # :nodoc: return @@rules end # Turn on "verbose mode" to see a detailed trace of which rules and sentence # patterns are being applied as Eliza responds to an input sentence. Call # Eliza.quiet to return to normal mode. def Eliza.verbose @@verbose = true end # Turn off "verbose mode" to return to normal processing. See Eliza.verbose. def Eliza.quiet @@verbose = false end # Save a copy of a script that is distributed with RubyLabs; if no output file name specified # make a file name from the program name. def Eliza.checkout(script, filename = nil) scriptfilename = script.to_s + ".txt" scriptfilename = File.join(@@elizaDirectory, scriptfilename) if !File.exists?(scriptfilename) puts "Script not found: #{scriptfilename}" return nil end outfilename = filename.nil? ? (script.to_s + ".txt") : filename dest = File.open(outfilename, "w") File.open(scriptfilename).each do |line| dest.puts line.chomp end dest.close puts "Copy of #{script} saved in #{outfilename}" end # See if Eliza has a rule associated with keyword +w+. If so, return a reference # to that Rule object, otherwise return +nil+. def Eliza.rule_for(w) @@rules[w] || ((x = @@aliases[w]) && (r = @@rules[x])) end # Apply preprocessing rules to an input +s+. Makes sure the entire input is a single # line and words are separated by single space, then applies pre-processing substitution # rules. The string is modified in place, so after this call the string +s+ has all # of the preprocessing substitutions. def Eliza.preprocess(s) s.gsub!( /\s+/, " " ) s.gsub!(@@word) { |w| @@pre.has_key?(w) ? @@pre[w] : w } puts "preprocess: line = '#{s}'" if @@verbose end # The scan method implements the first step in the "Eliza algorithm" to determine the response to an input sentence. # Apply preprocessing substitutions, then break the line into individual words, and # for each word that is associated with a Rule object, add the rule to the priority # queue. #-- # NOTE: this method does a destructive update to the input line.... def Eliza.scan(line, queue) Eliza.preprocess(line) line.scan(@@word) do |w| w.downcase! if r = Eliza.rule_for(w) queue << r puts "add rule for '#{w}' to queue" if @@verbose end end end # The apply method implements the second step in the "Eliza algorithm" to determine the response to an input sentence. # It is called from the top level method (Eliza.transform) to see if a rule applies to an # input sentence. If so, return the string generated by the rule object, otherwise # return +nil+. # # This is the method that handles indirection in scripts. If a rule body has a line # of the form "@x" it means sentences containing the rule for this word should be # handle by the rule for +x+. For example, suppose a script has this rule: # duck # /football/ # "I love my Ducks" # /.*/ # @bird # If an input sentence contains the word "duck", this rule will be added to the queue. # If Eliza applies the rule (after first trying higher priority rules) it will # see if the sentence matches the pattern /football/, i.e. if the word "football" appears # anywhere else in the sentence, and if so respond with the string "I love my Ducks". If not, the # next pattern succeeds (every input matches .*) and the response is generated by the # rules for "bird". def Eliza.apply(line, rule) puts "applying rule: key = '#{rule.key}'" if @@verbose if res = rule.apply(line, :no_preprocess) if res[0] == ?@ rulename = res.slice(1..-1) if @@rules[rulename] return Eliza.apply( line, @@rules[rulename] ) else warn "Eliza.apply: no rule for #{rulename}" return nil end else return res end else return nil end end # The transform method is called by the top level Eliza.run method to process # each sentence typed by the user. Initialize a priority queue, apply # preprocessing transformations, and add rules for each word to the queue. Then apply # the rules, in order, until a call to r.apply for some rule +r+ returns a # non-nil response. Note that the default rule should apply to any input string, so # it should never be the case that the queue empties out before some rule can apply. def Eliza.transform(s) s.sub!(/[\n\.\?!\-]*$/,"") # strip trailing punctuation # s.downcase! @@queue = PriorityQueue.new @@queue << @@default # initialize queue with default rule Eliza.scan(s, @@queue) # add rules for recognized key words while @@queue.length > 0 # apply rules in order of priority if @@verbose print "queue: " p @@queue.collect { |r| r.key } end rule = @@queue.shift if result = Eliza.apply(s, rule) return result end end warn "No rules applied" if @@queue.empty? return nil end # Helper method -- Eliza.load calls this method to deal with directives (lines where the first # word begins with a colon) def Eliza.parseDirective(line) # :nodoc: word = Eliza.detachWord(line) case word when "alias" if line.empty? || line[0] != ?$ warn "symbol after :alias must be a variable name; ignoring '#{word} #{line}'" return else sym = Eliza.detachWord(line) @@vars[sym] = Array.new line.split.each do |s| @@aliases[s] = sym @@vars[sym] << s end end when "start" @@starts << line.unquote when "stop" @@stops << line.unquote when "pre" sym = Eliza.detachWord(line) @@pre[sym] = line.unquote when "post" sym = Eliza.detachWord(line) @@post[sym] = line.unquote when "default" @@default = line[@@word] else warn "unknown directive: :#{word} (ignored)" end end # Helper method called by methods that read scripts -- remove a word from the front of a line def Eliza.detachWord(line) word = line[@@word] # pattern matches the first word if line.index(" ") line.slice!(0..line.index(" ")) # delete up to end of the word line.lstrip! # in case there are extra spaces after word else line.slice!(0..-1) # line just had the one word end return word end # Helper method called by Eliza.load. # Check each pattern's regular expression and replace var names by alternation # constructs. If the script specified a default rule name look up that # rule and save it as the default. def Eliza.compileRules @@rules.each do |key,val| a = val.patterns() a.each do |p| expr = p.regexp.inspect expr.gsub!(/\$\w+/) { |x| @@vars[x].join("|") } p.regexp = eval(expr) end end if @@default.class == String @@default = @@rules[@@default] end end # Parse rules in +filename+, store them in global arrays. If +filename+ is a symbol it # refers to a script file in the ElizaLab data directory; if it's a string it should # be the name of a file in the current directory. #-- # Strategy: use a local var named 'rule', initially set to nil. New rules start with a single word # at the start of a line. When such a line is found in the input file, create a # new Rule object and store it in 'rule'. Subsequent lines that are part of the # current rule (lines that contain regular expressions or strings) are added to # current Rule object. Directives indicate the end of a rule, so 'rule' is reset # to nil when a directive is seen. def Eliza.load(filename) begin Eliza.clear rule = nil if filename.class == Symbol filename = File.join(@@elizaDirectory, filename.to_s + ".txt") end File.open(filename).each do |line| line.strip! next if line.empty? || line[0] == ?# if line[0] == ?: Eliza.parseDirective(line) rule = nil else if line =~ @@iword rulename, priority = line.split rule = priority ? Rule.new(rulename, priority.to_i) : Rule.new(rulename) @@rules[rule.key] = rule elsif rule.nil? warn "missing rule name? unexpected input '#{line}'" elsif line[0] == ?/ if line[-1] == ?/ rule.addPattern(line) else warn "badly formed expression (missing /): '#{line}'" end elsif line[0] == ?" if line[-1] == ?" rule.addReassembly(line.unquote) else warn "badly formed string (missing \"): '#{line}'" end elsif line[0] == ?@ rule.addReassembly(line) else warn "unexpected line in rule for #{rulename}: '#{line}'" end end end Eliza.compileRules @@script = filename rescue puts "Eliza: Error processing #{filename}: #{$!}" return false end return true end # Print a complete description of all the rules from the current script. def Eliza.dump Eliza.clear unless defined? @@default puts "Script: #{@@script}" print "Starts:\n "; p @@starts print "Stops:\n "; p @@stops print "Vars:\n "; p @@vars print "Aliases:\n "; p @@aliases print "Pre:\n "; p @@pre print "Post:\n "; p @@post print "Default:\n "; p @@default print "Queue:\n "; p @@queue.collect { |r| r.key } puts @@rules.each { |key,val| puts val } return nil end # Print a summary description of the current script, with the number of rules # and sentence patterns and a list of key words from all the rules. def Eliza.info Eliza.clear unless defined? @@default words = Hash.new npatterns = 0 @@rules.each do |k,r| words[k] = 1 unless k[0] == ?$ r.patterns.each do |p| npatterns += 1 p.cleanRegexp.split.each do |w| Eliza.saveWords(w, words) end end end @@aliases.keys.each do |k| Eliza.saveWords(k, words) end puts "Script: #{@@script}" puts " #{@@rules.size} rules with #{npatterns} sentence patterns" puts " #{words.length} key words: #{words.keys.sort.join(', ')}" end # Helper method called by Eliza.info -- don't include common words like "the" or "a" # in list of key words, and clean up regular expression symbols. Put the remaining # items in the hash. def Eliza.saveWords(s, hash) # :nodoc: return if ["a","an","in","of","the"].include?(s) s.gsub! "(", "" s.gsub! ")", "" s.gsub! ".*", "" s.gsub! "?", "" return if s.length == 0 s.split(/\|/).each { |w| hash[w.downcase] = 1 } end # Delete the current script, reset Eliza back to its initial state. def Eliza.reset @@rules.each do |k, r| r.patterns.each { |p| p.reset } end return true end # Top level method to carry on a conversation. Starts a read-eval-print loop, # stopping when the user types "bye" or "quit". For each sentence, call # Eliza.transform to find a rule that applies to the sentence and print the # response. def Eliza.run Eliza.clear unless defined? @@default puts @@starts[rand(@@starts.length)] if ! @@starts.empty? loop do s = readline(" H: ", true) return if s.nil? s.chomp! next if s.empty? if s == "bye" || s == "quit" puts @@stops[rand(@@stops.length)] if ! @@stops.empty? return end puts " C: " + Eliza.transform(s) end end end # class Eliza # These state variables are accessible by any methods in a class defined inside # the ElizaLab module @@verbose = false @@elizaDirectory = File.join(File.dirname(__FILE__), '..', 'data', 'eliza') @@pre = Dictionary.new @@post = Dictionary.new @@rules = Dictionary.new @@word = /[a-z\-$']+/i # pattern for a "word" in the input language @@iword = /^[a-z\-$']+/i # same, but must be the first item on the line @@var = /\$\d+/ # variable name in reassembly string end # module ElizaLab end # module RubyLabs =begin rdoc == String The code for the ELIZA lab (elizalab.rb) has the definition of a new method for strings that removes quotes from the beginning and ending of a string. =end class String # Call +s.unquote+ to return a copy of string +s+ with double quotes removed from # the beginning and end. # # Example: # >> s = '"Is it raining?"' # => "\"Is it raining?\"" # >> s.unquote # => "Is it raining?" def unquote if self[0] == ?" && self[-1] == ?" return self.slice(1..-2) else return self end end end