module RubyLabs
=begin rdoc
== ElizaLab
The ElizaLab module has definitions of classes and methods used in the projects for Chapter 10
of Explorations in Computing. The methods and classes in this module are a Ruby
implementation of Joseph Weizenbaum's ELIZA program. Users can "chat" with the Doctor script,
which mimics a Rogerian psychiatrist, and experiment by adding new rules to Doctor or
writing their own scripts.
Most methods used to install a script or carry on a conversation are in a module named Eliza.
To interact with Eliza, call one of the class methods, e.g. to load the "doctor" script that
comes with the ElizaLab module call Eliza.load(:doctor) and to start a conversation
call Eliza.run. See the documentation for the Eliza module for a complete list of
these top level methods.
=end
module ElizaLab
require 'readline'
include Readline
=begin rdoc
== Rule
A transformation rule is associated with a key word, and is triggered
when that word is found in an input sentence. Rules have integer
priorities, and if more than one rule is enabled Eliza applies the one
with the highest priority.
Each rule has an ordered list of patterns, which control how Eliza will
respond to sentences containing the key word (see the Pattern class).
=end
class Rule
attr_accessor :key, :priority, :patterns
# Create a new Rule object for sentences containing the word +key+. An
# optional second argument specifies the rule's priority (the default is 1,
# which is the lowest priority). The list of patterns is initially empty.
def initialize(key, priority = 1)
@key = key
@priority = priority
@patterns = Array.new
end
# Compare this Rule with another Rule object +x+ based on their priority attributes. The rule comparison operator
# is used when a Rule is added to a priority queue.
#--
# The >= operator in the method body is important, in order to make sure the default
# rule stays at the end of the queue (i.e. new rules will be inserted at the
# front).
def <(x)
@priority >= x.priority
end
# Add a new sentence pattern (represented by a Pattern object) to the list of patterns
# for this rule. +expr+ can either be a reference to an existing Pattern object, or
# a string, in which case a new Pattern is created.
def addPattern(expr)
if expr.class == Pattern
@patterns << expr
else
if expr.class == String
expr = Regexp.new(expr.slice(1..-2))
end
@patterns << Pattern.new(expr)
end
end
# Return a reference to sentence pattern +n+ associated with this rule.
def [](n)
@patterns[n]
end
# Helper method called by methods that read scripts from a file -- add a response
# string to sentence pattern +n+.
def addReassembly(line, n = -1)
@patterns[n].add_response(line)
end
# Apply this rule to a sentence +s+. Try the patterns in order, to see if any of them match +s+.
# When +s+ matches a pattern, return the next reassembly for that pattern. Apply variable substitutions to both
# the patterns and the reassemblies if they contain variables. Return +nil+ if no patterns apply to +s+.
#
# The second argument, +opt+, is a symbol that is passed to Pattern#apply to control whether or not
# it should do preprocessing. Possible values are :preprocess or :no_preprocess.
def apply(s, opt)
@patterns.each do |p|
if @@verbose
print "trying pattern "
p p.regexp
end
res = p.apply(s, opt)
return res if ! res.nil?
end
return nil
end
# Create a string that contains the rule name and priority.
def to_s
s = @key + " / " + @priority.to_s + "\n"
@patterns.each { |r| s += " " + r.to_s + "\n" }
return s
end
# Create a string that describes the attributes of this Rule object.
def inspect
# s = @key.inspect
s = ""
s += " [#{@priority}]" if @priority > 1
s += " --> [\n" + @patterns.join("\n") + "]"
return s
end
end # class Rule
=begin rdoc
== Pattern
A Pattern represents one way to transform an input sentence into a
response. A Pattern instance has a regular expression and a list of
one or more reassembly strings that can refer to groups in the expression.
There is also an index to record the last reassembly string used, so
the application can cycle through the strings.
=end
class Pattern
attr_accessor :regexp, :list, :index, :md
# Create a new sentence pattern that will apply to input sentences that
# match +expr+. The argument can be either a string or a regular expression.
# If the argument is a string, it is converted to a regular expression that
# matches exactly that string, e.g. "duck" is converted to /duck/.
#
# To make it easier for uses to create patterns without knowing too many details
# of regular expressions the constructor modifies the regular expression:
# word breaks:: Insert word break anchors before the first word and after the last word in the expression
# case insensitive:: Add a \i modifier to the regular expression
# wildcards:: Insert parentheses around ".*"
# variables:: Insert parentheses around variable names of the form "$n"
# alternatives:: Insert parentheses around groups of words, e.g. "a|b|c"
#
# To see the real final regular expression stored with a rule call the
# +regexp+ accessor method.
#
# Example:
# >> p = Pattern.new("duck")
# => duck: []
# >> p.regexp
# => /\bduck\b/i
#
# >> p = Pattern.new("plane|train|automobile")
# => (plane|train|automobile): []
# >> p.regexp
# => /(plane|train|automobile)/i
#
# >> p = Pattern.new("I don't like .*")
# => I don't like (.*): []
# >> p.regexp
# => /\bI don't like (.*)/i
#--
# Pattern.new called internally only from Rule#addPattern, which is called
# to add /.*/ for default rule, or when reading /.../ line from script.
#
# In interactive experiments, users can call Pattern.new(s) or Pattern.new(s,a)
# where s is a string or regexp, and a is an array of response strings.
def initialize(expr, list = [])
raise "Pattern#initialize: expr must be String or Regexp" unless (expr.class == String || expr.class == Regexp)
re = (expr.class == String) ? expr : expr.source
add_parens(re, /\(?\.\*\)?/ )
add_parens(re, /\(?[\w' ]+(\|[\w' ]+)+\)?/ )
add_parens(re, /\(?\$\w+\)?/ )
re.insert(0,'\b') if re =~ /^\w/
re.insert(-1,'\b') if re =~ /\w$/
@regexp = Regexp.new(re, :IGNORECASE)
@list = list.nil? ? Array.new : list
@index = 0
end
# Reset the internal counter in this pattern, so that the next response comes from
# the first response string.
def reset
@index = 0
end
# Helper method called by the constructor -- add parentheses around every occurrence
# of the string +r+ in sentence pattern +s+. Checks to make sure there aren't already
# parentheses there.
def add_parens(s, r)
s.gsub!(r) { |m| ( m[0] == ?( ) ? m : "(" + m + ")" }
end
# Add sentence +s+ to the set of response strings for this pattern.
def add_response(s)
@list << s
end
# Try to apply this pattern to input sentence +s+. If +s+ matches the regular
# expression for this rule, extract the parts that match groups, insert them
# into the next response string, and return the result. If +s+ does not match
# the regular expression return +nil+.
#
# The second argument should be a symbol that controls whether or not the method
# applies preprocessing rules. The default is to apply preprocessing, which is the
# typical case when users call the method from an IRB session. But when Eliza is
# running, preprocessing is done already, so this argument is set to :no_preprocess.
def apply(s, opt = :preprocess)
Eliza.preprocess(s) if opt == :preprocess
@md = s.match(@regexp)
return nil if @list.empty? || @md == nil
res = @list[inc()].clone
return res if res[0] == ?@
puts "reassembling '#{res}'" if @@verbose
res.gsub!(/\$\d+/) do |ns|
n = ns.slice(1..-1).to_i # strip leading $, convert to int
if n && @md[n]
puts "postprocess #{@md[n]}" if @@verbose
@md[n].gsub(/[a-z\-$']+/i) do |w|
(@@post.has_key?(w) && @@post[w][0] != ?$) ? @@post[w] : w
end
else
warn "Pattern.apply: no match for #{ns} in '#{res}'"
""
end
end
return res
end
# Helper method -- return +true+ if sentence +s+ matches the regular expression
# for this pattern.
def match(s)
@md = s.match(@regexp)
return @md != nil
end
# Helper method -- return an array of parts of the input sentence captured when
# the input was compared to the regular expression and that matched any wild cards
# or groups in the regular expression.
def parts
return @md.nil? ? nil : @md.captures
end
# Create a string that summarizes the attributes of this pattern.
def to_s
s = " /" + cleanRegexp + "/\n"
@list.each { |x| s += " \"" + x + "\"\n" }
return s
end
# Create a more detailed string summarizing the pattern and its possible responses.
def inspect
return cleanRegexp + ": " + @list.inspect
end
# Helper method called by inspect and to_s -- remove the word boundary anchors from
# the regular expression so it is easier to read.
def cleanRegexp
res = @regexp.source
res.gsub!(/\\b/,"")
return res
end
private
def inc
n = @index
@index = (@index + 1) % @list.length
return n
end
end # class Pattern
=begin rdoc
== Dictionary
A Dictionary object is basically a Hash, but it overrides [] and []= to be case-insensitive.
=end
class Dictionary < Hash
# Create a new empty dictionary.
def initialize
super
@lc_keys = Hash.new
end
# Look up word +x+ in the dictionary, after converting all the letters in +x+ to lower case.
def [](x)
@lc_keys[x.downcase]
end
# Convert all letters in +x+ to lower case, then save item +y+ with the converted key.
def []=(x,y)
super
@lc_keys[x.downcase] = y
end
# Convert +x+ to lower case, then see if there is an entry for the converted key in the dictionary.
def has_key?(x)
return @lc_keys.has_key?(x.downcase)
end
end # class Dictionary
=begin rdoc
== Eliza
This top-level class of the Eliza module defines a singleton object that has
methods for managing a chat with Eliza.
=end
class Eliza
# Initialize (or reinitialize) the module -- clear out any rules that have been
# loaded from a script, and install the default script that simply echoes the
# user intput.
def Eliza.clear
@@script = nil
@@aliases = Hash.new
@@vars = Hash.new
@@starts = Array.new
@@stops = Array.new
@@queue = PriorityQueue.new
@@verbose = false
@@pre.clear
@@post.clear
@@rules.clear
@@default = Rule.new(:default)
@@default.addPattern(/(.*)/)
@@default.addReassembly("$1")
return true
end
#
# def Eliza.queue
# return @@queue
# end
#
# def Eliza.aliases
# return @@aliases
# end
#
# def Eliza.vars
# return @@vars
# end
#
# These methods are useful for debugging Eliza, but not for end users...
def Eliza.pre # :nodoc:
return @@pre
end
def Eliza.post # :nodoc:
return @@post
end
def Eliza.rules # :nodoc:
return @@rules
end
# Turn on "verbose mode" to see a detailed trace of which rules and sentence
# patterns are being applied as Eliza responds to an input sentence. Call
# Eliza.quiet to return to normal mode.
def Eliza.verbose
@@verbose = true
end
# Turn off "verbose mode" to return to normal processing. See Eliza.verbose.
def Eliza.quiet
@@verbose = false
end
# Save a copy of a script that is distributed with RubyLabs; if no output file name specified
# make a file name from the program name.
def Eliza.checkout(script, filename = nil)
scriptfilename = script.to_s + ".txt"
scriptfilename = File.join(@@elizaDirectory, scriptfilename)
if !File.exists?(scriptfilename)
puts "Script not found: #{scriptfilename}"
return nil
end
outfilename = filename.nil? ? (script.to_s + ".txt") : filename
dest = File.open(outfilename, "w")
File.open(scriptfilename).each do |line|
dest.puts line.chomp
end
dest.close
puts "Copy of #{script} saved in #{outfilename}"
end
# See if Eliza has a rule associated with keyword +w+. If so, return a reference
# to that Rule object, otherwise return +nil+.
def Eliza.rule_for(w)
@@rules[w] || ((x = @@aliases[w]) && (r = @@rules[x]))
end
# Apply preprocessing rules to an input +s+. Makes sure the entire input is a single
# line and words are separated by single space, then applies pre-processing substitution
# rules. The string is modified in place, so after this call the string +s+ has all
# of the preprocessing substitutions.
def Eliza.preprocess(s)
s.gsub!( /\s+/, " " )
s.gsub!(@@word) { |w| @@pre.has_key?(w) ? @@pre[w] : w }
puts "preprocess: line = '#{s}'" if @@verbose
end
# The scan method implements the first step in the "Eliza algorithm" to determine the response to an input sentence.
# Apply preprocessing substitutions, then break the line into individual words, and
# for each word that is associated with a Rule object, add the rule to the priority
# queue.
#--
# NOTE: this method does a destructive update to the input line....
def Eliza.scan(line, queue)
Eliza.preprocess(line)
line.scan(@@word) do |w|
w.downcase!
if r = Eliza.rule_for(w)
queue << r
puts "add rule for '#{w}' to queue" if @@verbose
end
end
end
# The apply method implements the second step in the "Eliza algorithm" to determine the response to an input sentence.
# It is called from the top level method (Eliza.transform) to see if a rule applies to an
# input sentence. If so, return the string generated by the rule object, otherwise
# return +nil+.
#
# This is the method that handles indirection in scripts. If a rule body has a line
# of the form "@x" it means sentences containing the rule for this word should be
# handle by the rule for +x+. For example, suppose a script has this rule:
# duck
# /football/
# "I love my Ducks"
# /.*/
# @bird
# If an input sentence contains the word "duck", this rule will be added to the queue.
# If Eliza applies the rule (after first trying higher priority rules) it will
# see if the sentence matches the pattern /football/, i.e. if the word "football" appears
# anywhere else in the sentence, and if so respond with the string "I love my Ducks". If not, the
# next pattern succeeds (every input matches .*) and the response is generated by the
# rules for "bird".
def Eliza.apply(line, rule)
puts "applying rule: key = '#{rule.key}'" if @@verbose
if res = rule.apply(line, :no_preprocess)
if res[0] == ?@
rulename = res.slice(1..-1)
if @@rules[rulename]
return Eliza.apply( line, @@rules[rulename] )
else
warn "Eliza.apply: no rule for #{rulename}"
return nil
end
else
return res
end
else
return nil
end
end
# The transform method is called by the top level Eliza.run method to process
# each sentence typed by the user. Initialize a priority queue, apply
# preprocessing transformations, and add rules for each word to the queue. Then apply
# the rules, in order, until a call to r.apply for some rule +r+ returns a
# non-nil response. Note that the default rule should apply to any input string, so
# it should never be the case that the queue empties out before some rule can apply.
def Eliza.transform(s)
s.sub!(/[\n\.\?!\-]*$/,"") # strip trailing punctuation
# s.downcase!
@@queue = PriorityQueue.new
@@queue << @@default # initialize queue with default rule
Eliza.scan(s, @@queue) # add rules for recognized key words
while @@queue.length > 0 # apply rules in order of priority
if @@verbose
print "queue: "
p @@queue.collect { |r| r.key }
end
rule = @@queue.shift
if result = Eliza.apply(s, rule)
return result
end
end
warn "No rules applied" if @@queue.empty?
return nil
end
# Helper method -- Eliza.load calls this method to deal with directives (lines where the first
# word begins with a colon)
def Eliza.parseDirective(line) # :nodoc:
word = Eliza.detachWord(line)
case word
when "alias"
if line.empty? || line[0] != ?$
warn "symbol after :alias must be a variable name; ignoring '#{word} #{line}'"
return
else
sym = Eliza.detachWord(line)
@@vars[sym] = Array.new
line.split.each do |s|
@@aliases[s] = sym
@@vars[sym] << s
end
end
when "start"
@@starts << line.unquote
when "stop"
@@stops << line.unquote
when "pre"
sym = Eliza.detachWord(line)
@@pre[sym] = line.unquote
when "post"
sym = Eliza.detachWord(line)
@@post[sym] = line.unquote
when "default"
@@default = line[@@word]
else
warn "unknown directive: :#{word} (ignored)"
end
end
# Helper method called by methods that read scripts -- remove a word from the front of a line
def Eliza.detachWord(line)
word = line[@@word] # pattern matches the first word
if line.index(" ")
line.slice!(0..line.index(" ")) # delete up to end of the word
line.lstrip! # in case there are extra spaces after word
else
line.slice!(0..-1) # line just had the one word
end
return word
end
# Helper method called by Eliza.load.
# Check each pattern's regular expression and replace var names by alternation
# constructs. If the script specified a default rule name look up that
# rule and save it as the default.
def Eliza.compileRules
@@rules.each do |key,val|
a = val.patterns()
a.each do |p|
expr = p.regexp.inspect
expr.gsub!(/\$\w+/) { |x| @@vars[x].join("|") }
p.regexp = eval(expr)
end
end
if @@default.class == String
@@default = @@rules[@@default]
end
end
# Parse rules in +filename+, store them in global arrays. If +filename+ is a symbol it
# refers to a script file in the ElizaLab data directory; if it's a string it should
# be the name of a file in the current directory.
#--
# Strategy: use a local var named 'rule', initially set to nil. New rules start with a single word
# at the start of a line. When such a line is found in the input file, create a
# new Rule object and store it in 'rule'. Subsequent lines that are part of the
# current rule (lines that contain regular expressions or strings) are added to
# current Rule object. Directives indicate the end of a rule, so 'rule' is reset
# to nil when a directive is seen.
def Eliza.load(filename)
begin
Eliza.clear
rule = nil
if filename.class == Symbol
filename = File.join(@@elizaDirectory, filename.to_s + ".txt")
end
File.open(filename).each do |line|
line.strip!
next if line.empty? || line[0] == ?#
if line[0] == ?:
Eliza.parseDirective(line)
rule = nil
else
if line =~ @@iword
rulename, priority = line.split
rule = priority ? Rule.new(rulename, priority.to_i) : Rule.new(rulename)
@@rules[rule.key] = rule
elsif rule.nil?
warn "missing rule name? unexpected input '#{line}'"
elsif line[0] == ?/
if line[-1] == ?/
rule.addPattern(line)
else
warn "badly formed expression (missing /): '#{line}'"
end
elsif line[0] == ?"
if line[-1] == ?"
rule.addReassembly(line.unquote)
else
warn "badly formed string (missing \"): '#{line}'"
end
elsif line[0] == ?@
rule.addReassembly(line)
else
warn "unexpected line in rule for #{rulename}: '#{line}'"
end
end
end
Eliza.compileRules
@@script = filename
rescue
puts "Eliza: Error processing #{filename}: #{$!}"
return false
end
return true
end
# Print a complete description of all the rules from the current script.
def Eliza.dump
Eliza.clear unless defined? @@default
puts "Script: #{@@script}"
print "Starts:\n "; p @@starts
print "Stops:\n "; p @@stops
print "Vars:\n "; p @@vars
print "Aliases:\n "; p @@aliases
print "Pre:\n "; p @@pre
print "Post:\n "; p @@post
print "Default:\n "; p @@default
print "Queue:\n "; p @@queue.collect { |r| r.key }
puts
@@rules.each { |key,val| puts val }
return nil
end
# Print a summary description of the current script, with the number of rules
# and sentence patterns and a list of key words from all the rules.
def Eliza.info
Eliza.clear unless defined? @@default
words = Hash.new
npatterns = 0
@@rules.each do |k,r|
words[k] = 1 unless k[0] == ?$
r.patterns.each do |p|
npatterns += 1
p.cleanRegexp.split.each do |w|
Eliza.saveWords(w, words)
end
end
end
@@aliases.keys.each do |k|
Eliza.saveWords(k, words)
end
puts "Script: #{@@script}"
puts " #{@@rules.size} rules with #{npatterns} sentence patterns"
puts " #{words.length} key words: #{words.keys.sort.join(', ')}"
end
# Helper method called by Eliza.info -- don't include common words like "the" or "a"
# in list of key words, and clean up regular expression symbols. Put the remaining
# items in the hash.
def Eliza.saveWords(s, hash) # :nodoc:
return if ["a","an","in","of","the"].include?(s)
s.gsub! "(", ""
s.gsub! ")", ""
s.gsub! ".*", ""
s.gsub! "?", ""
return if s.length == 0
s.split(/\|/).each { |w| hash[w.downcase] = 1 }
end
# Delete the current script, reset Eliza back to its initial state.
def Eliza.reset
@@rules.each do |k, r|
r.patterns.each { |p| p.reset }
end
return true
end
# Top level method to carry on a conversation. Starts a read-eval-print loop,
# stopping when the user types "bye" or "quit". For each sentence, call
# Eliza.transform to find a rule that applies to the sentence and print the
# response.
def Eliza.run
Eliza.clear unless defined? @@default
puts @@starts[rand(@@starts.length)] if ! @@starts.empty?
loop do
s = readline(" H: ", true)
return if s.nil?
s.chomp!
next if s.empty?
if s == "bye" || s == "quit"
puts @@stops[rand(@@stops.length)] if ! @@stops.empty?
return
end
puts " C: " + Eliza.transform(s)
end
end
end # class Eliza
# These state variables are accessible by any methods in a class defined inside
# the ElizaLab module
@@verbose = false
@@elizaDirectory = File.join(File.dirname(__FILE__), '..', 'data', 'eliza')
@@pre = Dictionary.new
@@post = Dictionary.new
@@rules = Dictionary.new
@@word = /[a-z\-$']+/i # pattern for a "word" in the input language
@@iword = /^[a-z\-$']+/i # same, but must be the first item on the line
@@var = /\$\d+/ # variable name in reassembly string
end # module ElizaLab
end # module RubyLabs
=begin rdoc
== String
The code for the ELIZA lab (elizalab.rb) has the definition of a new method for strings
that removes quotes from the beginning and ending of a string.
=end
class String
# Call +s.unquote+ to return a copy of string +s+ with double quotes removed from
# the beginning and end.
#
# Example:
# >> s = '"Is it raining?"'
# => "\"Is it raining?\""
# >> s.unquote
# => "Is it raining?"
def unquote
if self[0] == ?" && self[-1] == ?"
return self.slice(1..-2)
else
return self
end
end
end