=begin rdoc
== HashLab
Methods used in experiments on hash tables.
NOTE: the +insert+ and +lookup+ methods defined here are "stubs" used to test the hash
functions. Load the file hashmethods.rb to overwrite the stubs with the actual
methods that create and search buckets.
=end
module RubyLabs
module HashLab
TableView = Struct.new(:table, :nrows)
=begin rdoc
Compute the sum of the letters in a string, where each letter is weighted
according to its position. Requires the +ord+ method added to Fixnum in the
RubyLabs module.
=end
# :begin :radix26
def radix26(s)
x = 0
s.each_byte do |b|
x = x * 26 + b.ord
end
return x
end
# :end :radix26
=begin rdoc
Trival hash function that uses only the first letter in the input string. +s+ is the
string to hash, +n+ is the size of the hash table.
=end
# :begin :h0
def h0(s, n)
return s[0].ord % n
end
# :end :h0
=begin rdoc
Slightly better hash function that uses the first two letters in the input string. +s+ is the
string to hash, +n+ is the size of the hash table.
=end
# :begin :h1
def h1(s, n)
return ((s[0].ord * 26) + s[1].ord) % n
end
# :end :h1
=begin rdoc
Hash function based on the radix-26 representation of the input string. +s+ is the
string to hash, +n+ is the size of the hash table.
=end
# :begin :hr
def hr(s, n)
return radix26(s) % n
end
# :end :hr
=begin rdoc
Simple hash function used in the first project, to introduce the idea of hashing.
Uses a default table size of 10 and the h0 method (which just looks at the first
letter in a string).
=end
# :begin :h
def h(s, n = 10)
return s[0].ord % n
end
# :end :h
=begin rdoc
Method used in the first project: if +s+ can be added to +t+ (i.e. if no
collision) return the location, otherwise return +nil+
=end
# :begin :insert
def insert(s, t)
i = h(s, t.length)
if t[i].nil?
t[i] = s
return i
else
return nil
end
end
# :end :insert
=begin rdoc
Method used in the first project: if +s+ is in array +t+ return the location,
otherwise return +nil+
=end
# :begin :lookup
def lookup(s, t)
i = h(s, t.length)
if t[i] == s
return i
else
return nil
end
end
# :end :lookup
=begin rdoc
Make a hash table with +n+ buckets. The optional parameter is a symbol specifying
an alternative hash function (:h0 or :h1); the default is the method +hr+, which
uses the radix26 function.
The object returned by this method is an array of the designated size. A bit of
Ruby magic defines three new methods just for the new object:
[h=(f)] tells the table to use method f for the hash function (:h0, :h1, or nil)
[h] returns the id of the hash function currently being used
[hash(s)] call the current hash function on string s
=end
class HashTable
@@hash_functions = [:h0, :h1, :hr]
attr_reader :table
def initialize(n, f = :hr)
raise "HashTable: hash function must be one of #{@@hash_functions.inspect}" unless @@hash_functions.include?(f)
@table = Array.new(n)
@hash = f
end
# def hash(s)
# return case @h
# when :h0 : h0(s, @table.length)
# when :h1 : h1(s, @table.length)
# else hr(s, @table.length)
# end
# end
def lookup(s)
i = send(@hash, s, @table.length)
return ( @table[i] && @table[i].include?(s) ) ? i : nil
end
def insert(s)
i = send(@hash, s, @table.length)
@table[i] = Array.new if @table[i].nil?
@table[i] << s
return i
end
def to_s
print_table(self)
end
def inspect
sprintf '#', @table.size, @hash.to_s
end
=begin rdoc
Print usage statistics for the table. Prints lengths of longest and shortest
buckets, number of empty buckets, and mean bucket length.
=end
def print_stats
max = 0
min = Float::MAX
nzero = 0
sum = 0
@table.each do |bucket|
n = bucket.nil? ? 0 : bucket.length
min = n if n < min
max = n if n > max
sum += n
nzero += 1 if n == 0
end
printf "shortest bucket: %d\n", min
printf "longest bucket: %d\n", max
printf "empty buckets: %d\n", nzero
if max > 0
printf "mean bucket length: %.2f\n", sum.to_f / (@table.length - nzero)
end
return nil
end
=begin rdoc
Return a list of indices for buckets that have more than +cutoff+ entries.
=end
def long_rows(cutoff = 5)
rows = Array.new
@table.each_with_index do |row, i|
rows << i if !row.nil? && row.length > cutoff
end
return rows
end
end # HashTable
=begin rdoc
Print a nicely formatted representation of hash table. The parameter +t+ can be
an array or a HashTable object. The optional parameter
is the number of rows to print, e.g. to print just the first 10 rows of a large table
+t+ call print_table(t,10). Skips rows that have empty buckets.
=end
def print_table(t, max = nil)
t = t.table if t.class == HashTable
max = t.length unless max
max.times { |i| print_row(i, t[i] ) unless t[i].nil? }
return nil
end
def print_row(n, row) # :nodoc:
printf "%4d: %s\n", n, row.inspect
end
=begin rdoc
Verification of numbers shown in the table for the birthday paradox. Call
birthday(n,m) to make a table with n rows and fill it with m random words.
Return true if any row has more than one item.
=end
def birthday(n, m)
t = HashTable.new(n)
TestArray.new(m, :words).each { |s| t.insert(s) }
# puts t
t.table.each { |row| return true if row && row.length > 1 }
return false
end
=begin rdoc
Initialize the canvas with a drawing of an hash table +t+ (a table is
just an array augmented with a hash method)
=end
def view_table(t, userOptions = {} )
options = @@tableOptions.merge(userOptions)
Canvas.init(300, 500, "HashLab")
tbl = t.table
x0 = options[:tableX]
x1 = x0 + options[:cellWidth]
# if nil, color = gray, else color = white and draw text
# todo -- arrow
# todo -- save text, update with call to insert
nrows = min(tbl.size, options[:maxRows])
nrows.times do |i|
y0 = options[:tableY] + i * (options[:cellHeight] + options[:cellYSpace])
y1 = y0 + options[:cellHeight]
Canvas.rectangle(x0, y0, x1, y1, :fill => options[:cellColor])
# Canvas.text(tbl[i].inspect, textx, y0)
end
@@drawing = TableView.new(t, nrows)
Canvas.sync
return true
end
@@tableOptions = {
:tableX => 20,
:tableY => 20,
:cellHeight => 15,
:cellYSpace => 2,
:cellWidth => 30,
:cellColor => :lightgray,
:textX => 70,
:maxRows => 25,
}
end # HashLab
end # RubyLabs