This class is a Ruby representation of a table. All data is captured as
type String
by default. Columns are referred to by their
String
headers which are assumed to be identified in the first
row of the input file. Output is written by default to tab-delimited files
with the first row serving as the header names.
The headers attribute contains the table headers used to reference columns
in the Table
. All headers are represented as
String
types.
Instantiate a Table
object using a tab-delimited file
input
OPTIONAL Array
of rows or String
to identify the
name of the tab-delimited file to read
# File lib/tablestakes.rb, line 32 def initialize(input=nil) @headers = [] @table = {} @indices = {} if input.respond_to?(:fetch) if input[0].respond_to?(:fetch) #create +Table+ from rows add_rows(input) end elsif input.respond_to?(:upcase) # a string, then read_file read_file(input) elsif input.respond_to?(:headers) init(input) end # else create empty +Table+ end
Counts the number of instances of a particular string, given a column name,
and returns an integer >= 0. Returns nil
if the column is
not found. If no parameters are given, returns the number of rows in the
table.
colname
String
to identify the column to count
num
OPTIONAL String
number of values to return
# File lib/tablestakes.rb, line 152 def bottom(colname, num=1) freq = tally(colname).to_a[1..-1].sort_by {|k,v| v } return Table.new(freq[0..num-1].unshift(["State","Count"])) end
Return a copy of a column from the table, identified by column name.
Returns nil
if column name not found.
colname
String
to identify the name of the column
# File lib/tablestakes.rb, line 55 def column(colname) # check arguments return nil unless @table.has_key?(colname) Array(@table[colname]) end
Counts the number of instances of a particular string, given a column name,
and returns an integer >= 0. Returns nil
if the column is
not found. If no parameters are given, returns the number of rows in the
table.
colname
OPTIONAL String
to identify the column to count
value
OPTIONAL String
value to count
# File lib/tablestakes.rb, line 111 def count(colname=nil, value=nil) if colname.nil? || value.nil? if @table.size > 0 @table.each_key {|e| return @table.fetch(e).length } else return nil end end if @table[colname] result = 0 @table[colname].each do |val| val == value.to_s ? result += 1 : nil end result else nil end end
Return the intersection of columns from different tables, eliminating duplicates. Return nil if a column is not found.
table2
Table
to identify the secondary table in the intersection
colname
String
to identify the column to intersection
col2name
OPTIONAL String
to identify the column in the second table to
intersection
# File lib/tablestakes.rb, line 316 def intersect(table2, colname, col2name=nil) # check arguments raise ArgumentError, "Invalid table!" unless table2.is_a?(Table) return nil unless @table.has_key?(colname) if col2name.nil? # Assume colname applies for both tables col2name = colname end return nil unless table2.headers.include?(col2name) return self.column(colname) & table2.column(col2name) end
Given a second table to join against, and a field/column, return a
Table
which contains a join of the two tables. Join only lists
the common column once, under the column name of the first table (if
different from the name of thee second). All columns from both tables are
returned. Returns nil
if the column is not found.
table2
Table
to identify the secondary table in the join
colname
String
to identify the column to join on
col2name
OPTIONAL String
to identify the column in the second table to
join on
# File lib/tablestakes.rb, line 237 def join(table2, colname, col2name=nil) # check arguments raise ArgumentError, "Invalid table!" unless table2.is_a?(Table) return nil unless @table.has_key?(colname) if col2name.nil? # Assume colname applies for both tables col2name = colname end t2_col_index = table2.headers.index(col2name) return nil unless t2_col_index # is not nil # ensure no duplication of header values table2.headers.each do |h| if @headers.include?(h) update_header(h, '_' << h ) if h == colname colname = '_' << colname end end end result = [ Array(@headers) + Array(table2.headers) ] @table[colname].each_index do |index| t2_index = table2.column(col2name).find_index(@table[colname][index]) unless t2_index.nil? result << self.row(index) + table2.row(t2_index) end end if result.length == 1 #no rows selected return nil else return Table.new(result) end end
Return a copy of a row from the table as an Array
, given an
index (i.e. row number). Returns empty Array if the index is out of bounds.
index
FixNum
indicating index of the row.
# File lib/tablestakes.rb, line 66 def row(index) Array(get_row(index)) end
Select columns from the table, given one or more column names. Returns an
instance of Table
with the results. Returns nil if any column
is not valid.
columns
Variable String
arguments to identify the columns to select
# File lib/tablestakes.rb, line 178 def select(*columns) # check arguments columns.each do |c| return nil unless @table.has_key?(c) end result = [] result_headers = [] columns.each { |col| @headers.include?(col) ? result_headers << col : nil } result << result_headers @table[@headers.first].length.times do |row| this_row = [] result_headers.each do |col| this_row << @table[col][row] end result << this_row end unless result_headers.empty? return Table.new(result) else return nil end end
Given a field/column, and a regular expression to match against, and a
replacement string, update the table such that it substitutes the column
data with the replacement string. Returns nil
if the column is
not found.
colname
String
to identify the column to join on
re
Regexp
to match the value in the selected column
replace
String
to specify the replacement text for the given
Regexp
# File lib/tablestakes.rb, line 280 def sub(colname, re, replace) # check arguments raise ArgumentError, "No regular expression to match against" unless re raise ArgumentError, "No replacement string specified" unless replace return nil unless @table.has_key?(colname) @table[colname].each do |item| item.sub!(re, replace) end return self end
Count instances in a particular field/column and return a
Table
of the results. Returns nil
if the column
is not found.
colname
String
to identify the column to tally
# File lib/tablestakes.rb, line 163 def tally(colname) # check arguments return nil unless @table.has_key?(colname) result = {} @table[colname].each do |val| result.has_key?(val) ? result[val] += 1 : result[val] = 1 end return Table.new([[colname,"Count"]] + result.to_a) end
Converts a Table
object to an array of arrays (each row)
none
# File lib/tablestakes.rb, line 92 def to_a result = [ Array(@headers) ] @table[@headers.first].length.times do |row| items = [] @headers.each do |col| items << @table[col][row] end result << items end result end
Converts a Table
object to a tab-delimited string.
none
# File lib/tablestakes.rb, line 73 def to_s result = @headers.join("\t") << "\n" @table[@headers.first].length.times do |row| @headers.each do |col| result << @table[col][row].to_s unless col == @headers.last result << "\t" else result << "\n" end end end result end
Counts the number of instances of a particular string, given a column name,
and returns an integer >= 0. Returns nil
if the column is
not found. If no parameters are given, returns the number of rows in the
table.
colname
String
to identify the column to count
num
OPTIONAL String
number of values to return
# File lib/tablestakes.rb, line 140 def top(colname, num=1) freq = tally(colname).to_a[1..-1].sort_by {|k,v| v }.reverse return Table.new(freq[0..num-1].unshift(["State","Count"])) end
Return the union of columns from different tables, eliminating duplicates. Return nil if a column is not found.
table2
Table
to identify the secondary table in the union
colname
String
to identify the column to union
col2name
OPTIONAL String
to identify the column in the second table to
union
# File lib/tablestakes.rb, line 298 def union(table2, colname, col2name=nil) # check arguments raise ArgumentError, "Invalid table!" unless table2.is_a?(Table) return nil unless @table.has_key?(colname) if col2name.nil? # Assume colname applies for both tables col2name = colname end return nil unless table2.headers.include?(col2name) return self.column(colname) | table2.column(col2name) end
Given a particular condition for a given column field/column, return a
subtable that matches the condition. If no condition is given, a new
Table
is returned with all records. Returns nil
if the condition is not met or the column is not found.
colname
String
to identify the column to tally
condition
OPTIONAL String
containing a ruby condition to evaluate
# File lib/tablestakes.rb, line 211 def where(colname, condition=nil) # check arguments return nil unless @table.has_key?(colname) result = [] result << @headers @table[colname].each_index do |index| if condition eval("'#{@table[colname][index]}' #{condition}") ? result << get_row(index) : nil else result << get_row(index) end end result.length > 1 ? Table.new(result) : nil end
Write a representation of the Table
object to a file (tab
delimited).
filename
String
to identify the name of the file to write
# File lib/tablestakes.rb, line 333 def write_file(filename) file = File.open(filename, "w") file.print self.to_s end