class Table

This class is a Ruby representation of a table. All data is captured as type String by default. Columns are referred to by their String headers which are assumed to be identified in the first row of the input file. Output is written by default to tab-delimited files with the first row serving as the header names.

Attributes

headers[R]

The headers attribute contains the table headers used to reference columns in the Table. All headers are represented as String types.

Public Class Methods

new(input=nil) click to toggle source

Instantiate a Table object using a tab-delimited file

input

OPTIONAL Array of rows or String to identify the name of the tab-delimited file to read

# File lib/tablestakes.rb, line 32
def initialize(input=nil)
  @headers = []
  @table = {}
  @indices = {}
  
  if input.respond_to?(:fetch)
    if input[0].respond_to?(:fetch)
      #create +Table+ from rows
      add_rows(input)
    end
  elsif input.respond_to?(:upcase)
    # a string, then read_file
    read_file(input)
  elsif input.respond_to?(:headers)
    init(input)
  end
  # else create empty +Table+
end

Public Instance Methods

bottom(colname, num=1) click to toggle source

Counts the number of instances of a particular string, given a column name, and returns an integer >= 0. Returns nil if the column is not found. If no parameters are given, returns the number of rows in the table.

colname

String to identify the column to count

num

OPTIONAL String number of values to return

# File lib/tablestakes.rb, line 152
def bottom(colname, num=1)
  freq = tally(colname).to_a[1..-1].sort_by {|k,v| v }
  return Table.new(freq[0..num-1].unshift(["State","Count"]))
end
column(colname) click to toggle source

Return a copy of a column from the table, identified by column name. Returns nil if column name not found.

colname

String to identify the name of the column

# File lib/tablestakes.rb, line 55
def column(colname)
  # check arguments
  return nil unless @table.has_key?(colname)

  Array(@table[colname])
end
count(colname=nil, value=nil) click to toggle source

Counts the number of instances of a particular string, given a column name, and returns an integer >= 0. Returns nil if the column is not found. If no parameters are given, returns the number of rows in the table.

colname

OPTIONAL String to identify the column to count

value

OPTIONAL String value to count

# File lib/tablestakes.rb, line 111
def count(colname=nil, value=nil)
  if colname.nil? || value.nil?
    if @table.size > 0
      @table.each_key {|e| return @table.fetch(e).length }
    else
      return nil
    end
  end
  
  if @table[colname]
    result = 0
    @table[colname].each do |val|
      val == value.to_s ? result += 1 : nil 
    end
    result
  else
    nil 
  end
end
Also aliased as: size, length
get_columns(*columns)
Alias for: select
get_rows(colname, condition=nil)
Alias for: where
intersect(table2, colname, col2name=nil) click to toggle source

Return the intersection of columns from different tables, eliminating duplicates. Return nil if a column is not found.

table2

Table to identify the secondary table in the intersection

colname

String to identify the column to intersection

col2name

OPTIONAL String to identify the column in the second table to intersection

# File lib/tablestakes.rb, line 316
def intersect(table2, colname, col2name=nil)
  # check arguments
  raise ArgumentError, "Invalid table!" unless table2.is_a?(Table)
  return nil unless @table.has_key?(colname)
  if col2name.nil?   # Assume colname applies for both tables
    col2name = colname
  end
  return nil unless table2.headers.include?(col2name)

  return self.column(colname) & table2.column(col2name)
end
join(table2, colname, col2name=nil) click to toggle source

Given a second table to join against, and a field/column, return a Table which contains a join of the two tables. Join only lists the common column once, under the column name of the first table (if different from the name of thee second). All columns from both tables are returned. Returns nil if the column is not found.

table2

Table to identify the secondary table in the join

colname

String to identify the column to join on

col2name

OPTIONAL String to identify the column in the second table to join on

# File lib/tablestakes.rb, line 237
def join(table2, colname, col2name=nil)
  # check arguments
  raise ArgumentError, "Invalid table!" unless table2.is_a?(Table)
  return nil unless @table.has_key?(colname)
  if col2name.nil?   # Assume colname applies for both tables
    col2name = colname
  end
  t2_col_index = table2.headers.index(col2name)
  return nil unless t2_col_index # is not nil

  
  # ensure no duplication of header values
  table2.headers.each do |h|
    if @headers.include?(h)
      update_header(h, '_' << h )
      if h == colname
        colname = '_' << colname
      end
    end
  end

  result = [ Array(@headers) + Array(table2.headers) ]
  @table[colname].each_index do |index|
    t2_index = table2.column(col2name).find_index(@table[colname][index])
    unless t2_index.nil?
      result << self.row(index) + table2.row(t2_index)
    end
  end
  if result.length == 1 #no rows selected
    return nil
  else
    return Table.new(result) 
  end
end
length(colname=nil, value=nil)
Alias for: count
row(index) click to toggle source

Return a copy of a row from the table as an Array, given an index (i.e. row number). Returns empty Array if the index is out of bounds.

index

FixNum indicating index of the row.

# File lib/tablestakes.rb, line 66
def row(index)    
  Array(get_row(index))
end
select(*columns) click to toggle source

Select columns from the table, given one or more column names. Returns an instance of Table with the results. Returns nil if any column is not valid.

columns

Variable String arguments to identify the columns to select

# File lib/tablestakes.rb, line 178
def select(*columns)
  # check arguments
  columns.each do |c|
    return nil unless @table.has_key?(c)
  end

  result = []
  result_headers = []
  columns.each { |col| @headers.include?(col) ? result_headers << col : nil }
  result << result_headers
  @table[@headers.first].length.times do |row|
    this_row = []
    result_headers.each do |col|
      this_row << @table[col][row]
    end
    result << this_row
  end
  unless result_headers.empty?
    return Table.new(result)
  else
    return nil
  end
end
Also aliased as: get_columns
size(colname=nil, value=nil)
Alias for: count
sub(colname, re, replace) click to toggle source

Given a field/column, and a regular expression to match against, and a replacement string, update the table such that it substitutes the column data with the replacement string. Returns nil if the column is not found.

colname

String to identify the column to join on

re

Regexp to match the value in the selected column

replace

String to specify the replacement text for the given Regexp

# File lib/tablestakes.rb, line 280
def sub(colname, re, replace)
  # check arguments
  raise ArgumentError, "No regular expression to match against" unless re
  raise ArgumentError, "No replacement string specified" unless replace
  return nil unless @table.has_key?(colname)
  
  @table[colname].each do |item|
    item.sub!(re, replace)
  end
  return self
end
Also aliased as: sub!
sub!(colname, re, replace)
Alias for: sub
tally(colname) click to toggle source

Count instances in a particular field/column and return a Table of the results. Returns nil if the column is not found.

colname

String to identify the column to tally

# File lib/tablestakes.rb, line 163
def tally(colname)
  # check arguments
  return nil unless @table.has_key?(colname)

  result = {}
  @table[colname].each do |val|
    result.has_key?(val) ? result[val] += 1 : result[val] = 1
  end
  return Table.new([[colname,"Count"]] + result.to_a)
end
to_a() click to toggle source

Converts a Table object to an array of arrays (each row)

none

# File lib/tablestakes.rb, line 92
def to_a
  result = [ Array(@headers) ]
  
  @table[@headers.first].length.times do |row|
    items = []
    @headers.each do |col|
      items << @table[col][row]
    end
    result << items
  end
  result
end
to_s() click to toggle source

Converts a Table object to a tab-delimited string.

none

# File lib/tablestakes.rb, line 73
def to_s
  result = @headers.join("\t") << "\n"
  
  @table[@headers.first].length.times do |row|
    @headers.each do |col|
      result << @table[col][row].to_s
      unless col == @headers.last
        result << "\t"
      else
        result << "\n"
      end
    end
  end
  result
end
top(colname, num=1) click to toggle source

Counts the number of instances of a particular string, given a column name, and returns an integer >= 0. Returns nil if the column is not found. If no parameters are given, returns the number of rows in the table.

colname

String to identify the column to count

num

OPTIONAL String number of values to return

# File lib/tablestakes.rb, line 140
def top(colname, num=1)
  freq = tally(colname).to_a[1..-1].sort_by {|k,v| v }.reverse
  return Table.new(freq[0..num-1].unshift(["State","Count"]))
end
union(table2, colname, col2name=nil) click to toggle source

Return the union of columns from different tables, eliminating duplicates. Return nil if a column is not found.

table2

Table to identify the secondary table in the union

colname

String to identify the column to union

col2name

OPTIONAL String to identify the column in the second table to union

# File lib/tablestakes.rb, line 298
def union(table2, colname, col2name=nil)
  # check arguments
  raise ArgumentError, "Invalid table!" unless table2.is_a?(Table)
  return nil unless @table.has_key?(colname)
  if col2name.nil?   # Assume colname applies for both tables
    col2name = colname
  end
  return nil unless table2.headers.include?(col2name)

  return self.column(colname) | table2.column(col2name)
end
where(colname, condition=nil) click to toggle source

Given a particular condition for a given column field/column, return a subtable that matches the condition. If no condition is given, a new Table is returned with all records. Returns nil if the condition is not met or the column is not found.

colname

String to identify the column to tally

condition

OPTIONAL String containing a ruby condition to evaluate

# File lib/tablestakes.rb, line 211
def where(colname, condition=nil)
  # check arguments
  return nil unless @table.has_key?(colname)

  result = []
  result << @headers
  @table[colname].each_index do |index|
    if condition
      eval("'#{@table[colname][index]}' #{condition}") ? result << get_row(index) : nil
    else
      result << get_row(index)
    end
  end
  result.length > 1 ? Table.new(result) : nil
end
Also aliased as: get_rows
write_file(filename) click to toggle source

Write a representation of the Table object to a file (tab delimited).

filename

String to identify the name of the file to write

# File lib/tablestakes.rb, line 333
def write_file(filename)
  file = File.open(filename, "w")
  file.print self.to_s
end