Sha256: 764943d1af1477b9005fddaaf4434254dd2fc0363928e63e3e8b4f0d5d66a410

Contents?: true

Size: 1.22 KB

Versions: 3

Compression:

Stored size: 1.22 KB

Contents

# The University of California - Irvine has a great set of machine
# learning sample data sets.  Their data description pages have field
# label descriptors.  This class extracts them and returns a DataFrame
# with the labels of a data set. 

# Turns out, this isn't very useful.  So...oh well.
# By the way, the code I'm talking about is found here: http://archive.ics.uci.edu/ml/
# And to use this class:
# require 'lib/data_frame/labels_from_uci'
# df = LabelsFromUCI.data_frame 'http://archive.ics.uci.edu/ml/machine-learning-databases/communities/communities.names'
# df.import('http://archive.ics.uci.edu/ml/machine-learning-databases/communities/communities.data')

class LabelsFromUCI

  class << self
    def process(url)
      lfu = new(url)
      lfu.labels
    end
    
    def data_frame(url)
      lfu = new(url)
      DataFrame.new(lfu.labels)
    end
  end
  
  attr_reader :url, :contents, :labels
  
  def initialize(url)
    @url = url
    open(url) { |f| @contents = f.read }
    process_labels
  end
  
  protected
    def process_labels
      @labels = []
      @contents.each_line do |line|
        if line =~ label_re
          @labels << $1
        end
      end
    end
    
    def label_re
      /@attribute (\w+)/
    end
end

Version data entries

3 entries across 3 versions & 2 rubygems

Version Path
davidrichards-data_frame-0.0.19 lib/data_frame/labels_from_uci.rb
davidrichards-data_frame-0.0.20 lib/data_frame/labels_from_uci.rb
data_frame-0.1.8 lib/data_frame/labels_from_uci.rb