=data_miner Mine remote data into your ActiveRecord models. ==Quick start Put this in config/environment.rb: config.gem 'seamusabshere-data_miner', :lib => 'data_miner', :source => 'http://gems.github.com' Put this in lib/tasks/data_miner_tasks.rake: (unfortunately I don't know a way to automatically include gem tasks, so you have to do this manually for now) namespace :data_miner do task :mine => :environment do DataMiner.mine :class_names => ENV['CLASSES'].to_s.split(/\s*,\s*/).flatten.compact end end You need to specify what order to mine data. For example, in config/initializers/data_miner_config.rb: DataMiner.enqueue do |queue| queue << Country # class whose data should be mined 1st queue << Airport # class whose data should be mined 2nd # etc end You need to define mine_data blocks. For example, in app/models/country.rb: class Country < ActiveRecord::Base mine_data do |step| # import country names and country codes step.import :url => 'http://www.cs.princeton.edu/introcs/data/iso3166.csv' do |attr| attr.key :iso_3166, :name_in_source => 'country code' attr.store :iso_3166, :name_in_source => 'country code' attr.store :name, :name_in_source => 'country' end end end To complete the example, in app/models/airport.rb: class Airport < ActiveRecord::Base belongs_to :country mine_data do |step| # import airport iata_code, name, etc. step.import(:url => 'http://openflights.svn.sourceforge.net/viewvc/openflights/openflights/data/airports.dat', :headers => false) do |attr| attr.key :iata_code, :field_number => 3 attr.store :name, :field_number => 0 attr.store :city, :field_number => 1 attr.store :country, :field_number => 2, :foreign_key => :name # will use Country.find_by_name(X) attr.store :iata_code, :field_number => 3 attr.store :latitude, :field_number => 5 attr.store :longitude, :field_number => 6 end end end Once you have (1) set up the order of data mining and (2) defined mine_data blocks in your classes, you can: $ rake data_miner:mine ==Complete example ~ $ rails testapp ~ $ cd testapp/ ~/testapp $ ./script/generate model Airport iata_code:string name:string city:string country_id:integer latitude:float longitude:float ~/testapp $ ./script/generate model Country iso_3166:string name:string ~/testapp $ rake db:migrate ~/testapp $ touch lib/tasks/data_miner_tasks.rb [...edit per quick start...] ~/testapp $ touch config/initializers/data_miner_config.rake [...edit per quick start...] ~/testapp $ rake data_miner:mine Now you should have ~/testapp $ ./script/console Loading development environment (Rails 2.3.3) >> Airport.first.iata_code => "GKA" >> Airport.first.country.name => "Papua New Guinea" ==Copyright Copyright (c) 2009 Brighter Planet. See LICENSE for details.