Sha256: 894372780ab8757161ed60b2cbe34111236b3102ea14a675e7b9a1128c0e2a48
Contents?: true
Size: 1.96 KB
Versions: 1
Compression:
Stored size: 1.96 KB
Contents
require 'dragonfly' # When using this gem, you'll start by defining a {Scraper}, with methods for # retrieving and processing data. The data will be stored in {DataStorage}; # this gem currently provides only a {DataStorage::FileDataStore FileDataStore}. # You may enhance a datastore with {Decorators} and {Observers}: for example, # a {Decorators::Timeout Timeout} decorator to retry on timeout with exponential # backoff and a {Observers::Log Log} observer which logs retrieval progress. # Of course, you must also define a {Processors::Transform Processor} to turn # your raw data into machine-readable data. # # A skeleton scraper: # # require 'unbreakable' # # class MyScraper < Unbreakable::Scraper # def retrieve # # download all the documents # end # def processable # # return a list of documents to process # end # end # # class MyProcessor < Unbreakable::Processors::Transform # def perform(temp_object) # # return the transformed record as a hash, array, etc. # end # def persist(temp_object, arg) # # store the hash/array/etc. in Mongo, MySQL, YAML, etc. # end # end # # scraper = MyScraper.new # scraper.processor.register MyProcessor # scraper.configure do |c| # # configure the scraper # end # scraper.run(ARGV) # # Every scraper script can run as a command-line script. Try it! # # ruby myscraper.rb module Unbreakable autoload :Scraper, 'unbreakable/scraper' module Processors autoload :Transform, 'unbreakable/processors/transform' end module Observers autoload :Observer, 'unbreakable/observers/observer' autoload :Log, 'unbreakable/observers/log' end module Decorators autoload :Timeout, 'unbreakable/decorators/timeout' end module DataStorage autoload :FileDataStore, 'unbreakable/data_storage/file_data_store' end class UnbreakableError < StandardError; end class InvalidRemoteFile < UnbreakableError; end end
Version data entries
1 entries across 1 versions & 1 rubygems
Version | Path |
---|---|
unbreakable-0.0.1 | lib/unbreakable.rb |