Sha256: 894372780ab8757161ed60b2cbe34111236b3102ea14a675e7b9a1128c0e2a48

Contents?: true

Size: 1.96 KB

Versions: 1

Compression:

Stored size: 1.96 KB

Contents

require 'dragonfly'

# When using this gem, you'll start by defining a {Scraper}, with methods for
# retrieving and processing data. The data will be stored in {DataStorage};
# this gem currently provides only a {DataStorage::FileDataStore FileDataStore}.
# You may enhance a datastore with {Decorators} and {Observers}: for example,
# a {Decorators::Timeout Timeout} decorator to retry on timeout with exponential
# backoff and a {Observers::Log Log} observer which logs retrieval progress.
# Of course, you must also define a {Processors::Transform Processor} to turn
# your raw data into machine-readable data.
#
# A skeleton scraper:
#
#     require 'unbreakable'
#
#     class MyScraper < Unbreakable::Scraper
#       def retrieve
#         # download all the documents
#       end
#       def processable
#         # return a list of documents to process
#       end
#     end
#
#     class MyProcessor < Unbreakable::Processors::Transform
#       def perform(temp_object)
#         # return the transformed record as a hash, array, etc.
#       end
#       def persist(temp_object, arg)
#         # store the hash/array/etc. in Mongo, MySQL, YAML, etc.
#       end
#     end
#
#     scraper = MyScraper.new
#     scraper.processor.register MyProcessor
#     scraper.configure do |c|
#       # configure the scraper
#     end
#     scraper.run(ARGV)
#
# Every scraper script can run as a command-line script. Try it!
#
#     ruby myscraper.rb
module Unbreakable
  autoload :Scraper, 'unbreakable/scraper'

  module Processors
    autoload :Transform, 'unbreakable/processors/transform'
  end

  module Observers
    autoload :Observer, 'unbreakable/observers/observer'
    autoload :Log, 'unbreakable/observers/log'
  end

  module Decorators
    autoload :Timeout, 'unbreakable/decorators/timeout'
  end

  module DataStorage
    autoload :FileDataStore, 'unbreakable/data_storage/file_data_store'
  end

  class UnbreakableError < StandardError; end
  class InvalidRemoteFile < UnbreakableError; end
end

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
unbreakable-0.0.1 lib/unbreakable.rb