Sha256: 02f842d4eb2f2ef717e4367431b2bb2c64e42fd71b226cfc80db75a142cfac96

Contents?: true

Size: 1.95 KB

Versions: 1

Compression:

Stored size: 1.95 KB

Contents

require 'dragonfly'

# When using this gem, you'll start by defining a {Scraper}, with methods for
# retrieving and processing data. The data will be stored in {DataStorage};
# this gem currently provides only a {DataStorage::FileDataStore FileDataStore}.
# You may enhance a datastore with {Decorators} and {Observers}: for example,
# a {Decorators::Timeout Timeout} decorator to retry on timeout with exponential
# backoff and a {Observers::Log Log} observer which logs retrieval progress.
# Of course, you must also define a {Processors::Transform Processor} to turn
# your raw data into machine-readable data.
#
# A skeleton scraper:
#
#     require 'unbreakable'
#
#     class MyScraper < Unbreakable::Scraper
#       def retrieve(*args)
#         # download all the documents
#       end
#       def processable
#         # return a list of documents to process
#       end
#     end
#
#     class MyProcessor < Unbreakable::Processors::Transform
#       def perform
#         # return the transformed record as a hash, array, etc.
#       end
#       def persist(arg)
#         # store the hash/array/etc. in Mongo, MySQL, YAML, etc.
#       end
#     end
#
#     scraper = MyScraper.new
#     scraper.processor.register MyProcessor
#     scraper.configure do |c|
#       # configure the scraper
#     end
#     scraper.run(ARGV)
#
# Every scraper script can run as a command-line script. Try it!
#
#     ruby myscraper.rb
module Unbreakable
  autoload :Scraper, 'unbreakable/scraper'

  module Processors
    autoload :Transform, 'unbreakable/processors/transform'
  end

  module Observers
    autoload :Observer, 'unbreakable/observers/observer'
    autoload :Log, 'unbreakable/observers/log'
  end

  module Decorators
    autoload :Timeout, 'unbreakable/decorators/timeout'
  end

  module DataStorage
    autoload :FileDataStore, 'unbreakable/data_storage/file_data_store'
  end

  class UnbreakableError < StandardError; end
  class InvalidRemoteFile < UnbreakableError; end
end

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
unbreakable-0.0.2 lib/unbreakable.rb