# HttpReader Read any document on internet and parse to your own format :D ## Installation Add this line to your application's Gemfile: gem 'http_reader' And then execute: $ bundle Or install it yourself as: $ gem install http_reader ## Usage engine = HttpReader::Engine.new(opts) engine.read('http://www.google.com') ### Available opts [Hash] - **parsers:** list of document parser Classes [ default: [] ] - **default_parser:** parser used when none parser was match for url [default: HashPageParser] - **http_client:** http_client for downloading pages sources, [default: HTTParty] - **browser:** browser_client to processing and download source, [default: Watir::Browser] - **logger:** default: Logger ## Examples ### Usage default_parser as HashPageParser engine = HttpReader::Engine.new read_opts = {title: 'h1', items: '.content li;array'} engine.read('http://example.org', read_opts) **Where page body is:**

Information

not importante

Items:
**Result should be:** {:title=>"Information", :items=>%w{A B C}} ### Usage own Parser class **Class body:** Class TestParser < BasePageParser @pattern = /^((http|https):\/\/www.google.com)$/ class << self def browse_actions_for_html(browser, opts = {}) div = browser.div(id: 'als') raise 'Cannot find div' unless div.exists? div.html end def parse(response, opts = {}) n_body = Nokogiri::HTML(response.body) { text: n_body.css('p').text } end def use_browser true end end end **initializtion:** engine = HttpReader::Engine.new(default_parser: TestParser) engine.read('http://www.google.com') **Or** engine = HttpReader::Engine.new(parsers: [TestParser]) engine.read('http://www.google.com') **Or** engine = HttpReader::Engine.new engine.read('http://www.google.com', parser: TestParser) ## More info about syntax - [watir-webdriver](https://github.com/watir/watir-webdriver) - [nokogiri](http://ruby.bastardsbook.com/chapters/html-parsing/) ## Dependecies ### Gems - nokogiri - httparty - headless - watir-webdriver ### System components - xvfb instalation on ubuntu: sudo apt-get install xvfb ## Contributing 1. Fork it 2. Create your feature branch (`git checkout -b my-new-feature`) 3. Commit your changes (`git commit -am 'Add some feature'`) 4. Push to the branch (`git push origin my-new-feature`) 5. Create new Pull Request