Sha256: 4479a8e115b2077189476305e0eddf19ff2a292d4711f22a734287cf74829731

Contents?: true

Size: 1.68 KB

Versions: 3

Compression:

Stored size: 1.68 KB

Contents

# DocParser

[![Gem Version](https://badge.fury.io/rb/docparser.png)](http://badge.fury.io/rb/docparser) [![Build Status](https://travis-ci.org/jurriaan/docparser.png?branch=master)](https://travis-ci.org/jurriaan/docparser) [![Dependency Status](https://gemnasium.com/jurriaan/docparser.png)](https://gemnasium.com/jurriaan/docparser) [![Coverage Status](https://coveralls.io/repos/jurriaan/docparser/badge.png?branch=master)](https://coveralls.io/r/jurriaan/docparser)


DocParser is a web scraping/screen scraping tool.

You can use it to easily scrape information out of HTML documents.

The gem is called [docparser](http://rubygems.org/gems/docparser).
You can find the documentation [here](http://rubydoc.info/github/jurriaan/docparser/).

## Features

- XPath and CSS support through Nokogiri
- Support for parallel processing of the documents
- 6 Output formats:
  * CSV
  * XLSX
  * HTML
  * YAML
  * JSON
  * Screen (for debugging and development)
  * And more! (easy to extend)

## Installation

Add this line to your application's Gemfile:

    gem 'docparser'

And then execute:

    bundle

Or install it yourself using:

    gem install docparser

## Usage

See [example.rb](https://github.com/jurriaan/docparser/blob/master/example.rb)

## Todo

- Better examples and documentation

## Contributing

1. Fork it
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Commit your changes (`git commit -am 'Add some feature'`)
4. Push to the branch (`git push origin my-new-feature`)
5. Create new Pull Request

## Contributors

- [Jurriaan Pruis](https://github.com/jurriaan)

## Thanks

- [randym](https://github.com/randym) - for providing the [axlsx](https://github.com/randym/axlsx) gem

Version data entries

3 entries across 3 versions & 1 rubygems

Version Path
docparser-0.2.3 README.md
docparser-0.2.2 README.md
docparser-0.2.0 README.md