Sha256: 0be1eb0eb37a353e112288379ec4b11f229dd9534ff6f6f2afbaf9bc45f9a2e2

Contents?: true

Size: 1.55 KB

Versions: 1

Compression:

Stored size: 1.55 KB

Contents

# DocParser

[![Gem Version](http://img.shields.io/gem/v/docparser.svg)](http://badge.fury.io/rb/docparser) [![Build Status](http://img.shields.io/travis/jurriaan/docparser.svg)](https://travis-ci.org/jurriaan/docparser) [![Dependency Status](http://img.shields.io/gemnasium/jurriaan/docparser.svg)](https://gemnasium.com/jurriaan/docparser)


DocParser is a web scraping/screen scraping tool.

You can use it to easily scrape information out of HTML documents.

The gem is called [docparser](http://rubygems.org/gems/docparser).
You can find the documentation [here](http://rubydoc.info/github/jurriaan/docparser/).

## Features

- XPath and CSS support through Nokogiri
- Support for parallel processing of the documents
- 6 Output formats:
  * CSV
  * XLSX
  * HTML
  * YAML
  * JSON
  * Screen (for debugging and development)
  * And more! (easy to extend)

## Installation

Add this line to your application's Gemfile:

    gem 'docparser'

And then execute:

    bundle

Or install it yourself using:

    gem install docparser

## Usage

See [example.rb](https://github.com/jurriaan/docparser/blob/master/example.rb)

## Todo

- Better examples and documentation

## Contributing

1. Fork it
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Commit your changes (`git commit -am 'Add some feature'`)
4. Push to the branch (`git push origin my-new-feature`)
5. Create new Pull Request

## Contributors

- [Jurriaan Pruis](https://github.com/jurriaan)

## Thanks

- [randym](https://github.com/randym) - for providing the [axlsx](https://github.com/randym/axlsx) gem

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
docparser-0.3.0 README.md