RubygemsResearch

Sha256: ab177ee8f6e14190981f5d16e2b667e68f46819be3cb61befe3880e35ebbc04c

Contents?: true

Size: 1005 Bytes

Versions: 4

Compression:

Stored size: 1005 Bytes

# PDF Table Data Extractor

Simple tool to extract Table Data from PDFs

## Presentation

This library is able to understand stuff that looks like tables in PDF files:

 - Table Headers
 - Table Rows
 - Sub-Table Names (Partial tables)

Also, a set of filters are included to ensure that the output produced by the library is "clean" and free of false-positives or unusable / garbage information.

## Installation

### Gemfile
```ruby
gem 'pdftdx'
```

### Terminal
```bash
gem install -V pdftdx
```

## Usage example

Reading a PDF file:
```ruby
require 'pdftdx'
tables = PDFTDX::extract_data 'path to your PDF file'
puts tables.inspect
```

Output:
```
=> [{ head: ['trauma.eresse.net', 'durjaya.dooba.io', 'suessmost.eresse.net'], data: [{ name: 'System', data: [['Machine OS', 'Win32', 'Linux', 'MacOS'], ['IP Address', '10.0.232.48', '10.0.232.134', '10.0.232.108']] }] }]
```

## License

The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).

Version data entries

4 entries across 4 versions & 1 rubygems

Version	Path
pdftdx-1.2.2	README.md
pdftdx-1.2.1	README.md
pdftdx-1.2.0	README.md
pdftdx-1.1.8	README.md