# RTesseract
Ruby library for working with the Tesseract OCR.
## Installation
Check if tesseract ocr programs is installed:
$ tesseract --version
Add this line to your application's Gemfile:
```ruby
gem 'rtesseract'
```
And then execute:
$ bundle
Or install it yourself as:
$ gem install rtesseract
## Usage
It's very simple to use rtesseract.
### Convert image to string
```ruby
image = RTesseract.new("my_image.jpg")
image.to_s # Getting the value
```
### Convert image to searchable PDF
```ruby
image = RTesseract.new("my_image.jpg")
image.to_pdf # Getting open file of pdf
```
### Convert image to TSV
```ruby
image = RTesseract.new("my_image.jpg")
image.to_tsv # Getting open file of pdf
```
This will preserve the image colors, pictures and structure in the generated pdf.
## Options
### Language
```ruby
RTesseract.new('test.jpg', lang: 'deu')
```
* eng - English
* deu - German
* deu-f - German fraktur
* fra - French
* ita - Italian
* nld - Dutch
* por - Portuguese
* spa - Spanish
* vie - Vietnamese
* or any other supported by tesseract.
Note: Make sure you have installed the language to tesseract
### Other options
```ruby
RTesseract.new('test.jpg', config_file: :digits) # Only digit recognition
```
OR
```ruby
RTesseract.new('test.jpg', config_file: 'digits quiet')
```
### BOUNDING BOX: TO GET WORDS WITH THEIR POSITIONS
```ruby
RTesseract.new('test_words.png').to_box
```
# => [
# {:word => 'If', :x_start=>52, :y_start=>13, :x_end=>63, :y_end=>27},
# {:word => 'you', :x_start=>69, :y_start=>17, :x_end=>100, :y_end=>31},
# {:word => 'are', :x_start=>108, :y_start=>17, :x_end=>136, :y_end=>27},
# {:word => 'a', :x_start=>143, :y_start=>17, :x_end=>151, :y_end=>27},
# {:word => 'friend,', :x_start=>158, :y_start=>13, :x_end=>214, :y_end=>29},
# {:word => 'you', :x_start=>51, :y_start=>39, :x_end=>82, :y_end=>53},
# {:word => 'speak', :x_start=>90, :y_start=>35, :x_end=>140, :y_end=>53},
# {:word => 'the', :x_start=>146, :y_start=>35, :x_end=>174, :y_end=>49},
# {:word => 'password,', :x_start=>182, :y_start=>35, :x_end=>267, :y_end=>53},
# {:word => 'and', :x_start=>51, :y_start=>57, :x_end=>81, :y_end=>71},
# {:word => 'the', :x_start=>89, :y_start=>57, :x_end=>117, :y_end=>71},
# {:word => 'doors', :x_start=>124, :y_start=>57, :x_end=>172, :y_end=>71},
# {:word => 'will', :x_start=>180, :y_start=>57, :x_end=>208, :y_end=>71},
# {:word => 'open.', :x_start=>216, :y_start=>61, :x_end=>263, :y_end=>75}
# ]
## Development
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
## Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/dannnylo/rtesseract. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
## License
The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
## Code of Conduct
Everyone interacting in the Rtesseract project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/dannnylo/rtesseract/blob/master/CODE_OF_CONDUCT.md).