Sha256: 8fde70602892862f4d841410c247c67aa6cd85be364aa9562c15c991aa57427b

Contents?: true

Size: 1.93 KB

Versions: 1

Compression:

Stored size: 1.93 KB

Contents

# Yomu 読む
[Yomu](http://github.com/Erol/yomu) is a library for extracting text and metadata using the [Apache TIKA](http://tika.apache.org/) content analysis toolkit.

Here are some of the formats supported:

- Microsoft Office OLE 2 and Office Open XML Formats (.doc, .docx, .xls, .xlsx,
  .ppt, .pptx)
- OpenOffice.org OpenDocument Formats (.odt, .ods, .odp)
- Apple iWorks Formats
- Rich Text Format (.rtf)
- Portable Document Format (.pdf)

For the complete list of supported formats, please visit the Apache Tika
[Supported Document Formats](http://tika.apache.org/0.9/formats.html) page.

## Installation and Dependencies

Add this line to your application's Gemfile:

    gem 'yomu'

And then execute:

    $ bundle

Or install it yourself as:

    $ gem install yomu

Yomu packages the Apache Tika application jar and thus requires a working JRE for it to work.

## Usage

If you're not using Bundler, you will need to require Yomu in your application:

    require 'yomu'

You can extract text by calling `Yomu.read` directly:

    data = File.read 'sample.pages'
    text = Yomu.read :text, data

##### Filename

You can also make a new instance of Yomu and pass a filename.

    yomu = Yomu.new 'sample.pages'
    text = yomu.text

##### URL

This is useful for reading remote files, like documents hosted on Amazon S3.

    yomu = Yomu.new 'http://svn.apache.org/repos/asf/poi/trunk/test-data/document/sample.docx'
    text = yomu.text

##### Stream

Yomu can also read from a stream or any object that responds to `read`, including Ruby on Rails' and Sinatra's file uploads:

    post '/:name/:filename' do
      yomu = Yomu.new params[:data]
      yomu.text
    end

## Contributing

1. Fork it
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Create tests and make them pass (`rake test`)
4. Commit your changes (`git commit -am 'Added some feature'`)
5. Push to the branch (`git push origin my-new-feature`)
6. Create a new Pull Request

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
yomu-0.1.0 README.md