README.md in henkei-2.9.2.1 vs README.md in henkei-2.9.2.2
- old
+ new
@@ -26,11 +26,11 @@
Apache Tika v2.x brings with it some changes. One key change is that the Tika client and server applications have
been split up. To keep the gem size down Henkei will only include the client app. That is to say, each time you
call to Henkei, a new Java process will be started, run your command, then terminate.
Another change is the metadata keys. A lot of duplicate keys have been removed in favour of a more standards
-based approach. A list of the old vs new key names can be found [here](https://cwiki.apache.org/confluence/display/TIKA/Migrating+to+Tika+2.0.0#MigratingtoTika2.0.0-Metadata)
+based approach. A list of the old vs new key names can be found [here](https://cwiki.apache.org/confluence/display/TIKA/Migrating+to+Tika+2.0.0#MigratingtoTika2.0.0-Metadata)
## Usage
Text, metadata and MIME type information can be extracted by calling `Henkei.read` directly:
@@ -109,16 +109,30 @@
henkei = Henkei.new 'sample.docx'
henkei.mimetype.content_type #=> "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
henkei.mimetype.extensions #=> ['docx']
```
+### Output text in a specific character encoding
+
+You can specify the output character encoding by passing in the optional `encoding` argument when calling to the
+`text` or `html` instance methods, as well as the `read` class method.
+
+```ruby
+henkei = Henkei.new 'sample.pages'
+utf_8_text = henkei.text(encoding: 'UTF-8')
+utf_16_html = henkei.html(encoding: 'UTF-16')
+
+data = File.read 'sample.pages'
+utf_32_text = Henkei.read :text, data, encoding: 'UTF-32'
+```
+
## Installation and Dependencies
### Java Runtime
Henkei packages the Apache Tika application jar and requires a working JRE for it to work.
-Check that you either have the `JAVA_HOME` environment variable set, or that `java` is in your path.
+Check that you either have the `JAVA_HOME` environment variable set, or that `java` is in your path.
### Gem
Add this line to your application's Gemfile:
@@ -129,10 +143,10 @@
$ bundle
Or install it yourself as:
$ gem install henkei
-
+
### Heroku
Add the JVM Buildpack to your Heroku project:
$ heroku buildpacks:add heroku/jvm --index 1 -a YOUR_APP_NAME