README.md in blingfire-0.1.2 vs README.md in blingfire-0.1.3
- old
+ new
@@ -30,10 +30,22 @@
```ruby
model.text_to_sentences(text)
```
+Get offsets for words
+
+```ruby
+words, start_offsets, end_offsets = model.text_to_words_with_offsets(text)
+```
+
+Get offsets for sentences
+
+```ruby
+sentences, start_offsets, end_offsets = model.text_to_sentences_with_offsets(text)
+```
+
## Pre-trained Models
BlingFire comes with a default model that follows the tokenization logic of NLTK with a few changes. You can also download other models:
- [BERT Base](https://github.com/microsoft/BlingFire/blob/master/dist-pypi/blingfire/bert_base_tok.bin)
@@ -58,10 +70,16 @@
```ruby
model.text_to_ids(text)
```
+Get offsets for ids
+
+```ruby
+ids, start_offsets, end_offsets = model.text_to_ids_with_offsets(text)
+```
+
## History
View the [changelog](https://github.com/ankane/blingfire/blob/master/CHANGELOG.md)
## Contributing
@@ -77,8 +95,8 @@
```sh
git clone https://github.com/ankane/blingfire.git
cd blingfire
bundle install
-bundle exec rake vendor:all
+bundle exec rake vendor:all download:models
bundle exec rake test
```