README.md in blingfire-0.1.2 vs README.md in blingfire-0.1.3

- old
+ new

@@ -30,10 +30,22 @@ ```ruby model.text_to_sentences(text) ``` +Get offsets for words + +```ruby +words, start_offsets, end_offsets = model.text_to_words_with_offsets(text) +``` + +Get offsets for sentences + +```ruby +sentences, start_offsets, end_offsets = model.text_to_sentences_with_offsets(text) +``` + ## Pre-trained Models BlingFire comes with a default model that follows the tokenization logic of NLTK with a few changes. You can also download other models: - [BERT Base](https://github.com/microsoft/BlingFire/blob/master/dist-pypi/blingfire/bert_base_tok.bin) @@ -58,10 +70,16 @@ ```ruby model.text_to_ids(text) ``` +Get offsets for ids + +```ruby +ids, start_offsets, end_offsets = model.text_to_ids_with_offsets(text) +``` + ## History View the [changelog](https://github.com/ankane/blingfire/blob/master/CHANGELOG.md) ## Contributing @@ -77,8 +95,8 @@ ```sh git clone https://github.com/ankane/blingfire.git cd blingfire bundle install -bundle exec rake vendor:all +bundle exec rake vendor:all download:models bundle exec rake test ```