Sha256: 53b8cd95032a93a9dbc5c333554f4c28ee77498b7c0c1c57408b9422885547b3
Contents?: true
Size: 1.61 KB
Versions: 1
Compression:
Stored size: 1.61 KB
Contents
# Ruby Splitta * README: https://github.com/david-mccullars/ruby-splitta * Documentation: http://www.rubydoc.info/github/david-mccullars/ruby-splitta * Bug Reports: https://github.com/david-mccullars/ruby-splitta/issues ## Status [](https://badge.fury.io/rb/splitta) [](https://github.com/david-mccullars/ruby-splitta/actions?workflow=CI) [](https://codeclimate.com/github/david-mccullars/ruby-splitta) [](https://codeclimate.com/github/david-mccullars/ruby-splitta/coverage) [](LICENSE) ## Description [Splitta](https://code.google.com/archive/p/splitta/) Includes proper tokenization and models for very high accuracy sentence boundary detection (English only for now). The models are trained from Wall Street Journal news combined with the Brown Corpus which is intended to be widely representative of written English. Error rates on test news data are near 0.25%. ## Installation ``` gem install splitta ``` ## Requirements * Ruby 2.5.1 or higher ## Usage ```ruby require 'splitta' Splitta.sentences("Some text goes here.") ``` ## License MIT. See the `LICENSE` file. ## References > Dan Gillick, “Sentence Boundary Detection and the Problem with the U.S.” at NAACL 2009, http://dgillick.com/resource/sbd_naacl_2009.pdf
Version data entries
1 entries across 1 versions & 1 rubygems
Version | Path |
---|---|
splitta-5.0.0 | README.md |