Sha256: 53b8cd95032a93a9dbc5c333554f4c28ee77498b7c0c1c57408b9422885547b3

Contents?: true

Size: 1.61 KB

Versions: 1

Compression:

Stored size: 1.61 KB

Contents

# Ruby Splitta

* README:         https://github.com/david-mccullars/ruby-splitta
* Documentation:  http://www.rubydoc.info/github/david-mccullars/ruby-splitta
* Bug Reports:    https://github.com/david-mccullars/ruby-splitta/issues


## Status

[![Gem Version](https://badge.fury.io/rb/splitta.svg)](https://badge.fury.io/rb/splitta)
[![Build Status](https://github.com/david-mccullars/ruby-splitta/workflows/CI/badge.svg)](https://github.com/david-mccullars/ruby-splitta/actions?workflow=CI)
[![Code Climate](https://codeclimate.com/github/david-mccullars/ruby-splitta/badges/gpa.svg)](https://codeclimate.com/github/david-mccullars/ruby-splitta)
[![Test Coverage](https://codeclimate.com/github/david-mccullars/ruby-splitta/badges/coverage.svg)](https://codeclimate.com/github/david-mccullars/ruby-splitta/coverage)
[![MIT License](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)

## Description

[Splitta](https://code.google.com/archive/p/splitta/) Includes proper
tokenization and models for very high accuracy sentence boundary detection
(English only for now). The models are trained from Wall Street Journal news
combined with the Brown Corpus which is intended to be widely representative of
written English. Error rates on test news data are near 0.25%.

## Installation

```
gem install splitta
```

## Requirements

* Ruby 2.5.1 or higher

## Usage

```ruby
require 'splitta'

Splitta.sentences("Some text goes here.")
```

## License

MIT. See the `LICENSE` file.

## References

> Dan Gillick, “Sentence Boundary Detection and the Problem with the U.S.” at NAACL 2009, http://dgillick.com/resource/sbd_naacl_2009.pdf

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
splitta-5.0.0 README.md