README.md in pragmatic_segmenter-0.3.0 vs README.md in pragmatic_segmenter-0.3.1

- old
+ new

@@ -708,10 +708,11 @@ * *Unsupervised Multilingual Sentence Boundary Detection* - Tibor Kiss and Jan Strunk (2005) [[pdf](http://www.linguistics.ruhr-uni-bochum.de/~strunk/ks2005FINAL.pdf) | [mirror](https://s3.amazonaws.com/tm-town-nlp-resources/ks2005FINAL.pdf)] * *An Analysis of Sentence Boundary Detection Systems for English and Portuguese Documents* - Carlos N. Silla Jr. and Celso A. A. Kaestner (2004) [[pdf](https://www.cs.kent.ac.uk/pubs/2004/2930/content.pdf) | [mirror](https://s3.amazonaws.com/tm-town-nlp-resources/An+Analysis+of+Sentence+Boundary+Detection+Systems+for+English+and+Portuguese+Documents.pdf)] * *Periods, Capitalized Words, etc.* - Andrei Mikheev (2002) [[pdf](https://s3.amazonaws.com/tm-town-nlp-resources/cl-prop.pdf)] * *Scaled log likelihood ratios for the detection of abbreviations in text corpora* - Tibor Kiss and Jan Strunk (2002) [[pdf](http://www.linguistics.ruhr-uni-bochum.de/~kiss/publications/abbrev.pdf) | [mirror](https://s3.amazonaws.com/tm-town-nlp-resources/abbrev.pdf)] * *Viewing sentence boundary detection as collocation identification* - Tibor Kiss and Jan Strunk (2002) [[pdf](http://www.linguistics.rub.de/~kiss/publications/07v-kiss.pdf) | [mirror](https://s3.amazonaws.com/tm-town-nlp-resources/07v-kiss.pdf)] +* *Automatic Sentence Break Disambiguation for Thai* - Paisarn Charoenpornsawat and Virach Sornlertlamvanich (2001) [[pdf](http://www.cs.cmu.edu/~paisarn/papers/iccpol2001.pdf) | [mirror](https://s3.amazonaws.com/tm-town-nlp-resources/iccpol2001.pdf)] * *Sentence Boundary Detection: A Comparison of Paradigms for Improving MT Quality* - Daniel J. Walker, David E. Clements, Maki Darwin and Jan W. Amtrup (2001) [[pdf](https://www.cs.kent.ac.uk/pubs/2004/2930/content.pdf) | [mirror](https://s3.amazonaws.com/tm-town-nlp-resources/walker.pdf)] * *A Sentence Boundary Detection System* - Wendy Chen (2000) [[ppt](www.deg.byu.edu/presentations/SpResConf00.chen/SpResConf00.ppt) | [mirror](https://s3.amazonaws.com/tm-town-nlp-resources/SpResConf00.ppt)] * *Tagging Sentence Boundaries* - Andrei Mikheev (2000) [[pdf](http://www.aclweb.org/anthology/A00-2035) | [mirror](https://s3.amazonaws.com/tm-town-nlp-resources/A00-2035.pdf)] * *Automatic Extraction of Rules For Sentence Boundary Disambiguation* - E. Stamatatos, N. Fakotakis, AND G. Kokkinakis (1999) [[pdf](https://s3.amazonaws.com/tm-town-nlp-resources/Automatic+Extraction+of+Rules+For+Sentence+Boundary+Disambiguation.pdf)] * *A Maximum Entropy Approach to Identifying Sentence Boundaries* - Jeffrey C. Reynar and Adwait Ratnaparkhi (1997) [[pdf](https://www.aclweb.org/anthology/A/A97/A97-1004.pdf) | [mirror](https://s3.amazonaws.com/tm-town-nlp-resources/A97-1004.pdf)] @@ -723,10 +724,11 @@ ## TODO * Add additional language support * Add abbreviation lists for any languages that do not currently have one (only relevant for languages that have the concept of abbreviations with periods) * Get Golden Rule #18 passing - Handling of a.m. or p.m. followed by a capitalized non sentence starter (ex. "At 5 p.m. Mr. Smith went to the bank. He left the bank at 6 p.m. Next he went to the store." --> ["At 5 p.m. Mr. Smith went to the bank.", "He left the bank at 6 p.m.", "Next he went to the store."]) +* Support for Thai. This is a very challenging problem due to the absence of explicit sentence markers (i.e. like a period in English) and the ambiguity in Thai regarding what constitutes a sentence even among native speakers. For more information see the following research papers ([#1](http://www.cs.cmu.edu/~paisarn/papers/iccpol2001.pdf) | [#2](http://pioneer.chula.ac.th/~awirote/ling/snlp2007-wirote.pdf)). ## Change Log **Version 0.0.1** * Initial Release @@ -801,10 +803,13 @@ **Version 0.3.0** * Add support for square brackets * Add support for continuous exclamation points or questions marks or combinations of both * Fix Roman numeral support -* Add English abbreviations +* Add English abbreviations + +**Version 0.3.1** +* Fix undefined method 'gsub!' for nil:NilClass issue ## Contributing If you find a text that is incorrectly segmented using this gem, please submit an issue. \ No newline at end of file