Sha256: 7bf19b6d48e0a22c26f7a67957594349e0b9632d775b89c5ec15b3826be91944

Contents?: true

Size: 1.32 KB

Versions: 2

Compression:

Stored size: 1.32 KB

Contents

h1. Pedantic

Pedantic cleans strings of text - stripping out unimportant words and URLs, fixing typos, replacing symbols (like emoticons) with real words, and running the results through a stemmer.

In short - it gives you reliable text to process (but not read).

And if the name didn't give it away, yes this library is opinionated.

h2. Installation

Grab the gem.

<pre><code>gem install pedantic</code></pre>

h2. Usage

<pre><code>Pedantic.fix('my messy string ;)') #=> 'messi string joke'</code></pre>

Note that the stemmer generates imperfect words, but it is reasonably reliable and constant in the output, so you can work with those assumptions in the output.

Also - this library is a work in progress - currently I've aimed for a relatively useful but extremely basic implementation. If you look through the code, you'll see there's few typos and emoticons handled. It's easy enough to extend, though - so please, fork, patch and send a pull request.

h2. Contributing

Fork and patch as you see fit - and please send me a pull request if you think it's useful for others. Don't forget to write specs first, and don't mess with the version numbers please (or at least: only do so in a different branch).

h2. Copyright

Copyright (c) 2010 "Pat Allan":http://freelancing-gods.com, but released under an open licence. Go for your life.

Version data entries

2 entries across 2 versions & 1 rubygems

Version Path
pedantic-0.1.1 README.textile
pedantic-0.1.0 README.textile