Sha256: b5bb33e8dd7b3c533506dfd1076dd263ce9875c5d835b03afe599ddcae4cacf5
Contents?: true
Size: 902 Bytes
Versions: 2
Compression:
Stored size: 902 Bytes
Contents
module TextRank ## # Tokenizers are responsible for transforming a single String of text into an # array of potential keywords ("tokens"). There are no requirements of tokens # other than to be non-empty. When used in combination with token filters, it # may make sense for a tokenizer to temporarily create tokens which might seem # like ill-suited keywords. The token filter may use these "bad" keywords to # help inform its decision on which tokens to keep and which to drop. An example # of this is the part of speech token filter which uses punctuation tokens to # help guess the part of speech of each non-punctuation token. ## module Tokenizer autoload :Regex, 'text_rank/tokenizer/regex' autoload :Whitespace, 'text_rank/tokenizer/whitespace' autoload :WordsAndPunctuation, 'text_rank/tokenizer/words_and_punctuation' end end
Version data entries
2 entries across 2 versions & 1 rubygems
Version | Path |
---|---|
text_rank-1.1.1 | lib/text_rank/tokenizer.rb |
text_rank-1.1.0 | lib/text_rank/tokenizer.rb |