Sha256: 62249137d58c021b857680d4b4d9e4610d0dd202f6e630bd316176daa3b579bb

Contents?: true

Size: 532 Bytes

Versions: 5

Compression:

Stored size: 532 Bytes

Contents

#encoding: UTF-8
module TextRank
  module Tokenizer

    ##
    # A tokenizer regex that preserves (optionally formatted) numbers as a single token.
    ##
    Number = %r{
      (
        [1-9]\d{3,}       # 453231162
        (?:\.\d+)?        # 453231162.17

        |

        [1-9]\d{0,2}      # 453
        (?:,\d{3})*       # 453,231,162
        (?:\.\d+)?        # 453,231,162.17

        |

        0                 # 0
        (?:\.\d+)?        # 0.17

        |

        (?:\.\d+)         # .17
      )
    }x

  end
end

Version data entries

5 entries across 5 versions & 1 rubygems

Version Path
text_rank-1.2.3 lib/text_rank/tokenizer/number.rb
text_rank-1.2.2 lib/text_rank/tokenizer/number.rb
text_rank-1.2.0 lib/text_rank/tokenizer/number.rb
text_rank-1.1.7 lib/text_rank/tokenizer/number.rb
text_rank-1.1.6 lib/text_rank/tokenizer/number.rb