Sha256: c10ab8a8c91702f7573f2a36ff947a734bb0745659df5606bf14501e4ca14e1a

Contents?: true

Size: 548 Bytes

Versions: 1

Compression:

Stored size: 548 Bytes

Contents

#encoding: UTF-8
module TextRank
  module Tokenizer

    ##
    # A tokenizer regex that preserves (optionally formatted) numbers as a single token.
    ##
    Number = %r{
      (
        [1-9]\d{0,2}        # 453
        (?:,\d{3})*         # 453,231,162
        (?:\.\d{0,2})?      # 453,231,162.17

        |

        [1-9]\d*            # 453231162
        (?:\.\d{0,2})?      # 453231162.17

        |

        0                   # 0
        (?:\.\d{0,2})?      # 0.17

        |

        (?:\.\d{1,2})       # .17
      )
    }x

  end
end

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
text_rank-1.1.5 lib/text_rank/tokenizer/number.rb