Sha256: 6dcc86159b6569211d7db0a5b410fd0ac01c644c67072f37dcf6b3c2383e07cd

Contents?: true

Size: 356 Bytes

Versions: 6

Compression:

Stored size: 356 Bytes

Contents

module Sastrawi
  module Stemmer
    module Filter
      class TextNormalizer
        def self.normalize_text(text)
          lowercase_text = text.downcase
          replaced_text = lowercase_text.gsub(/[^a-z0-9 -]/im, ' ')
          replaced_text = replaced_text.gsub(/( +)/im, ' ')

          replaced_text.strip
        end
      end
    end
  end
end

Version data entries

6 entries across 6 versions & 1 rubygems

Version Path
sastrawi-0.1.4 lib/sastrawi/stemmer/filter/text_normalizer.rb
sastrawi-0.1.3 lib/sastrawi/stemmer/filter/text_normalizer.rb
sastrawi-0.1.2 lib/sastrawi/stemmer/filter/text_normalizer.rb
sastrawi-0.1.1 lib/sastrawi/stemmer/filter/text_normalizer.rb
sastrawi-0.1.0 lib/sastrawi/stemmer/filter/text_normalizer.rb
sastrawi-0.1.0.pre lib/sastrawi/stemmer/filter/text_normalizer.rb