Sha256: 191483a03a1a42763a46ff73f29dac2ee72dbfb9a8ddf95cb24ebe0e6d0b73fd

Contents?: true

Size: 641 Bytes

Versions: 3

Compression:

Stored size: 641 Bytes

Contents

require 'active_support/core_ext/string/multibyte'

module Fuzzily
  module String
    def trigrams
      normalized = self.normalize
      (0..(normalized.length - 3)).map { |index| normalized[index,3] }.uniq
    end

    protected

    # Remove accents, downcase, replace spaces and word start with '*',
    # return list of normalized words
    def normalize
      # Iconv.iconv('ascii//translit//ignore', 'utf-8', self).first.
      ActiveSupport::Multibyte::Chars.new(self).
        mb_chars.normalize(:kd).gsub(/[^\x00-\x7F]/,'').downcase.to_s.
        gsub(/\W/,' ').
        gsub(/\s+/,'*').
        gsub(/^/,'**')
    end
  end
end

Version data entries

3 entries across 3 versions & 1 rubygems

Version Path
fuzzily-0.1.0 lib/fuzzily/trigram.rb
fuzzily-0.0.3 lib/fuzzily/trigram.rb
fuzzily-0.0.2 lib/fuzzily/trigram.rb