Sha256: cc7106099c959eb37b5e7346d06b339e3d0adc1070cdfbd4adaffd0ebcbce446

Contents?: true

Size: 802 Bytes

Versions: 5

Compression:

Stored size: 802 Bytes

Contents

require 'active_support/core_ext/string/multibyte'
require 'delegate'
module Fuzzily
  class String < SimpleDelegator

    def trigrams
      return [] if __getobj__.nil?
      normalized = self.normalize
      number_of_trigrams = normalized.length - 3
      trigrams = (0..number_of_trigrams).map { |index| normalized[index,3] }.uniq
    end

    def scored_trigrams
      trigrams.map { |t| [t, self.length] }
    end

    protected

    # Remove accents, downcase, replace spaces and word start with '*',
    # return list of normalized words
    def normalize
      ActiveSupport::Multibyte::Chars.new(self).
        mb_chars.normalize(:kd).gsub(/[^\x00-\x7F]/,'').downcase.to_s.
        gsub(/[^a-z]/,' ').
        gsub(/\s+/,'*').
        gsub(/^/,'**').
        gsub(/$/,'*')
    end
  end
end

Version data entries

5 entries across 5 versions & 1 rubygems

Version Path
fuzzily-0.3.3 lib/fuzzily/trigram.rb
fuzzily-0.3.2 lib/fuzzily/trigram.rb
fuzzily-0.3.1 lib/fuzzily/trigram.rb
fuzzily-0.3.0 lib/fuzzily/trigram.rb
fuzzily-0.2.4 lib/fuzzily/trigram.rb