Sha256: 833eb155c9c5adf3e5a2a1836ea200df9233250199c2050e18de4c7609707ed6

Contents?: true

Size: 836 Bytes

Versions: 2

Compression:

Stored size: 836 Bytes

Contents

require "active_support/core_ext/string/multibyte"
require "delegate"
module Fuzzily
  class String < SimpleDelegator

    def trigrams
      return [] if __getobj__.blank?
      normalized = self.normalize
      number_of_trigrams = normalized.length - 3
      trigrams = (0..number_of_trigrams).map { |index| normalized[index, 3] }.uniq
    end

    def scored_trigrams
      trigrams.map { |t| [t, self.length] }
    end

    protected

    # Remove accents, downcase, replace spaces and word start with "*",
    # return list of normalized words
    def normalize
      ActiveSupport::Multibyte::Chars.new(self.to_s)
        .mb_chars.unicode_normalize(:nfkd).to_s.downcase
        .gsub(/[^\x00-\x7F]/, "")
        .gsub(/[^a-z\d]/, " ")
        .gsub(/\s+/, "*")
        .gsub(/^/, "**")
        .gsub(/$/, "*")
    end
  end
end

Version data entries

2 entries across 2 versions & 1 rubygems

Version Path
fuzzily_reloaded-1.0.1 lib/fuzzily/trigram.rb
fuzzily_reloaded-1.0.0 lib/fuzzily/trigram.rb