Sha256: 2499cb7860207a7ed14580bc5558451fa974c1de932469459cb9ead0a37e93f2

Contents?: true

Size: 824 Bytes

Versions: 1

Compression:

Stored size: 824 Bytes

Contents

require 'pg_trgm/version'

require 'set'

module PgTrgm
  def self.trigrams(v)
    memo = Set.new
    v.to_s.split(/[\W_]+/).each do |word|
      next if word.empty?
      # Each word is considered to have two spaces prefixed and one space suffixed when determining the set of trigrams contained in the string
      word = "  #{word.downcase} "
      word.chars.each_cons(3).map do |cons|
        memo << cons.join
      end
    end
    memo
  end

  # inspired by https://gist.github.com/komasaru/41b0c93e264be75eabfa
  def self.similarity(v1, v2)
    v1_trigrams = PgTrgm.trigrams v1
    v2_trigrams = PgTrgm.trigrams v2
    return 0 if v1_trigrams.empty? and v2_trigrams.empty?
    count_dup = (v1_trigrams & v2_trigrams).length
    count_all = (v1_trigrams + v2_trigrams).length
    count_dup / count_all.to_f
  end
end

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
pg_trgm-0.0.1 lib/pg_trgm.rb