Sha256: e14c54e53ff85ad042ad1de9bc2a094c20d0356171a10f590da18bbe8ff290bc

Contents?: true

Size: 643 Bytes

Versions: 3

Compression:

Stored size: 643 Bytes

Contents

module GuessWho
  class Tokenizer
    def self.tokenize!(str)
      self.new(str).tokenize!
    end

    def initialize(str)
      @raw_str = str
    end

    def tokenize!
      tokens = []

      (0..@raw_str.size-1).each do |i|
        str = @raw_str.clone
        possible_firstname = str.slice(0..i)

        (possible_firstname.length..str.length).each do |j|
          combination = str.scan(/(?=(#{possible_firstname})([a-zA-Z]{,#{j}})([a-zA-Z]*))/)
          combination = combination.flatten.reject(&:empty?)
          tokens << combination unless combination.empty?
        end
      end

      return tokens.uniq
    end
  end
end

Version data entries

3 entries across 3 versions & 1 rubygems

Version Path
guess_who-0.1.2 lib/guess_who/tokenizer.rb
guess_who-0.1.1 lib/guess_who/tokenizer.rb
guess_who-0.1.0 lib/guess_who/tokenizer.rb