Sha256: aaebd41fa447124f58027bc520adfb7476aceb21fd7d9d78bc2ea6d80bff2bd1

Contents?: true

Size: 476 Bytes

Versions: 2

Compression:

Stored size: 476 Bytes

Contents

module OpenNlp
  class Tokenizer < Tool
    self.java_class = Java::opennlp.tools.tokenize.TokenizerME

    # Tokenizes a string
    #
    # @param [String] str string to tokenize
    # @return [Array] array of string tokens
    def tokenize(str)
      fail ArgumentError, 'str must be a String' unless str.is_a?(String)
      j_instance.tokenize(str).to_ary
    end

    private

    def get_last_probabilities
      j_instance.getTokenProbabilities.to_ary
    end
  end
end

Version data entries

2 entries across 2 versions & 1 rubygems

Version Path
open_nlp-0.2.0-java lib/open_nlp/tokenizer.rb
open_nlp-0.1.0-java lib/open_nlp/tokenizer.rb