Sha256: d95ca894ca80bf9b37027ae1d00136da1e7ee7cfc57ab6ac5c3f1f137f24082c

Contents?: true

Size: 1.18 KB

Versions: 1

Compression:

Stored size: 1.18 KB

Contents

# frozen_string_literal: true

# Author::    Lucas Carlson  (mailto:lucas@rufy.com)
# Copyright:: Copyright (c) 2005 Lucas Carlson
# License::   LGPL

module ClassifierReborn
  module Tokenizer
    class Token < String
      # The class can be created with one token string and extra attributes. E.g.,
      #      t = ClassifierReborn::Tokenizer::Token.new 'Tokenize', stemmable: true, maybe_stopword: false
      #
      # Attributes available are:
      #   stemmable:        true  Possibility that the token can be stemmed. This must be false for un-stemmable terms, otherwise this should be true.
      #   maybe_stopword:   true  Possibility that the token is a stopword. This must be false for terms which never been stopword, otherwise this should be true.
      def initialize(string, stemmable: true, maybe_stopword: true)
        super(string)
        @stemmable = stemmable
        @maybe_stopword = maybe_stopword
      end

      def stemmable?
        @stemmable
      end

      def maybe_stopword?
        @maybe_stopword
      end

      def stem
        stemmed = super
        self.class.new(stemmed, stemmable: @stemmable, maybe_stopword: @maybe_stopword)
      end
    end
  end
end

Version data entries

1 entries across 1 versions & 1 rubygems

Version Path
classifier-reborn-2.3.0 lib/classifier-reborn/extensions/tokenizer/token.rb