Sha256: 3ddd433255c828d7eadba5094825c686fa3d7deff969602da83683c740e56a6b

Contents?: true

Size: 809 Bytes

Versions: 3

Compression:

Stored size: 809 Bytes

Contents

module PragmaticSegmenter
  module Languages
    class Amharic
      class Process < PragmaticSegmenter::Process
        private

        def sentence_boundary_punctuation(txt)
          PragmaticSegmenter::Languages::Amharic::SentenceBoundaryPunctuation.new(text: txt).split
        end

        def punctuation_array
          PragmaticSegmenter::Languages::Amharic::Punctuation.new.punct
        end
      end

      class SentenceBoundaryPunctuation < PragmaticSegmenter::SentenceBoundaryPunctuation
        SENTENCE_BOUNDARY = /.*?[፧።!\?]|.*?$/

        def split
          text.scan(SENTENCE_BOUNDARY)
        end
      end

      class Punctuation < PragmaticSegmenter::Punctuation
        PUNCT = ['።', '፧', '?', '!']

        def punct
          PUNCT
        end
      end
    end
  end
end

Version data entries

3 entries across 3 versions & 1 rubygems

Version Path
pragmatic_segmenter-0.0.3 lib/pragmatic_segmenter/languages/amharic.rb
pragmatic_segmenter-0.0.2 lib/pragmatic_segmenter/languages/amharic.rb
pragmatic_segmenter-0.0.1 lib/pragmatic_segmenter/languages/amharic.rb