Sha256: 37f04b6f1d80436dd5f532259a4ba2a5f6f3c4e57f01940d2e6f10e21096abfe

Contents?: true

Size: 961 Bytes

Versions: 4

Compression:

Stored size: 961 Bytes

Contents

require 'test/unit'
require_relative '../src/language-detector'

class LanguageDetectorTests < Test::Unit::TestCase
  def setup
    # Detect vowel-y sentences (the majority language) vs. consonant-y sentences.
    @vowel_detector = LanguageDetector.new(:ngram_size => 2)
    vowel_examples = ["aeiou uoeia auoiao ai", "iouea eou eu eaiou", "ou oi oiea ieau", "eau au aou ia", "aei aae eaee iou aii iaa ooae oaiuuoouie", "aei iou iaou", "aeeeiioouuu uoeiae"]
    consonant_examples = ["bcbcbbccdd bcd cdbcbc dbdb", "cddccdbcbcdbd", "cdc bdc bdb cdc"]
    @vowel_detector.train(20, vowel_examples + consonant_examples)
  end
  
  def test_classify
    assert_equal "majority", @vowel_detector.classify("iou eao oiee aie tee")
    assert_equal "majority", @vowel_detector.classify("aeou one cdf oeaoi ioeae")    
    assert_equal "minority", @vowel_detector.classify("cdccdb")
    assert_equal "minority", @vowel_detector.classify("bcbbd cdcbdcb ae")    
  end
end

Version data entries

4 entries across 4 versions & 1 rubygems

Version Path
unsupervised-language-detection-0.0.4 test/test_language_detector.rb
unsupervised-language-detection-0.0.3 test/test_language_detector.rb
unsupervised-language-detection-0.0.2 test/test_language_detector.rb
unsupervised-language-detection-0.0.1 test/test_language_detector.rb