Sha256: ccf2f0f020ef7e9a01e3daff1bf409a442fa27b5757b187a64d3aed89b15e5f5
Contents?: true
Size: 431 Bytes
Versions: 1
Compression:
Stored size: 431 Bytes
Contents
require File.expand_path('../language-detector', __FILE__) TWEETS_FILENAME = "datasets/tweets_5000.txt" training_sentences = File.readlines(TWEETS_FILENAME).map{ |tweet| tweet.normalize } detector = LanguageDetector.new(:ngram_size => 2) detector.train(30, training_sentences) detector.yamlize("detector.yaml") puts detector.classifier.get_prior_category_probability(0) puts detector.classifier.get_prior_category_probability(1)
Version data entries
1 entries across 1 versions & 1 rubygems
Version | Path |
---|---|
unsupervised-language-detection-0.0.1 | lib/unsupervised-language-detection/train-english-tweet-detector.rb |