Sha256: 4db13e959bfeee8342ba0346bf63fe8d832aa86e87986c83f858cdcc759952ee
Contents?: true
Size: 1.08 KB
Versions: 3
Compression:
Stored size: 1.08 KB
Contents
module HttpSpell class SpellChecker def initialize(personal_dictionary_path = nil, tracing: false) @personal_dictionary_arg = "-p #{personal_dictionary_path}" if personal_dictionary_path @tracing = tracing end def check(doc, lang) commands = [ 'pandoc --from html --to plain', "hunspell -d #{translate(lang)} #{@personal_dictionary_arg} -i UTF-8 -l", ] if @tracing warn "Piping the HTML document into the following chain of commands:" warn commands end Open3.pipeline_rw(*commands) do |stdin, stdout, _wait_thrs| stdin.puts(doc) stdin.close stdout.read.split.uniq end end private # The W3C [recommends](https://www.w3.org/International/questions/qa-html-language-declarations) # to specify language using identifiers as per [RFC 5646](https://tools.ietf.org/html/rfc5646) # which uses dashes. Hunspell, however, uses underscores. This method translates RFC-style identifiers # to hunspell-style. def translate(lang) lang.tr('-', '_') end end end
Version data entries
3 entries across 3 versions & 1 rubygems
Version | Path |
---|---|
httpspell-1.3.0 | lib/httpspell/spellchecker.rb |
httpspell-1.2.1 | lib/httpspell/spellchecker.rb |
httpspell-1.2.0 | lib/httpspell/spellchecker.rb |