Sha256: 882b21388be67c92cfca6f71791b23664067a982759db871ea89bb3e43697139
Contents?: true
Size: 320 Bytes
Versions: 1
Compression:
Stored size: 320 Bytes
Contents
# Keeps only those content blocks which contain at least k words. module Boilerpipe::Filters class MinWordsFilter def self.process(min_words, doc) doc.text_blocks.each do |tb| next if tb.is_not_content? tb.content = false if tb.num_words < min_words end doc end end end
Version data entries
1 entries across 1 versions & 1 rubygems
Version | Path |
---|---|
boilerpipe-ruby-0.4.0 | lib/boilerpipe/filters/min_words_filter.rb |