Sha256: 01e5230c53c9fb3ad9bf85406b9b444696a1563904eef83d5fefb746f2657358
Contents?: true
Size: 478 Bytes
Versions: 5
Compression:
Stored size: 478 Bytes
Contents
# Marks trailing headlines TextBlocks that have the label :#HEADING # as boilerplate. Trailing means they are marked content and are # below any other content block. module Boilerpipe::Filters class TrailingHeadlineToBoilerplateFilter def self.process(doc) doc.text_blocks.each do |tb| next unless tb.is_content? if tb.has_label? :HEADING tb.content = false else break end end doc end end end
Version data entries
5 entries across 5 versions & 1 rubygems