Sha256: 4897d5724f49b0e42330c7ee0ec42b6a559135e9c8ecf47f3acf91ac085a439b
Contents?: true
Size: 548 Bytes
Versions: 3
Compression:
Stored size: 548 Bytes
Contents
# Merges two subsequent blocks if their text densities are equal. module Boilerpipe::Filters class SimpleBlockFusionProcessor def self.process(doc) tbs = doc.text_blocks return doc if tbs.size < 2 blocks_to_remove = [] tb1 = tbs.first tbs.drop(1).each do |tb| if tb1.text_density == tb.text_density tb1.merge_next(tb) blocks_to_remove << tb else tb1 = tb end end doc.replace_text_blocks!( tbs - blocks_to_remove ) doc end end end
Version data entries
3 entries across 3 versions & 1 rubygems