Sha256: aa76d28e75a74fcc0825d740ec41727f32299a1a180c7e9a1d5dbf7904b6462b

Contents?: true

Size: 408 Bytes

Versions: 5

Compression:

Stored size: 408 Bytes

Contents

# Marks all blocks as content.

module Boilerpipe::Extractors
  class KeepEverythingExtractor
    def self.text(contents)
      doc = ::Boilerpipe::SAX::BoilerpipeHTMLParser.parse(contents)
      ::Boilerpipe::Extractors::KeepEverythingExtractor.process doc
      doc.content
    end

    def self.process(doc)
      ::Boilerpipe::Filters::MarkEverythingContentFilter.process doc
      doc
    end
  end
end

Version data entries

5 entries across 5 versions & 1 rubygems

Version Path
boilerpipe-ruby-0.5.0 lib/boilerpipe/extractors/keep_everything_extractor.rb
boilerpipe-ruby-0.4.4 lib/boilerpipe/extractors/keep_everything_extractor.rb
boilerpipe-ruby-0.4.3 lib/boilerpipe/extractors/keep_everything_extractor.rb
boilerpipe-ruby-0.4.2 lib/boilerpipe/extractors/keep_everything_extractor.rb
boilerpipe-ruby-0.4.1 lib/boilerpipe/extractors/keep_everything_extractor.rb