Sha256: d0ed220860f355c8c9b5ec981ebf2266c105fc2592a9da58f479cbf73bcda923

Contents?: true

Size: 645 Bytes

Versions: 3

Compression:

Stored size: 645 Bytes

Contents

PlainText.extract {
  from :html, :htm
  as "text/html"
  aka "HyperText Markup Language document"
  with {|source|
    encoding=File.encoding(source)
    if encoding.empty? or encoding.gsub(/[^\w]/,'').downcase=="utf8" then
      %x{html2text -nobs "#{source}"}
    else
      %x{html2text -nobs "#{source}" | iconv -f #{encoding} -t utf8}
    end
  }
  which_requires 'html2text', 'iconv'
  which_should_for_example_extract 'zentrum für angewandte forschung an fachhochschulen nachhaltige energietechnik Baden-Württemberg', :from => 'zafh.net.html'
  or_extract 'Málaga', :from => '7.html'
  or_extract 'le monde', :from => 'lemonde.htm'
}

Version data entries

3 entries across 3 versions & 1 rubygems

Version Path
picolena-0.0.99 app_generators/picolena/templates/lib/filters/html.rb
picolena-0.1.0 lib/picolena/templates/lib/filters/html.rb
picolena-0.1.1 lib/picolena/templates/lib/filters/html.rb