Sha256: a9b24a39266cff18d9fd381365d978d456d75efe5cbc5ffb81d54fe1d1718cc8
Contents?: true
Size: 453 Bytes
Versions: 19
Compression:
Stored size: 453 Bytes
Contents
require 'cgi/util' htmlfile = 'nisendouka.html' textfile = 'nisendouka.txt' html = File.read(htmlfile) File.open(textfile, 'w') do |f| in_header = true html.each_line do |line| if in_header && /<div class="main_text">/ !~ line next else in_header = false end break if /<div class="bibliographical_information">/ =~ line line.gsub!(/<[^>]+>/, '') esc_line = CGI.unescapeHTML(line) f.write esc_line end end
Version data entries
19 entries across 19 versions & 1 rubygems