Sha256: f786d4fcf627b861dc0d4fb0d2a3399a06fe60fb4ec01ea19ac935c4533911ef
Contents?: true
Size: 878 Bytes
Versions: 5
Compression:
Stored size: 878 Bytes
Contents
module Jkl module Text class << self def sanitize(text) remove_short_lines(strip_all_tags(remove_script_tags(text))) end alias :clean :sanitize def strip_all_tags(text) text.gsub(/<\/?[^>]*>/, "") end def remove_blank_lines(text) text.gsub(/\n\r|\r\n|\n|\r/, "") end def remove_html_comments(text) text.gsub(/<!--(.|\s)*?-->/, "") end def remove_script_tags(text) text = remove_html_comments(text) text.gsub(/((<[\s\/]*script\b[^>]*>)([^>]*)(<\/script>))/i, "") end def remove_short_lines(text) text = text.gsub(/\s\s/, "\n") str = "" # remove short lines - ususally just navigation text.split("\n").each do |l| str << l unless l.count(" ") < 5 end str end end end end
Version data entries
5 entries across 5 versions & 1 rubygems
Version | Path |
---|---|
jakal-0.1.92 | lib/jkl/text_client.rb |
jakal-0.1.91 | lib/jkl/text_client.rb |
jakal-0.1.9 | lib/jkl/text_client.rb |
jakal-0.1.8 | lib/jkl/text_client.rb |
jakal-0.1.7 | lib/jkl/text_client.rb |