Class: CodeZauker::Util
- Inherits:
-
Object
- Object
- CodeZauker::Util
- Defined in:
- lib/code_zauker.rb
Overview
Basic utility class
Instance Method Summary (collapse)
-
- (Object) ensureUTF8(untrusted_string)
Ensure Data are correctly imported
blog.grayproductions.net/articles/ruby_19s_string This code try to "guess" the right encoding switching to ISO-8859-1 if UTF-8 is not valid.
-
- (Object) mixCase(trigram)
Compute all the possible case-mixed trigrams It works for every string size TODO: Very bad implementation, need improvements.
Instance Method Details
- (Object) ensureUTF8(untrusted_string)
Ensure Data are correctly imported
blog.grayproductions.net/articles/ruby_19s_string This code try to "guess" the right encoding switching to ISO-8859-1 if UTF-8 is not valid. Tipical use case: an italian source code wronlgy interpreted as a UTF-8 whereas it is a ISO-8859 windows code.
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
# File 'lib/code_zauker.rb', line 56 def ensureUTF8(untrusted_string) if untrusted_string.valid_encoding?()==false #puts "DEBUG Trouble on #{untrusted_string}" untrusted_string.force_encoding("ISO-8859-1") # We try ISO-8859-1 tipical windows begin valid_string=untrusted_string.encode("UTF-8", { :undef =>:replace, :invalid => :replace} ) rescue Encoding::InvalidByteSequenceError => e raise e end # if valid_string != untrusted_string # puts "CONVERTED #{valid_string} Works?#{valid_string.valid_encoding?}" # end return valid_string else return untrusted_string end end |
- (Object) mixCase(trigram)
Compute all the possible case-mixed trigrams It works for every string size TODO: Very bad implementation, need improvements
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
# File 'lib/code_zauker.rb', line 19 def mixCase(trigram) caseMixedElements=[] lx=trigram.length combos=2**lx startString=trigram.downcase #puts "Combos... 1..#{combos}... #{startString}" for c in 0..(combos-1) do # Make binary maskForStuff=c.to_s(2) p=0 #puts maskForStuff currentMix="" # Pad it if maskForStuff.length < lx maskForStuff = ("0"*(lx-maskForStuff.length)) +maskForStuff end maskForStuff.each_char { | x | #putc x if x=="1" currentMix +=startString[p].upcase else currentMix +=startString[p].downcase end #puts currentMix p+=1 } caseMixedElements.push(currentMix) end return caseMixedElements end |