Class: CodeZauker::Util

Inherits:
Object
  • Object
show all
Defined in:
lib/code_zauker.rb

Overview

Basic utility class

Instance Method Summary (collapse)

Instance Method Details

- (Object) ensureUTF8(untrusted_string)

Ensure Data are correctly imported

blog.grayproductions.net/articles/ruby_19s_string This code try to "guess" the right encoding switching to ISO-8859-1 if UTF-8 is not valid. Tipical use case: an italian source code wronlgy interpreted as a UTF-8 whereas it is a ISO-8859 windows code.



56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
# File 'lib/code_zauker.rb', line 56

def ensureUTF8(untrusted_string)
  if untrusted_string.valid_encoding?()==false 
    #puts "DEBUG Trouble on #{untrusted_string}"
    untrusted_string.force_encoding("ISO-8859-1")        
    # We try ISO-8859-1 tipical windows 
    begin
      valid_string=untrusted_string.encode("UTF-8", { :undef =>:replace, :invalid => :replace} )           
    rescue Encoding::InvalidByteSequenceError => e   
      raise e
    end
    # if valid_string != untrusted_string
    #   puts "CONVERTED #{valid_string} Works?#{valid_string.valid_encoding?}"
    # end
    return valid_string
  else
    return untrusted_string
  end
end

- (Object) mixCase(trigram)

Compute all the possible case-mixed trigrams It works for every string size TODO: Very bad implementation, need improvements



19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
# File 'lib/code_zauker.rb', line 19

def mixCase(trigram) 
  caseMixedElements=[]
  lx=trigram.length
  combos=2**lx
  startString=trigram.downcase
  #puts "Combos... 1..#{combos}... #{startString}"
  for c in 0..(combos-1) do
    # Make binary
    maskForStuff=c.to_s(2)
    p=0
    #puts maskForStuff
    currentMix=""
    # Pad it
    if maskForStuff.length < lx
      maskForStuff = ("0"*(lx-maskForStuff.length)) +maskForStuff
    end        
    maskForStuff.each_char { | x |          
      #putc x
      if x=="1"
        currentMix +=startString[p].upcase
      else
        currentMix +=startString[p].downcase
      end
      #puts currentMix
      p+=1
    }        
    caseMixedElements.push(currentMix)
  end
  return caseMixedElements
end