Sha256: 704a90807c81ce61d82ea1faa976225eff23ce79f0c29fa3e1caaaac3eca0fa3

Contents?: true

Size: 1.36 KB

Versions: 8

Compression:

Stored size: 1.36 KB

Contents

require "yaml"

module LuckySneaks
  module Unidecoder
    # Contains Unicode codepoints, loading as needed from YAML files
    CODEPOINTS = Hash.new { |h, k|
      h[k] = YAML::load_file(File.join(File.dirname(__FILE__), "unidecoder_data", "#{k}.yml"))
    } unless defined?(CODEPOINTS)
  
    class << self
      # Returns string with its UTF-8 characters transliterated to ASCII ones
      # 
      # You're probably better off just using the added String#to_ascii
      def decode(string)
        string.gsub(/[^\x00-\x7f]/u) do |codepoint|
          begin
            CODEPOINTS[code_group(codepoint)][grouped_point(codepoint)]
          rescue
            # Hopefully this won't come up much
            "?"
          end
        end
      end
    
    private
      # Returns the Unicode codepoint grouping for the given character
      def code_group(character)
        "x%02x" % (character.unpack("U")[0] >> 8)
      end
    
      # Returns the index of the given character in the YAML file for its codepoint group
      def grouped_point(character)
        character.unpack("U")[0] & 255
      end
    end
  end
end

module LuckySneaks
  module StringExtensions
    # Returns string with its UTF-8 characters transliterated to ASCII ones. Example: 
    # 
    #   "⠋⠗⠁⠝⠉⠑".to_ascii #=> "braille"
    def to_ascii
      LuckySneaks::Unidecoder::decode(self)
    end
  end
end

Version data entries

8 entries across 8 versions & 2 rubygems

Version Path
jeffrafter-sms-0.8.0 lib/lucky_sneaks/unidecoder.rb
jeffrafter-sms-0.8.1 lib/lucky_sneaks/unidecoder.rb
jeffrafter-sms-0.8.2 lib/lucky_sneaks/unidecoder.rb
jeffrafter-sms-0.8.3 lib/lucky_sneaks/unidecoder.rb
rsl-stringex-0.9.0 lib/lucky_sneaks/unidecoder.rb
rsl-stringex-0.9.1 lib/lucky_sneaks/unidecoder.rb
rsl-stringex-0.9.2 lib/lucky_sneaks/unidecoder.rb
rsl-stringex-0.9.3 lib/lucky_sneaks/unidecoder.rb