Sha256: a8988702be96f7c4a737b81bfd1d5cea840ff7530ac6073d7a6b497e16d2eea9
Contents?: true
Size: 1.55 KB
Versions: 1
Compression:
Stored size: 1.55 KB
Contents
# A lightweight UTF8-aware String class meant for use with Ruby 1.8.x The scanning code is implemented in C (would love a Java version as well, if any of you want to help with that). At the moment, this gem is tested on 1.8.7, 1.9.2, 1.9.3 and Rubinius - it may work on others but ymmv. ## String::UTF8 Example The String::UTF8#[] API isn't fully supported yet, and may never support regular expressions. I'll update this readme as I make progress. ``` ruby # String (on 1.8) str = "你好世界" str.length # 12 str[0] # 228 str.chars.to_a.inspect # ["\344", "\275", "\240", "\345", "\245", "\275", "\344", "\270", "\226", "\347", "\225", "\214"] # String::UTF8 (on 1.8) utf8_str = "你好世界".as_utf8 utf8_str.length # 4 utf8.chars.to_a.inspect # => ["你", "好", "世", "界"] utf8[0] # => "你" utf8_str[0, 3] # => "你好世" utf8_str[0..3] # => "你好世界" ``` ## StringScanner::UTF8 Example The full StringScanner API hasn't been made UTF8-aware yet. I may try and finish it later, but the getch method is all I need at the moment. ``` ruby # StringScanner (on 1.8) scanner = StringScanner.new("你好世界") scanner.getch # => "\344" # StringScanner::UTF8 require 'utf8/string_scanner' # NOTE: we could have used scanner.as_utf8 instead utf8_scanner = StringScanner::UTF8.new("你好世界") utf8_scanner.getch # => "你" utf8_scanner.getch # => "好" utf8_scanner.getch # => "世" utf8_scanner.reset utf8_scanner.getch # => "你" ``` ## Disclaimer I'm only planning on making the methods I need UTF8-aware, if you need others I welcome pull requests :)
Version data entries
1 entries across 1 versions & 1 rubygems
Version | Path |
---|---|
utf8-0.1.8 | README.md |