The "google_hash" gem. Its goal. To boldly be faster than any ruby hash has before (cue star trek TNG theme...). Well, really the goal is a better hash for Ruby, either one that is faster or more space efficient than ruby's default. To attempt to accomplish this, this library wraps the google hash sparse and dense hashes [1], which seem to perform much better at least for the #each method. It also creates some "specialized" hashes, for instance, those that take an integer for their key, for even better performance. ruby 1.9.3p194 (2012-04-20 revision 35410) [i686-linux] inserting 400000 objects Ruby Standard Hash populate integer 0.324 #each 0.660 lookup int 0.083 GoogleHashDenseIntToRuby populate integer 0.114 #each 0.050 lookup int 0.080 GoogleHashSparseIntToInt # said to be more memory efficient populate integer 0.242 #each 0.046 lookup int 0.099 These also use less memory, because (if you specify IntToInt, it stores only 4 bytes per int, instead of Ruby's usual 20 bytes), and usually less overall RAM for the hash store. This also frees up Ruby so it doesn't hvae to garbage collect as much. Yea! For instance, with 1M ints, doing a GC takes this long, comparatively: GoogleHashDenseIntToInt "dense took 0.002" "ruby hash took 0.103" per GC. And those garbage collects happen all the time, so this is meant to speed those up. See also the results.txt file for more OS benchmark results. You can also run your own benchmarks to see how much faster it would be for you ("try before you buy"), see spec/benchmark.rb file. The best benchmark, of course, is to integrate and run it in your own app. Here is how it performs, if used as a "replacement" for the Ruby standard Hash: Ruby Standard Hash populate string 0.252 populate symbol 0.109 populate integer 0.324 #each 0.660 lookup int 0.083 lookup string 0.642 lookup symbol 0.082 GoogleHashDenseRubyToRuby populate string 0.312 # slower here populate symbol 0.136 populate integer 0.172 #each 0.077 lookup int 0.102 lookup string 0.271 lookup symbol 0.098 GoogleHashSparseRubyToRuby populate string 0.276 populate symbol 0.102 populate integer 0.428 #each 0.047 lookup int 0.098 lookup string 0.293 lookup symbol 0.099 basically slightly slower, except for an #each method that is *much* faster, and, as I said, RAM usage might be much better using this than the standard Ruby hash, esp. for the case of the Sparse version. == Installation == gem install google_hash (if on doze, you'll also first need the devkit installed). == usage == a = GoogleHashDenseRubyToRuby.new # like a normal Ruby hash, with slightly different performance characteristics, see results.txt b = GoogleHashDenseIntToRuby.new # :int => Ruby -- keys are stored as "native" ints, values are ruby objects c = GoogleHashSparseIntToInt.new # :int => :int Here's the full list of availables: >> Object.constants.each{|k k if k =~ /google/i } GoogleHashDenseLongToLong GoogleHashSparseLongToLong GoogleHashDenseLongToInt GoogleHashSparseLongToInt GoogleHashDenseLongToRuby GoogleHashSparseLongToRuby GoogleHashDenseDoubleToLong GoogleHashSparseDoubleToLong GoogleHashDenseDoubleToInt GoogleHashSparseDoubleToInt GoogleHashDenseDoubleToRuby GoogleHashSparseDoubleToRuby GoogleHashDenseIntToLong GoogleHashSparseIntToLong GoogleHashDenseIntToInt GoogleHashSparseIntToInt GoogleHashDenseIntToRuby GoogleHashSparseIntToRuby GoogleHashDenseRubyToLong GoogleHashSparseRubyToLong GoogleHashDenseRubyToInt GoogleHashSparseRubyToInt GoogleHashDenseRubyToRuby GoogleHashSparseRubyToRuby (long is "better" than int on 64 bit systems only) and how to use them: a[3] = 4 b[4] = 'abc' b['abc'] = 'some complex object' c[3] = 4 # all you can use are ints... Use them like "normal" hashes, the method names try to map well. == feedback == If you have a desired use case that's not covered, let me know and I might well be able to code it up for you and add it. ex: currently it uses longs internally instead of ints--if you want ints or chars or what not, let me know. if you want it to remember insertion order, I could do that, too, or native "store away" strings/bignums, whatever. Could also add vectors, vector(pairs), priority queues, floats, native bignums, other more complex types, if anybody wants me to. This is meant to be one more tool in the rubyists toolbelt when trying to optimize speed-wise, and plans to expand to more types, but at least with this release it has a #each method. Enjoy. -r [1] http://code.google.com/p/google-sparsehash If you want to see the code/hack on it, run extconf.rb within the ext directory, to create the code it actually uses (from a template). Related: judy http://groups.google.com/group/ruby-talk-google/browse_thread/thread/05ed587925526a7f/314375891d12b672?lnk=raot NArray gem : provides "native" type arrays (and X-Dimensional arrays--all in native memory, so also saves memory as this gem does)