* Basic commands Make sure to uninstall code zauker from your gems (gem uninstall code_zauker) before start developing ** Environment setup (windows) Ensure you have Dev kit too http://rubyinstaller.org/downloads https://github.com/oneclick/rubyinstaller/wiki/Development-Kit It is for hiredis: hiredis is not mandatory, but suggested #+begin_src sh gem install bundler # Dev kit installation... #ruby /c/rubyinstallkit/dk.rb init #ruby /c/rubyinstallkit/dk.rb install bundle install rake test # Ensure dev code is reachable export RUBYLIB=k:/code/code_zauker/lib #+end_src ** To Run tests #+begin_src sh rake test #+end_src ** To release a new version to rubygem #+begin_src sh rake release #+end_src ** Dependency management Done with ruby "bundle", you should check periodically dependency with "bundle update" to be sure to be with latest bug fixes of dependence libs * Notable facts ** DB Size tradeoff If trigram size is greather then 3, the database become larger, because of less collisions. czlist work better with 4-grams then with 3-grams (a lot less false positive) but the size can be 50% bigger 2-gram size rocks a lot, because of a very small db but false positive are a nightmare. czlist give 2188 files with a "for", but grep report only 383 of them (less then 18% of success) Emacs-lisp files spot a very huge number of trigrams * Future/Study To fulfill Google code options: ** Google code input Offers regula expression search like ^java/.*\.java$ and also: Package package:linux-2.6 Language lang:c++ File Path file:(code|[^or]g)search Class class:HashMap Function function:toString License license:mozilla Case Sensitive case:yes ** Reference http://www.rubyinside.com/21-ruby-tricks-902.html * Aws tests ** Micro instance Without multiplexing you get 4m39.599s for indexing code_zauker With time find . -type f -print0 | xargs -0 -P 10 -n 20 ./bin/czindexer -v --redis-server awsserver You get about 0m31.284s