Sha256: f77b3271a871f5189d5554a1ce5f4bcef810ac55baff52573a5b8c92a81a20f9
Contents?: true
Size: 1.73 KB
Versions: 5
Compression:
Stored size: 1.73 KB
Contents
= loose_tight_dictionary Match things based on string similarity (using the Pair Distance algorithm) and regular expressions. = Quickstart >> right_records = [ 'seamus', 'andy', 'ben' ] => [...] >> left_record = 'Shamus Heaney' => [...] >> d = LooseTightDictionary.new right_records => [...] >> puts d.left_to_right left_record => 'seamus' Try running the included example file: $ ruby examples/first_name_matching.rb Left side (input) ==================== Mr. Seamus Sr. Andy Master BenT Right side (output) ==================== seamus andy ben Results ==================== Left record (input) Right record (output) Prefix used (if any) Score Mr. Seamus seamus NULL 0.666666666666667 Sr. Andy andy NULL 0.5 Master BenT ben NULL 0.2 = Improving dictionaries Similarity matching will only get you so far. TODO: regex usage == Note on Patches/Pull Requests * Fork the project. * Make your feature addition or bug fix. * Add tests for it. This is important so I don't break it in a future version unintentionally. * Commit, do not mess with rakefile, version, or history. (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull) * Send me a pull request. Bonus points for topic branches. == Copyright Copyright (c) 2010 Seamus Abshere. See LICENSE for details.
Version data entries
5 entries across 5 versions & 2 rubygems