= loose_tight_dictionary

Match things based on string similarity (using the Pair Distance algorithm) and regular expressions.

== Quickstart

    >> d = LooseTightDictionary.new %w{seamus andy ben}
    => [...]
    >> puts d.find 'Shamus Heaney'
    => 'seamus'

Try running the included example file:

    $ ruby examples/first_name_matching.rb 
    ######################################################################################################################################################
    # Match "Mr. Seamus" => "seamus"
    ######################################################################################################################################################

    Needle
    (needle_reader proc not defined, so downcasing everything)
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    "mr. seamus"

    Haystack
    (haystack_reader proc not defined, so downcasing everything)
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    "seamus"
    "andy"
    "ben"

    Tighteners
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    (none)

    Comparisons
    Score                                             t_haystack [=> tightened/prefixed]                t_needle [=> tightened/prefixed]                  
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    0.8333333333333334                                "seamus"                                          "mr. seamus"
    0.0                                               "andy"                                            "mr. seamus"
    0.0                                               "ben"                                             "mr. seamus"

    Match
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    "seamus"

    # [... there's more output ...]

== The Boeing example

From the tests:

    ######################################################################################################################################################
    # Match "BOEING 737100" => "BOEING BOEING 737-100/200"
    ######################################################################################################################################################

    Needle
    (needle_reader proc not defined, so downcasing everything)
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    "boeing 737100"

    Haystack
    (haystack_reader proc not defined, so downcasing everything)
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    "boeing boeing 737-100/200"
    "boeing boeing 737-900"

    Tighteners
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    /(7\d)(7|0)-?(\d{1,3})/i

    Comparisons
    Score                                             t_haystack [=> tightened/prefixed]                t_needle [=> tightened/prefixed]                  
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    1.0                                               "boeing boeing 737-100/200" => "737100"           "boeing 737100" => "737100"
    0.6666666666666666                                "boeing boeing 737-100/200" => "737100"           "boeing 737100"
    0.6153846153846154                                "boeing boeing 737-900"                           "boeing 737100"
    0.6                                               "boeing boeing 737-900" => "737900"               "boeing 737100" => "737100"
    0.6                                               "boeing boeing 737-100/200"                       "boeing 737100"
    0.4                                               "boeing boeing 737-900" => "737900"               "boeing 737100"
    0.32                                              "boeing boeing 737-100/200"                       "boeing 737100" => "737100"
    0.2857142857142857                                "boeing boeing 737-900"                           "boeing 737100" => "737100"

    Match
    ------------------------------------------------------------------------------------------------------------------------------------------------------
    "BOEING BOEING 737-100/200"

== Improving dictionaries

Similarity matching will only get you so far.

    TODO: regex usage

== Note on Patches/Pull Requests
 
* Fork the project.
* Make your feature addition or bug fix.
* Add tests for it. This is important so I don't break it in a
  future version unintentionally.
* Commit, do not mess with rakefile, version, or history.
  (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
* Send me a pull request. Bonus points for topic branches.

== Copyright

Copyright (c) 2011 Seamus Abshere. See LICENSE for details.