Path: | README.rdoc |
Last Update: | Tue May 12 11:20:38 -0400 2009 |
ofac is a ruby gem that tries to find a match of a person‘s name and address against the Office of Foreign Assets Control‘s Specially Designated Nationals list...the so called terrorist watch list.
This gem, like the ssn_validator gem, started as a need for the company I work for, Clarity Services Inc. We decided once again to create a gem out of it and share it with the community. Much thanks goes to the management at Clarity Services Inc. for allowing this code to be open sourced. Thanks also to Larry Berland at Clarity Services Inc. The matching logic in the ofac_match.rb file was derived from his work.
Creates a score, 1 - 100, based on how well the name, address and city match the data on the SDN list. Since we have to match on strings, the likely hood of an exact match are virtually nil. So we‘ve created an algorithm that creates a score. The better the match, the higher the score. A score of 100 would be a perfect match.
The score is calculated by adding up the weightings of each part that is matched. So if only name is matched, then the max score is the weight for :name which is 60
It‘s possible to get partial matches, which will add partial weight to the score. If there is not a match on the element as it is passed in, then each word element gets broken down and matches are tried on each partial element. The weighting is distrubuted equally for each partial that is matched.
If exact matches are not made, then a sounds like match is attempted. Any match made by sounds like is given 75% of it‘s weight to the score. Example:
If you are trying to match the name Kevin Tyll and there is a record for Smith, Kevin in the database, then we will try to match both Kevin and Tyll separately, with each element Smith and Kevin. Since only Kevin will find a match, and there were 2 elements in the searched name, the score will be added by half the weighting for :name. So since the weight for :name is 60, then we will add 30 to the score.
If you are trying to match the name Kevin Gregory Tyll and there is a record for Tyll, Kevin in the database, then we will try to match Kevin and Gregory and Tyll separately, with each element Tyll and Kevin. Since both Kevin and Tyll will find a match, and there were 3 elements in the searched name, the score will be added by 2/3 the weighting for :name. So since the weight for :name is 60, then we will add 40 to the score.
If you are trying to match the name Kevin Tyll and there is a record for Kevin Gregory Tyll in the database, then we will try to match Kevin and Tyll separately, with each element Tyll and Kevin and Gregory. Since both Kevin and Tyll will find a match, and there were 2 elements in the searched name, the score will be added by 2/2 the weighting for :name. So since the weight for :name is 60, then we will add 60 to the score.
If you are trying to match the name Kevin Tyll, and there is a record for Teel, Kevin in the database, then an exact match will be found for Kevin, and a sounds like match will be made for Tyll. Since there were 2 elements in the searched name, and the weight for :name is 60, then each element is worth 30. Since Kevin was an exact match, it will add 30, and since Tyll was a sounds like match, it will add 30 * .75. So the :name portion of the search will be worth 53.
If data is in the database for city and or address, and you pass data in for these elements, the score will be reduced by 10% of the weight if there is no match or sounds like match. So if you get a match on name, you‘ve already got a score of 60. So if you don‘t pass in an address or city, or if you do, but there is no city or address info in the database, then your final score will be 60. But if you do pass in a city, say Tampa, and the city in the Database is New York, then we will deduct 10% of the weight (30 * .1) = 3 from the score since 30 is the weight for :city. So the final score will be 57.
If were searching for New York, and the database had New Deli, then there would be a match on New, but not on Deli. Since there were 2 elements in the searched city, each hit is worth 15. So the match on New would add 15, but the non-match on York would subtract (15 * .1) = 1.5 from the score. So the score would be (60 + 15 - 1.5) = 74, due to rounding.
Only :city and :address subtract from the score, No match on name simply returns 0.
Matches for name are made for both the name and any aliases in the OFAC database.
Matches for :city and :address will only be added to the score if there is first a match on :name.
We consider a score of 60 to be reasonable as a hit.
Accepts a hash with the identity‘s demographic information
Ofac.new({:name => 'Oscar Hernandez', :city => 'Clearwater', :address => '123 somewhere ln'})
:name is required to get a score. If :name is missing, an error will not be thrown, but a score of 0 will be returned.
The more information provided, the higher the score could be. A score of 100 would mean all fields were passed in, and all fields were 100% matches. If only the name is passed in without an address, it will be impossible to get a score of 100, even if the name matches perfectly.
Acceptable hash keys and their weighting in score calculation:
ofac = Ofac.new(:name => 'Kevin Tyll', :city => 'Clearwater', :address => '123 Somewhere Ln.')
ofac.score => return the score 1 - 100
ofac.possible_hits => returns an array of hashes.
sudo gem install kevintyll-ofac
script/generate ofac_migration
config.gem 'kevintyll-ofac', :lib => 'ofac'
rake ofac:update_data * The OFAC data is not updated with any regularity, but you can sign up for email notifications when the data changes at http://www.treas.gov/offices/enforcement/ofac/sdn/index.shtml.
Copyright (c) 2009 Kevin Tyll. See LICENSE for details.