Changes
Jump to navigation
Jump to search
m
Geocoding Inventor Locations (view source)
Revision as of 19:05, 22 August 2009
, 19:05, 22 August 2009no edit summary
The candidate reference string with the highest gram and LCS scores, assuming that these scores meet the decision threshold is then selected as the closest match. If no candidate reference string meets the decision threshold the source string is left unmatched. The decision thresholds are configured in the MatchLocations.pl script, and sets of Ngram/LCS matchings, using different character sets, gram lengths and decision thresholds, are performed sequentially, with the currently unmatched source strings used as input for each round.
===Reconsiling Reconciling Multiple Matches===
In a small number of cases it is possible that the source string will achieve more than one A (Area) or P (Place) match. For example suppose the string "Glouchester Street Cambridge Cambridgeshire" where considered. This could concievably produce two P matches and one A match with the token matching algorithm detailed above.