Changes
Jump to navigation
Jump to search
Geocoding Inventor Locations (view source)
Revision as of 14:58, 21 August 2009
, 14:58, 21 August 2009no edit summary
==The Matching Process==
The matching process is carried out by [http://www.edegan.com/repository/MatchPatentLocations.pl MatchPatentLocations.pl], which has a standard pod based command line interface. The -co option specifies the ISO3166 country code to be matched. The script uses these modules: PatentLocations.pm, GNS.pm, CleanStrings.pm, GramMatch.pm, LCS.pm and PostalCodes.pm. In addition to GNS reference files and patent data source files as detailed above, the script also use PatentLocations-Stopwords.txt.
Glossary of terms:
To reconsile multiple matches the following process is undertaken:
*If there are both P and A matches and more than one of either P and/or A matches, then determine the P-A pair with the shortest distance between then using a [http://en.wikipedia.org/wiki/Haversine_formula Haversine formula] distance calculation based on the GNS reported longitudes and latitudes. (Note that the Haversine formula is implemented in the Match::GNS.pm module and is the most accurate method over short-distances, where other methods, like the great-circle method, suffer from compounded rounding error problems.)
*If there are multiple P matches but no A matches, take the one that was arrived at first.
*If there are multiple A matches but no P matches, take the one that was arrived at first.