Then run [[Geocode.py]] on the output file (broken into 2500 address queries).
python3 Geocode.py companyneedsgeo1-2499.txt
===Matching===
Generally everything should be matched to itself first. Matching should be done using mode=2:
perl .\Matcher.pl -file1="DistinctConame.txt" -file2="DistinctConame.txt" -mode=2
perl .\Matcher.pl -mode=2 -file1="DistinctTargetName.txt" -file2="DistinctTargetName.txt"
perl .\Matcher.pl -mode=2 -file1="IPODistinctIssuer.txt" -file2="IPODistinctIssuer.txt"
Then match between and review:
perl .\Matcher.pl -mode=2 -file1="PortCoMatchInput.txt" -file2="MAMatchInput.txt"
The M&A review does the following (10406):
*Check if Hall (not Multi), datefirstinv<announceddate, statecode=statecode. Take when all three. (8064)
*Throw out when statecode != statecode OR when announcedate < datefirstinv. (1879)
*Of the remaining (463), take the min date announced.
This can be done with SQL faster than in Excel. Be aware that the join back must use statecode to deal with:
Mobile Technologies LLC Mobile Technologies PA 4/28/2011 8/12/2013
Mobile Technology Inc. Mobile Technology Inc CA 12/1/1985 6/28/1990
Also, there is an issue with multiples not showing up as multiple matches, e.g.:
ARCA BIOPHARMA Inc ARCA biopharma Inc Hall ARCA BIOPHARMA Inc ARCA BIOPHARMA Inc CO 3/1/2006 ARCA biopharma Inc ARCA biopharma Inc CO 9/25/2008 1 1 1 3 0 1
ARCA BIOPHARMA Inc ARCA biopharma Inc Hall ARCA BIOPHARMA Inc ARCA Biopharma Inc CO 2/3/2003 ARCA biopharma Inc ARCA biopharma Inc CO 9/25/2008 1 1 1 3 0 1