Changes

Jump to navigation Jump to search
400 bytes added ,  16:46, 8 July 2016
no edit summary
'''Output''':
Matched URL for each company in the CSV file.
 
==How to Use==
1) Assign "path1" = the input CSV file address
2) Assign "out_path" = the file address in which to dump all the downloaded JSON files.
3) Assign "path2" = the new output file address
4) Run the program
==Development Notes==
*I am utilizing the <code>pandas</code> library to read and write CSV files in order to access the inputted CSV files. From there, I am simplifying the names of the companies using several functions from the aiding program, glink, to get rid of company identifiers such as "Co., INC., LLC., etc. and form the company names in a manner that is accessible by the Google Search API.
 
*I am then searching each company name into the Google Search API and collecting a number of URLs that come up from the custom search. All of these URLs are put into a JSON file.
<code>fec['name_clean'] = fec["newname"].map(glink.remCorp)
fec['download_status'] = fec['name_clean'].map(glink.gdownload)</code>
 
 
*Attempted to use program on 1500 Startup company names but ran into a KeyError with the JSON files. I am not able to access specific keys in each data
383

edits

Navigation menu