Difference between revisions of "URL Finder (Tool)"

From edegan.com
Jump to navigation Jump to search
Line 34: Line 34:
  
 
'''7/8:'''
 
'''7/8:'''
 +
 +
*Created conditionals for keys in JSON dictionaries. Successfully ran the tool on my 50 companies and then again on 1500 companies. Changed ratio to .75 and higher to elicit URLs that were close but not exact and got more results.

Revision as of 15:15, 8 July 2016


McNair Project
URL Finder (Tool)
Project logo 02.png
Project Information
Project Title
Start Date
Deadline
Primary Billing
Notes
Has project status
Copyright © 2016 edegan.com. All Rights Reserved.


Description

Notes: The URL Finder Tool automated algorithmic program to locate, retrieve and match URLs to corresponding Startup companies using the Google API. Developed through Python 2.7.

Input: CSV file containing a list of startup company names

Output: Matched URL for each company in the CSV file.

Development Notes

7/7: Project start

  • I am utilizing the pandas library to read and write CSV files in order to access the inputted CSV files. From there, I am simplifying the names of the companies using several functions from the aiding program, glink, to get rid of company identifiers such as "Co., INC., LLC., etc. and form the company names in a manner that is accessible by the Google Search API.

fec['name_clean'] = fec["newname"].map(glink.remCorp)

fec['download_status'] = fec['name_clean'].map(glink.gdownload)

  • Attempted to use program on 1500 Startup company names but ran into a KeyError with the JSON files. I am not able to access specific keys in each data

7/8:

  • Created conditionals for keys in JSON dictionaries. Successfully ran the tool on my 50 companies and then again on 1500 companies. Changed ratio to .75 and higher to elicit URLs that were close but not exact and got more results.