Changes

INBIA (view source)

Revision as of 08:24, 29 May 2019

49 bytes added , 08:24, 29 May 2019

no edit summary

==Retrieve Data from URLS Generated==

We wrote a web crawler that

# reads in the csv file containing the URLs to scrape into a pandas dataframe

# changes the urls by replacing ''?c=companyprofile&'' with ''companyprofile?'' and appending the domain http://exchange.inbia.org/network/findacompany to each url

# opens each url and ~~extract~~ extracts information using element tree parser# ~~writes~~ collects information ~~for~~ from each url ~~to csv~~ and stores it in a txt file

The crawler generates a ~~csv~~ tab separated text file called INBIA_data.~~csv~~ txt containing [company_name, street_address, city, state, zipcode, country, website~~, contact_person~~] and is populated by information from the 415 entries from the database.

The ~~csv~~ txt file and the python script (inbia_scrape.py) are located in E:\projects\Kauffman Incubator Project\01 Classify entrepreneurship ecosystem organizations\INBIA

AnneFreeman

83

edits

Changes

INBIA (view source)

Revision as of 08:24, 29 May 2019

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools