Changes

Anne Freeman (view source)

Revision as of 15:12, 29 May 2019

350 bytes added , 15:12, 29 May 2019

no edit summary

== AngelList Data ==

Using selenium, I created a crawler to search the angelList database using the keyword "incubator" and the state. I also created a crawler to search the angelList database for companies with the type "incubator" and the state. The crawlers would click the "more" button at the bottom of the page to view all of the results and then save them in a tab separated text file. I performed a diff on the results to create a masterFile containing only unique entries. Then I used selenium to open the URL for the incubator within the angelList website and download it to a local folder. Then using beautfulsoup I parsed the static HTML files for information on the company, the employees, and the portfolio. There is more information on this process on the [[AngelList Database]] page.

== Things that still need work ==

The selenium google crawler pushes the urls rather than typing them in to google and hitting enter. It also collects the same page 10 times rather than selecting the next page.

The AngelList Data script to parse the employees is not collecting information from all the incubators. The script needs to be adjusted

AnneFreeman

83

edits

Changes

Anne Freeman (view source)

Revision as of 15:12, 29 May 2019

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools