Start Up Address Finder Algorithm (Tool)
Jump to navigation
Jump to search
Start Up Address Finder Algorithm (Tool) | |
---|---|
Project Information | |
Project Title | Start Up Address Finder Algorithm (Tool) |
Owner | Jake Floyd |
Start Date | Summer 2016 |
Deadline | |
Keywords | Tool |
Primary Billing | |
Notes | |
Has project status | Complete |
Copyright © 2016 edegan.com. All Rights Reserved. |
Description
Notes: The Start Up Address Finder Algorithm aims to provide street level address for start ups contained within the Crunchbase database.
Input: Crunchbase company list (which includes large amounts of information about the company, including some funding information and founding information)
Output: Addresses (specific to street address) for every company known.
Algorithm
To be filled once project is completed
Development Notes
7/7: Project development Notes: Beginning
- Company list downloaded from Crunchbase.
- Important information provided from list including company name, url, country, state, region and city.
- Data was analyzed to determine total number of companies as well as percentage of fields in which country, state, region, and city were not complete. This data was recorded in the image below.
- Following this it was determined that the Seattle region would be used as a test region.
- 1070 Companies were contained within the Seattle region.
- Each company was assigned a random number using the randombetween() excel function.
- These companies were then sorted based upon this number (and assigned a new number based upon their new order for identification purposes)
- The function CORREL was then applied to find the relationship between random and assigned number (expected to be around 1 as this was an assigned number)
- Then the same function was used to determine the correlation between these numbers and funding total and funding date. The results are displayed below:
- These values made us confident that the order of the list was randomized
- Following this the company name was entered into google in order to determine if the street level address could be found
- After a short test it was noted that Crunchbase contained a significant amount of street level addresses, and this hypothesis was to be tested in order to see if this was a common trend
- A total of 160 companies were tested:
- 140 Had addresses on crunchbase
- Of the 20 that were not on crunch base 4 had address on both linkedin and bloomberg, while 1 had an address on only linkedin, and 3 had an address only listed on bloomberg
- This meant that 8 of the 20 companies could be found using linkedin and bloomberg
- The remaining 12 companies were checked to see if they could be found using a whois search
- Of these 11 contained a url
- Of these 11: 3 address were retrievable with a who is search will 8 were not
- Of the 8 within the category unable to retrieve; 4 were protected by godaddy.com; and could possibly be retrieved from there, also one had a .io url that could not be found
7/?: Project development Notes (cont'd)
- To be filled for next section