Difference between revisions of "Start Up Address Finder Algorithm (Tool)"

From edegan.com
Jump to navigation Jump to search
Line 61: Line 61:
  
 
::The remaining 12 companies were checked to see if they could be found using a whois search
 
::The remaining 12 companies were checked to see if they could be found using a whois search
 +
 +
:::Of these 11 contained a url
 +
 +
::::Of these 11: 3 address were retrievable with a who is search will 8 were not
 +
 +
:::::Of the 8 within the unable to retrieve; 4 were protected by godaddy.com; and could possibly be retrieved from there

Revision as of 12:21, 7 July 2016


McNair Project
Start Up Address Finder Algorithm (Tool)
Project logo 02.png
Project Information
Project Title
Start Date
Deadline
Primary Billing
Notes
Has project status
Copyright © 2016 edegan.com. All Rights Reserved.


Description

Notes: The Start Up Address Finder Algorithm aims to provide street level address for start ups contained within the Crunchbase database.

Input: Crunchbase company list

Output: Addresses (specific to street address) for every company known.

Algorithm

To be filled once project is completed

Development Notes

7/7: Project development notes

Company list downloaded from Crunchbase.
Important information provided included company name, url, country, state, region and city.
Data was analyzed to determine total number of companies as well as percentage of fields in which country, state, region, and city were not complete. This data was recorded in the image below.
Count 7-7-16.png
Following this it was determined that the Seattle region would be used as a test region.
1070 Companies were contained within the Seattle region.
Each company was assigned a random number using the randombetween() excel function.
These companies were then sorted based upon this number (and assigned a new number based upon their new order for identification purposes)
The function CORREL was then applied to find the relationship between random and assigned number (expected to be around 1 as this was an assigned number)
Then the same function was used to determine the correlation between these numbers and funding total and funding date. The results are displayed below:
Correl 7-7-16.png
These values confirmed that the order of the list was randomized
Following this the company name was entered into google in order to determine if the street level address could be found
After a short test it was noted that Crunchbase contained a significant amount of street level addresses, and this hypothesis was to be tested in order to see if this was a common trend
A total of 160 companies were tested:
140 Had address on crunchbase
Of the 20 that were not on crunch base 4 had address on both linkedin and bloomberg, while 1 had an address on only linkedin, and 3 had an address only listed on bloomberg
This meant that 8 of the 20 companies could be found using linkedin and bloomberg
The remaining 12 companies were checked to see if they could be found using a whois search
Of these 11 contained a url
Of these 11: 3 address were retrievable with a who is search will 8 were not
Of the 8 within the unable to retrieve; 4 were protected by godaddy.com; and could possibly be retrieved from there