The most recent update provided on [[Accelerator Seed List (Data)]] was on 05/21/2018. This update included the most recent '''master file''' of accelerator data, found at
E:\McNair\Projects\Accelerators\Summer 2018\Accelerator Master Variable List - Revised by EdV2.xlsx
The Google Sheets Master Sheet (OUTDATED) is found here
https://docs.google.com/spreadsheets/d/1ikuxYwp9JIRrjz4qQcbdwTpbHOne-q2PterYTjzofjw/edit?ts=5aa2f1f9#gid=0
The '''Unfound Founders''' file is accurate in that it codes a 0 for all companies '''''not listed''''' within the LinkedIn Founders file, and a 1 for those that do have founders listed.
==Moving Forward==
Acquiring the necessary data to complete the Accelerator Master Variable List and the Cohort List will require the following (not necessarily in this order):
===Step One: LinkedIn Founders Data===
Find the names of accelerator founders using Crunchbase (reference [[Crunchbase Data]], [[Crunchbase Accelerator Data]], [[Crunchbase Accelerator Equity]]. This will require data from [[Grace Tan]] and [[Maxine Tao]].
Given the founders' names, we will then be able to use the [[LinkedIn Crawler (Python)]] to find the relevant details of an accelerator founder (education, work experience, etc.) This data on founders will help us solve the horse, jockey, racetrack question to detect what variables affect a startup's success (the accelerator, the founders, the environment/city).
===Step Two: Linking Accelerators to Cohorts Using Investments on Crunchbase===
In this step we focus on accelerators who take equity from the companies that engage in their program. We do this to prevent looking at accelerators who may also run funds/invest in various companies but do not take equity. This would provide us misleading results and lead us to believe some companies are in cohorts at accelerators that they are really not.
Maxine will acquire the list of accelerators who take equity from companies from the following sheet:
\bulk\McNair\Projects\Accelerators\All Relevant Files\accelerator_data_noflag.txt
Looking at the file, however, shows that very few are actually categorized well and the equity variable is messy. Moving forward, we need to check/refine/fix this classification.
This file has 266 rows. The most recent, actual version of our accelerator database (found at McNair\Projects\Accelerators\Summer 2018\Accelerator Master List - Revised by Ed V2.xlsx under the sheet Master Variable List) only has 167 rows, meaning the accelerator_data_noflag.txt file has too many rows.
We will need to do a left join of Accelerator Master List with accelerator_data_noflag.txt to get rid of the accelerator names that are in accelerator_data_noflag but NOT in Accelerator Master List.
Once this is finished, we should have an “Equity” classification variable for every accelerator in Accelerator Master List. The accelerators that have a Y (or maybe it’s a 1) are companies that do take equity. These are the companies we’ll be able to do your Crunchbase work on to see when accelerators take equity.
We then look at the accelerators investments (or companies and the entities which invested in them), cross-reference the list of companies/accelerators, and once we find a match, we know that a company went through an accelerator and during which year they went through a cohort.
From this, we get the following data:
*Accelerator a given company went through
*Year said company went through a cohort/Specific cohort company went through