|Has owner=Shrey Agarwal, Matthew Ringheanu, Veeral Shah, Connor Rothschild,
|Has start date=Fall 2016
|Has keywords=Accelerators,Data
|Has project status=ActiveSubsume
|Is dependent on=Industry Classifier
}}
=Current Work=
===As of 05/21/2018 the Google Sheet Workbook has been downloaded to the E drive. The now Excel Workbook is saved at E:\McNair\Projects\Accelerators\Summer 2018\Accelerator Master Variable List.xlsx. This is now the master file.===
Google Master Sheet: https://docs.google.com/spreadsheets/d/1ikuxYwp9JIRrjz4qQcbdwTpbHOne-q2PterYTjzofjw/edit?ts=5aa2f1f9#gid=0
*Cross-reference sheet with data from Peter's old accelerator consolidation file ("accelerator_data_noflag" and "accelerator_data" in "All Relevant Files") and fill in missing data
A 0 means we don't have founder data for that accelerator.
Specs: A tab delimited text file with the following fields:
Accelerator First Name Last Name LinkedInURL(if possible)
Getting the LinkedInURL will ensure accuracy, but will work without it.
*Shrey: Find "demo day" keywords, so that we can search AcceleratorName Year Keyword and get back potential demo day pages
==Accelerator Type project==
File to edit is called "Accelerator type list". Located in the folder E:\McNair\Projects\Accelerators\Spring 2018\Grouping project of ListOfAccs. More systematic information and instructions are in"Instructions for Accelerator type project" in E:\McNair\Projects\Accelerators\Spring 2018\Grouping project of ListOfAccs.
NOTE: until we get through all 270 accelerators, we will just categorize each accelerator into the following three categories as quickly as possible with short notes in teh "other info" column for these; once we have this, we will go back through the ones that aren't categorized and add notes to the "other info" column.
Type list:
*Private
*Corporate
*Academic
Note: if DEAD, noted here.
Other info:
*nonprofit? (y/n)
*Subtype abbreviations:
**S: for if a social entrepreneurship initiative
**I: for if an incubator
**A: for an angel group
**F: for foreign
**C: for in coworking space/hub/etc
**V: for if part of venture fund
**G: for if government funded/partnered
**T: for international
Note: subtypes (from individual text files in E:\McNair\Projects\Accelerators\Spring 2017\Code+Final_Data) were only found for 23 of the 270 accelerators. These accelerators were initially intended to be removed from the master list. Remaining subtypes are currently being added.
other info:
international offices, founders, industries, org type, program duration, or other interesting, easily accessed variables. Additional information is especially important for accelerators that have no other subtype abbreviation listed.
===Steps to research an accelerator===
1. Copy/paste URL listed in Accelerator type list file into google. If website is insufficient, try googling:
the name of the accelerator
the name of the accelerator + "crunchbase"
the name of the accelerator + "nonprofit"
the above steps sometimes lead to other helpful databases/news articles
2. Note whether:
1) Academic/Corporate/Private
2) For Profit/Nonprofit. Sometimes this isn't directly stated but can be inferred through their description of, say their investment process. If they don't address this at all it's probably For Profit.
3) subtype (S, I, A, F, C, V, G, T).
4) Additional, easily-accessed info. Number 4 is really important if there's no subtype.
All 270 need to be done by the end of the semester.
Type list file saved as
"Accelerator type list" in E:\McNair\Projects\Accelerators\Spring 2017\Grouping project of ListOfAccs.
The list of ListofAccs, from which we drew Accelerator type list, should have no matches with any of the flagged accelerators in E:\McNair\Projects\Accelerators\Spring 2017\Code+Final_Data. There are 23 matches though. So all subtypes must be searched and entered manually. Whether some were a nonprofit was listed in E:\McNair\Projects\Accelerators\Spring 2017\Grouping project of ListOfAccs, called "whether nonprofit...". Accelerators with no info there on whether nonprofit need to have info entered manually.
=Funded By Accelerators=
Reference the like-named portion in [[Crunchbase Data#Funded by Accelerators|Crunchbase Data]]
=End of Semester Report=
The end of semester report will focus on ranking accelerators and environments based on the variables we have gathered. Our primary form of categorization will be ranking individual accelerators based on their venture capital raise rate. We can probably generate information over time for accelerators and the amount of VC they raised to get a sense of what locations have developed in the past five years from the dates of transactions recorded by SDC. To obtain these rankings, we will identify which cohorts companies were trained in, as well as complete details of the accelerator and the details of cohort companies. We will focus only on accelerators because there are many other entities in each ecosystem. We will also utilize information on IPO or acquisition by companies, obtained through Crunchbase, to gain some sense of how successful startups emerging from a particular accelerator are. To obtain the data over time, we will need to fill out the cohort date information column in our cohort data, which will require the help of either Crunchbase or the Wayback machine for older accelerators. In ranking the accelerators across regions, we can also track industry-specific hotspots for accelerators such as medicine in Memphis or technology in San Francisco.
=End of Semester Notes=
*We have compiled a very long list of accelerators from many different databases. For the past couple of weeks, everyone in the center has been going through this list, 20 at a time, classifying each one as an accelerator or not an accelerator, and then proceeding to gather data on the accelerator using the process outlined below. This process went very smoothly. We have successfully gone through about 80% of the list. We are still missing information on the last hundred or so names. All of the collected data is located on the RDP, within the "Accelerators" folder under "Data"or on the [https://docs.google.com/spreadsheets/d/1ikuxYwp9JIRrjz4qQcbdwTpbHOne-q2PterYTjzofjw/edit?ts=5aa2f1f9#gid=1132417337 "Accelerator Master Variable List" Google sheet].*We have listed all of the startups from the accelerators that have break out cohorts on their website on the [https://docs.google.com/spreadsheets/d/1ikuxYwp9JIRrjz4qQcbdwTpbHOne-q2PterYTjzofjw/edit?ts=5aa2f1f9#gid=1132417337 "Accelerator Master Variable List" Google sheet]. This contains the following information in the "Cohort List (new)" sheet: accelerator name, year, cohort name, company name, description, founders, category/sector, and location. *Next steps include going through the demo day pages that have been downloaded and writing notes on the different types if possible (see [[Demo Day Page Google Classifier]]).
=Data Collection Notes=
==Link to Crunchbase API application==
https://about.crunchbase.com/forms/research-access-apply/(Does not work anymore) https://data.crunchbase.com/v3/docs/using-the-api (Has new instructions for application)
#Copied "Seed Accelerators" table to TextPad, data sorted itself into lines. Returned 235 results.
#Clicking on the accelerator name itself links to a page with all of its associated startups, up until 6/2016 cohort
*Overall very extensive data for accelerators that are included on the list, but after cross-referencing from other sources shows that seed-db is lacking many newer accelerators; list is not all-inclusive.
*Includes regional distributions for accelerator groups as well. For example, rather than just "Techstars", the group is broken into Austin, Berlin, Boston, Boulder, etc.
=Kauffman Foundation Incubator Proposal Information= ==Institutions==Summary: F6S, Crunchbase, seed-db Tools: Matcher - used to match lists of potential accelerators with our current list to identify duplicates/new matches (E:\McNair\Projects\Accelerators) ===F6S===F6S WebCrawler and F6S Parser - E:\McNair\Projects\Accelerators\F6S Accelerator HTMLs ===CrunchBase=== CrunchBase 2013 Snapshot '''(All Organizations)'''- E:\McNair\Projects\Accelerators\organizations.xls CrunchBase 2013 Snapshot '''(Potential Accelerators)'''- E:\McNair\Projects\Accelerators\organizations.accdb under "Potential Accelerators query" *Obtained using keyword matches in the descriptions of the potential accelerators. CrunchBase 2013 Snapshot '''(New Verified Accelerators)''' - E:\McNair\Projects\Accelerators\New CrunchBase Accelerators.xls We have the Crunchbase 2013 Snapshot which provided lots of new data on accelerators and incubators but we would love to use the Crunchbase API to get a current database snapshot that we could use to cross reference companies and add newly formed accelerator and incubator companies. ===AngelList=== ===seed-db=== Obtained through www.seed.db/accelerators ===Global Accelerator Network (GAN)=== GAN Parser- E:\McNair\Projects\Accelerators\Web Scraping for Accelerators\scrapeaccel.py GAN Data- E:\McNair\Projects\Accelerators\Web Scraping for Accelerators\GAN Accelerator Data*Contains: Company Name, # of Companies Range, % of Companies Funded, Funding Raised by Companies, Employee Range, Exit Funding, Exit Date, Total Company Funding Raised, # of Mentors Range, % Equity, Location, Minimum Seed Capital Investment ==Cohorts== *Cohorts obtained manually*All Cohort txt files are saved under "E:\McNair\Projects\Accelerators\Data *cohort file name = (accelerator name).cohort*Most updated Accelerator cohort data: E:\McNair\Projects\Accelerators\Cleaned Cohort Data.xls Automation for obtaining cohorts?? ==Other Information==Summary: Whois Parser, Geocode, Tools to determine industry, etc ===Whois Parser=== *Retrieves and parses Whois information. Specifically, takes a file with a column of domain names and populates the corresponding columns with information from the WhoIs API. *Often used to obtain locations. ===Geocode=== Input: Company AddressOutput: Directional Coordinates *Used to obtain the locations of different Accelerators and Cohort companies. ===SDC Platinum Pull=== Used to obtain funding information and match companies that have gotten funding with companies that are Accelerator cohorts. ===Desired Information/Variables=== *Key People (founders, lead entrepreneurs, strategists, etc.)*Total number of launched companies*A FAQ for application details, accelerator vision, and*Funds raised per company (average)*Features offered by accelerator (perks, space, tools, etc) ==Desired Tools/Information== ===Automating the Process of Obtaining Cohorts===*Automating this process would save a lot of time and really progress the project. ===Obtaining More Details on Accelerators=== *Having the kind of thorough information on industry, companies, funding, location, exits, mentors, leadership, that we got for the GAN companies would be fantastic. ===List of Alive/Dead Accelerators=== This is a dream but would be very helpful