Accelerator Seed List (Data)

From edegan.com
Jump to navigation Jump to search


McNair Project
Accelerator Seed List (Data)
Project logo 02.png
Project Information
Project Title
Start Date
Deadline
Primary Billing
Notes
Has project status
Copyright © 2016 edegan.com. All Rights Reserved.


This project will be used to determine which accelerators are the most effective at churning out successful startups, as well as what characteristics are exhibited by these accelerators. First, we need to gather as much data as we can about as many accelerators as we can in order to look at factors that differentiate successful vs. unsuccessful ventures. Next, we need to create a web crawling program which will gather information about accelerators across the world by accessing their websites and extracting information. I believe that our overall goal with this research project is to gain insight into the methods of successful accelerators, as well as to find out what exactly differentiates very successful accelerators from dead accelerators.

Helpful Links: http://seedrankings.com/

Contents

Pre-existing Data

List of Accelerators

Sources

Summary: These are sources obtained from List of Accelerators and other Google searches. We will evaluate these sources by looking at the number of accelerators they supply (as most of them are lists) and then also taking a look at the type of information they provide about each accelerator. Key data points are cohort-related data, startup-related data, and logistics of the accelerator. Better sources supply more information that the URL alone.

(Obtained from List of Accelerators and various Google searches)

(Obtained from Google search: "Accelerator Database")

Other ways used to find Accelerators (listed below "List of Sources Obtained from Various Google Searches"):

  • Type in generic location + "accelerators" (e.g. Houston Accelerators)
  • Looked at roughly the first 20 results
  • Used three locations as examples of accelerators that pop up

Source Evaluations

Summary: These evaluations couple with each of the sources above. The evaluations provide instructions for obtaining the information listed, as well as a general review of how useful the data seems. The review serves to determine whether a crawler would be suitable for obtaining information from the source autonomously.

Source: http://www.acceleratorinfo.com/see-all.html

  1. Opened source website
  2. Copied Information under "All Accelerator Programs" to TextPad, already sorted. Returned 190 results
  3. Each link on parent list leads to individual home page url of accelerator
  • Used sample size of 20 links, determined 16 to be accelerators, 2 to be incubators, 2 to be inactive or broken links
  • Many accelerators do not include founding date, most recent accelerators from around 2013-2014 (as determined from home page)

Review

  • Reliable source for specific URLs to older accelerators, not very helpful for more specific information.
  • Web crawling seems improbable because information is not readily available from source. Can potentially mine staff information or contact information from associated "about" page in the home url


Source: http://www.seed-db.com/accelerators/all

  1. Copied "Seed Accelerators" table to TextPad, data sorted itself into lines. Returned 235 results.
  2. Clicking on the accelerator name itself links to a page with all of its associated startups, up until 6/2016 cohort
  • Startup table includes:
  1. "state"
  2. "company name"
  3. "website and CrunchBase links"
  4. "cohort date"
  5. "exit value"
  6. "funding".
Many entries for "exit value" are missing, some values for "funding" are missing
On original seed-db webpage, each accelerator has a link to its associated home page url
  • From the table, each listed entry was an accelerator, although 24 accelerators out of 235 were classified as "dead"
  • Along with the home url, each accelerator table includes the following:
  1. Status
  2. Program (name)
  3. Location
  4. Country
  5. Number of companies
  6. Cumulative exit values
  7. Cumulative funding
  8. Average funding for startups
  9. Median funding for startups
Many entries for "median funding" are left empty, as well as entries for all types of funding on the bottom half of the table

Review

  • Reliable source for accelerators, includes list of accelerators both dead and active, as well as their associated start-ups
  • Web crawling potential is promising; startup table is located within the source for each webpage. Can also mine any category from the accelerator table
  • Overall very extensive data for accelerators that are included on the list, but after cross-referencing from other sources shows that seed-db is lacking many newer accelerators; list is not all-inclusive.
  • Includes regional distributions for accelerator groups as well. For example, rather than just "Techstars", the group is broken into Austin, Berlin, Boston, Boulder, etc.


Source: http://www.seed-db.com/accelerators

Very similar to "http://www.seed-db.com/accelerators/all", but contains large regional accelerators as groups, rather than individual accelerators. For example, Techstars appears only once.
  1. Copied "Seed Accelerators" table to TextPad, data sorted itself into lines. Returned 239 results.
  2. Clicking on the accelerator name itself links to a page with all of its associated startups, up until 6/2016 cohort
  • Startup table includes same information as previous source, "http://www.seed-db.com/accelerators/all". However, accelerators spanning across multiple regions have their startups located under one category on this webpage.
On original seed-db webpage, each accelerator has a link to its associated home page url
  • From the table, each listed entry was an accelerator, although 24 accelerators/groups out of 239 were classified as "dead"
  • Along with the home url, each accelerator table includes the same information as the "http://www.seed-db.com/accelerators/all" source

Review

  • Reliable source for accelerators, includes list of accelerators both dead and active, as well as their associated start-ups
  • Web crawling potential is promising; startup table is located within the source for each webpage. Can also mine any category from the accelerator table
  • Overall very extensive data for accelerators that are included on the list, includes large groups as well as individual accelerators. It seems that some accelerators missing from "http://www.seed-db.com/accelerators/all" are located here, since there are 239 returns rather than 235.


Source: https://www.f6s.com/programs?type

  1. On the webpage, set "Type" to "Accelerator/Program", set "Location" to "North America", and set "Invest in Country" to "United States" to return results
  2. Highlighted results and scrolled down until all results found; copied results to TextPad
  3. In TextPad, sorted out lines with "by", as well as miscellaneous categories such as dates and dollar signs through Regular Expressions
  4. Using the "More Info" line which held constant through the entire list, assigned a sequential number to the line (in order to determine the number of results)
  • Obtained a grand total of 1467 results from the list
  • Along with the name of the program/accelerator, the data included:
  1. Dollar value per team
  2. Equity
  3. Application Site
  4. Accelerator URL
  • Many entries are not accelerators, from a quick glance through the results, there were various conferences, 3-5 days events, and written literature pertaining to accelerators as well
  • From a sample size of the first 30 entries, determined 10 to be valid accelerators, 3 incubators, 6 conferences/weekends, and the rest to be miscellaneous entries such as startup events or "studios" (perhaps useful but not relevant to search)
  • As we go down the list, the number of accelerators proportionately decreases. Can comfortably say that overall accelerator turnout from this website is much less than 33%, probably closer to 10-15%.

Review

  • Potentially useful website if crawler could remove the clutter and target solely the accelerators; very useful for identifying new accelerators since data automatically sorted by date and location.
  • Large list of sources includes many irrelevant results, such as conferences or weekends which are difficult to identify. The name of the sorting category itself, "Accelerator/Program" suggests that many of the results fall under the "Program" section rather than being valid accelerators.
  • Potential site for identifying accelerators, but limited by in-site sorting; useful for URL and perhaps equity, but not very detailed information relating to the accelerator/program.


Source: http://gust.com/usa-canada-accelerator-report-2015/

  1. Selected region of US and Canada
  2. Scrolled down to the section labeled "Top 20 Active Accelerators" and selected "see the full list" near the bottom of the listed accelerators
  3. Copied resulting entries into TextPad and sorted out the numbers to leave only the name of the accelerator
  • Obtained 100 results for different accelerators
  • Accelerator lists included:
  1. Name and URL
  2. Number of Start-ups funded (2015 only)
  • Accelerator list limited to 2015

Review

  • Website provides its own evaluation of an accelerator's success based on various factors and provides data for larger trends.
  • Usefulness is questionable because website does not provide much except the URL, and all of the entries are based on success in 2015.
  • Other interesting data within website such as "Hot Markets", investment breakdowns by state, etc. All of this data is also limited to 2015.

Source: https://bostonstartupsguide.com/guide/every-boston-startup-accelerator-incubator/

  1. Scrolled down to the section labeled "Startup accelerators in Boston"
  2. Copied text beginning from "MassChallenge" (the first paragraph was just a general definition of startups) and continued to copy until "Startup Incubators in Boston"
  3. After pasting in TextPad, I sorted the data to delete any characters after the "-" and added a sequential number at the beginning of each line
  • Returned a total of 17 results for startups in Boston
  • Accelerator list included:
  1. Name and URL
  2. Capital requirements
  3. Application periods and requirements
  4. Paragraph describing accelerator and its goals

Review

  • Although the guide is dated, useful for identifying strong accelerator programs in Boston
  • Limitation: only focuses on Boston, but the description is helpful in identifying the role of the accelerator
  • Limited information on accelerator, not very useful by itself without information from the accelerator URL

Source: https://www.corporate-accelerators.net/database/

  1. Copied and pasted table into Microsoft Excel (Data was already sorted into categories so no need for TextPad)
  2. Table returned 72 references (but there was a link to the bottom to a larger database)
  • The table itself includes:
  1. Major Company
  2. Accelerator
  3. Funding
  4. Equity
  5. Website
  6. Details
  • The "Details" link led to a variety of other information including:
  1. Status (Active or Inactive)
  2. Locations
  3. Funding
  4. Equity
  5. Term
  6. Cohort Based? (Regular or Irregular)
  7. Pitch Day
  8. Office Space
  9. Powered by
  10. Support Offered?
  11. Launch year
  12. Focus Areas
  13. General Description
  • Also Included a variety of data regarding the host company as well

Review

  • Solid list for corporate accelerators and also includes a variety of information about the accelerator, the cohorts, etc. Some of the entries are international accelerators however so need to filter them out
  • Only limited to 72 accelerators from major companies

Source: https://github.com/florianheinemann/www-corporate-accelerators-net/blob/master/_data/Accelerators.json

  1. This source is a .json file from the previous database
  2. After placing into TextPad, replaced each space with a ###, replaced each new line with a tab, and replaced each ### with a new line. Ultimately returned 80 results
  • From the file, the .json includes:
  1. NAICS and NAICS sector
  2. Classification
  3. Sector Description
  4. Term
  5. Goal
  6. Partner
  • Also includes most of the information from the previous source, since they are undoubtedly linked

Review

  • Another solid list for corporate accelerators with some more information, but ultimately very similar to the previous source.

Source: https://www.quora.com/Where-can-I-find-a-comprehensive-list-of-startup-incubators-and-accelerators-in-the-US

  1. Since we already looked at the first listed source (seed-db), I clicked on the second link "(by Robert Shedd) http://blog.shedd.us/321987608/" which took me to a page headed "Help for Startups! – A semi-complete list of startup accelerator programs" created by a blogger, Robert Shedd
  2. List included 102 entries by the blogger, each of which do look like an accelerator
  • Upon immediate overview, noticed many results from previous sources were missing. Immediately noticed lack of "OwlSpark", the accelerator from Rice.
  • Shedd only offers us the accelerator name plus its URL

Review

  • Nice list to cross-reference with other sources but does not offer much new insight compared to more powerful engines such as seed-db\

List of Sources Obtained from Various Google Searches

Summary: These accelerators are taken from a specific Google search rather than a list. The idea is to compile a list of Google searches that return relevant results of accelerators. This will aid in the creation of a future web crawler.

From "Location + Accelerator"(Only individual results, not lists)

Houston Accelerators

  • Examples of single accelerators found
  1. TMCx: http://www.tmc.edu/innovation/innovation-programs/tmcx/
  2. RED labs: http://redlabs.uh.edu/8
  3. SURGE accelerator: https://kirkcoburn.com/
  4. OwlSpark: http://owlspark.com/
  5. NextHIT: http://www.houstonhealthventures.com/nexthit-accelerator-program-application/

Los Angeles Accelerators

  1. Amplify: http://amplify.la/
  2. Y Combinator: https://www.ycombinator.com/
  3. Chicklabs: https://www.chicklabsllc.com/
  4. Disney Accelerator: https://disneyaccelerator.com/
  5. Launchpad: https://launchpad.la/

New York Accelerators

  1. DreamIT Ventures: http://www.dreamit.com/#meaningful-experience
  2. Women Innovate Mobile: http://www.wim.co/
  3. Techstars NYC: http://www.techstars.com/programs/nyc-program/
  4. Entrepreneurs Roundtable: http://eranyc.com/
  5. FirstGrowthVC: http://venturecrush.com/fg/
  6. New York Digital Health Accelerator: http://digitalhealthaccelerator.com/
  7. Grand Central Tech: http://www.grandcentraltech.com/
  8. Accelerator Corp: http://www.acceleratorcorp.com/
  9. New York Startup Lab: http://nystartuplab.com/

Review

  • Some locations return more viable results for a similar sample size. For example, New York returned 9 valid accelerators, whereas Los Angeles and Houston both returned 5 actual accelerators out of the first 20 results: an 80% difference. Some optimization may come from identifying which locations return more accelerators upon searching.

Individual Accelerator Evaluations

Summary: The purpose of this section is to create instructions for each accelerator on how to find cohort information from their URLs. Along with specific instructions for obtaining the cohorts for each accelerator chosen, there should be a list of easy-to-obtain and relevant statistics regarding the accelerator, such as information about its team, location, etc. The variable statistics list is cumulative, whereas the cohort directions are unique per the accelerator.

Accelerators Chosen (Format = Name (source))

  1. Blue Startups (http://www.acceleratorinfo.com/see-all.html)
  2. Launchpad LA (http://www.acceleratorinfo.com/see-all.html)
  3. Y Combinator (http://www.seed-db.com/accelerators)
  4. FlashPoint (http://www.seed-db.com/accelerators/all)
  5. Prosper Accelerator (https://www.f6s.com/programs?type)
  6. Axel Springer Plug and Play (http://www.axelspringerplugandplay.com/)
  7. Techstars (http://www.seed-db.com/accelerators)
  8. Startmate (http://www.seed-db.com/accelerators)
  9. Capital Factory (http://blog.shedd.us/321987608/)
  10. OwlSpark (Google search: "Houston + accelerators")

Accelerator: Blue Startups (http://bluestartups.com/)

Finding the cohort:

  1. Navigated to "Track Record" page under the "Home" tab; found total number of graduated cohorts to be 7
  2. Navigated to "Portfolio" tab. Tab includes list of all seven graduated cohorts along with companies emerging from each one. Each cohort is listed under a separate page (ex. "Cohort 1", "Cohort 2", etc) and at the bottom of each cohort page, there is a link to the other 6. Each company has a short description along with its URL.
  3. An "Alumni News" page at the bottom of "Portfolio" includes articles pertinent to graduated startups.
  4. Unfortunately does not include the date and year of each cohort class, but perhaps could cross-reference with other sources.

Accelerator: Launchpad LA (http://launchpad.la/)

Finding the cohort:

  1. Navigated to "Companies" in the top of the homepage
  2. "Companies" returns all companies backed by Launchpad LA based on their class year and number (cohort)
    • Also sorted by active startups vs. inactive startups
  3. At the bottom of the "Companies" tab, there is a statistical layout returning values for the number of companies started by Launchpad during its time as an accelerator (2012-present), as well as the total funding funneled into the accelerator.

Accelerator: Y Combinator (http://www.ycombinator.com)

Finding the cohort:

  1. Scrolled down on the home page and clicked on a link entitled "See all companies".
  2. Navigated to a drop down menu named "All Batches", and clicked on it to expand the list.
  3. List is made up of dates ranging from 2005-2016, and these dates return lists of launched companies including most but not all of their URL's, as well as their launch year.

Accelerator: Flashpoint (http://flashpoint.gatech.edu/)

Finding the cohort:

  1. On upper right corner after animation, there is a tab sign which lets you navigate to a page labeled "Teams"
  2. The "Team" page has each batch of companies emerging from Georgia Tech, although it does not include the dates or cohorts of these companies. For example, "Batch 1" at the top of the page just lists the companies in the batch without URLs or any additional information.
  3. On the "Application" page on the tab near the top, there is information regarding Batch 7, which begins early 2017. Suggests that batch 6 either ended spring 2016 or fall 2016.

Accelerator: Prosper Women Entrepreneurs (http://www.prosperstl.com)

Finding the cohort:

  1. Navigated to "Accelerator" tab and clicked "Companies" when prompted with the drop down menu.
  2. This tab returned all of the launched company logos which then redirected to the company's home page when clicked.
  3. No other relevant form of information such as date launched or cohort was included on this page.

Accelerator: Axel Springer Plug and Play(http://www.axelspringerplugandplay.com/)

Finding the cohort:

  1. Clicked on the "Companies" tab on the home page and was directed to the middle of the page which included a short list of current companies.
  2. Clicked on the "All Companies" link which returned a page filled with startup logos and brief descriptions of those startups. When clicked, each logo serves to redirect to that startup's home page.
  3. Companies were not sorted by cohort or in any other relevant way.

Accelerator: Techstars (http://www.techstars.com)

Finding the cohorts:

  1. Navigated to the Accelerators tabs and clicked "Companies" on the drop down menu.
  2. Firstly, this returns a table comprised of a long list of different classes from different areas separated by years.
  3. Upon scrolling down further, each of these classes is broken down by the startups that graduated from them. It also includes information such as how much was invested in each startup, as well as whether or not the startup was acquired, is active, or failed.

Accelerator: Startmate (http://www.startmate.com.au)

Finding the cohorts:

  1. Navigated to the "Startups" tab, which returned a page of all startups that have graduated from Startmate.
  2. Startups are separated by year of graduation, and each company is linked on this page.
  3. It appears as if each year, 1 cohort is taken through the accelerator.

Accelerator: Capital Factory (https://capitalfactory.com/accelerate/)

Finding the cohorts:

  1. Navigated to the startups tab, which returned a long list of companies that were accelerated by Capital Factory.
  2. Each logo for the startups served as a link to their respective websites.
  3. There was no evidence or mention of any cohorts.

Accelerator: OwlSpark (http://entrepreneurship.rice.edu/accelerator/)

Finding the cohorts:

  1. Navigated to the "Startup Teams" tab, which returned a page that included links to 4 "Classes".
  2. Each class link i.e. (Class 1, Class 2, Class 3, Class 4) returned links to each startup that graduated from the program.
  3. These classes signify cohorts.

List of Promising Variables

  • Key People (founders, lead entrepreneurs, strategists, etc.)
  • Total number of launched companies
  • A FAQ for application details, accelerator vision, and
  • Funds raised per company (average)
  • Features offered by accelerator (perks, space, tools, etc)
  • General events hosted by the accelerator
  • (Success) stories for graduated start-ups

E-R Diagram (in list form) for Identifying Attributes to Pull from Accelerators

Summary: I will look at different entities within the accelerator page (e.g accelerators, cohorts, founders) and then find potential attributes that can be codified from those entities. Along with the attribute, we list a potential method for pulling that particular attribute.

Format:

Entity
  • Attribute - Possible sources/ways to get

Ed: "Be creative with finding new attributes to pull!"

List

Accelerators

  • Accelerator Name - Website, external database
  • Contact Form - General contact section in each website
  • Industry focus - can be pulled from description
  • Description - pulled from website itself
  • Takes equity? - Database or from "about" page
  • Non-profit? - Database
  • URL - Already have way of obtaining
  • DNS Registration Date - Already have way of obtaining
  • Address - Google Maps, maybe the website
  • Founding Date - Google Maps, website, server registration

Accelerators (1) has (n) Features

Features

  • Mentorship? - Description in website
  • Space Offered - Google Maps, Website description
  • Partnerships - Angel list, Same section as mentorship or events
  • Hosted Events - Calender

Accelerators (1) has (n) Founders

Founders

  • Name - Founders or Team Page
  • Title - Directly underneath or next to name
  • PhD? - Biography, webpage under name
  • Serial - Biography
  • Link back to "Accelerator Name" in Accelerators

Founders (n) has (n) Ventures

Ventures

  • Other Companies - Biography, webpage
  • Previous Companies - Biography
  • Net Worth - Forbes, Biography
  • Link back to "Name" in Founders

Accelerators (1) has (n) Cohorts

Cohorts

  • Date + Accelerator = Cohort ID - Database or Website
  • Number of Startups - Website, count from Startups
  • Cohort Number - Categorization on website, external database
  • Link back to "Accelerator Name"

Cohorts (1) has (n) Startups

Startups

  • Names - Website, external database
  • State of Inc - Angel List
  • URL - Angel List, website
  • Founding Date - Registration database, Angel List
  • Industry - startup description
  • Founding Location - Angel List
  • Current Location - Angel List
  • VC Raised to Date - SDC Platinum
  • Angel Funds Raised to date - Angel List

Variables which Distinguish Accelerators

  • Cohorts, Portfolio, Class, or Companies
    • First potential variable that could link the websites of many different accelerators. The problem with this variable is that it is also used by numerous venture capital firms, which could potentially cause complications when attempting to pull only the sites of accelerators from a Google search.
  • The word "Accelerator"
    • This word appears at least one time on the home page of the vast majority of accelerator websites. The word "Accelerator" appears either as a link to another page on the website or in a title on the homepage of the website. Not many other websites contain this word on their homepage, especially not if one Googles something generic such as "Accelerators in the US".