Difference between revisions of "Accelerator Seed List (Data)"

From edegan.com
Jump to navigation Jump to search
Line 91: Line 91:
 
::*As we go down the list, the number of accelerators proportionately decreases. Can comfortably say that overall accelerator turnout from this website is much less than 33%, probably closer to 10-15%.
 
::*As we go down the list, the number of accelerators proportionately decreases. Can comfortably say that overall accelerator turnout from this website is much less than 33%, probably closer to 10-15%.
 
===Review===
 
===Review===
*Potentially useful website if crawler could remove the clutter and target solely the accelerators
+
*Potentially useful website if crawler could remove the clutter and target solely the accelerators; very useful for identifying new accelerators since data automatically sorted by date and location.
*
+
*Large list of sources includes many irrelevant results, such as conferences or weekends which are difficult to identify. The name of the sorting category itself, "Accelerator/Program" suggests that many of the results fall under the "Program" section rather than being valid accelerators.
*
+
*Potential site for identifying accelerators, but limited by in-site sorting; useful for URL and perhaps equity, but not very detailed information relating to the accelerator/program.

Revision as of 15:13, 11 October 2016


McNair Project
Accelerator Seed List (Data)
Project logo 02.png
Project Information
Project Title
Start Date
Deadline
Primary Billing
Notes
Has project status
Copyright © 2016 edegan.com. All Rights Reserved.


Pre-existing Data

List of Accelerators

Sources

(Obtained from List of Accelerators)

Source Evaluations

http://www.acceleratorinfo.com/see-all.html

  1. Opened source website
  2. Copied Information under "All Accelerator Programs" to TextPad, already sorted. Returned 190 results
  3. Each link on parent list leads to individual home page url of accelerator
  • Used sample size of 20 links, determined 16 to be accelerators, 2 to be incubators, 2 to be inactive or broken links
  • Many accelerators do not include founding date, most recent accelerators from around 2013-2014 (as determined from home page)

Review

  • Reliable source for specific URLs to older accelerators, not very helpful for more specific information.
  • Web crawling seems improbable because information is not readily available from source. Can potentially mine staff information or contact information from associated "about" page in the home url

http://www.seed-db.com/accelerators/all

  1. Copied "Seed Accelerators" table to TextPad, data sorted itself into lines. Returned 235 results.
  2. Clicking on the accelerator name itself links to a page with all of its associated startups, up until 6/2016 cohort
  • Startup table includes:
  1. "state"
  2. "company name"
  3. "website and CrunchBase links"
  4. "cohort date"
  5. "exit value"
  6. "funding".
Many entries for "exit value" are missing, some values for "funding" are missing
On original seed-db webpage, each accelerator has a link to its associated home page url
  • From the table, each listed entry was an accelerator, although 24 accelerators out of 235 were classified as "dead"
  • Along with the home url, each accelerator table includes the following:
  1. Status
  2. Program (name)
  3. Location
  4. Country
  5. Number of companies
  6. Cumulative exit values
  7. Cumulative funding
  8. Average funding for startups
  9. Median funding for startups
Many entries for "median funding" are left empty, as well as entries for all types of funding on the bottom half of the table

Review

  • Reliable source for accelerators, includes list of accelerators both dead and active, as well as their associated start-ups
  • Web crawling potential is promising; startup table is located within the source for each webpage. Can also mine any category from the accelerator table
  • Overall very extensive data for accelerators that are included on the list, but after cross-referencing from other sources shows that seed-db is lacking many newer accelerators; list is not all-inclusive.
  • Includes regional distributions for accelerator groups as well. For example, rather than just "Techstars", the group is broken into Austin, Berlin, Boston, Boulder, etc.

http://www.seed-db.com/accelerators

Very similar to "http://www.seed-db.com/accelerators/all", but contains large regional accelerators as groups, rather than individual accelerators. For example, Techstars appears only once.
  1. Copied "Seed Accelerators" table to TextPad, data sorted itself into lines. Returned 239 results.
  2. Clicking on the accelerator name itself links to a page with all of its associated startups, up until 6/2016 cohort
  • Startup table includes same information as previous source, "http://www.seed-db.com/accelerators/all". However, accelerators spanning across multiple regions have their startups located under one category on this webpage.
On original seed-db webpage, each accelerator has a link to its associated home page url
  • From the table, each listed entry was an accelerator, although 24 accelerators/groups out of 239 were classified as "dead"
  • Along with the home url, each accelerator table includes the same information as the "http://www.seed-db.com/accelerators/all" source

Review

  • Reliable source for accelerators, includes list of accelerators both dead and active, as well as their associated start-ups
  • Web crawling potential is promising; startup table is located within the source for each webpage. Can also mine any category from the accelerator table
  • Overall very extensive data for accelerators that are included on the list, includes large groups as well as individual accelerators. It seems that some accelerators missing from "http://www.seed-db.com/accelerators/all" are located here, since there are 239 returns rather than 235.

https://www.f6s.com/programs?type

  1. On the webpage, set "Type" to "Accelerator/Program", set "Location" to "North America", and set "Invest in Country" to "United States" to return results
  2. Highlighted results and scrolled down until all results found; copied results to TextPad
  3. In TextPad, sorted out lines with "by", as well as miscellaneous categories such as dates and dollar signs through Regular Expressions
  4. Using the "More Info" line which held constant through the entire list, assigned a sequential number to the line (in order to determine the number of results)
  • Obtained a grand total of 1467 results from the list
  • Along with the name of the program/accelerator, the data included:
  1. Dollar value per team
  2. Equity
  3. Application Site
  4. Accelerator URL
  • Many entries are not accelerators, from a quick glance through the results, there were various conferences, 3-5 days events, and written literature pertaining to accelerators as well
  • From a sample size of the first 30 entries, determined 10 to be valid accelerators, 3 incubators, 6 conferences/weekends, and the rest to be miscellaneous entries such as startup events or "studios" (perhaps useful but not relevant to search)
  • As we go down the list, the number of accelerators proportionately decreases. Can comfortably say that overall accelerator turnout from this website is much less than 33%, probably closer to 10-15%.

Review

  • Potentially useful website if crawler could remove the clutter and target solely the accelerators; very useful for identifying new accelerators since data automatically sorted by date and location.
  • Large list of sources includes many irrelevant results, such as conferences or weekends which are difficult to identify. The name of the sorting category itself, "Accelerator/Program" suggests that many of the results fall under the "Program" section rather than being valid accelerators.
  • Potential site for identifying accelerators, but limited by in-site sorting; useful for URL and perhaps equity, but not very detailed information relating to the accelerator/program.