Specifically, our goals are to develop a system to:
# [[Ecosystem Organization Classifier|Classify entrepreneurship ecosystem organizations]], including high-growth technology incubators, startups, and venture capitalists based on a short textual description;
# [[Listing Page Classifier|Identify the client listing page ]] on an incubator's website; # [[Listing Page Extractor|Automate the extraction of information ]] about startups from an incubator's client listing page;
# Make this system available to the research community as opensource software.
'''By March 2019'''
# Determine at least [[Incubator Seed Data|4 primary data sources]], or secure licenses to extract ‘seed data’ from these sources, as measured by program records. # Have a working prototype of an [[Listing Page Classifier|automated classifier ]] to distinguish between incubators and other entities described in seed data, as measured by program records.# Collect data in at least [[Incubators in Five Ecosystems|5 ecosystems]], as measured by availability of a dataset.# Develop a [[LP Extractor Protocol|protocol for the tool ]] to extract client company identity information from incubator websites, as measured by program records.
'''By June 2019'''
# Have a working [[Listing Page Extractor|prototype of a tool ]] to identify client company listings from incubator websites, as measured by program records.
# Upload the collected data to GitHub, Dataverse, or other publicly accessible web platform for use by a set of academics, as measured by program records.
# Produce a summary on the [[Incubator Project Open Development|open development process ]] for the prototype as measured by program materials.
==Expected Outcomes==