Difference between revisions of "Start-Ups of Houston (Map)"

From edegan.com
Jump to navigation Jump to search
Line 2,998: Line 2,998:
 
<includeonly>
 
<includeonly>
 
[[Category: McNair Projects]]
 
[[Category: McNair Projects]]
</includeonly><!-- flush --><!-- flush --><!-- flush -->
+
</includeonly><!-- flush flush --><!-- flush flush --><!-- flush flush -->

Revision as of 10:27, 19 July 2016


McNair Project
Start-Ups of Houston (Map)
Project logo 02.png
Project Information
Project Title
Start Date
Deadline
Primary Billing
Notes
Has project status
Copyright © 2016 edegan.com. All Rights Reserved.


Under Houston Entrepreneurship umbrella.

Linked in Houston Accelerators and Incubators (Report)

Abstract

Using lists mined from websites, weblists, and databases, this map will be precisely locating and diagraming the Startups of Houston, TX. Later incorporations will include corresponding wiki pages for individual companies as well as maps of startup resources (including: accelerators, incubators, Angels and VC firms).

Report

From File File:HStartupMaster7.xlsx

1454 rows (Company Names), 11 data columns (including name) Column Names: normname, industry, website, descr, street, city, zip, totraised, founddate, accelerator, accelerator2

11 accelerators, 6 companies with 2

Processes

Steps taken

  1. Mined websites like AngelList, Cruchbase, StartupBlink, Houston Startups List, etc.
  2. Cleaned data
    1. Columns align with headers all the way down
    2. Websites actually belong to the company (Not youtube or angellist)
    3. There are no "new lines" in individual cells
    4. There are no open quotes (or really just no quotes in general is best)
  3. Uploaded into the Houston psql database
    1. Saved as UTF-8 encoding
  4. Used Matcher to Match compiled names against itself
    1. Used this matched file to standardize/normalize names for future data consolidating
  5. Made distinct list of Houston Startups using file above
  6. Made a priority list for importing data into the Masterfile
  7. Using priority list populated empty columns in Masterfile with each of the mined tables
    1. had to go back separate some things out like addresses or multiple accelerators
  8. exported MasterFile into excel

Future Steps

  1. Use who is parser to fill gaps (addresses and founding dates)
  2. upload individual startups into their own wiki
  3. repeat these steps with Venture firms, Angels (& Groups), Accelerators, Incubators, Service Firms, Flex & Co-working spaces, Event Spaces, etc.

References

https://angel.co/

http://www.startupblink.com/Houston-startups

http://houston.startups-list.com/

https://www.crunchbase.com/#/home/index

SDC Platinum

Ed Egan