Difference between revisions of "Talk:Start-Ups of Houston (Map)"

From edegan.com
Jump to navigation Jump to search
 
Line 23: Line 23:
  
 
==Next Steps==
 
==Next Steps==
#Use Viral's script to find out more websites (1007/1453 so far)
+
#Use Viral's script to find out more websites (1007/1454 so far)
 
#Use Websites in "WhoIs Parser" to get registered addresses and potential founding dates
 
#Use Websites in "WhoIs Parser" to get registered addresses and potential founding dates
 
#Some python script (unspecified)
 
#Some python script (unspecified)
Line 31: Line 31:
 
=Startup Map=
 
=Startup Map=
  
'''Total Unique Startup Names: 1451'''
+
'''Total Unique Startup Names: 1454'''
  
 
'''Total Unique Accelerator Names: 13'''
 
'''Total Unique Accelerator Names: 13'''

Latest revision as of 16:51, 5 July 2016

Meeting Notes 2016/07/05

How do we look at addresses?

  1. Look through business registration data
  2. Crawl Google Maps
  3. Phonebooks not going to work
  4. Track IP addresses with an email containing an image
  5. Use occupancy permits to look at whether its residential or not
  6. Social Media? Can we use twitter or Linkedin data?

Other Info to be Considered

  • Dillan has some list of 2000 companies (with minorities or women or something)
  • CoWorking Spaces Lists. But how do we do this in a way that will scale?
  • Crunchbase can give us who funded who info

Map & Angel notes

  • Angels are important but angels can't go on a map
  • VC, Acc, and Inc can go on a map
  • Can use Angel Names to look at LinkedIn
  • Maybe look at SEC data on investors

Next Steps

  1. Use Viral's script to find out more websites (1007/1454 so far)
  2. Use Websites in "WhoIs Parser" to get registered addresses and potential founding dates
  3. Some python script (unspecified)
  4. Use Google Maps to get Location Coordinates
  5. Make a Map

Startup Map

Total Unique Startup Names: 1454

Total Unique Accelerator Names: 13


Houston Startup Sources:

  • AngelList 500
    • Joined 393
    • Signal 394
    • Total Raised 204
      • Total Raised is actually made Redundant by the other 2 Angel List Pulls
  • HoustonStartupsList 283
  • StartupBlinkMap 379
  • Startups-Accelerators 292
  • SDC VC Houston Port Cos 493
  • CrunchBase 116
  • StartHouston 27

Towards unique names

Steps:

  1. Put all the names in one text file (done)
  2. Sort the file and removed exact dups using textpad (done)
  3. Run the matcher on that file in mode 2 (rerun)
  4. Clean that match file manually for idiosyncractic issues (rerun - only 2 problems)
  5. Load all 7 base files into a dbase
  6. Load the matchfile into the dbase
  7. Use SQL to get the unique names for each entry in a base file (7 queries)
  8. Assemble all of the common variables together taking the best available (somewhat subjective) in SQL, and add the extra vars.
  9. Output the new master file to work with!


Necessary Categories to include in each individual Wiki page

  • Name
  • Location
  • Desc
  • accelerator (if available)

Optional Categories

  • Contact info
  • Cohort of accelerator
  • Industry

To Do

Nexp Steps:

  • Standardize names
  • Match up SQL tables
  • Use URLs to find missing addresses
    • Does it matter if website now reroutes to new URL?
  • Remove non Houston Startups
  • Import into Individual Wiki Pages
  • Import into Map
  • Repeat Process with:
    • Accelerators
    • Angels
    • Incubators
    • Angel Groups
    • Venture Capital
    • Service Firms
    • Co-Working Spaces
    • Event Spaces

Future

Possible Expansions:

  • Calendar that correlates with the map
  • Proximity measures & Microgeography
  • Weak/Strong Areas in Houston for Entrepreneurship
  • Comparing accelerators based on funding