Talk:Start-Ups of Houston (Map)
Revision as of 16:50, 5 July 2016 by BenBaldazo (talk | contribs)
Contents
Meeting Notes 2016/07/05
How do we look at addresses?
- Look through business registration data
- Crawl Google Maps
- Phonebooks not going to work
- Track IP addresses with an email containing an image
- Use occupancy permits to look at whether its residential or not
- Social Media? Can we use twitter or Linkedin data?
Other Info to be Considered
- Dillan has some list of 2000 companies (with minorities or women or something)
- CoWorking Spaces Lists. But how do we do this in a way that will scale?
- Crunchbase can give us who funded who info
Map & Angel notes
- Angels are important but angels can't go on a map
- VC, Acc, and Inc can go on a map
- Can use Angel Names to look at LinkedIn
- Maybe look at SEC data on investors
Next Steps
- Use Viral's script to find out more websites (1007/1453 so far)
- Use Websites in "WhoIs Parser" to get registered addresses and potential founding dates
- Some python script (unspecified)
- Use Google Maps to get Location Coordinates
- Make a Map
Startup Map
Total Unique Startup Names: 1451
Total Unique Accelerator Names: 13
Houston Startup Sources:
- AngelList 500
- Joined 393
- Signal 394
- Total Raised 204
- Total Raised is actually made Redundant by the other 2 Angel List Pulls
- HoustonStartupsList 283
- StartupBlinkMap 379
- Startups-Accelerators 292
- SDC VC Houston Port Cos 493
- CrunchBase 116
- StartHouston 27
Towards unique names
Steps:
- Put all the names in one text file (done)
- Sort the file and removed exact dups using textpad (done)
- Run the matcher on that file in mode 2 (rerun)
- Clean that match file manually for idiosyncractic issues (rerun - only 2 problems)
- Load all 7 base files into a dbase
- Load the matchfile into the dbase
- Use SQL to get the unique names for each entry in a base file (7 queries)
- Assemble all of the common variables together taking the best available (somewhat subjective) in SQL, and add the extra vars.
- Output the new master file to work with!
Necessary Categories to include in each individual Wiki page
- Name
- Location
- Desc
- accelerator (if available)
Optional Categories
- Contact info
- Cohort of accelerator
- Industry
To Do
Nexp Steps:
- Standardize names
- Match up SQL tables
- Use URLs to find missing addresses
- Does it matter if website now reroutes to new URL?
- Remove non Houston Startups
- Import into Individual Wiki Pages
- Import into Map
- Repeat Process with:
- Accelerators
- Angels
- Incubators
- Angel Groups
- Venture Capital
- Service Firms
- Co-Working Spaces
- Event Spaces
Future
Possible Expansions:
- Calendar that correlates with the map
- Proximity measures & Microgeography
- Weak/Strong Areas in Houston for Entrepreneurship
- Comparing accelerators based on funding