Changes
Jump to navigation
Jump to search
→Primary Data Set
MSAPortCos: Count(CoName) As NoPortCosFunded, CoMSASuper, RoundYear
...
'''Notes on Creation of Primary Data Set'''
Raw tables
* companies (last investment, first investment, company name, MSA, MSA code, address, state, date founded, known funding, industry)
* funds (fund closing date, last investment, first investment, fund name, address, MSA, MSA code, Average investment, number companies invested (NoCos), known investment)
* rounds (round date, company name, state, round number, stage 1, stage 2, stage 3)
* combined rounds (company name, round date, disclosed amount, investor)
* msalist (changes MSAs to CMSAs— combined MSAs)
*industry list (changes 6 industry categories to 4— ICT, Life Sciences, Semiconductors, Other)
Process
* cleaned tables to eliminate duplications, undisclosed variables
* changed all original characters to include CMSA and Industry Codes (companyinfo3, fundinfocleanfinal, roundinfoclean)
* matched funds to avoid any issues with names (i.e. Fund ABC L.P./Fund ABC LP/Fund ABC)
To Do
* Match roundinfoclean investors to fundinfocleanfinal investors (roundinfo.txt >> cleanfundfinal.txt)-- 771 errors found; fix before uploading
* Match roundinfoclean conname to companyinfo3 coname (companyinfonames.txt >>companyroundnames.txt)
**try without matcher first, see how accurate the naming system is before reverting to matcher
* Join round-to-fund (CMSA) and round/fund to company - cmsa/industry
**see code for superinfo
* Create supportable by MSA by year
===Hub Candidates Data Set===