The dataset was rebuilt using vcdb3 -- See [[VentureXpert Data]] and then fixed by Ed using RevisedDbaseCode.sql in E:\projects\vcdb3 and /bulk/vcdb3.
After that, Ed did the following:
# Rebuild the code so that only matched VCs are used as synthetics
# Add year and industry as variables
# Add variables:
## City rankings over time, lagged by 1 year and by 5 years
## Age
## Various serial measures
## Rebuild data dictionary!
## Definition of industry codes again
Specifically:
E:\projects\vcdb3\MatchingEntrepsV3.sql
Fix PortCoMatchMaster and copeopleaggsimple to make and pass:
E.serialceopres, E.serialfounder, E.ceopres, E.singularceopres, E.founders, E.hasfounders, E.prevs, E.prevceopres, E.prevfounders
Also fix doctors!
E:\projects\vcdb3\RevisedDBaseCode.sql
Fix PortCoSuper and add:
C.serialceopres AS pcserialceopres,
C.serialfounder AS pcserialfounder,
C.ceopres AS pcceopres,
C.singularceopres AS pcsingularceopres,
C.founders AS pcfounders,
C.hasfounders AS pchasfounders,
C.prevs AS pcprevs,
C.prevceopres AS pcprevceopres,
C.prevfounders AS pcprevfounders,
C.serialceopres*C.singularceopres AS pcserialceopressingular,
C.serialfounder*C.hasfounders AS pcserialfounderhas,
C.prevceopres*C.singularceopres AS pcprevceopressingular,
C.prevfounders*C.hasfounders AS pcprevfoundershas
Fix E:\projects\vcdb3\OriginalSQL\Ranking.sql (Note originally fixed in E:\projects\vcdb3\Ranking.sql). Specifically, add the rankingfull queries: city, state, year, dollarsrank, dealsrank, aliverank, overallrank
Finally, in:
E:\projects\vcdb3\MatchingEntreps2VCs
Make:
*MatchingVCEntrepRevisions.sql
*MasterRealC20YearFullPlus.txt
*MasterRealC20YearFullPlus - DataDictionary.txt
===Current Plan===
New potential variables still being considered for inclusion:
*VC historic CAPUM?
*Industry-Year Measures? Needs input from Chenyu. Likely not useful.
Currently Chenyu is using:
*Distance
*Sub-sector specific expertise of VC - could broaden definition
**Currently: most small 10-15 matches using pccode20
**Might end up with more large markets
*Startup specific experience
**patent counts - mostly 0s: 95%.
This will be revised once he has the new data set.
Chenyu is now going to do:
*Monte Carlo with data from empirical distro
*Actual estimation - doesn't take long
*Reduced form estimation: VC investment and outcomes? Logit? outcomes (exit measures). Real match explatory variable, match characteristics, controlling
*Target: May
==Reference Papers==