*Chenyu's code and datasets are in .\matlab
*.\linearmodel is the current STATA work
Chenyu's box (available until Oct 31st 2019) is here: https://rochester.app.box.com/s/nvtqgpmyygjykes3lcx9s53sfzmu8c27
It contains:
*working folder
*sample batch files
Both folders were cloned to E:\projects\unobservedcomplementarities\Chenyusbox on 17th July 2019.
Notes:
*data_import3.m uses MasterRealC20YearFullPlus.txt, which is the latest dataset
===Linear Model===
E:\projects\unobservedcomplementarities\linearmodel
====Rebuilding Marcos==== Marcos starts with a dataset of reals with a single synthetic, and then constructs a dataset of reals with all synthetics (in the same year and code20). Table 1 gives some LPMs using two sets of variables with and without VC-yearmet fixed effects. These are replicated in the new do file. In order to get something close to Marcos's reported numbers, I create a one-to-one variable so that each real match has only a single synthetic match. This gives about 60k observations as compared with Marcos's 64K (and as opposed to 445k for the full sample). The key regressions coefficients are very close to those in Table 1. There aresome caveats, however. Marcos is using:*Amounts in billions (as am I) without taking logs (of 1+x)*Firmid x year (which he refers to as VC x yearmet) fixed effects, as opposed to year (i.e., dealminroundyear) x pccode20 fixed effects, which correctly define a market*No restrictions on timing Table 5 gives some LPMs before and after a Lasso. The hqdist variable was first transformed so that hqdist = hqdist/1000. The labels in the pdf are somewhat misleading. The translation is as follows: PDF -> Marcos variable -> New Variable -------------------------------------- hdqist -> c.hqdist##c.hqdist sumprevsameindu20 -> c.sumprevsameindu20##c.sumprevsameindu20 serials -> c.serials##c.numprevportco numprevportcos -> c.patentsprevc##c.numprevportco firmtenure -> c.serials##c.firmtenure patentsprevc -> c.patentsprevc##c.firmtenure
===Notes from Conference Call===