* matchprevportcos -- no nulls.
* pcnumperson -- this is a conceptually and operationally terrible variable! See below.
* pccitydollarsrankm1 -- matching on placename has issueslots of missing!
* pcexp -- similar issues to pcnumperson. See below.
Dropping the entire market is surely way too extreme. We should just drop the offending portco and only drop the market if the number of real matches drops below our threshold (e.g., 5).
=== pccitydollarsrankm1 ===
There are a number of possible explanations for why this variable has lots of missing.
There do seem to be missing placenames. 4855/69882 PortCoSuper records don't join to PlaceYearRanking on placename and state (ignoring year) and 4,561 of these have valid zips. However, only 263 had growth VC and just 82 has non-null positive invested amounts, so this isn't the issue.
=== pcnumperson / pcexp ===
WHEN firmstageprefno IS NULL AND firmcat IN ('Ecosystem','SBIC','Angel','Gov') AND NOT (dealseed >= 1 OR dealearly >= 1) THEN 0::int
WHEN firmstageprefno IS NULL THEN 1::int
ELSE 0::int END AS matchinstagebroad,
=== V1 ===