Marcos Ki Hyung Lee (Work Log)

From edegan.com
Revision as of 17:30, 22 June 2018 by Marcoslee (talk | contribs)
Jump to navigation Jump to search

Summer 2018

2018-06-22:

Plans for today: try to fix dataset.

Found more errors in matched dataset. Synthetic firms variables seem to be wrong, as there are negative numbers and lots of missings.

Also, variables form matched firms like number of people, doctors, etc, and city name, are missing seemingly at random.

Egan walked me through the SQL code that generates de matched dataset. We made a more precise count of coinvestors. Before, we were double counting funds. Now, if a PortCo had only one VC fund investment, numcoinvestor == 0.

Looking into the syntethic variables problem, the main problem is on the variables synsumprevsameindu100 synsumprevsameindu20 synsumprevsameindu synsumprevsamesector synnumprevportcos syntotsameindu100 syntotsameindu20 syntotsameindu syntotsamesector.

They basically count the number of PortCos VCs invested that were in the same industry code as them, before meeting the current matched POrtCo (synsum*) and over all time (syntot*). So they are integers and tot >= sum. However, for the synthetic firm ones, they are mostly -1 on the sum ones, and missing on tot ones.

Looking at the code that generates these synthetic, there seems to be a problem when joining and subtracting one to the sum of dummies where A.code100 = B.code100 for example. Can't figure out how to correct it yet.

2018-06-21:

Plans for today: get a full understanding of dataset and variables, start making some summary statistics.

Inspected matched dataset and found inconsistencies on invesmentment amounts of VCs in PortCos. Talked to Egan about this, we will check it out carefully on the source SQL code tomorrow.

Made summary statistics of firm variables. There does not seem to be inconsistencies on that.

2018-06-20: Created folder at "E:\McNair\Projects\MatchingEntrepsToVC\Stata\", imported files into Stata, and made master dofile.