Difference between revisions of "Vcdb4"

From edegan.com
Jump to navigation Jump to search
(Created page with "{{Project |Has title=vcdb4 |Has owner=Ed Egan, |Has project status=Active }}")
 
Line 4: Line 4:
 
|Has project status=Active
 
|Has project status=Active
 
}}
 
}}
 +
 +
==Source Files==
 +
 +
Files are in:
 +
E:\projects\vcdb4
 +
 +
The old files from [[VentureXpert Database]] are in the subfolder '''Student Work''', and their latest work is in '''Updated'''.
 +
 +
We need the following (according to E:\projects\vcdb3\OriginalSQL\LoadingScriptsV1.sql):
 +
*Roundbase: USVC1980-2018q2-Good.txt
 +
*IPOs: IPO1980-2018q2-NoFoot-normal.txt
 +
*Branchoffices: USVCFirmBranchOffices1980-2018q2-NoFoot-normal.txt
 +
*Roundline: USVCRound1980-2018q2-NoFoot-normal-normal.txt
 +
*FundBase: USVCFund1980-2018q2-NoFoot-normal.txt
 +
*CompanyBase: USPortCo1980-2018q2-NoFoot-normal.txt
 +
*MAs: MAUSTargetComp100pc1985-July2018-normal.txt
 +
*FirmBase: USVCFirms1980-2018q2-NoFoot-Normal.txt
 +
*LongDesc: PortCoLongDesc-Ready-normal-fixed.txt
 +
*CoPeople: Executives-NoFoot-normal.txt
 +
*FundPeople: Executives-Funds-NoFoot-normal.txt
 +
 +
As well as some lookup tables (CPI may need updating) and some joined tables (which would have to be updated separately) in MatchingEntrepsV3.sql:
 +
*PortCoSBIR: PortCoSBIR.txt
 +
*PortCoPatent: PortCoPatent.txt
 +
 +
And to update RevisedDBaseCode.sql, we'll need to:
 +
*Join in the Crunchbase (which needs updating)
 +
*Update the Geocoordinates
 +
 +
Note that this data could support new or updated versions of:
 +
*[[Urban Start-up Agglomeration and Venture Capital Investment]]
 +
*[[Estimating Unobserved Complementarities between Entrepreneurs and Venture Capitalists]]
 +
*[[How do Venture Backed Startups with Women in Charge Perform?]]
 +
*[[US Startup City Ranking]]
 +
*[[Measuring High-Growth High-Technology Entrepreneurship Ecosystems]]
 +
and others.
 +
 +
The build should be done as quickly but cleanly as possible, as it is needed right away but also will likely need to be updated in January of 2020 to reflect 2019's year end.
 +
 +
==SDC Platinum Requests==
 +
 +
Everything was updated to 09/22/2019 as the final date. Some files were renamed for clarity. Each result is a triplet of .ssh, .rpt, and .txt files. The following scripts, reports and their outputs are in E:\projects\vcdb4\SDC:
 +
 +
*USVCRound1980
 +
*USVCPortCo1980
 +
*USVCRoundOnOneLine1980
 +
*USVCFund1980
 +
*USVCFirms1980
 +
*USPortCoLongDesc1980
 +
*USVCFirmBranchOffices1980
 +
*USIPOComp1980
 +
*USMAPrivate100pc1985
 +
*USVCPortCoExecs1980
 +
*USVCFundExecs1980
 +
 +
USSDCRound1980 was updated to remove fields that should have been in USVCPortCos1980 only. When normalizing be sure to only copy down key fields. USMAPrivate100pc1985 was updated to reflect the MAs load in LoadingScriptsV1. There wasn't a good original. We are using 1985 forward as there are data issues that prevent download/extraction for the 1980-1984 data. Year completed was added as a check variable. And USIPOComp1980 was updated to allow all exchanges (not just NNA). I couldn't require completion in the search, so that will have to be done in the dbase.

Revision as of 11:21, 22 September 2019


Project
Vcdb4
Project logo 02.png
Project Information
Has title vcdb4
Has owner Ed Egan
Has start date
Has deadline date
Has project status Active
Copyright © 2019 edegan.com. All Rights Reserved.


Source Files

Files are in:

E:\projects\vcdb4

The old files from VentureXpert Database are in the subfolder Student Work, and their latest work is in Updated.

We need the following (according to E:\projects\vcdb3\OriginalSQL\LoadingScriptsV1.sql):

  • Roundbase: USVC1980-2018q2-Good.txt
  • IPOs: IPO1980-2018q2-NoFoot-normal.txt
  • Branchoffices: USVCFirmBranchOffices1980-2018q2-NoFoot-normal.txt
  • Roundline: USVCRound1980-2018q2-NoFoot-normal-normal.txt
  • FundBase: USVCFund1980-2018q2-NoFoot-normal.txt
  • CompanyBase: USPortCo1980-2018q2-NoFoot-normal.txt
  • MAs: MAUSTargetComp100pc1985-July2018-normal.txt
  • FirmBase: USVCFirms1980-2018q2-NoFoot-Normal.txt
  • LongDesc: PortCoLongDesc-Ready-normal-fixed.txt
  • CoPeople: Executives-NoFoot-normal.txt
  • FundPeople: Executives-Funds-NoFoot-normal.txt

As well as some lookup tables (CPI may need updating) and some joined tables (which would have to be updated separately) in MatchingEntrepsV3.sql:

  • PortCoSBIR: PortCoSBIR.txt
  • PortCoPatent: PortCoPatent.txt

And to update RevisedDBaseCode.sql, we'll need to:

  • Join in the Crunchbase (which needs updating)
  • Update the Geocoordinates

Note that this data could support new or updated versions of:

and others.

The build should be done as quickly but cleanly as possible, as it is needed right away but also will likely need to be updated in January of 2020 to reflect 2019's year end.

SDC Platinum Requests

Everything was updated to 09/22/2019 as the final date. Some files were renamed for clarity. Each result is a triplet of .ssh, .rpt, and .txt files. The following scripts, reports and their outputs are in E:\projects\vcdb4\SDC:

  • USVCRound1980
  • USVCPortCo1980
  • USVCRoundOnOneLine1980
  • USVCFund1980
  • USVCFirms1980
  • USPortCoLongDesc1980
  • USVCFirmBranchOffices1980
  • USIPOComp1980
  • USMAPrivate100pc1985
  • USVCPortCoExecs1980
  • USVCFundExecs1980

USSDCRound1980 was updated to remove fields that should have been in USVCPortCos1980 only. When normalizing be sure to only copy down key fields. USMAPrivate100pc1985 was updated to reflect the MAs load in LoadingScriptsV1. There wasn't a good original. We are using 1985 forward as there are data issues that prevent download/extraction for the 1980-1984 data. Year completed was added as a check variable. And USIPOComp1980 was updated to allow all exchanges (not just NNA). I couldn't require completion in the search, so that will have to be done in the dbase.