Difference between revisions of "Vcdb4"
Line 33: | Line 33: | ||
Everything was updated to 09/22/2019 as the final date. Some files were renamed for clarity. Each result is a triplet of .ssh, .rpt, and .txt files. The following scripts, reports and their outputs are in E:\projects\vcdb4\SDC: | Everything was updated to 09/22/2019 as the final date. Some files were renamed for clarity. Each result is a triplet of .ssh, .rpt, and .txt files. The following scripts, reports and their outputs are in E:\projects\vcdb4\SDC: | ||
− | |||
*USVCRound1980 | *USVCRound1980 | ||
*USVCPortCo1980 | *USVCPortCo1980 | ||
Line 42: | Line 41: | ||
*USVCFirmBranchOffices1980 | *USVCFirmBranchOffices1980 | ||
*USIPO1980 | *USIPO1980 | ||
− | |||
*USVCPortCoExecs1980 | *USVCPortCoExecs1980 | ||
*USVCFundExecs1980 | *USVCFundExecs1980 | ||
+ | *USMAPrivate100pc1985 | ||
+ | *USMAPrivate100pc2013 | ||
+ | |||
+ | The two USMAPrivate100pc queries are different. The first pulls just date announced, date effective, target name, target state and tv. The second adds basic acquirer information from 2013 forward (to allow retroactive revision by Thomson for 5+ years) and can be combined with MAUSTargetComp100pc1985-July2018.txt (after adjusting the spacing) to make USMAPrivate100pc2013Full. For some reason, the query always fails with an out of memory message when trying to pull the whole thing. | ||
− | USSDCRound1980 was updated to remove fields that should have been in USVCPortCos1980 only. When normalizing be sure to only copy down key fields. USMAPrivate100pc1985 was updated to reflect the MAs load in LoadingScriptsV1. There wasn't a good original. We are using 1985 forward as there are data issues that prevent download/extraction for the 1980-1984 data. Year completed was added as a check variable but might have been the source of issues and so was removed. Date Effective can be used instead. And USIPOComp1980 was updated to allow all exchanges (not just NNA). I couldn't require completion in the search, so that will have to be done in the dbase. | + | USSDCRound1980 was updated to remove fields that should have been in USVCPortCos1980 only. When normalizing be sure to only copy down key fields. USMAPrivate100pc1985 was updated to reflect the MAs load in LoadingScriptsV1. There wasn't a good original. We are using 1985 forward as there are data issues that prevent download/extraction for the 1980-1984 data. Year completed was added as a check variable but might have been the source of issues and so was removed. Date Effective can be used instead. And USIPOComp1980 was updated to allow all exchanges (not just NNA). I couldn't require completion in the search, so that will have to be done in the dbase. USVCFund1980 was updated because some variables -- those concerned with the fund's name and fund address -- had changed name. Finally, note that USPortCoLongDesc1980 needs processing separately. |
Revision as of 20:04, 22 September 2019
Vcdb4 | |
---|---|
Project Information | |
Has title | vcdb4 |
Has owner | Ed Egan |
Has start date | |
Has deadline date | |
Has project status | Active |
Copyright © 2019 edegan.com. All Rights Reserved. |
Source Files
Files are in:
E:\projects\vcdb4
The old files from VentureXpert Database are in the subfolder Student Work, and their latest work is in Updated.
We need a set of pulls (according to E:\projects\vcdb3\OriginalSQL\LoadingScriptsV1.sql), which are documented below, as well as some lookup tables (CPI may need updating) and some joined tables (which would have to be updated separately) in MatchingEntrepsV3.sql:
- PortCoSBIR: PortCoSBIR.txt
- PortCoPatent: PortCoPatent.txt
And to update RevisedDBaseCode.sql, we'll need to:
- Join in the Crunchbase (which needs updating)
- Update the Geocoordinates
Note that this data could support new or updated versions of:
- Urban Start-up Agglomeration and Venture Capital Investment
- Estimating Unobserved Complementarities between Entrepreneurs and Venture Capitalists
- How do Venture Backed Startups with Women in Charge Perform?
- US Startup City Ranking
- Measuring High-Growth High-Technology Entrepreneurship Ecosystems
and others.
The build should be done as quickly but cleanly as possible, as it is needed right away but also will likely need to be updated in January of 2020 to reflect 2019's year end.
SDC Platinum Requests
Everything was updated to 09/22/2019 as the final date. Some files were renamed for clarity. Each result is a triplet of .ssh, .rpt, and .txt files. The following scripts, reports and their outputs are in E:\projects\vcdb4\SDC:
- USVCRound1980
- USVCPortCo1980
- USVCRoundOnOneLine1980
- USVCFund1980
- USVCFirms1980
- USPortCoLongDesc1980
- USVCFirmBranchOffices1980
- USIPO1980
- USVCPortCoExecs1980
- USVCFundExecs1980
- USMAPrivate100pc1985
- USMAPrivate100pc2013
The two USMAPrivate100pc queries are different. The first pulls just date announced, date effective, target name, target state and tv. The second adds basic acquirer information from 2013 forward (to allow retroactive revision by Thomson for 5+ years) and can be combined with MAUSTargetComp100pc1985-July2018.txt (after adjusting the spacing) to make USMAPrivate100pc2013Full. For some reason, the query always fails with an out of memory message when trying to pull the whole thing.
USSDCRound1980 was updated to remove fields that should have been in USVCPortCos1980 only. When normalizing be sure to only copy down key fields. USMAPrivate100pc1985 was updated to reflect the MAs load in LoadingScriptsV1. There wasn't a good original. We are using 1985 forward as there are data issues that prevent download/extraction for the 1980-1984 data. Year completed was added as a check variable but might have been the source of issues and so was removed. Date Effective can be used instead. And USIPOComp1980 was updated to allow all exchanges (not just NNA). I couldn't require completion in the search, so that will have to be done in the dbase. USVCFund1980 was updated because some variables -- those concerned with the fund's name and fund address -- had changed name. Finally, note that USPortCoLongDesc1980 needs processing separately.