Difference between revisions of "VCDB24"
Jump to navigation
Jump to search
(Created page with "VCDB24 is the 2024 and final iteration of my [VentureXpert] based venture capital database. Thomson-Reuters discontinued access to VentureXpert through [SDC Platinum] on D...") |
(No difference)
|
Revision as of 18:40, 29 December 2023
VCDB24 is the 2024 and final iteration of my [VentureXpert] based venture capital database. Thomson-Reuters discontinued access to VentureXpert through [SDC Platinum] on December 31st, 2023. This iteration contains data up until then. The previous build was VCDB23, but the best previous instructions are from VCDB20.
Processing Steps
- Copy over the rpt, ssh, and pl files, and bulk edit the ssh files, now in E:\projects\vcdb24\SDC.
- Change 12/31/2020 (and one 07/20/2020) to 12/31/2022 and vcdb20 to vcdb23
- Run the ssh files against SDC Platinum. Note that SDC Platinum's service will be withdrawn on 31 December 2023.
- Run the SDC Normalizer script (one of the pl files) on each output
- Fix the header row in USFirms1980.txt before normalizing (the Capital Under Management column name is too long)
- Remove double quotes from USFund1980-normal.txt, USFundExecs1980-normal.txt, USPortCo1980-normal.txt, USFirmBranchOffices1980.txt
- The private and public M&A file sets have to be separately combined into 2 files after they've been normalized. Then replace \tnp\t and \tnm\t with \t\t in each.
- For RoundOnOneLine, remove the footer, run NormalizeFixedWidth.pl first, then RoundOnOneLine.pl, and then fix the header.
- PortCoLongDescription must be pre-processed from the command line and then post-processed in excel (see VCDB20H1 and Vcdb4#Long_Description). However, I didn't load it for this run.
- Create a new database on mother (createdb vcdb23) and setup a directory for the input files: E:\projects\vcdb23
- Copy over and edit Load.sql. Run it section-by-section.