|Has paper status=In development
}}
==SummaryExisting Data==
===VC Code=== The old VC code is in E:\mcnair\Projects\MatchingEntrepsToVC\DataWorkMatchingEntrepsV2-2.sql It uses vcdb2 and forks off of roundlinejoinerleanff, building the following sequence of tables:*roundlineaggfirmsseq -> roundlineaggseqwexit (using roundlineaggfunds)*RoundLineMasterSeqBase (from roundlineaggseqwexit and 10 LJ'd tables)*RoundLineMasterSeq (RoundLineMasterSeqBase with FirmnameRoundInduTotal, FirmnameRoundInduHist)*Build out by stage -- MatchMostNumerousSeed, MatchHighestRandomSeed, etc.*RoundLineByStageKeys -> MasterByStageBase -> MasterByStage -> MasterByStageKeys -> MasterByStageBlownout There is untested seq table code at the end of E:\projects\vcdb3\OriginalSQL\MatchingEntrepsV3.sql They build just roundlineaggfirmsseq ===Accelerator Demo Day=== See the [[Accelerator Demo Day]] for more information. We ran the code and posted several iterations to Turk, and completed at least one iteration by hand. from [[Amazon Mechanical Turk for Analyzing Demo Day Classifier's Results]]*E:\mcnair\Projects\Accelerator Demo Day\Turk\batch_results_all_accs_excel.xlsx -- looks like it contains the results of a Turk run. 265 results, 160 usable.*[[Accelerator_Demo_Day#Hand_Collecting_Data]] provides a link to a Google Sheet. This sheet was downloaded to E:\projects\accelerators\Demo Day Timing Info.xlsx - it contains 136 observations. Files of this format were processed by a script written by Grace? ===Accelerator Code=== The last build was by Ed and Hira. Hira's notes are on the [[Seed Accelerator Data Assembly]] page needs completing. Claims:*dbase is likely vcdb2*All data files are in Z:/accelerator*The SQL file that loads all data is: LoadAccData.sql. It is located in E:\McNair\Projects\Accelerators\Summer 2018*Source data is E:\McNair\Projects\Accelerators\Summer 2018\The File To Rule Them All.xlsx*timing_final - This project table is fully based on the most updated information on timing compiled in development!source file: Z:/accelerator/Formatted Timing Info.txt (by Grace)*additional_timing_info - source file: "merging_work.xlxs" located in: E:\Projects\McNair\Seed DB 8) *additional_timing_info2 - source file: "formatted timing info2.txt" located in E:\Projects\McNair\Accelerators\Summer 2018. This was collected through MTurks.*9) timing_combined - This table combines all timing information we have and appends tables 4, 7 and 8. 10) cohortcompanies_wtiming - merges data in tables cohortcompany and timing_combined*See also, Grace's code E:/McNair/Projects/Accelerators/Summer 2018/format_timing.py. Last file it produced was TurkData2ndPush-FormattedTiming.txt Last code written by Ed was likely: E:\mcnair\Projects\Accelerators\Summer 2018\FindTiming.sql Best files with provenance: SmallBatchTimingInfo.txt Appears hand collected 170 lines, conamestd accelerator date month year cohort quarter TurkData2ndPush-FormattedTimingWHeader.txt Processed by format_timing.py Comes from Final Turk Push.xlsx 1515 lines, company name normalized Formatted Timing Info 2 No header but: coname accelerator date pagetype 1523 lines Seems to have come from GraceData.txt and been processed by an earlier version of format_timing.py Formatted Timing Info Header: coname acceleratorname keyword url webpage predicted gooddata page_details full_date month year cohort_name notes prog_duration_wks actual_date actual_month actual_year season 1168 lines, company name normalized Seems to have come from Demo Day Timing Info - Good Data Only.txt Demo Day Timing Info Companies No header, but appears coname normalized 1143 lines Might have come from Demo Day Timing Info - Good Data Only.txt Made obsolete by Formatted Timing Info? Note that the most recent file is NewBatchForTimingInfo.txt, which contains coname, accelerator pairs. It's not clear if it was ever run.