Both of these projects (and as a corollary, this project) are dependent on the [[Demo Day Page Parser]], [[Industry Classifier]], and the [[Whois Parser]].
==Most Update for Hira== After our Skype call, I did the following: ===FIxing Google Sheet=== I first used our "recap" and "announced" classification to standardize and fix the dates.We created a new sheet with only data we want to keep, and cleaned it up. That sheet is called "Good Data Only", at the same link: https://docs.google.com/spreadsheets/d/16Suyp364lMkmUuUmK2dy_9MeSoS1X4DfFl3dYYDGPT4/edit?usp=sharing *Columns N-R contain our new data. Please note that all of these columns are based on formulas and will be made erroneous if edited.*Column N is the # of weeks for an accelerator program, gathered via VLookup from The File to Rule Them All.*Column O is the Actual Date we want to record, and was gathered by subtracting the # of weeks from a date '''if''' the listed page was a '''recap'''.*Columns P and Q are the Month and Years stripped from Column O.*Finally, Column R is the season variable, as Ed said it should be coded. We have also gone through and removed all bad data, all duplicates, and all rows without timing info. These is the most complete list possible. ===Recode employee count=== I have added a new column in Cohorts Final (of the File to Rule Them All) yet left the old column in case you would prefer to edit/classify differently.Column AB (emp_count_scale) contains a variable coded on a scale of 1 to 9, with each number corresponding to one of the employee_count classifications (1 the lowest, 9 the highest). The exact output can be modified (1 could instead be tiny, 2 be small... 9 be huge). The employee count column is standardized and can easily be edited given some modification of the Excel formula. ===Normalized investment amount=== I've been trying to fix the investment amount. But I think its smart we discuss this before I move forward. I've done some tentative standardization (finding the average in a range if given), but so many accelerators "take up to __%" equity and "invest up to $___" that I think its smartest we think hard about how to standardize first. Also notable is that some accelerators say they provide "$____ up front and another $___ in follow up funding for each stage." How do we deal with these? Message me if you'd like to talk more about this. I refrained from creating max and min investment columns lest spending time on the data before we discuss it. ===Remaning to do=== - Founders Experience: code job title- Founders Education: remove unknowns, code degree and code major- Multiple campus and cohorts ==Recent Work==
Here's a project update on the work that has been done since coming to McNair. The most recent file is