{{Project|Has project output=Content|Has sponsor=McNair ProjectsCenter|Project TitleHas title=Women in Entrepreneurship (Issue Brief)|Topic Area=Social Factors in Entrepreneurship|OwnerHas owner=Carlin Cherry|Start TermHas start date=Spring 2016|End TermHas keywords=Women, Entrepreneurship|StatusHas project status=Tabled|Deliverable}}=Timeline and plan for summer=*note: I am using "hopethisworks8.txt" to make my graphs*4-5 artifacts, 4-5 specifications, last artifact a regression table (tables, charts)**something about data itself. pie chart (how many companies do we have, how many do we have ceo information for, how many do we have founder) and then relatedly many can we classify as men or women***this will influence all remaining graphs - use this info to guide the rest of your graphs***classified drs using their first name matched to common first name list, that's why we were able to classify some drs as women, some as men, and some drs but gender unknown**graph with time on x axis, % on y axis, women founders, women ceos, women management positions (management position being anything that is vp and above) over time***will write up these results-ie most of the women in "womens management positions" **conditional on having (ceo identified) how many women, how many men, drs, dr men, dr women, dr unknown. **bar graph with women in various industries, y axis will have percentage***should also make note of total percentage of VC dollars in each industry**regression table***variable names, with n and r^2, industry/year fixed effects Y/N***each one of these will correspond to a regression***variables: IPO, acquisition, exit all correspond to 0,1***rounds, dollars invested =Spring 2017===Current plan to complete project==Issue Brief|Audience#Lit Review (which will be compiled here [[Women in Entrepreneurship Lit Review]]). Keep track of the things that have already been done in other research reports, so that we can do something different/original or so we can prove their data wrong. Done 3/28.#Brainstorm ways to fix Dr. problem in data (assigning gender to those who are doctors), recode them. reran new tables with new coded doctors. done 4/4.#Think of new variables to add depending on the guidance from the lit review (might include adding different variables than solely womenceo/womenfounder but will depend on litreview). ##add control variable for the people in data who have no gender#I think from there I can do analysis and start to write report! ==For analysis==General Public Note that:*finaldatasetcode.sql (in 181/Women) has been updated*hopethisworks5.txt (in both 181/Women and E/Women) has been created**Use this as your dataset now**It contains IPO, IPOyear, IPOAmount, MA, MAyear, MAAmount, and Exit**IPO, MA, Policymakersand Exit are 0/1 variables. Year is an int, and Amount is a real.**Many MAs will not have amounts. Build graphs of:*Over time (last ten yrs)**percent of co's with woman ceo (binary)**percent of co's with woman founder (binary)**percent of co's with woman clevel (binary)**percent of co's with women in top management (vp and above)**average fraction of women founders**average fraction of women clevel**average fraction of women in top management (vp and above)*Some of the above by industry*Some of the above by state
|KeywordsRegresions:*rounds womanvar w/controls*inv womanvar w/controls*exit womanvar w/controls*ipo womanvar w/controls*ma womanvar w/controls*ipoamount womanvar w/controls*maamount womanvar w/controls ==Call with SDC Platinum to determine information about the data==#How does Thompson get their data?##Why is coverage better for some firms than others?##Is the data self-reported by the companies?#How does Thompson upload executive data?##How often – after last round of funding?##After it’s uploaded, is the data updated continuously?#Better coverage for companies that get more rounds of financing/IPO/M&A? #VC data sourced from government filings, public press releases, and quarterly surveys of private equity firms. If company does not participate in survey, then SDC does not have the data, which is why coverage for some firms is better than others. #Deals team is the one that uploads all the data. As soon as SDC gets an update, or has a source that is updated, they automatically upload that to system as well. It typically takes 24-48 hours for new info to be reflected in database. #Yes, there is better coverage for companies that have IPO/M&A/get more rounds of financing. ==WomenTO DO after 2/28==#Check cleancos and verify that it is actually clean (yes)#Make Dr. a control variable (yes)#Sum all c-level, etc.#aggregate to company level We need to build:*IPO 1/0*M&A 1/0*Number of rounds*Total invested*Is the CEO a woman 1/0*Are any of the founders women 1/0*Is CEO doctor 1/0*Number of founders*Number of founders that are doctors*Number of women founders*State - coded*Industry -code them!*Year of First investment -extracted Ed is going to add IPO and Acq to rc1andcp3 Here's the code to add zeros to cleanpeople: DROP TABLE cleanpeople2; CREATE TABLE cleanpeople2 AS SELECT prefix, firstname, lastname, jobtitle, fulltitle, cname, CASE WHEN genval IS NOT NULL THEN genval ELSE 0::int END AS women, CASE WHEN doctor IS NOT NULL THEN doctor ELSE 0::int END AS doctor, CASE WHEN charman IS NOT NULL THEN charman ELSE 0::int END AS charman, CASE WHEN ceo IS NOT NULL THEN ceo ELSE 0::int END AS ceo, CASE WHEN cfo IS NOT NULL THEN cfo ELSE 0::int END AS cfo, CASE WHEN coo IS NOT NULL THEN coo ELSE 0::int END AS coo, CASE WHEN cio IS NOT NULL THEN cio ELSE 0::int END AS cio, CASE WHEN cto IS NOT NULL THEN cto ELSE 0::int END AS cto, CASE WHEN otherclvl IS NOT NULL THEN otherclvl ELSE 0::int END AS otherclvl, CASE WHEN boardmember IS NOT NULL THEN boardmember ELSE 0::int END AS boardmember, CASE WHEN president IS NOT NULL THEN president ELSE 0::int END AS president, CASE WHEN vp IS NOT NULL THEN vp ELSE 0::int END AS vp, CASE WHEN founder IS NOT NULL THEN founder ELSE 0::int END AS founder, CASE WHEN director IS NOT NULL THEN director ELSE 0::int END AS director FROM cleanpeople WHERE jobtitle IS NOT NULL; ==TO DO up to 2/28== #Load##company##people##title lookup##state lookup#Refine people##gender 0/1##Join title lookup##Dr.'s?#Aggregate to company level (left join)##Agg. People##Join state lookup#Export! Done! Here is the code for this part of project: DROP TABLE titlelookup;CREATE TABLE titlelookup( fulltitle varchar(150), charman int, ceo int, cfo int, coo int, cio int, cto int, otherclvl int, boardmember int, president int, vp int, founder int, director int);\COPY titlelookup FROM 'Important Titles in Women2017 dataset.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV--628 DROP TABLE statelookup;CREATE TABLE statelookup( statename varchar(100), uniquecode int);\COPY statelookup FROM 'uniquestateval.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV--50 DROP TABLE people;CREATE TABLE people( prefix varchar(5), firstname varchar(50), lastname varchar(50), jobtitle varchar(150), cname varchar(150) );\COPY people FROM 'pull5-normal.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV--186181 DROP TABLE cleancos;CREATE TABLE cleancos ASSELECT pull3info.*FROM pull3infoJOIN Cleancosbase ON pull3info.companyname=cleancosbase.companyname AND minfirstdate=firstinvestdate;--43534 DROP TABLE genvalpeople;CREATE TABLE genvalpeople ASSELECT *,CASE WHEN prefix='Ms' THEN 1::intWHEN prefix='Mrs' THEN 1::intWHEN prefix='Mr' THEN 0::intELSE Null::int END AS genval FROM people;--186181 DROP TABLE cleanpeople;CREATE TABLE cleanpeople AS SELECT genvalpeople.*, titlelookup.*FROM genvalpeople LEFT JOIN titlelookup ON genvalpeople.jobtitle=fulltitle;--186181 DROP TABLE uniquevalstate;CREATE TABLE uniquevalstate ASSELECT cleancos.*, statelookup.*FROM cleancos LEFT JOIN statelookup ON cleancos.companystate=statelookup.statename;--43534 DROP TABLE dataset;CREATE TABLE dataset ASSELECT cleanpeople.*, Entrepreneurshipuniquevalstate.*|Primary BillingFROM cleanpeople LEFT JOIN uniquevalstate ON cleanpeople.cname=AccMcNair01uniquevalstate.companyname;}}--186181
[[Women in Entrepreneurship Lit Review]]\COPY people TO 'finaldataset.txt' WITH DELIMITER AS E'\t' HEADER NULL AS '' CSV--186181