Augi Liebster (Work Log)

From edegan.com
Revision as of 16:50, 16 July 2018 by Adliebster (talk | contribs)
Jump to navigation Jump to search


McNair Center Staff
{{{name}}}
Profile placeholder.png
Staff Information
Status Active
McNairCenterⓂ


Summer 2018

VentureXpert Data


Augi Liebster Work Logs

2018-07-16: Spent the day matching portcos with IPOs and MAs. Then cleaned the data using an excel file. Almost finished IPOs. Made a mistake in filtering MAs but will go back and finish cleaning both MAs and IPOs by tomorrow. Slightly confused about cleaning the MA table since I do not see a way other than equivalence of state to determine whether a company is matched to itself or a company with a similar name. Either will accept same state as a indicator or will wait for Ed's response.

2018-07-13: Worked to standardize the company names using the matcher. Also uploaded the rest of the data that I could into the db.

2018-07-12: Spent the day struggling the MA pull. Dylan figured out that data will pull when pulled in text not columnar form. Tomorrow will try to learn RegEx so that I can manage this file. Still stuck on USLongDescription as I have tried different ways of normalization and nothing has worked.

2018-07-11: Uploaded the rest of the tables that I was able to into the database. I am struggling with normalizing the USLongDescription and have tried the various ways given to solve the problem. I am stuck here and not sure how to proceed. I am similarly stuck with the MA table as I have still not been able to retrieve this data from SDC. I did update the Venture Xpert DataBase wiki page with information on loading the tables and the possible errors that could arise. For now, I am waiting on a response from Ed to see how I can continue to be productive.

2018-07-10: Struggled with pulling MA data from SDC for the majority of the morning. Tried creating a custom report from scratch and playing around with the variables but eventually gave up because SDC kept crashing. Moved on to loading data into the database. Created the database and loaded in roundbase and the ipo which seem to be consistent with former projects. Then read around to figure out how to normalize long description so that I could load it into the database but couldn't figure out what the documentation was trying to say.

2018-07-09: Continued to repull data from SDC in order to have the first two full quarters of 2018. Pulled everything except for the MAs. While I was waiting on the data to pull, I continued to go through Minh's data and toggle which had a list of starts ups and which didn't.

2018-06-29: Planned out the construction of the database and checked all rpt files to make sure that all variables I would need were present. Then updated the SDC Platinum and VentureXpert Wikis to ensure that both were readable and thorough. Finally, helped Minh with creating his training data so that he would be able to create an accurate crawler. Sorted through previously pulled websites of accelerators with the keyword Demo Day and marked whether or not they had lists of the companies that had taken part in their cohorts. Marked about 500 websties.

2018-06-28: Organized my folder for the building of the database. Talked to Ed who suggested that I pull data from SDC for July so that we would have two full quarters of data to work with. Helped with RoundOnOneLine and gave tips for better organizing data.

2018-06-27: Finished extracting data from SDC. Have normalized everything that was normalized in the previous process. Am now waiting on Ed to discuss how he wants to new database designed so that I can begin to actually build the database. IPO and MA pulls took a while because of various errors including Out of Memory error which kept on popping up. I slightly changed the rpt file and the pull ran quickly and effectively.

2018-06-26: Pulled data from SDC. Successfully pulled USPortCo1980-Present, CompanyLongDescription, USVCFirms1980-Present, USVCFunds1980-Present, and VCFunds1980-Present. Had some trouble figuring out which rpt files to use but messaged Ed and he clarified.

2018-06-25: Started to pull data from SDC. Did a few practice runs and then started to pull real data. Today I pulled USVCPortCo1980-Present and USCompanyLongDescription. I am having some trouble formatting both of them and need to sort out foreign countries from the data. Once I get down the formatting I will pull down the other datasets as well.

2018-06-22: Read all relevant pages to my project. Understand the process behind the building and have identified the master tables that will have to be built. Mapped out multiple trees to represent the stacks of tables created in the process of making the master tables. Need help understanding the SDC Platinum interface and how to pull data from there so that I can start to construct the database. Would like to meet with Ed to discuss his vision for redesigning the database.

2018-06-21: Continued to read through and understand the VCDatabase Rebuild wiki page. Found a number of logical and mathematical errors and have quickly realized that in the process of building the db I will have to rewrite the wiki.

2018-06-20: Began to read the VCDatabase Rebuild wiki page. Found page to be decently good at describing the process of building the db, but the process seems flawed. Confused about certain things where numbers seem to not add up or illogical statements are made. So far I have observed this in roundname duplicate check (not present), IPO distinct table (where there seem to be 10 entries missing) and maskey announcedates (where the use of min seems illogical). Potentially I am misunderstanding due to new exposure to SQL.

2018-06-19: Set up work stations on the balcony. Ordered extra cables needed to set up monitors with keyboard, a mouse, and my laptop. Get our projects; I will be redesigning the VC Database. Heard about other people's projects and other interns were assigned their projects as well.

2018-06-18: Met all of the interns, Ed, and Anne. Was introduced to the database, learned basic SQL commands, and set up a wiki page. Also logged onto the RDP for the first time. Starting to learn the infrastructure.