Changes

Jump to navigation Jump to search
===Summer 2017===
2017-06-20: Joined the center! Wrote my page. Started on [[Collecting SBIR Data]]. Finished collecting SBIR Data; saved in bulk(E:)--> McNair-->Projects-->SBIR. Began researching VC funds in the file Venture Funds in E:\McNair\Projects\Houston\VCData
2017-06-21: Continued researching VC funds in the file Venture Funds in E:\McNair\Projects\Houston\VCData.
2017-0608-2204: Finished researching VC funds Continued on xpath project. See "Task Notes" in [[Patent Schema Reconciliation]]. 2017-08-03: For the cities ranked 1-50 in Top50_Table in E:\McNair\Projects\Ecosystem\Ranking, found 1)City 2) State 3) Dollars invested 4) first-round deals 5) Active Startups 6) Density (Active Startups per Capita). Found percent of 3, 4, and 5 among the totals for all of the US, and for each of 8 metro areas as a proportion of the US' totals. Saved file as "Cleaned ranking table Aug 3", saved in E:\McNair\Projects\Ecosystem\Ranking. Continued on xpath project. 2017-08-02: Edited excel charts from 08-01. Continued on xpath project, completed IPCR sections. For Copy of Rankingv3_Diana's_workingfile in E:\McNair\Projects\Ecosystem\Ranking, added data on population, political activity in the 2016 presidential election, whether it had a university (using a filter of organizations that gave out doctorate degrees in Carnegie Classifications 2015_cleaned in E:\McNair\Projects\University Patents). 2017-08-01: For the xpath project: Found which patent versions for text files in E:\McNair\Projects\SimplerPatentData\data\examples\granted had PRIORITY_CLAIMS_DATE, PRIORITY_CLAIMS_COUNTRY, and PRIORITY_CLAIMS_PATENT_NUMBER; noted which ones did, and added their xpaths to the file Venture Funds of xpaths, Patent Schema Reconciliation.txt in E:\McNair\Projects\SimplerPatentData\data\examples\Patent Schema Reconciliation. Checked which types and versions had pct document numbers, updated xpaths in http://mcnair.bakerinstitute.org/wiki/Equivalent_XPath_and_APS_Queries#Query_Equivalences and Patent Schema Reconciliation.txt. Began the same process with IPCR_Subclass and the following xpaths on http://mcnair.bakerinstitute.org/wiki/Equivalent_XPath_and_APS_Queries#Query_Equivalences. Began listing examples for each xpath. Used data from roundplus.txt in Z:\VentureCapitalData\SDCVCData\vcdb to create charts, saved as New2017Report(Aug) in E:\McNair\Projects\Houston\VCData2017Report. 2017-07-28: Continued working on xpaths. For Top50_Table in E:\McNair\Projects\Ecosystem\Ranking, found and entered necessary data. Source for city population and area is in the same folder, titled "City area chart". 2017-07-27: Doubled checked that the xpaths in http://mcnair.bakerinstitute.org/wiki/Equivalent_XPath_and_APS_Queries#Query_Equivalences were accurate for v4.0,v4.1, v4.2, and added to the page xpaths for the nodes listed on that wiki page for v<4.3. Began groupingadding xpaths for other nodes Oliver noted would be helpful, like Invention Title. Went over new hubs definition with Hira; ensured no hubs on "Joe hub list 2017"(see above) were actually just incubators, and that they all had coding/tech events/programs with substance. Took 17 hubs off total. Saved new list in Z:\Hubs\2017\hubs_data, called Joe hub list 2017 w comments. For the above code, I identified (what I think are accurate) xpaths for the nodes of patent number (//us-bibliographic-data-grant/publication-reference/document-id/doc-number), kind (//us-bibliographic-data-grant/publication-reference/document-id/kind), and grant date (//us-bibliographic-data-grant/publication-reference/document-id/date). I am adding the xpaths for these nodes, as well as the others mentioned above, for the 4 types of patents, for each version, for both granted and applications. Still have to do xpaths for granted version 2.5 for all types, and all applications. Waiting on Oliver about whether we need xpaths for more nodes other than the 6 example nodes. </document-id> <date>20161227</date> <kind>P3</kind> <doc-number>PP027502</doc-number> <country>US</country>
2017<document-06-23: Finished grouping VC funds in the file Venture Funds in E:\McNair\Projects\Houston\VCData. Researched whether based in Houston, and whether they should be considered alive.id>
2017<publication-06-27: Sorted VC funds in E:\McNair\Projects\Houston\VCData; deleted non-operating ones; finalized groups. Began researching the relative size of different sectors in Houston's economy. Work saved in E:\McNair\Projects\Houston\Industries. reference>
2017<us-06bibliographic-28: Began adding cohorts to each new accelerator in E:\McNair\Projects\Accelerators, saving each accelerator's cohort in E:\McNair\Projects\Accelerators\Data as (acceleratorname).cohort, as a text file.data-grant>
2017-06-29: Continued adding cohorts to each new accelerator in E:\McNair\Projects\Accelerators, saving each accelerator's cohort in E:\McNair\Projects\Accelerators\Data as (acceleratorname).cohort, as An example xpath for a text file. Searched through documents in E:\McNair\Projects\SimplerPatentData\data\extracts\applications, in the modern and vintage folders, for examples certain block of patents of the following type: utility, plantcode from granted, reissue, and design, in versions 1v4.5, 1.6, 4.0, 4.1, 4.2, 4.3, 4.4, and 4.5. Placed examples in the folder Eplant:\McNair\Projects\SimplerPatentData\data\examples. As mentioned in the wiki page (and all but confirmed with regex searches of hundreds of the patent documents), we appear only to have data on utility patents, except for a few plant patents.
2017-06-30Notes: Continued adding cohorts to each new accelerator I am assuming "application number" in E:\McNair\Projects\Acceleratorsthe patent code means "filing number", saving each accelerator's cohort because the word "filing" appears nowhere in E:\McNair\Projects\Accelerators\Data as (acceleratorname).cohortthe code, either as an excel file or and there is already a text filedifferent number, under the "publication reference" header, that seems to be referring to the patent number. Added addresses to companies in E:\McNair\Projects\Houston\VCDataIt's likely that the number under which the patent is internally filed is called "application number", and appears under the header "application reference", and that the (publicized) patent number appears under the header "publication reference".
2017-07-1126: Continued adding addresses to companies in Venture Funds Began [[Patent Schema Reconciliation]], creating a text document of xpaths for the following nodes: patent number, filing number, grant date, kind, type, application number, and filing date. Saved file in E:\McNair\Projects\HoustonSimplerPatentData\VCData. Note: The file "VC Data" is now called "Venture Funds"data\examples\Patent Schema Reconciliation.
2017-07-1225: Added addresses/PO box locations to Venture Funds Noted hubs that met the new definition but were not considered hubs in hubs_list in E:\McNair\Projects\Houston\VCData to remaining VC firms. Organized listthe same file. Began organizing cohort Copied all hubs data collected on 6/30 w/ regex to streamline searching and gathering of cohort names themselves. Compiled addresses, along with other categorized info, in an excel file called from "VC firms with address, basic sector infohubs list", saved in EZ:\McNairHubs\Projects2017\Houston\VCDatahubs_data to "Joe hub list 2017" in the same folder. For each cohort Searched hubs.txt and "Potential Hubs" in EZ:\McNairHubs\Projects2017\Accelerators\Accelerator Match, hubs_data for new hubs; added headers new ones to the tabbed name, founder, description, etc. Continued searching for addresses of the firms with no addresses listed in "VC firms with address, basic sector infoJoe hubs list 2017". Double-checked addresses.
Note2017-07-21: The Houston VC data write up Confirmed whether each hub in last year's hub list (in "Hubs Data v2_16" Z:\Hubs\2017\hubs_data) is on [[Houston_Entrepreneurship_Ecosystem_Project#VC_Funds_in_Houston]]still operating.
2017-07-1320: Helped Diana with researching proportion of Houston's city budget allocated towards startup funding/entrepreneurship (apparently none...). Gathered info on IT and Procurement sections of Budget. Researched proportion of Houston budget allocated towards IT and Procurement. Excel and text files saved Searched through the firms in "Raw Program list" in E:\McNair\Projects\HoustonHubs\Budgetsummer 2016 to determine if they could be considered hubs based on the definition listed above. Continued researching relative sizes If they were, they were added to the list of industries in Houston, gathered relevant info and links new hubs in "Industry breakdown...hubs list" in EZ:\McNair\ProjectsHubs\Houston2017\Industrieshubs_data.
2017-07-1419: used data from links in Continued editing "Industry breakdown by GDP...hubs list" in EZ:\McNair\ProjectsHubs\Houston2017\Industries to create Excel charts of Houston employment & Gross Area Product broken down by industryhubs_data, researching organizations marked as questionably hubs. Saved charts in the same folderUsed websites like Alexa and similarsites. Energy, health, and (for employment data) IT sectors were emphasized, in line com to find hubs with the goal of communicating the idea that, as substantial parts of the Houston economy, those industries will benefit from supporting local startups. Looked up example charts websites similar to hubs in the "hubs list" file 2017ReportV1 in E:\McNair\Projects\Houston\Houston Ecosystem Recommendations, both for ideas for future charts, and to get an idea of quantitative VC data in Houston. Fixed typos, improved incomplete keysOnly found 1 new hub. Cleaned Began searching possible hubs in the file Venture Funds "Raw Program list" in E:\McNair\Projects\HoustonHubs\VCData: added research on funds declared dead to make sure they actually aresummer 2016. Double checked that all firms in the master list were accounted for in the grouped list of VC firms.
2017-07-18: For the file "hubs list" in Z:\Hubs\2017\hubs_data, researched whether organizations not listed as hubs (aka shaded red) in "Hubs Data v2_16" (located in the same folder) should be considered hubs, under the definition that a hub has 1) has a coworking space, 2) provides mentorship, 3) offers coding classes/tech events for cohort companies. Whether the hub had an accelerator or was tech focused was also noted.
2017-07-1914: Continued editing used data from links in "hubs listIndustry breakdown by GDP..." in ZE:\HubsMcNair\Projects\2017Houston\hubs_dataIndustries to create Excel charts of Houston employment & Gross Area Product broken down by industry. Saved charts in the same folder. Energy, health, and (for employment data) IT sectors were emphasized, in line with the goal of communicating the idea that, researching organizations marked as questionably hubssubstantial parts of the Houston economy, those industries will benefit from supporting local startups. Used websites like Alexa Looked up example charts in the file 2017ReportV1 in E:\McNair\Projects\Houston\Houston Ecosystem Recommendations, both for ideas for future charts, and similarsites.com to find hubs with websites similar to hubs get an idea of quantitative VC data in the "hubs list" fileHouston. Only found 1 new hubFixed typos, improved incomplete keys. Began searching possible hubs in Cleaned the file "Raw Program list" Venture Funds in E:\McNair\Projects\HubsHouston\summer 2016VCData: added research on funds declared dead to make sure they actually are. Double checked that all firms in the master list were accounted for in the grouped list of VC firms.
2017-07-2013: Searched through the firms in "Raw Program list" Helped Diana with researching proportion of Houston's city budget allocated towards startup funding/entrepreneurship (apparently none...). Gathered info on IT and Procurement sections of Budget. Researched proportion of Houston budget allocated towards IT and Procurement. Excel and text files saved in E:\McNair\Projects\HubsHouston\summer 2016 to determine if they could be considered hubs based on the definition listed aboveBudget. If they wereContinued researching relative sizes of industries in Houston, they were added to the list of new hubs gathered relevant info and links in "hubs listIndustry breakdown..." in ZE:\HubsMcNair\Projects\2017Houston\hubs_dataIndustries.
2017-07-21Note: Confirmed whether each hub in last year's hub list (in "Hubs Data v2_16" Z:\Hubs\2017\hubs_data) The Houston VC data write up is still operating.on [[Houston_Entrepreneurship_Ecosystem_Project#VC_Funds_in_Houston]]
2017-07-2512: Noted hubs that met the new definition but were not considered hubs Added addresses/PO box locations to Venture Funds in hubs_list in the same fileE:\McNair\Projects\Houston\VCData to remaining VC firms. Organized list. Copied all hubs Began organizing cohort data from collected on 6/30 w/ regex to streamline searching and gathering of cohort names themselves. Compiled addresses, along with other categorized info, in an excel file called "hubs listVC firms with address, basic sector info" , saved in Z E:\HubsMcNair\Projects\2017Houston\hubs_data to "Joe hub list 2017" in the same folderVCData. Searched hubs.txt and "Potential Hubs" For each cohort in ZE:\HubsMcNair\Projects\2017Accelerators\hubs_data for new hubs; Accelerator Match, added new ones headers to the tabbed name, founder, description, etc. Continued searching for addresses of the firms with no addresses listed in "Joe hubs list 2017VC firms with address, basic sector info". Double-checked addresses.
2017-07-2611: Began [[Patent Schema Reconciliation]], creating a text document of xpaths for the following nodes: patent number, filing number, grant date, kind, type, application number, and filing date. Saved file Continued adding addresses to companies in Venture Funds in E:\McNair\Projects\SimplerPatentDataHouston\data\examples\Patent Schema ReconciliationVCData. Note: The file "VC Data" is now called "Venture Funds".
Notes2017-06-30: I am assuming "application number" Continued adding cohorts to each new accelerator in the patent code means "filing number"E:\McNair\Projects\Accelerators, because the word "filing" appears nowhere saving each accelerator's cohort in the codeE:\McNair\Projects\Accelerators\Data as (acceleratorname).cohort, and there is already either as an excel file or a different number, under the "publication reference" header, that seems to be referring to the patent numbertext file. It's likely that the number under which the patent is internally filed is called "application number", and appears under the header "application reference", and that the (publicized) patent number appears under the header "publication reference"Added addresses to companies in E:\McNair\Projects\Houston\VCData.
An example xpath 2017-06-29: Continued adding cohorts to each new accelerator in E:\McNair\Projects\Accelerators, saving each accelerator's cohort in E:\McNair\Projects\Accelerators\Data as (acceleratorname).cohort, as a text file. Searched through documents in E:\McNair\Projects\SimplerPatentData\data\extracts\applications, in the modern and vintage folders, for a certain block examples of patents of code from grantedthe following type: utility, plant, v4reissue, and design, in versions 1.5, 1.6, 4.0, 4.1, 4.2, 4.3, 4.4, and 4.5. Placed examples in the folder E:\McNair\Projects\SimplerPatentData\data\examples. As mentioned in the wiki page (and all but confirmed with regex searches of hundreds of the patent documents), we appear only to have data on utility patents, except for a few plant: patents.
<us2017-bibliographic06-data-grant><publication-reference><document-id><country>US</country><doc-number>PP027502</doc-number><kind>P3</kind><date>20161227</date></document-id>28: Began adding cohorts to each new accelerator in E:\McNair\Projects\Accelerators, saving each accelerator's cohort in E:\McNair\Projects\Accelerators\Data as (acceleratorname).cohort, as a text file.
For the above code, I identified (what I think are accurate) xpaths for the nodes of patent number (//us-bibliographic2017-data06-grant/publication27: Sorted VC funds in E:\McNair\Projects\Houston\VCData; deleted non-reference/document-id/doc-number), kind (//us-bibliographic-data-grant/publication-reference/document-id/kind), and grant date (//us-bibliographic-data-grant/publication-reference/document-id/date)operating ones; finalized groups. I am adding the xpaths for these nodes, as well as Began researching the others mentioned above, for the 4 types relative size of patents, for each version, for both granted and applicationsdifferent sectors in Houston's economy. Still have to do xpaths for granted version 2.5 for all types, and all applicationsWork saved in E:\McNair\Projects\Houston\Industries. Waiting on Oliver about whether we need xpaths for more nodes other than the 6 example nodes.
2017-0706-2723: Doubled checked that Finished grouping VC funds in the xpaths file Venture Funds in httpE://mcnair.bakerinstitute.org/wiki/Equivalent_XPath_and_APS_Queries#Query_Equivalences were accurate for v4.0,v4.1, v4.2, and added to the page xpaths for the nodes listed on that wiki page for v<4.3\McNair\Projects\Houston\VCData. Began adding xpaths for other nodes Oliver noted would be helpful, like Invention Title. Went over new hubs definition with Hira; ensured no hubs on "Joe hub list 2017"(see above) were actually just incubatorsResearched whether based in Houston, and that whether they all had coding/tech events/programs with substance. Took 17 hubs off total. Saved new list in Z:\Hubs\2017\hubs_data, called Joe hub list 2017 w commentsshould be considered alive.
2017-0706-2822: Continued working on xpaths. For Top50_Table Finished researching VC funds in the file Venture Funds in E:\McNair\Projects\EcosystemHouston\Ranking, found and entered necessary dataVCData. Source for city population and area is in the same folder, titled "City area chart"Began grouping.
2017-0806-0121: For the xpath project: Found which patent versions for text files Continued researching VC funds in E:\McNair\Projects\SimplerPatentData\data\examples\granted had PRIORITY_CLAIMS_DATE, PRIORITY_CLAIMS_COUNTRY, and PRIORITY_CLAIMS_PATENT_NUMBER; noted which ones did, and added their xpaths to the file of xpaths, Patent Schema Reconciliation.txt in E:\McNair\Projects\SimplerPatentData\data\examples\Patent Schema Reconciliation. Checked which types and versions had pct document numbers, updated xpaths in http://mcnair.bakerinstitute.org/wiki/Equivalent_XPath_and_APS_Queries#Query_Equivalences and Patent Schema Reconciliation.txt. Began the same process with IPCR_Subclass and the following xpaths on http://mcnair.bakerinstitute.org/wiki/Equivalent_XPath_and_APS_Queries#Query_Equivalences. Began listing examples for each xpath. Used data from roundplus.txt in Z:\VentureCapitalData\SDCVCData\vcdb to create charts, saved as New2017Report(Aug) Venture Funds in E:\McNair\Projects\Houston\2017ReportVCData.
2017-0806-0220: Edited excel charts from 08-01Joined the center! Wrote my page. Continued Started on xpath project, completed IPCR sections[[Collecting SBIR Data]]. For Copy of Rankingv3_Diana's_workingfile Finished collecting SBIR Data; saved in bulk(E:\)--> McNair\-->Projects\Ecosystem\Ranking, added data on population, political activity -->SBIR. Began researching VC funds in the 2016 presidential election, whether it had a university (using a filter of organizations that gave out doctorate degrees in Carnegie Classifications 2015_cleaned file Venture Funds in E:\McNair\Projects\University Patents).Houston\VCData
2017-08-03: For the cities ranked 1-50 in Top50_Table in E:\McNair\Projects\Ecosystem\Ranking, found 1)City 2) State 3) Dollars invested 4) first-round deals 5) Active Startups 6) Density (Active Startups per Capita). Found percent of 3, 4, and 5 among the totals for all of the US, and for each of 8 metro areas as a proportion of the US' totals. Saved file as "Cleaned ranking table Aug 3", saved in E:\McNair\Projects\Ecosystem\Ranking. Continued on xpath project.
2017-08-04: Continued on xpath project. See "Task Notes" in [[Patent Schema Reconciliation]].
[[Category:Work Log]]
447

edits

Navigation menu