PTLR Webcrawler
Christy
Monday: 3-5
Tuesday: 9-10:30, 4-5:45
Thursday: 2:15-3:45
Contents
Steps
Search on Google
Complete, query in command line to get results
Download BibTex
Complete
Download PDFs
Incomplete, struggling to find links.
Christy's LOG
09/27
Created file FindKeyTerms.py in Software/Google_Scholar_Crawler which takes in a text file and returns counts of the key terms from the codification page. Already included SIS, DHCI and OP terms and working on adding the others.
09/28
Thought that the pdf to text converter wasn't working, but realized that it does just sloooowly (70 papers converted overnight). Should be fine since we are still developing the rest of the code and we only need to convert them to txt once.
Continued to load PTLR codification terms to the word finding code and got most of the way through (there are so many ahhh but I'm learning ways to do this more quickly). Once they're all loaded up, I will create some example files of the kind output this program will produce for Lauren to review and start:
1) Seeking definitions of patent thicket (I think I'll start by pulling any sentence that patent thicket occurs in as well as the sentence before and after).
2) Classifying papers based on the matrix of term appearances that the current program builds.
Lauren's LOG
09/27
Took a random sample from "Candidate Papers by LB" and am reading each paper, extracting the definitions, and coding the definitions by hand. This is expected to the be a control group which will be tested for accuracy against computer coded papers in the future. The random sample contains the following publications:
Entezarkheir (2016) - Patent Ownership Fragmentation and Market Value An Empirical Analysis.pdf
Herrera (2014) - Not Purely Wasteful Exploring a Potential Benefit to Weak Patents.pdf
Kumari et al. (2017) - Managing Intellectual Property in Collaborative Way to Meet the Agricultural Challenges in India.pdf
Pauly (2015) - The Role of Intellectual Property in Collaborative Research Crossing the 'Valley of Death' by Turning Discovery into Health.pdf
Lampe Moser (2013) - Patent Pools and Innovation in Substitute Technologies - Evidence From the 19th-Century Sewing Machine Industry.pdf
Phuc (2014) - Firm's Strategic Responses in Standardization.pdf
Reisinger Tarantino (2016) - Patent Pools in Vertically Related Markets.pdf
Miller Tabarrok (2014) - Ill-Conceived, Even If Competently Administered - Software Patents, Litigation, and Innovation--A Comment on Graham and Vishnubhakat.pdf
Llanes Poblete (2014) - Ex Ante Agreements in Standard Setting and Patent-Pool Formation.pdf
Utku (2014) The Near Certainty of Patent Assertion Entity Victory in Portfolio Patent Litigation.pdf
Trappey et al. (2016) - Computer Supported Comparative Analysis of Technology Portfolio for LTE-A Patent Pools.pdf
Delcamp Leiponen (2015) - Patent Acquisition Services - A Market Solution to a Legal Problem or Nuclear Warfare.pdf
Allison Lemley Schwartz (2015) - Our Divided Patent System.pdf
Cremers Schliessler (2014) - Patent Litigation Settlement in Germany - Why Parties Settle During Trial.pdf
09/28
I added a section to the PTLR Codification page titled "Individual Terms." Ed would like to have all downloaded papers searched for these terms and record the frequency of which they appear.