Patent Thicket Literature Review

From edegan.com
Revision as of 19:17, 21 March 2013 by imported>Ed (→‎Process of the review)
Jump to navigation Jump to search

This page and its sub pages provide the raw materials and derived 'data' for a literature review on patent thickets.

Process of the review

The review process consists of the following steps:

  1. An original sample of papers was retrieved from journal databases
  2. These papers were very quickly classified into core, up, and down groups. This classification is undoubtedly subject to errors.
  3. The Core group of papers were used to undertake a convergence process
  4. Additional core papers, indentified in the Convergence Process were added to the sample
  5. All papers were manually classified into groups and bibtex entry tagging was performed
    • Key concept tags were added to the Up group papers
  6. Complete reviews of all of the Core papers were conducted.
    • The literature review of each paper in the core was checked to ensure that all referenced core papers are included in the core.

We are currently conducting steps 5a, 6, and 6a of this process. For 5a and 6a, see the [[#Workflow|workflow] section below.

Original Sample

An orginal sample of papers was retrieved from journal databases using keyword searches.

Journal databases searched included Google Scholar, Proquest, EBSCO, World of Science, JSTOR, and others. Google Scholar provided by far the most papers and now appears to dominate as a journal search tool. Keywords searched included "patent thicket", "anticommons", "Herfindahl", "blocking patents", "infringing", "dense web", "patent network", and others, both individually and in combination with one another. In additional, several papers that cite certain key papers, including Shapiro (2001), Ziedonis (2004), Hall et al. (2012), were also searched.

This process yielded 251 papers that spanned economics, management, public policy, law, computer science, the physical sciences, and policy reports by government or NGOs. 2 papers were added to this list following a recommendation from Peter.

Core, Up, and Down Groups

Papers need to be classified into 'Core', 'Up', and 'Down' groups (these are defined below).

The classification in step 2 was performed using word frequency counts for the term "patent thickets" in conjunction with a manual review. The manual review was performed very quickly - papers were not even 'scan' read, just glanced over with a mean review time of around 1 minute per paper. As such there will likely be many classification errors.

Specifically, it is expected that some Core papers will turn out to not be core, but I anticipate less mis-classification in the opposite direction. 50 papers were discarded during the step 2 classification as they were not sufficiently relevant. This left 203 papers. Many more paper may be deemed insufficiently relevant later, but a record of discards should be kept.

The Core Group

The core group consists papers that explicitly discuss patent thickets. They might be theory papers that describe mechanisms for thickets, empirical papers that show the existance or lack of existance of thickets, or other papers that provide direct work on thickets. Papers in this group will generally used the term 'thicket' very frequently.

There were origininally 59 papers classified as core in step 2 of the process. Once stage 5 of the process is completed, each core paper will be reviewed in detail and get its own page on this site.

The Down Group

The down group consists of papers that underpin the thicket literature while not explicitly discussing thickets. This group includes theory models for complementary or substitute innovations, for sequential innovation, for 'probabilistic patents', or for patent races, discussions of the relative importance of various aspects of intellectual property (such confering rents vs. providing information), and econometric papers on the use of patent statistics.

It is not necessary for papers in the down group to even mention patent thickets. Much of the older work (pre-2000) won't because the term didn't exist. This make classifying papers into the down group difficult without area specific expertise. 47 papers were classified as being in the down group in step 2 of the process. However, the convergence process provided many other candidates.

The Up group

The up group consists of paper that use patent thickets in some fashion. This group essentially takes patent thickets as given and builds from there.

At a first take it appears that this groups consists of papers concerning:

  • Mechansisms for addressing the consequences of thickets: SSOs and Standardization, Licensing arrangements, Trolls, Clearance Houses, Joint Ventures, and so forth.
  • IPR reform: advocating changes to patent policy, antitrust policy, the Bayh-Dole Act, etc.
  • Firm strategy: strategic responses to thickets, the effects of thickets on portfolio values, and other advocation of firm-level responses
  • Industry specific commentary: This last category generally overlaps with the others - for examples there are papers written for genome researchers informing them of the consequences of thickets in thier area, or that patents on segments of the genome are either invalid or not blocking. A considerable number of papers fall into this category and they shouldn't be discarded. It is important to determine which industries have commented on patent thickets, and whether or not they regard them as a problem for industry practitioners.

There were 97 papers classified into the up group in stage 2 of the process, and I expect this number to rise.

Convergence Process

A convergence process was undertaken using the 59 papers that were deemed to be in the core group in step 2 of the process. 45 of the 59 Core papers could be 'ripped' to text files and were. The references from these files were then extracted in 41 instances. The remaining 4 papers were law papers with their references scattered in footnotes that could not be reliably extracted by computer software.

The references for these papers were then matched against one another to produce counts of the most cited papers within the core group. This process led to 313 papers that were cited by more than one paper in the core group. Of these 239 were not in the original sample of 251. Each of these 239 papers was briefly checked to see whether it should be added to the core group and 8 papers were added.

BibTeX Entry Tagging

Beginning with candidates for the up, we need to add custom BibTeX tags to each reference.

An example BibTeX reference is as follows:

@article{andrews2002genes,
  title={Genes and patent policy: rethinking intellectual property rights},
  author={Andrews, L.B.},
  journal={Nature Reviews Genetics},
  volume={3},
  number={10},
  pages={803--807},
  year={2002},
  publisher={Nature Publishing Group},
  filename={Andrews (2002) - Genes And Patent Policy Rethinking Intellectual Property Rights.pdf}
}

We are going to clean up this reference, remove redundant tags and add new ones. A BibTeX entry must have a unique key. In the example above the key is 'andrews2002genes'. This conforms to the standard format, which is to use the first author's name, the year and the first word (or either the first two words or the second word if the first word is 'the', 'a', or similar common word). The entire entry is encapsulated by @article{...}, though you will also see 'techreport' or 'inproceedings' or similar as the container type. Then within the entry there is a series of tags of the form tag={}. These are comma seperated, and each have distinctive names.

The example reference looks pretty clean. We might use sentence case for the title, but the author is in the form LastName, FirstName/Initials form, and the journal and other information all look fine. For references with multiple authors the authors names should be seperated by ' and '. However, the publisher tag is redundant for an article (it isn't for a book), so it can be removed. When BibTeX references come from JSTOR they often need substantial cleaning.

We are then going to add tags. For the up group we are going to add the following tags:

  • abstract: Make sure it is all on one line with no carriage returns in it and is enclosed in the braces {}.
  • discipline: 'Policy Report', 'Law', 'Econ', 'Mgmt', 'Biology', 'Physics', etc. Keep a list of the disciplines used (and see below).
  • research_type: 'Theory' (if there are lots of equations or the paper is developing a written theory), 'Empirical' (if there are regressions), 'Discussion'. You can add two tags seperated by a comma if needed (i.e., Theory,Empirical).
  • industry: The industry that the empirical results, theory, or discussion apply to, if there is one. Papers that talk about thickets in Nanotech should be classifed as Nanotech, etc.
  • thicket_stance: Is the paper 'Pro' or 'Anti' the existance of patent thickets? Possible classifications might be: 'Pro', 'Assumed Pro' (for when the paper itself doesn't say that patent thickets exist, but does say that other people say that it exists and it moves forward using this assumption), 'Weakly Pro' (for when it says they might be a problem), 'Neutral', 'None', 'Weakly Anti' (for when it says that they probably aren't a problem), and 'Anti'.
  • thicket_stance_extract: An extracted section or section of the text (as small as possible) as a single line to provide evidence of the stance
  • thicket_def: The definition of what a patent thicket is explicitly or implicitly used in the paper. See below.
  • thicket_def_extract: An extracted section or section of the text (as small as possible) as a single line to provide evidence of the definition
  • tags: Comma seperated tags to describe what the paper is actually about or its key elements. Examples might be: Pools, Standards, SSOs, Oligopolies, Blocking Patents, etc.

For discipline, it isn't important to make a distinction between policy econ, business econ, and other types of econ. Likewise for management papers. For the thicket definition we are going to want to come up with categorizations of the definitions. I suggest that we keep a running list. Candidates might include 'Diversely-held Complementary Inputs', or 'One Firm With Blocking Patents', etc.

The corrected BibTeX reference for Andrews is then:

@article{andrews2002genes,
  title={Genes And Patent Policy: Rethinking Intellectual Property Rights},
  author={Andrews, L.B.},
  journal={Nature Reviews Genetics},
  volume={3},
  number={10},
  pages={803--807},
  year={2002},
  abstract={Concerns about human gene patents go beyond moral disquiet about creating a commodity from a part of the human body and also beyond legal questions about whether genes are unpatentable products of nature. New concerns are being raised about harm to public health and to research. In response to these concerns, various policy options, such as litigation, legislation, patent pools and compulsory licensing, are being explored to ensure that gene patents do not impede the practice of medicine and scientific progress.},
  discipline={Biology,Law},
  research_type={Discussion},
  industry={Genetics},
  thicket_stance={Neutral},
  thicket_stance_extract={Whatever policies society develops for gene patents, policymakers will be influenced by the fact that the ‘bio’ in biotechnology — the genes in the gene patents — comes from people. Researchers need the trust of those whom they study to get access to their tissue for research into diagnostics and cures. Using the biological resources of the public (and a substantial amount of public funding), genes have been discovered and patented. Now, policy makers are being asked to ensure that the public receives the benefits.},
  thicket_def={Diversely-held Complementary Inputs},
  thicket_def_extract={Economist Carl Shapiro elaborates on the problems created by a ‘patent thicket’. Using traditional economic analysis, he has shown how, when several monopolists exist that each control a different raw material needed for development of a product, the price of the resulting product is higher than if a single firm controlled trade in all of the raw materials or made the product itself. However, the combined profits of the producers are lower in the presence of complementary monopolies. So, if there are several patent holders whose permission is needed to create a gene therapy (and any one of them could block the production of the gene therapy), inefficiencies in the market are created, potentially harming both the patent holder and the patent users.},  
  tags={IPR Policy, Effects on Research},
  filename={Andrews (2002) - Genes And Patent Policy Rethinking Intellectual Property Rights.pdf}
}

An empty set of tags, ready for insertion above the filename tag, is below. However, some articles may already have an abstract tag.

  abstract={},
  discipline={},
  research_type={},
  industry={},
  thicket_stance={},
  thicket_stance_extract={},
  thicket_def={},
  thicket_def_extract={},  
  tags={},


Shortcuts

The easiest way to deal with BibTeX references is to have them in text files and edit them with textpad. Textpad supports regular expressions in its search and replace. Enable Posix Regular Expressions by going:

  • Configure -> Preferences -> Editor and then ticking the box next to "Use Posix regular expression syntax".

The following regular expressions might be helpful (don't include the quotes):

Find    Replace    Action        
"^ "    ""         Removes spaces at the start of each line
"^"     " "        Put a space at the start of each line
"\n"    " "        Replace returns with a single space


Likewise, Textpad can view text either line wrapped or not (Cntl-q w to toggle), change selected text to Sentence Case (Cntl-Shift-u), lower case (Cntl-l), etc. It also supports block select mode, which is sometimes helpful.

Workflow

The workflow is based on wiki pages. We can recover 'lost' data from these pages, but I suggest that at the start of work the pages are copied into text file (get the text using the edit button at the top of the relevant page and copying out the window) and saved on a local machine and then at the end of the work process the contents of the pages are submitted back to the wiki (again using a page edit).

So far we have `processed' the following pages:

  1. PTLR Annotated BibTeX Master
  2. PTLR Up Group Candidate Filename
  3. PTLR Up Group Processed BibTeX
  4. PTLR Non-Up Group Filename
  5. PTLR Thicket Definition

The next set of papers is provided in the PTLR Unclassified Candidate Filename page. Hopefully this will mostly contain down group papers. We need to go through these papers and classify them by adding them to the following pages:

  1. PTLR Core Group Processed BibTeX
  2. PTLR Up Group Processed BibTeX
  3. PTLR Down Group Processed BibTeX
  4. PTLR Discard Filenames

For the core and down group papers we won't be filling out the following tags (at least yet for the core group papers, and probably ever for the down group papers):

 thicket_stance={},
 thicket_stance_extract={},
 thicket_def={},
 thicket_def_extract={},  

For the moment the:

 tags={},

tag can be left blank for the core group, but for down group it would be very helpful to put enough useful comma seperated tags for us to work out roughly what the paper is about. We should also expand out the:

research_type={},

tag contents for the down group. Possible tags other than empirical, theory, and discussion, might be econometric methods, stylized facts, etc. I.e., we want to capture what the research is providing in meta terms with this tag.