Changes

Jump to navigation Jump to search
627 bytes added ,  18:20, 18 March 2016
no edit summary
[https://dataverse.harvard.edu/dataset.xhtml?persistentId=hdl:1902.1/15705 Harvard Dataverse]
 
==New Notes==
 
The source files have transitioned from here:
*https://www.google.com/googlebooks/uspto-patents-grants-text.html (No longer maintained)
To:
*https://bulkdata.uspto.gov/ (includes 2016 data)
 
The historic data is the same both sides.
 
Each file contains, in order, sorted by document ID:
#Design patents (we will discard)
#Plant patents (we will discard)
#Reissues (we probably want them)
#Utility patents (we want them)
 
The classifications in the XML file are:
*IPC - these are good and we just need the main classification
*CPC - as above
*USPC - just a numeric but not split. Is 22431 224/31 or 22/431, etc.
Anonymous user

Navigation menu