Bulk Patent Assignee Processing
USPTO Assignees Data
We would like to download and absorb data from this location on the USPTO website into our tables. The objective is to determine whether this dataset is better than the current version of our patent data (a combination of the data in the patent_2015 and patentdata databases.
Steps Followed to Extract the Data
Extracting Data from XML Files
All the historical USPTO data is available as XML files. Here is the tree structure for the XML files:
<patent-assignment> +<assignment-record> +<patent-assignors> +<patent-assignees> +<patent-properties> </patent-assignment>
Each of the above internal nodes is mandatory, and is a logical grouping of information fields. Each node has a corresponding table created with more or less the same fields as the XML elements.
Corresponding tables are: assignment-records : assignment patent-assignors : assignors patent-assignees : assignees patent-properties : properties
Additionally, for each file that is downloaded, there are some associated specs. All of these are stored in the PatentAssignment table.
DTD
Here is the DTD specified by the USPTO: