Difference between revisions of "Patent Schema Reconciliation"

From edegan.com
Jump to navigation Jump to search
Line 38: Line 38:
 
<us-bibliographic-data-grant> <publication-reference> <document-id> <country>US</country> <doc-number>PP027502</doc-number> <kind>P3</kind> <date>20161227</date> </document-id>
 
<us-bibliographic-data-grant> <publication-reference> <document-id> <country>US</country> <doc-number>PP027502</doc-number> <kind>P3</kind> <date>20161227</date> </document-id>
  
For the above code, I identified (what I think are accurate) xpaths for the nodes of patent number (//us-bibliographic-data-grant/publication-reference/document-id/doc-number), kind (//us-bibliographic-data-grant/publication-reference/document-id/kind), and grant date (//us-bibliographic-data-grant/publication-reference/document-id/date). I am adding the xpaths for these nodes, as well as the others mentioned above, for the 4 types of patents, for each version, for both granted and applications.
+
For the above code, I identified (what I think are accurate) xpaths for the nodes of patent number (//us-bibliographic-data-grant/publication-reference/document-id/doc-number), kind (//us-bibliographic-data-grant/publication-reference/document-id/kind), and grant date (//us-bibliographic-data-grant/publication-reference/document-id/date). I am adding the xpaths for these nodes, as well as the other types of nodes mentioned above, for the 4 types of patents, for each version, for both granted and applications.
  
 
==Useful links==
 
==Useful links==

Revision as of 17:30, 26 July 2017

Example files

E:\McNair\Projects\SimplerPatentData\data\examples

There are two sets:

  • Granted
  • Applications

Applications contains just utility and some plant, whereas granted contains design, plant, reissue, and utility patents (i.e., all four types of patents). Both applications and granted have multiple versions (e.g., v4.5, v4.4, v4.3, ..., etc.).

The Task

For both sets (starting with granted), all types, and all versions, we need to identify the xpath (or APS equivalent, see below) for each node.

A node is something like:

  • patent number (it shows up as document_id)
  • filing number (it also shows up as a document_id in another place)
  • grant date
  • kind
  • type
  • applicationnumber
  • filingdate

Some nodes are lists of other nodes, for example the assignees node contains multiple assignment records.

Task Notes

Details from Joe Reilly Work Logs (log page)

Created a text document of textpaths for the following nodes: patent number, filing number, grant date, kind, type, application number, and filing date. Saved file in E:\McNair\Projects\SimplerPatentData\data\examples\Patent Schema Reconciliation.


Notes:

There seems to be no filing number in the code for each patent type. There is also no examle xpath in Equivalent_XPath_and_APS_Queries#Query_Equivalences for filing number. I'm leaving filing number blank for now.

An example xpath for a certain block of code from granted, v4.5, plant: <us-bibliographic-data-grant> <publication-reference> <document-id> <country>US</country> <doc-number>PP027502</doc-number> <kind>P3</kind> <date>20161227</date> </document-id>

For the above code, I identified (what I think are accurate) xpaths for the nodes of patent number (//us-bibliographic-data-grant/publication-reference/document-id/doc-number), kind (//us-bibliographic-data-grant/publication-reference/document-id/kind), and grant date (//us-bibliographic-data-grant/publication-reference/document-id/date). I am adding the xpaths for these nodes, as well as the other types of nodes mentioned above, for the 4 types of patents, for each version, for both granted and applications.

Useful links

The Equivalent_XPath_and_APS_Queries#Query_Equivalences page has example XPath statements

The Reproducible_Patent_Data#Schema_Reconciliation page shows which schemas are associated with which year

The Patent_Data_Extraction_Scripts_(Tool)#Utility_patent_grants_fields pages has examples of nodes and where to find them for utility patents (XML version 4.4, I think).