Difference between revisions of "Patent Assignment Data Restructure"
Jump to navigation
Jump to search
Line 1: | Line 1: | ||
In order to restructure the current patent dataset, the data requires rigorous cleaning. The primary areas for improvement are: | In order to restructure the current patent dataset, the data requires rigorous cleaning. The primary areas for improvement are: | ||
:1. Clean ptoassignment table to unique keys. | :1. Clean ptoassignment table to unique keys. | ||
− | :2. Clean ptoproperties to remove nonutility patents (including patent numbers, application numbers, something else that we haven't matched yet). | + | :2. Clean ptoproperties to remove nonutility patents (including patent numbers, application :numbers, something else that we haven't matched yet). |
:3. Clean ptoassignee to extract address components and clean it up. | :3. Clean ptoassignee to extract address components and clean it up. | ||
:4. Check all patent numbers accounted for in ptoassignee_currentusa | :4. Check all patent numbers accounted for in ptoassignee_currentusa | ||
:5. Correspondence address clean up. | :5. Correspondence address clean up. | ||
:6. Transform structure. | :6. Transform structure. |
Revision as of 15:01, 2 March 2017
In order to restructure the current patent dataset, the data requires rigorous cleaning. The primary areas for improvement are:
- 1. Clean ptoassignment table to unique keys.
- 2. Clean ptoproperties to remove nonutility patents (including patent numbers, application :numbers, something else that we haven't matched yet).
- 3. Clean ptoassignee to extract address components and clean it up.
- 4. Check all patent numbers accounted for in ptoassignee_currentusa
- 5. Correspondence address clean up.
- 6. Transform structure.