Difference between revisions of "Matching VentureOne (Data)"

From edegan.com
Jump to navigation Jump to search
Line 9: Line 9:
 
}}
 
}}
  
==Data Processing==
+
==Overview==
 +
In this matching process, we will join patent data to venture backed companies and count the number of patents that affiliated to venture backed companies.
 +
 
 +
We first get the standard entity names for venture backed companies from the source VentureOne data set, and match the standard entity names with the standard company names that have patents from our patent database. Based on the common standard company names, we join patent information to venture backed companies.
 +
 
 +
 
 +
*Variables used from source data set: EntityName
 +
 
 +
===Final Matched Table==
 +
#Summary table displaying number of patents owned, minimum grant year, maximum grant year and average grant year for each company
 +
#A table contains all patent information for each company that has patents
 +
 
 +
 
 +
 
 +
 
 +
==Detailed Data Processing==
 
;*Get the VentureOne data ready
 
;*Get the VentureOne data ready
 
#Source file for VentureOne data <code>E:\McNair\Projects\Venture One Data\Venture Data 1.xlsx</code> Original data source
 
#Source file for VentureOne data <code>E:\McNair\Projects\Venture One Data\Venture Data 1.xlsx</code> Original data source
Line 29: Line 44:
 
;*Final summary tables
 
;*Final summary tables
 
#Summary table displaying number of patents owned, minimum grant year, maximum grant year and average grant year for each company <code>E:\McNair\Projects\Venture One Data\venturepatentreallyfinal.txt</code>
 
#Summary table displaying number of patents owned, minimum grant year, maximum grant year and average grant year for each company <code>E:\McNair\Projects\Venture One Data\venturepatentreallyfinal.txt</code>
#A table of all patent information for each company that has patent <code>E:\McNair\Projects\Venture One Data\venturepatentfullyjoined.txt</code>
+
#A table of all patent information for each company that has patents <code>E:\McNair\Projects\Venture One Data\venturepatentfullyjoined.txt</code>
  
 
;*Notes
 
;*Notes

Revision as of 14:32, 5 July 2016


McNair Project
Matching VentureOne (Data)
Project logo 02.png
Project Information
Project Title
Start Date
Deadline
Primary Billing
Notes
Has project status
Copyright © 2016 edegan.com. All Rights Reserved.


Overview

In this matching process, we will join patent data to venture backed companies and count the number of patents that affiliated to venture backed companies.

We first get the standard entity names for venture backed companies from the source VentureOne data set, and match the standard entity names with the standard company names that have patents from our patent database. Based on the common standard company names, we join patent information to venture backed companies.


  • Variables used from source data set: EntityName

=Final Matched Table

  1. Summary table displaying number of patents owned, minimum grant year, maximum grant year and average grant year for each company
  2. A table contains all patent information for each company that has patents



Detailed Data Processing

  • Get the VentureOne data ready
  1. Source file for VentureOne data E:\McNair\Projects\Venture One Data\Venture Data 1.xlsx Original data source
  2. Clean it up E:\McNair\Software\Scripts\Matcher\Input\Venture Data 1.txt extraneous symbols and words removed
  3. Match it against itself to get standardized entity names E:\McNair\Projects\Venture One Data\Cleaned and Matched Data.xlsx
  • Get the patent data ready
  1. Draw the distinct assignees Z:\allpatentsprocessed\DistinctAssignees2.txt
  2. Match them against themselves to get standardized org names for patent data Z:\allpatentsprocessed\DistinctAssignees2matched.txt
  • Match standardized org names of patent data to standardized entity names of venture data
Z:\allpatentsprocessed\Venture Patent Matched.txt
  • Join patent data to venture data to get patent information of each venture-backed company
  1. Join patent data to assignee data, creating firstjoin_cleaned which matches assignees to patent numbers.
  2. Join firstjoin_cleaned data to matchassignee data, creating secondjoin_cleaned which matches standard org names to patent numbers
  3. Join secondjoin_cleaned data to venturepatentmatched data, creating fourthjoin_cleaned which matches standard venture company names to patent numbers
  • Final summary tables
  1. Summary table displaying number of patents owned, minimum grant year, maximum grant year and average grant year for each company E:\McNair\Projects\Venture One Data\venturepatentreallyfinal.txt
  2. A table of all patent information for each company that has patents E:\McNair\Projects\Venture One Data\venturepatentfullyjoined.txt
  • Notes
  1. All data in allpatentsprocessed database. Access it by logging on to researcher@McNair DBServ:/bulk/allpatentsprocessed
  2. A script of detailed processing procedure can be found at E:\McNair\Projects\Venture One Data\patent data script.txt