Matching VentureOne (Data)

McNair Project
Matching VentureOne (Data)
Project Information
Project Title
Start Date
Deadline
Primary Billing
Notes
Has project status
	Copyright © 2016 edegan.com. All Rights Reserved.

Overview

In this matching process, we will join patent data to VentureOne companies and count the number of patents that affiliated to each company.

We first get the standard company names for VentureOne companies from the source VentureOne data set. Then we standardize the names of the companies that have patents from our patent database. Based on the common standard company names, we join patent information to VentureOne companies.

Raw Data

Original data set of VentureOne companies can be found at: E:\McNair\Projects\Venture One Data\Venture Data 1.xlsx

All Variables: EntityName,Employees, City, State, Zip, AreaCode, Business Status, IndustryGroup...etc
Variables used for matching: EntityName

Original patent data is in our database: 128.42.44.181/bulk/Hubs

Final Matched Tables

Summary table displaying number of patents owned, minimum grant year, maximum grant year and average grant year for each company (including the ones that own no patents). It can be found at:E:\McNair\Projects\Venture One Data\venturepatentreallyfinal.txt
A table contains all patent information for the companies that have patents and can be found at E:\McNair\Projects\Venture One Data\venturepatentfullyjoined.txt

Detailed Data Processing

Get the VentureOne data ready

Source file for VentureOne data E:\McNair\Projects\Venture One Data\Venture Data 1.xlsx Original data source
Clean it up E:\McNair\Software\Scripts\Matcher\Input\Venture Data 1.txt extraneous symbols and words removed
Match it against itself to get standardized entity names E:\McNair\Projects\Venture One Data\Cleaned and Matched Data.xlsx

Get the patent data ready

Draw the distinct assignees Z:\allpatentsprocessed\DistinctAssignees2.txt
Match them against themselves to get standardized org names for patent data Z:\allpatentsprocessed\DistinctAssignees2matched.txt

Match standardized org names of patent data to standardized entity names of venture data: Z:\allpatentsprocessed\Venture Patent Matched.txt

Join patent data to venture data to get patent information of each venture-backed company

Join patent data to assignee data, creating firstjoin_cleaned which matches assignees to patent numbers.
Join firstjoin_cleaned data to matchassignee data, creating secondjoin_cleaned which matches standard org names to patent numbers
Join secondjoin_cleaned data to venturepatentmatched data, creating fourthjoin_cleaned which matches standard venture company names to patent numbers

Final summary tables

Summary table displaying number of patents owned, minimum grant year, maximum grant year and average grant year for each company E:\McNair\Projects\Venture One Data\venturepatentreallyfinal.txt
A table of all patent information for each company that has patents E:\McNair\Projects\Venture One Data\venturepatentfullyjoined.txt

Notes

All data in allpatentsprocessed database. Access it by logging on to researcher@McNair DBServ:/bulk/allpatentsprocessed
A script of detailed processing procedure can be found at E:\McNair\Projects\Venture One Data\patent data script.txt

Matching VentureOne (Data)

Contents

Overview

Raw Data

Final Matched Tables

Detailed Data Processing

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools