Difference between revisions of "Matching VentureOne (Data)"
Jump to navigation
Jump to search
Line 13: | Line 13: | ||
#Clean it up <code>E:\McNair\Software\Scripts\Matcher\Input\Venture Data 1.txt</code> extraneous symbols and words removed | #Clean it up <code>E:\McNair\Software\Scripts\Matcher\Input\Venture Data 1.txt</code> extraneous symbols and words removed | ||
#Match it against itself to get standardized entity names <code>E:\McNair\Projects\Venture One Data\Cleaned and Matched Data.xlsx</code> | #Match it against itself to get standardized entity names <code>E:\McNair\Projects\Venture One Data\Cleaned and Matched Data.xlsx</code> | ||
− | + | ||
;*Get the patent data ready | ;*Get the patent data ready | ||
#Draw the distinct assignees <code>Z:\allpatentsprocessed\DistinctAssignees2.txt </code> | #Draw the distinct assignees <code>Z:\allpatentsprocessed\DistinctAssignees2.txt </code> | ||
#Match them against themselves to get standardized org names for patent data <code>Z:\allpatentsprocessed\DistinctAssignees2matched.txt </code> | #Match them against themselves to get standardized org names for patent data <code>Z:\allpatentsprocessed\DistinctAssignees2matched.txt </code> | ||
− | + | ||
;*Match standardized org names of patent data to standardized entity names of venture data | ;*Match standardized org names of patent data to standardized entity names of venture data | ||
:<code>Z:\allpatentsprocessed\Venture Patent Matched.txt</code> | :<code>Z:\allpatentsprocessed\Venture Patent Matched.txt</code> | ||
− | + | ||
− | *Join patent data to venture data to get patent information of each venture-backed companies | + | ;*Join patent data to venture data to get patent information of each venture-backed companies |
#Join <code>patent</code> data to <code>assignee</code> data, creating <code>firstjoin_cleaned</code> | #Join <code>patent</code> data to <code>assignee</code> data, creating <code>firstjoin_cleaned</code> | ||
#Join <code>firstjoin_cleaned</code> data to <code>matchassignee</code> data, creating <code>secondjoin_cleaned</code> | #Join <code>firstjoin_cleaned</code> data to <code>matchassignee</code> data, creating <code>secondjoin_cleaned</code> | ||
#Join <code>secondjoin_cleaned</code> data to <code>venturepatentmatched</code> data, creating <code>fourthjoin_cleaned</code> | #Join <code>secondjoin_cleaned</code> data to <code>venturepatentmatched</code> data, creating <code>fourthjoin_cleaned</code> | ||
<br> | <br> | ||
− | *Final summary | + | *Final summary tables |
#Summary table displaying number of patents, minimum grant year, maximum grant year and average grant year for each company <code>E:\McNair\Projects\Venture One Data\venturepatentreallyfinal.txt</code> | #Summary table displaying number of patents, minimum grant year, maximum grant year and average grant year for each company <code>E:\McNair\Projects\Venture One Data\venturepatentreallyfinal.txt</code> | ||
#A table of all patent information for each company that has patent <code>E:\McNair\Projects\Venture One Data\venturepatentfullyjoined.txt</code> | #A table of all patent information for each company that has patent <code>E:\McNair\Projects\Venture One Data\venturepatentfullyjoined.txt</code> | ||
<br> | <br> | ||
− | *Notes | + | ;*Notes |
− | # | + | #All data in <code>allpatentsprocessed database</code>. Access it by logging on to <code>researcher@McNair DBServ:/bulk/allpatentsprocessed</code> |
+ | #A detailed processing procedure can be found at <code>E:\McNair\Projects\Venture One Data\patent data script.txt</code> |
Revision as of 11:45, 16 June 2016
Matching VentureOne (Data) | |
---|---|
Project Information | |
Project Title | |
Start Date | |
Deadline | |
Primary Billing | |
Notes | |
Has project status | |
Copyright © 2016 edegan.com. All Rights Reserved. |
Data Processing
- Get the VentureOne data ready
- Source file for VentureOne data
E:\McNair\Projects\Venture One Data\Venture Data 1.xlsx
Original data source - Clean it up
E:\McNair\Software\Scripts\Matcher\Input\Venture Data 1.txt
extraneous symbols and words removed - Match it against itself to get standardized entity names
E:\McNair\Projects\Venture One Data\Cleaned and Matched Data.xlsx
- Get the patent data ready
- Draw the distinct assignees
Z:\allpatentsprocessed\DistinctAssignees2.txt
- Match them against themselves to get standardized org names for patent data
Z:\allpatentsprocessed\DistinctAssignees2matched.txt
- Match standardized org names of patent data to standardized entity names of venture data
Z:\allpatentsprocessed\Venture Patent Matched.txt
- Join patent data to venture data to get patent information of each venture-backed companies
- Join
patent
data toassignee
data, creatingfirstjoin_cleaned
- Join
firstjoin_cleaned
data tomatchassignee
data, creatingsecondjoin_cleaned
- Join
secondjoin_cleaned
data toventurepatentmatched
data, creatingfourthjoin_cleaned
- Final summary tables
- Summary table displaying number of patents, minimum grant year, maximum grant year and average grant year for each company
E:\McNair\Projects\Venture One Data\venturepatentreallyfinal.txt
- A table of all patent information for each company that has patent
E:\McNair\Projects\Venture One Data\venturepatentfullyjoined.txt
- Notes
- All data in
allpatentsprocessed database
. Access it by logging on toresearcher@McNair DBServ:/bulk/allpatentsprocessed
- A detailed processing procedure can be found at
E:\McNair\Projects\Venture One Data\patent data script.txt