Difference between revisions of "Shelby Bice (Research Plan)"

From edegan.com
Jump to navigation Jump to search
 
(14 intermediate revisions by one other user not shown)
Line 14: Line 14:
 
* Start moving data into new database by querying existing databases (using SQL)
 
* Start moving data into new database by querying existing databases (using SQL)
 
* Use scripts to query new data
 
* Use scripts to query new data
 +
* Test database
 +
* Remove extraneous information from database (copies, patents that we're not interested in, etc.)
  
 
'''Documentation I need to include:'''
 
'''Documentation I need to include:'''
Line 20: Line 22:
 
* SQL commands that were used to fill database with explanation of what they do
 
* SQL commands that were used to fill database with explanation of what they do
 
* Clear instructions on where to find scripts in bulk drive and an explanation of what each script does
 
* Clear instructions on where to find scripts in bulk drive and an explanation of what each script does
 +
* Visual representation of example table entries that isn't just copied and pasted from a CSV file
 +
 +
'''Project Pages:'''
 +
[[Redesigning Patent Database]]
  
 
== Log ==
 
== Log ==
 
+
[[Category:Work Log]]
'''2/16/2017''' - Talked over project with Ed, began reading existing wiki pages related to patent data and databases
 
 
 
'''2/21/2017''' - Brushed up on Perl, SQL, Entity - Relationship model of designing databases
 
* In the documentation, I want to briefly explain what the entity-relationship model is before including
 
the diagram so that readers have a little bit of background
 
* Found a tool for creating a visual representation called ERDPlus.com - create a standalone instead of an account, can download
 
Learning commands from Patent Data - SQL Steps
 
* copy command is PostgreSQL that copies a SQL table to a text file
 
** DELIMITER set what will separate columns in text file
 
** HEADER specifies that there will be a header in the text file with the names of the columns
 
* Definitely need to include more detail about what these do in the documentation
 
* insert into command inserts a new entry into the table
 

Latest revision as of 16:17, 21 March 2017

Overview

Overall goals:

  • Create better database that includes all the patent data to which the McNair Center has access.
  • More importantly, create documentation of process so it can improved upon/replicated in the future.

General Outline - updated 2/21/2017

  • Familiarize myself with SQL, Perl, and database design
  • Familiarize myself with existing scripts and schema for existing database
  • Design a better representation for database
  • Fix scripts if necessary
  • Start moving data into new database by querying existing databases (using SQL)
  • Use scripts to query new data
  • Test database
  • Remove extraneous information from database (copies, patents that we're not interested in, etc.)

Documentation I need to include:

  • Schema of new database (with justification of design), would like to include a visual representation
  • SQL commands that were used to fill database with explanation of what they do
  • Clear instructions on where to find scripts in bulk drive and an explanation of what each script does
  • Visual representation of example table entries that isn't just copied and pasted from a CSV file

Project Pages: Redesigning Patent Database

Log