Revision as of 13:26, 8 December 2017

Overview

McNair Project
Scholar Crawler Main Program
Project Information
Project Title	Scholar Crawler Main Program
Owner	Christy Warden
Start Date	10/23/2017
Deadline
Keywords	Google Scholar, python
Primary Billing
Notes
Has project status	Active
	Copyright © 2016 edegan.com. All Rights Reserved.

This code is located at E:/McNair/Software/Google_Scholar_Crawler/mainProgram.py. It calls on various other pieces of code to create a cohesive program for the patent thicket project which takes in a search term and a number of pages. It responds by searching on Google Scholar for that term, downloaded as many papers as it can from that search, converting them to text and searching for key terms and a definition of patent thicket in the text. Each piece of code can also be used individually for other applications.

Stage 1

Sets up a series of directories for results to go in.

Stage 2

Google Scholar Crawler under scholarcrawl.py heading.

Stage 3

PDF Downloader

Stage 4

PDF to Text Converter

Stage 5

@@ Line 21: / Line 21: @@
 =Stage 4=
+[[PDF to Text Converter]]
+=Stage 5=

Difference between revisions of "Scholar Crawler Main Program"

Revision as of 13:26, 8 December 2017

Contents

Overview

Stage 1

Stage 2

Stage 3

Stage 4

Stage 5

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools