Difference between revisions of "Collecting SBIR Data"

From edegan.com
Jump to navigation Jump to search
Line 8: Line 8:
 
==Rough notes==
 
==Rough notes==
  
*Get the data from http://www.sbir.gov
+
*Get the data from https://www.sbir.gov/sbirsearch/award/all
 +
*Built a Selenium Web Driver which is stored in
 +
*Does not work because there is a captcha that must be entered after selecting xls download
 +
 
 +
Notes on build a Selenium Web Driver:
 +
*Make sure that you properly set the chromedriver path if you don't have it under root. For example: webdriver.Chrome("/Users/adriansmart/PycharmProjects/SeleniumTest/chromedriver")
 +
*Use driver.find_element_by_xpath to locate element on html page
 +
*To get xpath from html, first load the website
 +
*Right click on the page element you want the xpath and select inspect. This will launch the html inspector and highlight the relevant lines of code
 +
*Right click on what looks like the right piece of code and select "Copy xpath data"
 +
*Paste that stuff in your python script where it asks for a path, For example: driver.find_element_by_xpath("//*[@id='solr-print-dropdown-button']")

Revision as of 17:04, 6 June 2017


McNair Project
Collecting SBIR Data
Project logo 02.png
Project Information
Project Title Collecting SBIR Data
Owner Adrian Smart
Start Date June 6, 2017
Deadline
Keywords Data, Tool
Primary Billing
Notes
Has project status Active
Copyright © 2016 edegan.com. All Rights Reserved.


Rough notes

Notes on build a Selenium Web Driver:

  • Make sure that you properly set the chromedriver path if you don't have it under root. For example: webdriver.Chrome("/Users/adriansmart/PycharmProjects/SeleniumTest/chromedriver")
  • Use driver.find_element_by_xpath to locate element on html page
  • To get xpath from html, first load the website
  • Right click on the page element you want the xpath and select inspect. This will launch the html inspector and highlight the relevant lines of code
  • Right click on what looks like the right piece of code and select "Copy xpath data"
  • Paste that stuff in your python script where it asks for a path, For example: driver.find_element_by_xpath("//*[@id='solr-print-dropdown-button']")