Each file is a group of 1000 companies. Each group of 1000 is numbered sequentially.
==Rough notes==
*Get the data from https://www.sbir.gov/sbirsearch/award/all
*Built a Selenium Web Driver which is stored in E:\McNair\Software\Scripts\Selenium Web Drivers
*Does not work because there is a captcha that must be entered after selecting xls download
==Notes on build building a Selenium Web Driver:==In your python script:
*Make sure that you properly set the chromedriver path if you don't have it under root. For example: webdriver.Chrome("/Users/adriansmart/PycharmProjects/SeleniumTest/chromedriver")
*Use driver.find_element_by_xpath to locate select the element on html page*To get . You will need to enter the xpath from html, in this function so first load the websitein a browser.*Right Next, right click on the page element you want the xpath and select inspect. This will launch the html inspector and highlight the relevant lines of code
*Right click on what looks like the right piece of code and select "Copy xpath data"
*Paste that stuff in your python script where it asks for a path, For example: driver.find_element_by_xpath("//*[@id='solr-print-dropdown-button']")
= SBIR Concatenation =
==Objective==
The objective of this project was to concatenate 162 xlsx files into one large tab delimited text file <br>