Changes

Jump to navigation Jump to search
165 bytes added ,  13:44, 28 September 2017
no edit summary
I have also downloaded the PDFS from the website. That is These are the pdfs that are in the csv file. Some of the PDFS were no able to be downloaded. The PDFs are here
E:\McNair\Projects\USITC\pdf_copy
An example of PDF parsing that works parsing this PDF: https://www.usitc.gov/secretary/fed_reg_notices/337/337_959_notice02062017sgl.pdf
E:\McNair\Projects\USITC\Parsed_Texts\337_959_notice02062017sgl.txt
However, there will be PDFs where the parsing does not work completelyand the text is scrambled.
==Status==
Next steps will be to parse the PDFS, currently running a script to convert them to text
Currently running a shell script to download the PDFs. Will update when that is completedDownloaded most of the PDFs. There were errors download some of the files.

Navigation menu