Changes

Accelerator Demo Day (view source)

Revision as of 14:29, 23 July 2018

5 bytes added , 14:29, 23 July 2018

no edit summary

==How to Use this Project==

Running the project is as simple as executing the code in the correct order. The files are named in the format "STEPX_name", where as X is the order of execution. To be more specific, run the following 4 commands:

~~python3 STEP1_crawl.py~~ #~~crawl~~ Crawl Google to get the data for the demo day pages for the accelerator stored in ListOfAccsToCrawl.txt python3 ~~STEP2_preprocessing_feature_matrix_generator~~STEP1_crawl.py #~~preprocess~~ Preprocess data using a bag of word approach: each page is characterized by the frequencies of chosen keywords. Chosen keywords are stored in words.txt. This script reates a file called feature_matrix.txt python3 ~~STEP3_train_rf~~STEP2_preprocessing_feature_matrix_generator.py #~~train~~ Train the RF model python3 ~~STEP4_classify_rf~~STEP3_train_rf.py #~~run~~ Run the model to predict on the HTML of the crawled HTMLs. python3 STEP4_classify_rf.py

Th

==The Crawler Functionality==

To be updated

Leminh.ams

197

edits

Changes

Accelerator Demo Day (view source)

Revision as of 14:29, 23 July 2018

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools