E:\McNair\Projects\Accelerators\Spring 2018\demo_day_classifier\DemoDayHTMLFull\demo_day_cohort_lists.xlsx
The task is basically to classify whether or not the HTML files contain cohort lists. There are many nuances in classifying this, because we want a balanced and correct dataset. If you are ever unsure how to classify something, just leave it blank. Also, if the HTML page looks like it is missing important text that should be there, just skip it.
Steps: