<onlyinclude>
[[Kyran Adams]] [[Work Logs]] [[Kyran Adams (Work Log)|(log page)]]
2018-04-04: Continued to classify examples, and tried using images as features. It didn't give great results, so I abandoned that. Currently the features are wordcounts in the headers and title. I might consider the number of "simple" links in the page, like "www.abc.com". Complicated links would be used a lot for many things, but simple links are often used to bring someone to a home page, as in a cohort company's page, so this might be a good indicator of a list of cohort companies.
2018-04-02: Mostly worked on increasing the size of the dataset. So far, with just 30ish more examples, there was a ton of increase in accuracy, so I'm just going to spend the rest of the time doing this. As seen in the graph, this is not entirely due to just the size of the data set, but probably the breadth of it. Also used scikit learns built in hyperparameter grid search, will run this overnight once the dataset is large enough. Another thing I'm thinking about is adding images as a feature, for example the number of images over 200 px wide, because some demo day pages use logos as the cohort company list.