Changes

Jump to navigation Jump to search
[[Kyran Adams]] [[Work Logs]] [[Kyran Adams (Work Log)|(log page)]]
2018-02-22: Subtracting the means helped very marginally. I'm going to try coming up with some new ideas for features to improve accuracy. Started working on a program web_demo_features.py that will parse URLs and HTML files and count word hits from a file words.txt. This will be a feature for the ML model. Also met with project team, somebody is going to look through the training output data and look for lists of cohort companies, as opposed to just demo day pages. I will train the model on this, and hopefully get more useful results.
2018-02-21: A lot of the inaccuracy was probably due to mismatched data (for some reason two entries, Mergelane & Dreamit, were missing from the hand-classified data, and some of the entries were alphabetized differently), cleaned up and matched full datasets. However, now, it seems like we might be overfitting the data. I think it might be necessary to input a few different features. Found the KeyTerms.py file (translated it to python3) to see if I could augment it with some other features. Yang also suggested normalizing the features by subtracting their means, so I will start implementing that.
226

edits

Navigation menu