Difference between revisions of "Accelerator Demo Day"

From edegan.com
Jump to navigation Jump to search
Line 16: Line 16:
 
===The Classifier===
 
===The Classifier===
 
====Input (Features)====
 
====Input (Features)====
The input (features) right now is the frequency of X_NUMBER of words appearing in each documents.
+
The input (features) right now is the frequency of X_NUMBER of words appearing in each documents. The word choice is hand selected
 
==Reading resources==
 
==Reading resources==
 
http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf
 
http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf

Revision as of 14:26, 13 July 2018


McNair Project
Accelerator Demo Day
Project logo 02.png
Project Information
Project Title Accelerator Demo Day
Owner Minh Le
Start Date 06/18/2018
Deadline
Primary Billing
Notes
Has project status Active
Subsumes: Demo Day Page Parser, Demo Day Page Google Classifier
Copyright © 2016 edegan.com. All Rights Reserved.


Project

This project that utilizes Selenium and Machine Learning to get good candidate web pages and classify webpages as a demo day page containing a list of cohort companies, currently using scikit learn's random forest model and a bag of words approach

Code Location

The source code and relevant files for the project can be found here:

E:\McNair\Projects\Accelerator Demo Day\

Development Notes

The Crawler Functionality

To be updated

The Classifier

Input (Features)

The input (features) right now is the frequency of X_NUMBER of words appearing in each documents. The word choice is hand selected

Reading resources

http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf