Difference between revisions of "Ecosystem Organization Classifier"
Line 12: | Line 12: | ||
===Text Processing=== | ===Text Processing=== | ||
− | There are two | + | There are two obvious classification methods for the processing the textual descriptions. The first is a "Bag of Words" approach, which uses Term Frequency – Inverse Document Frequency (TF-IDF) to do basic natural language processing and select words or phrases which have discriminant capabilities. The second is a Word2Vec approach which uses a shallow 2 layer neural network to reduce descriptions to a vector with high discriminant potential. (See "Memo for Evan" in E:\mcnair\Projects\Incubators for further detail.) We are going to be trying both approaches. |
==Related Projects== | ==Related Projects== |
Revision as of 13:56, 30 March 2019
Ecosystem Organization Classifier | |
---|---|
Project Information | |
Has title | Ecosystem Organization Classifier |
Has start date | |
Has deadline date | |
Has project status | Active |
Is dependent on | Crunchbase Database, VentureXpert Database |
Does subsume | Defining Incubators, Incubator Seed Data, Incubators in Five Ecosystems |
Copyright © 2019 edegan.com. All Rights Reserved. |
Introduction
The purpose of this project is to build a classifier, which takes the description of an ecosystem organization (i.e., a startup, a venture capitalist, an incubator, etc.) and either correctly classifies the organization's type or correctly classifies incubators vs. non-incubators.
Text Processing
There are two obvious classification methods for the processing the textual descriptions. The first is a "Bag of Words" approach, which uses Term Frequency – Inverse Document Frequency (TF-IDF) to do basic natural language processing and select words or phrases which have discriminant capabilities. The second is a Word2Vec approach which uses a shallow 2 layer neural network to reduce descriptions to a vector with high discriminant potential. (See "Memo for Evan" in E:\mcnair\Projects\Incubators for further detail.) We are going to be trying both approaches.
Related Projects
Subsumed Projects: Defining Incubators, Incubator Seed Data, Incubators in Five Ecosystems
This project is dependent on: Crunchbase Database, VentureXpert Database