Difference between revisions of "Deep Text Classifier"

From edegan.com
Jump to navigation Jump to search
Line 20: Line 20:
  
 
==How to Run the Code==
 
==How to Run the Code==
 +
 +
My code has been intentionally broken into two parts: data preprocessing and model training/prediction
  
 
The first part of the code is all about data preprocessing which I will discuss later. But basically this is where you transform your single "XXX.txt" input file into a pickle file that the later part of the code can use for training and prediction. To run this part:
 
The first part of the code is all about data preprocessing which I will discuss later. But basically this is where you transform your single "XXX.txt" input file into a pickle file that the later part of the code can use for training and prediction. To run this part:

Revision as of 15:34, 10 October 2017

Deep Text Classifier

Problem Description

We want to build a classifier for the text input. For example, we may want to classify a company's industry area based on its description. Or we may want to classify a company's IPO status based on its description.

General Approach

We will build a deep neural network to uniformly solve this problem. The traditional way of doing this is to hire a task specific expert to manually design some useful features, say to check if the text contains words "Internet" and "High-tech" at the same time, and to classify based on the observed features. Our way, by using the deep neural network, can automatically extract the features and most importantly achieve very high testing accuracy. However, the features that are used by the deep neural network are not human interpretable.

About the Deep Models

There are basically two big categories of deep neural networks - the convolutional neural networks (CNN) and the recurrent neural networks (RNN). The first one, CNN, is more suitable for dealing with the image based classification tasks. The second one, RNN, is in general for sequential information (i.e. language, video ...) based classification tasks.

Major Package Dependences

How to Run the Code

My code has been intentionally broken into two parts: data preprocessing and model training/prediction

The first part of the code is all about data preprocessing which I will discuss later. But basically this is where you transform your single "XXX.txt" input file into a pickle file that the later part of the code can use for training and prediction. To run this part:

python preprocessing.py

The second part of the code is where the deep neural network is. It will load in the pickle file you generated in the previous step and train the neural network. At the end, the well trained neural network will predict on your testing examples and print the accuracy. To run this part:

python classification_LSTM.py

Notice that the data preprocessing part usually only needs to be done once. The saved pickle file is basically a machine friendly code that can be loaded very fast.

Data Preprocessing

How to Modify the Code to Solve your own problems

General guidelines for tuning the hyper-parameters