The code contains two parts: Data Preprocessing and Model Training/Prediction.
"""Data Preprocessing (preprocessing.py) """ : this is where you transfer a text based "XXX.txt" input file into a numerical value based pickle file that the later part of the code can understand and use for training and prediction.
* Step 1 : modify the target file name in "main()"
python preprocessing.py
"""Model Training/Prediction (classification_MMM_LLL.py) """ : this is where the deep neural network is. The "MMM" represents the model. For example, currently I have "1DConvolution", "2DConvolution" and "LSTM". "LLL" represents the name of the label. Notice that for the same text input we can predict for different things using the same model. For example, "classification_LSTM_indu.py" is a LSTM model to predict the industray based on the descriptions. And "classification_LSTM_ipo.py" is a LSTM model to predict the IPO status based on the same descriptions. This Python file, no matter what the model is, will always load in the pickle file you generated in the previous step and train the neural network. At the end, the well trained neural network will predict on your testing examples (the examples you don't see during the training) and print the accuracy. To run this part:
python classification_LSTM.py