For data preprocessing, we adopt the same standard as in the [http://ai.stanford.edu/~amaas/data/sentiment/ IMDB] dataset.
'''To general users:''' your input file (usually a single ".txt" file contains many examples each as a row) will be split into a training set (80% by default) and a testing set (20% by default). The labels you want to predict will be the folder names. The content (usually a block of text) of the examples will go into separate ".txt" files. To run the script, you basically need to specify the following:
1. "File Name" : without the ".txt" extension,