====Data Preprocessing (IN PROGRESS)====
This part aims to create an automation process for combining results generated from the Site Map Tool with corresponding cohort indicators. The generated data splits is splitted into two text files: train.txt and test.txt.
Python file saved in
E:\projects\listing page identifier\generate_dataset.py