**Can use sent_tokenize() function to split document into sentences, easier that regular expressions
**Use pos_tag() to tag the sentences. This can be used to extract proper noun
**there are several packages that need to be downloaded, to do this:
***open up python in the shell
****run nltk.download()
****download all packages