Things to go over with Ed:
#my work on doctor problem so far- run against 500 most common names and see how many results that leaves unmatched. Then if num unassigned doctors > 500, take some data sets that we can easily access (imdb, olympic athletes).
#new variables we want to build
==For analysis==