a. Create a Master Table with the original features and (postcode, city, and state, coutry) summarized. Judge by myself
b. Finish by the end of the week (asap); inform Catherina after finished.
c. Postcode is the priority
=====Postcode=====
E:/McNair/Projects/PatentAddress/PostcodeClean.sql
Postcodes extracted from addrline1, addrline2 and city are stored in ptoassigneend_postcode.
For records of which addrline1, addrline2 and city don't contain postcode info, just clean the feature postcode as the postcode_cleaned
E:/McNair/Projects/PatentAddress/StateClean.sql
States extracted from addrline1, addrline2 and city are stored in ptoassigneend_state.
All the cleaned states for U.S. patents are stored in ptoassigneend_us_statecleaned. (# 3572605)
Since city_city is extracted from feature city and is cleaned, city_city beats city.
For records of which addrline1, addrline2 and city don't contain city info, clean the just keep feature city in as the following way: [To do]:city_cleaned.
* Output
E:/McNair/Projects/PatentAddress/CityClean.sql
Cities extracted from addrline1, addrline2 and city are stored in ptoassigneend_city.
All the cleaned cities for U.S. patents are stored in ptoassigneend_us_citycleaned. (# 3572605)