Since the city list is not long, I briefly cleaned the list by hand, and stored it in ptoassigneend_us_citylist2.
Then, similar to Section 4.2, identify clean data that meets all the requirements: postcode_extracted with 5-4 or 5 digits, state not null or not spaces, and city_extracted in ptoasigneend_us_citylist2.
SQL Code:
CREATE TABLE ptoassigneend_us_identify3 AS
SELECT *
FROM ptoassigneend_us_postex2
WHERE city_extracted IN (
SELECT citylist
FROM ptoassigneend_us_citylist2) AND
state IS NOT NULL AND state != '' AND
postcode_extracted ~* '\d{5}';
SELECT 664524