SQL code is in:
E:/McNair/Projects/PatentAddress/RxPostcode.sql
Simplified code is in:
E:/McNair/Projects/PatentAddress/Functions.sql
=====3. State (U.S.)=====
SQL code is in:
E:/McNair/Projects/PatentAddress/RxPostcode.sql
Simplified code is in:
E:/McNair/Projects/PatentAddress/Functions.sql
=====4. City (U.S.)=====
E:/McNair/Projects/PatentAddress/RxCity.sql
=====5. Output (Tables)===== *'''ptoassigneend_allus''' This table contains all the U.S. patents extracted from ptoassigneend table. The rule to generate this table is 'country = 'UNITED STATES' ' or any of the features contains U.S. postcode. Noise exists because of postcode errors. The ptoassigneend_allus table may miss some U.S. patents which lose postcode records. SQL Simplified code: CREATE TABLE ptoassigneend_allus AS SELECT * FROM ptoassigneend WHERE city ~* '.*\d{5}[-]\d{4}.*' OR addrline1 ~* '.*\d{5}[-]\d{4}.*' OR addrline2 ~* '.*\d{5}[-]\d{4}.*' OR postcode ~* '.*\d{5}[-]\d{4}.*' OR country ~* 'UNITED STATES'; SELECT 3572682 Table "public.ptoassigneend_allus" Column | Type | Modifiers reelno | integer | frameno | integer | name | character varying(500) | addrline1 | character varying(500) | addrline2 | character varying(500) | city | character varying(500) | state | character varying(500) | country | character varying(500) | postcode | character varying(80) | *'''ptoassigneend_missus_final''' State and postcode information extracted from addrline1, addrline2 and city columns are stored in this table. See section 2 and 3. This table is a subset of ptoassigneend_allus table. Table "public.ptoassigneend_missus_final" Column | Type | Modifiers reelno | integer | frameno | integer | name | character varying(500) | addrline1 | character varying(500) | addrline2 | character varying(500) | city | character varying(500) | state | character varying(500) | country | character varying(500) | postcode | character varying(80) | postcode_city | text | postcode_addr1 | text | postcode_addr2 | text | state_city | text | state_addr1 | text | state_addr2 | text | postcode_city is the postcode extracted from 'city'; postcode_addr1 is the postcode extracted from 'addrline1'; postcode_addr2 is the postcode extracted from 'addrline2'. state_city is the state name extracted from 'city'; state_addr1 is the state name extracted from 'addrline1'; state_addr2 is the state name extracted from 'addrline2'. *'''ptoassigneend_missus_city_final''' City information are extracted from addrline1, addrline2 and city columns and are stored in ptoassigneend_city table. See section 4. This table is a subset of ptoassigneend_allus table. : Table "publicE:/McNair/Projects/PatentAddress/Functions.ptoassigneend_missus_city_final" Column | Type | Modifiers reelno | integer | frameno | integer | name | character varying(500) | addrline1 | character varying(500) | addrline2 | character varying(500) | city | character varying(500) | state | character varying(500) | country | character varying(500) | postcode | character varying(80) | city_addr1 | text | city_addr2 | text | city_city | text | city_city is the city name extracted from 'city'; city_addr1 is the city name extracted from 'addrline1'; city_addr2 is the city name extracted from 'addrline2'. *'''ptoassigneend_us_extracted''' Contain all the original features as well as city, state and postcode info extracted from features addrline1, addrline2 and city. Table "public.ptoassigneend_us_extracted" Column | Type | Modifiers reelno | integer | frameno | integer | name | character varying(500) | addrline1 | character varying(500) | addrline2 | character varying(500) | city | character varying(500) | state | character varying(500) | country | character varying(500) | postcode | character varying(80) | postqcode_city | text | postcode_addr1 | text | postcode_addr2 | text | state_city | text | state_addr1 | text | state_addr2 | text | city_addr1 | text | city_addr2 | text | city_city | text |sql
====Master Table====