Changes
Jump to navigation
Jump to search
no edit summary
*Spain ([http://en.wikipedia.org/wiki/List_of_postal_codes_in_Spain Sourced from Wikipedia]): Post 1976 Spanish postcodes are five digits of the format NNMMM, where NN indicates the province (01-52) or a reserved code (e.g. 80 for P.O. boxes). In the patent data Spansish postcodes are comparatively well behaved, with the following standard variants appearing: NNNNN, NNN NN, NNNN, NN NNNNN, NN- NNNNN, -NNNNN, NNNNN-, "NN, NNNNN", NNN, NN, N NNNNN-, NN-NN, NN-NN NNNNN, NNNNN-IBI, E-NNNNN, E-NNNN, E - NNNNN, E--NNNNN, ES-NNNNN.
**Simple Regex: <tt>(E|ES|)\d{0,2},?\s?-{0,2}\s?\d{2,5}-?(IBI|)</tt>
*Switzerland ([http://en.wikipedia.org/wiki/Postal_codes_in_Switzerland_and_Liechtenstein Sourced from Wikipedia]): Swiss (and Lictenstein) postcodes are hierarchical four-digit numbers of the form District+Area+Route+PONumber, where districts are numbered West to East (would you expect less from the Swiss?). In the patent data Swiss postcodes are comparatively immaculately behaved with the following formats appearing: NNNN, NNNN-, CH-NNNN, CH - NNNN, CH NNNN, CHNNN, CH- NNNN, CHNNN. Though the "H" may sometimes be lowercase.Test
**Simple Regex: <tt>(CH|Ch|)\s?-?\s?\d{3,4}-?</tt>
*Australia: ([http://en.wikipedia.org/wiki/Postcodes_in_australia Sourced from Wikipedia]): NNNN where N is a numeric. Australian postcodes should appear at the end of addresses, and are frequently preceded by the acronym for the territory/state (specifically: NSW, ACT, VIC, QLD, SA, WA, TAS, and NT). In the patent data variations include: NNNN, AU-NNNN, XXX NNNN, Xxx. NNNN X.X.X. NNNN, XXXNNNN, where XXX indicate the two or three characters of the acronym.
The Match::PostalCodes.pm perl module provides a method to extract a postcode from a text string for a given ISO3166 code. The simple regular expressions listed above are not used verbatim, as more sophisticed techniques can be employed on per country basis.
Test
==The Matching Process==