Changes
Jump to navigation
Jump to search
# Remove or concatenate (default) Tussenvoegsel (default uses: [http://www.edegan.com/repository/Names-Tussenvoegsel.txt Names-Tussenvoegsel.txt])
# Remove first barrel (default) or concatenate double-barrelled names
no edit summary
|Phone Book (Hardcopy) || Last Name || First Name || Middle Initial || ||
|}
The phone book format is most commonly encountered as: Surname, Firstname I. In this instance we refer to it as a comma format name.
==The Normalization Script==
# Force the encoding to Latin
# Remove Stop Words (default uses: [http://www.edegan.com/repository/Names-Stopwords.txt Names-Stopwords.txt])
# Remove or concatenate (default) Tussenvoegsel (default uses: [http://www.edegan.com/repository/Names-Tussenvoegsel.txt Names-Tussenvoegsel.txt]) - Note that with comma formatted names this does not apply.
# Remove first barrel (default) or concatenate double-barrelled names
# Mark discards
# Extract "Surname"
# Extract "Firstname Surname" pair