Changes
Jump to navigation
Jump to search
Extracting Features from Surnames (view source)
Revision as of 20:35, 9 July 2009
, 20:35, 9 July 2009no edit summary
Extracting features from Surnames entails encoding the frequency of [http://en.wikipedia.org/wiki/Ngram n-grams ] and other features such as the string length. Recall that 1-grams are letters or characters, also called unigrams, 2-grams are called bigrams or digraphs, and 3-grams are called trigrams. In some applications entire words, sentences or other tokens are used as grams.
==Assumption of Independence of Features==