Changes

Extracting Features from Surnames (view source)

Revision as of 18:30, 10 July 2009

461 bytes added , 18:30, 10 July 2009

no edit summary

==Extracting the Features==

Feature extraction is performed by a dedicated script ([http://www.edegan.com/repository/SurnameFeatures.pl SurnameFeatures.pl]).

An example command line is:

<tt> perl SurnameFeatures.pl -i=sourcefile.txt -ncol=0 -dcol=5 -sp=1 -gram=2 -minfq=1 -diag=0<\tt>

Where <tt>sp=1<\tt> forces the inclusion of spaces in the character set (which is otherwise a-z), as well as before and after the string, <tt>minfq<\tt> sets to minimum global frequency of occurance of an n-gram for it to be included in the output, and <tt>diag=1<\tt> produces an additional frequency of occurance diagnostic file.

Anonymous user

imported>Ed

Changes

Extracting Features from Surnames (view source)

Revision as of 18:30, 10 July 2009

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Sites

Sections

Organizations

Help

Tools