Levenshtein name comparisons #7

BrJohan · 2014-03-18T11:33:57Z

I would like to suggest a possibility to compare persons names by using the Levenshtein Distance algorithm. See http://en.wikipedia.org/wiki/Levenshtein_distance

My genealogical 'research' is primarily related to Sweden. Very often persons have their name spelled a little different in various sourcedocuments.

Example: Kristina - Cristina - Christina - Chrestina - Christine

Using this suggested algorithm and allowing some (fairly small) maximum distance would be most helpful when trying to find duplicate persons in my database.

MinchinWeb · 2015-01-19T17:38:33Z

I think you might be better off with Soundex or something similar. Soundex assigns a value to a word, such that words that are pronounced the same as assigned the same value.

The fuzzy library ( https://pypi.python.org/pypi/Fuzzy ) might be a good place to start.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Levenshtein name comparisons #7

Levenshtein name comparisons #7

BrJohan commented Mar 18, 2014

MinchinWeb commented Jan 19, 2015

Levenshtein name comparisons #7

Levenshtein name comparisons #7

Comments

BrJohan commented Mar 18, 2014

MinchinWeb commented Jan 19, 2015