Skip to content
tumarkin edited this page Sep 14, 2015 · 8 revisions

Basic use

yente FROM-FILE TO-FILE -o OUTPUT-FILE

Getting help

yente --help

Filtering matches based on score

Requiring a minimum match score M

yente FROM-FILE TO-FILE -o OUTPUT-FILE --minimum-match-score=M

Getting multiple matches

The best N matches with ties

yente FROM-FILE TO-FILE -o OUTPUT-FILE --number-of-results=N --include-ties

or

yente FROM-FILE TO-FILE -o OUTPUT-FILE -nN -i

Getting the best N matches with ties, with all matches exceed a minimum score M (this will provide less than N matches if an insufficient number exceed M)

yente FROM-FILE TO-FILE -o OUTPUT-FILE --minimum-match-score=M --number-of-results=N --include-ties

Sound-based matching

For example, using the SoundEx algorithm with a maximum token length of 3 characters:

yente FROM-FILE TO-FILE -o OUTPUT-FILE --phonetic-algorithm=SoundEx --max-token-length=3

Matching with misspellings

The misspelling penalty factor is described in Advanced use. It is a positive floating point number. Low value allow for misspellings while high values are relatively intolerant of misspellings.

yente FROM-FILE TO-FILE -o OUTPUT-FILE --misspelling-penalty=2

Approximate sound-based matching

An approximate sound based algorithm (SoundEx preprocessing with a misspelling allowance)

yente FROM-FILE TO-FILE -o OUTPUT-FILE --phonetic-algorithm=SoundEx --max-token-length=3 --misspelling-penalty=2

Clone this wiki locally