Scripts and data related to the finding of optimal thresholds for string matching. The full docs can be found here.
The code uses hatch as a project manager, with the pyproject.toml
file for configuration. There are four command-line scripts that are defined to run the analysis:
hatch run simple
: Solves the simple hit, payment or combined modelshatch run relaxed
: Solves the relaxed hit or payment modelhatch run combination
: Solve the algorithm combination modelhatch run compare
: Compares the different solvers
To get started, install hatch
and run one of the commands, e.g.:
hatch run simple
This will execute the run_simple_model.py script that picks a model, a dataset and solves it with the default settings. If you would like to specify the details, you can pass them as arguments, e.g.:
hatch run simple --model_type='hit' --full_dataset=False --to_file=True
For more information, please check the help, e.g.:
hatch run simple --help
If you would like to run all the experiments as they appear in the Results section of the docs, you can run:
runs.sh
We highly welcome contributions, please simply make a pull request and we will review as quickly as possible.