Evaluation on MiniF2F and ProofNet

Here we evaluate models trained on LeanDojo Benchmark on MiniF2F or ProofNet. We use MiniF2F as an example, but the same procedure applies to ProofNet.

First, use LeanDojo to extract data from MiniF2F. See the end of this Jupyter notebook. Save the dataset to data/leandojo_minif2f.

For models without retrieval, run:

python prover/evaluate.py --data-path data/leandojo_minif2f/default/  --ckpt_path PATH_TO_MODEL_CHECKPOINT --split test --num-cpus 8 --with-gpus

For models with retrieval, first use the retriever to index the corpus (pre-computing the embeddings of all premises).

python retrieval/index.py --ckpt_path PATH_TO_RETRIEVER_CHECKPOINT --corpus-path data/leandojo_minif2f/corpus.jsonl --output-path PATH_TO_INDEXED_CORPUS

Then, run:

python prover/evaluate.py --data-path data/leandojo_minif2f/default/  --ckpt_path PATH_TO_REPROVER_CHECKPOINT --indexed-corpus-path PATH_TO_INDEXED_CORPUS --split test --num-cpus 8 --with-gpus

⚠️ PATH_TO_RETRIEVER_CHECKPOINT must be the same as the --model.ret_ckpt_path argument when training PATH_TO_REPROVER_CHECKPOINT.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eval_MiniF2F_ProofNet.md

eval_MiniF2F_ProofNet.md

Evaluation on MiniF2F and ProofNet

Files

eval_MiniF2F_ProofNet.md

Latest commit

History

eval_MiniF2F_ProofNet.md

File metadata and controls

Evaluation on MiniF2F and ProofNet