Questions about extracted benchmark dataset in Lean 3 #115

wiio12 · 2023-12-19T07:46:01Z

wiio12
Dec 19, 2023

Hi @yangky11, I have been looking into the Leandojo repository for a while, It's marvelous work considering the amount of engineering effort it takes. Here are some questions I came across, since I am not a pro in Lean I may have asked some dump questions.

TracedTheorem.theorem.fullname contains duplicates in extracted premises. There are 130283 premises extracted but only 129558 are distinct in their fullname. The code to reproduce is as follows:

URL = "https://github.com/leanprover-community/mathlib"
COMMIT = "19c869efa56bbb8b500f2724c0b77261edbfa28c"

repo = LeanGitRepo(URL, COMMIT)
traced_repo = trace(repo)
all_premises = []
for tf in traced_repo.traced_files:
    all_premises.extend(tf.get_premise_definitions())
all_premises_full_name = [p['full_name'] for p in all_premies]
print(len(all_premises_full_name))                 # 130283
print(len(set(all_premises_full_name)))         # 129558

Extra premises' name when calculating the theorem usage. The code to reproduce is as follows:

traced_theorems = traced_repo.get_traced_theorems()

# Map each premise to a list of theorems using it.
theorems_by_premises = defaultdict(list)
for t in traced_theorems:
    for p in t.get_premise_full_names():
        theorems_by_premises[p].append(t)

for premise_name in theorems_by_premises.keys():
    # turns out there are 3312 premises that fail this assert
    assert premise_name in all_premises_full_name

What are these premises exactly, can this be fixed? (my guess is that this error occurs in Lean's internal method of building up the AST tree, so no quick fix?)

Empty tactic list in benchmark data. The benchmark data extracted by Leandojo may not always contain a ground truth tactic list. (theorems that are solved by term mode? I guess)

    {
        "url": "https://github.com/leanprover-community/mathlib",
        "commit": "19c869efa56bbb8b500f2724c0b77261edbfa28c",
        "file_path": "src/data/matrix/block.lean",
        "full_name": "matrix.to_block_one_self",
        "start": [
            282,
            9
        ],
        "end": [
            283,
            27
        ],
        "traced_tactics": []
    },
    {
        "url": "https://github.com/leanprover-community/mathlib",
        "commit": "19c869efa56bbb8b500f2724c0b77261edbfa28c",
        "file_path": "src/ring_theory/subsemiring/basic.lean",
        "full_name": "subsemiring.map_id",
        "start": [
            445,
            9
        ],
        "end": [
            446,
            40
        ],
        "traced_tactics": []
    },

One more observation is that these theorems are taking up nearly half of the theorems in the val and test set for the random split. But only 3 of them exist in the test set in the novel premises split. Why does this happen? Will this cause some problems when evaluating model performance?

Answered by yangky11

Dec 21, 2023

Hi Haiming,

Thank you for your questions!

TracedTheorem.theorem.fullname contains duplicates in extracted premises.

This is expected. There are various reasons different premises may share the same full name. For example, the theorem here and the alias after it are both named linear_independent_subtype_range.

Extra premises' name when calculating the theorem usage.

This is also expected. Lean has some elaboration tricks that can generate additional lemmas/definitions (with different names) from a given lemma/definition. A common use case is when you state a theorem for multiplicative groups and want to automatically generate the version for additive groups. For examples, please search…

View full answer

yangky11 · 2023-12-21T15:24:09Z

yangky11
Dec 21, 2023
Maintainer

Hi Haiming,

Thank you for your questions!

TracedTheorem.theorem.fullname contains duplicates in extracted premises.

This is expected. There are various reasons different premises may share the same full name. For example, the theorem here and the alias after it are both named linear_independent_subtype_range.

Extra premises' name when calculating the theorem usage.

This is also expected. Lean has some elaboration tricks that can generate additional lemmas/definitions (with different names) from a given lemma/definition. A common use case is when you state a theorem for multiplicative groups and want to automatically generate the version for additive groups. For examples, please search for to_additive in this page. As a result, you should locate premises using locations instead of names. Please see the locate_premises function in ReProver for an example.

Empty tactic list in benchmark data.

That's because many term-style proofs do not have any tactics. Such proofs are very common. They are not common in the testing set of the novel_premises split because we require those proofs to use at least one premise that has never appeared in training. And we only consider premises that are used in tactics (not including term-style proofs). Actually, I don't know why there are still 3 such proofs. I thought it would be zero.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about extracted benchmark dataset in Lean 3 #115

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Questions about extracted benchmark dataset in Lean 3 #115

wiio12 Dec 19, 2023

Replies: 1 comment

yangky11 Dec 21, 2023 Maintainer

wiio12
Dec 19, 2023

yangky11
Dec 21, 2023
Maintainer