-
Notifications
You must be signed in to change notification settings - Fork 437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: change variable inclusion mechanism #2452
Comments
Note that this is also a problem for proof parallelism, as it is necessary to compile the body of the proof to determine the statement of the theorem, which may be needed for the following theorem. This is not directly a factor in the decision, but we should consider whether we really mean to make proof parallelism impossible. (This is similarly relevant for finding things that can be skipped to be able to compile a file in "outline mode" when full checking is not required, e.g. if we only want to know the AST of the file or what declarations are in it.) |
After some internal discussion within the Lean FRO:
This RFC remains on our radar, we hope to be able to get to it, but not yet. |
Here'a an example of how the current mechanism can bite: -- I'm going to prove stuff both about `Nat` and `Fin` in this file
variable (n : Nat) (m : Fin n)
theorem ohno (h : n ≠ 0) : n - 1 ≠ n := by
cases n with
| zero => exact (h rfl).elim
| succ n =>
intro h
cases h
-- Help, this lemma needs `m`!
#check ohno -- ohno (n : Nat) (m : Fin n) (h : n ≠ 0) : n - 1 ≠ n |
Recently we have been finding a lot of lemmas in Mathlib that pick up |
Can you say whether these would have been addressed by the inclusion heuristics suggested so far?
|
Here's another example of leaky variables causing problems: import Mathlib.Algebra.Algebra.Defs
count_heartbeats in
def foo1 : ℕ := 0 -- 7 heartbeats
variable (R S M : Type) [CommRing R] [CommRing S] [Algebra R S]
[AddCommGroup M] [Module R M] [Module S M] [IsScalarTower R S M]
count_heartbeats in
def foo2 : ℕ := 0 -- 1468 heartbeats
set_option trace.profiler true
def foo3 : ℕ := 0
/-
[Elab.command] [0.093040] def foo3 : ℕ :=
0 ▼
[step] [0.012494] expected type: Type, term
Module R M
[step] [0.012286] expected type: Type, term
Module S M
[step] [0.060222] expected type: Prop, term
IsScalarTower R S M ▶
-/ It seems that those variables (which are a very standard set-up in the theory of Kaehler differentials, coincidentally the slowest file in mathlib) are adding 0.1 seconds to every declaration in the file. The explanation seems to be this message. Gouezel's summary: "My understanding is the following: when you define foo, then the variables are around, and might be used in the definition, so Lean has to make sense of these variables even before starting to process your definition (because it might involve these variables). So it has to understand what Algebra R S means, which means finding some typeclasses on R and S." |
The current inclusion mechanism is also an obstacle to using |
The current Lean 4 mechanism for including variables is the following: When starting the proof of a theorem (or the body of a definition, it doesn't make a difference), all variables in the context are included and may be used in the proof. When the proof is over, only those variables that have indeed been used are kept for the statement of the theorem (i.e., all the other ones are discarded). This looks nice, but we have faced several issues in mathib 4 because of this:
variable {n : ℕ} (hn : 1 < n) (h'n : 2 < n)
and a theorem usinghn
but noth'n
, theninduction n
will also pullh'n
in the assumptions of the theorem. So willsubst
. The solution is toclear h'n
before the induction even thoughh'n
should not be in sight anywhere.classA α
, and in the context there are assumptionsclassB α
andclassC α
, both of them implyingclassA α
, then the above mechanism will only keepclassB α
orclassC α
, and the one which will be kept depends on the details of the path chosen by typeclass inference. Therefore, adjusting instance priorities somewhere else may change which one will appear in the assumptions of the theorem.No linter can guard against this lack of robustness. Therefore, most mathlib folks have been convinced that a less clever but more predictable behavior would be an improvement over the current situation. After several discussions on Zulip, the following scheme has been suggested:
include / omit
mechanism in Lean 3 is felt by many people as a major source of non-readability. Assume thathf : ...
andhg : ...
are variables in the context.include hf hg in
before the statement of a theorem, to force inclusion ofhf
andhg
(with the binders they have in the variables list, and at their position in the variables list). This could also be used to adust binder types, withinclude (hf) {hg} in
for instance. Optional: also allowomit hf in
to remove a named typeclass assumptionshf
that would otherwise be included.theorem foo ... (hf) ... {hg} ...
to force inclusion ofhf
andhg
, adjusting their binders and their position in the variables list. The interplay of this syntax with autoparams is not completely clear, and what one should do if other variables depend onhf
andhg
and appear before them in the list should also be clarified.A poll was organized on Zulip (https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/automatic.20inclusion.20of.20variables.20in.20mathlib.204/near/386620103), proposing either to keep the Lean 4 behavior, or go back to the Lean 3 behavior, or use the above scheme with one of the two suggestions (a) or (b). No-one opted for the current Lean 4 behavior nor the Lean 3 behavior. 12 people voted only for (a), 4 only for (b), and 4 for (a) or (b).
The text was updated successfully, but these errors were encountered: