Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: clarify and expand docstrings for the instantiate functions #3183

Merged
merged 5 commits into from
Jan 22, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 61 additions & 9 deletions src/Lean/Expr.lean
Original file line number Diff line number Diff line change
Expand Up @@ -1228,32 +1228,84 @@ def inferImplicit : Expr → Nat → Bool → Expr
| e, _, _ => e

/--
Instantiate the loose bound variables in `e` using `subst`.
That is, a loose `Expr.bvar i` is replaced with `subst[i]`.
Instantiates the loose bound variables in `e` using the `subst` array,
where a loose `Expr.bvar i` at "binding depth" `d` is instantiated with `subst[i - d]` if `0 <= i - d < subst.size`,
and otherwise it is replaced with `Expr.bvar (i - subst.size)`; non-loose bound variables are not touched.

If we imagine all expressions as being able to refer to the infinite list of loose bound variables ..., 3, 2, 1, 0 in that order,
then conceptually `instantiate` is instantiating the last `n` of these and reindexing the remaining ones.
Warning: `instantiate` uses the de Bruijn indexing to index the `subst` array, which might be the reverse order from what you might expect.
See also `Lean.Expr.instantiateRev`.

**Terminology.** The "binding depth" of a subexpression is the number of bound variables available to that subexpression
by virtue of being in the bodies of `Expr.forallE`, `Expr.lam`, and `Expr.letE` expressions.
A bound variable `Expr.bvar i` is "loose" if its de Bruijn index `i` is not less than its binding depth.)

**About instantiation.** Instantiation isn't mere substitution.
When an expression from `subst` is being instantiated, its internal loose bound variables have their de Bruijn indices incremented
by the binding depth of the replaced loose bound variable.
Comment on lines +1245 to +1246
Copy link
Collaborator

@digama0 digama0 Jan 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true? My impression was that lean's instantiate only works on closed terms for subst, which means it doesn't need to lift them when inserting them into e.

EDIT: it is true, although closed terms have metadata which avoids the need to actually do anything to them when lifting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's certainly true. I did some tests and checked the source code. Didn't make it up 😉

I had the same impression as you, and this revelation is what led me to feel the need to update this documentation. I saw Tomas Skriven use Core.transform with instantiate1 and thought that that couldn't be correct, but it was.

(Re your edit: yeah, there's an optimization that if the bvar range of the expression is 0 it can just substitute.)

Here's a test:

import Lean

open Lean

#eval
  let e := Expr.lam `a (.const `Nat []) (.bvar 1) .default
  Expr.instantiate1 e (.bvar 0)
/-
Lean.Expr.lam `a (Lean.Expr.const `Nat []) (Lean.Expr.bvar 1) (Lean.BinderInfo.default)
-/

If the de Bruijn index lifting weren't present, we'd expect Lean.Expr.bvar 0 in the output.

This is necessary for the substituted expression to still refer to the correct binders after instantiation.
Similarly, the reason loose bound variables not instantiated using `subst` have their de Bruijn indices decremented like `Expr.bvar (i - subst.size)`
is that `instantiate` can be used to eliminate binding expressions internal to a larger expression,
and this adjustment keeps these bound variables referring to the same binders.
-/
@[extern "lean_expr_instantiate"]
opaque instantiate (e : @& Expr) (subst : @& Array Expr) : Expr

/--
Instantiates loose bound variable `0` in `e` using the expression `subst`,
where in particular a loose `Expr.bvar i` at binding depth `d` is instantiated with `subst` if `i = d`,
and otherwise it is replaced with `Expr.bvar (i - 1)`; non-loose bound variables are not touched.

If we imagine all expressions as being able to refer to the infinite list of loose bound variables ..., 3, 2, 1, 0 in that order,
then conceptually `instantiate1` is instantiating the last one of these and reindexing the remaining ones.

This function is equivalent to `instantiate e #[subst]`, but it avoids allocating an array.

See the documentation for `Lean.Expr.instantiate` for a description of instantiation.
In short, during instantiation the loose bound variables in `subst` have their own de Bruijn indices updated to account
for the binding depth of the replaced loose bound variable.
-/
@[extern "lean_expr_instantiate1"]
opaque instantiate1 (e : @& Expr) (subst : @& Expr) : Expr

/-- Similar to instantiate, but `Expr.bvar i` is replaced with `subst[subst.size - i - 1]` -/
/--
Instantiates the loose bound variables in `e` using the `subst` array.
This is equivalent to `Lean.Expr.instantiate e subst.reverse`, but it avoids reversing the array.
In particular, rather than instantiating `Expr.bvar i` with `subst[i - d]` it instantiates with `subst[subst.size - 1 - (i - d)]`,
where `d` is the binding depth.

This function instantiates with the "forwards" indexing scheme.
For example, if `e` represents the expression `fun x y => x + y`,
then `instantiateRev e.bindingBody!.bindingBody! #[a, b]` yields `a + b`.
The `instantiate` function on the other hand would yield `b + a`, since de Bruijn indices count outwards.
-/
@[extern "lean_expr_instantiate_rev"]
opaque instantiateRev (e : @& Expr) (subst : @& Array Expr) : Expr

/--
Similar to `instantiate`, but consider only the variables `xs` in the range `[beginIdx, endIdx)`.
Function panics if `beginIdx <= endIdx <= xs.size` does not hold.
Similar to `Lean.Expr.instantiate`, but considers only the substitutions `subst` in the range `[beginIdx, endIdx)`.
Function panics if `beginIdx <= endIdx <= subst.size` does not hold.

This function is equivalent to `instantiate e (subst.extract beginIdx endIdx)`, but it does not allocate a new array.

This instantiates with the "backwards" indexing scheme.
See also `Lean.Expr.instantiateRevRange`, which instantiates with the "forwards" indexing scheme.
-/
@[extern "lean_expr_instantiate_range"]
opaque instantiateRange (e : @& Expr) (beginIdx endIdx : @& Nat) (xs : @& Array Expr) : Expr
opaque instantiateRange (e : @& Expr) (beginIdx endIdx : @& Nat) (subst : @& Array Expr) : Expr

/--
Similar to `instantiateRev`, but consider only the variables `xs` in the range `[beginIdx, endIdx)`.
Function panics if `beginIdx <= endIdx <= xs.size` does not hold.
Similar to `Lean.Expr.instantiateRev`, but considers only the substitutions `subst` in the range `[beginIdx, endIdx)`.
Function panics if `beginIdx <= endIdx <= subst.size` does not hold.

This function is equivalent to `instantiateRev e (subst.extract beginIdx endIdx)`, but it does not allocate a new array.

This instantiates with the "forwards" indexing scheme (see the docstring for `Lean.Expr.instantiateRev` for an example).
See also `Lean.Expr.instantiateRange`, which instantiates with the "backwards" indexing scheme.
-/
@[extern "lean_expr_instantiate_rev_range"]
opaque instantiateRevRange (e : @& Expr) (beginIdx endIdx : @& Nat) (xs : @& Array Expr) : Expr
opaque instantiateRevRange (e : @& Expr) (beginIdx endIdx : @& Nat) (subst : @& Array Expr) : Expr

/-- Replace free (or meta) variables `xs` with loose bound variables. -/
@[extern "lean_expr_abstract"]
Expand Down