Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFD(esign): Representations of proofs and proof obligations #49273

Closed
Keno opened this issue Apr 6, 2023 · 8 comments
Closed

RFD(esign): Representations of proofs and proof obligations #49273

Keno opened this issue Apr 6, 2023 · 8 comments
Labels
help wanted Indicates that a maintainer wants help on an issue or pull request

Comments

@Keno
Copy link
Member

Keno commented Apr 6, 2023

The problem statement

We're starting to see a bit of a proliferation of uses of @assume_effects. This isn't really a problem in general,
but the semantics here a subtle and getting it wrong is undefined behavior. As such, it would behoove us to think
about tooling to verify these assumptions wherever possible. Of course, you might say, "well, if the compiler could
verify the effects, we wouldn't need to assume them". That is the correct position of course, but not all
@assume_effects are due to limitations that we ever expect to lift. For example, the FMA consistency annotation:

julia/base/math.jl

Lines 49 to 55 in 1bf65b9

@assume_effects :consistent @inline function two_mul(x::Float64, y::Float64)
if Core.Intrinsics.have_fma(Float64)
xy = x*y
return xy, fma(x, y, -xy)
end
return Base.twomul(x,y)
end

requires a fairly complete and symbolic model of IEEE floating point semantics, that is beyond the scope of what I'd expect the compiler to ever have in the default compilation pipeline - it is more in the domain of proof assistants and model checkers. Ideally, we'd be able to have these kinds of tools as packages that can then be used to either find or check proofs of our assumptions (presumably on CI, since these kinds of checks can be expensive).

Prior art / the research question

I've taken a look at something like https://github.com/xldenis/creusot, which (as far as I understand) compiles Rust code to Why3's mlcfg representation (which can then generate proof obligations in a whole host of other tooling). It seems relatively straightforward to do the same thing for Julia by taking advantage of the usual IR-agnostic compilation trick that we do, but I think that's only half of the story, addressing the mechanism of how something like this might work, but not really the semantics, so that's kind of where this issue comes in.

I would like to pose to the question to the larger julia/PL community: What - if anything - should the semantics of proofs and proof obligations be in the core Julia system. I have some initial thoughts here, which I will post below, but I'm assuming there's a lot of standard approaches and tricks here that I'm just not familiar with, so I'd like to leave this fairly broad for suggestions.

My own thoughts

I think it would be prudent to have some sort of primitive representations of proofs within the core system. This doesn't mean that this system would actually be used for proof checking, but it seems necessary to at least have some object that represents true statements. That way, if you can construct such an object (or prove that the function that constructs it is :total and thus would construct the object if it were ran), you can assert the truth of something. For example, could we have something like:

struct NoThrowObligation <: ProofObligation
    mi::MethodInstance
    world::UInt
end

struct Proof
     what::ProofObligation
    # Proof construction using Julia compiler builtin
    function NoThrowProof(obl::NoThrowObligation)
        Base.infer_effects(obl.mi, obl.world).nothrow || error()
        new(obl)
    end
    # Proof construction from composition
    function NoThrowImplies(obl::NoThrowObligation)
        # Compiler primitive that does `infer_effects`, but also generates a `NoThrowObligation` for every
        # call it couldn't prove `:nothrow` for
        obls = Base.infer_effect_obligations(obl.mi, obl.world).nothrow
        obl, proofs->begin
            all((p, o)->p.what === o, proofs, obls) || error()
            new(obl)
        end
    end
    # Other constructors monkey patched by external packages, e.g. the why3 construction.
end

These objects could then be exchanged for checking/assumptions, etc.

This design is obviously bad, but that's because I don't really know what the good designs in this world are. I assume there's a bunch of standard tricks to lift primitive notions of equality and composition into the proof domain, so I'd love to get some input on the core principles of doing this kind of thing, keeping in mind that this would be an entirely optional part of the language.

@Keno Keno added the help wanted Indicates that a maintainer wants help on an issue or pull request label Apr 6, 2023
@Keno
Copy link
Member Author

Keno commented Apr 6, 2023

I've gotten some requests to expand upon what I'm asking for people who may not be versed in the latest Julia features, so let me try to provide some background. If there's anything else people would like me to elaborate on, please let me know.

  1. What is @assume_effects?

The julia compiler needs to, at various stages, prove certain properties of the functions its analyzing. For example, it can be helpful to know that a function does not throw, is guaranteed to terminate, and does not have other observable side effects, because that means that if the result of the function is unused, we may immediately delete it without further analysis. This kind of thing is similarly achieved with attributes in LLVM and it seems similar to the effect system in Verse, although I only saw a passing reference to that recently and have not dug in in detail. To achieve this, we interprocedurally converge the effects for a function along with regular type inference.

The full semantics of these effects are listed in the documentation: https://docs.julialang.org/en/v1/base/base/#Base.@assume_effects

They can be queried with infer_effects:

julia> Base.infer_effects(+, Tuple{Int, Int})
(+c,+e,+n,+t,+s,+m)

julia> Base.infer_effects(println, Tuple{String})
(!c,!e,!n,!t,!s,!m)′

(As you might imagine, this means that + is the most pure a function can be, println the least - again, see docs for specific semantics).

However, while the compiler is reasonably sophisticated at inferring these (based on regular abstract interpretation with relatively sophisticated transfer function for our primitives), it is of course not perfect. As a result, there is @assume_effects, which allows overriding the compiler's judgement of the effects of a particular function.
Consider for example, the function I had above:

 @assume_effects :consistent @inline function two_mul(x::Float64, y::Float64) 
     if Core.Intrinsics.have_fma(Float64) 
         xy = x*y 
         return xy, fma(x, y, -xy) 
     end 
     return Base.twomul(x,y) 
 end 

The :consistent effect, requires, among other things, that the result be independent of the execution environment. Here, have_fma, reads the processor capabilities, which is an execution environment query. Thus, this function is consistent if and only if ∀x,y (xy, fma(x, y, -xy)) === Base.twomul(x,y) (where === is egality, the strongest of our equality predicates, which in this case requires bitwise equality of the two floating point numbers). This is not something our compiler can determine itself, because it does not have a sufficiently strong formal floating point model. However, this is certainly within the capabilities of other tools (e.g. I'm sure Alive2 could prove it if we gave it the LLVM version of this), so it would be nice to be able to have a mechanism/semantics for the compiler to keep track of the its assumptions and offload them to external tools if possible (probably as a CI step).

Getting the effects wrong (e.g. annotating something as :consistent, when it actually isn't) is undefined behavior and as we exploit these effects aggressively can cause quite a bit havoc.

  1. Are you looking for a formalization of Julia?

Not really. It is certainly true that in general, you'd need to have some sort of formalization of julia semantics in order to prove e.g. the absence of the possibility of method errors. However, the Julia compiler already does a lot of this, and the goal of this effort is not to verify the julia compiler. The goal is to give users additional tooling to verify their own code so they can use things like @assume_effects more safely (because the semantics are subtle and very easy to get wrong, even for experienced developers). For example, the two_mul case above is entirely monomorphized by the julia compiler, and the (remaining) required proof is entirely in the floating point domain. Of course, if we build some interesting infrastructure here, formalizing more things about Julia could be an interesting future direction.

  1. Are you trying to add a static type system to julia?

Depends on what you mean, but not really. In particular, I'm not looking for syntax additions, etc that might have semantic impact on what julia code means. I'm looking for ways to make it easier to apply formal methods to existing julia code. I think the lines get a bit blurry when you get to really sophisticated type systems like you find in theorem provers where types are mostly written implicitly by automation, which is probably similar to what would happen here, but the key point is that I'm not proposing any changes to language semantics?

  1. Are you trying to build an interactive theorem prover?

Again, not really. I do think that in order to this well, you need to have some basic primitives in the language itself that can then get translated out to the actual proving environments. That probably does include a basic proof checker, so you can do basic compositional stuff mechanically without having to go through the overhead of going out into another environment. However, turning Julia into a theorem prover is not my goal - I want to leverage what already exists. If somebody wanted to take whatever comes out of this as a base and build a full theorem proving environment that is more integrated with Julia, I'd welcome that, but that's not my goal.

@Seelengrab
Copy link
Contributor

Seelengrab commented Apr 6, 2023

Love to see this! In a sense, any function that the compiler can statically infer the type for is already a proof in and of itself, our compiler "just" isn't smart enough to gather the invariants as part of the inferred type. For example, this function:

f(x::Int) = x - (x % 2)

Only infers as Int, when it could infer as (pseudo) EvenInt (this could be represented by metadata as well), with postcondition "The LSB is 0". This program is then a proof of the function Int -> EvenInt, since it typechecks for that signature. Conversely, this function:

function myfunc(x::Int)
    iseven(x) || throw(ArgumentError("x not even!"))
    x*2
end

Only requires Int and does a runtime check for its precondition that x should be an even number - in order for the function to terminate, the actual type of the input argument could be inferred as EvenInt (like a kind of reverse-inference - what kinds of properties must an incoming object have such that the function is guaranteed to terminate in a regular manner?). The function itself is then a proof with postcondition For all x from EvenInt: myfunc(x)::EvenInt (you could also write this as an entailment EvenInt => EvenInt, meaning IFF the input is an EvenInt instead of just an Int, you're guaranteed to get out another EvenInt. If it isn't, anything might happen).

So - are you primarily asking about datastructure/algorithmic design and representing that on the frontend, or asking about prior art on how to do proof extraction (i.e., extracting pre-/postconditions and/or invariants) and checking them against a provided spec?

There's also the question of what kinds of proofs/specs ought to be representable. I think for a lot of things, abstract interpretation itself can help here, not by using it to find effects but to, for example, aggregate memory allocations to give a lower bound on how much memory is needed. This gets messy quickly, since the same kinds of things that make regular type inference hard are also the kinds of things that blow up any kind of bounded value inference, like the question about getting a lowerbound to memory. Post-/Preconditions that involve multiple execution environments/tasks are going to be tricky - this is more or less equivalent to safety and liveness.

You mention "proofs by composition", which is an active research question (the main issue is unintentionally violating internal invariants of other specs when composing). Hillel Wayne has lots of interesting publications about that and using TLA+ to model programming environments. Currently though, most formal modeling I know of takes place as an additional, extra thing next to the program you're modeling.

Finally - as far as I'm aware, only the model checking part is really somewhat well researched. Extracting a model of any kind from an existing language is untrodden ground, to say the least - the closest/most generic I can come up with OTOH is Hoare Logic, which deals with pre-/postconditions (and seems to be what the interface in creusot is based off of), but most research about that and similar logics I know of are more concerned with the model checking part than the "how did we end up with this model in the first place" part. We already do kind of model this, with input argument types being preconditions and return types (and some effects) being postconditions, as aluded to above.

@Keno
Copy link
Member Author

Keno commented Apr 6, 2023

So - are you primarily asking about datastructure/algorithmic design and representing that on the frontend, or asking about prior art on how to do proof extraction (i.e., extracting pre-/postconditions and/or invariants) and checking them against a provided spec?

Mostly the former, but I'm not sure "frontend" is the correct way to say it. As you said, the compiler already proves various things by making typing judgements over the extended inference+effects lattice, but there isn't really any first class representation of this that could serve as the central interchange between the compiler and external tooling. The question is basically what does that interchange point look like. Is it something like what I wrote above, or one could imagine just directly saying something like:

struct Judgement
    mi::MethodInstance
    world::UInt
    typ # some type in the extended type lattice
end

taking advantage of the recent support we have for pluggable type systems in the abstract interpreter and as in my proposal above restrict construction of these things to only allow the construction of true judgements. I very much think of this as sort of a "bootstrapping" question, i.e. how do you turn the primitive notions of equality and code that we have into something that you can start to reason over. I have a very good sense of how to do that sort of thing for plain execution, but since I've never worked on proof system implementations, I just don't have any good design intuition here.

@timholy
Copy link
Member

timholy commented Apr 6, 2023

If the point is to check whether packages are using @assume_effects properly, can you explain in greater detail why this can't be in a package?

@MasonProtter
Copy link
Contributor

MasonProtter commented Apr 6, 2023

@timholy I think the whole point here is to come up with an interface precisely so that this can be done in a package without being overly entangled with the nitty gritty compiler internals.

@Seelengrab
Copy link
Contributor

Seelengrab commented Jun 28, 2023

I just remembered that this issue exists, so if you're still looking into this @Keno , the dafny language is probably the closest to a complete description of the sorts of things & challenges that arise in this domain. They also have a, to my eyes at least, good interface.

As for whether this can be done in a package or not - it's possible, but should (after A LOT of maturing) be transitioned to Base.

@Keno
Copy link
Member Author

Keno commented Jun 28, 2023

Yes, I'm quite familiar with Dafny and agree that it's a good design point for the interface side of things.

Keno added a commit that referenced this issue Jul 23, 2023
This proposal is an attempt to tie together some of the recent discussion around
the future of `@inbounds`, formal interface checking (a long the lines of
my proposal in https://github.com/Keno/InterfaceSpecs.jl), and `--check-bounds`.

# Reviewing `@inbounds`

## History
The `@inbounds` annotation has been in the langauge for a decade at this point [1].
It is a crictical tool to achieve machine performance in high-performance applications
where the overhead of boundschecking is prohibitive [2]. At the same time, as Julia
has matured, the dangers of this macro have become more clear. It is essentially
impossible to use correctly in generic code and instances where code (including
code in base from before we were culturally more careful about inbounds) are
using `@inbounds` have become a not-uncommon complaint-point about the language
as a whole [3]. In current practice, at least in Base, we have become extremely
hesitant about the introduction of new `@inbounds` annotations (see e.g.
discussion in #41200). At the same time, the ability to disable bounds checking
remains a critical capability for several of our user communities (c.f. #50239
and related discussions). So how do we move forward here? I'm hoping we can find
a way forward that achieves the same benefits of `@inbounds` but in a way where
it is possible to re-establish soundness if desired.

## When is inbounds still required?

### LLVM's current capabilities

Let's look at a couple of examples. First:
```
function sum1(A::Vector{Int64})
	a = zero(eltype(A))
	for i in eachindex(A)
		a += A[i]
	end
	return a
end
```

Is inbounds required here in modern julia? The answer is no - LLVM's range analysis
proves that the bounds check is dead and eliminates it.
(The same holds true for the `1:length(A)` version, although this was not always true).

What about the following:
```
function sum2(A::Vector{Int64}, B::Vector{Int64})
	a = zero(eltype(A))
	for i in eachindex(A, B)
		a += A[i] + B[i]
	end
	return a
end
```

Here, LLVM is again able to eliminate the boundscheck, though of course the
`eachindex` introduces an additional check that the array shapes are compatible.
Even this is still ok-ish:

```
# Noinline here just for illustration, think some big internal function that is
# not inlineable.
@noinline function sum_partial(A::Vector{Int64}, upto::Int)
	a = zero(eltype(A))
	for i in 1:upto
		a += A[i]
	end
	return a
end
sum3(A::Vector{Int64}) = sum_partial(A, length(A))
```

The generated code here runs through the loop. LLVM's vectorizer has support for bounds-like
checks and can remove them from the vector body. However, in scalar code (e.g. Float64
without fastmath), LLVM does still perform the bounds check in every loop iteration rather
than hoisting it outside. This is probably a bug somewhere. The IRCE pass is supposed to
take care of this, so for the present purposes, let's assume that this will also go through
eventually.

That said, the takeaway here is that, for simple cases, where everything is inlined, LLVM
is reasonably powerful at eliminating these checks.

### The effect-separation trick

Let's consider a case like this
```
function repeat_outer(a::AbstractMatrix, (m,n)::NTuple{2, Any})
    o, p = size(a,1), size(a,2)
    b = similar(a, o*m, p*n)
    for j=1:n
        d = (j-1)*p+1
        R = d:d+p-1
        for i=1:m
            c = (i-1)*o+1
            b[c:c+o-1, R] = a
        end
    end
    return b
end
```

The setindex! call in here goes through a few layers of dispatch, but eventually
ends up at:

```
function _setindex!(l::IndexStyle, A::AbstractArray, x, I::Union{Real, AbstractArray}...)
    @inline
    @BoundsCheck checkbounds(A, I...)
    _unsafe_setindex!(l, _maybe_reshape(l, A, I...), x, I...)
    A
end
```

This is a very common pattern in our code, where we have an `@inline` function that
first checks for the in-boundedness and then performs an unsafe operation that assumes
everything is inbounds.

This pattern is quite good. By allowing the boundscheck itself to be inlined, LLVM can
eliminate the boundscheck using local context from the calling function (in theory
at least - in practice we still have an `@inbounds` there because LLVM isn't strong
enough currently).

However, the pattern is somewhat hard to use consistently and correctly, so we generally
only use it in the low level indexing code.

# Effect-precondition synthesis

## `@inbounds` motivations

In the previous section I said that the effect-separation was good but hard
to do consistently and correctly. So can we have the compiler implement this
for us. Suppose we had a macro `@hoist_boundschecks` that worked something like
the following:

```
function _setindex!(l::IndexStyle, A::AbstractArray, x, I::Union{Real, AbstractArray}...)
    @inline
    @hoist_boundscheck _safe_setindex!(l, _maybe_reshape(l, A, I...), x, I...)
    A
end
# = Expands to =#
function _setindex!(l::IndexStyle, A::AbstractArray, x, I::Union{Real, AbstractArray}...)
    @inline
    #= All the boundschecks from _safe_setindex! hoisted =# checkbounds(A, I...)
    #= Switched to unsafe_setindex, but now proven correct =#
    _unsafe_setindex!(l, _maybe_reshape(l, A, I...), x, I...)
    A
end
```

Where essentially what the macro would do is go through the `_safe_setindex!` call
and synthesize a check that ensures that all calls are inbounds. Of course
such a synthesis is not always possible in general, but it's not that hard for simple
cases (a trivial way to do it is to duplicate the function, delete all the side
effects other than throwing boundserrors and then let LLVM go to town on it).

## Generalizing

There is a few problems with the macro as suggested in the previous section. In particular,
it changes the ordering of side effects and errors in the function, which may be desirable,
but I would like have explicit control over. My proposal is to have a slight generalization
of it that works roughly as follows:

```
function foo(A, I)
	@split_effects :nothrow bar(A, I)
end
#= Expands to =#
function foo(A, I)
	if precondition(Val{:nothrow}(), bar, A, I)
		@assume_effects :nothrow bar(A, I)
	else
		bar(A, I)
	end
end
```

Where `precondition` like the proposed macro above is some synthesized predicate
that ensures that the resulting function does not throw. Of course this brings up
many questions, such as the implementation of `precondition` and the statement-position
`@assume_effects`, which we currently don't have, but let's explore the implications
of this a bit further.

## Precondition inference and composition

The question of coming up with these kinds of preconditions is a well-known problem in the
compiler literature. The term to look for is "weakest precondition synthesis". How exactly
do this best is somewhat outside the scope of this writeup, but I do want to highlight
a the primary property that suggests the idea: functional composition of functions is boolean
composition of preconditions:

```
function foo()
	bar()
	baz()
end

precondition(n::Val{:nothrow}, foo) = precondition(n, bar) && precondition(n, baz)
```

Also, if a function uses the `@split_effects` macro internally, then an outside
`@assume_effects` can cause the precondition to be assumed to be true. This mirrors
the `@inbounds/@boundscheck` interaction we have right now, but with an alternative
that is provably safe.

# Extensions to `@assume_effects`

So far, we have assumed that these preconditions are synthesized automatically, but
getting this to work well, of course depends on the synthesis strength of the compiler.
To still allow users to take advantage of this mechanism, even if the compiler's synthesis
is not strong enough, we can extend @assume_effects to allow conditional effects:

```
@assume_effects (checkbounds(false, A, I) && :nothrow) setindex!(A, I)
	_safe_setindex!(A, I)
end
```

The idea here is that precondition is overriden by `checkbounds(false, A, I)`, so
any `@split_effects` of this function would use the `checkbounds` function for its
check and if this returned true, could `@assume_effects :nothrow` this function,
which as described above would allow the use of the unsafe function in the interior.

## Call-site `@assume_effects` and effects-assumption specialization

In the foregoing we have, in a few places, used a call-site `@assume_effects`, without
defining what it does. The idea here is pretty simple: We add a new specializaiton
axis based on effects assumption. For example, if a function is `@assume_effects :nothrow`,
then at codegen time, any path that throws an error (in particular bounds checks)
becomes undefined behavior and LLVM is allowed to assume that it does not happen.
Of course, this is macro is extremely dangerous (as the existing @assume_effects
and @inbounds are).

However, one significant benefit of this design is that there is a very clear notion of
what the assertion promises. This is not necessarily clear of `@inbounds`, which we
did not give semantics beyond it's effect on the `@boundscheck` macro. As a result,
even an infinitely-powerful prover could not check the correctness or non-correctness
of `@inbounds` (as currenlty used - we could of course consider an alternative @inbounds
design with stronger semantics). In contrast, the formal meaning of a conditional
`@assume_effects` is well defined and could in principle be checked by a tool (#49273).

# Implications for `--check-bounds`

In the whole, `--check-bounds` removal discussion, we had assumed that we did not want
to keep two copies of all code just for the purposes of `--check-bounds` which thus
required us disable constant propagation. However, in this proposal, the `@assume_effects`
specialization is exactly such a copy set and could thus be used for this purpose.
That said, this proposal would also hopefully provide a much stronger system for boundscheck
removal that would allow us to make `--check-bounds` much less necessary.

# Other uses

There are a few other places where domain checks can interfere with optimization.
For exmaple, consider the following situation:

```
function sin_loop(n)
	for i = 1:n
		# Imagine there was a whole bunch of other code here that used this value,
		# but all that got DCE'd, but we can't in general DCE `sin`, because it
		# may throw.
		sin(Float64(i))
	end
end
```
```
julia> @time sin_loop(1_000_000_000)
 20.383584 seconds
```

With the support in this PR, we can:
```
# The actual condition here is `!isinf`, but we're allowed to overapproximate and
# LLVM can handle `isfinite` better.
# TODO: In a more complete version of this PR, this precondition would be synthesized
@Base.assume_effects (isfinite(x) && :nothrow) @noinline function mysin(x::Float64)
	sin(x)
end

function sin_loop_split(n)
	for i = 1:n
		Core.invoke_split_effects(:nothrow, mysin, Float64(i))
	end
end
```
```
julia> @code_llvm sin_loop_split(1_000_000_000)
;  @ REPL[19]:1 within `sin_loop_split`
define void @julia_sin_loop_split_428(i64 signext %"n::Int64") #0 {
top:
;  @ REPL[19]:4 within `sin_loop_split`
  ret void
}

julia> @time sin_loop_split(1_000_000_000)
  0.000000 seconds
```

# Current status of this PR
This PR contains a complete sketch of this mechanism, including inference support
for the new intrinsic, as well as codegen and runtime support for effect-assumption
specialization. It also includes an (extremely incomplete) sketch of the supporting
macros. It does not implement any precondition synthesis logic. My plan is to support
synthesis for some of the simple `@inbounds` cases in Base, but then delagate fancier
synthesis support for packages, since that can get arbitrarily complicated.

# Implementation Plan

This PR itself is not suitable for merging, but if people like this direction, I would
like to get the basic pieces in in relatively short order. To that end, I would suggest
the following order of implementation as separate PRs once we've agreed on the overall plan:

1. New intrinsics, Method(Instance) fields, inference support
2. @assume_effects extensions
3. Codegen and specialization support, Code(Instance) fields
4. Basic Synthesis and `@inbounds` removal

[1] Introduced in 66ab577 for julia 0.2
[2] Note that it's usually not the boundschecking itself that is the problem,
    but rather that the presence of the boundschecking inhibits other optimizations.
[3] E.g. https://yuri.is/not-julia/
@Keno
Copy link
Member Author

Keno commented Nov 26, 2023

This issue served its purpose as having a place to point people to when asking questions. Closing in favor of more focused discussion on concrete proposals in the future.

@Keno Keno closed this as not planned Won't fix, can't repro, duplicate, stale Nov 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Indicates that a maintainer wants help on an issue or pull request
Projects
None yet
Development

No branches or pull requests

4 participants