Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lowering: Revert undefined global outlining #56714

Closed
wants to merge 3 commits into from
Closed

Conversation

Keno
Copy link
Member

@Keno Keno commented Nov 29, 2024

This partially reverts #51970, once again allowing throwing GlobalRefs in value position for CodeInfo, (but not IRCode). The primary motivation for this reversal is that after binding partition the throwiness of a global is world-age dependent and we do not want to have lowering be world-age dependent, because that would require us to rewrite the lowered code upon invalidation (#56649).

With respect to the original motivation for this change (being able to use the statement flag in inference), we retain the property that the statement flags apply only to the optimizer version of the statement (i.e. with the GlobalRefs moved out). The only place in inference where we were not doing this, we can rely on exct instead. This does assume that exct is precise, but we've done a lot of work on that in the past year, so hopefully we can rely on that now.

The revert is partial, because we go in the opposite direction in a few places: For GotoIfNot cond and EnterNode scope operands, we just forbid any GlobalRef entirely. Constants are rare in these, so no need to burden inference with the possibility of handling these. We do however, keep (and add appropriate handling code) for GlobalRef arguments to ReturnNode to make sure that trivial code remains one statement.

Keno added 3 commits November 29, 2024 07:35
When we declare inner methods, e.g. the `f` in

```
function fs()
   f(lhs::Integer) = 1
   f(lhs::Integer, rhs::(local x=Integer; x)) = 2
   return f
end
```

we must hoist the definition of the (appropriately mangled) generic
function `f` to top-level, including all variables that were used
in the signature definition of `f`. This situation is a bit
unique in the language because it uses inner function scope, but gets
executed in toplevel scope. For example, you're not allowed to use a local
of the inner function in the signature definition:

```
julia> function fs()
          local x=Integer
          f(lhs::Integer, rhs::x) = 2
          return f
       end
ERROR: syntax: local variable x cannot be used in closure declaration
Stacktrace:
 [1] top-level scope
   @ REPL[3]:1
```

In particular, the restriction is signature-local:
```
julia> function fs()
          f(rhs::(local x=Integer; x)) = 1
          f(lhs::Integer, rhs::x) = 2
          return f
       end
ERROR: syntax: local variable x cannot be used in closure declaration
Stacktrace:
 [1] top-level scope
   @ REPL[4]:1
```

There's a special intermediate form `moved-local` that gets generated
for this definition. In c6c3d72, this form
stopped getting generated for certain inner methods. I suspect this
happened because of the incorrect assumption that the set of moved
locals is being computed over all signatures, rather than being a
per-signature property.

The result of all of this was that this is one of the few places
where lowering still generated a symbol as the lhs of an assignment
for a global (instead of globalref), because the code that generates
the assignment assumes it's a local, but the later pass doesn't know this.
Because we still retain the code for this from before we started using globalref
consistently, this wasn't generally causing a problems, except possibly leaking
a global (or potentially assigning to a global when this wasn't intended).
However, in follow on work, I want to make use of knowing whether the LHS is a
global or local in lowering, so this was causing me trouble.

Fix all of this by putting back the `moved-local` where it was dropped.

Fixes #56711
This adjusts lowering to emit `setglobal!` for assignment to globals,
thus making the `=` expr head used only for slots in `CodeInfo` and
entirely absent in `IRCode`. The primary reason for this is just
to reduce the number of special cases that compiler passes have
to reason about. In IRCode, `=` was already essentially equivalent
to `setglobal!`, so there's no good reason not to canonicalize.

Finally, the `=` syntax form for globals already gets recognized
specially to insert `convert` calls to their declared binding
type, so this doesn't impose any additional requirements on
lowering to distinguish local from global assignments. In general,
I'd also like to separate syntax and intermediate forms as much
as possible where their semantics differ, which this accomplises
by just using the builtin.

This change is mostly semantically invisible, except that spliced-in
GlobalRefs now declare their binding because they are indistinguishable
from ordinary assignments at the stage where I inserted the lowering.
If we want to, we can preserve the difference, but it'd be a bit
more annoying for not much benefit, because:
1. The spliced in version was only recently made to work anyway, and
2. The semantics of when exactly bindings are declared is still messy
   on master anyway and will get tweaked shortly in further binding
   partitions work.
This partially reverts #51970, once again allowing throwing GlobalRefs
in value position for `CodeInfo`, (but not `IRCode`). The primary motivation
for this reversal is that after binding partition the throwiness of
a global is world-age dependent and we do not want to have lowering
be world-age dependent, because that would require us to rewrite
the lowered code upon invalidation (#56649).

With respect to the original motivation for this change (being able
to use the statement flag in inference), we retain the property
that the statement flags apply only to the optimizer version
of the statement (i.e. with the GlobalRefs moved out). The only
place in inference where we were not doing this, we can rely
on `exct` instead. This does assume that `exct` is precise, but
we've done a lot of work on that in the past year, so hopefully
we can rely on that now.

The revert is partial, because we go in the opposite direction in
a few places: For `GotoIfNot` cond and `EnterNode` scope operands, we
just forbid any `GlobalRef` entirely. Constants are rare in these,
so no need to burden inference with the possibility of handling
these. We do however, keep (and add appropriate handling code)
for GlobalRef arguments to `ReturnNode` to make sure that trivial
code remains one statement.
@vtjnash
Copy link
Member

vtjnash commented Nov 29, 2024

What is the reason not to just ban them entirely except if the optimizer proved them no throw and added an optimizer edge?

@Keno
Copy link
Member Author

Keno commented Nov 29, 2024

If GlobalRef were entirely disallowed in value position, we would blow up the size of lowered code by a significant factor. At 2-3x.

@vtjnash
Copy link
Member

vtjnash commented Nov 29, 2024

That seems to imply that most statements contain 2-3 globals (and probably more), which seems unlikely. I tried deleting this lowering complication entirely, and the system image changed by an insignificant amount (1MB or 0.6%), which seems easily recoverable by making the compression design a lot better once only SSAValue is allowed as immediate values.

vtjnash@deepsea4:~/julia$ git diff
diff --git a/src/ast.c b/src/ast.c
index 474c0661f52..5ae5ba0af79 100644
--- a/src/ast.c
+++ b/src/ast.c
@@ -288,7 +288,7 @@ static jl_value_t *scm_to_julia_(fl_context_t *fl_ctx, value_t e, jl_module_t *m
 
 static const builtinspec_t julia_flisp_ast_ext[] = {
     { "defined-julia-global", fl_defined_julia_global }, // TODO: can we kill this safepoint
-    { "nothrow-julia-global", fl_nothrow_julia_global },
+    //{ "nothrow-julia-global", fl_nothrow_julia_global },
     { "current-julia-module-counter", fl_module_unique_name },
     { "julia-scalar?", fl_julia_scalar },
     { NULL, NULL }
$ size -A usr/lib/julia/sys.so | grep ldata
.ldata                158870928 => 159737216
$ du -sh usr/share/julia/compiled/v1.12/
254M => 271M

@Keno
Copy link
Member Author

Keno commented Nov 30, 2024

Alright, if we're happy to bite the size increase, I can put up that PR instead.

@Keno Keno force-pushed the kf/canonicalequal branch from 18205fb to 5de6d3b Compare December 3, 2024 00:57
Base automatically changed from kf/canonicalequal to master December 3, 2024 06:19
Keno added a commit that referenced this pull request Dec 3, 2024
This is an alternative to #56714 that goes in the opposite direction -
just outline all GlobalRefs during lowering. It is a lot simpler
that #56714 at the cost of some size increase. Numbers:

sys.so .ldata size:
This PR: 159.8 MB
Master: 158.9 MB

I don't have numbers of #56714, because it's not fully complete.
Additionally, it's possible that restricting GlobalRefs from arguments
position would let us use a more efficient encoding in the future.
Keno added a commit that referenced this pull request Dec 4, 2024
This is an alternative to #56714 that goes in the opposite direction -
just outline all GlobalRefs during lowering. It is a lot simpler
that #56714 at the cost of some size increase. Numbers:

sys.so .ldata size:
This PR: 159.8 MB
Master: 158.9 MB

I don't have numbers of #56714, because it's not fully complete.
Additionally, it's possible that restricting GlobalRefs from arguments
position would let us use a more efficient encoding in the future.
Keno added a commit that referenced this pull request Dec 5, 2024
This is an alternative to #56714 that goes in the opposite direction -
just outline all GlobalRefs during lowering. It is a lot simpler
that #56714 at the cost of some size increase. Numbers:

sys.so .ldata size:
This PR: 159.8 MB
Master: 158.9 MB

I don't have numbers of #56714, because it's not fully complete.
Additionally, it's possible that restricting GlobalRefs from arguments
position would let us use a more efficient encoding in the future.
@Keno
Copy link
Member Author

Keno commented Dec 5, 2024

Closing in favor of #56746

@Keno Keno closed this Dec 5, 2024
Keno added a commit that referenced this pull request Dec 5, 2024
This is an alternative to #56714 that goes in the opposite direction -
just outline all GlobalRefs during lowering. It is a lot simpler
that #56714 at the cost of some size increase. Numbers:

sys.so .ldata size:
This PR: 159.8 MB
Master: 158.9 MB

I don't have numbers of #56714, because it's not fully complete.
Additionally, it's possible that restricting GlobalRefs from arguments
position would let us use a more efficient encoding in the future.
Keno added a commit that referenced this pull request Dec 6, 2024
This is an alternative to #56714 that goes in the opposite direction -
just outline all GlobalRefs during lowering. It is a lot simpler
that #56714 at the cost of some size increase. Numbers:

sys.so .ldata size:
This PR: 159.8 MB
Master: 158.9 MB

I don't have numbers of #56714, because it's not fully complete.
Additionally, it's possible that restricting GlobalRefs from arguments
position would let us use a more efficient encoding in the future.
Keno added a commit that referenced this pull request Dec 7, 2024
This is an alternative to #56714 that goes in the opposite direction -
just outline all GlobalRefs during lowering. It is a lot simpler that
#56714 at the cost of some size increase. Numbers:

sys.so .ldata size:
This PR: 159.8 MB
Master: 158.9 MB

I don't have numbers of #56714, because it's not fully complete.
Additionally, it's possible that restricting GlobalRefs from arguments
position would let us use a more efficient encoding in the future.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants