lowering: Revert undefined global outlining #56714

Keno · 2024-11-29T10:12:44Z

This partially reverts #51970, once again allowing throwing GlobalRefs in value position for CodeInfo, (but not IRCode). The primary motivation for this reversal is that after binding partition the throwiness of a global is world-age dependent and we do not want to have lowering be world-age dependent, because that would require us to rewrite the lowered code upon invalidation (#56649).

With respect to the original motivation for this change (being able to use the statement flag in inference), we retain the property that the statement flags apply only to the optimizer version of the statement (i.e. with the GlobalRefs moved out). The only place in inference where we were not doing this, we can rely on exct instead. This does assume that exct is precise, but we've done a lot of work on that in the past year, so hopefully we can rely on that now.

The revert is partial, because we go in the opposite direction in a few places: For GotoIfNot cond and EnterNode scope operands, we just forbid any GlobalRef entirely. Constants are rare in these, so no need to burden inference with the possibility of handling these. We do however, keep (and add appropriate handling code) for GlobalRef arguments to ReturnNode to make sure that trivial code remains one statement.

When we declare inner methods, e.g. the `f` in ``` function fs() f(lhs::Integer) = 1 f(lhs::Integer, rhs::(local x=Integer; x)) = 2 return f end ``` we must hoist the definition of the (appropriately mangled) generic function `f` to top-level, including all variables that were used in the signature definition of `f`. This situation is a bit unique in the language because it uses inner function scope, but gets executed in toplevel scope. For example, you're not allowed to use a local of the inner function in the signature definition: ``` julia> function fs() local x=Integer f(lhs::Integer, rhs::x) = 2 return f end ERROR: syntax: local variable x cannot be used in closure declaration Stacktrace: [1] top-level scope @ REPL[3]:1 ``` In particular, the restriction is signature-local: ``` julia> function fs() f(rhs::(local x=Integer; x)) = 1 f(lhs::Integer, rhs::x) = 2 return f end ERROR: syntax: local variable x cannot be used in closure declaration Stacktrace: [1] top-level scope @ REPL[4]:1 ``` There's a special intermediate form `moved-local` that gets generated for this definition. In c6c3d72, this form stopped getting generated for certain inner methods. I suspect this happened because of the incorrect assumption that the set of moved locals is being computed over all signatures, rather than being a per-signature property. The result of all of this was that this is one of the few places where lowering still generated a symbol as the lhs of an assignment for a global (instead of globalref), because the code that generates the assignment assumes it's a local, but the later pass doesn't know this. Because we still retain the code for this from before we started using globalref consistently, this wasn't generally causing a problems, except possibly leaking a global (or potentially assigning to a global when this wasn't intended). However, in follow on work, I want to make use of knowing whether the LHS is a global or local in lowering, so this was causing me trouble. Fix all of this by putting back the `moved-local` where it was dropped. Fixes #56711

This adjusts lowering to emit `setglobal!` for assignment to globals, thus making the `=` expr head used only for slots in `CodeInfo` and entirely absent in `IRCode`. The primary reason for this is just to reduce the number of special cases that compiler passes have to reason about. In IRCode, `=` was already essentially equivalent to `setglobal!`, so there's no good reason not to canonicalize. Finally, the `=` syntax form for globals already gets recognized specially to insert `convert` calls to their declared binding type, so this doesn't impose any additional requirements on lowering to distinguish local from global assignments. In general, I'd also like to separate syntax and intermediate forms as much as possible where their semantics differ, which this accomplises by just using the builtin. This change is mostly semantically invisible, except that spliced-in GlobalRefs now declare their binding because they are indistinguishable from ordinary assignments at the stage where I inserted the lowering. If we want to, we can preserve the difference, but it'd be a bit more annoying for not much benefit, because: 1. The spliced in version was only recently made to work anyway, and 2. The semantics of when exactly bindings are declared is still messy on master anyway and will get tweaked shortly in further binding partitions work.

This partially reverts #51970, once again allowing throwing GlobalRefs in value position for `CodeInfo`, (but not `IRCode`). The primary motivation for this reversal is that after binding partition the throwiness of a global is world-age dependent and we do not want to have lowering be world-age dependent, because that would require us to rewrite the lowered code upon invalidation (#56649). With respect to the original motivation for this change (being able to use the statement flag in inference), we retain the property that the statement flags apply only to the optimizer version of the statement (i.e. with the GlobalRefs moved out). The only place in inference where we were not doing this, we can rely on `exct` instead. This does assume that `exct` is precise, but we've done a lot of work on that in the past year, so hopefully we can rely on that now. The revert is partial, because we go in the opposite direction in a few places: For `GotoIfNot` cond and `EnterNode` scope operands, we just forbid any `GlobalRef` entirely. Constants are rare in these, so no need to burden inference with the possibility of handling these. We do however, keep (and add appropriate handling code) for GlobalRef arguments to `ReturnNode` to make sure that trivial code remains one statement.

vtjnash · 2024-11-29T12:00:48Z

What is the reason not to just ban them entirely except if the optimizer proved them no throw and added an optimizer edge?

Keno · 2024-11-29T12:05:40Z

If GlobalRef were entirely disallowed in value position, we would blow up the size of lowered code by a significant factor. At 2-3x.

vtjnash · 2024-11-29T19:36:07Z

That seems to imply that most statements contain 2-3 globals (and probably more), which seems unlikely. I tried deleting this lowering complication entirely, and the system image changed by an insignificant amount (1MB or 0.6%), which seems easily recoverable by making the compression design a lot better once only SSAValue is allowed as immediate values.

vtjnash@deepsea4:~/julia$ git diff
diff --git a/src/ast.c b/src/ast.c
index 474c0661f52..5ae5ba0af79 100644
--- a/src/ast.c
+++ b/src/ast.c
@@ -288,7 +288,7 @@ static jl_value_t *scm_to_julia_(fl_context_t *fl_ctx, value_t e, jl_module_t *m
 
 static const builtinspec_t julia_flisp_ast_ext[] = {
     { "defined-julia-global", fl_defined_julia_global }, // TODO: can we kill this safepoint
-    { "nothrow-julia-global", fl_nothrow_julia_global },
+    //{ "nothrow-julia-global", fl_nothrow_julia_global },
     { "current-julia-module-counter", fl_module_unique_name },
     { "julia-scalar?", fl_julia_scalar },
     { NULL, NULL }

$ size -A usr/lib/julia/sys.so | grep ldata
.ldata                158870928 => 159737216
$ du -sh usr/share/julia/compiled/v1.12/
254M => 271M

Keno · 2024-11-30T02:48:25Z

Alright, if we're happy to bite the size increase, I can put up that PR instead.

This is an alternative to #56714 that goes in the opposite direction - just outline all GlobalRefs during lowering. It is a lot simpler that #56714 at the cost of some size increase. Numbers: sys.so .ldata size: This PR: 159.8 MB Master: 158.9 MB I don't have numbers of #56714, because it's not fully complete. Additionally, it's possible that restricting GlobalRefs from arguments position would let us use a more efficient encoding in the future.

Keno · 2024-12-05T07:35:32Z

Closing in favor of #56746

This is an alternative to #56714 that goes in the opposite direction - just outline all GlobalRefs during lowering. It is a lot simpler that #56714 at the cost of some size increase. Numbers: sys.so .ldata size: This PR: 159.8 MB Master: 158.9 MB I don't have numbers of #56714, because it's not fully complete. Additionally, it's possible that restricting GlobalRefs from arguments position would let us use a more efficient encoding in the future.

Keno added 3 commits November 29, 2024 07:35

Keno mentioned this pull request Nov 29, 2024

Add basic code for binding partition revalidation #56649

Draft

Keno force-pushed the kf/canonicalequal branch from 18205fb to 5de6d3b Compare December 3, 2024 00:57

Base automatically changed from kf/canonicalequal to master December 3, 2024 06:19

Keno mentioned this pull request Dec 3, 2024

Fully outline all GlobalRefs #56746

Merged

Keno closed this Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lowering: Revert undefined global outlining #56714

lowering: Revert undefined global outlining #56714

Keno commented Nov 29, 2024

vtjnash commented Nov 29, 2024

Keno commented Nov 29, 2024

vtjnash commented Nov 29, 2024

Keno commented Nov 30, 2024

Keno commented Dec 5, 2024

lowering: Revert undefined global outlining #56714

lowering: Revert undefined global outlining #56714

Conversation

Keno commented Nov 29, 2024

vtjnash commented Nov 29, 2024

Keno commented Nov 29, 2024

vtjnash commented Nov 29, 2024

Keno commented Nov 30, 2024

Keno commented Dec 5, 2024