Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable/improve constant propagation through varargs methods #26826

Merged
merged 1 commit into from
May 4, 2018

Conversation

jrevels
Copy link
Member

@jrevels jrevels commented Apr 16, 2018

This should allow constant propagation through varargs functions. I was just a keyboard monkey following @Keno's/@vtjnash's direction for this PR, so they can probably answer any questions better than I can.

EDIT: We've now added some other stuff that expands the scope here a bit (mainly so we don't forget some of these hacks), but this PR can be broken up into several others later if need be

@jrevels jrevels changed the title grab varargs type info earlier than we did before WIP the long march towards inferring Cassette Apr 17, 2018
if isvarargtype(va)
# assumes that we should never see Vararg{T, x}, where x is a constant (should be guaranteed by construction)
va = rewrap_unionall(va, linfo.specTypes)
vararg_type_vec = Any[va]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This vec type is wrong - there’s basically nothing we can do in this case (just keep the empty Tuple)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, looks like the inference code will handle this, so this is OK

Copy link
Member Author

@jrevels jrevels Apr 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to make sure I follow, you're saying that we can't do anything if we don't actually have the types of the "trailing" arguments in atypes, so this should just be:

if nargs > laty
    vararg_type_vec = Any[]
    vararg_type = Tuple{}
else
     # the stuff that's already here, but fixed wrt to your other comment
end

EDIT: oops, I made this comment before refreshing the page, so I didn't see your other comment

cache_match = true
# verify that the trailing args (va) aren't Const
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to verify that the vargs list matches

Copy link
Member Author

@jrevels jrevels Apr 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verifying this correctly would essentially require fully recomputing the vararg_type_vec part of get_argtypes, right?

Should I actually do that, or is there a more lightweight check I should be doing?

EDIT: This would also be necessary for checking the last element of cache_args, which is the vararg_type that we computed in get_argtypes.

end
cache_match || continue
for i in 1:nargs
for i in 1:length(argtypes)
a = argtypes[i]
ca = cache_args[i]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only nargs items long

for cache_code in cache
# try to search cache first
cache_args = cache_code.args
if cache_code.linfo === code && length(cache_args) >= nargs
if cache_code.linfo === code && length(cache_args) == length(argtypes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

length(cache_args) === nargs

This change would filter out any varargs method from being found in the cache (slow, but not otherwise broken)

vararg_type_vec = Any[rewrap_unionall(p, linfo.specTypes) for p in atypes[nargs:laty]]
vararg_type = tuple_tfunc(Tuple{vararg_type_vec...})
for i in 1:length(vararg_type_vec)
atyp = vararg_type_vec[i]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this seems wrong too -atypes can be a Varargs here, and I don’t recall seeing that being checked for in the downstream users

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just talked to Jameson in person, turns out this is actually okay)

end
end
end
result.vargs = vararg_type_vec
end
args[nargs] = vararg_type
nargs -= 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this got put back, aren’t we missing detection of the Varargs argument as Const then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, wait, no. I think tuple_tfunc took care of this

@iblislin
Copy link
Member

iblislin commented Apr 17, 2018

BSD CI got unkillable process from this PR, and it ate 100% CPU on a core.
Here is the bt from LLDB, hope this help.
https://gist.github.com/iblis17/df720508da46677c3f5d73318a136362

and this

(lldb) p jl_lineno                  
(int) $0 = 1251                     
(lldb) p jl_filename                
(const char *) $1 = 0x000000080fcee158 "/usr/home/julia/julia-fbsd-buildbot/worker/11rel-amd64/build/test/compiler/compiler.jl"

@jrevels
Copy link
Member Author

jrevels commented Apr 21, 2018

Looks like this passes all tests except for compiler/compiler.jl, where it hangs.

@jrevels
Copy link
Member Author

jrevels commented Apr 21, 2018

Okay, I've tracked down the offending test here:

# issue #13183

Interrupting that test while it hangs yields:

^CWARNING: Force throwing a SIGINT
Internal error: encountered unexpected error in runtime:
InterruptException()
jl_egal at /Users/jarrettrevels/data/repos/julia-dev/src/builtins.c:171
is_derived_type at ./compiler/typelimits.jl:66
jfptr_is_derived_type_2599 at /Users/jarrettrevels/data/repos/julia-dev/usr/lib/julia/sys.dylib (unknown line)
is_derived_type at ./compiler/typelimits.jl:74
jfptr_is_derived_type_2599 at /Users/jarrettrevels/data/repos/julia-dev/usr/lib/julia/sys.dylib (unknown line)
⋮
jfptr_is_derived_type_2599 at /Users/jarrettrevels/data/repos/julia-dev/usr/lib/julia/sys.dylib (unknown line)
is_derived_type at ./compiler/typelimits.jl:74
jfptr_is_derived_type_2599 at /Users/jarrettrevels/data/repos/julia-dev/usr/lib/julia/sys.dylib (unknown line)
is_derived_type at ./compiler/typelimits.jl:74
jfptr_is_derived_type_2599 at /Users/jarrettrevels/data/repos/julia-dev/usr/lib/julia/sys.dylib (unknown line)
is_derived_type at ./compiler/typelimits.jl:74
jfptr_is_derived_type_2599 at /Users/jarrettrevels/data/repos/julia-dev/usr/lib/julia/sys.dylib (unknown line)
is_derived_type at ./compiler/typelimits.jl:55
jfptr_is_derived_type_2599 at /Users/jarrettrevels/data/repos/julia-dev/usr/lib/julia/sys.dylib (unknown line)
is_derived_type at ./compiler/typelimits.jl:74
jfptr_is_derived_type_2599 at /Users/jarrettrevels/data/repos/julia-dev/usr/lib/julia/sys.dylib (unknown line)
is_derived_type_from_any at ./compiler/typelimits.jl:94
jfptr_is_derived_type_from_any_2596 at /Users/jarrettrevels/data/repos/julia-dev/usr/lib/julia/sys.dylib (unknown line)
type_more_complex at ./compiler/typelimits.jl:228
jfptr_type_more_complex_2588 at /Users/jarrettrevels/data/repos/julia-dev/usr/lib/julia/sys.dylib (unknown line)
⋮

@jrevels
Copy link
Member Author

jrevels commented Apr 30, 2018

Woohoo, tests here now pass locally.

@jrevels jrevels mentioned this pull request Apr 30, 2018
5 tasks
@jrevels
Copy link
Member Author

jrevels commented Apr 30, 2018

Only things left here now are adding tests and investigating the potential cache problems pointed out at JuliaLabs/Cassette.jl#41 (comment)

@jrevels jrevels force-pushed the jr/pleasegodmakeitstop branch 2 times, most recently from d863bba to 3424b31 Compare May 2, 2018 21:34
@jrevels jrevels changed the title WIP the long march towards inferring Cassette enable/improve constant propagation through varargs methods May 2, 2018
@jrevels
Copy link
Member Author

jrevels commented May 2, 2018

Okay, tests added and caching problem fixed (thanks to @vtjnash). I rebased and squashed so that this doesn't leave a bunch of intermediate broken commits. Only thing left now is to make sure CI passes and then this is good to go.

@jrevels
Copy link
Member Author

jrevels commented May 2, 2018

From running tests locally, it looks like everything passes except a number of tests in SparseArrays/test/higherorderfns.jl - including the tests that are marked @test_broken on master which this PR had previously fixed. Looks like we'll have to get those passing again before merging this...

EDIT: Here's a gist of the failures logged by the test suite: https://gist.github.com/jrevels/197ed8b8bf71d72acd8bebe246116655

EDIT 2: ...and here's a gist that reproduces the failures in a more digestable format (ripped from the failing test code): https://gist.github.com/jrevels/20f5e1d7ac64c0b51099960ded3e4687

@mbauman
Copy link
Member

mbauman commented May 2, 2018

Good luck. I've lost many hours to failures in there. Typically it's been due to a failure to inline. Check @code_typed and search for invokes. There should be very few and none in the non-error path.

@jrevels
Copy link
Member Author

jrevels commented May 3, 2018

So, on master (see https://gist.github.com/jrevels/20f5e1d7ac64c0b51099960ded3e4687 for test setup):

julia> @test (@allocated broadcast!(+, Q, X, Y, Z)) == 0
Test Passed

julia> @test (@allocated broadcast!(*, Q, X, Y, Z)) == 0
Test Passed

julia> @test (@allocated broadcast!(f, Q, X, Y, Z)) == 0
Test Failed at REPL[10]:1
  Expression: #= REPL[10]:1 =# @allocated(broadcast!(f, Q, X, Y, Z)) == 0
   Evaluated: 16 == 0
ERROR: There was an error during testing

On this PR currently:

julia> @test (@allocated broadcast!(+, Q, X, Y, Z)) == 0
Test Failed at REPL[8]:1
  Expression: #= REPL[8]:1 =# @allocated(broadcast!(+, Q, X, Y, Z)) == 0
   Evaluated: 6688 == 0
ERROR: There was an error during testing

julia> @test (@allocated broadcast!(*, Q, X, Y, Z)) == 0
Test Failed at REPL[9]:1
  Expression: #= REPL[9]:1 =# @allocated(broadcast!(*, Q, X, Y, Z)) == 0
   Evaluated: 6688 == 0
ERROR: There was an error during testing

julia> @test (@allocated broadcast!(f, Q, X, Y, Z)) == 0
Test Failed at REPL[10]:1
  Expression: #= REPL[10]:1 =# @allocated(broadcast!(f, Q, X, Y, Z)) == 0
   Evaluated: 1744 == 0
ERROR: There was an error during testing

EDIT: I was dumb, see comment below

@jrevels
Copy link
Member Author

jrevels commented May 3, 2018

D'oh, I was dumb and didn't rebuild the sysimg. It turns out the culprit is entirely the change to inlineable that reduced InferenceResult cache misses, which makes sense.

Here's this PR on the MWE tests with the offending change reverted:

julia> @test (@allocated broadcast!(+, Q, X, Y, Z)) == 0
Test Passed

julia> @test (@allocated broadcast!(*, Q, X, Y, Z)) == 0
Test Passed

julia> @test (@allocated broadcast!(f, Q, X, Y, Z)) == 0
Test Passed

So I just need to figure out how to fix those cache misses without affecting inlining...

@jrevels jrevels force-pushed the jr/pleasegodmakeitstop branch from 3424b31 to 26911c1 Compare May 3, 2018 17:46
@jrevels
Copy link
Member Author

jrevels commented May 3, 2018

Okay, I just pushed a rewrite of the cache lookup fix which passes both the new compiler tests and the MWE above.

@vtjnash points out that the previous cache lookup fix also fixed another problem that is currently on master: the argexprs passed to inline_as_constant(linfo.inferred_const, argexprs, sv, invoke_data) should be pre-varargs-to-tuple-type rewrite (what is currently called argexprs0 in the linked snippet). However, this fix apparently exposed a different issue in the optimizer, which is what caused the sparse broadcast test breakage.

I'm not going to consider fixing this issue necessary for merging this PR, since the issue already exists on master, but it might be worth looking into later.

@jrevels
Copy link
Member Author

jrevels commented May 3, 2018

Okay, tests all pass locally now. This PR is basically done, so I'm going to merge once CI turns up green unless there are any objections.

@KristofferC
Copy link
Member

Worth running Nanosoldier?

@jrevels
Copy link
Member Author

jrevels commented May 3, 2018

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@@ -5,6 +5,7 @@ const EMPTY_VECTOR = Vector{Any}()
mutable struct InferenceResult
linfo::MethodInstance
args::Vector{Any}
vargs::Vector{Any}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field could use an explanatory comment.

- Store varargs type information in the InferenceResult object, such that the info can be used during inference/optimization

- Hack in a more precise return type for getfield of a vararg tuple. Ideally, we would handle this by teaching inference to track the types of the individual fields of a Tuple, which would make this unnecessary, but until then, this hack is helpful.

- Spoof parents as well as children during recursion limiting, so that higher degree cycles are appropriately spoofed

- A broadcast test marked as broken is now no longer broken, presumably due to the optimizations in this commit

- Fix relationship between depth/mindepth in limit_type_size/is_derived_type. The relationship should have been inverse over the domain in which they overlap, but was not maintained consistently. An example of problematic case was:

t = Tuple{X,X} where X<:Tuple{Tuple{Int64,Vararg{Int64,N} where N},Tuple{Int64,Vararg{Int64,N} where N}}
c = Tuple{X,X} where X<:Tuple{Int64,Vararg{Int64,N} where N}

because is_derived_type was computing the depth of usage rather than the depth of definition. This change thus makes the depth/mindepth calculations more consistent, and causes the limiting heuristic to return strictly wider types than it did before.

- Move the optimizer's "varargs types to tuple type" rewrite to after cache lookup.Inference is populating the InferenceResult cache using the varargs form, so the optimizer needs to do the lookup before writing the atypes in order to avoid cache misses.

Co-authored-by: Jameson Nash <vtjnash@users.noreply.github.com>
Co-authored-by: Keno Fischer <keno@alumni.harvard.edu>
@jrevels jrevels force-pushed the jr/pleasegodmakeitstop branch from 1a9f8e8 to b4b4d21 Compare May 4, 2018 14:25
@jrevels
Copy link
Member Author

jrevels commented May 4, 2018

Added the requested comment. Investigated the benchmark regressions, none seemed significant. Even the 3.65x one showed no regression locally, and had the same @code_typed output here vs. master. I guess nanosoldier is really noisy these days...

CI was totally green (besides unrelated Windows timeout) before I added the requested comment, so I'll go ahead and merge (I did rebuild and run tests locally just be absolutely sure, but of course adding a comment didn't break anything).

@jrevels jrevels merged commit 429a885 into master May 4, 2018
@jrevels jrevels deleted the jr/pleasegodmakeitstop branch May 4, 2018 14:50
Keno pushed a commit that referenced this pull request May 6, 2018
* fix InferenceResult cache lookup for new optimizer

* utilize cached vararg type info in new inlining pass

* fix test to work with new optimizer

* fix predicate for exploiting cached varargs type info

* atypes no longer needs to be modified to match cached types

* don't require varargs to be in last position to exploit cached type info

* restore missing isva boolean
aviatesk added a commit that referenced this pull request Oct 1, 2021
- inlining.jl: `UnionSplitSignature` (and its `SimpleCartesian` helper)
  is no longer used
- abstractinterpretation.jl: the use of `precise_container_type` seems
  to be introduced in #26826, but I think `getfield_tfunc` at this
  moment is precise enough to propagate elements of constant tuples.
aviatesk added a commit that referenced this pull request Oct 1, 2021
- inferenceresult.jl: just dead allocations
- inlining.jl: `UnionSplitSignature` (and its `SimpleCartesian` helper)
  is no longer used
- abstractinterpretation.jl: the use of `precise_container_type` seems
  to be introduced in #26826, but I think `getfield_tfunc` at this
  moment is precise enough to propagate elements of constant tuples.
aviatesk added a commit that referenced this pull request Oct 2, 2021
- inferenceresult.jl: just dead allocations
- inlining.jl: `UnionSplitSignature` (and its `SimpleCartesian` helper)
  is no longer used
- abstractinterpretation.jl: the use of `precise_container_type` seems
  to be introduced in #26826, but `getfield_tfunc` at this
  moment is precise enough to propagate elements of constant tuples.
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Feb 22, 2022
- inferenceresult.jl: just dead allocations
- inlining.jl: `UnionSplitSignature` (and its `SimpleCartesian` helper)
  is no longer used
- abstractinterpretation.jl: the use of `precise_container_type` seems
  to be introduced in JuliaLang#26826, but `getfield_tfunc` at this
  moment is precise enough to propagate elements of constant tuples.
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
- inferenceresult.jl: just dead allocations
- inlining.jl: `UnionSplitSignature` (and its `SimpleCartesian` helper)
  is no longer used
- abstractinterpretation.jl: the use of `precise_container_type` seems
  to be introduced in JuliaLang#26826, but `getfield_tfunc` at this
  moment is precise enough to propagate elements of constant tuples.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants