Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gf: improve ordering of operations based on performance estimates #36436

Merged
merged 1 commit into from
Jun 26, 2020

Conversation

vtjnash
Copy link
Sponsor Member

@vtjnash vtjnash commented Jun 25, 2020

In the last commit (#36166), I didn't expect NamedTuple to hammer our
performance, but was starting to notice performance problems with trying
to call constructors. It's somewhat violating our best-practices in two
common cases (both the constructor and structdiff functions are
dispatching on non-leaf types instead of values). That was putting a lot
of strain on the cache, and so it forms a good test case. Keeping those
cases out of the cache, and searching the cache in a more suitable order
(significant mainly for constructors because there are always so many of
them), offsets that--and seems to probably make us faster overall as a
bonus because of that effect!

Here's what I was measuring that was highlighting this issue:

julia> @btime methods(==, (Int64, Int64)) 
   1.248 μs (18 allocations: 784 bytes) # before PR
  32.553 μs (18 allocations: 784 bytes) # master now
   1.310 μs (18 allocations: 784 bytes) # this PR

@JeffBezanson
Copy link
Sponsor Member

Yeah, I see structdiff show up a lot in various profiles. I've thought about other ways to write it. One possibility is a getfields builtin. We could also pass in Val{names} instead of NamedTuple{names} if you think that would help. Any other ideas?

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Jun 25, 2020

Possibly: that'd help avoid the awkward <:Tuple constraint, though perhaps at the risk of introducing even more DataType objects. Regardless, this PR is an improvement, since I don't want some other user types like this (there's already a couple) to be slowing down other constructor calls.

In the last commit, I didn't expect NamedTuple to hammer our
performance, but was starting to notice performance problems with trying
to call constructors. It's somewhat violating our best-practices in two
common cases (both the constructor and structdiff functions are
dispatching on non-leaf types instead of values). That was putting a lot
of strain on the cache, and so it forms a good test case. Keeping those
cases out of the cache, and searching the cache in a more suitable order
(significant mainly for constructors because there are always so many of
them), offsets that--and seems to possibly make us slightly faster
overall as a bonus because of that effect!
@vtjnash vtjnash mentioned this pull request Jun 26, 2020
@vtjnash vtjnash merged commit d762e8c into master Jun 26, 2020
@vtjnash vtjnash deleted the jn/ml-matches-leaf-cache3 branch June 26, 2020 20:17
@JeffBezanson JeffBezanson added the compiler:latency Compiler latency label Jul 30, 2020
simeonschaub pushed a commit to simeonschaub/julia that referenced this pull request Aug 11, 2020
…liaLang#36436)

In the last commit, I didn't expect NamedTuple to hammer our
performance, but was starting to notice performance problems with trying
to call constructors. It's somewhat violating our best-practices in two
common cases (both the constructor and structdiff functions are
dispatching on non-leaf types instead of values). That was putting a lot
of strain on the cache, and so it forms a good test case. Keeping those
cases out of the cache, and searching the cache in a more suitable order
(significant mainly for constructors because there are always so many of
them), offsets that--and seems to possibly make us slightly faster
overall as a bonus because of that effect!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:latency Compiler latency
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants