-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Fix #31392, unaliasing of broadcast arguments against destinations with repeated indices #31407
base: master
Are you sure you want to change the base?
Conversation
Nice. I assume if we merge this, we probably don't need/want #31391, since this PR more accurately detects when a copy is needed. |
Yes, that's exactly right (edit: with the caveat that other broadcast implementations in packages need to be similarly careful about aliasing behaviors, but they need to be careful about aliasing in any case). |
Triage is in favor here given the following observations:
|
Let's see if I missed any obvious cases where a copy is now made: @nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
1ade0fb
to
d64f74c
Compare
Is this ready to merge as-is? If so, we might want to consider squeezing it into 1.2. |
I'm not terribly thrilled with this because it is making lots of unnecessary preventative copies for custom abstract array types. Perhaps it should only make the preventative copy for the known array types that can self-alias (e.g., just |
…th repeated indices When introducing the new broadcasting implementation (which came alongside greater semantic guarantees with regards to aliasing source and destination), I found that we needed a performance optimization for the common case where the destination was `===` to an argument. This is quite common due to the `.op=` syntax and indeed is safe in most cases as the iteration order between the destination and the source is matched. This performance optimization is not safe, however, in the case that the destination has multiple locations that refer to the same memory.
but still provide the hook for arrays to improve
d64f74c
to
55e10b2
Compare
Ok, we've been letting perfect be the enemy of the good here for far too long. Let's be less conservative — and default to not unaliasing self broadcasts — but still allow SubArrays and other custom array types to catch and fix this situation. I think this is a good compromise for now. |
""" | ||
potentially_self_aliased(::DenseArray) = false | ||
potentially_self_aliased(A::StridedArray) = any(==(0), strides(A)) | ||
potentially_self_aliased(A::SubArray) = any(!allunique, A.indices) | ||
potentially_self_aliased(A::SubArray) = any(map(!allunique, A.indices)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note for posterity: any(map(!allunique, A.indices))
does better compile-time evaluation with heterogeneous tuples and constant-folds to return 0
more frequently.
GTG then? |
Sorry to bump, but were there reasons preventing merging a fix for this? |
bumpity |
Marking for triage so discuss if there's actually anything preventing this from being merged. |
broadcast_unalias(::Nothing, src) = src | ||
|
||
""" | ||
potentially_self_aliased(A) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A year Two years later, I'm not a big fan of this name. I'm guessing I used potentially
as an attempt to say that true
s are conclusive but false
s are not... but I don't think it actually reads that way.
mightalias
documents itself as a "conservative" test for sharing the same memory, but I don't even know what that means on its face. Is it more likely to have false positive or a false negative?
In both cases, I used a different word than my usual go-to "maybe_foo" as an attempt to do some work in one direction or the other, but I don't think it was successful because I'm having a hard time piecing together what they mean right now. In short, here's what we have between these two systems:
function | returns | wrong if... |
---|---|---|
mightalias |
true |
the arrays reference distinct subsets of the same memory region in a complicated manner |
mightalias |
false |
a custom array has not tracked its internal field(s) with dataids and one of those aliases |
potentially_self_aliased |
true |
an array goofed its implementation (the default definitions should not have false positives) |
potentially_self_aliased |
false |
this is the default answer; a self-aliasing array didn't define a method on this obscure function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Numpy calls this may_have_internal_overlap
with tri-state output (known to not overlap, known to overlap, and unknown).
@mbauman - I think given this change we should add |
Something might unrelated: |
Returns true if multiple locations in `A` reference the same memory | ||
""" | ||
potentially_self_aliased(::DenseArray) = false | ||
potentially_self_aliased(A::StridedArray) = any(==(0), strides(A)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do better than this, but doing better is NP-hard. Ref:
When introducing the new broadcasting implementation (which came alongside greater
semantic guarantees with regards to aliasing source and destination), I found that
we needed a performance optimization for the common case where the destination was
===
to an argument. This is quite common due to the.op=
syntax and indeed issafe in most cases as the iteration order between the destination and the source
is matched. This performance optimization is not safe, however, in the case that
the destination has multiple locations that refer to the same memory.
This just ensures our unaliasing pass is indeed doing what we want it to. You can still broadcast into these sorts of "self-aliasing" arrays — which will indeed repeatedly assign into the same location — but regardless of how you reference those destinations on the RHS (be it implicitly via
.op=
, with or without@views
, explicitly on the RHS, etc), this will now ensures the RHS arguments are unaliased from the destination before performing the operation. I believe the following example from #31392 is a strong argument in favor of this fix:Previously:
Clearly, this is unsustainable. With this PR the answer is 6 in all cases. I believe this is the correct answer because it matches how things behave with other views:
Marking this as both a bugfix and a minor change for triage as apparently some packages may have been depending upon this in certain circumstances.