Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64] Add lowering for
@llvm.experimental.vector.compress
#101015[AArch64] Add lowering for
@llvm.experimental.vector.compress
#101015Changes from 6 commits
24780e8
a01b778
7c11026
76a15b2
fb3759f
e27fcd1
f1daac6
119e98d
78f8ad7
c9af6f2
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VP_MERGE are not really supported or encouraged by the AArch64 backend. Is there an alternative we can emit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what would be the AArch64-way to express this logic? I copied it from a RICSV example that came up in the original discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does 'corresponding lane' mean in this case? If the mask is
<1, 0, 0, 1>
, would the passthru for the zero'ed lanes expected to be<_, _, p, p>
or<_, p, p, _>
(where 'p' means the passthru value and '_' for don't care)If the former, then I guess you could do a popcount of the predicate, create a mask from that, and then do a vector select?
This also makes me wonder, would it be better to define the intrinsic to make the other lanes undefined, rather than adding a passthru parameter to the intrinsic? That would make the operation easier to codegen, and we can use existing intrinsics to implement the passthru behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The passthru exists because it's useful for some combinations of target/passthru value. For SVE in particular, for a non-zero passthru, we need to explicitly construct a mask, but other targets support it directly. This was discussed in #92289.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sdesmalen-arm The "corresponding lanes" are the remainder, used to fill up empty slots in the output. So
vec=<a, b, c, d>, mask=<1, 0, 0, 1>, passthru=<w, x, y, z>
would result in<a, d, y, z>
. In your example, it would be<_, _, p, p>
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davemgreen using VP_MERGE results in the following assembly (in one of the tests)
So it is doing the right thing. I could manually add the instructions instead of using VP_MERGE, but I'm not sure that makes sense. What would you suggest to do here?