[X86][AVX] lowerShuffleAsLanePermuteAndPermute - incomplete lane shuffle mask #40076

RKSimon · 2019-02-14T14:16:11Z


Bugzilla Link	40730
Resolution	FIXED
Resolved on	Feb 18, 2019 03:22
Version	trunk
OS	Windows NT
Blocks	#39678
CC	@topperc,@zmodem,@RKSimon,@rotateright
Fixed by commit(s)	r354034,r354117

Extended Description

https://gcc.godbolt.org/z/tLJJE0

define <8 x i32> @shuffle_v8i32_0dcd3f14(<8 x i32> %a, <8 x i32> %b) {
%shuffle = shufflevector <8 x i32> %a, <8 x i32> %b, <8 x i32> <i32 0, i32 13, i32 12, i32 13, i32 3, i32 15, i32 1, i32 4>
ret <8 x i32> %shuffle
}

define <8 x i32> @shuffle_v8i32_0dcd3f14_constant(<8 x i32> %a0) {
%res = shufflevector <8 x i32> %a0, <8 x i32> <i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16>, <8 x i32> <i32 0, i32 13, i32 12, i32 13, i32 3, i32 15, i32 1, i32 4>
ret <8 x i32> %res
}

When the shuffle gets lowered, the constant argument gets incorrectly folded. This appears to be due to the shuffle mask that (correctly) lowers to vperm2f128, containing undef elements in the wrong place that allows undef propagation that leads to incorrect constant folding.

shuffle_v8i32_0dcd3f14:
vextractf128 $1, %ymm0, %xmm2
vblendps $1, %xmm2, %xmm0, %xmm2 # xmm2 = xmm2[0],xmm0[1,2,3]
vpermilps $23, %xmm2, %xmm2 # xmm2 = xmm2[3,1,1,0]
vinsertf128 $1, %xmm2, %ymm0, %ymm0
vperm2f128 $17, %ymm0, %ymm1, %ymm1 # ymm1 = ymm1[2,3,2,3]
vpermilpd $4, %ymm1, %ymm1 # ymm1 = ymm1[0,0,3,2]
vblendps $209, %ymm0, %ymm1, %ymm0 # ymm0 = ymm0[0],ymm1[1,2,3],ymm0[4],ymm1[5],ymm0[6,7]
retq

.LCPI0_0:
.quad 60129542157 # 0x‭0000000E0000000D‬
.quad 60129542157 # 0x‭0000000E0000000D‬
.zero 8 <-- INCORRECT - should be 0x‭0000000F00000000
.quad 60129542157 # 0x‭0000000E0000000D‬
shuffle_v8i32_0dcd3f14_constant:
vextractf128 $1, %ymm0, %xmm1
vblendps $1, %xmm1, %xmm0, %xmm1 # xmm1 = xmm1[0],xmm0[1,2,3]
vpermilps $23, %xmm1, %xmm1 # xmm1 = xmm1[3,1,1,0]
vinsertf128 $1, %xmm1, %ymm0, %ymm0
vblendps $46, .LCPI0_0(%rip), %ymm0, %ymm0 # ymm0 = ymm0[0],mem[1,2,3],ymm0[4],mem[5],ymm0[6,7]
retq

Reduced from an internal fuzz test.

RKSimon · 2019-02-14T14:16:11Z

assigned to @RKSimon

RKSimon · 2019-02-14T14:17:39Z

The issue looks like it first appeared in rL344446, so is a regression in the 8.0 branch

RKSimon · 2019-02-14T14:19:55Z

https://gcc.godbolt.org/z/9BAQca contains the generic shuffle as well as the constant folding bugged version

RKSimon · 2019-02-14T14:49:14Z

Test case added at rL354034

RKSimon · 2019-02-14T15:01:39Z

That should've been:

.LCPI0_0:
.quad 60129542157 # 0x0000000E0000000D
.quad 60129542157 # 0x0000000E0000000D
.zero 8 <-- INCORRECT - should be 0x0000001000000000
.quad 60129542157 # 0x0000000E0000000D
shuffle_v8i32_0dcd3f14_constant:
vextractf128 $1, %ymm0, %xmm1
vblendps $1, %xmm1, %xmm0, %xmm1 # xmm1 = xmm1[0],xmm0[1,2,3]
vpermilps $23, %xmm1, %xmm1 # xmm1 = xmm1[3,1,1,0]
vinsertf128 $1, %xmm1, %ymm0, %ymm0
vblendps $46, .LCPI0_0(%rip), %ymm0, %ymm0 # ymm0 = ymm0[0],mem[1,2,3],ymm0[4],mem[5],ymm0[6,7]
retq

RKSimon · 2019-02-14T15:50:11Z

https://reviews.llvm.org/D58237

RKSimon · 2019-02-15T12:19:14Z

Fixed in trunk at rL354117

@Hans - please give it a while and then cherry pick r354034 + r354117

zmodem · 2019-02-18T11:22:08Z

Fixed in trunk at rL354117

@Hans - please give it a while and then cherry pick r354034 + r354117

Thanks! Merged them together in r354260. Please let me know if there are any follow-ups.

zmodem mentioned this issue Jan 16, 2019

[meta] 8.0.0 Release Blockers #39678

Closed

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[X86][AVX] lowerShuffleAsLanePermuteAndPermute - incomplete lane shuffle mask #40076

[X86][AVX] lowerShuffleAsLanePermuteAndPermute - incomplete lane shuffle mask #40076

RKSimon commented Feb 14, 2019

RKSimon commented Feb 14, 2019

RKSimon commented Feb 14, 2019

RKSimon commented Feb 14, 2019

RKSimon commented Feb 14, 2019

RKSimon commented Feb 14, 2019

RKSimon commented Feb 14, 2019

RKSimon commented Feb 15, 2019

zmodem commented Feb 18, 2019

[X86][AVX] lowerShuffleAsLanePermuteAndPermute - incomplete lane shuffle mask #40076

[X86][AVX] lowerShuffleAsLanePermuteAndPermute - incomplete lane shuffle mask #40076

Comments

RKSimon commented Feb 14, 2019

Extended Description

RKSimon commented Feb 14, 2019

RKSimon commented Feb 14, 2019

RKSimon commented Feb 14, 2019

RKSimon commented Feb 14, 2019

RKSimon commented Feb 14, 2019

RKSimon commented Feb 14, 2019

RKSimon commented Feb 15, 2019

zmodem commented Feb 18, 2019