Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rlibc's memcpy get miscompiled #31505

Closed
Zoxc opened this issue Feb 9, 2016 · 7 comments · Fixed by #31791
Closed

rlibc's memcpy get miscompiled #31505

Zoxc opened this issue Feb 9, 2016 · 7 comments · Fixed by #31791
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.

Comments

@Zoxc
Copy link
Contributor

Zoxc commented Feb 9, 2016

rlibc's memcpy get miscompiled with -O --target=x86_64-sun-solaris -C target-feature=-mmx,-sse,-sse2
At 0x18 lea esi, [rsi+0] zeroes the upper 32-bits of rsi which is a 64-bit pointer.

Curiously some targets has the loop unrolled. I don't see any reason for this.

Assembler output of various targets:

x86_64-rumprun-netbsd:
x86_64-sun-solaris:

.text.memcpy:0000000000000010 memcpy          proc near
.text.memcpy:0000000000000010                 test    rdx, rdx
.text.memcpy:0000000000000013                 jz      short loc_2F
.text.memcpy:0000000000000015                 mov     rax, rdi
.text.memcpy:0000000000000018                 lea     esi, [rsi+0]
.text.memcpy:000000000000001F                 nop
.text.memcpy:0000000000000020
.text.memcpy:0000000000000020 loc_20:                                 ; CODE XREF: memcpy+1D�j
.text.memcpy:0000000000000020                 mov     cl, [rsi]
.text.memcpy:0000000000000022                 mov     [rax], cl
.text.memcpy:0000000000000024                 inc     rax
.text.memcpy:0000000000000027                 inc     rsi
.text.memcpy:000000000000002A                 dec     rdx
.text.memcpy:000000000000002D                 jnz     short loc_20
.text.memcpy:000000000000002F
.text.memcpy:000000000000002F loc_2F:                                 ; CODE XREF: memcpy+3�j
.text.memcpy:000000000000002F                 mov     rax, rdi
.text.memcpy:0000000000000032                 retn
.text.memcpy:0000000000000032 memcpy          endp

x86_64-unknown-linux-gnu:
x86_64-unknown-linux-musl:
x86_64-unknown-freebsd:

.text.memcpy:0000000000000010                 public memcpy
.text.memcpy:0000000000000010 memcpy          proc near
.text.memcpy:0000000000000010                 test    rdx, rdx
.text.memcpy:0000000000000013                 jz      loc_A4
.text.memcpy:0000000000000019                 lea     r8, [rdx-1]
.text.memcpy:000000000000001D                 xor     ecx, ecx
.text.memcpy:000000000000001F                 test    dl, 7
.text.memcpy:0000000000000022                 jz      short loc_3E
.text.memcpy:0000000000000024                 mov     r9d, edx
.text.memcpy:0000000000000027                 and     r9d, 7
.text.memcpy:000000000000002B                 xor     ecx, ecx
.text.memcpy:000000000000002D                 nop     dword ptr [rax]
.text.memcpy:0000000000000030
.text.memcpy:0000000000000030 loc_30:                                 ; CODE XREF: memcpy+2C�j
.text.memcpy:0000000000000030                 mov     al, [rsi+rcx]
.text.memcpy:0000000000000033                 mov     [rdi+rcx], al
.text.memcpy:0000000000000036                 inc     rcx
.text.memcpy:0000000000000039                 cmp     r9, rcx
.text.memcpy:000000000000003C                 jnz     short loc_30
.text.memcpy:000000000000003E
.text.memcpy:000000000000003E loc_3E:                                 ; CODE XREF: memcpy+12�j
.text.memcpy:000000000000003E                 cmp     r8, 7
.text.memcpy:0000000000000042                 jb      short loc_A4
.text.memcpy:0000000000000044                 sub     rdx, rcx
.text.memcpy:0000000000000047                 lea     r8, [rdi+rcx+7]
.text.memcpy:000000000000004C                 lea     rcx, [rsi+rcx+7]
.text.memcpy:0000000000000051                 db      66h, 66h, 66h, 66h, 66h, 66h, 2Eh
.text.memcpy:0000000000000051                 nop     dword ptr [rax+rax+00000000h]
.text.memcpy:0000000000000060
.text.memcpy:0000000000000060 loc_60:                                 ; CODE XREF: memcpy+92�j
.text.memcpy:0000000000000060                 mov     al, [rcx-7]
.text.memcpy:0000000000000063                 mov     [r8-7], al
.text.memcpy:0000000000000067                 mov     al, [rcx-6]
.text.memcpy:000000000000006A                 mov     [r8-6], al
.text.memcpy:000000000000006E                 mov     al, [rcx-5]
.text.memcpy:0000000000000071                 mov     [r8-5], al
.text.memcpy:0000000000000075                 mov     al, [rcx-4]
.text.memcpy:0000000000000078                 mov     [r8-4], al
.text.memcpy:000000000000007C                 mov     al, [rcx-3]
.text.memcpy:000000000000007F                 mov     [r8-3], al
.text.memcpy:0000000000000083                 mov     al, [rcx-2]
.text.memcpy:0000000000000086                 mov     [r8-2], al
.text.memcpy:000000000000008A                 mov     al, [rcx-1]
.text.memcpy:000000000000008D                 mov     [r8-1], al
.text.memcpy:0000000000000091                 mov     al, [rcx]
.text.memcpy:0000000000000093                 mov     [r8], al
.text.memcpy:0000000000000096                 add     r8, 8
.text.memcpy:000000000000009A                 add     rcx, 8
.text.memcpy:000000000000009E                 add     rdx, 0FFFFFFFFFFFFFFF8h
.text.memcpy:00000000000000A2                 jnz     short loc_60
.text.memcpy:00000000000000A4
.text.memcpy:00000000000000A4 loc_A4:                                 ; CODE XREF: memcpy+3�j
.text.memcpy:00000000000000A4                                         ; memcpy+32�j
.text.memcpy:00000000000000A4                 mov     rax, rdi
.text.memcpy:00000000000000A7                 retn
.text.memcpy:00000000000000A7 memcpy          endp
.text.memcpy:00000000000000A7
.text.memcpy:00000000000000A7 _text_memcpy    ends
.text.memcpy
@Zoxc
Copy link
Contributor Author

Zoxc commented Feb 9, 2016

I've filed a LLVM bug for this: https://llvm.org/bugs/show_bug.cgi?id=26554

@Aatch Aatch added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Feb 9, 2016
@bluss
Copy link
Member

bluss commented Feb 10, 2016

Good job! What a nightmare of a bug to identify.

@semarie
Copy link
Contributor

semarie commented Feb 17, 2016

any feedback from llvm upstream ? this bug completely breaks rustc build under OpenBSD

@Zoxc
Copy link
Contributor Author

Zoxc commented Feb 20, 2016

This is fixed in the 3.8 release branch only, llvm-mirror/llvm@fdf40be
We should update rust's LLVM version to that.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Feb 20, 2016
Looks like they picked up a bunch of fixes, one of which is even quite relevant
to us!

Closes rust-lang#31505
@MagaTailor
Copy link

I wonder how you managed to disable sse and sse2 in LLVM on x86_64 - is there a patch for that?
I'd expect an error like this one:
LLVM ERROR: SSE2 register return with SSE2 disabled

@Zoxc
Copy link
Contributor Author

Zoxc commented Feb 20, 2016

@petevine Try to avoid passing floating points to/from functions

@MagaTailor
Copy link

Ah, thanks - does soft-float work on x86_64 presently?

bors added a commit that referenced this issue Feb 21, 2016
Looks like they picked up a bunch of fixes, one of which is even quite relevant
to us!

Closes #31505
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants