Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler-rt does not define aarch64 outline atomics #10086

Closed
nitsky opened this issue Nov 1, 2021 · 12 comments · Fixed by #11828
Closed

compiler-rt does not define aarch64 outline atomics #10086

nitsky opened this issue Nov 1, 2021 · 12 comments · Fixed by #11828
Labels
arch-aarch64 64-bit ARM bug Observed behavior contradicts documented or intended behavior enhancement Solving this issue will likely involve adding new logic or components to the codebase.
Milestone

Comments

@nitsky
Copy link

nitsky commented Nov 1, 2021

Zig Version

0.9.0-dev.1551+8346e011c

Steps to Reproduce

On aarch64 linux:

#include <stdatomic.h>
#include <stdio.h>

int main() {
  atomic_int atomic_counter = 0;
  atomic_counter++;
  printf("%u\n", atomic_counter);
}
  1. gcc main.c
  2. zig cc main.c
  3. gcc -moutline-atomics main.c
  4. zig cc -moutline-atomics main.c

Expected Behavior

I expect all variations to compile and run correctly.

Actual Behavior

zig cc -moutline-atomics main.c fails to compile:

ld.lld: error: undefined symbol: __aarch64_ldadd4_acq_rel

I discovered this issue while attempting to cross compile Rust to aarch64-linux with zig cc. Rust enables outline atomics by default on aarch64-linux and currently prohibits disabling them.

@nitsky nitsky added the bug Observed behavior contradicts documented or intended behavior label Nov 1, 2021
@andrewrk andrewrk added arch-aarch64 64-bit ARM zig cc Zig as a drop-in C compiler feature labels Nov 20, 2021
@andrewrk andrewrk added this to the 0.9.0 milestone Nov 20, 2021
@andrewrk
Copy link
Member

Thanks for the report. How is the __aarch64_ldadd4_acq_rel symbol expected to be provided? It does not appear to be included in LLVM's compiler-rt. Is that a gcc extension?

@andrewrk andrewrk added the enhancement Solving this issue will likely involve adding new logic or components to the codebase. label Nov 27, 2021
@andrewrk andrewrk modified the milestones: 0.9.0, 0.11.0 Nov 27, 2021
@nitsky
Copy link
Author

nitsky commented Nov 29, 2021

@andrewrk I see that outline atomics were added to compiler-rt a year ago: https://reviews.llvm.org/D91156.

@andrewrk
Copy link
Member

Ah, I see the symbol names are constructed with string concatenation. Missed that in my grepping. Thanks! So the next step to resolve this issue is to port more of LLVM's compiler-rt into zig's.

@andrewrk andrewrk modified the milestones: 0.11.0, 0.10.0 Nov 29, 2021
@nitsky
Copy link
Author

nitsky commented Nov 29, 2021

Great, thanks. For those running into this issue with Rust, rust-lang/rust#90623 was recently merged which allows you to disable outline atomics with -C target-feature=-outline-atomics.

@irishgreencitrus
Copy link

I have this issue compiling https://github.com/mattnite/gyro on aarch64 Raspberry Pi OS, with just normal zig build

@PiotrSikora
Copy link
Contributor

zig c++ seems to be completely broken when cross-compiling to aarch64-linux-*, because libc++ relies on atomics.

$ cat hello.cpp
#include <iostream>

int main() {
    std::cout << "Hello, World!" << std::endl;
    return 0;
}
$ zig c++ -target aarch64-linux-musl hello.cpp
ld.lld: error: undefined symbol: __aarch64_ldadd8_acq_rel
>>> referenced by cxa_exception.cpp
[...]
ld.lld: error: undefined symbol: __aarch64_swp8_acq_rel
>>> referenced by cxa_handlers.cpp
[...]
ld.lld: error: undefined symbol: __aarch64_ldadd4_acq_rel
>>> referenced by ios.cpp
[...]
ld.lld: error: undefined symbol: __aarch64_ldadd8_relax
>>> referenced by memory.cpp
[...]
ld.lld: error: undefined symbol: __aarch64_cas8_acq_rel
>>> referenced by memory.cpp

@PiotrSikora
Copy link
Contributor

My issue above results in the same errors, but it's actually unrelated. I filed a new issue: #10777.

@andrewrk andrewrk modified the milestones: 0.10.0, 0.11.0 Apr 16, 2022
@prenex
Copy link

prenex commented May 25, 2022

I tried to use zig to cross-compile my codebase easier so that I can develop on my pinephone while walking but have parts of the project pre-cross-compiled on a stronger machine.

Sadly these kind of errors still crop up for me when the code uses std::shared_ptr. Are there any workarounds for a real c++ codebase or some things that I can turn off so some simpler code is generated or anything?

messense added a commit to rust-cross/cargo-zigbuild that referenced this issue Sep 15, 2022
@andrewrk andrewrk removed the zig cc Zig as a drop-in C compiler feature label Nov 20, 2022
@andrewrk andrewrk changed the title zig cc does not define aarch64 outline atomics compiler-rt does not define aarch64 outline atomics Nov 20, 2022
@andrewrk
Copy link
Member

I suspect that many people running into this issue are running into it because native CPU detection for aarch64 is failing to correctly identify the CPU features available in some cases. Here I have a Neoverse-N1 CPU:

andy@Debian-bullseye-latest-arm64-base:~/zig/build-release$ lscpu
Architecture:                    aarch64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
CPU(s):                          80
On-line CPU(s) list:             0-79
Thread(s) per core:              1
Core(s) per socket:              80
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       ARM
Model:                           1
Model name:                      Neoverse-N1
Stepping:                        r3p1
Frequency boost:                 disabled
CPU max MHz:                     3000.0000
CPU min MHz:                     1000.0000
BogoMIPS:                        50.00
L1d cache:                       5 MiB
L1i cache:                       5 MiB
L2 cache:                        80 MiB
NUMA node0 CPU(s):               0-79
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Mmio stale data:   Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:        Mitigation; __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; CSV2, BHB
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid 
                                 asimdrdm lrcpc dcpop asimddp ssbs

You can see that it is Neoverse-N1, and that it has the atomics flag which I believe corresponds to lse. Zig does have awareness of this cpu:

https://github.com/ziglang/zig/blob/master/lib/std/target/aarch64.zig#L2091-L2106

However, zig build-exe --show-builtin does not pick it up:

pub const cpu: std.Target.Cpu = .{
    .arch = .aarch64,
    .model = &std.Target.aarch64.cpu.generic,
    .features = std.Target.aarch64.featureSet(&[_]std.Target.aarch64.Feature{
        .ete,
        .fp_armv8,
        .fuse_adrp_add,
        .fuse_aes,
        .neon,
        .trbe,
        .use_postra_scheduler,
    }),
};

It is precisely because it sees the CPU as "generic" that it does not enable atomics, causing this issue.

To be clear, compiler_rt still needs to be augmented with these functions. However, there is a second, perhaps more important issue, which is that native CPU detection is not working.

Another clue here is that /proc/cpuinfo is missing the model name, even though lscpu is producing it:

processor	: 76
BogoMIPS	: 50.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x3
CPU part	: 0xd0c
CPU revision	: 1

This is perhaps an indicator that the implementation of lscpu does something different than looking at /proc/cpuinfo and we should read its source code to see what that logic is.

@LemonBoy
Copy link
Contributor

Add .m64 = &A64.neoverse_n1 here, checking if there are any other missing entries is left as an exercise for the reader.

@andrewrk
Copy link
Member

andrewrk commented Nov 20, 2022

I'm confused why zig-bootstrap is able to produce aarch64-linux binaries without running into this issue. In this case, it targets the generic cpu, which, as noted in #10086 (comment), is missing the lse feature. I would therefore expect to see undefined symbol errors when linking.

Edit: I understand everything now, thanks to this comment:

Outline atomics were added with gcc 9.3.1 and turned on by default in gcc 10.1. Consequently most of distributions had libgcc with outline atomics already.
Besides Linux, Android will utilize them as well. Don't know about iOs and MacOs, I guess they are compiling with LSE enabled, so outline atomics should not affect them.
As for benchmarks, I rely on investigations completed during gcc outline atomics enablement work.
Choice of solution ../gcc/libgcc/config/aarch64/lse.S:

The problem that we are trying to solve is operating system deployment
of ARMv8.1-Atomics, also known as Large System Exensions (LSE).

There are a number of potential solutions for this problem which have
been proposed and rejected for various reasons.  To recap:

(1) Multiple builds.  The dynamic linker will examine /lib64/atomics/
if HWCAP_ATOMICS is set, allowing entire libraries to be overwritten.
However, not all Linux distributions are happy with multiple builds,
and anyway it has no effect on main applications.
(2) IFUNC.  We could put these functions into libgcc_s.so, and have
a single copy of each function for all DSOs.  However, ARM is concerned
that the branch-to-indirect-branch that is implied by using a PLT,
as required by IFUNC, is too much overhead for smaller cpus.
(3) Statically predicted direct branches.  This is the approach that
is taken here.  These functions are linked into every DSO that uses them.
All of the symbols are hidden, so that the functions are called via a
direct branch.  The choice of LSE vs non-LSE is done via one byte load
followed by a well-predicted direct branch.  The functions are compiled
separately to minimize code size.

Performance impact:

Various members in the Arm ecosystem have measured the performance impact of this indirection on a diverse set of systems and we were happy to find out that it was minimal compared to the benefit of using the LSE instructions for better scalability at large core counts.

@tmm1
Copy link

tmm1 commented Feb 6, 2023

Thanks for implementing these!

When cross compiling (with zig cc -target aarch64-linux-gnu), how can I force the wrappers (such as __aarch64_ldadd4_acq_rel) to be included in my binary regardless of if they're being used? My program uses dlopen on a .so file during runtime and that shared library expects these symbols to be present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-aarch64 64-bit ARM bug Observed behavior contradicts documented or intended behavior enhancement Solving this issue will likely involve adding new logic or components to the codebase.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants