Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: paho.mqtt.cpp crashes with SEGV_ACCERR in make_shared when built with optimizations #2094

Open
RankoR opened this issue Oct 21, 2024 · 7 comments
Assignees
Labels

Comments

@RankoR
Copy link

RankoR commented Oct 21, 2024

Description

When any level of optimizations different than 0 is enabled, paho.mqtt.cpp crashes in shared_ptr related code, for example, here, or in message:

10-21 15:35:50.265  1624  1778 F libc    : Fatal signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x78c40b354cd0 in tid 1778 (com.mypackage), pid 1624 (com.mypackage)
10-21 12:35:51.630     0     0 E audit   : rate limit exceeded
10-21 15:35:50.469  2223  2223 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
10-21 15:35:50.469  2223  2223 F DEBUG   : Build fingerprint: 'Android/sdk_phone_x86_64/emulator_x86_64:12/SP2A.220505.008/eng.user.20241021.101248:eng/test-keys'
10-21 15:35:50.469  2223  2223 F DEBUG   : Revision: '0'
10-21 15:35:50.469  2223  2223 F DEBUG   : ABI: 'x86_64'
10-21 15:35:50.469  2223  2223 F DEBUG   : Timestamp: 2024-10-21 15:35:50.297193046+0300
10-21 15:35:50.469  2223  2223 F DEBUG   : Process uptime: 3s
10-21 15:35:50.469  2223  2223 F DEBUG   : Cmdline: com.mypackage
10-21 15:35:50.469  2223  2223 F DEBUG   : pid: 1624, tid: 1778, name: com.mypackage  >>> com.mypackage <<<
10-21 15:35:50.469  2223  2223 F DEBUG   : uid: 10106
10-21 15:35:50.469  2223  2223 F DEBUG   : signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x78c40b354cd0
10-21 15:35:50.469  2223  2223 F DEBUG   :     rax 0000000000000001  rbx 000078c46661b458  rcx 0000000000000001  rdx 000078c52468c400
10-21 15:35:50.469  2223  2223 F DEBUG   :     r8  0000000000000000  r9  000078c40d88594d  r10 000078c508030df8  r11 0000000000000246
10-21 15:35:50.469  2223  2223 F DEBUG   :     r12 000078c46661b440  r13 000078c52468c400  r14 000078c404411518  r15 000078c52468c400
10-21 15:35:50.469  2223  2223 F DEBUG   :     rdi 000078c40d96d1b8  rsi 000078c40b354cd0
10-21 15:35:50.469  2223  2223 F DEBUG   :     rbp 000078c46661b458  rsp 000078c40afa8870  rip 000078c40b50e412
10-21 15:35:50.469  2223  2223 F DEBUG   : backtrace:
10-21 15:35:50.469  2223  2223 F DEBUG   :       #00 pc 000000000025e412  /data/local/lib64/libmylib.so (void std::__ndk1::allocator<mqtt::delivery_token>::construct[abi:ne180000]<mqtt::delivery_token, mqtt::iasync_client&, std::__ndk1::shared_ptr<mqtt::message const>&>(mqtt::delivery_token*, mqtt::iasync_client&, std::__ndk1::shared_ptr<mqtt::message const>&)+178) (BuildId: ecc1ffc5da396c51af7f96a5a98c83d75a6b812d)
10-21 15:35:50.469  2223  2223 F DEBUG   :       #01 pc 000000000025ae5c  /data/local/lib64/libmylib.so (mqtt::async_client::publish(std::__ndk1::shared_ptr<mqtt::message const>)+140) (BuildId: ecc1ffc5da396c51af7f96a5a98c83d75a6b812d)
10-21 15:35:50.469  2223  2223 F DEBUG   :       #02 pc 000000000025aa6f  /data/local/lib64/libmylib.so (mqtt::async_client::publish(mqtt::buffer_ref<char>, mqtt::buffer_ref<char>, int, bool)+319) (BuildId: ecc1ffc5da396c51af7f96a5a98c83d75a6b812d)
10-21 15:35:50.469  2223  2223 F DEBUG   :       #03 pc 0000000000120735  /data/local/lib64/libmylib.so (MqttClient::publish(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, int, bool)+405) (BuildId: ecc1ffc5da396c51af7f96a5a98c83d75a6b812d)

CMake options:

-DCMAKE_CXX_FLAGS="-fPIE -fPIC -lstdc++ -O1"
-DCMAKE_AR="${ANDROID_NDK}/toolchains/llvm/prebuilt/linux-x86_64/bin/llvm-ar"
-DCMAKE_RANLIB="${ANDROID_NDK}/toolchains/llvm/prebuilt/linux-x86_64/bin/llvm-ranlib"
-DCMAKE_CXX_COMPILER_CLANG_SCAN_DEPS="${ANDROID_NDK}/toolchains/llvm/prebuilt/linux-x86_64/bin/clang-scan-deps"
-DCMAKE_TOOLCHAIN_FILE="${ANDROID_NDK}/build/cmake/android.toolchain.cmake"
-DANDROID_ABI=x86_64
-DANDROID_PLATFORM=android-32

Here is the minimal project to reproduce: https://github.com/RankoR/paho-mqtt-crash-demo

It happens only on Android, I couldn't reproduce it on Linux even with -O3, so I assume this is an NDK-related issue.

Upstream bug

No response

Commit to cherry-pick

No response

Affected versions

r27

Canary version

No response

Host OS

Linux

Host OS version

Arch

Affected ABIs

arm64-v8a, x86_64

@RankoR RankoR added the bug label Oct 21, 2024
@github-project-automation github-project-automation bot moved this to Unconfirmed in NDK r27d Oct 21, 2024
@github-project-automation github-project-automation bot moved this to Unconfirmed in NDK r28 Oct 21, 2024
@DanAlbert DanAlbert moved this from Unconfirmed to Triaged in NDK r27d Oct 21, 2024
@DanAlbert
Copy link
Member

@RankoR do you know if this worked with r26? We can check for that as well, but if you happen to already know it'd save us the effort :)

@RankoR
Copy link
Author

RankoR commented Oct 21, 2024

@DanAlbert I don't know, unfortunately. Checked only with r27.1 and r27.2.

@DanAlbert
Copy link
Member

np, we'll check it

@DanAlbert
Copy link
Member

The readme in the repro case says

Build a release build

Could you elaborate so I don't draw the wrong rest of the owl? :) cmake -DCMAKE_TOOLCHAIN_FILE=$NDK/build/cmake/android.toolchain.cmake -DANDROID_ABI=arm64-v8a -DANDROID_PLATFORM=21 -DCMAKE_RELEASE_TYPE=Release? Something else?

@RankoR
Copy link
Author

RankoR commented Nov 1, 2024

My CMake options in CLion:

-DCMAKE_CXX_FLAGS="-fPIE -fPIC -lstdc++ -O3 -DNDEBUG -ftree-vectorize -ffast-math -march=x86-64-v3 -fomit-frame-pointer -ffunction-sections -fdata-sections"
-DCMAKE_AR="${ANDROID_NDK}/toolchains/llvm/prebuilt/linux-x86_64/bin/llvm-ar"
-DCMAKE_RANLIB="${ANDROID_NDK}/toolchains/llvm/prebuilt/linux-x86_64/bin/llvm-ranlib"
-DCMAKE_CXX_COMPILER_CLANG_SCAN_DEPS="${ANDROID_NDK}/toolchains/llvm/prebuilt/linux-x86_64/bin/clang-scan-deps"
-DCMAKE_TOOLCHAIN_FILE="${ANDROID_NDK}/build/cmake/android.toolchain.cmake"
-DANDROID_ABI=x86_64 # Or arm64-v8a, crashes on both architectures
-DANDROID_PLATFORM=android-32
-DANDROID_API=32

Some of them may be redundant, especially related to optimizations.

In your options -DCMAKE_RELEASE_TYPE probably should be replaced with -DCMAKE_BUILD_TYPE.

@DanAlbert
Copy link
Member

Oh, that was in the original post. Sorry, I was looking at the other page :)

@DanAlbert
Copy link
Member

Okay, confirmed that it is a regression from r26 (sort of, anyway, the project doesn't build with r26 because it relies on clang-scan-deps which wasn't added until r27, but using r27's clang-scan-deps with r26's compiler works and doesn't crash). If a fix becomes available and it's safe to cherry-pick, we'll backport to r27 when we can.

Minimized instructions that work outside clion:

$ cmake \
    -DCMAKE_TOOLCHAIN_FILE="${ANDROID_NDK}/build/cmake/android.toolchain.cmake" \
    -DCMAKE_CXX_FLAGS="-O1" \
    -DANDROID_ABI=arm64-v8a \
    -DANDROID_PLATFORM=android-32 \
    -S . -B build -G Ninja
$ cmake --build build
$ adb push build/paho_mqtt_crash /data/local/tmp/
$ adb push $ANDROID_NDK/toolchains/llvm/prebuilt/darwin-x86_64/sysroot/usr/lib/aarch64-linux-android/libc++_shared.so /data/local/tmp
$ adb shell "LD_LIBRARY_PATH=/data/local/tmp /data/local/tmp/paho_mqtt_crash"

It seems to also be broken with the r28 compiler, but I did have to make some modifications to the source to get that to build. The project uses std::basic_string<unsigned char>, which is not part of the standard and apparently no longer exists in libc++. You'll either need to switch to std::vector<unsigned char> (probably), use one of the character types listed here, or provide your own char_traits specialization for unsigned char.

btw, it's not the problem here, but using CMAKE_CXX_FLAGS for things like optimization can have surprising behavior in CMake. See https://developer.android.com/ndk/guides/cmake#manage_compiler_flags (our toolchain file has protected you from that, but that's a deviation from the typical CMake behavior)

@DanAlbert DanAlbert moved this from Unconfirmed to Triaged in NDK r28 Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Triaged
Status: Triaged
Development

No branches or pull requests

3 participants