Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random crash of test_signal on macos sonoma #110017

Closed
sobolevn opened this issue Sep 28, 2023 · 8 comments
Closed

Random crash of test_signal on macos sonoma #110017

sobolevn opened this issue Sep 28, 2023 · 8 comments
Labels
OS-mac tests Tests in the Lib/test dir type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@sobolevn
Copy link
Member

sobolevn commented Sep 28, 2023

Crash report

I've experience a randome crash while fixing #109981
Why random? Because I was not able to reproduce it ever again.

My command:

» ./python.exe -m test -f tests.txt
0:00:00 load avg: 4.78 Run 31 tests sequentially
0:00:00 load avg: 4.78 [ 1/31] test_shelve
0:00:00 load avg: 4.78 [ 2/31] test_shlex
0:00:00 load avg: 4.78 [ 3/31] test_signal
Fatal Python error: Bus error

Current thread 0x000000016c947000 (most recent call first):
  File "/Users/sobolev/Desktop/cpython/Lib/test/test_signal.py", line 1338 in set_interrupts
  File "/Users/sobolev/Desktop/cpython/Lib/threading.py", line 1003 in run
  File "/Users/sobolev/Desktop/cpython/Lib/threading.py", line 1066 in _bootstrap_inner
  File "/Users/sobolev/Desktop/cpython/Lib/threading.py", line 1023 in _bootstrap

Thread 0x00000001de555300 (most recent call first):
  File "/Users/sobolev/Desktop/cpython/Lib/signal.py", line 29 in _int_to_enum
  File "/Users/sobolev/Desktop/cpython/Lib/signal.py", line 57 in signal
  File "/Users/sobolev/Desktop/cpython/Lib/test/test_signal.py", line 1346 in cycle_handlers
  File "/Users/sobolev/Desktop/cpython/Lib/test/test_signal.py", line 1356 in test_stress_modifying_handlers
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/case.py", line 589 in _callTestMethod
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/case.py", line 636 in run
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/case.py", line 692 in __call__
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/suite.py", line 122 in run
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/suite.py", line 84 in __call__
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/suite.py", line 122 in run
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/suite.py", line 84 in __call__
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/suite.py", line 122 in run
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/suite.py", line 84 in __call__
  File "/Users/sobolev/Desktop/cpython/Lib/test/support/testresult.py", line 146 in run
  File "/Users/sobolev/Desktop/cpython/Lib/test/support/__init__.py", line 1152 in _run_suite
  File "/Users/sobolev/Desktop/cpython/Lib/test/support/__init__.py", line 1279 in run_unittest
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/single.py", line 36 in run_unittest
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/single.py", line 92 in test_func
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/single.py", line 48 in regrtest_runner
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/single.py", line 95 in _load_run_test
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/single.py", line 138 in _runtest_env_changed_exc
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/single.py", line 238 in _runtest
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/single.py", line 266 in run_single_test
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/main.py", line 290 in run_test
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/main.py", line 325 in run_tests_sequentially
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/main.py", line 459 in _run_tests
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/main.py", line 490 in run_tests
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/main.py", line 576 in main
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/main.py", line 584 in main
  File "/Users/sobolev/Desktop/cpython/Lib/test/__main__.py", line 2 in <module>
  File "/Users/sobolev/Desktop/cpython/Lib/runpy.py", line 88 in _run_code
  File "/Users/sobolev/Desktop/cpython/Lib/runpy.py", line 198 in _run_module_as_main

Extension modules: _testcapi (total: 1)
[1]    3657 bus error  ./python.exe -m test -f tests.txt

Contents of tests.txt:

test_shelve
test_shlex
test_signal
test_site
test_slice
test_smtplib
test_smtpnet
test_socket
test_socketserver
test_sort
test_source_encoding
test_sqlite3
test_ssl
test_stable_abi_ctypes
test_startfile
test_stat
test_statistics
test_str
test_strftime
test_string
test_string_literals
test_stringprep
test_strptime
test_strtod
test_struct
test_structseq
test_subclassinit
test_subprocess
test_sundry
test_super
test_support

But, note that only 3 tests were run before: test_shelve test_shlex test_signal

I've also tried to run ./python.exe -m test -j4 --forever test_shelve test_shlex test_signal, but no luck for now.

Env:

  • 98c0c1d (main)
  • macos sonoma, apple m2

Linked PRs

@sobolevn sobolevn added tests Tests in the Lib/test dir type-crash A hard crash of the interpreter, possibly with a core dump labels Sep 28, 2023
@sobolevn sobolevn assigned sobolevn and unassigned sobolevn Sep 28, 2023
@sobolevn
Copy link
Member Author

I got it again from --forever:

0:05:31 load avg: 1.86 [172/1] test_signal process crashed (Exit code -10)
Fatal Python error: Bus error

Current thread 0x000000016c56f000 (most recent call first):
  File "/Users/sobolev/Desktop/cpython/Lib/test/test_signal.py", line 1338 in set_interrupts
  File "/Users/sobolev/Desktop/cpython/Lib/threading.py", line 1003 in run
  File "/Users/sobolev/Desktop/cpython/Lib/threading.py", line 1066 in _bootstrap_inner
  File "/Users/sobolev/Desktop/cpython/Lib/threading.py", line 1023 in _bootstrap

Thread 0x00000001de555300 (most recent call first):
  File "/Users/sobolev/Desktop/cpython/Lib/enum.py", line 1128 in __new__
  File "/Users/sobolev/Desktop/cpython/Lib/enum.py", line 729 in __call__
  File "/Users/sobolev/Desktop/cpython/Lib/signal.py", line 29 in _int_to_enum
  File "/Users/sobolev/Desktop/cpython/Lib/signal.py", line 57 in signal
  File "/Users/sobolev/Desktop/cpython/Lib/test/test_signal.py", line 1346 in cycle_handlers
  File "/Users/sobolev/Desktop/cpython/Lib/test/test_signal.py", line 1356 in test_stress_modifying_handlers
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/case.py", line 589 in _callTestMethod
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/case.py", line 636 in run
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/case.py", line 692 in __call__
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/suite.py", line 122 in run
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/suite.py", line 84 in __call__
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/suite.py", line 122 in run
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/suite.py", line 84 in __call__
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/suite.py", line 122 in run
  File "/Users/sobolev/Desktop/cpython/Lib/unittest/suite.py", line 84 in __call__
  File "/Users/sobolev/Desktop/cpython/Lib/test/support/testresult.py", line 146 in run
  File "/Users/sobolev/Desktop/cpython/Lib/test/support/__init__.py", line 1152 in _run_suite
  File "/Users/sobolev/Desktop/cpython/Lib/test/support/__init__.py", line 1279 in run_unittest
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/single.py", line 36 in run_unittest
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/single.py", line 92 in test_func
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/single.py", line 48 in regrtest_runner
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/single.py", line 95 in _load_run_test
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/single.py", line 138 in _runtest_env_changed_exc
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/single.py", line 238 in _runtest
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/single.py", line 266 in run_single_test
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/worker.py", line 86 in worker_process
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/worker.py", line 109 in main
  File "/Users/sobolev/Desktop/cpython/Lib/test/libregrtest/worker.py", line 113 in <module>
  File "/Users/sobolev/Desktop/cpython/Lib/runpy.py", line 88 in _run_code
  File "/Users/sobolev/Desktop/cpython/Lib/runpy.py", line 198 in _run_module_as_main

Extension modules: _testcapi (total: 1)
Kill <WorkerThread #1 running test=test_signal pid=6276 time=6.6 sec> process group
Kill <WorkerThread #2 running test=test_signal pid=6170 time=23.1 sec> process group
Kill <WorkerThread #3 running test=test_signal pid=6217 time=12.4 sec> process group
0:05:32 load avg: 1.86 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=7.6 sec> thread for 1.0 sec
0:05:33 load avg: 1.86 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=8.6 sec> thread for 2.0 sec
0:05:34 load avg: 1.86 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=9.6 sec> thread for 3.0 sec
0:05:35 load avg: 1.79 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=10.7 sec> thread for 4.0 sec
0:05:36 load avg: 1.79 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=11.7 sec> thread for 5.0 sec
0:05:37 load avg: 1.79 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=12.7 sec> thread for 6.0 sec
0:05:38 load avg: 1.79 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=13.7 sec> thread for 7.0 sec
0:05:39 load avg: 1.79 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=14.7 sec> thread for 8.0 sec
0:05:40 load avg: 1.65 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=15.7 sec> thread for 9.0 sec
0:05:41 load avg: 1.65 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=16.7 sec> thread for 10.0 sec
0:05:42 load avg: 1.65 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=17.7 sec> thread for 11.0 sec
0:05:43 load avg: 1.65 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=18.7 sec> thread for 12.0 sec
0:05:44 load avg: 1.65 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=19.7 sec> thread for 13.0 sec
0:05:45 load avg: 1.60 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=20.7 sec> thread for 14.1 sec
0:05:46 load avg: 1.60 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=21.7 sec> thread for 15.1 sec
0:05:47 load avg: 1.60 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=22.7 sec> thread for 16.1 sec
0:05:48 load avg: 1.60 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=23.7 sec> thread for 17.1 sec
0:05:49 load avg: 1.60 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=24.7 sec> thread for 18.1 sec
0:05:50 load avg: 1.47 Waiting for <WorkerThread #1 running test=test_signal pid=6276 time=25.7 sec> thread for 19.1 sec

== Tests result: FAILURE ==

1 test failed:
    test_signal

171 tests OK.

Total duration: 5 min 50 sec
Total tests: run=15,586 skipped=569
Total test files: run=172 failed=1
Result: FAILURE

@vstinner
Copy link
Member

Random crash of test_signal on macos sonoma

Maybe the test_signal.test_stress_modifying_handlers() was already unstable before macOS 14: I saw a crash on macOS 12.7 (GHA macOS), see issue gh-110083. Maybe it's just that it was missed before I deplayed --rerun-fail on Python CIs.

@erlend-aasland
Copy link
Contributor

I got a failure-then-success for test_signal recently: https://github.com/python/cpython/actions/runs/6415769021/job/17418272051?pr=110374

@Eclips4
Copy link
Member

Eclips4 commented Oct 13, 2023

Random crash of test_signal on macos sonoma

Maybe the test_signal.test_stress_modifying_handlers() was already unstable before macOS 14: I saw a crash on macOS 12.7 (GHA macOS), see issue gh-110083.

I came to the same conclusion. It would be great if someone with macOS 13 (or older) try to reproduce it.
Reproducer: ./python.exe -m test test_signal -m test_stress_modifying_handlers --forever

@ronaldoussoren
Copy link
Contributor

I caught the error in a debugger:

* thread #12, stop reason = EXC_BAD_ACCESS (code=257, address=0x1)
    frame #0: 0x0000000000000001
error: memory read failed for 0x1
Target 0: (python.exe) stopped.
(lldb) thread backtrace
* thread #12, stop reason = EXC_BAD_ACCESS (code=257, address=0x1)
  * frame #0: 0x0000000000000001
    frame #1: 0x0000000189ba3a24 libsystem_platform.dylib`_sigtramp + 56
    frame #2: 0x0000000189b74cc0 libsystem_pthread.dylib`pthread_kill + 288
    frame #3: 0x0000000189a4d540 libsystem_c.dylib`raise + 32
    frame #4: 0x00000001001f1748 python.exe`signal_raise_signal_impl(module=<unavailable>, signalnum=<unavailable>) at signalmodule.c:445:11 [opt]
    frame #5: 0x00000001001528c0 python.exe`_PyEval_EvalFrameDefault(tstate=0x000000014e6e39c0, frame=<unavailable>, throwflag=<unavailable>) at generated_cases.c.h:1134:19 [opt]
    frame #6: 0x0000000100053a98 python.exe`method_vectorcall [inlined] _PyObject_VectorcallTstate(tstate=0x000000014e6e39c0, callable=0x00000001008904a0, args=0x0000000170e06ef8, nargsf=1, kwnames=0x0000000000000000) at pycore_call.h:168:11 [opt]
    frame #7: 0x0000000100053a6c python.exe`method_vectorcall(method=<unavailable>, args=0x0000000100400450, nargsf=<unavailable>, kwnames=0x0000000000000000) at classobject.c:70:20 [opt]
    frame #8: 0x0000000100229588 python.exe`thread_run(boot_raw=0x0000600001790800) at _threadmodule.c:1204:21 [opt]
    frame #9: 0x00000001001c9ed4 python.exe`pythread_wrapper(arg=<unavailable>) at thread_pthread.h:236:5 [opt]
    frame #10: 0x0000000189b75034 libsystem_pthread.dylib`_pthread_start + 136

Note the SEGV on address 0x1 triggered from within libc. A second thread is in the sigaction implementation.

(lldb) thread list
Process 18586 stopped
  thread #1: tid = 0xe789b2, 0x0000000189b3aebc libsystem_kernel.dylib`__sigaction + 8, queue = 'com.apple.main-thread'
* thread #12: tid = 0xe7996d, 0x0000000000000001, stop reason = EXC_BAD_ACCESS (code=257, address=0x1)

This looks more and more like a bug in macOS. I'll see if I can whip up a reproducer in C.

@ronaldoussoren
Copy link
Contributor

ronaldoussoren commented Dec 7, 2023

This is a reproducer in C that crashes almost immediately for me on a 14.1 system. The crash is similar to the one above:

* thread #2, stop reason = EXC_BAD_ACCESS (code=257, address=0x1)
    frame #0: 0x0000000000000001
error: memory read failed for 0x1
Target 1: (repro) stopped.
(lldb) thread backtrace
* thread #2, stop reason = EXC_BAD_ACCESS (code=257, address=0x1)
  * frame #0: 0x0000000000000001
    frame #1: 0x0000000189ba3a24 libsystem_platform.dylib`_sigtramp + 56
    frame #2: 0x0000000189b74cc0 libsystem_pthread.dylib`pthread_kill + 288
    frame #3: 0x0000000189a4d540 libsystem_c.dylib`raise + 32
    frame #4: 0x0000000100003e18 repro`raising_thread(arg=0x0000000000000000) at repro.c:17:9
    frame #5: 0x0000000189b75034 libsystem_pthread.dylib`_pthread_start + 136
(lldb) thread select 1
* thread #1, queue = 'com.apple.main-thread'
    frame #0: 0x0000000189b3aebc libsystem_kernel.dylib`__sigaction + 8
libsystem_kernel.dylib`:
->  0x189b3aebc <+8>:  b.lo   0x189b3aedc               ; <+40>
    0x189b3aec0 <+12>: pacibsp 
    0x189b3aec4 <+16>: stp    x29, x30, [sp, #-0x10]!
    0x189b3aec8 <+20>: mov    x29, sp
(lldb) thread backtrace
* thread #1, queue = 'com.apple.main-thread'
  * frame #0: 0x0000000189b3aebc libsystem_kernel.dylib`__sigaction + 8
    frame #1: 0x0000000189ba2d28 libsystem_platform.dylib`__platform_sigaction + 108
    frame #2: 0x0000000100003ea0 repro`update_signal at repro.c:34:9
    frame #3: 0x0000000100003f38 repro`main at repro.c:54:9
    frame #4: 0x00000001897f90e0 dyld`start + 2360
(lldb) 

Could one of you check my C code to make sure it doesn't contain a glaring issue? I intend to file an issue with Apple about this. I will test on older macOS versions over the weekend.

#include <signal.h>
#include <unistd.h>
#include <pthread.h>
#include <stdio.h>

static int sigcount = 0;
static int update_count = 0;

void sighandler(int sig)
{
    return;
}

void* raising_thread(void* arg)
{
    for (;;) {
        raise(SIGUSR1);
        sigcount += 1;
    }
    return NULL;
}

void update_signal(void)
{
    struct sigaction context, ocontext;

    if (update_count++ % 2 == 0) {
        context.sa_handler = sighandler;
    } else {
        context.sa_handler = SIG_IGN;
    }
    sigemptyset(&context.sa_mask);
    context.sa_flags = SA_ONSTACK;
    if (sigaction(SIGUSR1, &context, &ocontext) == -1) {
        perror("sigaction");
        _exit(1);
    }
}

int main(void)
{
    alarm(30);
    update_signal();

    pthread_t thread;


    if (pthread_create(&thread, NULL, &raising_thread, NULL) !=  0) {
        perror("pthread_create");
        return 1;
    }

    for (;;) {
        update_signal();
    }
}

ronaldoussoren added a commit to ronaldoussoren/cpython that referenced this issue Dec 7, 2023
…n macOS

Test test_stress_modifying_handlers in test_signal can crash
the interpreter due to a bug in macOS. Filed as FB13453490
with Apple.
@ronaldoussoren
Copy link
Contributor

I've filed an issue with Apple about this: FB13453490

@ned-deily
Copy link
Member

FWIW, your C test also crashed on macOS 13.5.1 and 12.6.1 (both Apple Silicon).

ronaldoussoren added a commit that referenced this issue Dec 7, 2023
#112834)

Test test_stress_modifying_handlers in test_signal can crash
the interpreter due to a bug in macOS. Filed as FB13453490
with Apple.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Dec 7, 2023
…n macOS (pythonGH-112834)

Test test_stress_modifying_handlers in test_signal can crash
the interpreter due to a bug in macOS. Filed as FB13453490
with Apple.
(cherry picked from commit bf0beae)

Co-authored-by: Ronald Oussoren <ronaldoussoren@mac.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Dec 7, 2023
…n macOS (pythonGH-112834)

Test test_stress_modifying_handlers in test_signal can crash
the interpreter due to a bug in macOS. Filed as FB13453490
with Apple.
(cherry picked from commit bf0beae)

Co-authored-by: Ronald Oussoren <ronaldoussoren@mac.com>
ronaldoussoren added a commit that referenced this issue Dec 8, 2023
…on macOS (GH-112834) (#112851)

gh-110017: Disable test_signal.test_stress_modifying_handlers on macOS (GH-112834)

Test test_stress_modifying_handlers in test_signal can crash
the interpreter due to a bug in macOS. Filed as FB13453490
with Apple.
(cherry picked from commit bf0beae)

Co-authored-by: Ronald Oussoren <ronaldoussoren@mac.com>
ronaldoussoren added a commit that referenced this issue Dec 8, 2023
…on macOS (GH-112834) (#112852)

gh-110017: Disable test_signal.test_stress_modifying_handlers on macOS (GH-112834)

Test test_stress_modifying_handlers in test_signal can crash
the interpreter due to a bug in macOS. Filed as FB13453490
with Apple.
(cherry picked from commit bf0beae)

Co-authored-by: Ronald Oussoren <ronaldoussoren@mac.com>
aisk pushed a commit to aisk/cpython that referenced this issue Feb 11, 2024
…n macOS (python#112834)

Test test_stress_modifying_handlers in test_signal can crash
the interpreter due to a bug in macOS. Filed as FB13453490
with Apple.
Glyphack pushed a commit to Glyphack/cpython that referenced this issue Sep 2, 2024
…n macOS (python#112834)

Test test_stress_modifying_handlers in test_signal can crash
the interpreter due to a bug in macOS. Filed as FB13453490
with Apple.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OS-mac tests Tests in the Lib/test dir type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

6 participants