posix: pthread: race condition between pthread_create() and pthread_join() #56163

hakehuang · 2023-03-23T17:11:50Z

Previous Title
posix_common derived test fails on some qemu platforms when the CONFIG_TICKLESS_KERNEL=n is added

Describe the bug
posix_common derived test fails on some qemu platforms when the CONFIG_TICKLESS_KERNEL=n is added

Please also mention any information which could help others to understand
the problem you're facing:

What target platform are you using?
qemu_cortex_r5 and qemu_cortex_a53
What have you tried to diagnose or workaround this issue?
see discussion test_posix_common: fix test_clock_gettime_rollover failure #55208

add a delay in the test_pthread_descriptor_leak loop can help
use tickless mode can work as well

Is this a regression? If yes, have you been able to "git bisect" it to a
specific commit?
No.

To Reproduce
Steps to reproduce the behavior:

add CONFIG_TICKLESS_KERNEL=n to prj.conf
do below

west build -b qemu_cortex_r5 -- -DCONFIG_NEWLIB_LIBC=y
west build -t run

Expected behavior
PASS

Impact
some potential race condition in pthread handling

Logs and console output

*** Booting Zephyr OS build zephyr-v3.3.0-1516-g1dce3c3ee2e9 ***
...
===================================================================
START - test_pthread_descriptor_leak

    Assertion failed at WEST_TOPDIR/zephyr/tests/posix/common/src/pthread.c:598: posix_apis_test_pthread_descriptor_leak: (pthread_join(pthread1, &unused) is non-zero)
unable to join thread 31
 FAIL - test_pthread_descriptor_leak in 0.032 seconds
===================================================================
START - test_sleep
ASSERTION FAIL [0] @ WEST_TOPDIR/zephyr/kernel/sched.c:1797
        aborted _current back from dead
E: r0/a1:  0x00000004  r1/a2:  0x00000705  r2/a3:  0xff00002c
E: r3/a4:  0x000089b9 r12/ip:  0x000242e8 r14/lr:  0x0000b2a1
E:  xpsr:  0x0000013f
E: Faulting instruction address (r15/pc): 0x0000c934
E: >>> ZEPHYR FATAL ERROR 4: Kernel panic on CPU 0
E: Current thread: 0x20038 (unknown)
E: Halting system
...

Environment (please complete the following information):

OS: (e.g. Linux, )
Toolchain (e.g Zephyr SDK, ...)
Commit SHA or Version used: zephyr-v3.3.0-1516-g1dce3c3ee2e9

The text was updated successfully, but these errors were encountered:

cfriedt · 2023-04-07T19:20:36Z

On commit 336803e, I'm unable to reproduce this on main with the command below (also tried CONFIG_TICKLESS_KERNEL=n in prj.conf).

west build -p always -b qemu_cortex_a53 -t run tests/posix/common -- \
  -DCONFIG_TICKLESS_KERNEL=n -DCONFIG_NEWLIB_LIBC=y

Oddly, what I have found with qemu_cortex_r5 is that if I disable other tests within the same testsuite, the test passes. E.g.

for i in tests/posix/common/src/*.c; do
  j=$(basename $i)
  if [ "$j" = "main.c" -o $j = "pthread.c" ]; then
    continue
  fi
  k=$i.ignore
  mv $i $k
done
west build -p always -b qemu_cortex_r5 -t run tests/posix/common -- \
  -DCONFIG_TICKLESS_KERNEL=n -DCONFIG_NEWLIB_LIBC=y

I am able to reproduce this on qemu_cortex_r5 without disabling other parts of the testsuite though.

That leads me to believe we are looking at possible stack corruption (stack overflow, ..). Another alternative is that there may be some uninitialized variable.

cfriedt · 2023-04-07T20:02:03Z

I tried this [1], which sets stack sizes to a minimum size, but the problem persisted. That makes me think it's more of an uninitialized variable issue or maybe an alignment issue with the architecture.

If it's an uninitialized variable, it could take some time to pinpoint, unfortunately.

That's very odd that this issue only presents itself with qemu_cortex_r5 as well. @hakehuang - this happens with NXP hardware too, right?

@microbuilder, do you know if Cortex-R5 has any special requirements w.r.t. thread stack alignment?

[1]

git checkout .
west build -p always -b qemu_cortex_r5 tests/posix/common -- \
    -DCONFIG_TICKLESS_KERNEL=n -DCONFIG_NEWLIB_LIBC=y
  grep STACK build/zephyr/.config | \
  grep -v "^#" | \
  grep -v "=\(y\|=n\)$" | \
  while read -r LINE; do \
    VAL="$(echo "$LINE" | awk '{split($0,a,"="); print a[2];}')" \
    if [ $((2*VAL)) -lt 4096 ]; then \
      NEWVAL=8192 \
    else \
      NEWVAL=$((2*VAL)) \
    fi \
    CONFIG="$(echo "$LINE" | awk '{split($0,a,"="); print a[1];}')" \
    echo "$CONFIG=$NEWVAL" \
  done | \
  tee -a tests/posix/common/prj.conf
west build -p always -b qemu_cortex_r5 -t run tests/posix/common -- \
    -DCONFIG_TICKLESS_KERNEL=n -DCONFIG_NEWLIB_LIBC=y

hakehuang · 2023-04-11T02:35:20Z

That's very odd that this issue only presents itself with qemu_cortex_r5 as well. @hakehuang - this happens with NXP hardware too, right?

no, only in QEMU or renode platform

hakehuang · 2023-04-11T02:38:28Z

Oddly, what I have found with qemu_cortex_r5 is that if I disable other tests within the same testsuite, the test passes. E.g.

if the pthread test is removed, all other will not fail, and per my testing it is only the last test case in the pthread which will cause problem for itself or for following test, a simple way to check is add a sleep right after the test. which will trigger the problem.

cfriedt · 2023-04-29T12:05:05Z

@hakehuang @dleach02 - please take a look at #57403 - it seems the solution was a lot easier than we anticipated.

cfriedt · 2023-05-02T01:46:31Z

Re-opening because it still is not fixed.
I was able to exacerbate the problem by greatly increasing the number of pthread_create() / pthread_join() requests up to 100 times the maximum number of pthreads.

It should not be necessary to re-initialize the same attribute over and over again as well.

Just to be clear, we are not actually leaking descriptors, but we are seeing the effects of some kind of race condition between pthread_create(), and pthread_join().

ZTEST(posix_apis, test_pthread_descriptor_leak)
{
	pthread_t pthread1;
	pthread_attr_t attr;

	zassert_ok(pthread_attr_init(&attr));
	zassert_ok(pthread_attr_setstack(&attr, &stack_e[0][0], STACKS));

	/* If we are leaking descriptors, then this loop will never complete */
	for (size_t i = 0; i < CONFIG_MAX_PTHREAD_COUNT * 100; ++i) {
		zassert_ok(pthread_create(&pthread1, &attr, create_thread1, NULL),
			   "unable to create thread %zu", i);
		zassert_ok(pthread_join(pthread1, NULL), "unable to join thread %zu", i);
	}
}

The same thing definitely does not occur with k_thread_create() and k_thread_join(), so there is definitely a bug here, and it exists in main as well as v2.7-branch.

It will likely help to create a reliable stress test where we're not executing the entire tests/posix/common suite. Something like the above code snippet, even repeated 1000 times.

cfriedt · 2023-05-02T13:12:10Z

Setting to priority medium as it affects multiple platforms.

cfriedt · 2023-05-02T13:12:44Z

On second thought - affects multiple platforms and is breaking CI - setting to high priority.

cfriedt · 2023-05-07T20:24:56Z

#57637 can be used to more reliably reproduce the race condition. Still working on a fix.

The pthread_join() implementation was done using a condition variable and not k_thread_join(), and that's actually fatal. It's not possible in Zephyr to guarantee a thread has exited in a race-free way without calling either k_thread_abort() or k_thread_join(). This opened a hole where an app could call pthread_join(), which would go to sleep waiting on the condition variable, which would be signaled from the exit path of the thread, and then return to call pthread_create() (and thus k_thread_create()) on the same pthread_t object (and thus the same struct k_thread) BEFORE THE EARLIER THREAD HAD ACTUALLY EXITED. This makes the scheduler blow up, at least on SMP (I wasn't personally able to make it happen on uniprocessor configs, though the race is present always). Rework to call k_thread_join(). There's no other correct way to do this. Fixes zephyrproject-rtos#56163 Signed-off-by: Andy Ross <andyross@google.com>

andyross · 2023-05-09T23:37:48Z

See commit message in linked PR. The root cause was a hand-rolled attempt at pthread_join() when it needed to be using the underlying k_thread_join() instead.

The pthread_join() implementation was done using a condition variable and not k_thread_join(), and that's actually fatal. It's not possible in Zephyr to guarantee a thread has exited in a race-free way without calling either k_thread_abort() or k_thread_join(). This opened a hole where an app could call pthread_join(), which would go to sleep waiting on the condition variable, which would be signaled from the exit path of the thread, and then return to call pthread_create() (and thus k_thread_create()) on the same pthread_t object (and thus the same struct k_thread) BEFORE THE EARLIER THREAD HAD ACTUALLY EXITED. This makes the scheduler blow up, at least on SMP (I wasn't personally able to make it happen on uniprocessor configs, though the race is present always). Rework to call k_thread_join(). There's no other correct way to do this. Fixes zephyrproject-rtos#56163 Signed-off-by: Andy Ross <andyross@google.com>

cfriedt · 2023-05-21T12:56:58Z

As it turns out, there is also a race between k_thread_create() and k_thread_join()

andyross · 2023-05-21T15:25:14Z

Can you file that as a bug? There really shouldn't be as I'm reading the code.

cfriedt · 2023-05-22T01:18:53Z

Can you file that as a bug? There really shouldn't be as I'm reading the code.

@andyross - created #58116 for you

andyross · 2023-05-23T15:01:04Z

Comment left in #58116 for anyone interested. FWIW: it seems not to be a kernel race in thread lifecycle, more like a arm64-specific race in context handling (and maybe x86_64, but with 1000x lower frequency).

FWIW: I'd treat this as two separate issues and merge the pthread fixes. If you need to tune the test[1] on arm64 for CI, that seems acceptable as long as we have a bug tracking the instability.

[1] SMP tests are already retried in CI because of unavoidable timer skew problems with scheduling "CPUs" onto host threads that can be preempted for 10ms+ in busy CI runs. (Which isn't what we're seeing here -- no timer interaction in this test!). The frequency I'm seeing probably wouldn't even show up above the noise floor.

cfriedt · 2023-06-05T13:18:30Z

Just wanted to note here that although this issue has highlighted an underlying issue with k_thread_create() and k_thread_join() which has been mitigated (at least made so that the probabilities are quite negligible in parts per billion), the race still very much exists in pthreads.

Working on a fix for this which I hope will be accepted into v3.4.0

cfriedt · 2023-06-06T00:16:06Z

Probably should not have been closed

hakehuang added the bug The issue is a bug, or the PR is fixing a bug label Mar 23, 2023

hakehuang assigned cfriedt Mar 23, 2023

hakehuang mentioned this issue Mar 23, 2023

test_posix_common: fix test_clock_gettime_rollover failure #55208

Closed

henrikbrixandersen added the area: POSIX POSIX API Library label Mar 28, 2023

jgl-meta added the priority: low Low impact/importance bug label Mar 28, 2023

cfriedt mentioned this issue Apr 29, 2023

tests: posix: pthread: init pthread_attr_t on each iteration #57403

Merged

jgl-meta closed this as completed in #57403 Apr 29, 2023

cfriedt added backport v2.7-branch Request backport to the v2.7-branch and removed backport v2.7-branch Request backport to the v2.7-branch labels Apr 29, 2023

cfriedt mentioned this issue Apr 30, 2023

[Backport v2.7] tests: posix: pthread: init pthread_attr_t on each iteration #57413

Closed

cfriedt reopened this May 2, 2023

cfriedt added priority: medium Medium impact/importance bug and removed priority: low Low impact/importance bug labels May 2, 2023

cfriedt added priority: high High impact/importance bug and removed priority: medium Medium impact/importance bug labels May 2, 2023

cfriedt mentioned this issue May 2, 2023

[Backport v2.7] posix: sched: ensure min and max priority are schedulable #57414

Merged

andyross mentioned this issue May 9, 2023

lib/pthread: Fix create/join race in thread lifecycle #57721

Closed

cfriedt mentioned this issue May 11, 2023

posix: pthread: correct race condition between pthread_create and pthread_join #57637

Closed

cfriedt mentioned this issue May 22, 2023

kernel: thread: race condition between create and join #58116

Closed

cfriedt mentioned this issue May 24, 2023

tests: posix: stress test for pthread_create and pthread_join #58218

Merged

cfriedt added priority: medium Medium impact/importance bug and removed priority: high High impact/importance bug labels May 27, 2023

cfriedt mentioned this issue May 27, 2023

[v2.7] kernel: thread: race condition between create and join #58362

Closed

cfriedt mentioned this issue Jun 5, 2023

POSIX Roadmap for LTSv3 #51211

Closed

cfriedt changed the title ~~posix_common derived test fails on some qemu platforms when the CONFIG_TICKLESS_KERNEL=n is added~~ posix: pthread: race condition between pthread_create() and pthread_join() Jun 5, 2023

cfriedt mentioned this issue Jun 5, 2023

posix: pthread: barrier functions should not directly invoke ksched.h API #58856

Closed

nashif closed this as completed in #58218 Jun 5, 2023

cfriedt reopened this Jun 6, 2023

cfriedt mentioned this issue Jun 7, 2023

posix: pthread: leverage zephyr's synchronization and threading api #58924

Merged

nashif closed this as completed in #58924 Jun 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

posix: pthread: race condition between pthread_create() and pthread_join() #56163

posix: pthread: race condition between pthread_create() and pthread_join() #56163

hakehuang commented Mar 23, 2023 •

edited by cfriedt

Loading

cfriedt commented Apr 7, 2023 •

edited

Loading

cfriedt commented Apr 7, 2023 •

edited

Loading

hakehuang commented Apr 11, 2023

hakehuang commented Apr 11, 2023

cfriedt commented Apr 29, 2023

cfriedt commented May 2, 2023 •

edited

Loading

cfriedt commented May 2, 2023

cfriedt commented May 2, 2023

cfriedt commented May 7, 2023

andyross commented May 9, 2023

cfriedt commented May 21, 2023

andyross commented May 21, 2023

cfriedt commented May 22, 2023

andyross commented May 23, 2023

cfriedt commented Jun 5, 2023

cfriedt commented Jun 6, 2023

posix: pthread: race condition between pthread_create() and pthread_join() #56163

posix: pthread: race condition between pthread_create() and pthread_join() #56163

Comments

hakehuang commented Mar 23, 2023 • edited by cfriedt Loading

cfriedt commented Apr 7, 2023 • edited Loading

cfriedt commented Apr 7, 2023 • edited Loading

hakehuang commented Apr 11, 2023

hakehuang commented Apr 11, 2023

cfriedt commented Apr 29, 2023

cfriedt commented May 2, 2023 • edited Loading

cfriedt commented May 2, 2023

cfriedt commented May 2, 2023

cfriedt commented May 7, 2023

andyross commented May 9, 2023

cfriedt commented May 21, 2023

andyross commented May 21, 2023

cfriedt commented May 22, 2023

andyross commented May 23, 2023

cfriedt commented Jun 5, 2023

cfriedt commented Jun 6, 2023

hakehuang commented Mar 23, 2023 •

edited by cfriedt

Loading

cfriedt commented Apr 7, 2023 •

edited

Loading

cfriedt commented Apr 7, 2023 •

edited

Loading

cfriedt commented May 2, 2023 •

edited

Loading