Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jdk23 ReentrantLockTest::testAwaitUninterruptibly timed out waiting for thread to enter thread state WAITING #19304

Closed
pshipton opened this issue Apr 11, 2024 · 13 comments · Fixed by ibmruntimes/openj9-openjdk-jdk#812 or ibmruntimes/openj9-openjdk-jdk23#11

Comments

@pshipton
Copy link
Member

jdk23 testing on amac

17:09:02  openjdk version "23-internal" 2024-09-17
17:09:02  OpenJDK Runtime Environment (build 23-internal-adhoc.****.BuildJDKnextaarch64macPersonal)
17:09:02  Eclipse OpenJ9 VM (build master-b2db2045aa4, JRE 23 Mac OS X aarch64-64-Bit 20240411_113 (JIT enabled, AOT enabled)
17:09:02  OpenJ9   - b2db2045aa4
17:09:02  OMR      - 0e07ad19c1f
17:09:02  JCL      - 7ac6a9b5b39 based on jdk-23+17)

Requires more triage to determine if this is jdk23 specific or an intermittent failure.

https://openj9-jenkins.osuosl.org/job/Test_openjdknext_j9_sanity.openjdk_aarch64_mac_Personal_testList_0/4
jdk_util_1
java/util/concurrent/tck/JSR166TestCase.java#others

17:24:56  STARTED    ReentrantLockTest::testAwaitUninterruptibly_fair 'testAwaitUninterruptibly_fair'
17:24:56  junit.framework.AssertionFailedError: timed out waiting for thread to enter thread state WAITING
17:24:56  	at junit.framework.Assert.fail(Assert.java:57)
17:24:56  	at junit.framework.TestCase.fail(TestCase.java:223)
17:24:56  	at JSR166TestCase.assertThreadBlocks(JSR166TestCase.java:1253)
17:24:56  	at ReentrantLockTest.testAwaitUninterruptibly(ReentrantLockTest.java:954)
17:24:56  	at ReentrantLockTest.testAwaitUninterruptibly_fair(ReentrantLockTest.java:922)
17:24:56  	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
17:24:56  	at java.base/java.lang.reflect.Method.invoke(Method.java:586)
17:24:56  	at junit.framework.TestCase.runTest(TestCase.java:177)
17:24:56  	at JSR166TestCase.runTest(JSR166TestCase.java:406)
17:24:56  	at junit.framework.TestCase.runBare(TestCase.java:142)
17:24:56  	at JSR166TestCase.runBare(JSR166TestCase.java:397)
17:24:56  	at junit.framework.TestResult$1.protect(TestResult.java:122)
17:24:56  	at junit.framework.TestResult.runProtected(TestResult.java:142)
17:24:56  	at junit.framework.TestResult.run(TestResult.java:125)
17:24:56  	at junit.framework.TestCase.run(TestCase.java:130)
17:24:56  	at junit.framework.TestSuite.runTest(TestSuite.java:241)
17:24:56  	at junit.framework.TestSuite.run(TestSuite.java:236)
17:24:56  	at junit.framework.TestSuite.runTest(TestSuite.java:241)
17:24:56  	at junit.framework.TestSuite.run(TestSuite.java:236)
17:24:56  	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:90)
17:24:56  	at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
17:24:56  	at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
17:24:56  	at org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:42)
17:24:56  	at org.junit.vintage.engine.VintageTestEngine.executeAllChildren(VintageTestEngine.java:80)
17:24:56  	at org.junit.vintage.engine.VintageTestEngine.execute(VintageTestEngine.java:72)
17:24:56  	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:147)
17:24:56  	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:127)
17:24:56  	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:90)
17:24:56  	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:55)
17:24:56  	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:102)
17:24:56  	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:54)
17:24:56  	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114)
17:24:56  	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86)
17:24:56  	at org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86)
17:24:56  	at com.sun.javatest.regtest.agent.JUnitRunner.runWithJUnitPlatform(JUnitRunner.java:142)
17:24:56  	at com.sun.javatest.regtest.agent.JUnitRunner.main(JUnitRunner.java:95)
17:24:56  	at com.sun.javatest.regtest.agent.JUnitRunner.main(JUnitRunner.java:61)
17:24:56  	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
17:24:56  	at java.base/java.lang.reflect.Method.invoke(Method.java:586)
17:24:56  	at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138)
17:24:56  	at java.base/java.lang.Thread.run(Thread.java:1587)
17:24:56  FAILED     ReentrantLockTest::testAwaitUninterruptibly_fair 'testAwaitUninterruptibly_fair'

17:24:56  [ JUnit Tests: found 4089, started 4068, succeeded 4067, failed 1, aborted 0, skipped 0]
@pshipton
Copy link
Member Author

pshipton commented Apr 11, 2024

3x grinder on the single test https://openj9-jenkins.osuosl.org/job/Grinder/3470/ - passed
2x x 2 grinder jdk_util_1 https://openj9-jenkins.osuosl.org/job/Grinder/3471/ - passed

@pshipton pshipton removed this from the Java 23 milestone Apr 11, 2024
@pshipton pshipton removed the jdk23 label Apr 11, 2024
@pshipton pshipton changed the title jdk23 ReentrantLockTest::testAwaitUninterruptibly_fair timed out waiting for thread to enter thread state WAITING ReentrantLockTest::testAwaitUninterruptibly_fair timed out waiting for thread to enter thread state WAITING Apr 11, 2024
@pshipton pshipton added this to the Java 23 milestone Apr 23, 2024
@pshipton
Copy link
Member Author

pshipton commented Apr 23, 2024

https://openj9-jenkins.osuosl.org/job/Test_openjdknext_j9_sanity.openjdk_aarch64_mac_Nightly_testList_0/2 - mac11-aarch64-4
jdk_util_1
java/util/concurrent/tck/JSR166TestCase.java#forkjoinpool-common-parallelism

21:03:36  STARTED    ClassSource [className = 'ReentrantLockTest', filePosition = null] 'testAwaitUninterruptibly'
21:03:36  junit.framework.AssertionFailedError: timed out waiting for thread to enter thread state WAITING
21:03:36  	at junit.framework.Assert.fail(Assert.java:57)
21:03:36  	at junit.framework.TestCase.fail(TestCase.java:223)
21:03:36  	at JSR166TestCase.assertThreadBlocks(JSR166TestCase.java:1253)
21:03:36  	at ReentrantLockTest.testAwaitUninterruptibly(ReentrantLockTest.java:954)
21:03:36  	at ReentrantLockTest.testAwaitUninterruptibly(ReentrantLockTest.java:921)

https://openj9-jenkins.osuosl.org/job/Test_openjdknext_j9_sanity.openjdk_x86-64_linux_Nightly_testList_1/2/ - cent7-x64-1
jdk_util_0
java/util/concurrent/tck/JSR166TestCase.java#security-manager

01:43:28  STARTED    ReentrantReadWriteLockTest::testAwaitUninterruptibly_fair 'testAwaitUninterruptibly_fair'
01:43:28  junit.framework.AssertionFailedError: timed out waiting for thread to enter thread state WAITING
01:43:28  	at junit.framework.Assert.fail(Assert.java:57)
01:43:28  	at junit.framework.TestCase.fail(TestCase.java:223)
01:43:28  	at JSR166TestCase.assertThreadBlocks(JSR166TestCase.java:1253)
01:43:28  	at ReentrantReadWriteLockTest.testAwaitUninterruptibly(ReentrantReadWriteLockTest.java:1070)
01:43:28  	at ReentrantReadWriteLockTest.testAwaitUninterruptibly_fair(ReentrantReadWriteLockTest.java:1038)

@pshipton
Copy link
Member Author

pshipton commented Apr 24, 2024

https://openj9-jenkins.osuosl.org/job/Test_openjdknext_j9_sanity.openjdk_x86-64_linux_Nightly_testList_1/3 - ub18-x86-1
jdk_util_0
java/util/concurrent/tck/JSR166TestCase.java#security-manager

03:44:50  STARTED    ReentrantLockTest::testAwaitUninterruptibly_fair 'testAwaitUninterruptibly_fair'
03:44:50  junit.framework.AssertionFailedError: timed out waiting for thread to enter thread state WAITING
03:44:50  	at junit.framework.Assert.fail(Assert.java:57)
03:44:50  	at junit.framework.TestCase.fail(TestCase.java:223)
03:44:50  	at JSR166TestCase.assertThreadBlocks(JSR166TestCase.java:1253)
03:44:50  	at ReentrantLockTest.testAwaitUninterruptibly(ReentrantLockTest.java:954)
03:44:50  	at ReentrantLockTest.testAwaitUninterruptibly_fair(ReentrantLockTest.java:922)

https://openj9-jenkins.osuosl.org/job/Test_openjdknext_j9_sanity.openjdk_x86-64_mac_Nightly_testList_1/2 - mac12-x86-1
jdk_util_0
java/util/concurrent/tck/JSR166TestCase.java#others

00:49:30  STARTED    ClassSource [className = 'ReentrantLockTest', filePosition = null] 'testAwaitUninterruptibly'
00:49:30  junit.framework.AssertionFailedError: timed out waiting for thread to enter thread state WAITING
00:49:30  	at junit.framework.Assert.fail(Assert.java:57)
00:49:30  	at junit.framework.TestCase.fail(TestCase.java:223)
00:49:30  	at JSR166TestCase.assertThreadBlocks(JSR166TestCase.java:1253)
00:49:30  	at ReentrantLockTest.testAwaitUninterruptibly(ReentrantLockTest.java:954)
00:49:30  	at ReentrantLockTest.testAwaitUninterruptibly(ReentrantLockTest.java:921)

https://openj9-jenkins.osuosl.org/job/Test_openjdknext_j9_sanity.openjdk_aarch64_linux_Nightly_testList_0/3/ - ub20-aarch64-osu-2
jdk_util_0 java/util/concurrent/tck/JSR166TestCase.java#forkjoinpool-common-parallelism

22:36:13  STARTED    ClassSource [className = 'ReentrantReadWriteLockTest', filePosition = null] 'testAwaitUninterruptibly'
22:36:13  junit.framework.AssertionFailedError: timed out waiting for thread to enter thread state WAITING
22:36:13  	at junit.framework.Assert.fail(Assert.java:57)
22:36:13  	at junit.framework.TestCase.fail(TestCase.java:223)
22:36:13  	at JSR166TestCase.assertThreadBlocks(JSR166TestCase.java:1253)
22:36:13  	at ReentrantReadWriteLockTest.testAwaitUninterruptibly(ReentrantReadWriteLockTest.java:1070)
22:36:13  	at ReentrantReadWriteLockTest.testAwaitUninterruptibly(ReentrantReadWriteLockTest.java:1037)

jdk_util_1 java/util/concurrent/tck/JSR166TestCase.java#default

23:03:01  STARTED    ClassSource [className = 'ReentrantLockTest', filePosition = null] 'testAwaitUninterruptibly'
23:03:01  junit.framework.AssertionFailedError: timed out waiting for thread to enter thread state WAITING
23:03:01  	at junit.framework.Assert.fail(Assert.java:57)
23:03:01  	at junit.framework.TestCase.fail(TestCase.java:223)
23:03:01  	at JSR166TestCase.assertThreadBlocks(JSR166TestCase.java:1253)
23:03:01  	at ReentrantLockTest.testAwaitUninterruptibly(ReentrantLockTest.java:954)
23:03:01  	at ReentrantLockTest.testAwaitUninterruptibly(ReentrantLockTest.java:921)

https://openj9-jenkins.osuosl.org/job/Test_openjdknext_j9_sanity.openjdk_s390x_linux_Nightly_testList_0/3/ - ub20-390-2
jdk_util_1
java/util/concurrent/tck/JSR166TestCase.java#default

02:34:20  STARTED    ReentrantReadWriteLockTest::testAwaitUninterruptibly_fair 'testAwaitUninterruptibly_fair'
02:34:20  junit.framework.AssertionFailedError: timed out waiting for thread to enter thread state WAITING
02:34:20  	at junit.framework.Assert.fail(Assert.java:57)
02:34:20  	at junit.framework.TestCase.fail(TestCase.java:223)
02:34:20  	at JSR166TestCase.assertThreadBlocks(JSR166TestCase.java:1253)
02:34:20  	at ReentrantReadWriteLockTest.testAwaitUninterruptibly(ReentrantReadWriteLockTest.java:1070)
02:34:20  	at ReentrantReadWriteLockTest.testAwaitUninterruptibly_fair(ReentrantReadWriteLockTest.java:1038)

@pshipton pshipton added the jdk23 label Apr 24, 2024
@pshipton pshipton changed the title ReentrantLockTest::testAwaitUninterruptibly_fair timed out waiting for thread to enter thread state WAITING jdk23 ReentrantLockTest::testAwaitUninterruptibly_fair timed out waiting for thread to enter thread state WAITING Apr 24, 2024
pshipton added a commit to pshipton/openjdk-tests that referenced this issue Apr 24, 2024
Issue eclipse-openj9/openj9#19304

Signed-off-by: Peter Shipton <Peter_Shipton@ca.ibm.com>
@pshipton
Copy link
Member Author

@tajila fyi I've set this as a blocker for jdk23 and I'm excluding the test for now.

adoptium/aqa-tests#5262

@pshipton pshipton changed the title jdk23 ReentrantLockTest::testAwaitUninterruptibly_fair timed out waiting for thread to enter thread state WAITING jdk23 ReentrantLockTest::testAwaitUninterruptibly timed out waiting for thread to enter thread state WAITING Apr 24, 2024
llxia pushed a commit to adoptium/aqa-tests that referenced this issue Apr 24, 2024
Issue eclipse-openj9/openj9#19304

Signed-off-by: Peter Shipton <Peter_Shipton@ca.ibm.com>
@tajila
Copy link
Contributor

tajila commented May 6, 2024

@babsingh Please take a look

@ThanHenderson
Copy link
Contributor

I set up a version of the test locally, the failure rate is ~4 to 6 failures per 1000 runs, and I've only seen it on Java 23 and next.

The test spawns two threads (t1 and t2) that both call this awaitUninterruptibly().

The javacore file's identify that t2 is waiting to be notified about the publicFlagsMutex (which is stated to be unowned). The native thread's vmstate is CW (sleeping) and the j9vmthread state is R (runnable); I'd expect the calls to park to put the thread in a P state but this isn't the case.

Tried on RI too for JDK23 and there is no failure there. Still looking into our code.

@ThanHenderson
Copy link
Contributor

ThanHenderson commented Jul 10, 2024

There is a data race introduced here.

Now that the volatile interrupted member is outside of a synchronized(interruptLock) block, the call to Thread.interrupted() (which checks for and clears an interrupt), and t2.interrupt() (which sets interrupted to true) race for the value of interrupted. The failing case gets caught in this loop because, on the first iteration, during the Thread.interrupted() call, interrupted is false and thus doesn't execute the path through clearInterruptEvent() -> interruptedImpl() that clears the J9THREAD_FLAG_INTERRUPTED. This leads to a situation where Thread::isInterrupted() evaluates to false but the thread still has J9THREAD_FLAG_INTERRUPTED set. Then, in this call, omrthread_park is short circuited because J9THREAD_FLAG_INTERRUPTED is set and thus J9THREAD_FLAG_PARKED never gets set. The loop condition then gets stuck in the state where status == 3 and Thread.currentThread().isInterrupted() == false.

The solution is just to include the interrupted = true in the synchronization block that was enclosed in interrupt0 in ibmruntimes/openj9-openjdk-jdk#803, that is, remove the synchronized block from within interrupt0 and do this:

synchronized(interruptLock) {
    interrupted = true;
    interrupt0;
}

in Thread::interrupt. On my local testing, this passes on 20K runs for JDK next. This will also need to be backported to JDK 21, 22, and 23 extension repo.

edit: 21 and 22 don't need to be changed, and the synchronized block in interrupt0 introduced in ibmruntimes/openj9-openjdk-jdk#803 doesn't need to be changed since the synchronized blocks for the same lock for the same thread are reentrant.

@babsingh
Copy link
Contributor

do all locations, where Thread.interrupted is written, need to be synchronized?

@ThanHenderson
Copy link
Contributor

No. None do.

@ThanHenderson
Copy link
Contributor

Fixed in:
JDKnext: ibmruntimes/openj9-openjdk-jdk#812
JDK23: ibmruntimes/openj9-openjdk-jdk23#11

@babsingh
Copy link
Contributor

babsingh commented Jul 11, 2024

@ThanHenderson Reopening, the test needs to be enabled before the issue is closed.

@babsingh babsingh reopened this Jul 11, 2024
@ThanHenderson
Copy link
Contributor

aqa-test PR to re-enable: adoptium/aqa-tests#5431

@ThanHenderson
Copy link
Contributor

The PR un-excluding the tests is now merged: adoptium/aqa-tests#5431. Closing this as completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment