Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows LambdaLoadTest hang at 3% #8214

Closed
pshipton opened this issue Jan 6, 2020 · 32 comments
Closed

Windows LambdaLoadTest hang at 3% #8214

pshipton opened this issue Jan 6, 2020 · 32 comments

Comments

@pshipton
Copy link
Member

pshipton commented Jan 6, 2020

https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_special.system_x86-32_windows_Personal/33
LambdaLoadTest_OpenJ9_NonLinux_special_24
variation: Mode107-OSRG
JVM_OPTIONS: -Xgcpolicy:optthruput -Xdebug -Xrunjdwp:transport=dt_socket,address=8888,server=y,onthrow=no.pkg.foo,launch=echo -Xjit:enableOSR,enableOSROnGuardFailure,count=1,disableAsyncCompilation

No diagnostic files generated.

10:22:03.554 - Completed 3.0%. Number of tests started=6
10:22:23.882 - Completed 3.0%. Number of tests started=6 (+0)
10:22:43.460 - Completed 3.0%. Number of tests started=6 (+0)
10:23:03.476 - Completed 3.0%. Number of tests started=6 (+0)
10:23:23.492 - Completed 3.0%. Number of tests started=6 (+0)
10:23:43.398 - Completed 3.0%. Number of tests started=6 (+0)
10:24:03.413 - Completed 3.0%. Number of tests started=6 (+0)
10:24:23.429 - Completed 3.0%. Number of tests started=6 (+0)
10:24:43.445 - Completed 3.0%. Number of tests started=6 (+0)
10:25:03.460 - Completed 3.0%. Number of tests started=6 (+0)
10:25:23.476 - Completed 3.0%. Number of tests started=6 (+0)
10:25:43.492 - Completed 3.0%. Number of tests started=6 (+0)
10:26:03.398 - Completed 3.0%. Number of tests started=6 (+0)
10:26:23.414 - Completed 3.0%. Number of tests started=6 (+0)
10:26:43.429 - Completed 3.0%. Number of tests started=6 (+0)
10:27:03.445 - Completed 3.0%. Number of tests started=6 (+0)
10:27:23.460 - Completed 3.0%. Number of tests started=6 (+0)
10:27:43.476 - Completed 3.0%. Number of tests started=6 (+0)
10:28:03.492 - Completed 3.0%. Number of tests started=6 (+0)
10:28:23.398 - Completed 3.0%. Number of tests started=6 (+0)
10:28:43.414 - Completed 3.0%. Number of tests started=6 (+0)
10:29:03.429 - Completed 3.0%. Number of tests started=6 (+0)
10:29:23.445 - Completed 3.0%. Number of tests started=6 (+0)
10:29:43.461 - Completed 3.0%. Number of tests started=6 (+0)
10:30:03.476 - Completed 3.0%. Number of tests started=6 (+0)
10:30:23.492 - Completed 3.0%. Number of tests started=6 (+0)
10:30:43.398 - Completed 3.0%. Number of tests started=6 (+0)
10:31:03.414 - Completed 3.0%. Number of tests started=6 (+0)
10:31:23.429 - Completed 3.0%. Number of tests started=6 (+0)
10:31:43.445 - Completed 3.0%. Number of tests started=6 (+0)
10:32:03.461 - Completed 3.0%. Number of tests started=6 (+0)
10:32:23.414 - Completed 3.0%. Number of tests started=6 (+0)
10:32:43.398 - Completed 3.0%. Number of tests started=6 (+0)
10:33:03.429 - Completed 3.0%. Number of tests started=6 (+0)
10:33:23.445 - Completed 3.0%. Number of tests started=6 (+0)
10:33:43.461 - Completed 3.0%. Number of tests started=6 (+0)
10:34:03.476 - Completed 3.0%. Number of tests started=6 (+0)
10:34:23.492 - Completed 3.0%. Number of tests started=6 (+0)
10:34:43.398 - Completed 3.0%. Number of tests started=6 (+0)
10:35:03.414 - Completed 3.0%. Number of tests started=6 (+0)
10:35:23.429 - Completed 3.0%. Number of tests started=6 (+0)
10:35:43.445 - Completed 3.0%. Number of tests started=6 (+0)
10:36:03.461 - Completed 3.0%. Number of tests started=6 (+0)
10:36:23.476 - Completed 3.0%. Number of tests started=6 (+0)
10:36:43.492 - Completed 3.0%. Number of tests started=6 (+0)
10:37:03.398 - Completed 3.0%. Number of tests started=6 (+0)
10:37:23.414 - Completed 3.0%. Number of tests started=6 (+0)
10:37:43.430 - Completed 3.0%. Number of tests started=6 (+0)
10:38:03.445 - Completed 3.0%. Number of tests started=6 (+0)
10:38:23.461 - Completed 3.0%. Number of tests started=6 (+0)
10:38:43.476 - Completed 3.0%. Number of tests started=6 (+0)
10:39:03.492 - Completed 3.0%. Number of tests started=6 (+0)
10:39:23.398 - Completed 3.0%. Number of tests started=6 (+0)
10:39:43.414 - Completed 3.0%. Number of tests started=6 (+0)
10:40:03.430 - Completed 3.0%. Number of tests started=6 (+0)
10:40:23.445 - Completed 3.0%. Number of tests started=6 (+0)
10:40:43.461 - Completed 3.0%. Number of tests started=6 (+0)
10:41:03.476 - Completed 3.0%. Number of tests started=6 (+0)
10:41:23.492 - Completed 3.0%. Number of tests started=6 (+0)
10:41:43.398 - Completed 3.0%. Number of tests started=6 (+0)
10:42:03.414 - Completed 3.0%. Number of tests started=6 (+0)
10:42:23.430 - Completed 3.0%. Number of tests started=6 (+0)
10:42:43.445 - Completed 3.0%. Number of tests started=6 (+0)
10:43:03.461 - Completed 3.0%. Number of tests started=6 (+0)
10:43:23.477 - Completed 3.0%. Number of tests started=6 (+0)
10:43:43.492 - Completed 3.0%. Number of tests started=6 (+0)
10:44:03.398 - Completed 3.0%. Number of tests started=6 (+0)
10:44:23.414 - Completed 3.0%. Number of tests started=6 (+0)
10:44:43.430 - Completed 3.0%. Number of tests started=6 (+0)
10:45:03.445 - Completed 3.0%. Number of tests started=6 (+0)
10:45:23.461 - Completed 3.0%. Number of tests started=6 (+0)
10:45:43.477 - Completed 3.0%. Number of tests started=6 (+0)
10:46:03.492 - Completed 3.0%. Number of tests started=6 (+0)
10:46:23.398 - Completed 3.0%. Number of tests started=6 (+0)
10:46:43.414 - Completed 3.0%. Number of tests started=6 (+0)
10:47:03.430 - Completed 3.0%. Number of tests started=6 (+0)
10:47:23.508 - Completed 3.0%. Number of tests started=6 (+0)
10:47:43.430 - Completed 3.0%. Number of tests started=6 (+0)
10:48:03.461 - Completed 3.0%. Number of tests started=6 (+0)
10:48:23.399 - Completed 3.0%. Number of tests started=6 (+0)
10:48:43.414 - Completed 3.0%. Number of tests started=6 (+0)
10:49:03.430 - Completed 3.0%. Number of tests started=6 (+0)
10:49:23.446 - Completed 3.0%. Number of tests started=6 (+0)
10:49:43.461 - Completed 3.0%. Number of tests started=6 (+0)
10:50:03.477 - Completed 3.0%. Number of tests started=6 (+0)
10:50:23.492 - Completed 3.0%. Number of tests started=6 (+0)
10:50:43.399 - Completed 3.0%. Number of tests started=6 (+0)
10:51:03.414 - Completed 3.0%. Number of tests started=6 (+0)
10:51:23.430 - Completed 3.0%. Number of tests started=6 (+0)
10:51:43.446 - Completed 3.0%. Number of tests started=6 (+0)
10:52:03.461 - Completed 3.0%. Number of tests started=6 (+0)
10:52:23.477 - Completed 3.0%. Number of tests started=6 (+0)
10:52:43.492 - Completed 3.0%. Number of tests started=6 (+0)
10:53:03.399 - Completed 3.0%. Number of tests started=6 (+0)
10:53:23.414 - Completed 3.0%. Number of tests started=6 (+0)
10:53:43.430 - Completed 3.0%. Number of tests started=6 (+0)
10:54:03.446 - Completed 3.0%. Number of tests started=6 (+0)
10:54:23.461 - Completed 3.0%. Number of tests started=6 (+0)
10:54:43.477 - Completed 3.0%. Number of tests started=6 (+0)
10:55:03.493 - Completed 3.0%. Number of tests started=6 (+0)
10:55:23.399 - Completed 3.0%. Number of tests started=6 (+0)
10:55:43.414 - Completed 3.0%. Number of tests started=6 (+0)
10:56:03.430 - Completed 3.0%. Number of tests started=6 (+0)
10:56:23.446 - Completed 3.0%. Number of tests started=6 (+0)
10:56:43.461 - Completed 3.0%. Number of tests started=6 (+0)
10:57:03.477 - Completed 3.0%. Number of tests started=6 (+0)
10:57:23.493 - Completed 3.0%. Number of tests started=6 (+0)
10:57:43.399 - Completed 3.0%. Number of tests started=6 (+0)
10:58:03.414 - Completed 3.0%. Number of tests started=6 (+0)
10:58:23.430 - Completed 3.0%. Number of tests started=6 (+0)
10:58:43.446 - Completed 3.0%. Number of tests started=6 (+0)
10:59:03.461 - Completed 3.0%. Number of tests started=6 (+0)
10:59:23.477 - Completed 3.0%. Number of tests started=6 (+0)
10:59:43.493 - Completed 3.0%. Number of tests started=6 (+0)
11:00:03.399 - Completed 3.0%. Number of tests started=6 (+0)
11:00:23.415 - Completed 3.0%. Number of tests started=6 (+0)
11:00:43.430 - Completed 3.0%. Number of tests started=6 (+0)
11:01:03.446 - Completed 3.0%. Number of tests started=6 (+0)
11:01:23.461 - Completed 3.0%. Number of tests started=6 (+0)
11:01:43.477 - Completed 3.0%. Number of tests started=6 (+0)
11:02:03.493 - Completed 3.0%. Number of tests started=6 (+0)
11:02:23.399 - Completed 3.0%. Number of tests started=6 (+0)
11:02:43.415 - Completed 3.0%. Number of tests started=6 (+0)
11:03:03.430 - Completed 3.0%. Number of tests started=6 (+0)
11:03:23.446 - Completed 3.0%. Number of tests started=6 (+0)
11:03:43.462 - Completed 3.0%. Number of tests started=6 (+0)
11:04:03.477 - Completed 3.0%. Number of tests started=6 (+0)
11:04:23.493 - Completed 3.0%. Number of tests started=6 (+0)
11:04:43.399 - Completed 3.0%. Number of tests started=6 (+0)
11:05:03.415 - Completed 3.0%. Number of tests started=6 (+0)
11:05:23.430 - Completed 3.0%. Number of tests started=6 (+0)
11:05:43.446 - Completed 3.0%. Number of tests started=6 (+0)
11:06:03.462 - Completed 3.0%. Number of tests started=6 (+0)
11:06:23.477 - Completed 3.0%. Number of tests started=6 (+0)
11:06:43.493 - Completed 3.0%. Number of tests started=6 (+0)
11:07:03.399 - Completed 3.0%. Number of tests started=6 (+0)
11:07:23.415 - Completed 3.0%. Number of tests started=6 (+0)
11:07:43.430 - Completed 3.0%. Number of tests started=6 (+0)
11:08:03.446 - Completed 3.0%. Number of tests started=6 (+0)
11:08:23.462 - Completed 3.0%. Number of tests started=6 (+0)
11:08:43.477 - Completed 3.0%. Number of tests started=6 (+0)
11:09:03.493 - Completed 3.0%. Number of tests started=6 (+0)
11:09:23.399 - Completed 3.0%. Number of tests started=6 (+0)
11:09:43.415 - Completed 3.0%. Number of tests started=6 (+0)
11:10:03.430 - Completed 3.0%. Number of tests started=6 (+0)
11:10:23.446 - Completed 3.0%. Number of tests started=6 (+0)
11:10:43.462 - Completed 3.0%. Number of tests started=6 (+0)
11:11:03.477 - Completed 3.0%. Number of tests started=6 (+0)
11:11:23.493 - Completed 3.0%. Number of tests started=6 (+0)
11:11:43.399 - Completed 3.0%. Number of tests started=6 (+0)
11:12:03.415 - Completed 3.0%. Number of tests started=6 (+0)
11:12:23.431 - Completed 3.0%. Number of tests started=6 (+0)
11:12:43.446 - Completed 3.0%. Number of tests started=6 (+0)
11:13:03.462 - Completed 3.0%. Number of tests started=6 (+0)
11:13:23.477 - Completed 3.0%. Number of tests started=6 (+0)
11:13:43.493 - Completed 3.0%. Number of tests started=6 (+0)
11:14:03.399 - Completed 3.0%. Number of tests started=6 (+0)
11:14:23.415 - Completed 3.0%. Number of tests started=6 (+0)
11:14:43.431 - Completed 3.0%. Number of tests started=6 (+0)
11:15:03.446 - Completed 3.0%. Number of tests started=6 (+0)
11:15:23.462 - Completed 3.0%. Number of tests started=6 (+0)
11:15:43.478 - Completed 3.0%. Number of tests started=6 (+0)
11:16:03.493 - Completed 3.0%. Number of tests started=6 (+0)
11:16:23.399 - Completed 3.0%. Number of tests started=6 (+0)
11:16:43.415 - Completed 3.0%. Number of tests started=6 (+0)
11:17:03.431 - Completed 3.0%. Number of tests started=6 (+0)
11:17:23.446 - Completed 3.0%. Number of tests started=6 (+0)
11:17:43.462 - Completed 3.0%. Number of tests started=6 (+0)
11:18:03.478 - Completed 3.0%. Number of tests started=6 (+0)
11:18:23.493 - Completed 3.0%. Number of tests started=6 (+0)
11:18:43.400 - Completed 3.0%. Number of tests started=6 (+0)
11:19:03.415 - Completed 3.0%. Number of tests started=6 (+0)
11:19:23.431 - Completed 3.0%. Number of tests started=6 (+0)
11:19:43.446 - Completed 3.0%. Number of tests started=6 (+0)
11:20:03.462 - Completed 3.0%. Number of tests started=6 (+0)
11:20:23.478 - Completed 3.0%. Number of tests started=6 (+0)
11:20:43.493 - Completed 3.0%. Number of tests started=6 (+0)
11:21:03.400 - Completed 3.0%. Number of tests started=6 (+0)
11:21:23.415 - Completed 3.0%. Number of tests started=6 (+0)
11:21:43.431 - Completed 3.0%. Number of tests started=6 (+0)
11:22:03.446 - Completed 3.0%. Number of tests started=6 (+0)
@pshipton
Copy link
Member Author

@pshipton
Copy link
Member Author

@pshipton
Copy link
Member Author

@pshipton
Copy link
Member Author

https://ci.eclipse.org/openj9/job/Test_openjdk11_j9_sanity.system_x86-64_windows_Nightly/320
LambdaLoadTest_ConcurrentScavenge_0
looks the same

LT  00:31:59.732 - Completed 3.0%. Number of tests started=6 (+0)
LT  00:32:19.742 - Completed 3.0%. Number of tests started=6 (+0)
LT  00:32:39.743 - Completed 3.0%. Number of tests started=6 (+0)
STF 00:32:57.026 - Heartbeat: Process LT  is still running
STF 00:32:58.030 - **FAILED** Process LT  has timed out

@pshipton pshipton changed the title special.system windows 32-bit LambdaLoadTest_OpenJ9_NonLinux_special_24 hang Windows LambdaLoadTest hang Mar 26, 2020
@pshipton
Copy link
Member Author

@pshipton
Copy link
Member Author

@pshipton
Copy link
Member Author

@pshipton
Copy link
Member Author

pshipton commented Jun 14, 2020

@pshipton
Copy link
Member Author

@pshipton
Copy link
Member Author

@pshipton
Copy link
Member Author

@pshipton
Copy link
Member Author

@pshipton
Copy link
Member Author

@Mesbah-Alam we've been ignoring this for while, but it would be worth taking a look. procdump is used to create diagnotics, but it seems not working, the stdout shows the following No process matching the specified PID can be found..

STF 11:32:46.269 - Using procdump.exe to generate .DMP files
STF 11:32:46.269 - Running command: F:\Users\jenkins\workspace\Test_openjdk8_j9_special.system_x86-32_windows_Personal_lambdaLoadTest\jvmtest\system\systemtest_prereqs\windows_sysinternals\procdump.exe -accepteula -ma 240

@Mesbah-Alam
Copy link
Contributor

Created STF issue for the above: adoptium/STF#86

@Mesbah-Alam
Copy link
Contributor

Mesbah-Alam commented Nov 5, 2020

@pshipton FYI : Simon (@lumpfish) has created a fix (adoptium/STF#89) which should resolve the procdump hang issue.

We may close this issue at this point and re-open if the problem somehow resurfaces in any future build.

@pshipton pshipton closed this as completed Nov 5, 2020
@pshipton
Copy link
Member Author

pshipton commented Dec 7, 2020

https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_special.system_x86-32_windows_Personal_testList_1/1/
LambdaLoadTest_OpenJ9_NonLinux_special_26
variation: Mode107-OSRG
JVM_OPTIONS: -Xgcpolicy:optthruput -Xdebug -Xrunjdwp:transport=dt_socket,address=8888,server=y,onthrow=no.pkg.foo,launch=echo -Xjit:enableOSR,enableOSROnGuardFailure,count=1,disableAsyncCompilation

Diagnostics https://140-211-168-230-openstack.osuosl.org/artifactory/ci-eclipse-openj9/Test/Test_openjdk8_j9_special.system_x86-32_windows_Personal_testList_1/1/system_test_output.tar.gz

@pshipton pshipton reopened this Dec 7, 2020
@pshipton
Copy link
Member Author

There are 3 core files, which I'll refer to as 125, 155, 225.

125:

        !stack 0x25f3e600       !j9vmthread 0x25f3e600  !j9thread 0x27394644    tid 0x139c (5020) // (load-3)
        !stack 0x25f3ed00       !j9vmthread 0x25f3ed00  !j9thread 0x273948b8    tid 0x33d8 (13272) // (load-4)

0x139c:

ntdll!NtQueryPerformanceCounter+0xc
j9prt29!omrtime_hires_clock+0xd
j9trc29!traceV+0x13c
j9trc29!logTracePoint+0x3a
j9trc29!doTracePoint+0xd6
j9trc29!javaTrace+0x3f
j9jit29!initializeOSRFrame+0xbe
j9jit29!initializeOSRBuffer+0x1ea
j9jit29!induceOSROnCurrentThread+0x2cd
j9jit29!old_slow_jitInduceOSRAtCurrentPC+0x80
j9jit29!jitInduceOSRAtCurrentPC+0x62
j9vm29!sidecarInvokeReflectMethodImpl+0x2f4
j9vm29!sidecarInvokeReflectMethod+0x38
jclse29!JVM_InvokeMethod_Impl+0x54
java_705a0000!Java_sun_reflect_NativeMethodAccessorImpl_invoke0+0x15
j9vm29!ffi_call_win32+0x48
j9vm29!ffi_raw_call+0x63
j9vm29!VM_DebugBytecodeInterpreterFull::run+0x1344c
j9vm29!debugBytecodeLoopFull+0x7d
j9vm29!runJavaThread+0x1b5

0x33d8:

WARNING: Stack unwind information not available. Following frames may be wrong.
ntdll!NtQueryPerformanceCounter+0xc
j9prt29!omrtime_hires_clock+0xd
j9trc29!traceV+0x13c
j9trc29!logTracePoint+0x3a
j9trc29!doTracePoint+0xd6
j9trc29!javaTrace+0x3f
j9jit29!decompileMethodFrameIterator+0xd3
j9vm29!walkFrame+0x181
j9vm29!walkStackFrames+0x225
j9jit29!jitDecompileMethod+0x68
j9jit29!c_jitDecompileAtCurrentPC+0x22
j9jit29!jitDecompileAtCurrentPC+0x1a
j9vm29!sidecarInvokeReflectMethodImpl+0x2f4
j9vm29!sidecarInvokeReflectMethod+0x38
jclse29!JVM_InvokeMethod_Impl+0x54
java_705a0000!Java_sun_reflect_NativeMethodAccessorImpl_invoke0+0x15
j9vm29!ffi_call_win32+0x48
j9vm29!ffi_raw_call+0x63
j9vm29!VM_DebugBytecodeInterpreterFull::run+0x1344c
j9vm29!debugBytecodeLoopFull+0x7d

225: same threads

        !stack 0x25f3e600       !j9vmthread 0x25f3e600  !j9thread 0x27394644    tid 0x139c (5020) // (load-3)
        !stack 0x25f3ed00       !j9vmthread 0x25f3ed00  !j9thread 0x273948b8    tid 0x33d8 (13272) // (load-4)

0x139c:

WARNING: Stack unwind information not available. Following frames may be wrong.
ntdll!NtQueryPerformanceCounter+0xc
j9prt29!omrtime_hires_clock+0xd
j9trc29!traceV+0x13c
j9trc29!logTracePoint+0x3a
j9trc29!doTracePoint+0xd6
j9trc29!javaTrace+0x3f
j9jit29!decompileOuterFrame+0x1b1
j9jit29!buildInlineStackFrames+0x30
j9jit29!performDecompile+0x213
j9jit29!jitDecompileMethod+0x7c
j9jit29!c_jitDecompileAtCurrentPC+0x22
j9jit29!jitDecompileAtCurrentPC+0x1a
j9vm29!sidecarInvokeReflectMethodImpl+0x2f4
j9vm29!sidecarInvokeReflectMethod+0x38
jclse29!JVM_InvokeMethod_Impl+0x54
java_705a0000!Java_sun_reflect_NativeMethodAccessorImpl_invoke0+0x15
j9vm29!ffi_call_win32+0x48
j9vm29!ffi_raw_call+0x63
j9vm29!VM_DebugBytecodeInterpreterFull::run+0x1344c
j9vm29!debugBytecodeLoopFull+0x7d

0x33d8:

WARNING: Stack unwind information not available. Following frames may be wrong.
ntdll!NtQueryPerformanceCounter+0xc
j9prt29!omrtime_hires_clock+0xd
j9trc29!traceV+0x13c
j9trc29!logTracePoint+0x3a
j9trc29!doTracePoint+0xd6
j9trc29!javaTrace+0x3f
j9jit29!fixStackForNewDecompilation+0x14c
j9jit29!induceOSROnCurrentThread+0x39d
j9jit29!old_slow_jitInduceOSRAtCurrentPC+0x80
j9jit29!jitInduceOSRAtCurrentPC+0x62
j9vm29!sidecarInvokeReflectMethodImpl+0x2f4
j9vm29!sidecarInvokeReflectMethod+0x38
jclse29!JVM_InvokeMethod_Impl+0x54
java_705a0000!Java_sun_reflect_NativeMethodAccessorImpl_invoke0+0x15
j9vm29!ffi_call_win32+0x48
j9vm29!ffi_raw_call+0x63
j9vm29!VM_DebugBytecodeInterpreterFull::run+0x1344c
j9vm29!debugBytecodeLoopFull+0x7d
j9vm29!runJavaThread+0x1b5
j9vm29!javaProtectedThreadProc+0xa1

Running !stack on jdmpview results in:

Dec 14, 2020 1:51:02 PM com.ibm.j9ddr.vm29.events.DefaultEventListener corruptData
WARNING: CorruptDataException thrown walking stack. walkThread = 0x25F3E600
com.ibm.j9ddr.CorruptDataException: Induced OSR resolve not handled yet
        at com.ibm.j9ddr.vm29.j9.stackwalker.JITStackWalker$JITStackWalker_29_V0.jitWalkResolveMethodFrame(JITStackWalker.java:471)
        at com.ibm.j9ddr.vm29.j9.stackwalker.JITStackWalker$JITStackWalker_29_V0.walkTransitionFrame(JITStackWalker.java:883)
        at com.ibm.j9ddr.vm29.j9.stackwalker.JITStackWalker$JITStackWalker_29_V0.jitWalkStackFrames(JITStackWalker.java:198)
        at com.ibm.j9ddr.vm29.j9.stackwalker.JITStackWalker.jitWalkStackFrames(JITStackWalker.java:101)
        at com.ibm.j9ddr.vm29.j9.stackwalker.StackWalker$StackWalker_29_V0.walkStackFrames(StackWalker.java:488)
        at com.ibm.j9ddr.vm29.j9.stackwalker.StackWalker.walkStackFrames(StackWalker.java:103)
        at com.ibm.j9ddr.vm29.tools.ddrinteractive.commands.StackWalkCommand.run(StackWalkCommand.java:139)
        at com.ibm.j9ddr.tools.ddrinteractive.Context.tryCommand(Context.java:229)
        at com.ibm.j9ddr.tools.ddrinteractive.Context.execute(Context.java:202)
        at com.ibm.j9ddr.tools.ddrinteractive.DDRInteractive.execute(DDRInteractive.java:356)
        at com.ibm.j9ddr.command.CommandReader.processLine(CommandReader.java:79)
        at com.ibm.j9ddr.tools.ddrinteractive.DDRInteractive.processLine(DDRInteractive.java:331)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.ibm.jvm.dtfjview.CombinedContext.executeDDRInteractiveCommand(CombinedContext.java:258)
        at com.ibm.jvm.dtfjview.CombinedContext.execute(CombinedContext.java:169)
        at com.ibm.jvm.dtfjview.Session.execute(Session.java:811)
        at com.ibm.jvm.dtfjview.Session.execute(Session.java:765)
        at com.ibm.jvm.dtfjview.tools.ToolsRegistry.executeJdmpviewCommand(ToolsRegistry.java:183)
        at com.ibm.jvm.dtfjview.tools.ToolsRegistry.execute(ToolsRegistry.java:107)
        at com.ibm.jvm.dtfjview.tools.ToolsRegistry.execute(ToolsRegistry.java:88)
        at com.ibm.jvm.dtfjview.tools.ToolsRegistry.recordAndExecute(ToolsRegistry.java:72)
        at com.ibm.jvm.dtfjview.Session.runInteractive(Session.java:739)
        at com.ibm.jvm.dtfjview.Session.run(Session.java:663)
        at com.ibm.jvm.dtfjview.DTFJView.launch(DTFJView.java:52)
        at com.ibm.jvm.dtfjview.DTFJView.main(DTFJView.java:46)

Stack walk result: STACK_CORRUPT

@pshipton
Copy link
Member Author

It seems it could be a decompilation endless loop.

@0xdaryl fyi. Somebody could grab the diagnostics before they expire and get deleted. This problem has been occurring for a while, we've only "recently" been able to get core files.

@pshipton
Copy link
Member Author

fwiw, I'm copying the sdk, debug image, and core files to /team/triage/openj9-8214.

@pshipton
Copy link
Member Author

Also copied them to Box, pls let me know if you need access.

@0xdaryl
Copy link
Contributor

0xdaryl commented Dec 14, 2020

@nbhuiyan : could you take an initial look at this please? I realize you're about to vanish this week on vacation, but perhaps a quick triage from the crash dump file could be done. Thanks.

@nbhuiyan
Copy link
Member

@0xdaryl Looking at the core dumps briefly, I also suspect the hang is happening during decompilation. However, further work is needed to determine the cause and coming up with a fix. Requesting that this issue is moved to the jdk16 release. I have kept all the files needed to continue this investigation upon my return.

@0xdaryl
Copy link
Contributor

0xdaryl commented Dec 17, 2020

Moving forward for the justification above.

@pshipton
Copy link
Member Author

@pshipton
Copy link
Member Author

@pshipton
Copy link
Member Author

@pshipton
Copy link
Member Author

@pshipton pshipton changed the title Windows LambdaLoadTest hang Windows LambdaLoadTest hang at 3% Jan 25, 2021
@nbhuiyan
Copy link
Member

nbhuiyan commented Feb 5, 2021

@0xdaryl, I have determined a possible cause for this hang from what I could gather from the diagnostics files over the last month. I am still working on the fix and will need more time to test and verify the issue has been resolved, which will take time as this hang happens intermittently. Requesting that this issue is moved forward to the next milestone.

@pshipton
Copy link
Member Author

pshipton commented Feb 5, 2021

Moved it. This is not a new issue, I'm glad we're getting to the bottom of it.

@0xdaryl
Copy link
Contributor

0xdaryl commented Mar 9, 2021

@nbhuiyan is working on a fix and studying its behaviour in local testing before contributing. If it is committed, he requested more time to allow it to "soak" in the head stream. I will push this out to 0.27.

@pshipton
Copy link
Member Author

pshipton commented Jun 9, 2021

I haven't seen this in some time, did a fix get merged?

@nbhuiyan
Copy link
Member

nbhuiyan commented Jun 9, 2021

I have not been able to verify that the possible fix I had would address this hang, since over the last month in my local testing using a recent build, I did not experience this hang without the fix. It is possible that this got fixed as a side effect of some other work. I suggest we close this issue now. If we see this hang again, we can revisit this issue.

@pshipton pshipton closed this as completed Jun 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants