Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Jtreg/FFI]Fix the OOM issue with thunk allocation #16260

Conversation

ChengJin01
Copy link

The changes aim to address the OOM issue with thunk allocation on the platforms with a small page size (e.g. 4KB) by maintaining a circular thunk heap list in which case a new thunk heap is created only when there is no free thunk heap available in the list to allocate thunk.

Signed-off-by: Cheng Jin jincheng@ca.ibm.com

@ChengJin01
Copy link
Author

ChengJin01 commented Nov 4, 2022

An intermitant OOM issue is detected in Jtreg FFI test suite at https://github.com/ibmruntimes/openj9-openjdk-jdk19/blob/openj9/test/jdk/java/foreign/TestUpcallAsync.java on zLinux and CentOS/x86_64 with pretty small page size (4KB), in which case the allocated thunks are held by native threads (created in native test code) for a while prior to termination. When the allocated thunks(to be released but delayed) accumulates, OOM is randomly triggered in upcall due to the temporary out-of-memory situation when allocating a new thunk for upcall.

This PR is to resolve this issue by replacing a single thunk heap with the fixed page size in the existing implementation with a circular linked list of thunk heaps so as to make the best of heaps for thunk allocation. The idea is straightforward as follows:

  1. Initially there is only one heap(the head by default) in the list for thunk allocation, supposing it is called heap 1.
  2. When heap 1 is out-of-memory, create a new heap 2 and insert it to the next one of heap 1 and set heap 2 as the list head.
  3. When heap 2 is out-of-memory, move forward to heap 1 in the circular list, assuming some of allocated thunks are already released in heap 1. If heap 1 is free for allocation, set heap 1 as the list head to keep allocating thunks.
  4. When there is no heap with free memory available in the list, create a new heap 3 and insert it to the next one of head to allocate thunks.

That being said, the list head always points to the free heap to be allocated till it is out-of-memory in which case we move to the next heap with free memory or create a new heap if there is no free memory for thunk in the list.

@ChengJin01
Copy link
Author

Reviewer: @gacholio, @tajila
FYI: @pshipton, @knn-k

@ChengJin01
Copy link
Author

Hi @knn-k, please help double-check to ensure pthread_jit_write_protect_np is still at the right place for Aarch64 in this PR.

@ChengJin01 ChengJin01 force-pushed the fix_ffi_oom_thunk_alloc_with_heap_list branch from fab545a to e31c3dd Compare November 4, 2022 16:26
@knn-k
Copy link
Contributor

knn-k commented Nov 5, 2022

@ChengJin01 pthread_jit_write_protect_np for AArch64 macOS looks OK.

Copy link
Contributor

@gacholio gacholio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me.

@gacholio gacholio requested a review from keithc-ca November 8, 2022 14:07
@gacholio
Copy link
Contributor

gacholio commented Nov 8, 2022

This is complex enough that someone else should also look it over (suggesting @keithc-ca).

runtime/vm/UpcallThunkMem.cpp Outdated Show resolved Hide resolved
runtime/vm/UpcallThunkMem.cpp Outdated Show resolved Hide resolved
runtime/vm/UpcallThunkMem.cpp Outdated Show resolved Hide resolved
runtime/vm/jvminit.c Outdated Show resolved Hide resolved
@ChengJin01 ChengJin01 force-pushed the fix_ffi_oom_thunk_alloc_with_heap_list branch 3 times, most recently from 70a2f0f to 5022aa5 Compare November 9, 2022 22:27
@ChengJin01 ChengJin01 force-pushed the fix_ffi_oom_thunk_alloc_with_heap_list branch 2 times, most recently from d7987c2 to 51acf4a Compare November 10, 2022 21:52
@ChengJin01 ChengJin01 force-pushed the fix_ffi_oom_thunk_alloc_with_heap_list branch 10 times, most recently from 3d8d465 to 9a27fd3 Compare November 25, 2022 17:28
@ChengJin01 ChengJin01 force-pushed the fix_ffi_oom_thunk_alloc_with_heap_list branch from 9a27fd3 to aa110d5 Compare November 25, 2022 18:12
@keithc-ca keithc-ca requested a review from gacholio November 28, 2022 15:31
@keithc-ca
Copy link
Contributor

@gacholio Can you please have another look at this? Thanks.

@ChengJin01 ChengJin01 force-pushed the fix_ffi_oom_thunk_alloc_with_heap_list branch from aa110d5 to 678076b Compare November 28, 2022 17:07
The changes aim to address the OOM issue with thunk
allocation on the platforms with a small page size
(e.g. 4KB) by maintaining a thunk heap list in which
case a new thunk heap is created only when there is
no free thunk heap available in the list to allocate
thunk.

Signed-off-by: Cheng Jin <jincheng@ca.ibm.com>
@ChengJin01 ChengJin01 force-pushed the fix_ffi_oom_thunk_alloc_with_heap_list branch from 678076b to 5e55aca Compare November 28, 2022 17:09
@gacholio
Copy link
Contributor

What JDK do we need to test for these changes?

@ChengJin01
Copy link
Author

ChengJin01 commented Nov 28, 2022

What JDK do we need to test for these changes?

JDK17 & 19 (no difference in terms of native code in these changes).

@gacholio
Copy link
Contributor

jenkins test sanity all jdk19

@gacholio
Copy link
Contributor

jenkins compile win jdk8

@gacholio gacholio merged commit 7a844d3 into eclipse-openj9:master Nov 29, 2022
@tajila tajila added the backport:candidate Possible candidate for a backport to a `0.x.y+1` release label Nov 29, 2022
@pshipton pshipton removed the backport:candidate Possible candidate for a backport to a `0.x.y+1` release label Dec 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants