-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix fake memory leaks in some test cases [databricks] #5955
Conversation
Signed-off-by: Chong Gao <res_life@163.com>
Depends on rapidsai/cudf#11161 Tested, and it works. |
val REF_COUNT_DEBUG_STR = System.getProperty(MemoryCleaner.REF_COUNT_DEBUG_KEY, "false") | ||
if (REF_COUNT_DEBUG_STR.equalsIgnoreCase("true")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
val REF_COUNT_DEBUG_STR = System.getProperty(MemoryCleaner.REF_COUNT_DEBUG_KEY, "false") | |
if (REF_COUNT_DEBUG_STR.equalsIgnoreCase("true")) { | |
if (java.lang.Boolean.getBoolean(MemoryCleaner.REF_COUNT_DEBUG_KEY)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
MemoryCleaner.removeDefaultShutdownHook() | ||
// Shutdown hooks are executed concurrently in JVM, and there is no execution order guarantee. | ||
// See the doc of `Runtime.addShutdownHook`. | ||
// Some resources are closed in Spark hooks. | ||
// Here we should wait Spark hooks to be done, or a false leak will be detected. | ||
// See issue: https://github.com/NVIDIA/spark-rapids/issues/5854 | ||
// | ||
// `Spark ShutdownHookManager` leverages `Hadoop ShutdownHookManager` to manage hooks with | ||
// priority. The priority parameter will guarantee the execution order. | ||
// | ||
// Here also use `Hadoop ShutdownHookManager` to add a lower priority hook. | ||
// 20 priority is small enough, will run after Spark hooks. | ||
// Note: `ShutdownHookManager.get()` is a singleton | ||
org.apache.hadoop.util.ShutdownHookManager.get().addShutdownHook( | ||
MemoryCleaner.DEFAULT_SHUTDOWN_RUNNABLE, 20) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if you follow the suggestion on the cudf PR you could do
MemoryCleaner.removeDefaultShutdownHook() | |
// Shutdown hooks are executed concurrently in JVM, and there is no execution order guarantee. | |
// See the doc of `Runtime.addShutdownHook`. | |
// Some resources are closed in Spark hooks. | |
// Here we should wait Spark hooks to be done, or a false leak will be detected. | |
// See issue: https://github.com/NVIDIA/spark-rapids/issues/5854 | |
// | |
// `Spark ShutdownHookManager` leverages `Hadoop ShutdownHookManager` to manage hooks with | |
// priority. The priority parameter will guarantee the execution order. | |
// | |
// Here also use `Hadoop ShutdownHookManager` to add a lower priority hook. | |
// 20 priority is small enough, will run after Spark hooks. | |
// Note: `ShutdownHookManager.get()` is a singleton | |
org.apache.hadoop.util.ShutdownHookManager.get().addShutdownHook( | |
MemoryCleaner.DEFAULT_SHUTDOWN_RUNNABLE, 20) | |
// Shutdown hooks are executed concurrently in JVM, and there is no execution order guarantee. | |
// See the doc of `Runtime.addShutdownHook`. | |
// Some resources are closed in Spark hooks. | |
// Here we should wait Spark hooks to be done, or a false leak will be detected. | |
// See issue: https://github.com/NVIDIA/spark-rapids/issues/5854 | |
// | |
// `Spark ShutdownHookManager` leverages `Hadoop ShutdownHookManager` to manage hooks with | |
// priority. The priority parameter will guarantee the execution order. | |
// | |
// Here also use `Hadoop ShutdownHookManager` to add a lower priority hook. | |
// 20 priority is small enough, will run after Spark hooks. | |
// Note: `ShutdownHookManager.get()` is a singleton | |
org.apache.hadoop.util.ShutdownHookManager.get().addShutdownHook( | |
MemoryCleaner.removeDefaultShutdownHook(), 20) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
/** | ||
* Re-register leaks checking hook if configured. | ||
*/ | ||
private def ReRegisterCheckLeakHook: Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit : this is a method (should start with a lowercase) and it has side effects, should use empty parameter list in parens.
private def ReRegisterCheckLeakHook: Unit = { | |
private def reRegisterCheckLeakHook(): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
build |
2 similar comments
build |
build |
build |
Contributes to #5854
Problem
Prints RapidsHostMemoryStore.pool leaked error log when running Rapids Accelerator test cases.
Root cause
RapidsHostMemoryStore.pool is not closed before MemoryCleaner checking the leaks.
It's actually not a leak, it's caused by hooks execution order.
RapidsHostMemoryStore.pool is closed in the Spark executor plugin hook.
plugins.foreach(_.shutdown()) // this line will eventually close the RapidsHostMemoryStore.pool
The close path is:
Solution
First, remove the default hook in MemoryCleaner, then add the default hook by leveraging Spark ShutdownHookManager
See the cuDF side change: rapidsai/cudf#11161
Signed-off-by: Chong Gao res_life@163.com