-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZstdOutputStream leaking thread handles #203
Comments
Hmm, interesting. Does it still leak threads if you add My hypothesis is that the |
Thanks for the response! I'm back to work today. Yes, the issue still happens if I add an explicit call to close at the end of the try block, which is expected since try-with-resources does the same thing implicitly. I also thought the leak may have to do with GC, but now I'm not convinced. Note that although the sample code I supplied is a short lived program, our production code runs continuously as a service process for days/weeks/months and the thread handle count is not reduced until the process is restarted. We've had customers report millions of handles held by our process after multiple days of repeated compression tasks. I also tried triggering GC manually via jconsole and was able to see the in-use memory drop, but the open thread handle count seems to be unaffected by the cleanup. I also want to re-emphasize that we're not leaking actual threads - just thread handles, which are essentially pointers to threads that no longer exist. |
My bad, yes, it's not related to GC - I didn't notice it was in BTW, noticed you may be using Windows. Unfortunately I don't have access to windows to check. Can you reproduce it with the zstd binaries provided by https://github.com/facebook/zstd? For example you can use something like:
to run benchmark for levels 1-9, 10 seconds each level and use 16 threads for compression. |
That's a good thought - I was also thinking they might be native thread handles and not related to the zstd-jni java code. It also seems possible that the java code is not properly closing out some native objects, in which case it would be a bug in the java code. I can try a repro with the native zstd binaries, and I can also see if the issue reproduces on Linux too. It will take a little time to get back to you. |
No rush. The Java code calls I found the Zstd threading implementation for win32 https://github.com/facebook/zstd/blob/12c045f74d922dc934c168f6e1581d72df983388/lib/common/threading.c#L23-L77 and it looks it does not close the handles ( |
I completed a couple of tests on Windows using the native zstd executable (version 1.5.2). For the first test I generated a list of 323 files to be compressed (total size 51GB) and then compressed those files individually within a single instance of the zstd process using the following command: zstd -T16 --filelist .\FILE_LIST.TXT --output-dir-flat .\COMPRESSED_FILES This resulted in the zstd executable opening about 80 handles and staying constant at level for the entire duration. No evidence of thread handles leaking over time, and the count does not seem to vary with the number or size of the files. For the second test I ran the command suggested by @luben above on a single 24GB file: zstd -b1 -e9 -i10 -T16 BIG_FILE This resulted in a total of 250 handles or so by the time the process finished. The number of opened handles did increase over time as it proceeded through the benchmarks, but not as dramatically as it does with the zstd-jni code sample I shared in my original post. I am not sure why the results are slightly different for these two test scenarios, but I suppose the results from the second test show that the number of open handles can increase over time in the native implementation. This is not a critical issue for my organization at the moment so I don't intend to pursue this further right now, but figured I would share these findings. |
Thanks. I will share with upstream if they have any idea how to fix it. |
The leak was fixed upstream with facebook/zstd#3147 so next release will have it fixed. |
Yes, that's great! Looks like they put out 2-3 releases per year on average. |
I think this should be fixed in 1.5.5. Please reopen if you still see the problem. |
We have a long-lived service process on Windows that periodically uses zstd-jni to create compressed files. We've noticed that this process accumulates thread handles over time that are not released, even after streams are closed and Java garbage collection is triggered. The accumulation of leaked handles is more drastic with higher numbers of concurrent compression threads, but still occurs when configured for a single compression thread. Note that the number actual threads at any given time is constant, but the number of HANDLES does increase over time as repeated compression tasks occur.
This issue is present in the newest release of zstd-jni as of today, which is 1.5.1-1.
How we're measuring handle accumulation
We're using process explorer on Windows to view the accumulation of thread handles in our Java process over time. The count of handles for a given process can be viewed by right clicking the process, going to properties, and then clicking the "environment" tab within the subsequent window that opens. The individual handles for a process are enumerated in the lower pane of the main process explorer window where you select the process and then View --> Lower Pane View --> Handles. From this we can see that the accumulating handles are thread handles and not file handles or some other type.
Sample Code
Below is the simplest example we could come up with to illustrate the problem. If you run the following code against a relatively small file and examine the handles via the process explorer method described above, you should be able to see the growth. It may help to place a breakpoint at the end of the main method to keep the process running while you inspect in process explorer. Note that we are properly closing the streams via try-with-resources.
For reference, when we run this code against a 10MB input file with 16 compression threads, we see an accumulation of about 2000 handles, and this count increases as you increase the number of for loop iterations. If you remove the ZstdOutputStream, the count of handles generally remains constant.
The text was updated successfully, but these errors were encountered: