Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] PreparedDictionaryImpl data gets removed by garbage collection #189

Closed
bwollmer opened this issue Nov 26, 2024 · 0 comments · Fixed by #190
Closed

[BUG] PreparedDictionaryImpl data gets removed by garbage collection #189

bwollmer opened this issue Nov 26, 2024 · 0 comments · Fixed by #190

Comments

@bwollmer
Copy link
Contributor

Describe the bug
PreparedDictionaries are loosing their content, since the garbage collection will remove it. This can happen "silently" so the code does not crash, but the compression ratio is just not good since no data was used. The problem seems to be a missing reference within the PreparedDictionaryImpl class on rawData.

To Reproduce
I would love to show a test, but since the garbage collection is involved and could run at any time, the test would be flaky.
But the general idea:

  1. Call Encoder.prepareDictionary
  2. Use dictionary for compression
  3. Let the garbage collection run
  4. Repeat 2. and compare the results
  5. The second run should be way bigger than the first run, since an empty dictionary was used

Expected behavior
Custom dictionaries should be safe from garbage collection until they are not used anymore.

Platform (please complete the following information):
I saw the behavior on Linux and MacOS, but this should be platform independent.

Additional context
We have seen this in production, where we load a dictionary once at start and use it multiple times. As a workaround we keep a reference to the ByteBuffer passed to Encoder.prepareDictionary, which solved the problem for us.

hyperxpro added a commit that referenced this issue Nov 27, 2024
…Impl (#190)

Motivation:

As described in #189 the garbage collection currently removes the data
from the dictionary, resulting in poor compression ratios, since no
dictionary was actually used.
The code to finalize the rawData already exists.

Modification:
Hold the reference on rawData in PreparedDictionaryImpl

Result:
Fixes #189.

---------

Co-authored-by: Benjamin Wollmer <benni@wollmer.dev>
Co-authored-by: Aayush Atharva <aayush@shieldblaze.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant