Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix for huge dictionaries #3157

Merged
merged 1 commit into from
Jun 9, 2022
Merged

Bugfix for huge dictionaries #3157

merged 1 commit into from
Jun 9, 2022

Conversation

embg
Copy link
Contributor

@embg embg commented Jun 9, 2022

To trigger the bug from CLI, disable the protection in fileio.c. This file is outside of lib/, so it's possible to trigger the bug without any modifications through the API.

diff --git a/programs/fileio.c b/programs/fileio.c
index 3c47e3f5..d23ef038 100644
--- a/programs/fileio.c
+++ b/programs/fileio.c
@@ -672,7 +672,7 @@ static size_t FIO_createDictBuffer(void** bufferPtr, const char* fileName, FIO_p
 
     fileSize = UTIL_getFileSizeStat(&statbuf);
     {
-        size_t const dictSizeMax = prefs->patchFromMode ? prefs->memLimit : DICTSIZE_MAX;
+        size_t const dictSizeMax = 1lu << 40; //prefs->patchFromMode ? prefs->memLimit : DICTSIZE_MAX;
         if (fileSize >  dictSizeMax) {
             EXM_THROW(34, "Dictionary file %s is too large (> %u bytes)",
                             fileName,  (unsigned)dictSizeMax);   /* avoid extreme cases */

Then make with DEBUGLEVEL >= 2 and trigger the bug as follows:

embg@embg-mbp zstd % ls -l bad_dict1
-rw-r--r--  1 embg  staff  3758096383 Jun  9 11:18 bad_dict1
embg@embg-mbp zstd % ./zstd -1 -D bad_dict1 README.md -o /dev/null
Assertion failed: (curr > newCurrent), function ZSTD_window_correctOverflow, file zstd_compress_internal.h, line 1016.
zsh: abort      ./zstd -1 -D bad_dict1 README.md -o /dev/null

With the change from this PR applied, the bug goes away:

embg@embg-mbp zstd % ./zstd -1 -D bad_dict1 README.md -o /dev/null
README.md            : 46.11%   (  9.62 KiB =>   4.44 KiB, /dev/null)   

Copy link
Contributor

@felixhandte felixhandte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! For a long time, the start index was 1, so it doesn't surprise me that we have lingering inconsistencies.

@embg embg added the bug label Jun 9, 2022
@embg embg merged commit f313a77 into facebook:dev Jun 9, 2022
@embg embg deleted the huge_dict_bugfix branch June 9, 2022 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants