Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimal huff depth speed improvements #3302

Merged
27 changes: 17 additions & 10 deletions lib/compress/huf_compress.c
Original file line number Diff line number Diff line change
Expand Up @@ -1261,28 +1261,35 @@ unsigned HUF_optimalTableLog(unsigned maxTableLog, size_t srcSize, unsigned maxS
if (depthMode == HUF_depth_optimal) { /** Test valid depths and return optimal **/
BYTE* dst = (BYTE*)workSpace + sizeof(HUF_WriteCTableWksp);
size_t dstSize = wkspSize - sizeof(HUF_WriteCTableWksp);
size_t optSize = ((size_t) ~0);
unsigned huffLog;
size_t maxBits, hSize, newSize;
const unsigned symbolCardinality = HUF_cardinality(count, maxSymbolValue);
const unsigned minTableLog = HUF_minTableLog(symbolCardinality);
size_t optSize = ((size_t) ~0) - 1;
unsigned optLogGuess;

if (wkspSize < sizeof(HUF_buildCTable_wksp_tables)) return optLog;
if (wkspSize < sizeof(HUF_buildCTable_wksp_tables)) return optLog; /** Assert workspace is large enough **/

/* Search until size increases */
for (optLogGuess = minTableLog; optLogGuess <= maxTableLog; optLogGuess++) {
maxBits = HUF_buildCTable_wksp(table, count, maxSymbolValue, optLogGuess, workSpace, wkspSize);

for (huffLog = HUF_minTableLog(symbolCardinality); huffLog <= maxTableLog; huffLog++) {
maxBits = HUF_buildCTable_wksp(table, count,
maxSymbolValue, huffLog,
workSpace, wkspSize);
if (ERR_isError(maxBits)) continue;

Copy link
Contributor

@Cyan4973 Cyan4973 Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor optimization :

if (maxBits < optLogGuess) break;

If the tree builder is unable to use all allowed bits anyway, it means we have already reached the optimal huffman distribution at previous attempt. We can immediately stop the loop, as it will bring no further benefit.

Tested on silesia.tar with 1 KB blocks : this seems to improve compression speed by ~+1%, at no impact on compression ratio.

hSize = HUF_writeCTable_wksp(dst, dstSize, table, maxSymbolValue, (U32)maxBits,
workSpace, wkspSize);
if (maxBits < optLogGuess && optLogGuess > minTableLog) break;

hSize = HUF_writeCTable_wksp(dst, dstSize, table, maxSymbolValue, (U32)maxBits, workSpace, wkspSize);

if (ERR_isError(hSize)) continue;

newSize = HUF_estimateCompressedSize(table, count, maxSymbolValue) + hSize;

if (newSize > optSize + 1) {
break;
}

if (newSize < optSize) {
optSize = newSize;
optLog = huffLog;
optLog = optLogGuess;
}
}
}
Expand Down
20 changes: 10 additions & 10 deletions tests/regression/results.csv
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ silesia.tar, level 7, compress
silesia.tar, level 9, compress simple, 4552899
silesia.tar, level 13, compress simple, 4502956
silesia.tar, level 16, compress simple, 4360527
silesia.tar, level 19, compress simple, 4266970
silesia.tar, level 19, compress simple, 4267021
silesia.tar, uncompressed literals, compress simple, 4854086
silesia.tar, uncompressed literals optimal, compress simple, 4266970
silesia.tar, uncompressed literals optimal, compress simple, 4267021
silesia.tar, huffman literals, compress simple, 6179047
github.tar, level -5, compress simple, 52115
github.tar, level -3, compress simple, 45678
Expand Down Expand Up @@ -135,7 +135,7 @@ silesia.tar, level 7, zstdcli,
silesia.tar, level 9, zstdcli, 4552903
silesia.tar, level 13, zstdcli, 4502960
silesia.tar, level 16, zstdcli, 4360531
silesia.tar, level 19, zstdcli, 4266974
silesia.tar, level 19, zstdcli, 4267025
silesia.tar, no source size, zstdcli, 4854160
silesia.tar, long distance mode, zstdcli, 4845745
silesia.tar, multithreaded, zstdcli, 4854164
Expand Down Expand Up @@ -283,7 +283,7 @@ silesia.tar, level 12 row 1, advanced
silesia.tar, level 12 row 2, advanced one pass, 4513797
silesia.tar, level 13, advanced one pass, 4502956
silesia.tar, level 16, advanced one pass, 4360527
silesia.tar, level 19, advanced one pass, 4266970
silesia.tar, level 19, advanced one pass, 4267021
silesia.tar, no source size, advanced one pass, 4854086
silesia.tar, long distance mode, advanced one pass, 4840452
silesia.tar, multithreaded, advanced one pass, 4854160
Expand Down Expand Up @@ -601,7 +601,7 @@ silesia.tar, level 12 row 1, advanced
silesia.tar, level 12 row 2, advanced one pass small out, 4513797
silesia.tar, level 13, advanced one pass small out, 4502956
silesia.tar, level 16, advanced one pass small out, 4360527
silesia.tar, level 19, advanced one pass small out, 4266970
silesia.tar, level 19, advanced one pass small out, 4267021
silesia.tar, no source size, advanced one pass small out, 4854086
silesia.tar, long distance mode, advanced one pass small out, 4840452
silesia.tar, multithreaded, advanced one pass small out, 4854160
Expand Down Expand Up @@ -919,7 +919,7 @@ silesia.tar, level 12 row 1, advanced
silesia.tar, level 12 row 2, advanced streaming, 4513797
silesia.tar, level 13, advanced streaming, 4502956
silesia.tar, level 16, advanced streaming, 4360527
silesia.tar, level 19, advanced streaming, 4266970
silesia.tar, level 19, advanced streaming, 4267021
silesia.tar, no source size, advanced streaming, 4859267
silesia.tar, long distance mode, advanced streaming, 4840452
silesia.tar, multithreaded, advanced streaming, 4854160
Expand Down Expand Up @@ -1213,10 +1213,10 @@ silesia.tar, level 7, old stre
silesia.tar, level 9, old streaming, 4552900
silesia.tar, level 13, old streaming, 4502956
silesia.tar, level 16, old streaming, 4360527
silesia.tar, level 19, old streaming, 4266970
silesia.tar, level 19, old streaming, 4267021
silesia.tar, no source size, old streaming, 4859267
silesia.tar, uncompressed literals, old streaming, 4859271
silesia.tar, uncompressed literals optimal, old streaming, 4266970
silesia.tar, uncompressed literals optimal, old streaming, 4267021
silesia.tar, huffman literals, old streaming, 6179056
github, level -5, old streaming, 204407
github, level -5 with dict, old streaming, 46718
Expand Down Expand Up @@ -1323,7 +1323,7 @@ silesia.tar, level 7, old stre
silesia.tar, level 9, old streaming advanced, 4552900
silesia.tar, level 13, old streaming advanced, 4502956
silesia.tar, level 16, old streaming advanced, 4360527
silesia.tar, level 19, old streaming advanced, 4266970
silesia.tar, level 19, old streaming advanced, 4267021
silesia.tar, no source size, old streaming advanced, 4859267
silesia.tar, long distance mode, old streaming advanced, 4859271
silesia.tar, multithreaded, old streaming advanced, 4859271
Expand All @@ -1333,7 +1333,7 @@ silesia.tar, small hash log, old stre
silesia.tar, small chain log, old streaming advanced, 4917021
silesia.tar, explicit params, old streaming advanced, 4806873
silesia.tar, uncompressed literals, old streaming advanced, 4859271
silesia.tar, uncompressed literals optimal, old streaming advanced, 4266970
silesia.tar, uncompressed literals optimal, old streaming advanced, 4267021
silesia.tar, huffman literals, old streaming advanced, 6179056
silesia.tar, multithreaded with advanced params, old streaming advanced, 4859271
github, level -5, old streaming advanced, 213265
Expand Down