-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
miniz fixes #227
miniz fixes #227
Conversation
I left some comment, I recommend to use mz_*Bound api to handle output length. |
I'm not sure what the correct solution is for choosing the correct inflate buffer size. In lua-zlib, they create a one-time-use inflator that grows the buffer in the same way that |
If we do allow an input value, we need upper and lower bounds. local data = miniz.uncompress(compressed, 2^32) -- overflow |
EDIT: make reverse of mz_deflateBound
|
I don't think that will work. According to "Maximum Expansion Factor" in https://zlib.net/zlib_tech.html, deflateBound is an estimate of how much space would be needed for compressed data plus any overhead. For data that is difficult to compress, the output can easily be larger than the input.
Indeed, miniz's implementation always produces a number that is larger than the input size, even though the output is expected to be smaller due to compression. (I'm not sure why miniz calls this conservative.) If we invert that function, the output buffer would will always be smaller than the input buffer, which would fail during typical use. local function deflateBound(x)
local a = 128 + (x * 110) / 100
local b = 128 + x + ((x / (31 * 1024)) + 1) * 5
return math.floor(math.max(a, b))
end
for i = 0, 16 do
local in_len = 2^i
local out_len = deflateBound(in_len)
print(i, in_len, out_len, out_len / in_len)
end
The "Maximum Compression Factor" explains what we are looking for. There is a theoretical compression ratio as high as 1000:1 with typical ratios of 2:1 to 5:1. So, I think that an expanding buffer is a good idea. Python does this.
We just need to rely on something other than MZ_BUF_ERROR to know when to stop expanding. Maybe impose an upper limit somewhere between 2^16 and 2^32. |
See the recent commit to |
I finally got a chance to use the miniz zlib api (#163 and #165), but I ran into some issues.
Byte streams were not fully processed in
inflator:inflate
/deflator:deflate
when they were larger than the output buffer. This was missed by tests because the they never reached the buffer limit. With inspiration from the lua-zlib bindings, I added a loop to handle the overflow, and changed the test to parse a larger stream. I also removed the "finish" flush mode from the inflator test, which lua-zlib also seems to omit. We may have to see whether we're properly implementing that.miniz.compress
was using the wrong stack index for the compression level selection. I factored this out into a helper function and changed the index from 1 to 2 and added a test.miniz.uncompress
attempted to use an output buffer of the same size as the input buffer. I changed this to allow an optional setting, withinput * 10
as the default and added a test.