-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
s2: Add Dictionary support. #685
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
kodiakhq bot
referenced
this pull request
in cloudquery/filetypes
Mar 1, 2023
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [github.com/klauspost/compress](https://github.com/klauspost/compress) | indirect | minor | `v1.15.11` -> `v1.16.0` | --- ### ⚠ Dependency Lookup Warnings ⚠ Warnings were logged while processing this repo. Please check the Dependency Dashboard for more information. --- ### Release Notes <details> <summary>klauspost/compress</summary> ### [`v1.16.0`](https://github.com/klauspost/compress/releases/tag/v1.16.0) [Compare Source](https://github.com/klauspost/compress/compare/v1.15.15...v1.16.0) #### What's Changed - s2: Add Dictionary support by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/685](https://github.com/klauspost/compress/pull/685) - s2: Add Compression Size Estimate by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/752](https://github.com/klauspost/compress/pull/752) - s2: Add support for custom stream encoder by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/755](https://github.com/klauspost/compress/pull/755) - s2: Add LZ4 block converter by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/748](https://github.com/klauspost/compress/pull/748) - s2: Support io.ReaderAt in ReadSeeker by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/747](https://github.com/klauspost/compress/pull/747) - s2c/s2sx: Use concurrent decoding by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/746](https://github.com/klauspost/compress/pull/746) - tests: Upgrade to Go 1.20 by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/749](https://github.com/klauspost/compress/pull/749) - Update all (command) dependencies by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/758](https://github.com/klauspost/compress/pull/758) **Full Changelog**: klauspost/compress@v1.15.15...v1.16.0 ### [`v1.15.15`](https://github.com/klauspost/compress/releases/tag/v1.15.15) [Compare Source](https://github.com/klauspost/compress/compare/v1.15.14...v1.15.15) #### What's Changed - zstd: Add delta encoding support by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/728](https://github.com/klauspost/compress/pull/728) - huff0: Reduce bounds checking by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/734](https://github.com/klauspost/compress/pull/734) - huff0: Assembler improvements by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/736](https://github.com/klauspost/compress/pull/736) - deflate: Improve level 7-9 by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/739](https://github.com/klauspost/compress/pull/739) - gzhttp: Add SuffixETag() and DropETag() options to prevent ETag collisions on compressed responses by [@​willbicks](https://github.com/willbicks) in [https://github.com/klauspost/compress/pull/740](https://github.com/klauspost/compress/pull/740) - zstd: Don't allocate dataStorage when using byteBuf by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/741](https://github.com/klauspost/compress/pull/741) - huff0: Speed up compression of short blocks by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/744](https://github.com/klauspost/compress/pull/744) - zstd: Handle dicts by pointer, always by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/743](https://github.com/klauspost/compress/pull/743) - fse: Optimize compression by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/745](https://github.com/klauspost/compress/pull/745) - Retract v1.14.1-v.1.14.3 by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/742](https://github.com/klauspost/compress/pull/742) #### New Contributors - [@​willbicks](https://github.com/willbicks) made their first contribution in [https://github.com/klauspost/compress/pull/740](https://github.com/klauspost/compress/pull/740) **Full Changelog**: klauspost/compress@v1.15.14...v1.15.15 ### [`v1.15.14`](https://github.com/klauspost/compress/releases/tag/v1.15.14) [Compare Source](https://github.com/klauspost/compress/compare/v1.15.13...v1.15.14) #### What's Changed - flate: Improve speed in big stateless blocks. by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/718](https://github.com/klauspost/compress/pull/718) - zstd: Trigger BCE by switching on lengths by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/716](https://github.com/klauspost/compress/pull/716) - zstd: Shave some instructions off the amd64 asm by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/720](https://github.com/klauspost/compress/pull/720) - export NoGzipResponseWriter for custom ResponseWriter wrappers by [@​harshavardhana](https://github.com/harshavardhana) in [https://github.com/klauspost/compress/pull/722](https://github.com/klauspost/compress/pull/722) - s2: Add example for indexing and existing stream by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/723](https://github.com/klauspost/compress/pull/723) - tests: Tweak fuzz tests by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/719](https://github.com/klauspost/compress/pull/719) #### New Contributors - [@​harshavardhana](https://github.com/harshavardhana) made their first contribution in [https://github.com/klauspost/compress/pull/722](https://github.com/klauspost/compress/pull/722) **Full Changelog**: klauspost/compress@v1.15.13...v1.15.14 ### [`v1.15.13`](https://github.com/klauspost/compress/releases/tag/v1.15.13) [Compare Source](https://github.com/klauspost/compress/compare/v1.15.12...v1.15.13) #### What's Changed - zstd: Add MaxEncodedSize to encoder by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/691](https://github.com/klauspost/compress/pull/691) - zstd: Improve "best" end search by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/693](https://github.com/klauspost/compress/pull/693) - zstd: Replace bytes.Equal with smaller comparisons by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/695](https://github.com/klauspost/compress/pull/695) - zstd: Faster CRC checking/skipping by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/696](https://github.com/klauspost/compress/pull/696) - zstd: Rewrite matchLen to make it inlineable by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/701](https://github.com/klauspost/compress/pull/701) - zstd: Write table clearing in a way that the compiler recognizes by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/702](https://github.com/klauspost/compress/pull/702) - zstd: Use individual reset threshold by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/703](https://github.com/klauspost/compress/pull/703) - huff0: Check for zeros earlier in Scratch.countSimple by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/704](https://github.com/klauspost/compress/pull/704) - zstd: Improve best compression's match selection by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/705](https://github.com/klauspost/compress/pull/705) - zstd: Select best match using selection trees by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/706](https://github.com/klauspost/compress/pull/706) - zstd: sync xxhash with final accepted patch upstream by [@​lizthegrey](https://github.com/lizthegrey) in [https://github.com/klauspost/compress/pull/707](https://github.com/klauspost/compress/pull/707) - zstd: Import xxhash v2.2.0 by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/708](https://github.com/klauspost/compress/pull/708) **Full Changelog**: klauspost/compress@v1.15.12...v1.15.13 ### [`v1.15.12`](https://github.com/klauspost/compress/releases/tag/v1.15.12) [Compare Source](https://github.com/klauspost/compress/compare/v1.15.11...v1.15.12) #### What's Changed - zstd: Tweak decoder allocs. by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/680](https://github.com/klauspost/compress/pull/680) - gzhttp: Always delete `HeaderNoCompression` by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/683](https://github.com/klauspost/compress/pull/683) **Full Changelog**: klauspost/compress@v1.15.11...v1.15.12 </details> --- ### Configuration 📅 **Schedule**: Branch creation - "before 3am on the first day of the month" (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNC4xMDkuMSIsInVwZGF0ZWRJblZlciI6IjM0LjE1NC4wIn0=-->
kodiakhq bot
referenced
this pull request
in cloudquery/plugin-pb-go
Aug 1, 2023
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [github.com/klauspost/compress](https://github.com/klauspost/compress) | indirect | minor | `v1.15.15` -> `v1.16.7` | --- ### Release Notes <details> <summary>klauspost/compress (github.com/klauspost/compress)</summary> ### [`v1.16.7`](https://github.com/klauspost/compress/releases/tag/v1.16.7) [Compare Source](https://github.com/klauspost/compress/compare/v1.16.6...v1.16.7) #### What's Changed - zstd: Fix default level first dictionary encode by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/829](https://github.com/klauspost/compress/pull/829) - docs: Fix typo in security advisory URL by [@​vcabbage](https://github.com/vcabbage) in [https://github.com/klauspost/compress/pull/830](https://github.com/klauspost/compress/pull/830) - s2: add GetBufferCapacity() method by [@​GiedriusS](https://github.com/GiedriusS) in [https://github.com/klauspost/compress/pull/832](https://github.com/klauspost/compress/pull/832) #### New Contributors - [@​vcabbage](https://github.com/vcabbage) made their first contribution in [https://github.com/klauspost/compress/pull/830](https://github.com/klauspost/compress/pull/830) - [@​GiedriusS](https://github.com/GiedriusS) made their first contribution in [https://github.com/klauspost/compress/pull/832](https://github.com/klauspost/compress/pull/832) **Full Changelog**: klauspost/compress@v1.16.6...v1.16.7 ### [`v1.16.6`](https://github.com/klauspost/compress/releases/tag/v1.16.6) [Compare Source](https://github.com/klauspost/compress/compare/v1.16.5...v1.16.6) #### What's Changed - zstd: correctly ignore WithEncoderPadding(1) by [@​ianlancetaylor](https://github.com/ianlancetaylor) in [https://github.com/klauspost/compress/pull/806](https://github.com/klauspost/compress/pull/806) - gzhttp: Handle informational headers by [@​rtribotte](https://github.com/rtribotte) in [https://github.com/klauspost/compress/pull/815](https://github.com/klauspost/compress/pull/815) - zstd: Add amd64 match length assembly by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/824](https://github.com/klauspost/compress/pull/824) - s2: Improve Better compression slightly by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/663](https://github.com/klauspost/compress/pull/663) - s2: Clean up matchlen assembly by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/825](https://github.com/klauspost/compress/pull/825) #### New Contributors - [@​rtribotte](https://github.com/rtribotte) made their first contribution in [https://github.com/klauspost/compress/pull/815](https://github.com/klauspost/compress/pull/815) - [@​dveeden](https://github.com/dveeden) made their first contribution in [https://github.com/klauspost/compress/pull/816](https://github.com/klauspost/compress/pull/816) **Full Changelog**: klauspost/compress@v1.16.5...v1.16.6 ### [`v1.16.5`](https://github.com/klauspost/compress/releases/tag/v1.16.5) [Compare Source](https://github.com/klauspost/compress/compare/v1.16.4...v1.16.5) #### What's Changed - zstd: readByte needs to use io.ReadFull by [@​jnoxon](https://github.com/jnoxon) in [https://github.com/klauspost/compress/pull/802](https://github.com/klauspost/compress/pull/802) - gzip: Fix WriterTo after initial read by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/804](https://github.com/klauspost/compress/pull/804) #### New Contributors - [@​jnoxon](https://github.com/jnoxon) made their first contribution in [https://github.com/klauspost/compress/pull/802](https://github.com/klauspost/compress/pull/802) **Full Changelog**: klauspost/compress@v1.16.4...v1.16.5 ### [`v1.16.4`](https://github.com/klauspost/compress/releases/tag/v1.16.4) [Compare Source](https://github.com/klauspost/compress/compare/v1.16.3...v1.16.4) #### What's Changed - s2: Fix huge block overflow by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/779](https://github.com/klauspost/compress/pull/779) - s2: Allow CustomEncoder fallback by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/780](https://github.com/klauspost/compress/pull/780) - zstd: Fix amd64 not always detecting corrupt data by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/785](https://github.com/klauspost/compress/pull/785) - zstd: Improve zstd best efficiency by [@​klauspost](https://github.com/klauspost) and [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/784](https://github.com/klauspost/compress/pull/784) - zstd: Make load(32|64)32 safer and smaller by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/788](https://github.com/klauspost/compress/pull/788) - zstd: Fix quick reject on long backmatches by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/787](https://github.com/klauspost/compress/pull/787) - zstd: Revert table size change by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/789](https://github.com/klauspost/compress/pull/789) - zstd: Respect WithAllLitEntropyCompression by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/792](https://github.com/klauspost/compress/pull/792) - zstd: Fix back-referenced offset by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/793](https://github.com/klauspost/compress/pull/793) - zstd: Load source value at start of loop by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/794](https://github.com/klauspost/compress/pull/794) - zstd: Shorten checksum code by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/795](https://github.com/klauspost/compress/pull/795) - zstd: Fix fallback on incompressible block by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/798](https://github.com/klauspost/compress/pull/798) - gzhttp: Suppport ResponseWriter Unwrap() in gzhttp handler by [@​jgimenez](https://github.com/jgimenez) in [https://github.com/klauspost/compress/pull/799](https://github.com/klauspost/compress/pull/799) #### New Contributors - [@​jgimenez](https://github.com/jgimenez) made their first contribution in [https://github.com/klauspost/compress/pull/799](https://github.com/klauspost/compress/pull/799) **Full Changelog**: klauspost/compress@v1.16.3...v1.16.4 ### [`v1.16.3`](https://github.com/klauspost/compress/releases/tag/v1.16.3) [Compare Source](https://github.com/klauspost/compress/compare/v1.16.2...v1.16.3) **Full Changelog**: klauspost/compress@v1.16.2...v1.16.3 ### [`v1.16.2`](https://github.com/klauspost/compress/releases/tag/v1.16.2) [Compare Source](https://github.com/klauspost/compress/compare/v1.16.1...v1.16.2) #### What's Changed - Fix Goreleaser permissions by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/777](https://github.com/klauspost/compress/pull/777) **Full Changelog**: klauspost/compress@v1.16.1...v1.16.2 ### [`v1.16.1`](https://github.com/klauspost/compress/releases/tag/v1.16.1) [Compare Source](https://github.com/klauspost/compress/compare/v1.16.0...v1.16.1) #### What's Changed - zstd: Speed up + improve best encoder by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/776](https://github.com/klauspost/compress/pull/776) - s2: Add Intel LZ4s converter by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/766](https://github.com/klauspost/compress/pull/766) - gzhttp: Add BREACH mitigation by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/762](https://github.com/klauspost/compress/pull/762) - gzhttp: Remove a few unneeded allocs by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/768](https://github.com/klauspost/compress/pull/768) - gzhttp: Fix crypto/rand.Read usage by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/770](https://github.com/klauspost/compress/pull/770) - gzhttp: Use SHA256 as paranoid option by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/769](https://github.com/klauspost/compress/pull/769) - gzhttp: Use strings for randomJitter to skip a copy by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/767](https://github.com/klauspost/compress/pull/767) - zstd: Fix ineffective block size check by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/771](https://github.com/klauspost/compress/pull/771) - zstd: Check FSE init values by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/772](https://github.com/klauspost/compress/pull/772) - zstd: Report EOF from byteBuf.readBig by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/773](https://github.com/klauspost/compress/pull/773) - huff0: Speed up compress1xDo by [@​greatroar](https://github.com/greatroar) in [https://github.com/klauspost/compress/pull/774](https://github.com/klauspost/compress/pull/774) - tests: Remove fuzz printing by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/775](https://github.com/klauspost/compress/pull/775) - tests: Add CICD Fuzz testing by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/763](https://github.com/klauspost/compress/pull/763) - ci: set minimal permissions to GitHub Workflows by [@​diogoteles08](https://github.com/diogoteles08) in [https://github.com/klauspost/compress/pull/765](https://github.com/klauspost/compress/pull/765) #### New Contributors - [@​diogoteles08](https://github.com/diogoteles08) made their first contribution in [https://github.com/klauspost/compress/pull/765](https://github.com/klauspost/compress/pull/765) **Full Changelog**: klauspost/compress@v1.16.0...v1.16.1 ### [`v1.16.0`](https://github.com/klauspost/compress/releases/tag/v1.16.0) [Compare Source](https://github.com/klauspost/compress/compare/v1.15.15...v1.16.0) #### What's Changed - s2: Add Dictionary support by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/685](https://github.com/klauspost/compress/pull/685) - s2: Add Compression Size Estimate by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/752](https://github.com/klauspost/compress/pull/752) - s2: Add support for custom stream encoder by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/755](https://github.com/klauspost/compress/pull/755) - s2: Add LZ4 block converter by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/748](https://github.com/klauspost/compress/pull/748) - s2: Support io.ReaderAt in ReadSeeker by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/747](https://github.com/klauspost/compress/pull/747) - s2c/s2sx: Use concurrent decoding by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/746](https://github.com/klauspost/compress/pull/746) - tests: Upgrade to Go 1.20 by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/749](https://github.com/klauspost/compress/pull/749) - Update all (command) dependencies by [@​klauspost](https://github.com/klauspost) in [https://github.com/klauspost/compress/pull/758](https://github.com/klauspost/compress/pull/758) **Full Changelog**: klauspost/compress@v1.15.15...v1.16.0 </details> --- ### Configuration 📅 **Schedule**: Branch creation - "before 4am on the first day of the month" (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi4yNi4xIiwidXBkYXRlZEluVmVyIjoiMzYuMjYuMSIsInRhcmdldEJyYW5jaCI6Im1haW4ifQ==-->
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Compression Improvement
github_users_sample_set
From https://github.com/facebook/zstd/releases/tag/v1.1.3
With 64K dictionary trained with zstd:
9114 files, 7484607 bytes input:
Default Compression: 3362023 (44.92%) -> 921524 (12.31%)
Better: 3083163 (41.19%) -> 873154 (11.67%)
Best: 3057944 (40.86%) -> 785503 bytes (10.49%)
Go Sources
8912 files, 51253563 bytes input:
Default: 22955767 (44.79%) -> 19654568 (38.35%)
Better: 20189613 (39.39%) -> 16289357 (31.78%)
Best: 19482828 (38.01%) -> 15184589 (29.63%)
Status:
Non-goals
There will be no assembly for initial release. Also some compression may still be left on the table.
There will be no Snappy implementation, since it will be incompatible anyway.
DOCUMENTATION
Note: S2 dictionary compression is currently at an early implementation stage, with no assembly for
neither encoding nor decoding. Performance improvements can be expected in the future.
Adding dictionaries allow providing a custom dictionary that will serve as lookup in the beginning of blocks.
The same dictionary must be used for both encoding and decoding.
S2 does not keep track of whether the same dictionary is used,
and using the wrong dictionary will most often not result in an error when decompressing.
Blocks encoded without dictionaries can be decompressed seamlessly with a dictionary.
This means it is possible to switch from an encoding without dictionaries to an encoding with dictionaries
and treat the blocks similarly.
Similar to zStandard dictionaries,
the same usage scenario applies to S2 dictionaries.
S2 further limits the dictionary to only be enabled on the first 64KB of a block.
This will remove any negative (speed) impacts of the dictionaries on bigger blocks.
Compression
Using the github_users_sample_set and a 64KB dictionary trained with zStandard the following sizes can be achieved.
So for highly repetitive content, this case provides an almost 3x reduction in size.
For less uniform data we will use the Go source code tree.
Compressing First 64KB of all
.go
files ingo/src
, Go 1.19.5, 8912 files, 51253563 bytes input:Creating Dictionaries
There are no tools to create dictionaries in S2.
However, there are multiple ways to create a useful dictionary:
Using a Sample File
If your input is very uniform, you can just use a sample file as the dictionary.
For example in the
github_users_sample_set
above, the average compression only goes up from10.49% to 11.48% by using the first file as dictionary compared to using a dedicated dictionary.
Using Zstandard
Zstandard dictionaries can easily be converted to S2 dictionaries.
This can be helpful to generate dictionaries for files that don't have a fixed structure.
Example, with training set files placed in
./training-set
:λ zstd -r --train-fastcover training-set/* --maxdict=65536 -o name.dict
This will create a dictionary of 64KB, that can be converted to a dictionary like this:
It is recommended to save the dictionary returned by
b:= dict.Bytes()
, since that will contain only the S2 dictionary.This dictionary can later be loaded using
s2.NewDict(b)
. The dictionary then no longer requireszstd
to be initialized.Also note how
s2.MakeDict
allows you to search for a common starting sequence of your files.This can be omitted, at the expense of a few bytes.
Dictionary Encoding
Adding dictionaries allow providing a custom dictionary that will serve as lookup in the beginning of blocks.
A dictionary provides an initial repeat value that can be used to point to a common header.
Other than that the dictionary contains values that can be used as back-references.
Often used data should be placed at the end of the dictionary since offsets < 2048 bytes will be smaller.
Format
Dictionary content must at least 16 bytes and less or equal to 64KiB (65536 bytes).
Encoding:
[repeat value (uvarint)][dictionary content...]
Before the dictionary content, an unsigned base-128 (uvarint) encoded value specifying the initial repeat offset.
This value is an offset into the dictionary content and not a back-reference offset,
so setting this to 0 will make the repeat value point to the first value of the dictionary.
The value must be less than the dictionary length-8.
Encoding
From the decoder point of view the dictionary content is seen as preceding the encoded content.
[dictionary content][decoded output]
Backreferences to the dictionary are encoded as ordinary backreferences that have an offset before the start of the decoded block.
Matches copying from the dictionary are not allowed to cross from the dictionary into the decoded data.
However, if a copy ends at the end of the dictionary the next repeat will point to the start of the decoded buffer, which is allowed.
The first match can be a repeat value, which will use the repeat offset stored in the dictionary.
When 64KB (65536 bytes) has been en/decoded it is no longer allowed to reference the dictionary,
neither by a copy nor repeat operations.
If the boundary is crossed while copying from the dictionary, the operation should complete,
but the next instruction is not allowed to reference the dictionary.
Valid blocks encoded without a dictionary can be decoded with any dictionary.
There are no checks whether the supplied dictionary is the correct for a block.
Because of this there is no overhead by using a dictionary.
Streams
For streams each block can use the dictionary.
The dictionary is not provided on the stream.