-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compressed Materialization #7644
Compressed Materialization #7644
Conversation
…gress on CompressedMaterialization
…ns and make them not callable from catalog
So there's still a regression, but it passes our regression test threshold!
I think this PR is finally good to go when CI passes. |
Somehow, changes meant for a different branch ended up in this one. I think I've removed all of it now |
Apologies, I somehow contaminated my own |
I think this is ready to go! |
Thanks! |
Happy to see the mention of 'big-endian machines' 👍 Please let me know if the gang needs access to one, running either Linux on Z or z/OS (reference here, here, & here). GH Actions currently doesn't have a native runner for s390x, but VM(s) can be made available (for direct use or indirectly reaching them via some other GH Action) in case it's helpful. |
This PR implements the
CompressedMaterializationOptimizer
which compresses and decompresses data on-the-fly during execution if there is a materializing operator like sort/join/aggregate/etc.This is useful when we, for example, have the following table:
Here the
id
column has typeint32
, which has a width of 4 bytes. The maximumid
we can store in this type is 2,147,483,647, but we only store 300 through 304. We keep statistics on each column in the catalog, and using these statistics, we can convert this column to auint8
at runtime by subtracting the minimum value (300), bringing it to a range of 0 to 4. This reduces the width of the column down to 1 byte.This is only useful when there could be memory pressure, which is sometimes the case for materializing operators.
We can also compress down the
name
column. The maximum string length is 7, but ourstring_t
type that is used for strings during execution is 16 bytes wide. We can compress these to auint64
, which is only 8 bytes wide, by encoding names like so:The length of the name is stored within a single byte in the
uint64
. Then, we flip the bytes around so that they are properly comparable and sortable asuint64
on big-endian machines:Currently, this is only implemented for sorting and aggregates, as these operators have only one input, which makes compression more straightforward. If materializing operators are chained, the optimizer removes redundant subsequent compressions, allowing the compressed data to flow from one operator to the next.
This can also be applied to joins, but this is harder to get right, and heuristics are likely needed to not negatively impact performance when the build side fits in memory. I've left this for a future PR.
Performance
Sorting:
The
lineitem
table benefits a lot from this compression.The query
Wich is an aggregate, shows a similar performance improvement.
TPC-H Q1 performance is also improved, by ~30%, as we can now group by integers rather than strings.
Other changes
ColumnLifetimeAnalyzer
optimizer.Deliminator
andRemoveUnusedColumns
. I've added aRemoveDuplicateGroups
to remove these.Deliminator
optimizer, which is now much more readable/maintainable.id
s with a table with a very large range ofid
s, we can create a filter from the statistics that we have and push it into the scan of the large table. This can greatly speed up specific queries (requested in Doing HASH_JOIN instead of SEQ_SCAN even when driving table has single record #4974).Happy to receive feedback! No rush though, this is a pretty big PR.