-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change to suggested schema? #64
Comments
I also got similar results when experimenting with compression and low cardinality, but I do not think it's universal enough and need to be recommended as generic solution. |
I'd have thought other than partitioning the suggested changes above should benefit everyone as Path will be repeated a lot and Time/Date/Timestamp are all usually increasing and with lots of duplicates |
I mean, I think it's not possible to remove Date, it's used by graphite-clickhouse. You can zero Timestamp and have mostly same effect with is documented and recommended way. You can't remove Timestamp because it's required (maybe not anymore?) by Graphite Merge Tree engine. |
See this extract from one of my servers: SELECT
column,
any(type),
formatReadableSize(sum(column_data_compressed_bytes)) AS compressed,
formatReadableSize(sum(column_data_uncompressed_bytes)) AS uncompressed,
sum(rows)
FROM system.parts_columns
WHERE (table = 'graphite_data') AND active
GROUP BY column
ORDER BY column ASC
┌─column────┬─any(type)─┬─compressed─┬─uncompressed─┬──sum(rows)─┐
│ Date │ Date │ 69.84 MiB │ 12.24 GiB │ 6569285540 │
│ Path │ String │ 4.60 GiB │ 1.04 TiB │ 6569285540 │
│ Time │ UInt32 │ 17.00 GiB │ 24.47 GiB │ 6569285540 │
│ Timestamp │ UInt32 │ 19.38 GiB │ 24.47 GiB │ 6569285540 │
│ Value │ Float64 │ 6.40 GiB │ 48.94 GiB │ 6569285540 │
└───────────┴───────────┴────────────┴──────────────┴────────────┘ The I just found out the setting for |
Be aware, that LowCardinality impacts the size of the marks. Since it doubles it for Path, I've had a lot of OOMs during experiments with it. In ClickHouse telegram chat people have suggested me to use ZSTD, and it works quite well too. Here's my own experiments' final state: ATTACH TABLE data_lr
(
`Path` String CODEC(ZSTD(3)),
`Value` Float64 CODEC(Gorilla, LZ4),
`Time` UInt32 CODEC(DoubleDelta, LZ4),
`Date` Date CODEC(DoubleDelta, LZ4),
`Timestamp` UInt32 CODEC(DoubleDelta, LZ4)
)
ENGINE = ReplicatedGraphiteMergeTree('/clickhouse/tables/graphite.data_lr/{shard}', '{replica}', 'graphite_rollup')
PARTITION BY toYYYYMMDD(Date)
ORDER BY (Path, Time)
SETTINGS index_granularity = 256 And here's the data on the host result: SELECT
column,
any(type),
formatReadableSize(sum(column_data_compressed_bytes)) AS compressed,
formatReadableSize(sum(column_data_uncompressed_bytes)) AS uncompressed,
sum(rows)
FROM system.parts_columns
WHERE (table = 'data_lr') AND active
GROUP BY column
ORDER BY column ASC
┌─column────┬─any(type)─┬─compressed─┬─uncompressed─┬───sum(rows)─┐
│ Date │ Date │ 63.59 MiB │ 48.97 GiB │ 26289415098 │
│ Path │ String │ 9.01 GiB │ 1.54 TiB │ 26289415098 │
│ Time │ UInt32 │ 980.84 MiB │ 97.94 GiB │ 26289415098 │
│ Timestamp │ UInt32 │ 4.29 GiB │ 97.94 GiB │ 26289415098 │
│ Value │ Float64 │ 60.51 GiB │ 195.87 GiB │ 26289415098 │
└───────────┴───────────┴────────────┴──────────────┴─────────────┘ @Hipska what do you use for the Value column? |
4 months later and I have seen that compress ratio for column So again, why is @Felixoid I didn't do anything special to the datamodel, I only added a TTL based on the |
The Timestamp column is used internally by GraphiteMergeTree as Version column during the rollup process. I'm afraid, I already don't remember details of how it's used, but the doc tells:
Here's a comment from the code https://github.com/ClickHouse/ClickHouse/fc42851/master/src/Processors/Merges/Algorithms/GraphiteRollupSortedAlgorithm.h#L78-L90 /* | path | time | rounded_time | version | value | unmodified |
* -----------------------------------------------------------------------------------
* | A | 11 | 10 | 1 | 1 | a | |
* | A | 11 | 10 | 3 | 2 | b |> subgroup(A, 11) |
* | A | 11 | 10 | 2 | 3 | c | |> group(A, 10)
* ----------------------------------------------------------------------------------|>
* | A | 12 | 10 | 0 | 4 | d | |> Outputs (A, 10, avg(2, 5), a)
* | A | 12 | 10 | 1 | 5 | e |> subgroup(A, 12) |
* -----------------------------------------------------------------------------------
* | A | 21 | 20 | 1 | 6 | f |
* | B | 11 | 10 | 1 | 7 | g |
* ...
*/ I don't think setting SELECT
column,
any(type),
formatReadableSize(sum(column_data_compressed_bytes)) AS compressed,
formatReadableSize(sum(column_data_uncompressed_bytes)) AS uncompressed,
sum(column_data_uncompressed_bytes) / sum(column_data_compressed_bytes) AS ratio,
sum(rows)
FROM system.parts_columns
WHERE ((table = 'data_lr') AND (database = 'graphite')) AND active
GROUP BY column
ORDER BY column ASC
┌─column────┬─any(type)─┬─compressed─┬─uncompressed─┬──────────────ratio─┬───sum(rows)─┐
│ Date │ Date │ 59.17 MiB │ 45.58 GiB │ 788.813125428121 │ 24468984334 │
│ Path │ String │ 9.42 GiB │ 1.41 TiB │ 153.15145708165463 │ 24468984334 │
│ Time │ UInt32 │ 961.97 MiB │ 91.15 GiB │ 97.0315116787169 │ 24468984334 │
│ Timestamp │ UInt32 │ 4.33 GiB │ 91.15 GiB │ 21.03714874982845 │ 24468984334 │
│ Value │ Float64 │ 55.59 GiB │ 182.31 GiB │ 3.279419424878342 │ 24468984334 │
└───────────┴───────────┴────────────┴──────────────┴────────────────────┴─────────────┘ |
And one additional note, when I've tested the |
So, I still don't see use case to have anything other than |
Because neither I nor you know the internals of GraphiteMergeTree engine, I'd say. And I feel uncomfortable making a decision for everybody to change the default behavior. Besides that, with the proper codec, it doesn't hurt so much. Plus, it just came to my mind, it would make sense to add a custom TTL to the Timestamp column. Something pretty big to be sure, that rollup is done at least once. Then it will be pretty insignificant. |
But instead of changing the table schema, is there an issue in using toYYYYMMDD() for partitioning? Optimizing > 1.5TB partitions down to ~100G just needs an useless amount of empty space. (Does anybody know why clickhouse requires the 100% overhead?) |
It's something I can answer
It depends more on the amount of data. Partitions bring overhead. And if you have broad historical inserts (like over a year), then the memory consumption for such batches would be huge. On the other hand, the optimizing overhead is minimized, so one should find a balance for himself. I personally have used partitions for 3 days, but a half-year ago migrated to
It's inherited from the MergeTree engine. The 100% space allocation required to be safe during merging all parts together. The server can't know in advance how much space may be really needed and how aggressive aggregation will be, so can't predict the necessary allocation more precisely than just taking an existing size as a constant rule of thumb. |
Basically I'm inserting ~400k rows/s with data from the last 10..300 seconds. Old data is rarely inserted.
Thanks, appreciated. |
Hi guys, would graphite-clickhouse benefit from any projections? https://www.youtube.com/watch?v=jJ5VuLr2k5k |
Yes, it definitely looks interesting. Looks like it brings the feature to define the only one table for both direct and reversed points at once. Thank you for popping it up! 👍 |
For graphite table, switching to the following moves us from 5 bytes/row to 0.6 bytes/row for a large dataset:
toYYYYMMDD()
might be a better partitioning key for large volumes of dataClickHouse/ClickHouse#12144 (comment) also suggests removing the Date col - perhaps this could be an option (although when compressed as above only takes a very small amount of space)
The text was updated successfully, but these errors were encountered: