-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial update to have attributes use the ordered map so the order is maintained into clickhouse #34598
Conversation
Awesome to see this. Will review thoroughly once marked as ready, but so far things look good. Especially interested in seeing this for log/trace attributes. |
6eca859
to
1277292
Compare
… maintained into clickhouse
…when converting from
1277292
to
60c729b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent! Glad to see this applies to logs/traces now. Can you confirm the trace test file doesn't require any changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!, LGTM
I added field mapping coverage for the attribute conversion to OrderedMap. FYI, I did notice Tracing Scope().Attributes() does not have a column defined in the clickhouse schema, so those values are not persisted. Was this intentional? For example: Logs has it https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/clickhouseexporter/exporter_logs.go#L159 Tracing does not: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/exporter/clickhouseexporter/exporter_traces.go#L176 I can create an issue for this, it should be independent of this PR. |
Not sure, I'll have to check the OTel spec to see what's in there. Let me know your thoughts on adding it and I'll add it in my exisitng PR #34245 |
Looking at the spec it looks like the attributes are optional for tracing, so its probably a good idea to add them in case they are being used in an up stream component.
|
I wrote a similar IterableOrderedMap some while ago, and came here to report the same problem to see it's already being taken care of : D Thanks! |
@earwin so good ideal for go 1.23, can you put it to clickhouse go sdk? |
I've been meaning to say it's perfectly usable for pre-1.23 as well, because everything that uses iter.Seq is non-core functionality which can be deleted. I'll ask and see if they are open to having an impl out-of-the-box. ... there |
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
@dmitryax Any feedback on this pr? It will go stale soon. |
Here's my pull request: ClickHouse/clickhouse-go#1417, I modified my impl so it compiles on pre-1.23 go, while still being forward compatible with 1.23 iterators. |
mapIterator := orderedMap.Iterator() | ||
|
||
for mapIterator.Next() { | ||
attrs.Put(mapIterator.Key(), mapIterator.Value()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you're trying to copy orderedMap
. Why?
But you're also reusing the attrs
var defined outside of the loop, so exemplars will get attributes that are a union of all their attributes. Is this intended?
Heads up: I've been working on my own reimplementation of this pull request, To reproduce: reenable and run integration_test.go:TestIntegration
Seems to be triggered by arrays holding pointers to instances of OrderedMap/IterableOrderedMap. Somewhere in append() chain elements of the array get dereferenced, and then |
Closing due to parallel effort, @earwin has a better take on the problem and is following up with prs against the Clickhouse repo. Thanks for helping address this. |
…3634 (#35725) #### Description Our attributes are stored as Map(String, String) in CH. By default the order of keys is undefined and as described in #33634 leads to worse compression and duplicates in `group by` (unless carefully accounted for). This PR uses the `column.IterableOrderedMap` facility from clickhouse-go to ensure fixed attribute key order. It is a reimplementation of #34598 that uses less allocations and is (arguably) somewhat more straightforward. I'm **opening this as a draft**, because this PR (and #34598) are blocked by ClickHouse/clickhouse-go#1365 (fixed in ClickHouse/clickhouse-go#1418) In addition, I'm trying to add the implementation of `column.IterableOrderedMap` used to clickhouse-go upstream: ClickHouse/clickhouse-go#1417 If it is accepted, I will amend this PR accordingly. #### Link to tracking issue Fixes #33634 #### Testing The IOM implementation was used in production independently. I'm planning to build otelcollector with this PR and cut over my production to it in the next few of days.
…en-telemetry#33634 (open-telemetry#35725) #### Description Our attributes are stored as Map(String, String) in CH. By default the order of keys is undefined and as described in open-telemetry#33634 leads to worse compression and duplicates in `group by` (unless carefully accounted for). This PR uses the `column.IterableOrderedMap` facility from clickhouse-go to ensure fixed attribute key order. It is a reimplementation of open-telemetry#34598 that uses less allocations and is (arguably) somewhat more straightforward. I'm **opening this as a draft**, because this PR (and open-telemetry#34598) are blocked by ClickHouse/clickhouse-go#1365 (fixed in ClickHouse/clickhouse-go#1418) In addition, I'm trying to add the implementation of `column.IterableOrderedMap` used to clickhouse-go upstream: ClickHouse/clickhouse-go#1417 If it is accepted, I will amend this PR accordingly. #### Link to tracking issue Fixes open-telemetry#33634 #### Testing The IOM implementation was used in production independently. I'm planning to build otelcollector with this PR and cut over my production to it in the next few of days.
…en-telemetry#33634 (open-telemetry#35725) #### Description Our attributes are stored as Map(String, String) in CH. By default the order of keys is undefined and as described in open-telemetry#33634 leads to worse compression and duplicates in `group by` (unless carefully accounted for). This PR uses the `column.IterableOrderedMap` facility from clickhouse-go to ensure fixed attribute key order. It is a reimplementation of open-telemetry#34598 that uses less allocations and is (arguably) somewhat more straightforward. I'm **opening this as a draft**, because this PR (and open-telemetry#34598) are blocked by ClickHouse/clickhouse-go#1365 (fixed in ClickHouse/clickhouse-go#1418) In addition, I'm trying to add the implementation of `column.IterableOrderedMap` used to clickhouse-go upstream: ClickHouse/clickhouse-go#1417 If it is accepted, I will amend this PR accordingly. #### Link to tracking issue Fixes open-telemetry#33634 #### Testing The IOM implementation was used in production independently. I'm planning to build otelcollector with this PR and cut over my production to it in the next few of days.
Description: Update to maintain attribute order for records created in ClickHouse
Otel Records where not consistently maintain order. Go does not maintain order for maps. This was causing records to be duplicated since ClickHouse considers different attribute order as unique records. This was also effecting compression because these records were considered different if the order was different.
Link to tracking Issue: #33634
Implemented the column.MapIterator and changed the use of attributes from
map[string]string
tocolumn.MapIterator
ClickHouse/clickhouse-go#1152
Testing:
Updated Unit tests and added coverage for OrderMap type.
Basic integration tested was manually done using ClickHouse Server in a local environment.
Documentation:
Pre-existing use of the attribute columns were switched out to use the MapIterator. The Exporter explicitly sorts the attributes before inserting. The MapIterator maintains the original order so it is not lost on record creation.