Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster json parser #33785

Merged
merged 2 commits into from
Jun 28, 2024
Merged

Faster json parser #33785

merged 2 commits into from
Jun 28, 2024

Conversation

atoulme
Copy link
Contributor

@atoulme atoulme commented Jun 27, 2024

Description:
Switch libraries to reduce the footprint of the JSON parser.

Link to tracking Issue:
Fixes #33784

Testing:

Before:

goos: darwin
goarch: arm64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/parser/json
BenchmarkProcess-10        49047             23951 ns/op           26277 B/op        177 allocs/op
BenchmarkProcess-10        50248             24002 ns/op           26275 B/op        177 allocs/op
BenchmarkProcess-10        50258             24517 ns/op           26276 B/op        177 allocs/op
BenchmarkProcess-10        50505             24731 ns/op           26276 B/op        177 allocs/op
BenchmarkProcess-10        45708             24730 ns/op           26276 B/op        177 allocs/op
BenchmarkProcess-10        50022             25021 ns/op           26277 B/op        177 allocs/op
BenchmarkProcess-10        47204             24794 ns/op           26275 B/op        177 allocs/op
BenchmarkProcess-10        47742             25335 ns/op           26274 B/op        177 allocs/op
BenchmarkProcess-10        46252             25205 ns/op           26276 B/op        177 allocs/op
BenchmarkProcess-10        47916             24379 ns/op           26277 B/op        177 allocs/op
PASS
ok      github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/parser/json       15.559s

After:

goos: darwin
goarch: arm64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/parser/json
BenchmarkProcess-10        95613             12168 ns/op           14702 B/op        115 allocs/op
BenchmarkProcess-10        97904             12343 ns/op           14704 B/op        115 allocs/op
BenchmarkProcess-10        99006             12187 ns/op           14702 B/op        115 allocs/op
BenchmarkProcess-10        96964             12310 ns/op           14703 B/op        115 allocs/op
BenchmarkProcess-10        98661             12285 ns/op           14703 B/op        115 allocs/op
BenchmarkProcess-10        96896             12356 ns/op           14703 B/op        115 allocs/op
BenchmarkProcess-10        94118             12367 ns/op           14703 B/op        115 allocs/op
BenchmarkProcess-10        96123             12349 ns/op           14702 B/op        115 allocs/op
BenchmarkProcess-10        96948             12305 ns/op           14702 B/op        115 allocs/op
BenchmarkProcess-10        96626             12225 ns/op           14702 B/op        115 allocs/op
PASS
ok      github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/parser/json       13.619s

Benchstat:

goos: darwin
goarch: arm64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/pkg/stanza/operator/parser/json
           │ before.txt  │              after.txt              │
           │   sec/op    │   sec/op     vs base                │
Process-10   24.73µ ± 3%   12.31µ ± 1%  -50.23% (p=0.000 n=10)

           │  before.txt  │              after.txt               │
           │     B/op     │     B/op      vs base                │
Process-10   25.66Ki ± 0%   14.36Ki ± 0%  -44.05% (p=0.000 n=10)

           │ before.txt │             after.txt              │
           │ allocs/op  │ allocs/op   vs base                │
Process-10   177.0 ± 0%   115.0 ± 0%  -35.03% (p=0.000 n=10)

@atoulme atoulme force-pushed the faster_json_parser branch from a649883 to ce895d3 Compare June 27, 2024 01:12
Copy link
Member

@songy23 songy23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrzej-stencel
Copy link
Member

Oh, it looks like the https://github.com/json-iterator/go that is currently used has been unmaintained for about 2 years now?

@atoulme are you aware of other major open source projects using the proposed library https://github.com/goccy/go-json?

@atoulme atoulme force-pushed the faster_json_parser branch from ce895d3 to 81e1d52 Compare June 27, 2024 15:57
@atoulme
Copy link
Contributor Author

atoulme commented Jun 27, 2024

@atoulme are you aware of other major open source projects using the proposed library https://github.com/goccy/go-json?
I took the first google result. I applied it to our code, and I saw it is used by quite a few other components.

 git grep -l go-json | xargs dirname | sort | uniq
cmd/otelcontribcol
connector/datadogconnector
exporter/datadogexporter
exporter/datadogexporter/integrationtest
exporter/otelarrowexporter
internal/sqlquery
processor/k8sattributesprocessor
receiver/googlecloudspannerreceiver
receiver/influxdbreceiver
receiver/otelarrowreceiver
receiver/snowflakereceiver
receiver/sqlqueryreceiver
receiver/sqlserverreceiver

Following a few of those paths, it seems to be used by datadog libraries and the Go arrow client.

Given that json-iterator/go is not seeing any updates for 2 years, jumping to another lib with minimal change to the functional set seems like the right idea.

@djaglowski djaglowski merged commit e004ecd into open-telemetry:main Jun 28, 2024
154 checks passed
@github-actions github-actions bot added this to the next release milestone Jun 28, 2024
djaglowski pushed a commit that referenced this pull request Jul 8, 2024
…33929)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

#33785
switched the library for the `json` parser. Based on this, we should
switch the library for the `container` parser as well.
(No benchmarks were added since we can rely on those of
#33785
I guess. Happy to add some if it's suggested though)

**Link to tracking Issue:** <Issue number if applicable>

**Testing:** <Describe what testing was performed and which tests were
added.>

**Documentation:** <Describe the documentation added.>

Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
codeboten pushed a commit that referenced this pull request Sep 19, 2024
This follows
#33785
by switching to a faster json library.
jriguera pushed a commit to springernature/opentelemetry-collector-contrib that referenced this pull request Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[pkg/stanza] json_parser is a GC hot spot
5 participants