Improve performance for schemas with many fields #140

sfriedel · 2016-06-23T13:34:20Z

I did some performance testing and noticed that there is some noticeable overhead once the schema grows.

Commit ccd7f68 introduces a invariantf function that formats the error message with fmt.Sprintf. This was done because in CPU profiles Printf regularily came up almost top.

(pprof) top10
1000ms of 2000ms total (50.00%)
Showing top 10 nodes out of 166 (cum >= 500ms)
      flat  flat%   sum%        cum   cum%
     200ms 10.00% 10.00%      460ms 23.00%  runtime.mallocgc
     180ms  9.00% 19.00%      710ms 35.50%  fmt.(*pp).doPrintf

I traced the source of the Printf's to defineFieldMap and started replacing there, but later extended to all occurances that I found. The change almost halved query processing time for a schema with 20 fields and also brought down allocations from ~2800 to ~1800.

In fdb793c I tried to eliminate the next biggest source of CPU usage, which was the regex in assertValidName. Suprisingly this only resulted in a ~26% decrease in query execution time.

(pprof) top10
640ms of 1360ms total (47.06%)
Showing top 10 nodes out of 145 (cum >= 40ms)
      flat  flat%   sum%        cum   cum%
     140ms 10.29% 10.29%      350ms 25.74%  runtime.mallocgc
      90ms  6.62% 16.91%       90ms  6.62%  regexp/syntax.(*Inst).MatchRunePos

58481fc only tries to avoid some slice reallocations in some places where the final slice length is known. This didn't really increase performance much (~1%) but seeing that the garbage collector is already the biggest source of CPU usage I left it in anyways.

I did some benchmarks comparing the changes to master with varying schema sizes.

schema with 1 field:

Before:
BenchmarkHelloQuery-4       5000            307082 ns/op           55207 B/op       1261 allocs/op

After:
BenchmarkHelloQuery-4       5000            250414 ns/op           49166 B/op       1103 allocs/op

schema with 20 fields:

Before:
BenchmarkHelloQuery-4       2000            736567 ns/op          116243 B/op       2867 allocs/op

After:
BenchmarkHelloQuery-4       5000            331042 ns/op           72469 B/op       1645 allocs/op

schema with 100 fields:

Before:
BenchmarkHelloQuery-4        500           2474401 ns/op          373010 B/op       9532 allocs/op


After:
BenchmarkHelloQuery-4       2000            609578 ns/op          169809 B/op       3885 allocs/op

The source for the benchmarks can be found at: https://gist.github.com/sfriedel/81793288bcc373988f6a000062d051ea

I did some testing with the changes and both the tests and my application run flawlessly with them, but still it would be good if someone else takes a look.

…riants

…ad of regex

coveralls · 2016-06-23T13:35:51Z

Coverage increased (+0.0004%) to 90.755% when pulling 58481fc on nubo:performance into 491504a on graphql-go:master.

andreas · 2016-07-07T11:09:12Z

definition.go

@@ -1285,9 +1286,26 @@ func (gl *NonNull) Error() error {

 var NameRegExp, _ = regexp.Compile("^[_a-zA-Z][_a-zA-Z0-9]*$")


Is this variable superfluous now?

Yes and no. It's not used in the lib anymore, but it's exposed and part of the public API therefore I left it in because I didn't want to break backwards compatibility.

Good point, though I'm not sure exporting this variable was deliberate. Maybe it could be marked as deprecated or simply removed anyway (current version is sub 1.0 after all).

motiejus · 2017-01-21T06:01:42Z

Dear maintainers,

We profiled graphql-go in our environment, and saw some performance issues associated to a growing schema.

What needs to happen to bring this to the finish line? Would it be easier for you to merge if the commits were merged to more small pull requests? Or is there something else we could do?

fd · 2018-01-08T08:16:46Z

This can be closed now?

See #267

sfriedel · 2018-01-24T11:35:54Z

closing this one since #267 addressed the major issue i saw (invariant/printf overhead)

Sebastian Friedel added 3 commits June 17, 2016 14:28

Improve performance by not eagerly formatting error messages for inva…

ccd7f68

…riants

Improve performance by using custom matching in assertValidName inste…

fdb793c

…ad of regex

Avoid some unnecessary reallocations

58481fc

andreas reviewed Jul 7, 2016
View reviewed changes

chris-ramon force-pushed the master branch from a8b0417 to 491504a Compare July 20, 2016 01:31

sfriedel closed this Jan 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance for schemas with many fields #140

Improve performance for schemas with many fields #140

sfriedel commented Jun 23, 2016

coveralls commented Jun 23, 2016

andreas Jul 7, 2016

sfriedel Jul 7, 2016

andreas Jul 7, 2016

motiejus commented Jan 21, 2017

fd commented Jan 8, 2018

sfriedel commented Jan 24, 2018

		@@ -1285,9 +1286,26 @@ func (gl *NonNull) Error() error {

		var NameRegExp, _ = regexp.Compile("^[_a-zA-Z][_a-zA-Z0-9]*$")

Improve performance for schemas with many fields #140

Improve performance for schemas with many fields #140

Conversation

sfriedel commented Jun 23, 2016

coveralls commented Jun 23, 2016

andreas Jul 7, 2016

Choose a reason for hiding this comment

sfriedel Jul 7, 2016

Choose a reason for hiding this comment

andreas Jul 7, 2016

Choose a reason for hiding this comment

motiejus commented Jan 21, 2017

fd commented Jan 8, 2018

sfriedel commented Jan 24, 2018