-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PCP: /api/v1/regions optimization #1970
Conversation
Thanks for your contribution. If your PR get merged, you will be rewarded 900 points. |
Thanks, @ekalinin, Greatly work. |
Sure, will take a look on it today. |
- reduced the amount of allocation in convertToAPIRegions by making a slice of RegionInfo. - added a new function - InitRegion. Used it in a loop to init a slice of pointers to the RegionInfo structs without additional allocations. - added method HexRegionKeyStr. It returns string (and not []byte). StartKey/EndKey are string, and we do not need one more convert (and allocation) from []byte to string. - added benchmarks: - BenchmarkConvertToAPIRegions - BenchmarkHexRegionKey - BenchmarkHexRegionKeyStr Result: ➜ benchcmp old.api.txt new.api.txt benchmark old ns/op new ns/op delta BenchmarkConvertToAPIRegions-12 2880085 2208890 -23.30% benchmark old allocs new allocs delta BenchmarkConvertToAPIRegions-12 82182 52183 -36.50% benchmark old bytes new bytes delta BenchmarkConvertToAPIRegions-12 2162007 1947634 -9.92% ➜ GO111MODULE=on go test github.com/pingcap/pd/server/api -run=Benchmark -bench=BenchmarkHexRegionKey -benchmem goos: linux goarch: amd64 pkg: github.com/pingcap/pd/server/api BenchmarkHexRegionKey-12 1229766 1239 ns/op 240 B/op 5 allocs/op BenchmarkHexRegionKeyStr-12 1226398 1018 ns/op 144 B/op 3 allocs/op
Codecov Report
@@ Coverage Diff @@
## master #1970 +/- ##
========================================
Coverage ? 77.9%
========================================
Files ? 174
Lines ? 17626
Branches ? 0
========================================
Hits ? 13732
Misses ? 2834
Partials ? 1060
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks @ekalinin, you have done several significant optimizations, but may we can still try more for :
- Conversion between
[]byte
andstring
may have memory allocation, you have added a functionHexRegionKeyStr
to reduce one conversion, but there still have some conversions. maybe we can try this method to do something. strings.Upper()
will do a copy, we can improve it tozero-copy
with doing it in slice[]byte
.hex.EncodingToString
also have a conversion.
- tRegions: removed args, used predefined args - added Slice/String functions for zero-cost convertations - added ToUpperASCIIInplace function (zero cost ToUppper) - added EncodeToString. It overrides hex.EncodeToString. Difference: returns []byte, not string
Changes:
Results: # GO111MODULE=on go test github.com/pingcap/pd/server/api -run=Benchmark -bench=BenchmarkHexRegionKey -benchmem
➜ benchcmp hex.old.txt hex.new.txt
benchmark old ns/op new ns/op delta
BenchmarkHexRegionKey-12 532 137 -74.25%
BenchmarkHexRegionKeyStr-12 470 140 -70.21%
benchmark old allocs new allocs delta
BenchmarkHexRegionKey-12 5 1 -80.00%
BenchmarkHexRegionKeyStr-12 3 1 -66.67%
benchmark old bytes new bytes delta
BenchmarkHexRegionKey-12 400 48 -88.00%
BenchmarkHexRegionKeyStr-12 240 48 -80.00% and in general: ➜ benchcmp old.api.txt new.api.txt
benchmark old ns/op new ns/op delta
BenchmarkConvertToAPIRegions-12 2880085 1550980 -46.15%
benchmark old allocs new allocs delta
BenchmarkConvertToAPIRegions-12 82182 20003 -75.66%
benchmark old bytes new bytes delta
BenchmarkConvertToAPIRegions-12 2162007 2002431 -7.38% |
/run-all-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do the test same as the original issue, with 1,000,000 regions:
before
[2019/10/23 12:19:55.653 +00:00] [INFO] [region.go:154] ["[DEBUG] got all regions from core"] [cost=149.863003ms]
[2019/10/23 12:19:59.083 +00:00] [INFO] [region.go:157] ["[DEBUG] convert all regions to API regions"] [cost=3.430113877s]
[2019/10/23 12:20:17.575 +00:00] [INFO] [region.go:160] ["[DEBUG] send all regions to client"] [cost=18.491626948s]
after
[2019/11/27 15:15:14.985 +00:00] [INFO] [region.go:155] ["[DEBUG] got all regions from core"] [cost=31.286284ms]
[2019/11/27 15:15:16.834 +00:00] [INFO] [region.go:158] ["[DEBUG] convert all regions to API regions"] [cost=1.849013297s]
[2019/11/27 15:15:22.415 +00:00] [INFO] [region.go:161] ["[DEBUG] send all regions to client"] [cost=5.580648785s]
It's almost 3.5x faster now. but 5s may still too long, can we have more method to optimize it?
and I have got the profile from my test:
cpu.new.pprof.zip
mem.new.pprof.zip
maybe we need to focus on reducing CPU usage now:
- how about have a try
https://github.com/json-iterator/go
to getting all regions?* - gzip the response may speed up transmission*
What is your opinion?
Could you add a benchmark which would be close to that test with 1M regions?
Thanks! That's great!
Sure, will try to dig into that direction.
Unfortunately, Why just not use external tool? Like: $ >> ./bin/pd-ctl -u http://0.0.0.0:32833 config show all | python -m json.tool or $ >> ./bin/pd-ctl -u http://0.0.0.0:32833 config show all | jq . |
I have shared my data in here, you can use it to construct the regions in your local benchmark.
|
Done. |
Just tried it. Looks not well: # current branch
➜ GO111MODULE=on go test github.com/pingcap/pd/server/api -run=Benchmark -bench=BenchmarkRenderJSON
goos: linux
goarch: amd64
pkg: github.com/pingcap/pd/server/api
BenchmarkRenderJSON-12 96 15344540 ns/op 2429223 B/op 5 allocs/op
PASS
ok github.com/pingcap/pd/server/api 1.732s
# current branch with github.com/json-iterator/go
GO111MODULE=on go test github.com/pingcap/pd/server/api -run=Benchmark -bench=BenchmarkRenderJSON
goos: linux
goarch: amd64
pkg: github.com/pingcap/pd/server/api
BenchmarkRenderJSON-12 27 38831651 ns/op 6267826 B/op 18 allocs/op
PASS
ok github.com/pingcap/pd/server/api 1.328s patch: func (ri *RegionsInfo) MarshalJSON() ([]byte, error) {
var json = jsoniter.ConfigFastest
return json.Marshal(*ri)
} |
Tried
patch: var enc = jingo.NewStructEncoder(RegionInfo{})
func (ri *RegionInfo) MarshalJSON() ([]byte, error) {
buf := jingo.NewBufferFromPool()
enc.Marshal(ri, buf)
b := make([]byte, len(buf.Bytes))
copy(b, buf.Bytes)
buf.ReturnToPool()
return b, nil
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest lgtm. Thank you for several optimization attempts.
Current results of the json render optimization: # GO111MODULE=on go test github.com/pingcap/pd/server/api -run=Benchmark -bench=BenchmarkRenderJSON -benchmem
➜ benchcmp old.txt new.txt
benchmark old ns/op new ns/op delta
BenchmarkRenderJSON-12 662659672 17829821 -97.31%
benchmark old allocs new allocs delta
BenchmarkRenderJSON-12 50 4 -92.00%
benchmark old bytes new bytes delta
BenchmarkRenderJSON-12 414771916 2339140 -99.44% |
PTAL @rleungx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.what format when we get all regions? would you like to provide examples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
Signed-off-by: Ryan Leung <rleungx@gmail.com>
Signed-off-by: Ryan Leung <rleungx@gmail.com>
What problem does this PR solve?
PCP #1837
What is changed and how it works?
convertToAPIRegions
optimizationconvertToAPIRegions
by making a slice ofRegionInfo
.InitRegion
. Used it in a loop to init a sliceof pointers to the
api.RegionInfo
structs without additional allocations.HexRegionKeyStr
. It returns a string (and not a[]byte
).StartKey/EndKey
fields arestring
in theapi.RegionInfo
, and we do not need one more convertation (and allocation) from []byte to string.BenchmarkConvertToAPIRegions
BenchmarkHexRegionKey
BenchmarkHexRegionKeyStr
Result:
# GO111MODULE=on go test github.com/pingcap/pd/server/api -run=None -bench=BenchmarkConvertToAPIRegions -benchmem > new.api.txt ➜ benchcmp old.api.txt new.api.txt benchmark old ns/op new ns/op delta BenchmarkConvertToAPIRegions-12 2880085 2208890 -23.30% benchmark old allocs new allocs delta BenchmarkConvertToAPIRegions-12 82182 52183 -36.50% benchmark old bytes new bytes delta BenchmarkConvertToAPIRegions-12 2162007 1947634 -9.92%
h.rd.JSON
optimizationserver/api/router.go#createRender
function for creating*render.Render
IndentJSON
from therender.New
callStreamingJSON
to therender.New
callserver/api/region_test.go#BenchmarkRenderJSON
for testing the performance of the last part of theregionsHandler.GetAll
function (which takes 90% of the total time for the/pd/api/v1/regions
endpoint in PCP PCP-24: Improve the performance of the HTTP API for getting all regions #1837)# GO111MODULE=on go test github.com/pingcap/pd/server/api -run=Benchmark -bench=BenchmarkRenderJSON -benchmem > new.txt ➜ benchcmp old.txt new.txt benchmark old ns/op new ns/op delta BenchmarkRenderJSON-12 662659672 147632381 -77.72% benchmark old allocs new allocs delta BenchmarkRenderJSON-12 50 12 -76.00% benchmark old bytes new bytes delta BenchmarkRenderJSON-12 414771916 51753928 -87.52%
Check List
Tests