Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memberlist: get rid of exclusive mutex on get worker channel method #171

Closed
wants to merge 2 commits into from

Conversation

ortuman
Copy link
Contributor

@ortuman ortuman commented May 24, 2022

What this PR does:

This PR adapts memberlist KV to make use of a non-exclusive mutex to protect worker channels access.

Below are the numbers of running the included test benchmark comparing against the new code.

Old version:

$ go test -cpu 1024 -count=10 -run=XXX -benchmem -bench=BenchmarkKV_GetWorkerChannel ./kv/memberlist | tee old.txt
goos: darwin
goarch: arm64
pkg: github.com/grafana/dskit/kv/memberlist
BenchmarkKV_GetWorkerChannel-1024      	 3501267	       365.5 ns/op	      48 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 3099487	       373.7 ns/op	      53 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 3151992	       371.8 ns/op	      49 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 3045904	       330.6 ns/op	      48 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 3089200	       371.4 ns/op	      49 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 3653978	       332.3 ns/op	      58 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 3079264	       358.1 ns/op	      49 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 3284900	       402.8 ns/op	      50 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 3429358	       423.3 ns/op	      53 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 3431317	       362.8 ns/op	      48 B/op	       2 allocs/op
PASS
ok  	github.com/grafana/dskit/kv/memberlist	16.741s

New version:

$ go test -cpu 1024 -count=10 -run=XXX -benchmem -bench=BenchmarkKV_GetWorkerChannel ./kv/memberlist | tee new.txt
goos: darwin
goarch: arm64
pkg: github.com/grafana/dskit/kv/memberlist
BenchmarkKV_GetWorkerChannel-1024      	 6367411	       214.2 ns/op	      48 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 5560459	       227.1 ns/op	      48 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 6267093	       216.9 ns/op	      48 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 6772287	       195.9 ns/op	      48 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 5128867	       203.3 ns/op	      48 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 5811045	       184.2 ns/op	      48 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 5568841	       191.1 ns/op	      48 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 5977153	       193.8 ns/op	      48 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 6543156	       196.6 ns/op	      48 B/op	       2 allocs/op
BenchmarkKV_GetWorkerChannel-1024      	 6678436	       199.9 ns/op	      48 B/op	       2 allocs/op
PASS
ok  	github.com/grafana/dskit/kv/memberlist	15.843s

Comparison:

$ benchstat old.txt new.txt
name                      old time/op    new time/op    delta
KV_GetWorkerChannel-1024     363ns ±11%     202ns ±12%  -44.30%  (p=0.000 n=9+10)

name                      old alloc/op   new alloc/op   delta
KV_GetWorkerChannel-1024     50.5B ±15%     48.0B ± 0%   -4.95%  (p=0.003 n=10+10)

name                      old allocs/op  new allocs/op  delta
KV_GetWorkerChannel-1024      2.00 ± 0%      2.00 ± 0%     ~     (all equal)

Which issue(s) this PR fixes:

Fixes N/A

Checklist

  • Tests updated
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

ortuman added 2 commits May 24, 2022 16:50
Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>
Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com>
@ortuman ortuman added the enhancement New feature or request label May 24, 2022
@ortuman ortuman requested a review from pstibrany May 24, 2022 15:20
Comment on lines +1261 to +1264
go func(idx int) {
_ = kv.getKeyWorkerChannel(strconv.Itoa(idx % totalNumberOfKeys))
wg.Done()
}(i)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark should not spawn new goroutine in each iteration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have constant number of goroutines, each doing N iterations.

@pstibrany
Copy link
Member

pstibrany commented May 24, 2022

Given that memberlist processes incoming messages in single goroutine, I don't think that additional overhead of RW lock is useful here. (Correct benchmark should show RWMutex being slower than classic Mutex [when used from single goroutine])

@ortuman
Copy link
Contributor Author

ortuman commented May 25, 2022

Given that memberlist processes incoming messages in single goroutine, I don't think that additional overhead of RW lock is useful here.

In such a case, and since there's no concurrent access, we could simply do without workersMu mutex. 🥳

PS: and no need for benchmark either... the fastest code is the one that does not execute.

@pstibrany
Copy link
Member

In such a case, and since there's no concurrent access, we could simply do without workersMu mutex. 🥳

That would be nice :) I'm not 100% sure about that however, I see two paths leading to NotifyMsg method, and it's not clear if they are from single goroutine or not. I would err on the safe side and keep the lock. However the majority of cases are from single goroutine loop.

@ortuman
Copy link
Contributor Author

ortuman commented May 25, 2022

I see two paths leading to NotifyMsg method, and it's not clear if they are from single goroutine or not.

here & here.

One running in packetHandler loop and the other in handleConn loop. So in esence this function could be accessed concurrently from this two goroutines, thus the workersMu is required. 🫤

On the other hand, I agree that if most of the time the worker channels map is going to accessed sequentially there's no need for switching to a non-exclusive mutex. 👍

Closing, as this solution does not add real value.

@ortuman ortuman closed this May 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants