Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

balancer/weightedroundrobin: add load balancing policy (A58) #6241

Merged
merged 13 commits into from
May 8, 2023

Conversation

dfawley
Copy link
Member

@dfawley dfawley commented May 2, 2023

Implements https://github.com/grpc/proposal/blob/master/A58-client-side-weighted-round-robin-lb-policy.md

RELEASE NOTES:

  • balancer/weightedroundrobin: add new LB policy for balancing between backends based on their load reports

@dfawley dfawley added the Type: Feature New features or improvements in behavior label May 2, 2023
@dfawley dfawley added this to the 1.56 Release milestone May 2, 2023
@dfawley dfawley requested a review from zasweq May 2, 2023 01:09
@zasweq
Copy link
Contributor

zasweq commented May 2, 2023

I lgtmed the pr this is based off. Could you please rebase and fix vet?

@zasweq
Copy link
Contributor

zasweq commented May 2, 2023

balancer/weightedroundrobin/config.go:38:6: exported type LBConfigForTesting should have comment or be unexported

@dfawley dfawley changed the title balancer/weightedroundrobin: add load balancing policy balancer/weightedroundrobin: add load balancing policy (A58) May 2, 2023
Copy link
Contributor

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first pass on the implementation focused on style/readability with a light correctness focus (but still tried to break correctness). Future passes I will go heavier on breaking correctness. I haven't looked at the tests yet.

balancer/weightedroundrobin/balancer.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/config.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer.go Show resolved Hide resolved
balancer/weightedroundrobin/balancer.go Show resolved Hide resolved
balancer/weightedroundrobin/balancer.go Show resolved Hide resolved
balancer/weightedroundrobin/balancer.go Show resolved Hide resolved
balancer/weightedroundrobin/scheduler.go Show resolved Hide resolved
@zasweq zasweq assigned dfawley and unassigned zasweq May 3, 2023
@dfawley dfawley assigned zasweq and unassigned dfawley May 3, 2023
Copy link
Member Author

@dfawley dfawley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated based on additional comments, thanks!

balancer/weightedroundrobin/balancer.go Show resolved Hide resolved
balancer/weightedroundrobin/config.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/scheduler.go Show resolved Hide resolved
Copy link
Contributor

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a first pass at tests. Sending them out now so you can see them, beginning implementation second pass now.

balancer/weightedroundrobin/balancer_test.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer_test.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer_test.go Show resolved Hide resolved
balancer/weightedroundrobin/balancer_test.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer_test.go Show resolved Hide resolved
Comment on lines 404 to 405
// srv1 starts loaded and srv2 starts without load; ensure RPCs are routed
// disproportionately to srv2 (10:1). Errors are set (but ignored
// initially) such that RPCs will be routed 50/50.
srv1.oobMetrics.SetQPS(10.0)
srv1.oobMetrics.SetCPUUtilization(1.0)
srv1.oobMetrics.SetEPS(0)
// srv1 weight before: 10.0 / 1.0 = 10.0
// srv1 weight after: 10.0 / 1.0 = 10.0

srv2.oobMetrics.SetQPS(10.0)
srv2.oobMetrics.SetCPUUtilization(.1)
srv2.oobMetrics.SetEPS(10.0)
// srv2 weight before: 10.0 / 0.1 = 100.0
// srv2 weight after: 10.0 / 1.0 = 10.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this. Can you please specify where the errors are set, and why the RPCs once errors are set are routed 50/50 rather than 10/1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors are set via SetEPS, and I explain here with "weight before/after" what the effective weights will be (10:1 before and 1:1 after). The comment above goes with these to explain this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you specify what 0 and 10 mean wrt EPS and errors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done I think

balancer/weightedroundrobin/balancer_test.go Show resolved Hide resolved
Comment on lines +524 to +546
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()

var mu sync.Mutex
start := time.Now()
now := start
setNow := func(t time.Time) {
mu.Lock()
defer mu.Unlock()
now = t
}
iwrr.TimeNow = func() time.Time {
mu.Lock()
defer mu.Unlock()
return now
}
t.Cleanup(func() { iwrr.TimeNow = time.Now })

srv1 := startServer(t, reportBoth)
srv2 := startServer(t, reportBoth)

// srv1 starts loaded and srv2 starts without load; ensure RPCs are routed
// disproportionately to srv2 (10:1). Because the OOB reporting interval
// is 1 minute but the weights expire in 1 second, routing will go to 50/50
// after the weights expire.
srv1.oobMetrics.SetQPS(10.0)
srv1.oobMetrics.SetCPUUtilization(1.0)

srv2.oobMetrics.SetQPS(10.0)
srv2.oobMetrics.SetCPUUtilization(.1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole codeblock is shared with test above (except reportBoth, which is fine since it's a no-op anyway if it just gets ignored). Refactor into helper?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not have helpers that do too much.. The goal should be to have ~1 line for each type of operation. With too many helpers the code can end up harder to debug when there is a test error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 514 - 547 are shared. Sorry, the comment didn't include the lines. I think due to the thirty lines of shared code repeated 3 times, you can pull into helper.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startServer and the setting of metrics are already factored out enough.

Setting the context can't be shared.

The bit that I'd be willing to rework is 517-530, but it seems fine to me this way.

balancer/weightedroundrobin/balancer_test.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer_test.go Show resolved Hide resolved
Copy link
Contributor

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks solid. Some minor comments (alongside a test I want).

}

// A simple RR scheduler to use for fallback when all weights are zero or the
// same or when only one subconn exists.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also mention if < 2 subconns have weights. (i.e. all weights are zero or only one subconn has weight, can perhaps replace the language of "all weights are zero" to both of these cases)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

balancer/weightedroundrobin/scheduler.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer.go Show resolved Hide resolved
balancer/weightedroundrobin/balancer.go Outdated Show resolved Hide resolved
Comment on lines 440 to 441
// By default we set load reports to off, because they are not running
// upon initial weightedSubConn creation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the real root cause for this decision?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason this code is here is: 1. we need a non-nil oldCfg, and 2. we want the wsc to know there isn't a running OOB listener.

Comment on lines 442 to 473
oldCfg = &lbConfig{EnableOOBLoadReport: false}
}
w.cfg = cfg
newPeriod := cfg.OOBReportingPeriod
if cfg.EnableOOBLoadReport == oldCfg.EnableOOBLoadReport &&
newPeriod == oldCfg.OOBReportingPeriod {
// Load reporting wasn't enabled before or after, or load reporting was
// enabled before and after, and had the same period. (Note that with
// load reporting disabled, OOBReportingPeriod is always 0.)
return
}
// (Optionally stop and) start the listener to use the new config's
// settings for OOB reporting.
if w.stopORCAListener != nil {
w.stopORCAListener()
}
if !cfg.EnableOOBLoadReport {
w.stopORCAListener = nil
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the first config update ok to be OOB? Then oldCfg would hit the nil check on 439, and this function work proceed as normal. It feels weird though that the default is off, but the first config has the bool set, so why not just use the bool in the first config since a config received is what triggers this call in the first place?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default also means zero value I guess, and your code makes it work correctly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oldCfg = the "current state of the wsc". But on the first call, there is no current state. Not really, anyway.

I have moved the initialization to happen at initialization time, which is better.

balancer/weightedroundrobin/balancer.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer.go Outdated Show resolved Hide resolved
@zasweq zasweq assigned dfawley and unassigned zasweq May 4, 2023
Copy link
Member Author

@dfawley dfawley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All comments addressed!

balancer/weightedroundrobin/balancer.go Show resolved Hide resolved
balancer/weightedroundrobin/balancer.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer.go Show resolved Hide resolved
balancer/weightedroundrobin/balancer.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer_test.go Show resolved Hide resolved
balancer/weightedroundrobin/scheduler.go Outdated Show resolved Hide resolved
}

// A simple RR scheduler to use for fallback when all weights are zero or the
// same or when only one subconn exists.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

balancer/weightedroundrobin/balancer_test.go Show resolved Hide resolved
Copy link
Contributor

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All comments minor. LGTM otherwise though.

balancer/weightedroundrobin/balancer.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer_test.go Outdated Show resolved Hide resolved
balancer/weightedroundrobin/balancer_test.go Show resolved Hide resolved
balancer/weightedroundrobin/balancer_test.go Show resolved Hide resolved
Comment on lines +375 to +376
// and if the scheduler is replaced during this usage, we want to use the
// scheduler that was live when the pick started.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is replaced after reading and usage, continue to use the scheduler that was read. or something like that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see what's wrong with what I've written.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"We want to" is not strong enough language. I don't really care about this though minor nit.

balancer/weightedroundrobin/balancer_test.go Show resolved Hide resolved
Comment on lines +524 to +546
ctx, cancel := context.WithTimeout(context.Background(), defaultTestTimeout)
defer cancel()

var mu sync.Mutex
start := time.Now()
now := start
setNow := func(t time.Time) {
mu.Lock()
defer mu.Unlock()
now = t
}
iwrr.TimeNow = func() time.Time {
mu.Lock()
defer mu.Unlock()
return now
}
t.Cleanup(func() { iwrr.TimeNow = time.Now })

srv1 := startServer(t, reportBoth)
srv2 := startServer(t, reportBoth)

// srv1 starts loaded and srv2 starts without load; ensure RPCs are routed
// disproportionately to srv2 (10:1). Because the OOB reporting interval
// is 1 minute but the weights expire in 1 second, routing will go to 50/50
// after the weights expire.
srv1.oobMetrics.SetQPS(10.0)
srv1.oobMetrics.SetCPUUtilization(1.0)

srv2.oobMetrics.SetQPS(10.0)
srv2.oobMetrics.SetCPUUtilization(.1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 514 - 547 are shared. Sorry, the comment didn't include the lines. I think due to the thirty lines of shared code repeated 3 times, you can pull into helper.

Comment on lines 404 to 405
// srv1 starts loaded and srv2 starts without load; ensure RPCs are routed
// disproportionately to srv2 (10:1). Errors are set (but ignored
// initially) such that RPCs will be routed 50/50.
srv1.oobMetrics.SetQPS(10.0)
srv1.oobMetrics.SetCPUUtilization(1.0)
srv1.oobMetrics.SetEPS(0)
// srv1 weight before: 10.0 / 1.0 = 10.0
// srv1 weight after: 10.0 / 1.0 = 10.0

srv2.oobMetrics.SetQPS(10.0)
srv2.oobMetrics.SetCPUUtilization(.1)
srv2.oobMetrics.SetEPS(10.0)
// srv2 weight before: 10.0 / 0.1 = 100.0
// srv2 weight after: 10.0 / 1.0 = 10.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you specify what 0 and 10 mean wrt EPS and errors.

Copy link
Contributor

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All comments minor. LGTM otherwise though.

Comment on lines +411 to +412
// or when registering a new listener, as those calls require the ORCA
// producer mu which is held when calling the listener, and the listener
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does holding the producer mu have anything to do with this mu? Are you saying that that's what guarantees mutual exclusion on accesses? I get the last bit about the listener callback also grabs mu so can't use this mu.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ORCA producer holds a mutex when calling the listeners. If the listeners do something that requires that mutex (basically, register/unregister listeners), we get a deadlock. Transitively, if the listeners share a mutex with another part of the code that holds that mutex while registering/unregistering listeners, we get a deadlock.

Copy link
Contributor

@zasweq zasweq May 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that wrt deadlock scenarios. However, I'm asking why you mention the second producerMu. Is it to get around this deadlock that would happen if you grab the same mutex? (and can you clarify this in comment)

Copy link
Contributor

@zasweq zasweq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@zasweq zasweq removed their assignment May 5, 2023
@dfawley
Copy link
Member Author

dfawley commented May 5, 2023

Flake in the first run is #6258. I'll run a few more times on GA and see if I can reproduce the failure with the scheduler pointer. I was never able to reproduce it locally, but found and fixed a handful of other things, so maybe it's gone now?

@dfawley
Copy link
Member Author

dfawley commented May 5, 2023

Forced push to pull in 6258 as it was hitting that race a lot.

@dfawley
Copy link
Member Author

dfawley commented May 5, 2023

Ohhhh... I think the problem with the bug I was seeing might be that an unsafe.Pointer is 64 bits, and the scheduler was not the first entry in the struct. I was only seeing problems on 386 and ARM, so I'm 99% sure that was it.. Debugging code removed & I think this is done! (hopefully??)

@dfawley
Copy link
Member Author

dfawley commented May 5, 2023

That wasn't it, either.. But the other instrumentation I put in found it... The rrScheduler was being used and the cast to int from inc() was the problem.

Why does go use the signed int type for slice accesses and length?! Slices can't ever be negative...

@dfawley dfawley merged commit 5c4bee5 into grpc:master May 8, 2023
1 check passed
@dfawley dfawley deleted the wrr branch May 8, 2023 17:01
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Type: Feature New features or improvements in behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants