Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds: add weighted round robin LB policy support #9873

Merged
merged 30 commits into from
Feb 27, 2023

Conversation

YifeiZhuang
Copy link
Contributor

implementation design doc: go/grpc-java-wrr

1. to merge orca api remove listener
2. to merge metric report api rqs
3. to settle the package
RoundRobinLB picker issue, e.g.
size = 5, index = 4
pick1: i = index = 5
pick2: i = index = 6
pick1 : oldi = 5, i = 0, index = 0
pick2: oldi = 6, i = 1, index not updated = 0.
pick1 return subchanel[0], pick2 return subchannel[1]
next time, it still return subchannel[1], it gets picked more often;
@YifeiZhuang YifeiZhuang changed the title implement weighted round robin LB policy xds: add weighted round robin LB policy support Feb 9, 2023
@YifeiZhuang YifeiZhuang marked this pull request as ready for review February 10, 2023 00:19
double newWeight = subchannel.getWeight();
scheduler.add(i, newWeight > 0 ? newWeight : avgWeight);
}
schedulerRef.set(scheduler);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because you are completely replacing the scheduler each time you update weights, if something has a large weight and something else has a minimum weight the second one might never get used.
One way of dealing with this would be for ObjectState to have a flag indicating whether this was added from a pick, then for all of the ones that weren't added from a pick you could use the old deadline (or the smaller of the old deadline and the newly calculated one). You could have the flag passed to scheduler.add() and all of the work done in the add method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely true.
Note stubby has similar "completion ratio" mechanism, that gives credits to subchannels in the previous state when updating to the next state with the new weight. This way, the minimum weight channel can possibly be picked.
The current implementation is very simplified. I'll make it as a future improvement and I will capture it in the design doc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because you are completely replacing the scheduler each time you update weights, if something has a large weight and something else has a minimum weight the second one might never get used.

Randomizing the scheduler each creation should prevent that. Worst-case, if a schedule is used for only a single pick after being created, then that is the same as a WRR implementation that has weighted ranges for each choice and uses a random number to choose (same approach as weighted_target).

The code right now does not seem to randomize the initial scheduler state, but it will need to before de-experimentalizing.

@ejona86
Copy link
Member

ejona86 commented Feb 14, 2023

Can you rebase this on master because #9875 is merged? I would have looked at just the commits after that other PR, but there were so many; in the future you could squash commits that were just used during development before creating the PR.

Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still haven't found enough time to get through it all :-/. Sending what I have.

Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happened to go a bit further, but this is still incomplete. All of these comments are minor.

}

WeightedRoundRobinLoadBalancerConfig build() {
return new WeightedRoundRobinLoadBalancerConfig(blackoutPeriodNanos,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: passing this to the Config constructor is convenient, and Config will have access to the fields even if they are private. (no need to change though)

private volatile long lastUpdated;
private volatile long nonEmptySince;
private volatile double weight;
private volatile WeightedRoundRobinLoadBalancerConfig config;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this volatile? It appears to only be accessed in the synchronization context.

Copy link
Contributor Author

@YifeiZhuang YifeiZhuang Feb 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh it appears so.
It can be used by pick() from SubchannelStateListener from transport thread, or here in the LB thread in weightUpdateTask or acceptResolvedAddresses, all in sync context.

@YifeiZhuang YifeiZhuang requested a review from ejona86 February 24, 2023 21:59
Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's two non-FYI remaining comments from the previous part of the review that I'd like to see done. They are related; passing config.enableOobLoadReport to the picker's constructor would allow making config non-volatile.

After those two previous comments and the comments requiring change in this review, things look good. I didn't look at tests, but Larry did previously.

@YifeiZhuang YifeiZhuang merged commit 8d12baa into grpc:master Feb 27, 2023
@YifeiZhuang YifeiZhuang deleted the wrr-impl branch February 27, 2023 18:39
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants