Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

With min-scale > 1 and activator only the first pod receive traffic #12593

Closed
SharpEdgeMarshall opened this issue Feb 3, 2022 · 7 comments · Fixed by #14028
Closed

With min-scale > 1 and activator only the first pod receive traffic #12593

SharpEdgeMarshall opened this issue Feb 3, 2022 · 7 comments · Fixed by #14028
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Issues which should be fixed (post-triage)
Milestone

Comments

@SharpEdgeMarshall
Copy link

SharpEdgeMarshall commented Feb 3, 2022

What version of Knative?

0.26.0

Expected Behavior

/kind bug

If you set min-scale=2 all pods receive traffic

Actual Behavior

  • Only the first pod will receive traffic if there's activator proxing reqeusts

Steps to Reproduce the Problem

Istio Version: 1.11.4
Knative Version: 0.26.0
Kubernetes version: (use kubectl version): 1.21
OS (e.g. from /etc/os-release): bottlerocket

@SharpEdgeMarshall SharpEdgeMarshall added the kind/bug Categorizes issue or PR as related to a bug. label Feb 3, 2022
@jwcesign
Copy link
Member

jwcesign commented Feb 7, 2022

What gateway u are using? istio?

@SharpEdgeMarshall
Copy link
Author

SharpEdgeMarshall commented Feb 7, 2022 via email

@SharpEdgeMarshall SharpEdgeMarshall changed the title With min-scale > 1 only the first pod receive traffic With min-scale > 1 and activator only the first pod receive traffic Feb 11, 2022
@github-actions
Copy link

This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen. Mark the issue as
fresh by adding the comment /remove-lifecycle stale.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 13, 2022
@SharpEdgeMarshall
Copy link
Author

This should be reopened it's an actual bug hitting production workloads

@dprotaso
Copy link
Member

dprotaso commented Mar 29, 2023

/triage accepted
/reopen
/milestone v1.11.0

Want to priortize an investigation in v1.11.0 timeframe

@dprotaso dprotaso reopened this Mar 29, 2023
@knative-prow knative-prow bot added the triage/accepted Issues which should be fixed (post-triage) label Mar 29, 2023
@dprotaso dprotaso added this to the v1.11.0 milestone Mar 29, 2023
@github-actions github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 30, 2023
@dprotaso
Copy link
Member

/assign @dprotaso

Found the issue - it seems like when we have exactly 2 pods that have equal weight we will always pick the first one

pick, alt := targets[r1], targets[r2]
// Possible race here, but this policy is for CC=0,
// so fine.
if pick.getWeight() > alt.getWeight() {
pick = alt
}

If you kick off a bunch of requests in parallel then you'll get better spreading since the weight's won't be even.

@dprotaso
Copy link
Member

This should be reopened it's an actual bug hitting production workloads

@SharpEdgeMarshall do you have a way to repro your production issue - ie. similar to #14011 ?

It would be good to know if there are other scenarios where this happens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Issues which should be fixed (post-triage)
Projects
None yet
3 participants