Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert #3070, wait on networking a different way #3107

Merged
merged 2 commits into from
Apr 18, 2023

Conversation

zmerlynn
Copy link
Collaborator

The problem addressed by #3070 is that on an indeterminate basis, we are seeing containers start without networking fully available. Once networking seems to work, it works fine.

However, the fix in #3070 introduced a downside: heavy watch traffic, because I didn't quite understand that it would also block the hanging GET of the watch. See #3106.

Instead of timing out the whole client, let's use an initial-probe approach and instead block on a successful GET (with a reasonable timeout) before we try to start informers.

Fixes #3106

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 3d2515c9-5ba7-4e10-939b-bb884989f7de

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@zmerlynn zmerlynn force-pushed the wait-on-networking branch 2 times, most recently from e111eb9 to c40fe48 Compare April 17, 2023 23:46
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: b590d5ee-5671-49f9-bc59-fe6a1c1d04b6

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@zmerlynn zmerlynn force-pushed the wait-on-networking branch from c40fe48 to e90ce40 Compare April 18, 2023 00:33
@zmerlynn zmerlynn requested review from gongmax and removed request for EricFortin and aLekSer April 18, 2023 00:33
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 6b954725-5eb8-470a-9dba-3e2ed011367e

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3107/head:pr_3107 && git checkout pr_3107
  • helm install agones ./install/helm/agones --namespace agones-system --agones.image.release=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.32.0-e90ce40-amd64

@@ -263,6 +267,38 @@ func (s *SDKServer) Run(ctx context.Context) error {
return nil
}

// waitForConnection attempts a GameServer GET every 3s until the client responds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amusingly this matches the exact pattern we use on SDK clients that don't support gRPC.

pkg/sdkserver/sdkserver.go Outdated Show resolved Hide resolved
@zmerlynn zmerlynn force-pushed the wait-on-networking branch 2 times, most recently from 46efa67 to bcb9aab Compare April 18, 2023 17:56
@zmerlynn zmerlynn enabled auto-merge (squash) April 18, 2023 17:58
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 653aecf4-f80e-4ce4-aa57-a414683dfe0f

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: ea6d2094-0933-46a2-9f70-7a9d793851ac

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@zmerlynn zmerlynn force-pushed the wait-on-networking branch from bcb9aab to eef0152 Compare April 18, 2023 20:02
@zmerlynn zmerlynn disabled auto-merge April 18, 2023 20:03
@zmerlynn
Copy link
Collaborator Author

@markmandel I just realized, I thought since you left a non-blocking comment that you can approved it. If you get a sec, PTAL. Or @gongmax or @roberthbailey if you get a chance - thanks!

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: a2768079-7e87-43cd-9932-7c62e57cefff

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 9ac8d75a-1e85-4ef9-bd0c-454069ee989f

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@zmerlynn zmerlynn force-pushed the wait-on-networking branch from eef0152 to c4c46e8 Compare April 18, 2023 21:29
The problem addressed by googleforgames#3070 is that on an indeterminate basis, we
are seeing containers start without networking fully available. Once
networking seems to work, it works fine.

However, the fix in googleforgames#3070 introduced a downside: heavy watch traffic,
because I didn't quite understand that it would also block the hanging
GET of the watch. See googleforgames#3106.

Instead of timing out the whole client, let's use an initial-probe
approach and instead block on a successful GET (with a reasonable
timeout) before we try to start informers.

Fixes googleforgames#3106
@zmerlynn zmerlynn force-pushed the wait-on-networking branch from c4c46e8 to d35a366 Compare April 18, 2023 21:47
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 81ea0032-0eff-4b29-a6eb-8026272da9d3

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3107/head:pr_3107 && git checkout pr_3107
  • helm install agones ./install/helm/agones --namespace agones-system --agones.image.release=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.32.0-d35a366-amd64

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: gongmax, zmerlynn
Once this PR has been reviewed and has the lgtm label, please assign alekser for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

1 similar comment
@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: gongmax, zmerlynn
Once this PR has been reviewed and has the lgtm label, please assign alekser for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@zmerlynn zmerlynn merged commit e61be47 into googleforgames:main Apr 18, 2023
@zmerlynn zmerlynn deleted the wait-on-networking branch April 18, 2023 23:29
@Kalaiselvi84 Kalaiselvi84 added the kind/bug These are bugs. label May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Excessive WATCH load from Agones 1.31
5 participants