-
Notifications
You must be signed in to change notification settings - Fork 823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating UpdateList to update the values on a list #3899
Conversation
Build Succeeded 👏 Build Id: ac2c0d9e-1a0b-400a-972d-b7b1e11075ce The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
Build Succeeded 👏 Build Id: b956a631-428e-4fd6-af05-7cf4b8867f62 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
@igooch This is a duplicate of the previous pull request that I made and fixes the CLA issue. What is the process from here to get this approved? |
I pulled down your PR and installed on a cluster. It doesn't look like the code is working as intended.
The UpdateList should make use of the field masks for specifying which field(s) are being updated. More info on that in the PR that was closed #3897 (comment) . Please also include tests for the new behavior. |
Hmm that is strange as it worked locally for me. To test this I am using the following for my gameserver.yaml:
One thing that is important to note is that when running kubectl describe gs I see the following:
And:
I think this is due to there already being a predefined list named Anyways, to test I copied your commands into my shell and ran them to get the following output.
@igooch Would you be able to provide more detailed steps to reproduce the issue you are seeing? If I can start replicating this on my end it will be a lot easier to diagnose the issue.
I am a bit confused by that. Looking at both AddListValue and RemoveListvalue in sdkserver.go, neither of them use the field masks. They both queue the changes for being processed later in updateList. Are field masks specific to UpdateList? I went with that method because it matched the pattern that was already used. I can change this but I am just trying to understand why one way over the other. |
@igooch I just wanted to follow up to see if you had a chance to check this out. I was unable to repro your issue so if you can provide a bit more detailed repro steps that would be great, and then I'll hopefully have a better idea of what is going wrong! |
The seeing two I'm also not sure what you mean by not being able to replicate? From what I can tell the code in the above comment is replicated exactly. The only difference being that there's an additional The reason the existing SDK Server doesn't currently use the protobuf field mask is because it's only changing one field, the capacity. Or in the case of Add or Remove it's only changing the values field. If multiple values can be changed, then that's when the field mask comes into use. Using the field mask allows for updates that only change specific fields, and leave the other fields untouched. So the command |
@igooch I have updated the PR to use the field mask system as requested. I have not gotten a chance to test this out yet and plan to do so in the next day or so. However, I wanted to see if I could get some feedback on this as I ran into some issues with the unit tests. When running Is this expected for unit tests and if so is there something I need to change to allow the test case to pass correctly? In the mean time I added a line to make it directly update the game server to pass the tests, however, I don't think this should be submitted and would like to find the correct solution. Is this something you are familiar with? I have also converted this PR to a draft until this is resolved. |
Build Failed 😱 Build Id: ac7d8248-4a33-4bc2-8c28-aed8a44f8330 To get permission to view the Cloud Build view, join the agones-discuss Google Group. |
pkg/sdkserver/sdkserver.go
Outdated
list.Capacity = tmpList.Capacity | ||
list.Values = tmpList.Values | ||
gsCopy.Status.Lists[name] = list | ||
s.patchGameServer(ctx, gs, gsCopy) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to call this here, all the calls are batched, and then the actual change to the gameserver happens in updatelist
agones/pkg/sdkserver/sdkserver.go
Line 1218 in e35d73f
gs, err = s.patchGameServer(ctx, gs, gsCopy) |
@markmandel do you want to weigh in here? It does seem a bit odd to batch an "overwrite" function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to batching for a few reasons:
- It puts work items in the workerqueue - which for an "overwrite" operation isn't necessary for conflicts, but is necessary if the KCP goes down for any reason -- so we stay self-healing.
- The experience across the SDK stays the same regardless of operation.
Build Failed 😱 Build Id: 3b9b1854-9f92-4f6a-bf5f-3a04849188e9 To get permission to view the Cloud Build view, join the agones-discuss Google Group. |
Build Failed 😱 Build Id: 858c9972-5509-4d5f-83d6-9dde95825719 To get permission to view the Cloud Build view, join the agones-discuss Google Group. |
Build Failed 😭 Build Id: 73a0e49f-c29b-438f-b086-2261905ba356 Status: FAILURE To get permission to view the Cloud Build view, join the agones-discuss Google Group. |
Build Succeeded 🥳 Build Id: 094d510f-5e3c-4429-87bc-552f22a1dbda The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
Build Failed 😭 Build Id: a82fbd60-8bcd-49b7-bb0e-91485829ca96 Status: FAILURE To get permission to view the Cloud Build view, join the agones-discuss Google Group. |
Build Failed 😭 Build Id: 38b0b88c-0113-46c3-9243-294b6d3b2b84 Status: FAILURE To get permission to view the Cloud Build view, join the agones-discuss Google Group. |
Build Failed 😭 Build Id: 6af3ddb0-ba08-4478-beda-3c6961d654bf Status: FAILURE To get permission to view the Cloud Build view, join the agones-discuss Google Group. |
@igooch I appear to be failing on the
I am not sure what this step is, would you be able to clarify this for me? |
Builds are currently broken -- @zmerlynn is investigating #3939. I'll try and take a look later today as well. |
Ah I see, thank you for the heads up! In the meantime, I do have some questions about some of the codebase as it relates to the batching/queueing system. I may be mistaken but it seems that when a request to change a list comes in (Add, Remove, and now Update), the current value in the queue gets overwritten.
In this case that value for how to update the list is getting overwritten. My worry is that this will create a scenario where sending too many requests at the same time will cause the system to overwrite requests. In fact, this was the initial reason I started looking into using |
The agones/pkg/sdkserver/sdkserver.go Line 1116 in ff4c222
It then appends the new value to the list, so that all values are written back. agones/pkg/sdkserver/sdkserver.go Line 1129 in ff4c222
So batching complicates the
This is interesting, have you seen any error like |
Build Succeeded 🥳 Build Id: a3bf987c-88a1-4ed5-9b9e-6e0e13965c0d The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
@igooch I added a mutex lock and made minor adjustments to clean the code up a bit. Does this look correct? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, one quick adjustment to the mutex. Just to confirm on this, the UpdateList
will overwrite everything, although any add or remove or other update requests that come in after the initial update list request but before the batch would still modify the batch. Will this behavior work for your use case?
pkg/sdkserver/sdkserver.go
Outdated
s.gsUpdateMutex.RLock() | ||
defer s.gsUpdateMutex.RUnlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're writing to s.gsListUpdates[list.Name]
this should be a Lock()
mutex, and will need to be moved to below the s.GetList
since that method uses a RLock()
.
Yes at least for us, we were expecting the list to be overwritten by UpdateList calls which this should so. This allows us to completely change the list in a single call if needed. I will update the code shortly. Thank you! |
Build Succeeded 🥳 Build Id: 0d4f09de-bcd7-4bd9-a791-8961e515e011 The following development artifacts have been built, and will exist for the next 30 days:
A preview of the website (the last 30 builds are retained): To install this version:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank for you for your work and patience on this!
/kind bug
What this PR does / Why we need it:
This PR updates the UpdateList functionality to update the list values along side the given capacity. This is very important because without it managing game server keys gets very complicated and often causes unwanted race conditions when sending multiple additions or removals to lists in quick succession.
Which issue(s) this PR fixes:
Closes #3870