Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beta: UpdateList using Agones 1.41 rest API doesn't update the data in the selected list #3870

Closed
Tsm012 opened this issue Jun 13, 2024 · 18 comments · Fixed by #3899
Closed
Labels
kind/bug These are bugs.

Comments

@Tsm012
Copy link

Tsm012 commented Jun 13, 2024

What happened:
Attempting to use the Lists features in Agones 1.41. I have a fleet deployed in agones and am attempting to use the:

curl -d '{"capacity": "120", "values": ["player3", "player4"]}' -H "Content-Type: application/json" -X PATCH http://localhost:${AGONES_SDK_HTTP_PORT}/v1beta1/lists/players

call as mentioned in the rest documentation. The list is declared on our "kind: Fleet" resource and declares a list players

lists:
  players:
    capacity: 20
    values:
    - player1
    - player2
    - player3

Once the fleet is deployed, I can run the command:

curl -H "Content-Type: application/json" -X GET http://localhost:${AGONES_SDK_HTTP_PORT}/v1beta1/lists/players

and will get the correct response.

{"name":"players","capacity":"20","values":["player1","player2","player3"]}

When I go to update that same list using the command:

curl -d '{"capacity": "20", "values": ["player3", "player4"]}' -H "Content-Type: application/json" -X PATCH http://localhost:${AGONES_SDK_HTTP_PORT}/v1beta1/lists/players

I get the response:

{"name":"","capacity":"0","values":[]}

I then check again to see if my list was updated:

curl -H "Content-Type: application/json" -X GET http://localhost:${AGONES_SDK_HTTP_PORT}/v1beta1/lists/players

but the reply seems to indicate nothing has changed.

{"name":"players","capacity":"20","values":["player1","player2","player3"]}

What you expected to happen:

I expected the list up update to reflect the data payload of

{"capacity": "20", "values": ["player3", "player4"]}

How to reproduce it (as minimally and precisely as possible):

Create a Fleet resource in agones and deploy it with a list declared as:

lists:
  players:
    capacity: 20
    values:
    - player1
    - player2
    - player3

shell into a game server in the fleet just deployed and verify that your initial values are in the list

curl -H "Content-Type: application/json" -X GET http://localhost:${AGONES_SDK_HTTP_PORT}/v1beta1/lists/players

{"name":"players","capacity":"20","values":["player1","player2","player3"]}

Attempt to update the list with new values

curl -d '{"capacity": "20", "values": ["player3", "player4"]}' -H "Content-Type: application/json" -X PATCH http://localhost:${AGONES_SDK_HTTP_PORT}/v1beta1/lists/players

after running the update list call, confirm that the list has not changed

curl -H "Content-Type: application/json" -X GET http://localhost:${AGONES_SDK_HTTP_PORT}/v1beta1/lists/players

{"name":"players","capacity":"20","values":["player1","player2","player3"]}

Anything else we need to know?:

Environment:

  • Agones version: 1.41
  • Kubernetes version (use kubectl version): 1.29
  • Cloud provider or hardware configuration: AWS EKS
  • Install method (yaml/helm): Helm
  • Troubleshooting guide log(s):
  • Others:
@Tsm012 Tsm012 added the kind/bug These are bugs. label Jun 13, 2024
@markmandel
Copy link
Member

Can you share the logs from the sdk-server sidecar?

If you have a gameserver or fleet.yaml you could share with a setup that easy to replicate this, would also be handy.

@markmandel
Copy link
Member

FYI: @igooch just to make sure you see this.

@Tsm012
Copy link
Author

Tsm012 commented Jun 13, 2024

Here is a simplified version of a fleet resource that you can deploy. This is happening in both AWS EKS and locally in minikube. If you shell into the alpine container, you can run the curl commands to replicate.

apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
  name: simple-game-server-fleet
spec:
  replicas: 1
  template:
    spec:
      sdkServer:
        logLevel: Debug
      lists:
        players:
          capacity: 3
          values:
          - player1
          - player2
          - player3
      container: simple-game-server
      template:
        spec:
          containers:
            - name: simple-game-server
              image: us-docker.pkg.dev/agones-images/examples/simple-game-server:0.31
            - name: alpine   
              image: alpine/curl 
              command: ["sleep"]         # Override the container entrypoint with a simple sleep command
              args: ["1000000"]  

here is the tail from the agones sidecar, setting the log info to debug doesn't seem to add any additional info even though doing a describe on the pod seems to say the LOG_LEVEL environment variable is set to Debug

{"error":"not a valid logrus Level: \"\"","message":"Invalid LOG_LEVEL value. Defaulting to 'info'.","severity":"warning","time":"2024-06-13T23:37:24.349150707Z"}

{"ctlConf":{"GameServerName":"simple-game-server-fleet-kb5lx-lwd84","PodNamespace":"default","Address":"localhost","IsLocal":false,"LocalFile":"","Delay":0,"Timeout":0,"Test":"","TestSdkName":"","KubeConfig":"","GracefulTermination":true,"GRPCPort":9357,"HTTPPort":9358,"LogLevel":""},"featureGates":"AutopilotPassthroughPort=false\u0026CountsAndLists=true\u0026DisableResyncOnSDKServer=true\u0026Example=false\u0026GKEAutopilotExtendedDurationPods=false\u0026PlayerAllocationFilter=false\u0026PlayerTracking=false\u0026PortPolicyNone=false\u0026PortRanges=false\u0026RollingUpdateFix=false","message":"Starting sdk sidecar","severity":"info","source":"main","time":"2024-06-13T23:37:24.349276227Z","version":"1.41.0"}

{"gsKey":"default/simple-game-server-fleet-kb5lx-lwd84","message":"Created GameServer sidecar","severity":"info","source":"*sdkserver.SDKServer","time":"2024-06-13T23:37:24.350165213Z"}

{"gsKey":"default/simple-game-server-fleet-kb5lx-lwd84","message":"Connection to Kubernetes service established","severity":"info","source":"*sdkserver.SDKServer","time":"2024-06-13T23:37:24.359130873Z","try":0}             

{"grpcEndpoint":"localhost:9357","message":"Starting SDKServer grpc service...","severity":"info","source":"main","time":"2024-06-13T23:37:24.359724909Z"}

{"gsKey":"default/simple-game-server-fleet-kb5lx-lwd84","message":"Starting workers...","queue":"agones.dev.default.simple-game-server-fleet-kb5lx-lwd84","severity":"info","source":"*sdkserver.SDKServer","time":"2024-06-13T23:37:24.460172777Z","workers":1}

{"httpEndpoint":"localhost:9358","message":"Starting SDKServer grpc-gateway...","severity":"info","source":"main","time":"2024-06-13T23:37:25.361356656Z"}

@igooch
Copy link
Collaborator

igooch commented Jun 14, 2024

Looks like I implemented the UpdateList to only update the capacity, since all the other SDKs only use it for changing the capacity. The List values can be changed by addValue or removeValue. @Tsm012 do the addValue and removeValue work for your use case, or do you need to overwrite the value list?

@markmandel we should update the documentation to reflect that as-is it only updates the capacity.

// UpdateList collapses all update capacity requests for a given List into a single UpdateList request.
// This function currently only updates the Capacity of a List.
// Returns error if the List does not exist (name cannot be updated).
// Returns error if the List update capacity is out of range [0,1000].
// [Stage:Beta]
// [FeatureFlag:CountsAndLists]
func (s *SDKServer) UpdateList(ctx context.Context, in *beta.UpdateListRequest) (*beta.List, error) {
if !runtime.FeatureEnabled(runtime.FeatureCountsAndLists) {
return nil, errors.Errorf("%s not enabled", runtime.FeatureCountsAndLists)
}
if in == nil {
return nil, errors.Errorf("UpdateListRequest cannot be nil")
}
name := in.List.Name
s.logger.WithField("name", name).Debug("Update List -- Currently only used for Updating Capacity")
gs, err := s.gameServer()
if err != nil {
return nil, err
}
s.gsUpdateMutex.Lock()
defer s.gsUpdateMutex.Unlock()
if in.List.Capacity < 0 || in.List.Capacity > apiserver.ListMaxCapacity {
return nil, errors.Errorf("out of range. Capacity must be within range [0,1000]. Found Capacity: %d", in.List.Capacity)
}
if _, ok := gs.Status.Lists[name]; ok {
batchList := s.gsListUpdates[name]
batchList.capacitySet = &in.List.Capacity
s.gsListUpdates[name] = batchList
// Queue up the Update for later batch processing by updateLists.
s.workerqueue.Enqueue(cache.ExplicitKey(updateLists))
return &beta.List{}, nil
}
return nil, errors.Errorf("not found. %s List not found", name)
}

@Tsm012
Copy link
Author

Tsm012 commented Jun 14, 2024

We could make it work using the add/remove value calls. The source of truth for this data is on the gameserver so one call to update all the data in the list based on data from the game server would be really nice.

@markmandel
Copy link
Member

We could make it work using the add/remove value calls. The source of truth for this data is on the gameserver so one call to update all the data in the list based on data from the game server would be really nice.

I'm curious your use case for overwriting a total list? Usually we see people wanting to add/remove, but not overwrite. Do you have one in mind.

@igooch - 100% agree the docs need to be updated.

@Tsm012
Copy link
Author

Tsm012 commented Jun 14, 2024

Our game server maintains a list of data needed to connect to it that is used by our match maker. This data is constantly changing as players join and leave different servers and changes in the data can happen very frequently. This list of data is always authoritative. Instead of having to constantly check what has changed in the list maintained by the server and what was the last state of the list in Agones and adding and removing entries one at a time using multiple rest calls, It would be nice to be able to take a snapshot of what the current authoritative list is on the server and replace the list in Agones with one call.

@markmandel
Copy link
Member

Our game server maintains a list of data needed to connect to it that is used by our match maker.

More for curiosities sake - is the matchmaker communicating this list somehow? or is the game server process maintaining this list within itself? or something else?

@Tsm012
Copy link
Author

Tsm012 commented Jun 14, 2024

is the matchmaker communicating this list somehow?

Yes, our match maker is using the agones sdk to pull list information for game servers to use in the match making process.

@markmandel
Copy link
Member

Yes, our match maker is using the agones sdk

... okay now I'm really curious. The Agones SDK can only talk to a single GameServer - unless your match maker talks to the game server binary (somehow?) and the server does the work against it's own GameServer record through the SDK?

@igooch
Copy link
Collaborator

igooch commented Jun 17, 2024

@markmandel I took a look at the docs, and they were tested against the LocalSDKServer which does overwrite both the capacity and the values. I'll make a separate issue for the documentation updates. (An aside that UpdateCounter is working as expected, although it also returns an empty object.)

Do we want to update the SDK Server to also overwrite the values?

@appleturnoverload
Copy link

Yes, our match maker is using the agones sdk

... okay now I'm really curious. The Agones SDK can only talk to a single GameServer - unless your match maker talks to the game server binary (somehow?) and the server does the work against it's own GameServer record through the SDK?

Just to clarify this, we are currently using the k8s API here https://agones.dev/site/docs/guides/access-api/ to get all the info from the game servers in one go rather than using the SDK itself to get data about individual game servers.

So the way it's working is that the server is responsible for maintaining some of the lists, which the matchmaking system reads from using the k8s API to make informed decisions (things like current players, player capacity). Then on allocate, the matchmaking system will add some values to another list to record the fact that players are expecting to join the game server and the server will receive this change by listening to changes through the SDK.

We are expecting to have a slightly less heavy handed way of getting server state into the matchmaking system by streaming changes through the k8s API rather than just getting all the game servers in bulk.

@markmandel
Copy link
Member

Sounds like this is actually a K8s API thing you are doing, and not part of the SDK -- so we can close this issue now I believe.

@chrisfoster121
Copy link
Contributor

Hello, I have been following this and would like to try to update the UpdateList SDK code to support changing list values. I have some changes locally but am unable to follow the Minikube testing procedures. Specifically, I am getting stuck during the make minikube-shell step when running kubectl get pods. I am seeing the following output:

root@docker-desktop:/go/src/agones.dev/agones# kubectl get pods
E0705 23:20:59.463501      36 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:52935/api?timeout=32s": dial tcp 127.0.0.1:52935: connect: connection refused
E0705 23:20:59.463762      36 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:52935/api?timeout=32s": dial tcp 127.0.0.1:52935: connect: connection refused
E0705 23:20:59.465400      36 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:52935/api?timeout=32s": dial tcp 127.0.0.1:52935: connect: connection refused
E0705 23:20:59.465761      36 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:52935/api?timeout=32s": dial tcp 127.0.0.1:52935: connect: connection refused
E0705 23:20:59.467207      36 memcache.go:265] couldn't get current server API group list: Get "https://127.0.0.1:52935/api?timeout=32s": dial tcp 127.0.0.1:52935: connect: connection refused
The connection to the server 127.0.0.1:52935 was refused - did you specify the right host or port?
root@docker-desktop:/go/src/agones.dev/agones#

It seems as though the docker container is starting in the Docker Desktop WSL distro that was created when installing docker desktop. Is this the expected behavior? Additionally, I noticed that I was not immediately switched to agones context when running make minikube-test-cluster.

My development environment is:

  • Windows 11
  • WSL 2
  • Ubuntu 22.04.4 LTS
  • Docker Desktop with the WSL integration active
  • Minikube, kubectl, and Helm installed on WSL Ubuntu

Should this be a separate issue? Am I missing something here?

@igooch
Copy link
Collaborator

igooch commented Jul 9, 2024

@chrisfoster121 could you repost on the Slack channel to see if anyone there has run into this issue that could help?

@chrisfoster121
Copy link
Contributor

@igooch That link looks to have expired. It says "This link is no longer active". Additionally, I have been able to work around this by doing all of my work in a Ubuntu VM on Hyper-V. How would I be able to get my SDK code running as the Agones sidecar? I have been able to push and install the controller, extensions, and ping images but that does not seem to impact the sidecar that gets run.

@chrisfoster121
Copy link
Contributor

@Tsm012 @markmandel @igooch I believe I have a fix for this that I have tested. I am going to look at cleaning this up and see about getting this in (following the contribution guidelines). It would be incredible if we could get this into the next stable release coming next Tuesday.

@igooch
Copy link
Collaborator

igooch commented Jul 9, 2024

Ah so it is, updating the link in #3896.

Assuming you have the Agones repo cloned locally and are making changes there, go into the agones/build directory and run make gen-sdk-grpc SDK_FOLDER=restapi or make gen-all-sdk-grpc. From there the running-a-test-minikube-cluster instructions on building, pushing and installing for minikube should work, although noted in that same readme it hasn't been testing on minikube on Windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug These are bugs.
Projects
None yet
5 participants