-
Notifications
You must be signed in to change notification settings - Fork 825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running windows game server #54
Comments
It looks like 1.9 support for Windows is much better than previous - I would suggest only trying it in a 1.9 install. |
http://blog.kubernetes.io/2017/09/windows-networking-at-parity-with-linux.html this was shared by @alexandrem and if you look at the bottom table you'll see that only 1709 support multiple container in a pod. |
Sorry, I wasn't clear - my thought was that the blog post is pre-1.9, so it may be out of date. Looking at the Known Limitations Section, which is the 1.9 docs, this is not listed as an issue - where I figure it should be. Hence I figured it makes sense to check, either because it's a bad documentation bug (and we could file a PR), or work has been done to resolve the issue on Windows Server 2016. |
You're right and we need to test but from my understanding it's independent from kubernetes development but more an issue of windows server networking and this has been fixed in 1709 version so for me as soon as you don't use an updated version it won't work. What will be really a problem is testing, so far I don't see any other way than having a real cluster. |
Before testing - would be faster to just drop an email to https://groups.google.com/forum/#!forum/kubernetes-sig-windows 😄 and see if it's a documentation bug or not. The other part I'm curious about is the state of hostPort support. |
Some more content released today: |
https://blog.docker.com/2017/11/docker-for-windows-17-11/
Possibly no need to change anything in the code as the sidecar would run on linux and the gameserver on windows, this would also be interesting to explore. |
Today - there is an update! |
Kind issue to look at: kubernetes-sigs/kind#410 |
Big news!!! https://cloud.google.com/kubernetes-engine/docs/release-notes?hl=en
|
I have taken a stab at this in the past few days. I somewhat succeeded and thought I would recap my work in this issue so that we may be able to make it happen. Disclaimer: I was mainly interested in seeing if this could even work. I hacked a few things and work around others. Building sidecar and simple udp images for WindowsSDK
Here's an example of my final
Windows containers images need to be build on the same OS that they are targetting. When trying to build on my machine(1903) with a simple-udpI used simple-udp to test as this is simple and can be cross-compiled. I used this command from Agones clone:
I then wrote a similar Cluster creationI followed GKE's documentation to create a cluster to host Agones with Windows node. It seems we should be able to run in 1.15 but when I tried with this version, I couldn't add the Windows node pool so I reverted to use 1.16 which at the time of writing gave me a 1.16.8-gke15. I used those commands to create the cluster:
Agones installationWe need to replace the SDK image that default install use since it is the one build as a linux based container. Helm has the settings to overwrite. Unfortunately, Agones use a single repo url and only the name and tags can be replaced. We can then either host all images in an external repo and replace the image repo settings or we can cheat a little(which I did but it'd be better not to) Here's how I installed Agones: This means the controller will insert sidecar with image at gcr.io/agones-images/agones-sdk-win:1909. On my side, since I built the image from the actual cluster's Windows node, I simply had to tag it like so. It worked in my favor since Kubernetes' default Deploying game serversI used Agones' example and added a node selector to target Windows machine. Here's the content of my apiVersion: "agones.dev/v1"
kind: Fleet
metadata:
name: simple-udp
spec:
replicas: 2
template:
spec:
ports:
- name: default
containerPort: 7654
template:
spec:
nodeSelector:
kubernetes.io/os: windows
containers:
- name: simple-udp
image: <redacted>
resources:
requests:
memory: "64Mi"
cpu: "100m"
limits:
memory: "64Mi"
cpu: "200m" After that, I had 2 game servers running in my cluster. Allocation works too.
End of good newsI was not able to contact my game server. So I did some investigation to diagnose: Firewall rules is correctly setup as when running the sdk in local mode and simple-udp directly on Windows VM, I was able to contact the server. To run in GKE with Windows node, we have to enable ip alias which means each pod have their own IPs. Kubernetes then need to setup routing when using host ports. So I tested from within the cluster if I was able to reach game servers. Here are some results:
That means the game server hosting in Windows container is working but there are issues with the networking setup. After snooping around on the Windows VM, I found out that GKE uses I tested a standard cluster(only Container optimized OS VMs) and it worked so it point to a shortcoming in the Windows CNI implementation. To be continued ... |
Would be great to be able to specify the SDK image via an annotation, this way you don't need to reinstall agones and you can run different sdk version. Nice experiment. |
On GKE (I think this is also true for other providers) you have to have at least one node pool running linux VMs for system containers even if your are running your workloads on Windows. So you could just run the linux agones controllers next to the other linux Kubernetes controllers. edit: I just re-read this and realized you are talking about the sidecar..... For that, I think we need to add some flexibility into Agones so that it can launch both windows and linux sidecars based on some sort of tag on the gameserver / fleet. |
I had been wondering about hostPort support. I read through the GKE and Kubernetes docs and didn't see a clear answer as to whether (or how well) they would work. Maybe someone on the GKE windows or GKE networking team can help. We will reach out internally. |
Hello, |
I ran almost all commands present in https://github.com/microsoft/SDN/tree/master/Kubernetes/windows/debug/ scripts |
Same issue without agones, unable to connect to IIS container launched with the following command: |
Started a thread as well in #sig-windows on K8s slack: |
Based on the conversation in the #sig-windows thread, hostPort is not currently supported. I've filed/updated two tickets (see links above). and been told to check in in a week, to see if they can find some resources to work on hostPort support:
|
Possible good news!
has been added to the CNI config. |
Looks like we do have it enabled on GCE: But need the CNI version is not up to 0.8.6 (which came out 23 days ago, so no surprise) |
Hello Mark,
Thank you |
Hello again Thank you again Mark for pointing to the new CNi plugin portMappings attributes
Note: The windows node is registered with its flannel IP instead of its primary IP.
Test from an Alpine VM on the same subnet:
I built the following images on my windows node:
|
If the Agones SDK sidecar container image is pushed to different tags for different platforms/architectures, then a Manifest List can be pushed that references them, and the container runtime will pull the right image for its platform. Then |
#1894 does the manifest magic already it just needs to be turned on in CI. At the time I cautioned against enabling it by default because it significantly increases the build time and I didn't want to complicate an Agones release. Since there's a release cut after the PR went in it's appropriate to turn on Beyond that like @josephbmanley said, all you pretty much need is some changes to Terraform to provision a Windows Node Pool via conditional and the Helm templates need to be updated to have a parameter that basically inserts the nodeSelector for Windows. /cc @markmandel There's 1 big caveat in that only Windows LTSC 2019 is supported (default for Kubernetes). Using it on 2004 and 20H2 will break in weird ways (this is likely going to be the version your dev box is on unless you run with Windows Server). Agones will need to update the CI image to install Docker Client 20.10 which comes with the os.version fix to |
Thanks for digging into this work!
I think we have this already? See this documentation. Will that suffice?
I don't think this is necessary (I'm fairly sure this is covered above), as with the way the manifest/registry operates, it can select the appropriate OS image as needed without having to be specific about it. One thing to also add to the list - make sure the windows images are built and pushed as part of the release: We should probably also add a section for documentation as well - likely a "Windows containers" page would be useful - maybe under
How significantly are we talking here? minutes, hours, days? 😄
We use whatever version comes with Cloud Build, also whatever we have on our work machines - sounds like we'll need to wait on supporting 2004 and 20H2 until the newer Docker version propagates out to those systems (if they haven't already). Does that all make sense? |
I'm not sure what your link is referring to (the
in the Users can also do this themselves using Helm values, but if the container os/arch is known statically, it seems nicer to make it part of the chart, than make it part of the config where it might be forgotten or overwritten. Another option, the "legacy" option from the Kubernetes docs is to taint all your Windows nodes, and include tolerations on the Fleet's PodTemplates. |
Yeah exactly, the section that reads:
But I didn't realise you could do:
Which makes total sense - and should be backward compatible 👍 Do we need to something similar for the GameServer windows images? I assume we do? |
Ideally, Sadly, k8s has been practically mono-os and mono-arch long enough most people and Helm charts aren't including appropriate For illustration, once spec:
affinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
# Option 1: Linux && AMD64
- matchExpressions:
- key: kubernetes.io/os:
operator: In
values: [ linux ]
- key: kubernetes.io/arch:
operator: In
values: [ amd64 ]
# Option 2: Windows && AMD64 && [ ltsc2019 ]
- matchExpressions:
- key: kubernetes.io/os:
operator: In
values: [ windows ]
- key: kubernetes.io/arch:
operator: In
values: [ amd64 ]
- key: node.kubernetes.io/windows-build:
operator: In
values: [ "10.0.17763" ] I expect that the above would never actually be used as-is. The main container in the I think in the end it'd be easier on everyone to simply build a wider variety of sdk-server image arch/os/version targets, so that GameServers are always constrained by the user, not the sdk-server container availability. ^_^ |
If it's true that it already selects the proper image based on the manifest, the default tags should be documented, as I had to specify the tag manually when using my own registry.
Azure also only supports the current version as well. It seems all cloud providers are a bit behind on this. |
Sounds like what i think we're saying here is - put os specific constraints on the linux bits, leave GameServers as a exercise for the user on how to restrict game servers to their appropriate OS (in this case Windows) basically so they can just do what they want (although this sounds like something we should definitely document), and we don't get in their way. Makes sense to me if that's the general consensus. This multi-arch stuff make my head hurt 🤕 |
The way the Docker Client 20.10 is compatible with Docker Server 19.03. I've verified I can build like this in a different project.
+10-30 minutes based on machine type, hdd vs ssd, of the machine. If you parallelize you'll need to create a separate buildx context for each thread. If there's only a sidecar I don't think there's any specific changes necessary. It'll run on Windows and Linux because the manifest makes that work. The pod container spec simply needs the Lastly, Windows requires at least 1 Linux node, https://kubernetes.io/docs/setup/production-environment/windows/intro-windows-in-kubernetes/#windows-containers-in-kubernetes. @TBBle @josephbmanley What issues are you seeing in Windows? Beyond opting into running a Windows container on Windows you shouldn't need to use node selectors. Windows nodes have a taint applied to them so default pod specs shouldn't schedule on those nodes. Has anyone tried to deploy the Xonotic example running on Windows in a cluster? The Xonotic example requires a Windows Server 2019 machine to build. It's possible to remove this dependency but it's intentional to have it like so since that's a more likely path Windows customers would use. |
The issues I'm referring to are (pods being assigned to Windows nodes that don't have Windows containers images available) are prevented by the example taint in the docs, but I consider tainting Windows nodes to be a short-term workaround until all the things running on the cluster have appropriate nodeSelectors to ensure their container image manifests match the node arch/os/buildver. Otherwise we'll go through this all again as ARM CPUs become more popular in k8s deployments. I do kind-of wish k8s supported looking at the containers in a Pod and extracting their OS requirements from the manifests/manifest lists, but I can't see how that would ever make sense given the k8s system architecture. |
Quick thought I had on e2e testing: Figured we could add some windows nodes to our e2e cluster, and then adjust some of our e2e tests to run both on Linux and Windows, by running the same test just with the windows node selector in the Pod template for the windows test. |
one last request - publish |
If you pull |
@roberthbailey i see that stuff works for simple-gs, but not for xonotic. i see xonotic has windows, but pulling it directly or with windows suffix fails on windows pod. i want xonotic as test tool for our gs programmers and me, so we know when issues in our gs or in agones setup i did. |
Looking at https://console.cloud.google.com/gcr/images/agones-images/GLOBAL/xonotic-example?gcrImageListsize=30, it doesn't look like a new Xonitic manifest has been pushed since early 2020, and the Windows support was added in December 2020 in #1894. So I don't think a Windows build of Xonotic has been pushed to the Agones repo, you probably have to build it yourself. The xonotic Makefile used by the cloudbuild scripts doesn't build a Windows image, so even if it had been built since Windows support was added, it wouldn't have been pushed. You can contrast with with simple-game-server's Makefile to see how much is missing, both building Windows images, and pushing the manifest list that lets you use one repository name from multiple platforms. #1894 (comment) suggests that the xonotic Windows image isn't cross-buildable from Linux, which might have been (or still be) a blocker for Agones's build pipeline. I'm not sure why else it didn't get the Windows support added to its Makefile when it was added to the Dockerfile. So either an oversight, or a difficult problem to solve, I guess. |
Since I was looking at the simple-game-server's Makefile, this TODO (and a couple of others in this file) are now completable, as Docker 20.10 has been available on Cloud Build since May 2021. I guess those TODOs might have been replicated in other Makefiles, but haven't looked. |
@jeremyje - do you have any cycles to look into making our xonotic example run on windows servers? |
@TBBle Yes GCB has supported docker 20.10 for a while I'd strongly recommend upgrading to it because 19.03 is going end of life later this month, https://endoflife.software/applications/virtualization/docker-daemon. @roberthbailey I unfortunately do not have cycles to work on this. I looked at it briefly here's the state. The |
'This issue is marked as Stale due to inactivity for more than 30 days. To avoid being marked as 'stale' please add 'awaiting-maintainer' label or add a comment. Thank you for your contributions ' |
@zmerlynn , @gongmax 🤔 I think the only thing left here is to setup some automated testing of a (?) windows cluster, probably on at least a single version of supported GKE? Looks like there is some cleanup to do here as well with the build system, and hopefully there isn't much else? But I also say that as someone who isn't familiar with the windows container ecosystem either. Marking as |
Game production usually works on windows first then port to linux eventually at the end of the development. Having windows support would help the adoption rate.
What does it takes to run a windows game server ?
Testing will be difficult as windows support is still in beta since k8s 1.5 but apparently greatly improve in 1.9.
Documentation :
https://kubernetes.io/docs/getting-started-guides/windows/
http://blog.kubernetes.io/2017/09/windows-networking-at-parity-with-linux.html
https://github.com/kubernetes/community/tree/master/sig-windows
https://docs.microsoft.com/en-us/windows-server/get-started/whats-new-in-windows-server-1709
The text was updated successfully, but these errors were encountered: