Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Stages Documentation #1080

Merged
merged 2 commits into from
Dec 13, 2019

Conversation

markmandel
Copy link
Member

@markmandel markmandel commented Sep 26, 2019

Feedback on below very much warranted and appreciated - from where it sits in the documentation, to the contents itself.


Definitions of feature stages, how they are enabled/disabled, the support to be expected at each stage, and how each stage can be identified.

Updates to the contributing guide as well, as this needs to be kept in mind for new features and design work.

Credit / Reference:
https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-stages
https://istio.io/about/feature-stages/#feature-phase-definitions

@markmandel markmandel added kind/feature New features for Agones kind/documentation Documentation for Agones area/meta Organisational matters. e.g. Governance, release cycles, etc. labels Sep 26, 2019
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 3186bdc2-d8ba-4a6e-98d3-d3644951d968

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/1080/head:pr_1080 && git checkout pr_1080
  • helm install install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.1.0-a96d2ff

Copy link
Contributor

@pooneh-m pooneh-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good. Thanks for putting it together. I just have a concern about versioning the SDK.

site/content/en/docs/Guides/feature-stages.md Outdated Show resolved Hide resolved
site/content/en/docs/Guides/feature-stages.md Outdated Show resolved Hide resolved
site/content/en/docs/Guides/feature-stages.md Outdated Show resolved Hide resolved
Copy link
Member

@roberthbailey roberthbailey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kubernetes lists the specific features inline (on the page you linked to). Do you think it would be better to do that than just say "look through the helm configuration" or "read the feature gate documentation for the feature you care about"? Having a central place to see the various feature gates seems like it would be really nice.

CONTRIBUTING.md Show resolved Hide resolved
site/content/en/docs/Guides/feature-stages.md Show resolved Hide resolved
site/content/en/docs/Guides/feature-stages.md Show resolved Hide resolved
site/content/en/docs/Guides/feature-stages.md Show resolved Hide resolved
@roberthbailey
Copy link
Member

/assign

Copy link
Member Author

@markmandel markmandel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kubernetes lists the specific features inline (on the page you linked to). Do you think it would be better to do that than just say "look through the helm configuration" or "read the feature gate documentation for the feature you care about"? Having a central place to see the various feature gates seems like it would be really nice.

Yeah, I agree. I think I was just hesitant to maintain it. My thought was once something goes to stable, the gate gets removed. Whereas Kubernetes maintains the feature gate. Do you think we should do the same?

I'm updating the doc to include a section for the list of feature gates.

@markmandel markmandel force-pushed the meta/feature-stages branch from a96d2ff to 6ff2597 Compare October 1, 2019 23:47
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 303fd64e-17af-47e1-b027-f58f6ad53e11

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/1080/head:pr_1080 && git checkout pr_1080
  • helm install install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.1.0-6ff2597

@roberthbailey
Copy link
Member

I'm happy removing the gates (and their documentation) once they hit stable. I think having a list of things you can try right now makes sense and you should be able to tell which feature gates apply to your version of Agones using the drop down on the website that takes you to the version specific docs.

Copy link
Member Author

@markmandel markmandel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some updates. Also moving this out of draft, since I'm not sure what benefit that gives.

site/content/en/docs/Guides/feature-stages.md Outdated Show resolved Hide resolved
site/content/en/docs/Guides/feature-stages.md Outdated Show resolved Hide resolved
site/content/en/docs/Guides/feature-stages.md Outdated Show resolved Hide resolved
@markmandel markmandel marked this pull request as ready for review October 24, 2019 00:34
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 1e4de8ec-6777-4036-a28b-b6cafdf58198

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: e2b00070-9420-4208-9c0d-a6b1a19c4ad9

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/1080/head:pr_1080 && git checkout pr_1080
  • helm install install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.1.0-3da76b4

@markmandel
Copy link
Member Author

@roberthbailey @pooneh-m I think this is pretty close - I updated the gRPC section with a resolution from the research in #1043. PTAL, let me know what you think.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 34417e1f-02b1-4f53-bec9-901790c92497

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/1080/head:pr_1080 && git checkout pr_1080
  • helm install install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.2.0-3a537cb

For example, a REST API with a prefix of `v1alpha1` is an `alpha` stage feature:
`http://api.example.com/v1alpha1/exampleaction`.

Similar to the SDK, Any `alpha` or `beta` gRPC functionality will be a subpackage of the main API package.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about if there is a breaking change to the v1 API?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question. Ideally, we never want this to happen - since the SDK is so important -- but I may be being optimistic that it won't ever happen - or at least to the degree that we want a whole new implementation of the SDK.

I'm almost tempted to say "handle it if/when we need to?, and hopefully we won't have to?"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, wrong section - this isn't the SDK section, this is the API section.

So my only concern here, is that if we do a v1/v2 type API surface, it's completely different from the SDK. Maybe that is fine though? I'd like to have consistency however.

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimistically, I also expect major versioning would not happen However, it wouldn't hurt to mention that if there is any required incompatible change to the stable version there should be an extensive communication and planning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a great addition as well 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How strongly do you feed about having a v1/v2 interface/package for the gRPC/REST endpoints.

I'm thinking I might be backtracking on my position of not doing this - just because it's going to be really hard to go back and slot this in if we need it (SDK comes to mind).

I'm not a huge fan of packages called v1 (hello K8s), but I can't think of a better way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version in the package name for gRPC for v1 is optional.
For REST, I have always seen APIs to be versioned, but couldn't find a recommendation specific to v1 versioning. I would recommend designing REST API with version in the path to avoid URL changes between versions.

site/content/en/docs/Guides/feature-stages.md Outdated Show resolved Hide resolved
For example, a REST API with a prefix of `v1alpha1` is an `alpha` stage feature:
`http://api.example.com/v1alpha1/exampleaction`.

Similar to the SDK, Any `alpha` or `beta` gRPC functionality will be a subpackage of the main API package.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should perhaps mention that alpha/beta features are a super-set of the main package?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you are saying all stable features should be available in the alpha/beta sdk subpackage?

So I should be able to do sdk.alphav1.Ready()?

If so, I'm not sure what the benefit would be?

Copy link
Contributor

@pooneh-m pooneh-m Nov 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you are saying all stable features should be available in the alpha/beta sdk subpackage?

I think now I understand why you wanted to run two SDK packages side by side because one supports a none stable API and the other supports a stable API. If this is what you are recommending, please make it clear. Otherwise, I assumed an alpha API would be a super set of the stable API and there is only one SDK running. For example, if the SDK is going to support an alpha feature SelfNominate(), I thought I create a new snapshot of the SDK and version it as sdk.v1alpha1 which is a super set of sdk[.v1] package and expose it for experimentation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaah! Yes! That is the disconnect. Yes, that is exactly the way I figured it would work, but I wasn't clear. Let me write words to that effect (it sounds like it will be similar to what we have in CRDs as well).

We can also do something like keep alpha/beta SDKs in place for a period after they move to stable, so that people have time to move over as well 👍

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 1499d2dd-6b1e-4416-b7cb-18ab1750e6cf

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/1080/head:pr_1080 && git checkout pr_1080
  • helm install install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.2.0-41d4862

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 16d0f70d-61f3-488f-9eea-2e35a101735b

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/1080/head:pr_1080 && git checkout pr_1080
  • helm install install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.2.0-9ecbecb

@@ -48,6 +48,16 @@ When submitting pull requests, make sure to do the following:
- Remove trailing whitespace. Many editors will do this automatically.
- Ensure any new files have [a trailing newline](https://stackoverflow.com/questions/5813311/no-newline-at-end-of-file)

## Feature Stages
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would a breaking change in the controller look like?

* The API may change in incompatible ways in a later software release without notice.
* Recommended for use only in short-lived testing clusters, due to increased risk of bugs and lack of long-term support.

> Note: Please do try `Alpha` features and give feedback on them. This is important to ensure less breaking changes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and below) should be switched to the new note style.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Good catch.


As SDK features move to through feature stages towards `stable`, the previous version of the SDK API
will remain for at least one release to enable easy migration to the more stable feature stage (i.e. from `alpha
` -> `beta`, `beta` -> `alpha`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be beta -> stable?

Any `alpha` or `beta` Game Server SDK functionality will be a subpackage of the `sdk` package. For example
, functionality found in a `sdk.alphav1` package should be considered at the `alpha` feature stage.

Only experimental functionality will be found in any `alpha` and `beta` SDK packages, and as such may change as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies that you would import both the alpha and stable SDKs to start using the alpha sdk functions. I'm not actually sure how that would work in practice (do you have to call Connect twice?). Does the alpha package piggyback on the stable one somehow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you would need to explicitly import the alpha package to use it. But I feel like from an implementation detail we would hide the gRPC details from the user.

From a gRPC technical details, I posted some details on how to do this technically on #1043

Aha, so I found the right stackoverflow post:
https://stackoverflow.com/questions/52634217/access-multiple-grpc-services-over-the-same-connection

This shows we can both register multiple services with the same endpoint, and also configure a singular connection with multiple clients - at least with Go.

Even if we end up having 2 connections to the same local endpoint, I don't think it's a drastic issue?

If we had alpha/betabe a superset ofstable`, would we demark a section of sdk.proto for alpha/beta functionality, and add/remove from there as needed? (and basically hide this from the user via our published SDK? This would have some complications for anyone generating a SDK, I would assume?)

Not to say I'm 100% against a superset, it was just my original viewpoint. I'm not completely married to the idea - maybe a superset is easier for us/the end user.


### REST & gRPC APIs

REST and gRPC API will have versioned paths were appropriate to indicate their feature stage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/were/where

Definitions of feature stages, how they are enabled/disbaled, the
support to be expected at each stage, and how each stage can be
identified.

Updates to the contributing guide as well, as this needs to be kept in
mind for new features and design work.

Credit / Reference:
https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-stages
https://istio.io/about/feature-stages/#feature-phase-definitions
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 6ecd9b16-5d67-48cf-926b-13ce6b7a08bc

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 532eed90-545e-42dc-b008-238d94c3e21d

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/1080/head:pr_1080 && git checkout pr_1080
  • helm install install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.2.0-7f74cea

@markmandel
Copy link
Member Author

@pooneh-m @roberthbailey I'm almost wondering if we want to coordinate a video call and resolve this PR?

@pooneh-m
Copy link
Contributor

pooneh-m commented Nov 7, 2019

@pooneh-m @roberthbailey I'm almost wondering if we want to coordinate a video call and resolve this PR?

It seems like the outstanding issue is the gRPC versioning. Should we consult with gRPC experts then to help making the decision? Other than that, LGTM.

@markmandel
Copy link
Member Author

We have an internal gRPC office hours, I'm going to go tomorrow morning, and will get a resolution.

@markmandel
Copy link
Member Author

Had a chat with the gRPC team:

  1. Confirmed: On the server side, yes, you can register multiple proto defined services through a single gRPC ip and port (multiple registrations on a single server)
  2. Confirmed: Client side, you can clients from multiple protos reuse the connection to that above single server from the same multiple protos. Some languages will share "channels", in others, if the client works out you are writing code to connect to the exact same place that the client already has a connection to - it will automatically reuse the TCP connection.

So I think we are good to go on the currently documented approach?

@markmandel
Copy link
Member Author

Anything outstanding on this feature stage doc?

Copy link
Contributor

@pooneh-m pooneh-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for double checking with gRPC experts.

@google-oss-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: markmandel, pooneh-m

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [markmandel,pooneh-m]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-robot
Copy link

New changes are detected. LGTM label has been removed.

@markmandel
Copy link
Member Author

Oooh! Now we have to actually build the feature gate functionality 😄

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 9825ce46-ee39-4da1-99c3-6d16cf323ad9

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@markmandel
Copy link
Member Author

--- FAIL: TestMultiClusterAllocationOnLocalCluster (0.44s)
    --- FAIL: TestMultiClusterAllocationOnLocalCluster/Packed (44.51s)
        gameserverallocation_test.go:80: Namespace gsa-multicluster-local-8a7d0c9d-1d3d-11ea-b2b9-0242c0a80a03 is created
        gameserverallocation_test.go:80: ServiceAccount gsa-multicluster-local-8a7d0c9d-1d3d-11ea-b2b9-0242c0a80a03/agones-sdk is created
        gameserverallocation_test.go:80: RoleBinding gsa-multicluster-local-8a7d0c9d-1d3d-11ea-b2b9-0242c0a80a03/agones-sdk-access is created
        gameserverallocation_test.go:182: 
            	Error Trace:	gameserverallocation_test.go:182
            	Error:      	Expected nil, but got: &errors.StatusError{ErrStatus:v1.Status{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ListMeta:v1.ListMeta{SelfLink:"", ResourceVersion:"", Continue:""}, Status:"Failure", Message:"an error on the server (\"no multi-cluster allocation policy is specified\") has prevented the request from succeeding (post gameserverallocations.allocation.agones.dev)", Reason:"InternalError", Details:(*v1.StatusDetails)(0xc0008ac1e0), Code:500}}
            	Test:       	TestMultiClusterAllocationOnLocalCluster/Packed

That's a new one. @pooneh-m in case you know what's up here.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 5a1d114a-cc82-4f26-8e5b-41050d2a05c0

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/1080/head:pr_1080 && git checkout pr_1080
  • helm install ./install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.3.0-6b0a07c

@markmandel markmandel merged commit eac79fa into googleforgames:master Dec 13, 2019
@markmandel markmandel deleted the meta/feature-stages branch December 13, 2019 00:46
@pooneh-m
Copy link
Contributor

--- FAIL: TestMultiClusterAllocationOnLocalCluster (0.44s)
    --- FAIL: TestMultiClusterAllocationOnLocalCluster/Packed (44.51s)
        gameserverallocation_test.go:80: Namespace gsa-multicluster-local-8a7d0c9d-1d3d-11ea-b2b9-0242c0a80a03 is created
        gameserverallocation_test.go:80: ServiceAccount gsa-multicluster-local-8a7d0c9d-1d3d-11ea-b2b9-0242c0a80a03/agones-sdk is created
        gameserverallocation_test.go:80: RoleBinding gsa-multicluster-local-8a7d0c9d-1d3d-11ea-b2b9-0242c0a80a03/agones-sdk-access is created
        gameserverallocation_test.go:182: 
            	Error Trace:	gameserverallocation_test.go:182
            	Error:      	Expected nil, but got: &errors.StatusError{ErrStatus:v1.Status{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ListMeta:v1.ListMeta{SelfLink:"", ResourceVersion:"", Continue:""}, Status:"Failure", Message:"an error on the server (\"no multi-cluster allocation policy is specified\") has prevented the request from succeeding (post gameserverallocations.allocation.agones.dev)", Reason:"InternalError", Details:(*v1.StatusDetails)(0xc0008ac1e0), Code:500}}
            	Test:       	TestMultiClusterAllocationOnLocalCluster/Packed

That's a new one. @pooneh-m in case you know what's up here.

It's a flake. I opened #1114 while ago, but it didn't happen again. I'll reopen it to fix it. I think a retry will fix it.

@markmandel markmandel added this to the 1.3.0 milestone Jan 10, 2020
ilkercelikyilmaz pushed a commit to ilkercelikyilmaz/agones that referenced this pull request Oct 23, 2020
Definitions of feature stages, how they are enabled/disbaled, the
support to be expected at each stage, and how each stage can be
identified.

Updates to the contributing guide as well, as this needs to be kept in
mind for new features and design work.

Credit / Reference:
https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-stages
https://istio.io/about/feature-stages/#feature-phase-definitions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved area/meta Organisational matters. e.g. Governance, release cycles, etc. kind/documentation Documentation for Agones kind/feature New features for Agones size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants