Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Starting test plan section of Windows KEP #685

Merged
merged 17 commits into from
Jan 24, 2019

Conversation

PatrickLang
Copy link
Contributor

@PatrickLang PatrickLang commented Jan 12, 2019

I added the test plan into the outline and added the current tests that have been merged and are running in testgrid. This isn't a complete list (see the TODOs), but it's a start.

@astrieanna and @benmoss - any feedback?

Can you take a look @adelina-t and @BCLAU ? Do you think this format makes sense for the tests?

/sig windows

This is continuing to add content to the draft merged in #676

@k8s-ci-robot k8s-ci-robot added sig/windows Categorizes an issue or PR as relevant to SIG Windows. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 12, 2019
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. approved Indicates a PR has been approved by an approver from all required OWNERS files. sig/pm size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 12, 2019
@astrieanna
Copy link
Contributor

@PatrickLang It seems like a very long list of tests. Is it useful to list all of them in the KEP when presumably tagging in the code will be the source of truth on that?

@PatrickLang
Copy link
Contributor Author

@astrieanna yeah, I know the list is long but I was getting questions and feedback on "does X work" last time. Let me see if I can get someone from SIG-Testing to clarify what level of detail they'd like to see here.

@PatrickLang
Copy link
Contributor Author

/milestone v1.14

@k8s-ci-robot k8s-ci-robot added this to the v1.14 milestone Jan 14, 2019
@benmoss
Copy link
Member

benmoss commented Jan 14, 2019

@PatrickLang do we have any plan for keeping track of & making progress on non-conformance e2e tests? We were just looking at the tests for port-forwarding, and with our change some of them pass, and the images were already part of the set ported to Windows, but they aren't being run in Testgrid right now.

@jdumars
Copy link
Member

jdumars commented Jan 14, 2019

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 14, 2019
@jdumars
Copy link
Member

jdumars commented Jan 14, 2019

/lgtm
SIG stakeholders, feel free to remove hold at your discretion.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 14, 2019
@spiffxp
Copy link
Member

spiffxp commented Jan 14, 2019

So, I'm torn since I'm coming from multiple perspectives.

A) This does feel like a lot to enumerate and I’m not sure it’s sig-testing’s job to validate the exhaustive enumeration, but I suspect we have a say in whether the plan seems reasonable. It does seem reasonable to me.

B) As a consumer of this information from the Release Team, I think I'm just asking for a dashboard, or list of dashboards to check. So I would be fine if your test plan was just: "we're going to call this done when all tests at [sig-windows/the-blocking-tests] are green", or "[sig-windows/existing-tests], [sig-windows/substitute-tests], [sig-windows/windows-only-tests]" or something.

To me that then begs the question of whether getting to green by PR'ing failing tests out of those dashboards is OK, or whether the list of tests in those dashboards is OK.

C) As someone who watched SIG Arch flail during 1.13, I feel like there is a desire for exhaustive enumeration of some sort, to figure out whether these tests, or whether the grouping of tests is sufficient to meet user expectations for "how a kubernetes behaves". Reviewing that is something that, to me, seems like it's in SIG Arch's purview (I am super open to suggestions on which SIGs should be involved/federated to instead)

In the interest of an initial pass at this, I'd be in favor of A) & B) where you just describe the test plan and dashboards. I would run by sig-arch whether they need more, or would be comfortable using the dashboards and evaluating at some checkpoint.

@PatrickLang
Copy link
Contributor Author

Thanks @spiffxp . I'll change it to a summary level of detail now for A/B, and if other reviewers want to review it test case by test case - git to the rescue!

@mattfarina
Copy link
Contributor

Something I'm considering is perspective of those looking in. When we reviewed the tests in the SIG Arch conversation on Windows GA we looked at a dashboard of failing tests. I would suggest having a dashboard of windows tests and that they be generally as green as those for Linux nodes.

I like the plan of updating tests cases to handle multi-os, having windows specific tests, and having substitute tests cases. With my SIG Apps hat on, seeing the workloads APIs passing on Windows would be fantastic.

1. [Adapting](#Adapting-existing-tests) some of the existing conformance tests to be able to pass on multiple node OS's
2. Adding [substitute](#Substitute-test-cases) test cases where the first approach isn't feasible or would change the tests in a way is not approved by the owner. These will be tagged with `[SIG-Windows]`
3. Last, gaps will be filled with [Windows specific tests](#Windows-specific-tests). These will also be tagged with `[SIG-Windows]`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is important to note that any test that is tagged with [SIG-Windows] should always pass on windows. we should specify this in the text.

In addition, is it possible to tag any of the tests that are not applicable to windows with any tag? that way it is easy to differentiate which of the existing conformance tests are not applicable to windows (beyond the ones that were substituted)

Copy link
Contributor

@michmike michmike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving this PR to get it merged with the rest of the KEP so that we can continue iterating over it.

@michmike
Copy link
Contributor

/approve
/lgtm

@michmike
Copy link
Contributor

So, I'm torn since I'm coming from multiple perspectives.

A) This does feel like a lot to enumerate and I’m not sure it’s sig-testing’s job to validate the exhaustive enumeration, but I suspect we have a say in whether the plan seems reasonable. It does seem reasonable to me.

B) As a consumer of this information from the Release Team, I think I'm just asking for a dashboard, or list of dashboards to check. So I would be fine if your test plan was just: "we're going to call this done when all tests at [sig-windows/the-blocking-tests] are green", or "[sig-windows/existing-tests], [sig-windows/substitute-tests], [sig-windows/windows-only-tests]" or something.

To me that then begs the question of whether getting to green by PR'ing failing tests out of those dashboards is OK, or whether the list of tests in those dashboards is OK.

C) As someone who watched SIG Arch flail during 1.13, I feel like there is a desire for exhaustive enumeration of some sort, to figure out whether these tests, or whether the grouping of tests is sufficient to meet user expectations for "how a kubernetes behaves". Reviewing that is something that, to me, seems like it's in SIG Arch's purview (I am super open to suggestions on which SIGs should be involved/federated to instead)

In the interest of an initial pass at this, I'd be in favor of A) & B) where you just describe the test plan and dashboards. I would run by sig-arch whether they need more, or would be comfortable using the dashboards and evaluating at some checkpoint.

i think for this initial iteration of the KEP, we do need (a), (b), and (c) to satisfy curious minds. Ultimately the dashboard with green checkboxes across all tests applicable to windows is what most folks will be looking for, but the approvers of the KEP will likely need to dig in.

@PatrickLang
Copy link
Contributor Author

@justaugustus removed KEP number & renamed

@PatrickLang
Copy link
Contributor Author

@benmoss, @astrieanna, @michmike, @adelina-t or @yujuhong - can you review and LGTM? I'd like to get this next update merged. It has a list of the remaining test work that I'm aware of.

More updates coming:

  • @craiglpeters is working on a PR updating the release criteria and final approver list after some discussions this week
  • I'm going to do a separate PR bringing in an API support list


- [ ] [sig-network] [sig-windows] Networking Granular Checks: Pods should function for intra-pod communication: http - [PR#71468](https://github.com/kubernetes/kubernetes/pull/71468)
- [ ] [sig-network] [sig-windows] Networking Granular Checks: Pods should function for intra-pod communication: udp - [PR#71468](https://github.com/kubernetes/kubernetes/pull/71468)
- [ ] [sig-network] [sig-windows] Networking Granular Checks: Pods should function for node-pod communication: udp - [PR#71468](https://github.com/kubernetes/kubernetes/pull/71468)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of completeness, you should also include: [sig-network] [sig-windows] Networking Granular Checks: Pods should function for node-pod communication: http, same PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops my mistake. Adding now

@claudiubelu
Copy link

As far the tests and the required patches on the testing side, other than the comment above, LGTM.

@adelina-t
Copy link

LGTM

@michmike
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 24, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jdumars, michmike, PatrickLang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@michmike
Copy link
Contributor

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 24, 2019
@k8s-ci-robot k8s-ci-robot merged commit 77ae66a into kubernetes:master Jan 24, 2019

### Non-Goals

- Adding Windows support to all projects in the Kubernetes ecosystem (Cluster Lifecycle, etc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably pretty obvious to some, but for clarity, I'd suggest adding

- Running Kubernetes master control-plane on Windows nodes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR 729 already had this change as Enable the Kubernetes master components to run on Windows


The testing for Windows nodes will include multiple approaches:

1. [Adapting](#Adapting-existing-tests) some of the existing conformance tests to be able to pass on multiple node OS's. Tests that won't work will be [excluded](https://github.com/kubernetes/test-infra/blob/master/config/jobs/kubernetes-sigs/sig-windows/sig-windows-config.yaml#L69).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create a file in the "windows-testing" listing the tests excluded? The config is hard to parse, and also the line is likely to change unless you get the permanent link.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe those tests are all being tagged [LinuxOnly].
kubernetes/kubernetes#73204
kubernetes/community#3140


#### Adapting existing tests

Over the course of v1.12/13, many conformance tests were adapted to be able to pass on either Linux or Windows nodes as long as matching OS containers are run. This was done by creating Windows equivalent containers from [kubernetes/test/images](https://github.com/kubernetes/kubernetes/tree/master/test/images). An additional parameter is needed for e2e.test/kubetest to change the container repos to the one containing Windows versions since they're not part of the Kubernetes build process yet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd need an intermediate solution for managing images soon. IIUC, you can only substitute registry/repos for the test images. It'd be hard to keep the "version/tag" of the windows images in sync with the linux ones. Maybe we can extend the e2e framework to consume images (registry, repository, tag/version) directly from a yaml file? I'd suggest opening an issue in k8s.io/kubernetes and carry on the discussion over there. This KEP can link to that issue.

@bgrant0607
Copy link
Member

In the future, please do such file renames as separate commits or PRs. It's hard to see what changed through the UI

Additional dashboard pages will be added over time as we run the same test cases with additional CRI, CNI and cloud providers. They reflect work that may be stabilized in v1.15 or later and is not strictly required for v1.14.

- Windows Server 2019 on GCP - this is [in progress](https://k8s-testgrid.appspot.com/google-windows#windows-prototype), but not required for v1.14
- Windows Server 2019 with OVN+OVS & Dockershim
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these additional dashboards? Are they all required other than GCP? What configuration will be shown on the main dashboard?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will clarify in the next update on #732


All of the test cases will be maintained within the kubernetes/kubernetes repo. SIG-Windows specific tests for 2/3 will be in [test/e2e/windows](https://github.com/kubernetes/kubernetes/tree/master/test/e2e/windows)

Additional Windows test setup scripts, container image source code, and documentation will be kept in the [kubernetes-sigs/windows-testing](https://github.com/kubernetes-sigs/windows-testing) repo. One example is that the prow jobs need a list of repos to use for the test containers, and that will be maintained here - see [windows-testing#1](https://github.com/kubernetes-sigs/windows-testing/issues/1).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now they're in https://hub.docker.com/u/e2eteam but ideally we can consolidate these in the same registry as the other Kubernetes images

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could be pushed to GCR if someone gives us the right access.

We can't build Windows containers in the main build today, but that's the eventual goal. Building these at the same cadence will require adding Windows nodes into the build pool, or moving all images to languages that can be cross-compiled and put into containers with only offline operations (no RUN commands in Dockerfile).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/windows Categorizes an issue or PR as relevant to SIG Windows. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.