Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1703885: test/extended/router: Fix curl timeout, log dumping, template parameters, and router readiness probes #23614

Conversation

Miciah
Copy link
Contributor

@Miciah Miciah commented Aug 15, 2019

test/extended/router: Fix curl timeout logic

Fix waitForRouteToRespond, waitForRouterOKResponseExec, and expectRouteStatusCodeRepeatedExec to honor their timeout parameter values and to specify a timeout for each curl invocation.

Before this change, each function used its timeout parameter value as the number of times to retry the curl command and did not specify a timeout value to that command. As a result, each command invocation could keep trying indefinitely, and the function could keep retrying the command for far longer than the timeout period that the caller indicated.

Follow-up to #12346 and #19073.

  • test/extended/router/config_manager.go (waitForRouteToRespond):
  • test/extended/router/scoped.go (waitForRouterOKResponseExec)
    (expectRouteStatusCodeRepeatedExec): Keep retrying until the amount of time that is indicated by the timeout value has elapsed. Specify a timeout value of 5 seconds to curl using the -m option.

test/extended/router/unprivileged: Fix log dump

Fix the prefix for the pod name used in the unprivileged router test so that the pod's logs will be dumped on test failures.

Follow-up to #21557.

  • test/extended/router/unprivileged.go: Call DumpPodLogsStartingWith with the "router-" prefix instead of the "unprivileged-router" prefix.

test/extended/router: Delete SCOPE template param

Delete the SCOPE template parameter in the various router template fixtures, and add ROUTER_NAME (default value "test-scoped") and UPDATE_STATUS (default value "true") parameters to the router-scoped.yaml template fixture.

Most tests were not using the SCOPE parameter, and in the one test that did, its use evoked the following warning from oc new-app:

warning: --param no longer accepts comma-separated lists of values.
  • test/extended/router/unprivileged.go:
  • test/extended/testdata/router/router-scoped.yaml: Replace the SCOPE template parameter with ROUTER_NAME and UPDATE_STATUS parameters.
  • test/extended/testdata/router/router-override-domains.yaml:
  • test/extended/testdata/router/router-override.yaml: Delete the SCOPE parameter.
  • test/extended/testdata/bindata.go: Regenerate.

test/extended/router: Enable metrics, fix readiness

Fix the router template test fixtures that use the router's /healthz/ready endpoint for their readiness probes to enable HAProxy metrics, which are required for the endpoint to return OK.

Without HAProxy metrics enabled, the router pods that tests created using these fixtures never became ready.

Follow-up to #21592.

  • test/extended/testdata/router/router-override-domains.yaml:
  • test/extended/testdata/router/router-override.yaml:
  • test/extended/testdata/router/router-scoped.yaml: Enable HAProxy metrics.
  • test/extended/testdata/bindata.go: Regenerate.

Miciah added 4 commits August 15, 2019 16:10
Fix waitForRouteToRespond, waitForRouterOKResponseExec, and
expectRouteStatusCodeRepeatedExec to honor their timeout parameter values
and to specify a timeout for each curl invocation.

Before this change, each function used its timeout parameter value as the
number of times to retry the curl command and did not specify a timeout
value to that command.  As a result, each command invocation could keep
trying indefinitely, and the function could keep retrying the command for
far longer than the timeout period that the caller indicated.

Follow-up to commit 6593d0c
and commit 9ddee4f.

* test/extended/router/config_manager.go (waitForRouteToRespond):
* test/extended/router/scoped.go (waitForRouterOKResponseExec)
(expectRouteStatusCodeRepeatedExec): Keep retrying until the amount of time
that is indicated by the timeout value has elapsed.  Specify a timeout
value of 5 seconds to curl using the "-m" option.
Fix the prefix for the pod name used in the unprivileged router test so
that the pod's logs will be dumped on test failures.

Follow-up to commit 409ac8b.

* test/extended/router/unprivileged.go: Call DumpPodLogsStartingWith with
the "router-" prefix instead of the "unprivileged-router" prefix.
Delete the SCOPE template parameter in the various router template
fixtures, and add ROUTER_NAME (default value "test-scoped") and
UPDATE_STATUS (default value "true") parameters to the router-scoped.yaml
template fixture.

Most tests were not using the SCOPE parameter, and in the one test that
did, its use evoked the following warning from oc new-app:

    warning: --param no longer accepts comma-separated lists of values.

* test/extended/router/unprivileged.go:
* test/extended/testdata/router/router-scoped.yaml: Replace the SCOPE
template parameter with ROUTER_NAME and UPDATE_STATUS parameters.
* test/extended/testdata/router/router-override-domains.yaml:
* test/extended/testdata/router/router-override.yaml: Delete the SCOPE
parameter.
* test/extended/testdata/bindata.go: Regenerate.
Fix the router template test fixtures that use the router's /healthz/ready
endpoint for their readiness probes to enable HAProxy metrics, which are
required for the endpoint to return OK.

Without HAProxy metrics enabled, the router pods that tests created using
these fixtures never became ready.

Follow-up to commit b343fb7.

* test/extended/testdata/router/router-override-domains.yaml:
* test/extended/testdata/router/router-override.yaml:
* test/extended/testdata/router/router-scoped.yaml: Enable HAProxy metrics.
* test/extended/testdata/bindata.go: Regenerate.
@openshift-ci-robot
Copy link

@Miciah: This pull request references a valid Bugzilla bug. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Bug 1703885: test//extended/router: Fix curl timeout, log dumping, template parameters, and router readiness probes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 15, 2019
@Miciah
Copy link
Contributor Author

Miciah commented Aug 15, 2019

[Suite:openshift/test-cmd][Serial][Disruptive] test-cmd: test/cmd/policy-storage-admin.sh [Suite:openshift] expand_less	18s
fail [github.com/openshift/origin/test/extended/cmd/cmd.go:117]: Expected
    <[]error | len:1, cap:1>: [
        {
            s: "error waiting for the pod 'test-cmd' to complete: the pod errored trying to run the command",
        },
    ]
to have length 0

/test e2e-cmd

@Miciah Miciah changed the title Bug 1703885: test//extended/router: Fix curl timeout, log dumping, template parameters, and router readiness probes Bug 1703885: test/extended/router: Fix curl timeout, log dumping, template parameters, and router readiness probes Aug 16, 2019
@ironcladlou
Copy link
Contributor

/approve
/lgtm

Thanks, this looks helpful

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 16, 2019
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ironcladlou, Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 16, 2019
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@Miciah
Copy link
Contributor Author

Miciah commented Aug 16, 2019

I'm hoping #23622 will fix one of the test failures.

@openshift-merge-robot openshift-merge-robot merged commit 5f82df0 into openshift:master Aug 16, 2019
@openshift-ci-robot
Copy link

@Miciah: All pull requests linked via external trackers have merged. The Bugzilla bug has been moved to the MODIFIED state.

In response to this:

Bug 1703885: test/extended/router: Fix curl timeout, log dumping, template parameters, and router readiness probes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link

@Miciah: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-cmd 01d02db link /test e2e-cmd

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@Miciah
Copy link
Contributor Author

Miciah commented Feb 12, 2020

/cherrypick release-4.1

@openshift-cherrypick-robot

@Miciah: #23614 failed to apply on top of branch "release-4.1":

Applying: test/extended/router/unprivileged: Fix log dump
Applying: test/extended/router: Delete SCOPE template param
error: Failed to merge in the changes.
Using index info to reconstruct a base tree...
M	test/extended/router/unprivileged.go
M	test/extended/testdata/bindata.go
A	test/extended/testdata/router/router-override-domains.yaml
A	test/extended/testdata/router/router-override.yaml
A	test/extended/testdata/router/router-scoped.yaml
Falling back to patching base and 3-way merge...
Auto-merging test/extended/testdata/router-scoped.yaml
Auto-merging test/extended/testdata/router-override.yaml
Auto-merging test/extended/testdata/router-override-domains.yaml
Auto-merging test/extended/testdata/bindata.go
CONFLICT (content): Merge conflict in test/extended/testdata/bindata.go
Auto-merging test/extended/router/unprivileged.go
Patch failed at 0003 test/extended/router: Delete SCOPE template param

In response to this:

/cherrypick release-4.1

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ironcladlou
Copy link
Contributor

Since it doesn't apply cleanly and since 4.1 EOL is approaching, it would be nice to understand 4.1's CI status going forward to help decide whether this is worth manually backporting...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants