-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ref(*): Refactor the whole thing #157
Conversation
I haven't not read the code yet but based on the description a few cert + domain tests were removed, I'm not very sure that is wise. @slack and I came up with 5 uses cases for when I implemented this in the Controller and the same use case were applied to the router. They are complicated but this is the only point in our testing cycle where we test all the complexities of wildcard and SAN and other fun fun things. |
The one issue I see (and I think you alluded to) is that these tests use |
-v ${HOME}/.kube:/root/.kube \ | ||
-v ${CURDIR}:${SRC_PATH} -w ${SRC_PATH} ${IMAGE} | ||
|
||
.PHONY: check-controller-url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can all these at the bottom of the makefile:
.PHONY: check-controller-url dev-env ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be happy to move them. They are where they are now because that's where they were when I started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to the bottom.
@arschles @bacongobbler Since a you both have asked about it, I'll defend my rationale for use of the DEIS_CONTROLLER_URL variable over DEIS_ROUTER_SERVICE_HOST and DEIS_ROUTER_SERVICE_PORT. When the router's doing its job, it's transparent. It's awkward to ask for the host and port of the router (and expecting that the host is actually the subdomain for the controller). One would never tell a real user to carry out |
.PHONY: docker-build | ||
docker-build: | ||
docker build -t ${IMAGE} ${CURDIR} | ||
docker tag -f ${IMAGE} ${MUTABLE_IMAGE} | ||
docker tag ${IMAGE} ${MUTABLE_IMAGE} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs that -f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That -f
is deprecated... Docker warned me about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it'll be deprecated in future releases, but if you're using an older version of Docker (like 1.9.1), this will fail when running the tests a second time. Let's keep this in for now and we can come back to it in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
@helgi we can put more cert tests back. What I was trying to express was that an arbitrarily complex test case like "test an app with two certs and three domains, where one domain is a wildcard and so is one cert" is an epic... it doesn't exercise any specific conditions that cannot be exercised by smaller, more focused specs. BDD is about story telling and a spec such as that is like Batman v Superman... too many superflous scenes that don't advance the story. I'd be very happy to work with you to come up with smaller, more focused specs that exercise any cert-related use cases that you feel are not being addressed. |
FYI, I've manually* kicked off the CI pipeline on this commit, so we should get e2e test job result back using the *manually as the commit status updates from our CI pipeline are currently 'under construction' |
@krancour if you want to go ahead with DEIS_CONTROLLER_URL then you're going to have to figure out a way to get this to work with the workflow-e2e chart, as with this PR it breaks because it cannot discover where the controller lives inside the cluster. |
@bacongobbler, I don't quite understand this:
Can you elaborate? |
@bacongobbler @krancour it appears communications are working fine with current commit changes: https://ci.deis.io/job/workflow-test-pr/469/console (aside from actual test error*, the tests are running properly) *test error alluded to at time of writing is known issue deis/controller#601 |
Because $DEIS_CONTROLLER_URL is not present in the pod, it will fail. |
@bacongobbler... we can fix that in the chart if need be. I thought you were trying to express that even after such a change there would be technical difficulties with the approach. fwiw, I have always run these remotely and have not tried running them from the chart yet. |
@vdice... do the Jenkins jobs deploy e2e tests to a cluster using the chart? Or do the jobs run the tests remotely against a cluster? |
@krancour the former; the job runs tests using the tests chart. |
@vdice, then I suspect complimentary chart changes actually are in order. |
@bacongobbler @vdice @arschles The concerns about finding the correct URL for the controller (which only happens to be a nip.io URL involving the router's service address) have been addressed by adding the following bit of code to the
The end result is that a user executing the tests remotely can specify the controller URL (which feels natural) via the I don't both checking the port because the router's service always listens on port 80. There is no scenario where it doesn't. Even when HTTPS is enforced, the router's service still forward traffic from 80 to 8080 within the container and a redirect to HTTPS is enforced there. Long story short-- 80 is always available. As @bacongobbler had pointed out, the chart wasn't calling That is also changed here in the Dockerfile:
And a complementary change in the chart: This setup has the added benefit that all details of how tests are executed are now consolidated in deis/workflow-e2e, which hopefully insulates the chart from any need for further changes if / as that changes. |
@krancour I still don't see the technical rationale for using a new env variable, regardless of where it points. The existing ones are provided by Kubernetes, which allows someone to run these tests in isolation, without a chart, as long as they're in the same namespace as a working deis install. The reasons you gave are philosophically valid, but (1) they suggest leaving out the router entirely, which would make these tests not end-to-end anymore (maybe I misunderstood you here) and (2) practically speaking, they will add another step (set the correct env variable) before we can run tests (2) is the technical reason for not making the env var name change. |
@arschles if you'll bear with me, I do think you're misunderstanding the change a bit. I'll try to respond to your concerns-- which I appreciate you raising.
This won't change, as I'll demonstrate.
Thank you. At the risk of being pedantic, I believe that how things are named is always of tremendous consequence. The first time I ran this test suite (remotely) and was asked to provide the router's IP and port, I couldn't at all understand how that information was relevant. The tests interact with the controller. The fact that traffic flows through the router on its way to the controller is largely inconsequential. A real world analogy: If you want to send me a letter, you use my address; not the address of the nearest post office.
I believe this is a misunderstanding. The router is not bypassed in these tests. We'll come back to this...
It doesn't. As of Friday afternoon, I amended this PR to include the following code in the Makefile:
If If Ultimately, the only variable that anyone executing tests remotely cares about is the only variable that the test suite itself cares about-- the one that makes sense-- Coming back to the following:
In the case of the user executing the test suite remotely, the router is the single point of ingress into their cluster. The value they would set for In the case of a user executing the test suite within a Kubernetes cluster, the automatic construction of the |
@krancour that clears things up completely, thanks! I'll test these locally now. |
@krancour I was able to run these tests against my personal testing cluster. Since it was a relatively smaller cluster, I had to run with only 2 nodes. Only 2 errors occurred, and as we talked about privately, they may be DNS related (possibly ENG000656:workflow-e2e aaronschlesinger$ GINKGO_NODES=2 make test-integration
ginkgo -slowSpecThreshold=120.00 -noisyPendings=false -nodes=2 tests/
Running Suite: Deis Workflow
============================
Random Seed: 1460404263
Will run 134 of 137 specs
Running in parallel across 2 nodes
••••••••••••••P
------------------------------
• [SLOW TEST:190.383 seconds]
deis limits
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/limits_test.go:104
with an existing user
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/limits_test.go:102
who owns an existing app that has already been deployed
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/limits_test.go:100
that user can set a CPU limit on that application
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/vendor/github.com/onsi/ginkgo/ginkgo_dsl.go:354
------------------------------
$ deis auth:register http://deis.130.211.202.191.nip.io --username=test-13336 --password=asdf1234 --email=test-13336@deis.io
http://deis.130.211.202.191.nip.io/v2/ does not appear to be a valid Deis controller.
Make sure that the Controller URI is correct, the server is running and
your client version is correct.
Error: Get http://deis.130.211.202.191.nip.io/v2/: dial tcp: lookup deis.130.211.202.191.nip.io on 192.168.1.254:53: read udp 192.168.1.66:49619->192.168.1.254:53: i/o timeout
$ deis auth:cancel --username=test-97929 --password=asdf1234 --yes
Error: Not logged in. Use 'deis login' or 'deis register' to get started.
• Failure in Spec Setup (BeforeEach) [10.125 seconds]
deis limits
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/limits_test.go:104
with an existing user
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/limits_test.go:102
who owns an existing app that has already been deployed [BeforeEach]
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/limits_test.go:100
that user can unset a memory limit on that application
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/vendor/github.com/onsi/ginkgo/ginkgo_dsl.go:354
No future change is possible. Bailing out early after 10.099s.
Expected
<int>: 1
to match exit code:
<int>: 0
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/cmd/auth/commands.go:49
------------------------------
••••••••••••••••••s•^R
•
------------------------------
• [SLOW TEST:137.505 seconds]
deis ps
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/ps_test.go:177
with an existing user
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/ps_test.go:175
who owns an existing app that has already been deployed
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/ps_test.go:173
that user can restart that app's processes
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/vendor/github.com/onsi/ginkgo/extensions/table/table.go:96
restarts all of 3
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/vendor/github.com/onsi/ginkgo/extensions/table/table_entry.go:46
------------------------------
•••••••••••
------------------------------
• [SLOW TEST:139.905 seconds]
deis ps
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/ps_test.go:177
with an existing user
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/ps_test.go:175
who owns an existing app that has already been deployed
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/ps_test.go:173
that user can restart that app's processes
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/vendor/github.com/onsi/ginkgo/extensions/table/table.go:96
restarts all of 3 by type
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/vendor/github.com/onsi/ginkgo/extensions/table/table_entry.go:46
------------------------------
••••••••
------------------------------
$ deis auth:register http://deis.130.211.202.191.nip.io --username=test-43494 --password=asdf1234 --email=test-43494@deis.io
Registered test-43494
Error: Post http://deis.130.211.202.191.nip.io/v2/auth/login/: dial tcp: lookup deis.130.211.202.191.nip.io on 192.168.1.254:53: read udp 192.168.1.66:59254->192.168.1.254:53: i/o timeout
$ deis auth:cancel --username=test-54860 --password=asdf1234 --yes
Error: Not logged in. Use 'deis login' or 'deis register' to get started.
• Failure in Spec Setup (BeforeEach) [12.743 seconds]
deis tags
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/tags_test.go:177
with an existing user
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/tags_test.go:142
who owns an existing app that has already been deployed [BeforeEach]
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/tags_test.go:140
and a tag has already been added to the app
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/tags_test.go:138
that user can unset that tag from that app
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/vendor/github.com/onsi/ginkgo/ginkgo_dsl.go:354
No future change is possible. Bailing out early after 12.567s.
Expected
<int>: 1
to match exit code:
<int>: 0
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/cmd/auth/commands.go:49
------------------------------
••••••••••••••••••••P•••••••••••••
------------------------------
• [SLOW TEST:120.912 seconds]
deis releases
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/releases_test.go:94
with an existing user
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/releases_test.go:92
who owns an existing app that has already been deployed
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/releases_test.go:90
and that app has three releases
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/releases_test.go:88
that user can roll the application back to the second release
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/vendor/github.com/onsi/ginkgo/ginkgo_dsl.go:354
------------------------------
••••••••••••••••••••••••••••••••••••••••P••
Summarizing 2 Failures:
[Fail] deis limits with an existing user [BeforeEach] who owns an existing app that has already been deployed that user can unset a memory limit on that application
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/cmd/auth/commands.go:49
[Fail] deis tags with an existing user [BeforeEach] who owns an existing app that has already been deployed and a tag has already been added to the app that user can unset that tag from that app
/Users/aaronschlesinger/gocode/src/github.com/deis/workflow-e2e/tests/cmd/auth/commands.go:49
Ran 134 of 137 Specs in 2443.886 seconds
FAIL! -- 132 Passed | 2 Failed | 3 Pending | 0 Skipped
Ginkgo ran 1 suite in 40m46.05314335s
Test Suite Failed
make: *** [test-integration] Error 1 |
Note the |
I had a run with only two failed specs (both of which seem to be timeout related). I think we need to get this merged ASAP |
Oops, removing my LGTM in favour of @vdice's comment. I always forget to update the docs... |
Per @vdice's suggestion, I want to update the README before merging this. Not updating it was an oversight. |
This represents the necessary update to go along with changes in deis/workflow-e2e#157 and deis/charts#218
ifdef DEIS_ROUTER_SERVICE_HOST | ||
export DEIS_CONTROLLER_URL=http://deis.${DEIS_ROUTER_SERVICE_HOST}.nip.io | ||
endif | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@krancour do you think it's worth putting this logic into the test code itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had done it this way because I felt it was more transparent. Speaking personally, if I don't know how to run something (like a test suite), the Makefile is where I look first. That said, I don't mind relocating this to the test code itself if you think there's a strong argument to be made for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The program should at least be self documenting. How about a middle ground, where the test code at least prints out an explanation if it doesn't find DEIS_CONTROLLER_URL
(unless it does already?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems reasonable. I still want to fix this before merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arschles as we discussed offline, I've attempted this and there's a problem that arises from Ginkgo's architecture. When executing tests in parallel, each Ginkgo node runs as its own process. If the DEIS_CONTROLLER_URL
environment variable is not set and is also not derivable from DEIS_ROUTER_SERVICE_HOST
, you are warned about this once per node... and there isn't anything immediately obvious we can do about that.
We decided together to defer fixing this until beta3.
[kent@mbp workflow-e2e]$ make test-integration
ginkgo -slowSpecThreshold=120.00 -noisyPendings=false -nodes=15 tests/
-------------------------------------------------------------------
| |
| Ginkgo timed out waiting for all parallel nodes to report back! |
| |
-------------------------------------------------------------------
[1] 2016/04/12 15:59:19 Environment variable DEIS_CONTROLLER_URL must be exported to proceed!
[2] 2016/04/12 15:59:19 Environment variable DEIS_CONTROLLER_URL must be exported to proceed!
[3] 2016/04/12 15:59:19 Environment variable DEIS_CONTROLLER_URL must be exported to proceed!
[4] 2016/04/12 15:59:19 Environment variable DEIS_CONTROLLER_URL must be exported to proceed!
[5] 2016/04/12 15:59:19 Environment variable DEIS_CONTROLLER_URL must be exported to proceed!
[6] 2016/04/12 15:59:19 Environment variable DEIS_CONTROLLER_URL must be exported to proceed!
[7] 2016/04/12 15:59:19 Environment variable DEIS_CONTROLLER_URL must be exported to proceed!
[8] 2016/04/12 15:59:19 Environment variable DEIS_CONTROLLER_URL must be exported to proceed!
[9] 2016/04/12 15:59:19 Environment variable DEIS_CONTROLLER_URL must be exported to proceed!
[10] 2016/04/12 15:59:19 Environment variable DEIS_CONTROLLER_URL must be exported to proceed!
[11] 2016/04/12 15:59:19 Environment variable DEIS_CONTROLLER_URL must be exported to proceed!
[12] 2016/04/12 15:59:19 Environment variable DEIS_CONTROLLER_URL must be exported to proceed!
[13] 2016/04/12 15:59:19 Environment variable DEIS_CONTROLLER_URL must be exported to proceed!
[14] 2016/04/12 15:59:19 Environment variable DEIS_CONTROLLER_URL must be exported to proceed!
[15] 2016/04/12 15:59:19 Environment variable DEIS_CONTROLLER_URL must be exported to proceed!
Ginkgo ran 1 suite in 3.560606126s
Test Suite Failed
make: *** [test-integration] Error 1
[kent@mbp workflow-e2e]$
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's file an issue and merge this.
deis/controller#552 will probably be fixed by this |
This represents the necessary update to go along with changes in deis/workflow-e2e#157 and deis/charts#218
return gexec.Start(execCmd, ginkgo.GinkgoWriter, ginkgo.GinkgoWriter) | ||
} | ||
|
||
// cmdWithRetry runs the provided <cmd> repeatedly, once a second up to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should say Retry runs the provided...
@krancour code LGTM. I have 2 more godoc related comments (#157 (comment) and #157 (comment)), so I'm LGTMing this assuming those will be fixed |
@arschles fixing now. |
This represents the necessary update to go along with changes in deis/workflow-e2e#157 and deis/charts#218
This represents the necessary update to go along with changes in deis/workflow-e2e#157 and deis/charts#218
This is refactor of the whole shooting match. It's stable... I get all tests passing well in excess of 95% of the time.
Note to reviewers: Rather than looking at the diffs, I'd just suggest evaluating the branch as a whole. Consider running the tests locally to assert they work as advertised.
Highlights:
make test-integration
defaults to parallel executionmake test-integration-serial
SynchronizedBeforeSuite
to manage one-time setup for the parallelized test suite. This cut down on repetitive, unnecessary, and doomed (but not fatal) attempts to register the admin user.settings
package to avoid polluting thetests
package and to keep test code expressive.deis builds:create
(akadeis pull
)! Why? For many tests that have a condition / context that specifies an application exists and has already been deployed, this is a faster way to satisfy that prerequisite than by having the builder build pushed code. (Don't worry. There is still a test that specifically exercises that functionality, but we just avoid that anywhere where there's a faster means of satisfying "an app exists and is already deployed.") It would seem these faster, lighter deploys have been good for overall stability.deis builds:create
. I know, I know... sleeps in the tests are evil, right? Or are they? In 132/134 tests we're not testingdeis builds:create
itself, but as I said, we're using it to satisfy the condition "an app exists and has already been deployed." I think it's more than fair to assume an "existing app" has been around for at least 10 seconds (in real life, an "existing app" has likely existed for much longer). So the code reflects this... and this also had a drastic effect on improving the stability of the tests.tests
package.admin
andhealthcheck
since, unlike the rest of the*_test.go
files, they didn't neatly align with an describe adeis
subcommand. The specs within those files were not removed. They were transplanted to other files where they seemed to fit more appropriately.Things I didn't change or fix:
*_test.go
files.One thing I removed, in part because it didn't make sense to me: I did remove some specs having to do with arbitrarily complex usage of multiple domains and certificates. Smaller, more focused specs would be better than an epic. I do, however, feel that the important functionality is still tested and that I didn't actually sacrifice coverage by eliminating that big spec.
Bonus: Containerized e2e tests work now. (It seems to me they didn't quite work before.)
To run:
Caveat: Any credentials referenced by your kubectl configuration must live somewhere within
~/.kube
Other notes: Some of the environment variables that drive all of this have changed, so this PR isn't going to pass in Jenkins with the job as is.
This is all far from perfect, but I'll let the stability speak for itself. Didn't want to delay PR'ing it any longer since I know people are eagerly awaiting a suite that will pass the PRs we've all got queued up.
If I am on PTO Friday, I do not mind if someone else takes this over.