Test suite discussion (Friday June 23) #2333

mdbenjam · 2017-06-13T18:54:38Z

It seems every couple of months we have a discussion about our test suite. There are two main complaints:

It takes too long
Flakey tests make the CI process, and testing in general, frustrating.

Proposal

Capybara tests have proved to be difficult to maintain. I suggest that we move away from Capybara feature tests by doing the following.

Beef up our unit tests of React components and Redux actions.
Try to break our feature tests into sets of two tests.
- A backend rspec test that hits the endpoint and makes sure we return what is expected.
- A frontend mocha/enzyme test that uses the expected value from the backend test to respond to any API requests.
- The expected responses from the API can be stored in a file. This same file can be used to mock out the frontend API requests. This way if the API contract ever changes, we can easily catch it.
Only test one or two happy paths through the app using Capybara

Feel free to add new ideas, or push back against what I've posted. Let's have a conversation about this on Monday June 19.

ghost · 2017-06-13T18:58:20Z

https://martinfowler.com/bliki/IntegrationContractTest.html

NickHeiner · 2017-06-13T19:59:36Z

See also #2293.

joofsh · 2017-06-13T23:23:01Z

Some initial thoughts:

Before we discuss this, I think it's really important everyone reads this long article about JS-heavy applications and capybara. It goes into great detail about a setup identical to ours, running into similar problems, and the way they went about solving it.
One key insight I took from that article is to audit our gems and see if any need upgrading (the existing version may have a bug with test concurrency). I don't believe we've upgraded any of our core gems in quite some time now. This would be a great task to get on the roadmap
I agree with Mark moving to a split API backend test & frontend UI test approach is probably wise. By creating a clearly defined contract between the front & backend, we can in the future split up the monolith into separate repos. This will allow us to only run a section of the test suite before merging. It also has several benefits beyond testing, such as a clearer path to migrating off VACOLS (other contracting teams working on sections of the UI), ability to deploy our Caseflow products at different timings (being able to rollback only Reader if there is a bug), faster hotfixing, etc. We're starting to feel some of the "burden" of the monolith approach
If we do decide to move towards more frontend tests, we should first work on improving that setup. I think the current karma setup is not yet conducive to a happy developer workflow due to its own slowness & poor test output. (Note: I haven't done any frontend development recently, so this statement may be no longer accurate)
We may want to consider adding code coverage to the javascript tests. This is something that was on my personal TODO list but then we moved away from JS testing. Ideally the goal would be to have unit tests for all redux reducers, computed actions, selectors, and "unit UI tests" for all reusable components

ghost · 2017-06-15T16:03:47Z

I was able to spin up a Travis-like VM where these tests can be debugged and tested...
All the flaky tests have repeatedly failed there with no particular consistency. However, this allows for clear debugging and screenshot access for anyone willing.

shanear · 2017-06-16T23:20:59Z

I know I'm on vacay but I couldn't help but add some thoughts :)

There are a couple really big advantages of our Capybara tests that we would be leaving behind:
- They test almost every feature of our application with the front and back end integrated. This is extremely valuable and has prevented countless bugs. The fact that we've only had 1 regression bug in almost a year and a half of rapid development and refactor cycles is a pretty incredible testament to the value they are providing.
- They have automated accessibility testing (we could do this with JS tests too, it'd just take some work to set up).
- They are very easy to write and setup (extending on JDs point 4)
- They deal very well with timing issues comparatively to any other UI testing framework (javascript included) I've ever used.
To my knowledge, we haven't put much effort into lower cost solutions to the problems we are having with the capybara tests / long builds. There are a couple low cost ways off the top of my head that we might be able to mitigate the problem:
- Upgrade our gems (maybe to Rails 5?)
- Try using the new chrome headless browser that just came out
- Assess the reason for the failing tests, and fix the root cause
- Investigate another CI server that would take more advantage of parallelization.
- Put a retry on flakey tests.
We don't actually know whether JS UI tests are the better option. Flakey tests will be a problem no matter what framework we use. We may be trading in one set of problems for another set of problems, and spending a lot of effort to do so.

I think piloting these new JS tests on Reader (or another app?) without committing to that direction is a better way to test whether JS UI tests are a suitable alternative. That would give us time to make the JS testing setup more developer friendly, and create some norms around how to write those sorts of tests.
It's possible that we may be leaning too much on our feature tests, and they could afford to be peeled back a little bit. I don't think we really have a Pyramid (https://martinfowler.com/bliki/TestPyramid.html). If there is functionality we could pull from a feature test to a JS UI test without loosing any coverage, I think we should try and do that where we can.

NickHeiner · 2017-06-19T14:03:11Z

@mdbenjam when is this conversation actually happening? Is it the time that would be right after standup today if there were standup?

ghost · 2017-06-23T15:06:08Z

Action items:

SPIKE: other CI tools: jenkins, circle ci? vets.gov swarm
New testing style: contract frontent vs backend test, with less feature tests
Address current Pending tests per app:
Explore a way to setup parallel tests: feature tests in parallel?
Document "How we do testing!" - ensure_stable helper.

NickHeiner · 2017-06-23T15:07:36Z

This is the ensure_stable helper.

askldjd · 2017-06-23T15:36:31Z

Just an update on the shared CI pipeline with vets.gov.

@CyberKoz has a prototype Jenkins k8s cluster ready and @mogrenAtWork will start developing against it. There are the steps.

Create a base container
Migrate Travis steps into shell scripts
Add the jobs to Jenkins prototype for testing and performance evaluation.
Deploy to production

There are three key advantages for this setup.

We (Appeals) are not provisioning this Jenkins cluster. We are piggybacking on the devOps infrastructure of vets.gov. So as long as we do the migration, provisioning cost is very low.
The Jenkins k8s allows close to unlimited scalability for our team. If @joofsh can repair the parallel rake test, I expect that our CI will complete in 5 minutes (vs. ~20 mins) today from vertical scaling. There will be little to no wait queue because of horizontal scaling.
This Jenkins will be public facing. Build status will be public, and Build Logs will be protected behind passwords.

For timeline reference, @CyberKoz expects the k8s setup to go to production in 2-3 weeks. For Appeals, it depends on how fast we can get the container and scripts ready.

NickHeiner · 2017-06-23T15:54:42Z

and Build Logs will be protected behind passwords.

Why?

We (Appeals) are not provisioning this Jenkins cluster. We are piggybacking on the devOps infrastructure of vets.gov. So as long as we do the migration, provisioning cost is very low.

Regardless, this does increase the devops cost on DSVA overall. I am still not sold that we shouldn't use a SaaS solution to decrease our devops burden. @askldjd what are the benefits of Jenkins over CircleCI? I agree that Jenkins costs less money, but the budget may not be the biggest concern here.

askldjd · 2017-06-23T16:04:41Z

Build Logs will be protected behind passwords.

Arbitrary. We can discuss this.

Regardless, this does increase the devops cost on DSVA overall. I am still not sold that we shouldn't use a SaaS solution to decrease our devops burden.

Maybe, maybe not. I think once the infrastructure is up, adding Appeals to the CI pipeline will fit well within the economy of scale. Personally, I would much prefer to have our infrastructure stack lining up with vets.gov team. Having two CI solutions for DSVA seems to make no sense when our software stack is near identical.

What are the benefits of Jenkins over CircleCI?

Performance. I haven't seen performance number from Circle CI. However, they are unlikely to be able to match the performance of a custom Jenkin swarm. If Circle CI isn't fast enough, we will have to speed up our test. With Jenkins, we can just throw more hardware at it.

Vets.gov team has the Jenkins CI Swarm for over 6 months now and they are happy customer. Like us, they outgrew Travis, and evaluated all the SaaS solutions. Jenkins CI Swarm was their solution, and they have been happy since.

Jenkins isn't going away no matter what. Our CD pipeline can only be done by Jenkins. I think it is more advantageous to invest time in Jenkins competency instead of investing in Circle CI competency.

NickHeiner · 2017-06-23T22:10:18Z

Jenkins isn't going away no matter what. Our CD pipeline can only be done by Jenkins. I think it is more advantageous to invest time in Jenkins competency instead of investing in Circle CI competency.

I agree with this. Unless CircleCI competency is trivial compared to the work of maintaining more Jenkins environments.

Personally, I would much prefer to have our infrastructure stack lining up with vets.gov team. Having two CI solutions for DSVA seems to make no sense when our software stack is near identical.

I agree, but I wonder why DSVA doesn't use a SaaS solution instead of Jenkins for CI as well! 😄

Vets.gov team has the Jenkins CI Swarm for over 6 months now and they are happy customer. Like us, they outgrew Travis, and evaluated all the SaaS solutions. Jenkins CI Swarm was their solution, and they have been happy since.

This is a good data point. I would love to chat with them more about why they chose Jenkins.

Performance. I haven't seen performance number from Circle CI. However, they are unlikely to be able to match the performance of a custom Jenkin swarm. If Circle CI isn't fast enough, we will have to speed up our test. With Jenkins, we can just throw more hardware at it.

I agree that we still need to investigate further, but I wouldn't be surprised if CircleCI could have comparable performance to Jenkins. Can we not also "throw more hardware" at CircleCI by giving them more money?

mdbenjam · 2017-06-26T15:40:43Z

I'm closing this discussion ticket. With the following opened to track action items:

#2449
#2293
#2455
#2457
#2456
#2454
#2458

NickHeiner · 2017-06-27T18:20:20Z

As a follow-up to my earlier comments: I chatted with @CyberKoz today. Vets.gov went to Jenkins instead of a SaaS CI because they were frustrated with Travis being too slow even at the max subscription level and were afraid that other SaaS products would have the same problem. He thinks that the additional overhead of Appeals running PR builds on Vets.gov Jenkins would be negligible. Extra work to generally improve their CI setup is minimal as well.

Strangely, the pricing pages for both Travis and CircleCI only talk in terms of horizontal scaling, not vertical.

I'm going to reach out to CircleCI and ask about their ability to do vertical scaling. Until then, I'm fine moving forward with Jenkins. I am also curious to try CircleCI when we get our dockerfile ready and see how the two compare.

Adding norms based on the discussion from #2333.

NickHeiner · 2017-06-27T18:32:39Z

For what it's worth, a few other SaaS CI providers:

NickHeiner · 2017-06-27T18:33:12Z

Update from talking to @CyberKoz: the problem that Vets.gov had with Travis was horizontal scaling, not vertical. CircleCI offers up to 62 concurrent builders. I think we'd have a hard time hitting that.

For me, the biggest unknown will be how fast a single build runs. Let's try both CircleCI and custom Jenkins side-by-side and see how they compare.

mdbenjam self-assigned this Jun 13, 2017

mdbenjam added the In Progress label Jun 13, 2017

mdbenjam changed the title ~~Test suite discussion~~ Test suite discussion (Monday June 19) Jun 13, 2017

mdbenjam added the discussion label Jun 14, 2017

mdbenjam removed their assignment Jun 14, 2017

ghost mentioned this issue Jun 16, 2017

Post-mortem: Certification v2 interfering with Certification v1 | 10:28AM June 7th, 2017 #2268

Closed

mdbenjam changed the title ~~Test suite discussion (Monday June 19)~~ Test suite discussion (Thursday June 22) Jun 19, 2017

mdbenjam changed the title ~~Test suite discussion (Thursday June 22)~~ Test suite discussion (Friday June 23) Jun 19, 2017

mdbenjam mentioned this issue Jun 26, 2017

Tests | Javascript based testing #2449

Closed

mdbenjam self-assigned this Jun 26, 2017

mdbenjam added Team: Whiskey and removed discussion labels Jun 26, 2017

mdbenjam closed this as completed Jun 26, 2017

mdbenjam removed the In Progress label Jun 26, 2017

NickHeiner mentioned this issue Jun 26, 2017

Add norms about testing #2465

Merged

NickHeiner added a commit that referenced this issue Jun 27, 2017

Add norms about testing (#2465)

cace4da

Adding norms based on the discussion from #2333.

abbyraskinUSDS modified the milestone: Reader Milestone 2 (beta) Jun 27, 2017

mdbenjam mentioned this issue Jul 27, 2017

Reader | Add loading screen to welcome gate #2718

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test suite discussion (Friday June 23) #2333

Test suite discussion (Friday June 23) #2333

mdbenjam commented Jun 13, 2017 •

edited

Loading

ghost commented Jun 13, 2017

NickHeiner commented Jun 13, 2017

joofsh commented Jun 13, 2017 •

edited

Loading

ghost commented Jun 15, 2017

shanear commented Jun 16, 2017 •

edited

Loading

NickHeiner commented Jun 19, 2017

ghost commented Jun 23, 2017 •

edited by ghost

Loading

NickHeiner commented Jun 23, 2017

askldjd commented Jun 23, 2017 •

edited

Loading

NickHeiner commented Jun 23, 2017

askldjd commented Jun 23, 2017 •

edited

Loading

NickHeiner commented Jun 23, 2017

mdbenjam commented Jun 26, 2017

NickHeiner commented Jun 27, 2017

NickHeiner commented Jun 27, 2017

NickHeiner commented Jun 27, 2017

Test suite discussion (Friday June 23) #2333

Test suite discussion (Friday June 23) #2333

Comments

mdbenjam commented Jun 13, 2017 • edited Loading

Proposal

ghost commented Jun 13, 2017

NickHeiner commented Jun 13, 2017

joofsh commented Jun 13, 2017 • edited Loading

ghost commented Jun 15, 2017

shanear commented Jun 16, 2017 • edited Loading

NickHeiner commented Jun 19, 2017

ghost commented Jun 23, 2017 • edited by ghost Loading

Action items:

NickHeiner commented Jun 23, 2017

askldjd commented Jun 23, 2017 • edited Loading

NickHeiner commented Jun 23, 2017

askldjd commented Jun 23, 2017 • edited Loading

NickHeiner commented Jun 23, 2017

mdbenjam commented Jun 26, 2017

NickHeiner commented Jun 27, 2017

NickHeiner commented Jun 27, 2017

NickHeiner commented Jun 27, 2017

mdbenjam commented Jun 13, 2017 •

edited

Loading

joofsh commented Jun 13, 2017 •

edited

Loading

shanear commented Jun 16, 2017 •

edited

Loading

ghost commented Jun 23, 2017 •

edited by ghost

Loading

askldjd commented Jun 23, 2017 •

edited

Loading

askldjd commented Jun 23, 2017 •

edited

Loading