Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

Calibrate target code coverage #13324

Closed
jmkiley opened this issue Nov 8, 2018 · 8 comments
Closed

Calibrate target code coverage #13324

jmkiley opened this issue Nov 8, 2018 · 8 comments
Labels

Comments

@jmkiley
Copy link
Contributor

jmkiley commented Nov 8, 2018

We should keep an eye on coverage fluctuations in order to see if we want to add custom coverage targets, as well as if we want to not fail PRs or add notifications when code coverage drops below a certain amount. This would help us track coverage and flag less stable tests. It seems that with Coveralls, we posted coverage changes to Slack.

Currently, code coverage is only enabled for core GL, but this is something to keep in mind as enable code coverage for other platforms (tracked in #7609 for iOS). We can configure different coverage targets for different flags.

@jmkiley
Copy link
Contributor Author

jmkiley commented Nov 14, 2018

We have noticed that code coverage is occasionally dropping by 0.01% because a line in line_bucket.cpp is not being hit. This has popped up in several PRs, including #13357, #13352, abd #13337.

@springmeyer is going to adjust the threshold for CodeCov failure until we can pinpoint what is causing this line to be skipped. Investigating on the ds-jk-line_bucket branch.

cc @kkaefer @friedbunny @zugaldia

@springmeyer
Copy link
Contributor

springmeyer commented Nov 14, 2018

We have noticed that code coverage is occasionally dropping by 0.01% because a line in line_bucket.cpp is not being hit. This has popped up in several PRs, including #13357, #13352, abd #13337.

Looks like there are many different places in the codebase that are flapping. So, fundamentally the unit tests are non-determinative, which is very concerning. It may be easily explained by how threading works or some other feature of GL. But worst case this means there is a possibility that our unit tests are not actually catching the bugs they intend to prevent from regressing and that there is a potential for latent bugs to be lurking, despite tests.

Based on voice with @jmkiley the next action we see on coverage tracking is to document all of the lines of code that we've seen flap in the coverage results in the last few days on this ticket. This will be good data to start getting a broad sense of scope of the non-deterministic test problem, which should be handed and prioritized separately from this coverage ticket.

@springmeyer is going to adjust the threshold for CodeCov failure until we can pinpoint what is causing this line to be skipped. Investigating on the ds-jk-line_bucket branch.

Yes, as a stopgap until the non-determinism is understood and solved I've put up #13364 in the hope that this will avoid confusing failures. But this is only a bandaid and not a real fix.

@jmkiley
Copy link
Contributor Author

jmkiley commented Nov 14, 2018

Tests that seem to have flapped in the last few days. The links below point to the lines CodeCov shows as changing.

@langsmith
Copy link
Contributor

@osana also experienced the weird flapping on #13433

@friedbunny
Copy link
Contributor

Seems that codecov not being reachable can also fail otherwise successful CI builds:

$ #!/bin/bash -eo pipefail
curl -sSfL -o codecov https://codecov.io/bash
chmod +x codecov
./codecov -c

curl: (22) The requested URL returned error: 503 Service Temporarily Unavailable
Exited with code 22

@springmeyer
Copy link
Contributor

That makes sense given they just experienced a major outage: https://status.codecov.io/incidents/623n3mpxf6t1. But looks like it is resolved.

@jmkiley
Copy link
Contributor Author

jmkiley commented Nov 30, 2018

Once #13364 lands, the threshold for gl-native will be at 1%. We should keep an eye out for any PRs where code coverage unexpectedly changes by a greater amount, and reevaluate the threshold in 2019 Q2.

@springmeyer
Copy link
Contributor

springmeyer commented Dec 2, 2018

#13364 has landed, so hopefully that avoids failed jobs due to slight coverage changes in the near term. Because even slight changes in coverage could also be explained by legitimate failures to add new tests to cover new code, we should aim for reverting #13364 asap. I've ticketed #13494 to track that.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants