-
Notifications
You must be signed in to change notification settings - Fork 19
How to motivate people to fix flaky tests #21
Comments
I hope you don't mind me putting my two cents here, but I've fixed a few flaky tests and lately I've been looking into fixing more. My takeout is:
I for one am interested in helping fix flaky tests, but some more information on how to reproduce results from particular platforms would be welcome. |
I'm noticing an uptick in tagging @nodejs/testing when a flaky test fails. In and of itself, that's OK, but it seems more often than not to be the extent of what is done, and that may be the opposite of OK. How do we feel about that? I can see different ways of looking at it, but I'm finding myself feeling this way: This makes it easy for people to ignore flaky tests and also ignore actual failures by assuming they are flaky. Tag @nodejs/testing and move on! This is bad and we should stop it. Additionally, it's not scalable. And it wrongly puts the onus of fixing flaky tests on the testing working group when that responsibility should lie with all collaborators. What to do about it? I'm not sure. Here's an example: nodejs/node#5314 (comment) (Not picking on you @jasnell, it's just the latest I've come across. And you're more likely to do this sort of thing because you work to push along an unusually large number of PRs so if anyone's going to do it and have it be OK, it's you.) Labeling this issue for the agenda for the next meeting. |
One possible thing to do is have a short, easy-to-understand bullet point or checklist of things to do when you find a possibly-flaky test and refer to it when this happens. Off the top of my head:
One thing we may need to codify is who is responsible for doing those last two things. Like, in the example above, is it James? Is it the person who opened the PR? I mean, it's everybody's responsibility. But if it's everybody's responsibility, it's easy for nobody to actually do it, you know? But we don't want to punish people for noting a flaky test by making them do a bunch of other stuff. Maybe we could ease the opening of the issue with some automation or something? I don't know what the answer is here, or even if there is a good one. |
Long time since the last discussion, but I think we can as for "Open an issue titled "Investigate flaky TESTNAME". Include a link to the failure in CI." as the minimal responsibility we all have as collaborators. In terms of the PR to mark the test as flaky, I'm thinking that can depend on how flaky it is. In some cases it may be better not to have it marked as that makes more invisible and possibly less likely for somebody to decide to try an fix it. I'm not sure if in general flaky tests are a good first contribution, but marking those that are may be a good way to get more attention on them. |
It seems like perhaps this should be closed. Feel free to re-open (or leave a comment requesting that it be re-opened) if you disagree. I'm just tidying up and not acting on a super-strong opinion or anything like that. |
(continued from nodejs/build#248)
The current documented policy for flaky tests (https://github.com/nodejs/node/wiki/Flaky-tests#what-to-do-when-you-encounter-a-new-flaky-test) calls for opening an issue to track them when you mark the test as flaky, and assigning the issue to the next release milestone.
One part that I think could use some improvement is clarifying who is going to take responsibility for fixing the flaky test / how to motivate people to do it. The person who marks the tests as flaky is usually the collaborator who is making the determination that the test is not failing because of the current pull request changes. They are not motivated to fix the test and not necessarily the most qualified in the particular test that is failing.
In a dev team working for one company, you could probably just assign the issue to the test author/owner. I am not sure that this would work in an open source project.
So how do we motivate collaborators to investigate and fix these failures? Here are some options we could consider:
The text was updated successfully, but these errors were encountered: