-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running shell tests with bazel run
intermittently omits outputs
#17754
Comments
Can you please try this with bazel from head? We recently fixed issues with bazel run that sound similar. |
I do not currently have an environment set up where I can build the latest head, but I have tentatively tested the following:
This still fails as described in the issue. While |
I have now tried this with a Bazel built from head (8a23169) and can verify that the problem still exists. |
Fixes #17754. What we have seen prior to this change was that sometimes for quick tests the output was swallowed. After a lot of poking it became clear that the culprit is the use of subshell and `tee`, e.g. if you remove `tee` completely from the picture the behavior never shows up. The issue is that with a fast test, `tee` seems to be killed (or its parent subshell) before the printing the output to stdout. With this change, we reduce the number of subshells and processes to set up and reduce the chance of the race condition but not remove it. However, for practical purposes, the race condition is gone. With the reproduction steps in #17754, and this command ``` for i in {1..10000}; do /tmp/bazel run :foo &> /tmp/log ; grep -q "useful echo" /tmp/log ; if [ $? -eq 0 ]; then echo -n '+'; else echo -n '-'; fi; done ``` a bazel from head fails ~3900 out of 10000 times. After this commit, it never failed.
Fixes bazelbuild#17754. What we have seen prior to this change was that sometimes for quick tests the output was swallowed. After a lot of poking it became clear that the culprit is the use of subshell and `tee`, e.g. if you remove `tee` completely from the picture the behavior never shows up. The issue is that with a fast test, `tee` seems to be killed (or its parent subshell) before the printing the output to stdout. With this change, we reduce the number of subshells and processes to set up and reduce the chance of the race condition but not remove it. However, for practical purposes, the race condition is gone. With the reproduction steps in bazelbuild#17754, and this command ``` for i in {1..10000}; do /tmp/bazel run :foo &> /tmp/log ; grep -q "useful echo" /tmp/log ; if [ $? -eq 0 ]; then echo -n '+'; else echo -n '-'; fi; done ``` a bazel from head fails ~3900 out of 10000 times. After this commit, it never failed. Closes bazelbuild#17846. PiperOrigin-RevId: 518794237 Change-Id: I8c1862d3a274799b864f0f5f42b85d6df5af78c7
) Fixes #17754. What we have seen prior to this change was that sometimes for quick tests the output was swallowed. After a lot of poking it became clear that the culprit is the use of subshell and `tee`, e.g. if you remove `tee` completely from the picture the behavior never shows up. The issue is that with a fast test, `tee` seems to be killed (or its parent subshell) before the printing the output to stdout. With this change, we reduce the number of subshells and processes to set up and reduce the chance of the race condition but not remove it. However, for practical purposes, the race condition is gone. With the reproduction steps in #17754, and this command ``` for i in {1..10000}; do /tmp/bazel run :foo &> /tmp/log ; grep -q "useful echo" /tmp/log ; if [ $? -eq 0 ]; then echo -n '+'; else echo -n '-'; fi; done ``` a bazel from head fails ~3900 out of 10000 times. After this commit, it never failed. Closes #17846. PiperOrigin-RevId: 518794237 Change-Id: I8c1862d3a274799b864f0f5f42b85d6df5af78c7 Co-authored-by: Tobias Werth <twerth@google.com>
Fixes bazelbuild#17754. What we have seen prior to this change was that sometimes for quick tests the output was swallowed. After a lot of poking it became clear that the culprit is the use of subshell and `tee`, e.g. if you remove `tee` completely from the picture the behavior never shows up. The issue is that with a fast test, `tee` seems to be killed (or its parent subshell) before the printing the output to stdout. With this change, we reduce the number of subshells and processes to set up and reduce the chance of the race condition but not remove it. However, for practical purposes, the race condition is gone. With the reproduction steps in bazelbuild#17754, and this command ``` for i in {1..10000}; do /tmp/bazel run :foo &> /tmp/log ; grep -q "useful echo" /tmp/log ; if [ $? -eq 0 ]; then echo -n '+'; else echo -n '-'; fi; done ``` a bazel from head fails ~3900 out of 10000 times. After this commit, it never failed. Closes bazelbuild#17846. PiperOrigin-RevId: 518794237 Change-Id: I8c1862d3a274799b864f0f5f42b85d6df5af78c7
Description of the bug:
We have a use case wherein we want to pipe the output of a Bazel test target implemented with a shell script to an external tool. There are certain parts of the output that are essential to pass to the external tool. In addition, we want to be able to run this target with
bazel run
in addition tobazel test
.Assuming that the target is named
:foo
and that the external tool is namedmy_tool
, we can do this with the command:$ bazel run :foo | my_tool
This usually works, but
my_tool
will intermittently not receive the full intended output from the:foo
run. The failures seem to be caused by a race condition that sometimes results in subprocesses being killed before they have had the opportunity to write to stdout. This problem seems to have been introduced by the commit 9051faa.What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Create a new workspace with the files:
Then, try to grep "useful echo" from the script output repeatedly until failure. This can for example be done with:
Alternatively, it's faster to use the
--script_path
option to create a separate script for running the test.We have found that this usually fails within 1000 iterations, but under heavy load from other processes on the machine it been observed to go below 50.
Which operating system are you running Bazel on?
Red Hat Enterprise Linux Server 7.9
What is the output of
bazel info release
?No response
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response
The text was updated successfully, but these errors were encountered: