Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel coverage fails on existing rules_go test #6293

Closed
jayconrod opened this issue Oct 2, 2018 · 12 comments
Closed

Bazel coverage fails on existing rules_go test #6293

jayconrod opened this issue Oct 2, 2018 · 12 comments
Labels
coverage P3 We're not considering working on this, but happy to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-Rules-Server Issues for serverside rules included with Bazel type: bug

Comments

@jayconrod
Copy link
Contributor

Description of the problem / feature request:

In Bazel 0.18.0rc8, the rules_go target //tests/core/coverage:coverage_test fails. It passed in 0.17.1. It appears the collect_coverage.sh script is failing because the LCOV_MERGER environment variable is not set.

This may affect all coverage runs for Starlark runs. coverage_test is a trivial go_test used to collect coverage data; there's nothing special about it.

Snippet of the output:

$ bazel coverage tests/core/coverage:coverage_test
...
+ cd /usr/local/google/home/jayconrod/.cache/bazel/_bazel_jayconrod/7e5a5a4544b57e311691c3a03ea8d383/sandbox/linux-sandbox/1/execroot/io_bazel_rules_go/bazel-out/k8-fastbuild/bin/tests/core/coverage/linux_amd64_stripped/coverage_test.runfiles/io_bazel_rules_go
+ /usr/local/google/home/jayconrod/.cache/bazel/_bazel_jayconrod/7e5a5a4544b57e311691c3a03ea8d383/sandbox/linux-sandbox/1/execroot/io_bazel_rules_go/bazel-out/k8-fastbuild/bin/tests/core/coverage/linux_amd64_stripped/coverage_test.runfiles/io_bazel_rules_go/tests/core/coverage/linux_amd64_stripped/coverage_test
PASS
coverage: 50.0% of statements
+ TEST_STATUS=0
+ touch /usr/local/google/home/jayconrod/.cache/bazel/_bazel_jayconrod/7e5a5a4544b57e311691c3a03ea8d383/sandbox/linux-sandbox/1/execroot/io_bazel_rules_go/bazel-out/k8-fastbuild/testlogs/tests/core/coverage/coverage_test/coverage.dat
+ [[ 0 -ne 0 ]]
+ cd /usr/local/google/home/jayconrod/.cache/bazel/_bazel_jayconrod/7e5a5a4544b57e311691c3a03ea8d383/sandbox/linux-sandbox/1/execroot/io_bazel_rules_go
+ [[ -n '' ]]
+ export 'LCOV_MERGER_CMD= --coverage_dir=/usr/local/google/home/jayconrod/.cache/bazel/_bazel_jayconrod/7e5a5a4544b57e311691c3a03ea8d383/sandbox/linux-sandbox/1/execroot/io_bazel_rules_go/_coverage/tests/core/coverage/coverage_test/test --output_file=/usr/local/google/home/jayconrod/.cache/bazel/_bazel_jayconrod/7e5a5a4544b57e311691c3a03ea8d383/sandbox/linux-sandbox/1/execroot/io_bazel_rules_go/bazel-out/k8-fastbuild/testlogs/tests/core/coverage/coverage_test/coverage.dat'
+ LCOV_MERGER_CMD=' --coverage_dir=/usr/local/google/home/jayconrod/.cache/bazel/_bazel_jayconrod/7e5a5a4544b57e311691c3a03ea8d383/sandbox/linux-sandbox/1/execroot/io_bazel_rules_go/_coverage/tests/core/coverage/coverage_test/test --output_file=/usr/local/google/home/jayconrod/.cache/bazel/_bazel_jayconrod/7e5a5a4544b57e311691c3a03ea8d383/sandbox/linux-sandbox/1/execroot/io_bazel_rules_go/bazel-out/k8-fastbuild/testlogs/tests/core/coverage/coverage_test/coverage.dat'
+ [[ -n '' ]]
+ JAVA_RUNFILES=
+ exec --coverage_dir=/usr/local/google/home/jayconrod/.cache/bazel/_bazel_jayconrod/7e5a5a4544b57e311691c3a03ea8d383/sandbox/linux-sandbox/1/execroot/io_bazel_rules_go/_coverage/tests/core/coverage/coverage_test/test --output_file=/usr/local/google/home/jayconrod/.cache/bazel/_bazel_jayconrod/7e5a5a4544b57e311691c3a03ea8d383/sandbox/linux-sandbox/1/execroot/io_bazel_rules_go/bazel-out/k8-fastbuild/testlogs/tests/core/coverage/coverage_test/coverage.dat
external/bazel_tools/tools/test/collect_coverage.sh: line 132: exec: --: invalid option
exec: usage: exec [-cl] [-a name] [command [arguments ...]] [redirection ...]
================================================================================
bazel: Leaving directory `/usr/local/google/home/jayconrod/.cache/bazel/_bazel_jayconrod/7e5a5a4544b57e311691c3a03ea8d383/execroot/io_bazel_rules_go/'
Target //tests/core/coverage:coverage_test up-to-date:
  bazel-bin/tests/core/coverage/linux_amd64_stripped/coverage_test
INFO: Elapsed time: 4.579s, Critical Path: 0.25s
INFO: 1 process: 1 linux-sandbox.
INFO: Build completed, 1 test FAILED, 2 total actions
//tests/core/coverage:coverage_test                                      FAILED in 0.1s

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

What operating system are you running Bazel on?

Linux 4.17.0-3rodete2-amd64 #1 SMP Debian 4.17.17-1rodete2 (2018-08-28) x86_64 GNU/Linux

What's the output of bazel info release?

release 0.18.0rc8

Any other information, logs, or outputs that you want to share?

I have lcov installed at /usr/bin/lcov. Not sure if that makes a difference now -- it did in the past.

@jin jin added team-Rules-Server Issues for serverside rules included with Bazel untriaged labels Oct 3, 2018
@jayconrod
Copy link
Contributor Author

This is failing in 0.18.0 now. :(

@benjaminp
Copy link
Collaborator

I believe the workaround is throw a dummy implicit lcov_merger attribute in your test rule attributes like this:

'$lcov_merger': attr.label(executable=True, default=Label('//any/target/at:all'), cfg="target"),

jayconrod pushed a commit to jayconrod/rules_go that referenced this issue Oct 15, 2018
When collecting coverage, Bazel now expects test rules to have an
implicit "$lcov_merger" or "_lcov_merger" attribute that points to a
binary to be run by the test driver script.

This change adds a dummy _lcov_merger attribute to go_test. The script
does nothing for now.

Related bazelbuild/bazel#6293
jayconrod pushed a commit to jayconrod/rules_go that referenced this issue Oct 15, 2018
When collecting coverage, Bazel now expects test rules to have an
implicit "$lcov_merger" or "_lcov_merger" attribute that points to a
binary to be run by the test driver script.

This change adds a dummy _lcov_merger attribute to go_test. The script
does nothing for now.

Related bazelbuild/bazel#6293
@jayconrod
Copy link
Contributor Author

@benjaminp Thanks, I think that will get rules_go's CI to be green again.

I'll leave this issue open, since it's still a regression.

jayconrod added a commit to bazelbuild/rules_go that referenced this issue Oct 15, 2018
When collecting coverage, Bazel now expects test rules to have an
implicit "$lcov_merger" or "_lcov_merger" attribute that points to a
binary to be run by the test driver script.

This change adds a dummy _lcov_merger attribute to go_test. The script
does nothing for now.

Related bazelbuild/bazel#6293
jayconrod added a commit to bazelbuild/rules_go that referenced this issue Oct 16, 2018
When collecting coverage, Bazel now expects test rules to have an
implicit "$lcov_merger" or "_lcov_merger" attribute that points to a
binary to be run by the test driver script.

This change adds a dummy _lcov_merger attribute to go_test. The script
does nothing for now.

Related bazelbuild/bazel#6293
jayconrod added a commit to bazelbuild/rules_go that referenced this issue Oct 16, 2018
When collecting coverage, Bazel now expects test rules to have an
implicit "$lcov_merger" or "_lcov_merger" attribute that points to a
binary to be run by the test driver script.

This change adds a dummy _lcov_merger attribute to go_test. The script
does nothing for now.

Related bazelbuild/bazel#6293
jayconrod added a commit to bazelbuild/rules_go that referenced this issue Oct 16, 2018
When collecting coverage, Bazel now expects test rules to have an
implicit "$lcov_merger" or "_lcov_merger" attribute that points to a
binary to be run by the test driver script.

This change adds a dummy _lcov_merger attribute to go_test. The script
does nothing for now.

Related bazelbuild/bazel#6293
jayconrod added a commit to bazelbuild/rules_go that referenced this issue Oct 16, 2018
When collecting coverage, Bazel now expects test rules to have an
implicit "$lcov_merger" or "_lcov_merger" attribute that points to a
binary to be run by the test driver script.

This change adds a dummy _lcov_merger attribute to go_test. The script
does nothing for now.

Related bazelbuild/bazel#6293
grailbot pushed a commit to grailbio/rules_r that referenced this issue Oct 30, 2018
Summary:
From bazel 0.18.0, an lcov_merger tool is needed for every test rule
that collects coverage.

See bazelbuild/bazel#6293 for more details.

Reviewers: ysaito

Reviewed By: ysaito

Differential Revision: https://phabricator.grailbio.com/D20750

fbshipit-source-id: f480a5e
@lberki lberki added P3 We're not considering working on this, but happy to review a PR. (No assignee) coverage and removed untriaged labels Mar 13, 2019
@iirina
Copy link
Contributor

iirina commented Jun 25, 2019

@jayconrod @benjaminp Addressing the questions from #8670.

[...] what rules_go + Bazel + CI should do here? I understand that Bazel needs a tool to merge lcov files from multiple tests with coverage enabled, and that's /usr/bin/lcov by default.

Bazel does need a tool to merge lcov files from multiple tests, but that is not /usr/bin/lcov. Bazel uses an embedded tool called CoverageOutputGenerator which is faster than lcov. the _lcov_merger attribute in go_test can also point to this tool, although I'm not sure what go_test generates in coverage mode.

However, /usr/bin/lcov isn't installed on most systems (including images used on BuildKite). It looks like there are some lcov utilities in @bazel_tools. Will /usr/bin/lcov not be needed in the future?

lcov is only needed for C++ tests and is not required for the other languages. For C++ lcov will be needed in the future.

Should test rules still define the _lcov_merger attribute, or will there be another default behavior in the future?

Currently bazel assumes that all test runs generate an lcov file in coverage mode, so they should have an _lcov_merger attribute pointing to CoverageOutputGenerator in order for bazel to merge all these intermediate reports. There are plans to have a general mechanism for coverage in Starlark rules (#5885) but until then we have to rely on the _lcov_merger attribute.

Yeah, can we fix coverage to not try to invoke lcov if the rule doesn't have the lcov attribute?

I wouldn't say this is a thing that needs to be fixed. Reiterating, lcov_merger is a tool that merges lcov files, which is bazel's standard coverage format. Ideally all tests generate a coverage report in the lcov format and bazel takes care of merging them and generating a combined report.

How does go coverage work with go_test? What does it generate and what role does bazel play here?

@jayconrod
Copy link
Contributor Author

go_test currently emits coverage data in the same format that the external Go tool expects. This was implemented before Bazel had a standard coverage format, so users are expected to analyze the data manually with go tool cover -html=coverage.dat. We declare an _lcov_merger attribute on go_test as a workaround for this issue (the coverage wrapper script fails without it if /usr/bin/lcov was not installed). However, this attribute just points to a trivial script that exits 0.

So I think the long term solution for this is for go_test to emit lcov data, instead of go tool cover data. Does it makes sense for us to point _lcov_merger to a program that translates from the Go format to lcov, or should we do that internally and point _lcov_merger to CoverageOutputGenerator? Or will _lcov_merger need to be set at all?

I'm not sure what we should do in the short term. Is there a way we can keep writing non-lcov coverage data to COVERAGE_OUTPUT_FILE for users to analyze with go tool cover if we don't expect it to be merged with non-Go coverage data?

@benjaminp
Copy link
Collaborator

My problem is that if you have a coverage-enabled test and don't define a _lcov_merger attribute, Bazel will still try to invoke the lcov merger in collect_coverage.sh. It would be nice to completely opt of that.

@iirina
Copy link
Contributor

iirina commented Jun 27, 2019

Thanks @jayconrod @benjaminp for your explanations! Answers inline below.

So I think the long term solution for this is for go_test to emit lcov data, instead of go tool cover data.

I agree this should be the long term solution and fits well with #5885.

Does it makes sense for us to point _lcov_merger to a program that translates from the Go format to lcov, or should we do that internally and point _lcov_merger to CoverageOutputGenerator? Or will _lcov_merger need to be set at all?

This depends on the design of the Starlark mechanism for coverage. My initial idea was for every language to define a tool that converts from their format to lcov so that we can have consistency across the rules sets. _lcov_merger can point to CoverageOutputGenerator but since right now it doesn't merge any data for rules_go there is no point in starting it up.

I'm not sure what we should do in the short term. Is there a way we can keep writing non-lcov coverage data to COVERAGE_OUTPUT_FILE for users to analyze with go tool cover if we don't expect it to be merged with non-Go coverage data?

You can still do that for the foreseeing future.

My problem is that if you have a coverage-enabled test and don't define a _lcov_merger attribute, Bazel will still try to invoke the lcov merger in collect_coverage.sh. It would be nice to completely opt of that.

That would go against standardizing bazel's coverage output. Even if it's possible right now to output different formats in COVERAGE_OUTPUT_FILE we'd still like to keep moving towards a standard.

@benjaminp
Copy link
Collaborator

My problem is that if you have a coverage-enabled test and don't define a _lcov_merger attribute, Bazel will still try to invoke the lcov merger in collect_coverage.sh. It would be nice to completely opt of that.

That would go against standardizing bazel's coverage output. Even if it's possible right now to output different formats in COVERAGE_OUTPUT_FILE we'd still like to keep moving towards a standard.

What if the rule can produce lcov output natively without the help of lcov merger? The current situation is that Bazel tries to invoke a tool that it purposely didn't add as an input to the test. That's not a useful behavior.

@iirina
Copy link
Contributor

iirina commented Jul 2, 2019

Especially if your rule produces lcov natively you should specify the lcov merger tool. If your test generates multiple lcov reports then lcov_merger can find and merge them.

The current situation is that Bazel tries to invoke a tool that it purposely didn't add as an input to the test.

Bazel fails the coverage run if it doesn't find lcov-merger. The behaviour when bazel was invoking lcov (the tool) when LCOV_MERGER was not defined is no longer there. It was removed in 190d4f8. Bazel doesn't invoke lcov at all anymore now, it uses only gcov for C++ coverage.

@benjaminp
Copy link
Collaborator

Bazel fails the coverage run if it doesn't find lcov-merger. The behaviour when bazel was invoking lcov (the tool) when LCOV_MERGER was not defined is no longer there. It was removed in 190d4f8. Bazel doesn't invoke lcov at all anymore now, it uses only gcov for C++ coverage.

What the see is that if coverage collection is enabled for a target, Bazel preprends tools/test/collect_coverage.sh to the test command. This script fails if $LCOV_MERGER doesn't exist. If the target does not have an implicit $lcov_merger attribute, $LCOV_MERGER will not be set, so the test action will be set up to fail.

@fmeum
Copy link
Collaborator

fmeum commented Sep 10, 2022

@lberki I think this can be closed. Starlark rules can now add the required dependency in the following way without incurring any overhead for bazel test:

"_lcov_merger": attr.label(
    default = configuration_field(fragment = "coverage", name = "output_generator"),
    cfg = "exec",
)

rules_go emits lcov that is then picked up by the merger tool.

Copy link

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

@github-actions github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Nov 15, 2023
@fmeum fmeum closed this as completed Nov 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
coverage P3 We're not considering working on this, but happy to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-Rules-Server Issues for serverside rules included with Bazel type: bug
Projects
None yet
Development

No branches or pull requests

6 participants