feat(coverage): Register coverage.py to hermetic toolchains #977

aignas · 2023-01-09T10:28:21Z

PR Checklist

Please check if your PR fulfills the following requirements:

Some examples have been modified to work with bazel coverage.
Docs have been added / updated (for bug fixes / features).

PR Type

What kind of change does this PR introduce?

Bugfix
Feature (please, look at the "Scope of the project" section in the README.md file)
Code style update (formatting, local variables)
Refactoring (no functional changes, no api changes)
Build related changes
CI related changes
Documentation content changes
Other... Please describe:

What is the current behavior?

The coverage_tool cannot be included via the toolchian.

Issue Number: #43

What is the new behavior?

Summary:

A script to update URLs and sha256 values used to download platform specific coverage.py wheels.
Allow to specify coverage_tool attribute in python_repository.
Register coverage.py when running pip_install_dependencies function.
Ensure that use_repo includes coverage tool, so that bzlmod users can use it too.
Have a boolean flag, which disables setting coverage_tool in the toolchain code.

Does not work/bad/ugly:

Use coverage.py v6.5.0 because the latest has a file types.py in the package directory, which imports from Python's stdlib types. Somehow the Python interpreter is thinking that the from types import FrameType is referring to the currently interpreted file and everything breaks. I would have expected the package to use absolute imports and only attempt to import from coverage.types if we use coverage.types and not just a plain types import.
The multi_python_versions example cannot show coverage for the more complex tests that are using subprocess. I am wondering if this is related to the fact that we are including coverage.py via the toolchain and not through other mechanisms.
The __init__.py files in the root of the WORKSPACE in bzlmod is breaking, when running under bazel coverage //:test. However, it started working when I renamed __init__.py to lib.py. I am suspecting that this has to do with the fact that the layer of indirection that coverage introduces could be something to do with that.
The coverage collection script is written in bash so I removed support for the automatic registration of coverage toolchain for Windows for now. The issue tracking upstream support for coverage in Windows is here.

Work towards #43.

Does this PR introduce a breaking change?

Yes
No

We may potentially break setups where coverage tooling is already included via
COVERAGE_PYTHON variable setting. In that case the users should use the flag to
disable coverage_tool inclusion in the toolchain. Setting
register_coverage_tool = False when registering toolchains would be sufficient.

Other information

aignas · 2023-01-09T10:36:30Z

This is mostly building on the support added in bazelbuild/bazel@9d01630

python/repositories.bzl

adam-azarchs · 2023-01-18T01:49:31Z

python/private/coverage_deps.bzl

+            http_archive,
+            name = name,
+            build_file_content = """
+py_library(


I'm not sure this can be a py_library. Doesn't that lead to a dependency when you add it as a dependency for the python_runtime? Or is the python toolchain not actually a dependency of py_library targets (as opposed to py_binary)?

coverage_tool needs to be a single file from the toolchain point of view and that is why there is only a single src in there. I was following official docs from https://bazel.build/configure/coverage#python and since the manual tests with bazel coverage worked in some cases, I thought that was fine?

I guess so! I must have been thinking about the prohibition on using py_binary there (because py_binary does depend on the python toolchain to assemble the wrapper script, and that of course depends on the coverage tool).

TIL, thanks for explaining. Should we have an extra line of docs somewhere?

I would suggest not using py_library for now -- it implies a set of functionality that simply isn't respected (e.g. imports attribute won't be respected, anything in its PyInfo is ignored, etc).

It's currently possible to put a py_library there, but is a bit risky. This is because py_library doesn't, today, depend on the toolchain. However, I don't see how it can avoid that forever (in particular, I'm thinking of for when pyc precompilation is added).

I do think a py_library-like-equivalent dependency is the correct conceptual type of dependency that should be there, though. Running with code coverage is, essentially, adding an extra dependency to the binary and wrapping the regular entry point in another entry point. Unfortunately, it's not immediately clear to me how to do make that work cleanly via (or with support from) the toolchain (I have a few ideas, but haven't thought them through).

To use the coverage settings of the toolchain, you need to:

Point coverage_tool to a single file. Only this file is consumed. Any additional files, runfiles, dependencies, etc aren't used.

Put any files coverage_tool needs in coverage_files, which will be added to the binary's runfiles.

I mean filegroup is clearly documented that srcs is in the files output for the rule (which we need) and data is in the runfiles. I don't think we need to make it more complicated than that (it is also true that bazel will treat such a filegroup as an executable target, however we don't think we actually depend on that behavior).

Er, sorry, I mixed up the py_runtime and PyRuntimeInfo apis. Disregard the "you have to specify two attributes" part. The splitting of the target into (entry point, files) is handled by py_runtime.

Yes, +1 to filegroup.

I don't think we need to make it more complicated than that

I think you're referring to my "it should be a py library-like-equivalent dependency" comment?

There's a big distinction between "bundle of pre-built, platform-specific files" and a "dependency" (for lack of a better term). The former carries no information other than "here's a bag of files". A dependency can carry additional information, such as modifications to make to the import path, additional linker flags, propagating a proper runfiles object (we currently omit symlinks, for example), or other necessary metadata.

But, lets not derail this PR with an extended discussion of that? Happy to discuss in more detail on slack or in an issue

it is also true that bazel will treat such a filegroup as an executable target, however we don't think we actually depend on that behavior

"depend" is a strong word, but we do allow the "executable" of the target to be an indicator as to what the entry point is. The basic logic for converting the Target to the (entry point, files) is:

If there is one output, use that

otherwise, use whatever the executable file is

Otherwise, error

I don't think we need to make it more complicated than that

I think you're referring to my "it should be a py library-like-equivalent dependency" comment?

There's a big distinction between "bundle of pre-built, platform-specific files" and a "dependency" (for lack of a better term). The former carries no information other than "here's a bag of files". A dependency can carry additional information, such as modifications to make to the import path, additional linker flags, propagating a proper runfiles object (we currently omit symlinks, for example), or other necessary metadata.

Yes, and in this particular case, so long as the target has its visibility restricted to the toolchain, a bag of pre-built platform-specific files is exactly what this is - the py_runtime rule won't pay any attention to additional providers from the target, and if it did honor e.g. imports from this input, I can't see a situation where that wouldn't do more harm than good. But, this is a very good reason to make sure the visibility is appropriately restricted.

"depend" is a strong word, but we do allow the "executable" of the target to be an indicator as to what the entry point is. The basic logic for converting the Target to the (entry point, files) is:

If there is one output, use that

otherwise, use whatever the executable file is

Otherwise, error

Yes, I do remember how I implemented that. But in this case I was referring to the fact that a filegroup with a single file in src will have a FilesToRun provider supplying that file as an executable, which would make the second option you list above there work, but we don't depend on that behavior (which is, admittedly, a little weird) because it'll stop at the first in this case.

At any rate, if a filegroup is what we want to use here, then the upstream docs in the bazel repo should maybe be updated? Except that there might be use cases for having it as a py_library in the general case; just in this particular case we don't need it to be and by using restricted visibility and a filegroup we can afford to not think about all of the ways that could go wrong.

I changed it to use filegroup, so it should be hopefully an improvement over py_library and we can figure out something better as a separate PR.

f0rmiga · 2023-01-18T16:28:00Z

This is awesome! I was going to work on it this week. I did a pass on your code so far and will do a more thorough review today.

rickeylev · 2023-01-18T19:00:07Z

python/private/coverage_deps.bzl

+py_library(
+    name = "coverage",
+    srcs = ["coverage/__main__.py"],
+    data = glob(["coverage/*", "coverage/**/*.py", "coverage/*.so"]),


This is moot after removing py_library, but:

.py files don't belong in data. Instead, a separate py_library should be factored out with proper srcs

The "coverage/*" glob is probably too general and overlaps with "*.so". In general, a "*" or "**" glob should only be done when you need every file in the source tree (e.g. some sort of source distribution zip), or if you have a data directory and you know it only contains files that should be included.

Done, please take a look again.

rickeylev · 2023-01-18T21:05:35Z

python/private/coverage_deps.bzl

+    srcs = ["coverage/__main__.py"],
+    data = glob(["coverage/*", "coverage/**/*.py", "coverage/*.so"]),
+    exec_compatible_with = {compatible_with},
+    visibility = ["//visibility:public"],


Should this be public visibility? It's really only the toolchain that should be referencing it. It'd be rather hard to use it correctly because its just a group of platform-specific files.

rickeylev · 2023-01-18T21:06:35Z

python/repositories.bzl

+config_setting(
+    name = "coverage_enabled",
+    values = {{"collect_code_coverage": "true"}},
+)
+


oh awesome, thank you for adding this!

Can you restrict visibility of this? It's not the py rule's responsibility to tell if coverage is enabled or not, so lets not expose a way for people to use it and think it is.

rickeylev · 2023-01-18T21:17:21Z

python/repositories.bzl

@@ -255,13 +267,15 @@ py_runtime_pair(
        python_path = python_bin,
        python_version = python_short_version,
        python_version_nodot = python_short_version.replace(".", ""),
+        coverage_tool = rctx.attr.coverage_tool if rctx.attr.coverage_tool == None and "windows" not in rctx.os.name else "\"{}\"".format(rctx.attr.coverage_tool),


I think this logic has a bug?

If tool=None and os=windows, then it'll do format(None), resulting in "None", which isn't a valid value.

Is this supposed to be "or" instead of "and" ?

Thanks, there was a bug, fix it now.

python/repositories.bzl

rickeylev · 2023-01-18T21:25:27Z

python/private/coverage_deps.bzl

+            http_archive,
+            name = name,
+            build_file_content = """
+py_library(


Er, sorry, I mixed up the py_runtime and PyRuntimeInfo apis. Disregard the "you have to specify two attributes" part. The splitting of the target into (entry point, files) is handled by py_runtime.

python/repositories.bzl

rickeylev · 2023-01-26T21:55:40Z

examples/bzlmod/.bazelrc

@@ -1 +1,3 @@
 common --experimental_enable_bzlmod
+
+coverage --java_runtime_version=remotejdk_11


Why is this needed?

My guess would be because they don't have java installed on their dev machine, or at least not in a way that bazel is able to find it, and including it in the PR was accidental.

Should I remove it? I installed bazel via bazelisk and it does not pull in any java dependencies by default. What is the official way to do this?

So according to this: https://bazel.build/docs/bazel-and-java#hermetic-testing

What this basically means is instead of using a locally installed JDK, it uses one it downloads (it sounds similar to how we're doing things).

So this is probably a net positive for getting started at the cost of having to download a JDK. I think that's worth it, so go ahead and leave it in.

rickeylev · 2023-01-26T22:04:21Z

python/private/coverage_deps.bzl

+
+# Update with './tools/update_coverage_deps.py <version>'
+#START: managed by update_coverage_deps.py script
+_coverage_deps = {


Ah hm. Something I just realized is we run the risk of conflicting with a user installed coverage library. e.g.
the toolchain uses coverage X
The user, in their requirements, uses coverage Y

At runtime, we'll have 2 different coverage libraries available, and it's not well defined which will be used.

I'm not sure what we can do about this, though. I guess add an arg to toolchain creation and dependency setup that allows the user to specify the coverage target to use, and skip having one automatically installed? Or...IDK. The key attribute we're looking for is that the toolchain and user binary reference the same target (or equiv; that we only end up with 1 coverage library in runfiles).

That would be an argument in favor of using py_library as the target type in the remote for coverage_dep, so things other than the toolchain could use it.

I don't think there's a way around having the coverage tool used by the toolchain being potentially different from what's declared in requirements, because you can't actually do a pip install until the toolchain is configured. I don't think it's a dealbreaker, though; it might be a little surprising to find out that coverage is running with a version of coveragepy distinct from what was in your requirements.txt, but at the same time I don't think it'll actually interefere with anything.

For now, can you default register_coverage_deps=False? Otherwise we'll potentially cause issues with people who have coverage in their deps already, and we don't really have any solution or workaround available to them.

Yeah, we were kicking around ideas on how to best address this. I'm not sure there is much to be done for this in this PR. The only thing I can think of would be to change the register_coverage_tool arg from a bool to something that allows the caller to pass in their own target.

you can't actually do a pip install until the toolchain is configured.

Hmmm, well, what if you could? If we had a separate "pip tools" or "requirements resolution" toolchain, then maybe we'd be able to do that. Just thinking out loud.

The two more appealing ideas we came up with were:

Making sure the toolchains' coverage was last in the import order (probably by modifying the bootstrap, when the bootstrap_template arg becomes available). That would at least give well-defined order for which version gets used and lets the user have more control over the version used.

Have the toolchain only have the coverarge tool entry point, not the coverage library itself; each test is responsible for bringing the necessary deps for the coverage tool (and try to figure out how to make that as easy/transparent to test writers as possible)

rickeylev · 2023-01-26T22:08:57Z

python/tests/toolchains/run_acceptance_test.py.tmpl

+
+    @unittest.skipIf(%is_windows%, "not supported on Windows")
+    def test_coverage_toolchain(self):
+        subprocess.run("bazel coverage //...", shell=True, check=True)


Is there another way to test this? I was actually looking to remove this run_acceptance_test thing because it prevents upstream bazel from running our tests with upcoming bazel versions.

I am not sure about that. I was using what was available, but if you have a suggestion on how to configure bazelci in order to run coverage, I would be happy to remove the changes in this file.

I think we just have to add coverage_targets: <targets> to the presubmit.yml?

https://github.com/bazelbuild/continuous-integration/blob/master/buildkite/README.md#running-bazel-coverage

But eh, I'm ok with doing that in a separate PR.

Reverted modifications to this file and relying on the coverage_targets in the examples for now.

rickeylev · 2023-01-27T17:42:08Z

python/tests/toolchains/run_acceptance_test.py.tmpl

@@ -53,6 +53,15 @@ class TestPythonVersion(unittest.TestCase):

        subprocess.run("bazel test //...", shell=True, check=True)

+        stream = os.popen("bazel info execution_root")


Please use the subprocess module instead. Right now, this is leaking an opened file description, which spams the logs with a warning.

rickeylev · 2023-01-27T17:42:38Z

python/tests/toolchains/run_acceptance_test.py.tmpl

+
+    @unittest.skipIf(%is_windows%, "not supported on Windows")
+    def test_coverage_toolchain(self):
+        subprocess.run("bazel coverage //...", shell=True, check=True)


I think we just have to add coverage_targets: <targets> to the presubmit.yml?

https://github.com/bazelbuild/continuous-integration/blob/master/buildkite/README.md#running-bazel-coverage

But eh, I'm ok with doing that in a separate PR.

- Only set the tooling if it is actually needed. - By default it is set to None.

The user can turn the coverage_tool setting by passing `register_coverage_tool=(True|False)` to `python_register_toolchains` or `python_register_multi_toolchains` call or specifying the `coverage_tool` label as described in the `versions.bzl` file. Update tests to: - ensure that we can still use the toolchain as previously. - ensure that we are not downloading extra deps if they are not needed.

This hopefully makes everything more watertight and will help avoid bugs with misconfiguration.

It seems that the support is missing upstream [3]. [3]: bazelbuild/bazel#15835

The `__init__.py` files in the root of the WORKSPACE in `bzlmod` is breaking, when running under `bazel coverage //:test`. However, it started working when I renamed `__init__.py` to `lib.py`. I am suspecting that this has to do with the fact that the layer of indirection that `coverage` introduces could be something to do with that.

The `multi_python_versions` example cannot show coverage for the more complex tests that are using `subprocess`. I am wondering if this is related to the fact that we are including `coverage.py` via the toolchain and not through other mechanisms [2]. [2]: https://bazel.build/configure/coverage

I am not sure how I can translate the config constraint to a build environment here.

The logic should be that we set the `coverage_tool` to None if it is Windows or the tool has been not set.

This changes how we install the coverage dependencies and allows us to set the visibility for the coverage tool correctly by passing the name of the toolchain repo during the repository function execution. bzlmod does not have this due to the fact that the dependency registering is happening in a different way/order.

Now we store everything in a nested dict of tuples, which makes the lookup of supported versions slightly cleaner.

We rely now on the bazelci config instead

rickeylev · 2023-01-31T01:21:44Z

btw, re: coverage of subprocesses not working: in the bootstrap, there's some code that unsets some coverage variables. I think keeping that set is necessary to propagate coverage things to subprocesses

…ld#977) This allows including the coverage package as part of the toolchain dependencies, which is mixed into a test's dependencies when `bazel coverage` is run (if coverage is not enabled, no extra dependency is added) For now, it's disabled by default because enabling it poses the risk of having two versions of coverage installed (one from the toolchain, one from the user's dependencies). The user can turn the coverage_tool setting by passing `register_coverage_tool=(True|False)` to `python_register_toolchains` or `python_register_multi_toolchains` call or specifying the `coverage_tool` label as described in the `versions.bzl` file. Use coverage.py v6.5.0 because the latest has `types.py` in the package directory, which imports from Python's stdlib `types` [1]. Somehow the Python interpreter is thinking that the `from types import FrameType` is referring to the currently interpreted file and everything breaks. I would have expected the package to use absolute imports and only attempt to import from `coverage.types` if we use `coverage.types` and not just a plain `types` import. NOTE: Coverage is only for non-windows platforms. Update tests to: - ensure that we can still use the toolchain as previously. - ensure that we are not downloading extra deps if they are not needed. * Also changes the projects bazelrc to use a remotejdk, which makes it easier for contributors because they don't have to locally install a jdk to get going. [1]: https://github.com/nedbat/coveragepy/blob/master/coverage/types.py [3]: bazelbuild/bazel#15835

aignas requested review from hrfuller, f0rmiga and rickeylev as code owners January 9, 2023 10:28

aignas marked this pull request as draft January 9, 2023 10:28

aignas marked this pull request as ready for review January 9, 2023 12:06

aignas mentioned this pull request Jan 12, 2023

Python Coverage does not work bazelbuild/bazel#10660

Open

adam-azarchs reviewed Jan 12, 2023

View reviewed changes

python/repositories.bzl Outdated Show resolved Hide resolved

aignas force-pushed the coverage branch from 9f80bb6 to 5570ff7 Compare January 17, 2023 07:20

aignas marked this pull request as draft January 17, 2023 07:20

aignas force-pushed the coverage branch 7 times, most recently from a0a0700 to 8550b00 Compare January 18, 2023 00:23

aignas marked this pull request as ready for review January 18, 2023 01:15

adam-azarchs reviewed Jan 18, 2023

View reviewed changes

rickeylev reviewed Jan 18, 2023

View reviewed changes

rickeylev requested changes Jan 18, 2023

View reviewed changes

python/repositories.bzl Show resolved Hide resolved

aignas force-pushed the coverage branch from 2bada83 to f9ea7da Compare January 19, 2023 08:54

aignas requested review from rickeylev and removed request for f0rmiga and hrfuller January 19, 2023 08:55

rickeylev reviewed Jan 26, 2023

View reviewed changes

rickeylev requested changes Jan 27, 2023

View reviewed changes

aignas force-pushed the coverage branch from f9ea7da to fccf769 Compare January 30, 2023 01:47

feat(coverage): Add a bzl macro to install coverage.py

8faed9b

aignas added 22 commits January 30, 2023 15:54

feat(toolchain): Allow to set coverage_tool for a python toolchain

13b5fe1

- Only set the tooling if it is actually needed. - By default it is set to None.

feat(coverage): use exec_compatible_with

1a5f8cb

This hopefully makes everything more watertight and will help avoid bugs with misconfiguration.

feat(coverage): limit support of coverage to non Windows platforms

c2c6ad5

It seems that the support is missing upstream [3]. [3]: bazelbuild/bazel#15835

feat(example): use external openjdk to be able to collect coverage

c329959

Add coverage.py for Python 3.11

fd1c48f

address comment: restrict config_setting visibility

a6c66dc

address comment: use filegroup instead of py_library

20e62cf

address comment: mention official bazel docs

ca97b1d

address comment: remove the compatible_with

52cc5b3

I am not sure how I can translate the config constraint to a build environment here.

address comment: Make the glob more specific

d741b92

address comment: fix a bug in windows handling

40bd804

The logic should be that we set the `coverage_tool` to None if it is Windows or the tool has been not set.

fixup: Simplify the supported coverage binary lookup

4eba78f

Now we store everything in a nested dict of tuples, which makes the lookup of supported versions slightly cleaner.

comment: use subprocess.check_output in test_match_toolchain

5572799

feat: add Python 3.11 to multi_python_versions example

f704efa

comment: skip registering coverage_tool by default

0af2036

ci: test coverage enabled examples in the CI

a7bbb2b

fixup: register coverage tool in toolchain acceptance tests

9628610

revert: rollback changes to toolchain tests

f02a1b2

We rely now on the bazelci config instead

aignas force-pushed the coverage branch from 92e6faf to f02a1b2 Compare January 30, 2023 06:55

chore: add copyright headers to new files

480ed99

rickeylev approved these changes Jan 31, 2023

View reviewed changes

rickeylev merged commit 488a037 into bazelbuild:main Jan 31, 2023

faximan mentioned this pull request Feb 10, 2023

0.18.0 breaks compatibility for Bazel 5 #1056

Closed

aignas deleted the coverage branch February 13, 2023 08:12

		@@ -1 +1,3 @@
		common --experimental_enable_bzlmod

		coverage --java_runtime_version=remotejdk_11

		@@ -53,6 +53,15 @@ class TestPythonVersion(unittest.TestCase):

		subprocess.run("bazel test //...", shell=True, check=True)

		stream = os.popen("bazel info execution_root")

feat(coverage): Register coverage.py to hermetic toolchains #977

feat(coverage): Register coverage.py to hermetic toolchains #977

Conversation

aignas commented Jan 9, 2023 • edited Loading

PR Checklist

PR Type

What is the current behavior?

What is the new behavior?

Does this PR introduce a breaking change?

Other information

aignas commented Jan 9, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

f0rmiga commented Jan 18, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rickeylev commented Jan 31, 2023

aignas commented Jan 9, 2023 •

edited

Loading