Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel 6.X.X - runfiles output-directories does not respect workspace-name anymore #18128

Open
PieterBunnenberg opened this issue Apr 18, 2023 · 7 comments
Assignees
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Documentation Documentation improvements that cannot be directly linked to other team labels team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug

Comments

@PieterBunnenberg
Copy link

Description of the bug:

Starting with Bazel 6, the output directory for runfiles does not use the worskpace-name for the sub-directory anymore.

From the documentation:

foo/bar/baz.runfiles/ <== The runfiles symlink farm for the //foo/bar:baz executable.
MANIFEST
<workspace-name>/

With Bazel 5.X.X:
foo/bar/baz.runfiles/<workspace-name>

With Bazel 6.X.X:
foo/bar/baz.runfiles/_main

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

No response

Which operating system are you running Bazel on?

Linux

What is the output of bazel info release?

release 6.1.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@fmeum
Copy link
Collaborator

fmeum commented Apr 18, 2023

This should only be the case with --enable_bzlmod, where it is expected behavior: The WORKSPACE file is destined to go away eventually and the way Bzlmod uses repository mappings leads to ambiguities if users could specify the name of this directory directly.

If you use the runfiles libraries (e.g. @bazel_tools//tools/cpp/runfiles), you can still use your workspace name in paths and the library will resolve that to the correct path.

@Wyverald Should we update the docs?

@Pavank1992 Pavank1992 added the team-Documentation Documentation improvements that cannot be directly linked to other team labels label Apr 18, 2023
@PieterBunnenberg
Copy link
Author

@fmeum Thanks for clarification. The flag --enable_bzlmod is used within the project. So, updating the docs might be a good idea. :)

@keertk keertk added the team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. label May 1, 2023
@meteorcloudy meteorcloudy added P2 We'll consider working on this in future. (Assignee optional) and removed untriaged labels May 2, 2023
@aiuto
Copy link
Contributor

aiuto commented Oct 27, 2023

This is a fairly severe breaking change. It becomes impossible to write some code that is backwards compatible to bazel versions before bzlmod.

The basic difference is where files are based. Let's say I have a WORKSPACE named "my_project". Under it I have tests/BUILD, which produces the target //tests/data.txt, which is consumed by //tests/data_gen_test.

Under the old scheme, that file would be in
/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/tests/data_gen_test.runfiles/my_project/tests/data.txt.
Under the bzlmod, that file would appear in.
/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/tests/data_gen_test.runfiles/_main/tests/data.txt.

Problem 1: python runfiles

@fmeum pointed out that you need to use the runfiles module for python. While that is true, it is not clear if you should use @bazel_tools//tools/python/runfiles or if you should use @rules_python//python/runfiles. There are some differences. I assert that users should prefer to use runfiles from rules_python. That allows you to write code using runfiles that is not only used within bazel biulds and tests. You could tar up build tree and provide a binary that works outside of the bazel environment. Otherwise, code that reads data files might need distinct versions for test harness vs. production. So, we have.

BUILD

py_test("data_gen_test", srcs=[...], data = ["data.txt"],  deps = ["@rules_python//python/runfiles"])

data_gen_test.py

 from rules_python.python.runfiles import runfiles

This worked until I tried --enable_bzlmod on a bazel 7.x build. The import must change to
from python.runfiles import runfiles. This is a big loss. We lost the namespacing of where the library comes from. The potential for clashes across modules increases because the module name is no longer part of the namespace.

Problem 2: rules_python runfiles might not be doing the trick to find _main.

The previous correct usage to find a file path was runfiles.Create().Rlocation('my_project/tests/data.txt').
Even when I can load the right runfiles the paths are still wrong. Not clear what is right. Should it be "_main/tests/data.txt" (obviously not, that is ugly). It seems, however, that is what I must to with @bazel_tools//python/runfiles. Not sure about rules_python's one yet.

Problem 3: What happens when you vendor in modules?

OK, this might be a google only problem, we take my_package and vendor it in. runfiles will be wrong when we do that. That is understandable. But let's make it easy to fix on import. One can do that with a copybara transform of something like "rules_pkg/" => "third_party/rules/rules_pkg/". That is harder when we don't have the package name to be the key of the regex replacement.

@aiuto
Copy link
Contributor

aiuto commented Oct 27, 2023

@rickeylev This is what has been bitting me all this week. I would love to hear your opinion of what Python runfiles one should use.

@fmeum
Copy link
Collaborator

fmeum commented Oct 27, 2023

I would also be interested in @rickeylev's opinion. I would personally say we should:

  1. Adopt @rules_python//python/runfiles as the recommended runfiles library, which also ensures that runfiles paths that worked before Bzlmod (using the workspace name) will continue to work.
  2. Terminally deprecate the library in bazel_tools.
  3. Officially recommend that all Python projects should have their sources under a descriptively named top-level directory. Since WORKSPACE names are ultimately user-specified and only aligned via conventions, I wouldn't consider them ideal identifiers for import paths anyway.

@rickeylev
Copy link
Contributor

what Python runfiles one should use.

The rules_python runfiles library should be used. I'm actually not sure if the one in bazel_tools is as functional; I don't exactly remember when the code was copied over or what changes were put into only rules_python and not bazel_tools. In any case, the one in bazel_tools shouldn't be used.

should it be _main or my_project

Either should work. The runfiles library takes the first component and looks it up in the repo mapping file ($runfiles/_repo_mapping, iirc).

The import must change to from python.runfiles import runfiles. This is a big loss. We lost the namespacing of where the library comes from.

This "namespacing" was always a mistake. Implicitly adding the build system's concept of the "location" of a library to the language's runtime identity of a library is a bad thing. The equivalent in java would be e.g. import rules_java.com.google.whatever.Foo -- clearly not a good thing.

Repo names are not stable. Under workspace, they are user selected. Under bzlmod they are munged, plus there is the repo-mapping, plus multi-version-override stuff.

Officially recommend that all Python projects should have their sources under a descriptively named top-level directory.

Yes, that'd be one option, though that also has issues -- it relies on the repo root directory being added to sys.path, which can become exceedingly long (hence why we will eventually disable that by default). The suggested layout in the larger Python community is to have a top-level "src" directory, with everything below that, e.g. src/foo. Regardless, the key point is to more deliberately choose the top-level import name and the sys.path entry that is added via py_library.imports.

@rickeylev
Copy link
Contributor

Sorry, forgot to mention, you can use multiple imports if you need to support multiple versions of bazel and/or rules_python. Something like:

try:
  from python.runfiles import runfiles  # Case 1
except ImportError:
    from rules_python.python.runfiles import runfiles # Case 2
  • Case 1 covers rules_python version 0.18.0 or higher (I think).
  • Case 2 covers before 0.18.0

If you want to use the published pypi package, you can do that, too: import runfiles and depend on bazel-runfiles from pypi. Depending via pypi isn't necessarily better than depending via bazel.

fzakaria added a commit to fzakaria/stablehlo that referenced this issue Feb 13, 2024
Please see bazelbuild/bazel#18128
Workspace names are going away.
fzakaria added a commit to fzakaria/stablehlo that referenced this issue Feb 13, 2024
Please see bazelbuild/bazel#18128
Workspace names are going away.
fzakaria added a commit to openxla/stablehlo that referenced this issue Feb 13, 2024
Major change: Bump Bazel to the new LTS at 7.X

* Included a default Python toolchain 3.10

  There is a problem at the moment running lit using Python311 where it
  cannot find the module.

  Our CI system already enforces Python310 to get around it but we should
  set the toolchain in Bazel itself to avoid this problem.
  I had already filed llvm/llvm-project#75963

* I moved skylib to MODULE.bazel to test the MODULE portion of Bazel is
working. A remaining action item is to move LLVM itself to MODULE.bazel
which would preclude needing to define skylib since its a dependency of
LLVM.

* Workspace names are no longer named in Bazel 7+
  
  I had to change stablehlo to _main to fix the runfiles directory.
  Please see bazelbuild/bazel#18128

We should consider using
https://github.com/bazelbuild/rules_python/blob/main/python/runfiles/runfiles.py#L262
instead to get the runfiles directory for future proofing.

Fixes #1878

---------

Co-authored-by: mlevesquedion <mlevesquedion@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Documentation Documentation improvements that cannot be directly linked to other team labels team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug
Projects
None yet
Development

No branches or pull requests

10 participants