Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: make bootstrap_impl=script compute correct directory when RUNFILES_MANIFEST_FILE set #2177

Conversation

scasagrande
Copy link
Contributor

@scasagrande scasagrande commented Sep 3, 2024

The script-based bootstrap wasn't computing the correct runfiles directory when
RUNFILES_MANIFEST_FILE was set. The path it computed stripped off the manifest
file name, but didn't re-add the .runfiles suffix to point to the runfiles
directory.

To fix, just re-append the .runfiles suffix after it removes the manifest file
name portion of the path.

Reproducing this is a bit tricky and it's difficult to reproduce the necessary build
flags in a test; all of the following must be met:

  • --enable_runfiles=false, but this cannot be set by transitions, only via command line
  • --build_runfile_manifests=true (this can be set in a transition, but see below)
  • Due to RUNFILES_MANIFEST_FILE is unset inside the sandbox bazel#7994, even if a manifest is created,
    the RUNFILES_MANIFEST_FILE env var won't be set unless the test strategy is local
    (i.e. not sandboxed, which is the default).

To work around those issues, the test just recreates the necessary envvar state and
invokes the binary. The underlying files may not exist, but that's OK for the code
paths were testing.

Fixes #2186

@aignas
Copy link
Collaborator

aignas commented Sep 4, 2024

You can add a test in https://github.com/bazelbuild/rules_python/tree/main/tests/base_rules

@scasagrande
Copy link
Contributor Author

scasagrande commented Sep 4, 2024

Okay I've been looking through the existing tests and I must admit I am a little lost here.

I took a look at tests/bootstrap_impls and I see the we already have a test for a target having a data dependency on a py_binary target: https://github.com/bazelbuild/rules_python/blob/main/tests/bootstrap_impls/BUILD.bazel#L47

However my change is handling the case where RUNFILES_DIR is not defined. Instead the bootstrap script determines the runfiles dir based on RUNFILES_MANIFEST_FILE, which should be a sibling to the .runfiles folder.

Can you help me out with what you'd want to see for testing this?

@scasagrande
Copy link
Contributor Author

Maybe my first step is just to make a separate minimal reproduction of the issue, and then we can boil that down into a test.

I'll see what I can do there, but I also just fixed my original issue another way. My scenario is as follows:

bazel run rust_binary target -> dep on rust_library -> data dep on py_binary

py_binary is called essentially like this

    let rf = runfiles::Runfiles::create()?;

    let path = runfiles::rlocation!(
        rf,
        [
            "_main",
            "path",
            "to",
            "target"
        ]
        .iter()
        .collect::<PathBuf>()
    );

    let output = std::process::Command::new(path)
        .env_remove("PYTHONPATH")
        .output()?;

Which results in the error. If I use my patch, then it works. Or if I modify my process call as follows, then it also works:

    let output = std::process::Command::new(path)
        .env_remove("PYTHONPATH")
        .env_remove("RUNFILES_MANIFEST_FILE")
        .output()?;

@aignas
Copy link
Collaborator

aignas commented Sep 4, 2024

I think the second code snippet is the correct fix and no changes are needed to rules_python.

@scasagrande
Copy link
Contributor Author

okay, no worries :)

although perhaps some tests are needed in this area anyways. If I can make this change and not impact any existing tests it makes me believe that there is a coverage gap

@scasagrande scasagrande closed this Sep 4, 2024
@rickeylev
Copy link
Contributor

My initial read here is that there is a legit bug. Does this only happen with bootstrap=script and not bootstrap=system_python, or does it happen with both? The part that catches my eye is that there's an absolute local file system path to something in runfiles, but it doesn't have the binary.runfiles path component -- that seems clearly wrong.

For binaries-nested-in-binaries its hard to tell whether the problem is on the caller or callee side. In general you shouldn't have to modify the environment when calling the binary in order for it to work.

The expectation is, given e.g. a test with a binary in data, the test can invoke the binary and have it Just Work for both bazel test :outer and bazel build :outer; bazel-bin/outer. (I use a test--data-->bin as an example, any executable--data-->executable should behave similarly)

Under the hood, it's assumed that the two binary's runfile trees are merged. Hence the inner binary will find the runfiles manifest created by the outer binary, which is OK, because that manifest is the union of both binaries. Similarly, the inner binary is going to find the outer binary's runfiles directory (e.g. bazel-bin/outer.runfiles) and identify that as the runfiles root to use (because there is only one runfiles root).

@scasagrande
Copy link
Contributor Author

Does this only happen with bootstrap=script and not bootstrap=system_python, or does it happen with both?

only with script

For binaries-nested-in-binaries its hard to tell whether the problem is on the caller or callee side. In general you shouldn't have to modify the environment when calling the binary in order for it to work.

agreed all around.

The expectation is, given e.g. a test with a binary in data, the test can invoke the binary and have it Just Work

agreed

@aignas
Copy link
Collaborator

aignas commented Sep 5, 2024

Thank you @rickeylev for raising #2186.

Looking at this again with a fresh set of eyes, it does look like this is a
correct fix, but it would be also good to add a similar fix on line 50. It
does seem that all of the other code has .runfiles suffix in the echo
statements and these two were missing.

@scasagrande scasagrande reopened this Sep 5, 2024
Copy link
Contributor

@rickeylev rickeylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I also added a test.

@rickeylev rickeylev changed the title fix: bootstrap template runfiles dir fix: make bootstrap_impl=script compute correct directory when RUNFILES_MANIFEST_FILE set Sep 5, 2024
@scasagrande
Copy link
Contributor Author

double checked your test locally, and verified that when I remove the fix from this PR that the test correctly errors with the same "unable to find python3" error:

==================== Test output for //tests/bootstrap_impls:run_binary_bootstrap_script_zip_no_test:
env: /private/var/tmp/_bazel_steven/e3e538e5f92c867c7a03c82bec7d4cfa/sandbox/darwin-sandbox/51/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/tests/bootstrap_impls/run_binary_bootstrap_script_zip_no_test/_main~python~python_3_11_aarch64-apple-darwin/bin/python3: No such file or directory
Test case failed: using RUNFILES_MANIFEST_FILE with output manifest
expected output to match: Hello
but got:\n

thank you for adding a test to cover this!

@rickeylev rickeylev added this pull request to the merge queue Sep 5, 2024
@rickeylev rickeylev removed this pull request from the merge queue due to a manual request Sep 5, 2024
@rickeylev rickeylev added this pull request to the merge queue Sep 5, 2024
Merged via the queue into bazelbuild:main with commit 65d1326 Sep 5, 2024
4 checks passed
@scasagrande scasagrande deleted the fix/boostramp-template-runfiles-py-binary branch September 5, 2024 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bootstrap_impl=script doesn't handle RUNFILES_MANIFEST_FILE
3 participants