Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel's own integration tests fail locally on Linux #20753

Open
fmeum opened this issue Jan 5, 2024 · 25 comments
Open

Bazel's own integration tests fail locally on Linux #20753

fmeum opened this issue Jan 5, 2024 · 25 comments
Assignees
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Local-Exec Issues and PRs for the Execution (Local) team type: bug

Comments

@fmeum
Copy link
Collaborator

fmeum commented Jan 5, 2024

Description of the bug:

Many of Bazel's own integration tests fail locally on Linux with an error such as the following:

src/main/tools/linux-sandbox-pid1.cc:280: "The sandbox working directory cannot be below a path where we mount tmpfs (you requested mounting /tmp/bazel-working-directory/_main/_tmp/74701f16efd616fedbad9bb8b36df803/root/0a1489a1de031466aaaf200f31c5b29b/sandbox/linux-sandbox/1/execroot/_main in /tmp). Is your --output_base= below one of your --sandbox_tmpfs_path values?": Invalid argument

This appears to be due to a combination of the outer Bazel process now running with --incompatible_sandbox_hermetic_tmp and the inner Bazel process running with --sandbox_tmpfs_path=/tmp due to

build --sandbox_tmpfs_path=/tmp

Which category does this issue belong to?

No response

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

bazel test //src/test/shell/bazel:bazel_rules_cc_test --test_output=errors

Which operating system are you running Bazel on?

Linux

What is the output of bazel info release?

No response

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@fmeum
Copy link
Collaborator Author

fmeum commented Jan 5, 2024

CC @lberki

@sgowroji sgowroji added the team-Local-Exec Issues and PRs for the Execution (Local) team label Jan 5, 2024
@lberki
Copy link
Contributor

lberki commented Jan 5, 2024

Let me see if I can just remove that flag from the test bazelrc. Now that we have hermetic /tmp and a new JVM, it should not be necessary, but ISTR that there were some test breakages when I tried removing it on the side.

@lberki
Copy link
Contributor

lberki commented Jan 5, 2024

Note to self: bazel test --test_output=streamed //src/test/shell/bazel:cc_integration_test --nocache_test_results --test_filter=test_env_inherit_cc_test seems to be an easy way to debug what's going on (so far, I have no clue)

@lberki
Copy link
Contributor

lberki commented Jan 5, 2024

I bet it's the recent reshuffling of /tmp (620d617 or one of the others). What I see is:

linux-sandbox \
  ... \
  -w /tmp/bazel-working-directory/_main/_tmp/b3cdc2524f2b7d337eecbf6bc2c53cb7/root/bca1883186b7bd452a3b6f6da322260c/sandbox/linux-sandbox/4/execroot/_main/_tmp/12935b5e0c672f7d902613a432271f4c \
  ...
  -M /tmp/bazel-working-directory/_main/_tmp/b3cdc2524f2b7d337eecbf6bc2c53cb7/root/bca1883186b7bd452a3b6f6da322260c/sandbox/linux-sandbox/4/_hermetic_tmp -m /tmp   ...
...
src/main/tools/linux-sandbox-pid1.cc:311: writable: /tmp/bazel-working-directory/_main/_tmp/b3cdc2524f2b7d337eecbf6bc2c53cb7/root/bca1883186b7bd452a3b6f6da322260c/sandbox/linux-sandbox/4/execroot/_main/_tmp/12935b5e0c672f7d902613a432271f4c

And my guess is that this is because $TEST_TMPDIR gets its pre-remapping value.

@lberki
Copy link
Contributor

lberki commented Jan 5, 2024

Bingo! There is a sinister comment here:

// We add this path even though it is below sandboxExecRoot (and thus already writable as a

which indicates that 'll probably have to do more digging, sigh.

@lberki
Copy link
Contributor

lberki commented Jan 5, 2024

That comment probably references this code:

and apparently that's how it creates $TEST_TMPDIR.

Arguably, $TEST_TMPDIR is set to a completely wrong value because it's set to something that starts with the execroot (outside the sandbox) and the execroot inside the sandbox is something very different. But we also have --test_tmpdir and $TEST_TMPDIR is computed in StandaloneTestStrategy which doesn't know what sandbox (if any) is in use, so can't trivially make $TEST_TMPDIR something hard-wired under /tmp, at least not easily.

Grump squared.

@lberki
Copy link
Contributor

lberki commented Jan 5, 2024

Update: I have a fix for this, but I still see widespread breakages when I remove --sandbox_tmpfs_path from the test bazelrc, mostly in Android.

There are still a few test cases in cc_integration_test so I'll start with those because I would profoundly enjoy not having to debug Android builds.

@fmeum
Copy link
Collaborator Author

fmeum commented Jan 5, 2024

The test failure in the coverage integration test looks similar to this as of now unresolved Bazel 7.0.0 regression: #20556

@lberki
Copy link
Contributor

lberki commented Jan 5, 2024

I had some scraps of time to track down test_execroot_subdir_layout_fails_for_external_subpackages.

One lesson is that the presence of --sandbox_tmpfs_path=/tmp negated the flag flip for --incompatible_sandbox_hermetic_tmp in our test battery, so the functionality was untested the whole time. Go me.

For that test case in particular, one could argue that it's WAI: it tests that builds requiring files under the external/ directory don't work (because they are overridden by external repositories). The reason why the test case fails with hermetic /tmp is that the new layout of the sandbox actually makes it possible to access those files from within the action. Of course, it's a namespace clash (external/foo/bar.cc and @foo//:bar.cc have the same execpath) which is disgusting, but one could argue that this is in fact an improvement.

There are a number of other test cases that I think fail for the same reason and Androd builds are hopelessly broken. haven't been able to figure out how because the little scraps of time I had didn't allow me to put together and Android SDK and NDK good enough for Bazel today..

@lberki
Copy link
Contributor

lberki commented Jan 7, 2024

In addition to test_execroot_subdir_layout_fails_for_external_subpackages I found at least three (no joke) different issues:

  1. $TEST_TMPDIR is not handled appropriately. Its value is set to something under the outside-of-the-sandbox execroot, which is, while available inside the sandbox, is probably wrong. It also fails spectacularly if the output base happens to be under /tmp . This is pretty easy (if moderately hacky) to fix.

  2. The Linux sandbox now relies on the parent tree artifacts of any tree file artifacts in the inputs of the action to be present in the InputMetadataProvider. Turns out, this is not guaranteed and there is one very special case where a TreeFileArtifact in fact goes without its parent tree artifact: action templates. This is also pretty easy to fix and although a call site of SkyframeActionExecutor.requiresTreeMetadataWhenTreeFileIsInput() has a warning about worse performance when it's true, I think it's only a minor performance loss so the fix is actually a one-liner.

  3. Hermetic /tmp does not work if the source tree or an external repository contains symlinks to /tmp. This is almost a tautology, but unfortunately, it is also the case for our integration tests since external repositories that are "bridged" from the outer to the inner Bazel invocation (e.g. the Android SDK) are just like that: the repository is a separate source root in both Bazel invocations, thus, the outer Bazel invocation makes it visible to the test under /tmp/bazel-source-roots/$i and $OUTPUT_BASE/external/$repo contains symlnks to that.

I'm not sure if the right approach here is to special-case /tmp/bazel-source-roots, to flip off the flag in our tests (this time, for a good reason) or to fix it in our integration testing framework.

copybara-service bot pushed a commit that referenced this issue Jan 8, 2024
This is necessary because that paths of those directories are different when seen by Bazel and by the processes within the sandbox and the sandbox interprets paths to writable directories as within the sandbox.

This is notably the case for $TEST_TMPDIR. The reason why this worked at all is that the $TEST_TMPDIR that Bazel passes to the test is relative to the working directory (it's absolutized in the test wrapper script)

Progress on #20753.

RELNOTES: None.
PiperOrigin-RevId: 596566851
Change-Id: Ifb56a3016a521b6a0cd4b5700172951d6feabddf
@lberki lberki added this to the 7.0.1 release blockers milestone Jan 8, 2024
copybara-service bot pushed a commit that referenced this issue Jan 8, 2024
…ata map.

This happens for action templates.

The reason is that the Linux sandbox needs to know where the tree file artifact is materialized to, which information is stored in its parent.

The reason for making this a flag was performance, but it's only at most one tree artifact per action, it's a constant time operation per tree artifact and it only happens rarely (on templated actions) so I think this time, simplicity is better than performance.

The surprising change to ActionInputMapHelper was needed because before this change, the tree artifact was a discovered input and therefore was processed by the code path in ActionExecutionFunction.addDiscoveredInputs(), which populated archivedTreeArtifacts but addToMap() did not. It was presumably the bug.

The depOwner argument of putTreeArtifact may also be wrong there. It's at least inconsistent with the other two call sites, but I don't want to add more polish at some risk to this change so I decided to not add it to addArchivedTreeArtifactMaybe().

Progress towards #20753.

RELNOTES: None.
PiperOrigin-RevId: 596586625
Change-Id: Ib690a84508da07560ec376d4e87ed8fb2211979f
@lberki
Copy link
Contributor

lberki commented Jan 8, 2024

wrt. Bazel 7.0.1, I propose we do the following:

  1. We cherry-pick c48392c and bc1d9d3
  2. poke at our integration test battery to see if there are any more issues that are currently hidden by the fact that we in fact use /tmp in such a way that is specifically contraindicated with --incompatible_sandbox_hermetic_tmp

If (2) doesn't make any more skeletons fall out from the closet, we call the fire drill done. If it does, well, that's unfortunate.

@meteorcloudy WDYT?

@iancha1992
Copy link
Member

@bazel-io fork 7.0.1

@iancha1992
Copy link
Member

@bazel-io fork 7.1.0

@lberki
Copy link
Contributor

lberki commented Jan 9, 2024

Yet Another issue: it looks like deploy jar actions don't quite work. The symptom is:

ERROR: /tmp/bazel-working-directory/_main/_tmp/f6553c75b5e5f9d32addcdb402698239/workspace/java/com/a/BUILD:1:12: Building deploy jar java/com/a/a_deploy.jar failed: (Exit 1): singlejar_local failed: error executing JavaDeployJar command (from target //java/com/a:a_deployjars_internal_rule) external/rules_java~7.3.2~toolchains~remote_java_tools_linux/java_tools/src/tools/singlejar/singlejar_local @bazel-out/k8-fastbuild/bin/java/com/a/a_deploy.jar-0.params

Quick debugging shows that the action input bazel-out/k8-fastbuild/bin/external/bazel_tools/tools/build_defs/build_info/redacted_file.properties is a symlink to /tmp/bazel-working-directory/_main/_tmp/f6553c75b5e5f9d32addcdb402698239/root/install/5c1a48ea43805eea854857f7fc475a0d/embedded_tools/tools/build_defs/build_info/templates/redacted_file.properties.template, which is not available in the sandbox of the inner Bazel instance. I haven't checked yet why. Repro is:

  mkdir -p java/com/a
  cat > java/com/a/BUILD <<'EOF'
java_binary(name="a", srcs=["A.java"])
EOF

  cat > java/com/a/A.java <<'EOF'
package a;

class A {
  public static void main(String[] args) {
    System.out.println("hello");
  }
}
EOF
  bazel build //java/com/a:a_deploy.jar

After adding --incompatible_sandbox_hermetic_tmp to and removing --sandbox_tmpfs_path=/tmp from the test bazelrc.

My guess is that this is an instance of the problem that the integration tests of Bazel do the very thing which --incompatible_sandbox_hermetic_tmp discourages, which is to use files in /tmp in the sandbox (by design, but unfortunate)

bazel-io pushed a commit to bazel-io/bazel that referenced this issue Jan 9, 2024
This makes --experimental_split_coverage_postprocessing work in the wake of bazelbuild@fb6658c: before, it was enough to add the metadata of the tree file artifacts to the metadata provider of the post-processing action, but that change made the metadata of the tree artifact necessary, too.

Progress towards bazelbuild#20753.

RELNOTES: None.
PiperOrigin-RevId: 596929659
Change-Id: I481ef36328de7f7ab07f2ec7a0ac83d5fd508c36
@iancha1992
Copy link
Member

iancha1992 commented Jan 9, 2024

@lberki
#20812
#20813

Oops. I didn't realize I had to cherry-pick all three. I'll get back to you with new PR's

bazel-io pushed a commit to bazel-io/bazel that referenced this issue Jan 9, 2024
This makes --experimental_split_coverage_postprocessing work in the wake of bazelbuild@fb6658c: before, it was enough to add the metadata of the tree file artifacts to the metadata provider of the post-processing action, but that change made the metadata of the tree artifact necessary, too.

Progress towards bazelbuild#20753.

RELNOTES: None.
PiperOrigin-RevId: 596929659
Change-Id: I481ef36328de7f7ab07f2ec7a0ac83d5fd508c36
bazel-io pushed a commit to bazel-io/bazel that referenced this issue Jan 9, 2024
This makes --experimental_split_coverage_postprocessing work in the wake of bazelbuild@fb6658c: before, it was enough to add the metadata of the tree file artifacts to the metadata provider of the post-processing action, but that change made the metadata of the tree artifact necessary, too.

Progress towards bazelbuild#20753.

RELNOTES: None.
PiperOrigin-RevId: 596929659
Change-Id: I481ef36328de7f7ab07f2ec7a0ac83d5fd508c36
iancha1992 pushed a commit to iancha1992/bazel that referenced this issue Jan 9, 2024
…ata map.

This happens for action templates.

The reason is that the Linux sandbox needs to know where the tree file artifact is materialized to, which information is stored in its parent.

The reason for making this a flag was performance, but it's only at most one tree artifact per action, it's a constant time operation per tree artifact and it only happens rarely (on templated actions) so I think this time, simplicity is better than performance.

The surprising change to ActionInputMapHelper was needed because before this change, the tree artifact was a discovered input and therefore was processed by the code path in ActionExecutionFunction.addDiscoveredInputs(), which populated archivedTreeArtifacts but addToMap() did not. It was presumably the bug.

The depOwner argument of putTreeArtifact may also be wrong there. It's at least inconsistent with the other two call sites, but I don't want to add more polish at some risk to this change so I decided to not add it to addArchivedTreeArtifactMaybe().

Progress towards bazelbuild#20753.

RELNOTES: None.
PiperOrigin-RevId: 596586625
Change-Id: Ib690a84508da07560ec376d4e87ed8fb2211979f
iancha1992 pushed a commit to iancha1992/bazel that referenced this issue Jan 9, 2024
This is necessary because that paths of those directories are different when seen by Bazel and by the processes within the sandbox and the sandbox interprets paths to writable directories as within the sandbox.

This is notably the case for $TEST_TMPDIR. The reason why this worked at all is that the $TEST_TMPDIR that Bazel passes to the test is relative to the working directory (it's absolutized in the test wrapper script)

Progress on bazelbuild#20753.

RELNOTES: None.
PiperOrigin-RevId: 596566851
Change-Id: Ifb56a3016a521b6a0cd4b5700172951d6feabddf
iancha1992 pushed a commit to iancha1992/bazel that referenced this issue Jan 9, 2024
This makes --experimental_split_coverage_postprocessing work in the wake of bazelbuild@fb6658c: before, it was enough to add the metadata of the tree file artifacts to the metadata provider of the post-processing action, but that change made the metadata of the tree artifact necessary, too.

Progress towards bazelbuild#20753.

RELNOTES: None.
PiperOrigin-RevId: 596929659
Change-Id: I481ef36328de7f7ab07f2ec7a0ac83d5fd508c36
iancha1992 pushed a commit to iancha1992/bazel that referenced this issue Jan 9, 2024
…ata map.

This happens for action templates.

The reason is that the Linux sandbox needs to know where the tree file artifact is materialized to, which information is stored in its parent.

The reason for making this a flag was performance, but it's only at most one tree artifact per action, it's a constant time operation per tree artifact and it only happens rarely (on templated actions) so I think this time, simplicity is better than performance.

The surprising change to ActionInputMapHelper was needed because before this change, the tree artifact was a discovered input and therefore was processed by the code path in ActionExecutionFunction.addDiscoveredInputs(), which populated archivedTreeArtifacts but addToMap() did not. It was presumably the bug.

The depOwner argument of putTreeArtifact may also be wrong there. It's at least inconsistent with the other two call sites, but I don't want to add more polish at some risk to this change so I decided to not add it to addArchivedTreeArtifactMaybe().

Progress towards bazelbuild#20753.

RELNOTES: None.
PiperOrigin-RevId: 596586625
Change-Id: Ib690a84508da07560ec376d4e87ed8fb2211979f
iancha1992 pushed a commit to iancha1992/bazel that referenced this issue Jan 9, 2024
This is necessary because that paths of those directories are different when seen by Bazel and by the processes within the sandbox and the sandbox interprets paths to writable directories as within the sandbox.

This is notably the case for $TEST_TMPDIR. The reason why this worked at all is that the $TEST_TMPDIR that Bazel passes to the test is relative to the working directory (it's absolutized in the test wrapper script)

Progress on bazelbuild#20753.

RELNOTES: None.
PiperOrigin-RevId: 596566851
Change-Id: Ifb56a3016a521b6a0cd4b5700172951d6feabddf
iancha1992 pushed a commit to iancha1992/bazel that referenced this issue Jan 9, 2024
This makes --experimental_split_coverage_postprocessing work in the wake of bazelbuild@fb6658c: before, it was enough to add the metadata of the tree file artifacts to the metadata provider of the post-processing action, but that change made the metadata of the tree artifact necessary, too.

Progress towards bazelbuild#20753.

RELNOTES: None.
PiperOrigin-RevId: 596929659
Change-Id: I481ef36328de7f7ab07f2ec7a0ac83d5fd508c36
@iancha1992
Copy link
Member

@lberki Here you go!:
#20821
#20822

@lberki
Copy link
Contributor

lberki commented Jan 9, 2024

A thousand thanks and a bhattleship :)

github-merge-queue bot pushed a commit that referenced this issue Jan 9, 2024
…#20822)

Includes 3 commits.

c48392c,
bc1d9d3,
b0db044

Progress towards #20753

---------

Co-authored-by: Googler <lberki@google.com>
iancha1992 added a commit that referenced this issue Jan 9, 2024
…#20821)

Includes 3 commits.

c48392c,
bc1d9d3,
b0db044

Progress towards #20753

---------

Co-authored-by: Googler <lberki@google.com>
@fmeum
Copy link
Collaborator Author

fmeum commented Jan 20, 2024

@lberki I just ran bazel test //src/test/shell/bazel:bazel_java_test locally on addf41b, which uses Bazel 7.0.1, and still see the failure. Is anything missing?

@lberki
Copy link
Contributor

lberki commented Jan 22, 2024

I know that there are still failures, that's why this bug is still open, but AFAICT they are not due to bugs in Bazel, but due to bugs in the test battery. That particular test is fixed by replacing an ln -s with a cp (otherwise, the final target of the symlink is under /tmp/bazel-source-roots and the inner sandbox overwrites that directory)

@daivinhtran
Copy link
Contributor

daivinhtran commented Feb 3, 2024

Related bug: #21190. Disabling --incompatible_sandbox_hermetic_tmp in 7.0.2 doesn't help either.

copybara-service bot pushed a commit that referenced this issue Feb 5, 2024
Tempoarily workaround #20753 to unblock bazelbuild/java_tools#87

Setting `--sandbox_tmpfs_path=/tmp` for linux matches [the current behavior of bazelci.py](https://github.com/bazelbuild/continuous-integration/blob/7a8d90d15520b81e0f330a85772c5416a04d0061/buildkite/bazelci.py#L1976)

PiperOrigin-RevId: 604341263
Change-Id: I37fe324afe4328d861b06fc64a03e82cc55de38f
fzakaria added a commit to fzakaria/bazel that referenced this issue Apr 27, 2024
fzakaria added a commit to fzakaria/bazel that referenced this issue Apr 30, 2024
copybara-service bot pushed a commit that referenced this issue May 16, 2024
The bind mounting scheme used with the Linux sandbox' hermetic `/tmp` feature is modified to preserve all paths as they are outside the sandbox, which removes the need to rewrite paths when staging inputs into and, crucially, moving outputs out of the sandbox.

Source roots and output base paths under `/tmp` are now treated just like any user-specified bind mount under `/tmp`: They are mounted under the hermetic tmp directory with their path relativized against `/tmp` before the hermetic tmp directory is mounted as `/tmp` as the final step.

There is one caveat compared to user-specified mounts: Source roots, which may themselves not lie under `/tmp`, can be symlinks to directories under `/tmp` (e.g., when they arise from a `local_repository`). To handle this situation in the common case, all parent directories of package path entries (up to direct children of `/tmp`) are mounted into the sandbox. If users use `local_repository`s with fixed target paths under `/tmp`, they will need to specify `--sandbox_add_mount_pair`.

Overlayfs has been considered as an alternative to this approach, but ultimately doesn't seem to work for this use case since its `lowerpath`, which would be `/tmp`, is not allowed to have child mounts from a different user namespace (see https://unix.stackexchange.com/questions/776030/mounting-overlayfs-in-a-user-namespace-with-child-mounts). However, this is exactly the situation created by a Bazel-in-Bazel test and can also arise if the user has existing mounts under `/tmp` when using Bazel (e.g. the JetBrains toolbox on Linux uses such mounts).

This replaces and mostly reverts the following commits, but keeps their tests:
* bf6ebe9
* fb6658c
* bc1d9d3
* 1829883
* 70691f2
* a556969
* 8e32f44 (had its test lost in an incorrect merge conflict resolution, this PR adds it back)

Fixes #20533
Work towards #20753
Fixes #21215
Fixes #22117
Fixes #22226
Fixes #22290

RELNOTES: Paths in the Linux sandbox are now again identical to those outside the sandbox, even with `--incompatible_sandbox_hermetic_tmp`.

Closes #22001.

PiperOrigin-RevId: 634381503
Change-Id: I9f7f3948c705be120c55c9b0c51204e5bea45f61
fmeum added a commit to fmeum/bazel that referenced this issue May 16, 2024
The bind mounting scheme used with the Linux sandbox' hermetic `/tmp` feature is modified to preserve all paths as they are outside the sandbox, which removes the need to rewrite paths when staging inputs into and, crucially, moving outputs out of the sandbox.

Source roots and output base paths under `/tmp` are now treated just like any user-specified bind mount under `/tmp`: They are mounted under the hermetic tmp directory with their path relativized against `/tmp` before the hermetic tmp directory is mounted as `/tmp` as the final step.

There is one caveat compared to user-specified mounts: Source roots, which may themselves not lie under `/tmp`, can be symlinks to directories under `/tmp` (e.g., when they arise from a `local_repository`). To handle this situation in the common case, all parent directories of package path entries (up to direct children of `/tmp`) are mounted into the sandbox. If users use `local_repository`s with fixed target paths under `/tmp`, they will need to specify `--sandbox_add_mount_pair`.

Overlayfs has been considered as an alternative to this approach, but ultimately doesn't seem to work for this use case since its `lowerpath`, which would be `/tmp`, is not allowed to have child mounts from a different user namespace (see https://unix.stackexchange.com/questions/776030/mounting-overlayfs-in-a-user-namespace-with-child-mounts). However, this is exactly the situation created by a Bazel-in-Bazel test and can also arise if the user has existing mounts under `/tmp` when using Bazel (e.g. the JetBrains toolbox on Linux uses such mounts).

This replaces and mostly reverts the following commits, but keeps their tests:
* bazelbuild@bf6ebe9
* bazelbuild@fb6658c
* bazelbuild@bc1d9d3
* bazelbuild@1829883
* bazelbuild@70691f2
* bazelbuild@a556969
* bazelbuild@8e32f44 (had its test lost in an incorrect merge conflict resolution, this PR adds it back)

Fixes bazelbuild#20533
Work towards bazelbuild#20753
Fixes bazelbuild#21215
Fixes bazelbuild#22117
Fixes bazelbuild#22226
Fixes bazelbuild#22290

RELNOTES: Paths in the Linux sandbox are now again identical to those outside the sandbox, even with `--incompatible_sandbox_hermetic_tmp`.

Closes bazelbuild#22001.

PiperOrigin-RevId: 634381503
Change-Id: I9f7f3948c705be120c55c9b0c51204e5bea45f61
github-merge-queue bot pushed a commit that referenced this issue May 16, 2024
The bind mounting scheme used with the Linux sandbox' hermetic `/tmp`
feature is modified to preserve all paths as they are outside the
sandbox, which removes the need to rewrite paths when staging inputs
into and, crucially, moving outputs out of the sandbox.

Source roots and output base paths under `/tmp` are now treated just
like any user-specified bind mount under `/tmp`: They are mounted under
the hermetic tmp directory with their path relativized against `/tmp`
before the hermetic tmp directory is mounted as `/tmp` as the final
step.

There is one caveat compared to user-specified mounts: Source roots,
which may themselves not lie under `/tmp`, can be symlinks to
directories under `/tmp` (e.g., when they arise from a
`local_repository`). To handle this situation in the common case, all
parent directories of package path entries (up to direct children of
`/tmp`) are mounted into the sandbox. If users use `local_repository`s
with fixed target paths under `/tmp`, they will need to specify
`--sandbox_add_mount_pair`.

Overlayfs has been considered as an alternative to this approach, but
ultimately doesn't seem to work for this use case since its `lowerpath`,
which would be `/tmp`, is not allowed to have child mounts from a
different user namespace (see
https://unix.stackexchange.com/questions/776030/mounting-overlayfs-in-a-user-namespace-with-child-mounts).
However, this is exactly the situation created by a Bazel-in-Bazel test
and can also arise if the user has existing mounts under `/tmp` when
using Bazel (e.g. the JetBrains toolbox on Linux uses such mounts).

This replaces and mostly reverts the following commits, but keeps their
tests:
*
bf6ebe9
*
fb6658c
*
bc1d9d3
*
1829883
*
70691f2
*
a556969
*
8e32f44
(had its test lost in an incorrect merge conflict resolution, this PR
adds it back)

Fixes #20533
Work towards #20753
Fixes #21215
Fixes #22117
Fixes #22226
Fixes #22290

RELNOTES: Paths in the Linux sandbox are now again identical to those
outside the sandbox, even with `--incompatible_sandbox_hermetic_tmp`.

Closes #22001.

PiperOrigin-RevId: 634381503
Change-Id: I9f7f3948c705be120c55c9b0c51204e5bea45f61

Fixes #22291
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Local-Exec Issues and PRs for the Execution (Local) team type: bug
Projects
None yet
Development

No branches or pull requests

8 participants