Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow the exec root to be placed outside the output base #12558

Conversation

EdSchouten
Copy link
Contributor

In an attempt to achieve 'Builds without the Bytes' without losing
access to build outputs, I am experimenting with a FUSE file system that
gives access to objects stored in the CAS. In PR #11703, I added a
command line flag to let Bazel emit symbolic links pointing into this
FUSE file system, as opposed to downloading files into the exec root.

Though this change has allowed me to get quite a lot of stuff working,
there are also many build actions that break. For example, Python calls
realpath(argv[0]) to figure out its installation path. Because the FUSE
file system does not mimic the execroot, Python won't be able to find
its site packages. Similar problems hold with shared library resolution
in general.

This is why I think the only proper way we can get this to work is by
using hard links instead of symbolic links. That way the usual file
hierarchy remains intact. This, however, requires that the exec root
itself is placed on a FUSE file system. It is already possible to
achieve this by setting --output_base, but that has the downside of also
placing many other files on FUSE (e.g., the sandbox directories), which
is detrimental to performance.

This change adds a new command line flag, --exec_root_base, which can be
used to leave the output base at the regular place, but host the
exec root directory on a FUSE file system.

This change originally seemed to work all right with Bazel 3.4-3.7. In
order to make this work with Bazel master, I had to make a slight tweak
to the changes in 0c249d5. That code
added the assumption that "${output_base}/external" is always placed at
"${exec_root_base}/../external". I suspect that already causes a
regression in case a BlazeModule overrides the exec root base. While
there, rename 'execRootParent' to 'execRootBase', as it corresponds to
the exec root itself; not its parent directory.

@google-cla google-cla bot added the cla: yes label Nov 25, 2020
@EdSchouten EdSchouten force-pushed the eschouten/20201125-exec-root-base branch 5 times, most recently from 8829c5c to 7f87987 Compare November 25, 2020 19:39
In an attempt to achieve 'Builds without the Bytes' without losing
access to build outputs, I am experimenting with a FUSE file system that
gives access to objects stored in the CAS. In PR bazelbuild#11703, I added a
command line flag to let Bazel emit symbolic links pointing into this
FUSE file system, as opposed to downloading files into the exec root.

Though this change has allowed me to get quite a lot of stuff working,
there are also many build actions that break. For example, Python calls
realpath(argv[0]) to figure out its installation path. Because the FUSE
file system does not mimic the execroot, Python won't be able to find
its site packages. Similar problems hold with shared library resolution
in general.

This is why I think the only proper way we can get this to work is by
using hard links instead of symbolic links. That way the usual file
hierarchy remains intact. This, however, requires that the exec root
itself is placed on a FUSE file system. It is already possible to
achieve this by setting --output_base, but that has the downside of also
placing many other files on FUSE (e.g., the sandbox directories), which
is detrimental to performance.

This change adds a new command line flag, --exec_root_base, which can be
used to leave the output base at the regular place, but host the
exec root directory on a FUSE file system.

This change originally seemed to work all right with Bazel 3.4-3.7. In
order to make this work with Bazel master, I had to make a slight tweak
to the changes in 0c249d5. That code
added the assumption that "${output_base}/external" is always placed at
"${exec_root_base}/../external". I suspect that already causes a
regression in case a BlazeModule overrides the exec root base. While
there, rename 'execRootParent' to 'execRootBase', as it corresponds to
the exec root itself; not its parent directory.
@EdSchouten EdSchouten force-pushed the eschouten/20201125-exec-root-base branch from 7f87987 to 0f543a2 Compare November 25, 2020 19:49
@EdSchouten
Copy link
Contributor Author

Cc @coeuvre @janakdr

@coeuvre
Copy link
Member

coeuvre commented Nov 26, 2020

Thanks for your PR!

I am not familiar with this part so might not be the one to review the code. But overall, the changes LGTM.

EdSchouten added a commit to EdSchouten/bazel that referenced this pull request Nov 26, 2020
In an attempt to achieve 'Builds without the Bytes' without losing
access to build outputs, I am experimenting with a FUSE file system that
gives access to objects stored in the CAS. In PR bazelbuild#11703, I added a
command line flag to let Bazel emit symbolic links pointing into this
FUSE file system, as opposed to downloading files into the exec root.

Though this change has allowed me to get quite a lot of stuff working,
there are also many build actions that break. For example, Python calls
realpath(argv[0]) to figure out its installation path. Because the FUSE
file system does not mimic the execroot, Python won't be able to find
its site packages. Similar problems hold with shared library resolution
in general.

This is why I think the only proper way we can get this to work is by
using hard links instead of symbolic links. That way the usual file
hierarchy remains intact. This change renames the
--remote_download_symlink_template flag to
--remote_download_hard_link_template and changes the code to create hard
links instead.

When used in combination with --exec_root_base (bazelbuild#12558), it's now
possible to let Bazel construct an exec root that does not have any
additional indirection through symbolic links, thereby keeping programs
that do symlink expansion happy.
EdSchouten added a commit to EdSchouten/bazel that referenced this pull request Nov 26, 2020
In an attempt to achieve 'Builds without the Bytes' without losing
access to build outputs, I am experimenting with a FUSE file system that
gives access to objects stored in the CAS. In PR bazelbuild#11703, I added a
command line flag to let Bazel emit symbolic links pointing into this
FUSE file system, as opposed to downloading files into the exec root.

Though this change has allowed me to get quite a lot of stuff working,
there are also many build actions that break. For example, Python calls
realpath(argv[0]) to figure out its installation path. Because the FUSE
file system does not mimic the execroot, Python won't be able to find
its site packages. Similar problems hold with shared library resolution
in general.

This is why I think the only proper way we can get this to work is by
using hard links instead of symbolic links. That way the usual file
hierarchy remains intact. This change renames the
--remote_download_symlink_template flag to
--remote_download_hard_link_template and changes the code to create hard
links instead.

When used in combination with --exec_root_base (bazelbuild#12558), it's now
possible to let Bazel construct an exec root that does not have any
additional indirection through symbolic links, thereby keeping programs
that do symlink expansion happy.
@janakdr
Copy link
Contributor

janakdr commented Nov 28, 2020

cc reviewer @lberki for the issue with /external, since I think the author is not part of our github org.

@janakdr
Copy link
Contributor

janakdr commented Nov 28, 2020

Internally, the "bazel-out" directory is a symlink to the FUSE filesystem. It's created inside the "startBuild" method here:

. I think it might be a smaller change and more consistent to have just the bazel-out directory be on the FUSE filesystem, and make it a symbolic link. Does that sound ok? It's conceivable that we could open-source the abstract FUSE logic as an OutputService, configured via command-line flags, that would allow sharing of this code (although it's not very complex code).

@EdSchouten
Copy link
Contributor Author

Internally, the "bazel-out" directory is a symlink to the FUSE filesystem. It's created inside the "startBuild" method here:

. I think it might be a smaller change and more consistent to have just the bazel-out directory be on the FUSE filesystem, and make it a symbolic link. Does that sound ok?

Ah, nice. The difference between putting the exec root or just bazel-out on FUSE should be fairly small, so that would work for me. Thanks for pointing me to that part of the code!

It's conceivable that we could open-source the abstract FUSE logic as an OutputService, configured via command-line flags, that would allow sharing of this code (although it's not very complex code).

That would be awesome! Anything I could do to help out with that (review, test, etc.)?

@janakdr
Copy link
Contributor

janakdr commented Nov 28, 2020

@alexjski separating out an abstract FuseOutputService might be one entry point to work we discussed around rethinking the output tree, although it kind of goes in the opposite direction!

@aiuto aiuto added the team-Remote-Exec Issues and PRs for the Execution (Remote) team label Nov 30, 2020
@EdSchouten
Copy link
Contributor Author

Hey @janakdr,

Thanks again for pointing me to OutputService a couple of days ago. The last couple of days I've been experimenting with it, and at the same time doing some beard scratching. Maybe for Bazel (not Blaze) it would make sense to have a completely separate implementation of OutputService, specifically aimed at REv2.

Just as an experiment, I'm working on adding an implementation of OutputService to lib/remote that transforms many of the operations into gRPC calls, to be sent to a FUSE daemon. At the same time, I'm trying to see whether this gRPC protocol can incorporate the features that I (attempted to) add in #11703, #11662 and #12566 into something cohesive. More concretely: we could let Bazel just do gRPC calls against the FUSE daemon to do getDigest(), and to create CAS-backed pseudo-files.

I'll keep you posted on how that works out.

@janakdr
Copy link
Contributor

janakdr commented Dec 5, 2020 via email

@EdSchouten
Copy link
Contributor Author

Superseded by #12823. Thanks for sharing your insights, @janakdr. Hope to get your opinions on #12823!

@EdSchouten EdSchouten closed this Jan 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes team-Remote-Exec Issues and PRs for the Execution (Remote) team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants