Lack of description on what the permissions on files in the input root are #43

EdSchouten · 2018-12-23T12:05:07Z

(Note: This is a continuation of a remark I made in #40)

It seems to be the case that the protocol doesn't document what the permissions of files and directories in the input root are in relation to the credentials of the build action.

Are build actions permitted to overwrite input files? When using Sandboxfs, this would be easy to achieve. Without using Sandboxfs, it's also possible, but has the downside that optimizations built around caching input files and hardlinking them is out of the question.
Related to the previous question, is it permitted to hardlink input files to some other location? Recent versions of Linux have fs.protected_hardlinks enabled by default. This implies that if input files are read-only (due to the use of hardlinking caches), we cannot guarantee that the kernel will allow the creation of hardlinks to input files.
In the general sense: is there even any guarantee that hardlinks can be created between two distinct directories in the input root? For example, may a worker place the output directories on a separate file system (tmpfs)? If so, this means you can't hardlink input files into an output directory.
What are permissions on directories? Where may the build action create temporary files? In any directory in the input root, or only inside of a directory containing one or more outputs?

Buildbarn's workers don't use anything like FUSE (yet!), for the reason that I initially aimed at using Buildbarn on Kubernetes, where you can't simply make mounts inside of containers. After giving it enough tweaks, I eventually concluded that:

All directories in the input root need to be writable to appease the build rules out there. Build rules should be allowed to rename and remove any file in the input root, and to create files in any directory in the input root.
Input files may be read-only. They may be replaced by removing them first and creating a new one with the old name.
Disallowing input files to be hardlinked (fs.protected_hardlinks == 1) causes a very small number of build rules to break, but those may be easy to fix.

Do we want to document this in the .proto file somewhere?

The text was updated successfully, but these errors were encountered:

EdSchouten · 2019-01-19T21:26:28Z

Friendly ping? :-)

At the build event at Bloomberg, several users of REv2 (Bazel, Buildbarn, BuildGrid) discussed what file permissions of the input root ought to be. Basically, there were two camps: - Camp A: reduce the number of writable files and directories to the absolute minumum (Bazel and Buildbarn). - Camp B: keep files read-only to allow caching based on hardlinks, while making all directories writable (BuildGrid). This is desired to support the execution of very large build actions that invoke traditional build systems that tend to clobber the input root all over the place. It was eventually decided to support the latter. In the long run a separate effort could be started to lock this down further, if it turns out this causes significant performance overhead. Until recently, Buildbarn also followed the second approach. A recent change caused Buildbarn to only make directories with one or more outputs to be writable and all of the parent directories. This change will be reverted to match the semantics proposed by this commit. Fixes: bazelbuild#43

ola-rozenfeld · 2019-10-22T14:33:42Z

Does #90 fully address this, or is there anything remaining?

juergbi · 2019-10-22T14:39:05Z

Does #90 fully address this, or is there anything remaining?

We still need to document the default permissions, i.e., when UnixMode is not specified as node property.

EricBurnett · 2019-11-05T20:22:17Z

I don't think we can document the default permissions just yet, for the simple reason that there isn't currently an agreed-upon default. The tradeoffs you outline above are a good example: I agree that directories need to be writeable, but I'm not sure what will/won't break if we make files readonly (today RBE marks all input files as writeable; no actions modify their inputs today, but I don't know if any are opening them readwrite and so would be broken by readonly - wouldn't put it past random tools/scripts tools to do so). At the same time, to document that files must be writeable precludes you from some reasonable optimizations, so I'm not going to push for it.

If we do document a default, my vote is for '777' for the simple reason that that's what we use today, and I don't have a viable way to validate whether changing it would break anyone. If you (and other implementations) are interested in standardizing on the same, then I certainly won't object, but otherwise I think it's probably a case where we should declare it to be "implementation-defined" and leave it to tests of cross-compatibility to flag if the divergence actually matters to anything.

For #90 there's an independent concern that defaults may be context-specific: the default permission for inputs should probably be different from the default permission of outputs?

juergbi · 2019-11-06T07:47:32Z

If we do document a default, my vote is for '777' for the simple reason that that's what we use today, and I don't have a viable way to validate whether changing it would break anyone. If you (and other implementations) are interested in standardizing on the same, then I certainly won't object, but otherwise I think it's probably a case where we should declare it to be "implementation-defined" and leave it to tests of cross-compatibility to flag if the divergence actually matters to anything.

If these are the only two options, we have to keep it implementation-defined for now. BuildBox (and as far as I know also Buildbarn) exposes cached files to action commands via hardlinks by default. To prevent corruption of the cached files, these hardlinks have to be read-only to the process/user executing the action commands.

To my knowledge, this is the only cross-platform approach that avoids having to copy each input file for every action. Some platforms provide more flexible options such as FUSE or reflinks but these options are not universally available.

For #90 there's an independent concern that defaults may be context-specific: the default permission for inputs should probably be different from the default permission of outputs?

Do you mean for the special case of output paths that already exist in the input tree? As I normally expect outputs to be created by the action command, in which case the permissions are defined by the action command (+ possibly umask) and not by the RE server/worker. Or am I misunderstanding your question?

EdSchouten mentioned this issue Feb 1, 2019

Document what the intended file permissions of the input root is. #54

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lack of description on what the permissions on files in the input root are #43

Lack of description on what the permissions on files in the input root are #43

EdSchouten commented Dec 23, 2018

EdSchouten commented Jan 19, 2019

ola-rozenfeld commented Oct 22, 2019

juergbi commented Oct 22, 2019

EricBurnett commented Nov 5, 2019

juergbi commented Nov 6, 2019 •

edited

Loading

Lack of description on what the permissions on files in the input root are #43

Lack of description on what the permissions on files in the input root are #43

Comments

EdSchouten commented Dec 23, 2018

EdSchouten commented Jan 19, 2019

ola-rozenfeld commented Oct 22, 2019

juergbi commented Oct 22, 2019

EricBurnett commented Nov 5, 2019

juergbi commented Nov 6, 2019 • edited Loading

juergbi commented Nov 6, 2019 •

edited

Loading