Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hermetic build.rs #490

Closed
UebelAndre opened this issue Nov 11, 2020 · 24 comments · Fixed by google/cargo-raze#325
Closed

Hermetic build.rs #490

UebelAndre opened this issue Nov 11, 2020 · 24 comments · Fixed by google/cargo-raze#325

Comments

@UebelAndre
Copy link
Collaborator

The Rust Bazel rules should have a way to handle build.rs in a hermetic way. My current hypothesis is that this could be handled by removing the following:

    # Start with the default shell env, which contains any --action_env
    # settings passed in on the command line.
    env = dict(ctx.configuration.default_shell_env)

and by adding a way for users to optionally enable default_shell_env.

@dae
Copy link
Contributor

dae commented Nov 11, 2020

(For reference, that line was introduced by #447)

I have no objections to making this off by default if necessary, though I don't think it will completely solve the issues of hermeticity - build scripts can still access arbitrary paths on the filesystem, call programs in the standard PATH, and make network requests for example.

@UebelAndre
Copy link
Collaborator Author

I'll try to do some investigating myself but it'd be a huge help if someone could enumerate all the things required to make build.rs hermetic in these rules.

In my experience, 80% of the build.rs files I come across can be done in a hermetic way (they check for features or attributes of the current rustc executable). I think it'd be a huge benefit to be able to default build.rs to be hermetic and have users explicitly add attributes to support unique cases or to explicitly opt out of hermetic builds.

@UebelAndre
Copy link
Collaborator Author

Actually, the snippet I had questioned earlier doesn't seem problematic. @dfreese is there something else I'm missing? cargo_build_script seems hermetic (assuming no --action_env has been set to something broad) to me.

@PiotrSikora
Copy link
Contributor

@UebelAndre some build.rs scripts are used to build C libraries or generate input artifacts (e.g. ring's build.rs).

I don't think default_shell_env breaks hermeticity, since it should be an empty set unless --action_env is passed to Bazel, but that has potential to break hermeticity in most of things in Bazel anyway.

@UebelAndre
Copy link
Collaborator Author

I don't think default_shell_env breaks hermeticity, since it should be an empty set unless --action_env is passed to Bazel, but that has potential to break hermeticity in most of things in Bazel anyway.

I came to the same conclusion about default_shell_env. I'm looking for any more input as to why cargo_build_script might not currently be hermetic (assuming no --action_env). If there isn't then I'd say Cargo Raze should default default_gen_buildrs to true.

@PiotrSikora
Copy link
Contributor

The reason for not changing default_gen_buildrs to true is that we know that for some creates it's going to result in non-hermetic builds (e.g. the aforementioned ring crate), and because you're suddenly executing third party build scripts on your build machine, which might be against various security policies, etc.

Hence my suggestion to maintain a list of select well-known / trusted crates that would have gen_buildrs enabled by default (google/cargo-raze#289).

@UebelAndre
Copy link
Collaborator Author

How would it result in non hermetic builds? Won't that. Crate simply fail to build because build.rs doesn't have the resources it needs to run the script?

@UebelAndre
Copy link
Collaborator Author

Maybe bazelbuild/bazel#7026 is why?

@PiotrSikora
Copy link
Contributor

Will it, though? cargo_build_script has data = glob(["**"]), so it should have access to all the input files, and if there is C/C++ compiler in the default system path, then it might be able to compile those. I'm not familiar with cargo_build_script, and I don't see it having outs, but I imagine that artifacts it produces are passed as input (via deps) to rust_library.

@UebelAndre
Copy link
Collaborator Author

The outputs would go into OUT_DIR:
https://github.com/bazelbuild/rules_rust/blob/master/cargo/cargo_build_script.bzl#L75

This is quite the tricky situation. Maybe the output directory is something that should be explicitly requested when defining targets?

@UebelAndre
Copy link
Collaborator Author

Maybe not, that would probably lead to obscure build errors downstream. And work would be wasted since it wouldn't stop build.rs from compiling some C++ source

@UebelAndre
Copy link
Collaborator Author

The thing I'm wondering now is, is there way to identify whether or not a build.rs script is trying to access other things on the system (compilers or other tools) to do some larger amount of work outside of the Bazel ecosystem?

@PiotrSikora
Copy link
Contributor

I think naive way would be looking at build.rs imports.

We could be either permissive and allow crates with build.rs files that don't use std::fs, std::path, std::process, etc., or we could be strict and allow crates with build.rs files that use only std::env.

@UebelAndre
Copy link
Collaborator Author

Is there an example of something that enumerates imports? Like a built in rustc command or something?

@PiotrSikora
Copy link
Contributor

I have no idea, sorry. You'd probably need to use something like rust-analyzer.

@UebelAndre
Copy link
Collaborator Author

That might not actually work. Given https://github.com/dtolnay/syn/blob/master/build.rs

Is there a way to restrict what binaries an action is allowed to use?

@PiotrSikora
Copy link
Contributor

Ops, you're right, libc's build.rs does the same, not sure how I've missed that.

@illicitonion
Copy link
Collaborator

I believe that cargo_build_script is reasonably hermetic by default; like any action, like genrule, it can reach out to absolute paths, but by only if the code is written to do so. We go out of our way to put things like toolchain-selected C++ compilers:

if cc_toolchain:
toolchain_tools.append(cc_toolchain.all_files)
cc_executable = cc_toolchain.compiler_executable
if cc_executable:
env["CC"] = cc_executable
ar_executable = cc_toolchain.ar_executable
if ar_executable:
env["AR"] = ar_executable

and rustc:
"RUSTC": toolchain.rustc.path,

into relevant env vars so that these things work hermetically by default.

We even go as far as to try to redact absolute paths in the outputs that build scripts propagate to their dependees (see the callers of

fn redact_exec_root(value: &str, exec_root: &str) -> String {
value.replace(exec_root, "${pwd}")
}
in the same file as the function).

Just like any action, you can do things like try to enumerate /usr/bin or similar by specifying absolute paths. But this should be addressed by fixing the code (e.g. by providing env var overrides, and specifying them in the BUILD file), or by action sandboxing, rather than not running them at all.

FWIW I would be strongly in favour of defaulting default_gen_buildrs to true in cargo-raze. My personal view is that the more raze can Just Work out of the box, the more valuable it is, and that folks who really care about untrusted code running in build actions are probably in the minority (certainly, Bazel makes it fairly hard to guarantee that only trusted toolchains are used, or even that only explicitly enumerated rulesets are used), and can use opt-outs to enable stricter vetting if they want. The introduction of cargo_build_script (instead of the genrules we used to have) which automatically propagates information from the build script to its dependers was a huge step forwards here, no longer needing to manually specify things like --cfg flags - let's try to get as close as we can to needing no extra metadata in a Cargo.toml file to use it from bazel.

@UebelAndre
Copy link
Collaborator Author

@dfreese do you concur with what @illicitonion is saying?

@UebelAndre
Copy link
Collaborator Author

Also curious to hear from @damienmg on this.

@damienmg
Copy link
Collaborator

default_shell_env can be forced with --action_env and was the scheduled way to run those rule in an hermetic fashion while still allowing for the ease of use of using the default environment. I don't think this should be removed.

@UebelAndre
Copy link
Collaborator Author

@damienmg Yeah, I think the conversation has moved away from suggesting that be removed. Given #490 (comment) I think it'd be reasonable to default default_gen_buildrs to true in Cargo-raze. What're your thoughts there and on that comment?

@damienmg
Copy link
Collaborator

I also think that default_gen_buildrs to true is sensible default.

@UebelAndre
Copy link
Collaborator Author

The next release of cargo-raze will now default default_gen_buildrs to true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants