Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Security: avoid storing binary artifacts for protoc binary #562

Closed
tmfink opened this issue Nov 5, 2021 · 3 comments
Closed

Security: avoid storing binary artifacts for protoc binary #562

tmfink opened this issue Nov 5, 2021 · 3 comments

Comments

@tmfink
Copy link

tmfink commented Nov 5, 2021

Thanks for your great work with this crate.

Situation

The prost-build crate seems to use cached, pre-compiled binaries for the protoc by default:
https://github.com/tokio-rs/prost/tree/master/prost-build/third-party/protobuf

Users can opt out of this behavior by setting the PROTOC environment variable:

prost/prost-build/build.rs

Lines 102 to 108 in c3b7037

let protoc = env_protoc()
.or_else(bundled_protoc)
.or_else(path_protoc)
.expect(
"Failed to find the protoc binary. The PROTOC environment variable is not set, \
there is no bundled protoc for this platform, and protoc is not in the PATH",
);

Problem

It's surprising that this crate would use a pre-compiled binary artifact by default.
Those concerned with supply-chain security don't want to unwittingly rely on binary artifacts

In fact, in the software security space, the Software Bill of Materials (SBOM) movement has gained traction, where vendors show where all code comes from. In the United States, there was an executive order mandating that vendors providing software to the government provide an SBOM. It would not look good if a vendor unwittingly relied on a binary artifact.

The Open Source Security Foundation (OpenSSF) classifies storing binary artifacts in the repo as a high-risk security concern. I don't want to re-iterate every point, but the high-level point is that while not impossible to review, binaries are much more difficult to validate compared to source code. This makes it easier for an attacker to insert malicious code.

For example, in a recent breach of the JavaScript NPM package ua-parser-js, attackers inserted malware into the package. This is especially bad since (as of writing) that package has about 6 million weekly downloads.

As of writing, crates.io shows that prost has 368 public reverse dependencies. That does not include unpublished/proprietary code or transitive reverse dependencies. There are probably many users of prost who don't know they are users of prost (due the prost existing somewhere in the dependency graph) 😉.

Solutions

Note that either of these is a backwards-incompatible change and would require a version bump.

Minimum Fix

At a minimum, using the binary artifacts should not be default. Using the pre-compiled binaries should be opt-in instead of opt-out.

For example, users could set the PROTOC_USE_PRECOMPILED_BINARY to use the pre-built binary. Otherwise, the PROTOC variable would be queried (possible fallback to PATH). If no protoc can be found, the build.rs would panic.

Recommended Fix

Ideally, no pre-compiled binaries should be used at all. Instead, vendor the protobuf sources in this repo. I suggest avoiding git submodules and copying the sources directly into this repo1.

This also has the benefit of making it easier to support more platforms. For example, I hit an issue trying to compile code that depended on prost on a FreeBSD machine2. On such platforms, protobuf could just be compiled without needing to manually compile/install protoc.

Footnotes

  1. I maintain the capstone crate, which are Rust bindings to a C library. I vendor the C library and have a script to update the vendored sources.

  2. That's how I actually discovered this issue

@benesch
Copy link

benesch commented Dec 23, 2021

Hah, I just wrote up the exact same thing in #575, and put together a proposed fix in #576. Would love your feedback on that.

@tmfink
Copy link
Author

tmfink commented Dec 24, 2021

Hah, I just wrote up the exact same thing in #575, and put together a proposed fix in #576. Would love your feedback on that.

ACK--I put comments in #576 😄

@tmfink
Copy link
Author

tmfink commented Mar 28, 2022

handled with #610

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants