Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Bazel's custom unix_cc_configure with a wrapper script. #2631

Merged
merged 5 commits into from
Feb 20, 2018

Conversation

jmillikin-stripe
Copy link
Contributor

Description:
The script checks for --static-libstdc++ in argv, and if found
(1) switches to the $CXX compiler and (2) drops -lstdc++ from argv.

This should let the main Envoy binary build normally, without
interfering with other build systems that treat C and C++ as different
languages.

Risk Level: Medium

Testing:
I tested this Works On My Machine but will rely on CI to verify it works on the more exotic build configurations.

Signed-off-by: John Millikin jmillikin@stripe.com

@jmillikin-stripe
Copy link
Contributor Author

cc @snowp because I think this will unblock your rules_go upgrade, @htuch because you'll be excited to see the crosstool patches go away, and @mattklein123 because I know how much you love Bazel.

envoy_real_cc = {ENVOY_REAL_CC}
envoy_real_cxx = {ENVOY_REAL_CXX}
compiler = envoy_real_cc
if envoy_real_cxx is not None and "-static-libstdc++" in sys.argv[1:]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this play with libc++ requirements that exist for OSS-Fuzz?

@snowp
Copy link
Contributor

snowp commented Feb 16, 2018

This seems reasonable to me (still hacky, but I prefer hacky that works over hacky that doesn't) and should solve the rules_go issue as well as the general problem of building C files (#839)

@jmillikin-stripe jmillikin-stripe force-pushed the cc-wrapper-script branch 2 times, most recently from 6e79f45 to da4b0b7 Compare February 17, 2018 00:48
@jmillikin-stripe
Copy link
Contributor Author

asan/tsan is stuck on a clang/bazel interaction. Will look at it more on the weekend or holiday. Notes:

$ CC=clang-5.0 bazel build @com_github_cyan4973_xxhash//:xxhash
INFO: Analysed target @com_github_cyan4973_xxhash//:xxhash (0 packages loaded).
INFO: Found 1 target...
ERROR: /root/.cache/bazel/_bazel_root/f8087e59fd95af1ae29e8fcb7ff1a3dc/external/com_github_cyan4973_xxhash/BUILD.bazel:1:1: undeclared inclusion(s) in rule '@com_github_cyan4973_xxhash//:xxhash':
this rule is missing dependency declarations for the following files included by 'external/com_github_cyan4973_xxhash/xxhash.c':
  '/usr/lib/clang/5.0.1/include/stddef.h'
  '/usr/lib/clang/5.0.1/include/__stddef_max_align_t.h'
  '/usr/lib/clang/5.0.1/include/stdint.h'
[...]

$ grep cxx_builtin_include_directory bazel-src/external/local_config_cc/CROSSTOOL 
  cxx_builtin_include_directory: "/usr/include/c++/5.4.0"
  cxx_builtin_include_directory: "/usr/include/x86_64-linux-gnu/c++/5.4.0"
  cxx_builtin_include_directory: "/usr/include/c++/5.4.0/backward"
  cxx_builtin_include_directory: "/usr/include/clang/5.0.1/include"
  cxx_builtin_include_directory: "/usr/local/include"
  cxx_builtin_include_directory: "/usr/include/x86_64-linux-gnu"
  cxx_builtin_include_directory: "/usr/include"

$ ls -ld /usr/lib/clang/5.0.1/include
lrwxrwxrwx 1 root root 38 Feb  2 01:57 /usr/lib/clang/5.0.1/include -> ../../llvm-5.0/lib/clang/5.0.1/include
$ ls -ld /usr/include/clang/5.0.1/include
lrwxrwxrwx 1 root root 45 Feb  2 01:57 /usr/include/clang/5.0.1/include -> ../../../lib/llvm-5.0/lib/clang/5.0.1/include

OK, so the clang include path is having a symlink party. Bazel is expecting /usr/include/clang/5.0.1/include, but Clang is outputting /usr/lib/clang/5.0.1/include at compile time. How's that got anything to do with my change?

Ran clang -MD on a hello world, looks like Clang returns different include paths for C vs C++.

$ cat hello.cc      # hello.c is the same file
#include <stddef.h>
int main() { return 1; }
$ cat hello-c-deps.d 
hello.o: hello.c /usr/lib/clang/5.0.1/include/stddef.h
$ cat hello-cc-deps.d 
hello.o: hello.cc /usr/include/clang/5.0.1/include/stddef.h

Bazel is running $CC -E -xc++ - -v to get the CROSSTOOL data. What does Clang return from that command (or similar)?

$ echo '' | clang-5.0 -E - -v 2>&1 | grep -A 5 'search starts here'
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /usr/lib/llvm-5.0/lib/clang/5.0.1/include
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
$ echo '' | clang++-5.0 -E - -v 2>&1 | grep -A 5 'search starts here'
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /usr/lib/llvm-5.0/lib/clang/5.0.1/include
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
$ echo '' | clang-5.0 -E -xc++ - -v 2>&1 | grep -A 5 'search starts here'
#include "..." search starts here:
#include <...> search starts here:
 /usr/bin/../lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0
 /usr/bin/../lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/x86_64-linux-gnu/c++/5.4.0
 /usr/bin/../lib/gcc/x86_64-linux-gnu/5.4.0/../../../../include/c++/5.4.0/backward
 /usr/include/clang/5.0.1/include
 /usr/local/include

clang and clang++ output is identical, but clang -xc++ is different!

The script checks for `--static-libstdc++` in argv, and if found
(1) switches to the `$CXX` compiler and (2) drops `-lstdc++` from argv.

This should let the main Envoy binary build normally, without
interfering with other build systems that treat C and C++ as different
languages.

Signed-off-by: John Millikin <jmillikin@stripe.com>
@jmillikin-stripe
Copy link
Contributor Author

And we're green!

Based on the time to run tests, wrapping the compiler like this invalidates the build cache. We'll want to do a new build image after merging.

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This new approach is neat,I like it, it will be much more robust to Bazel version churn, in addition to the rules_go stuff.

# See the License for the specific language governing permissions and
# limitations under the License.
"""Rules for configuring the C++ toolchain (experimental)."""
def _quiet_fake_which(program):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add comments here and below providing details on why we're faking?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

os = fake_os,
), {})
repository_ctx.template("envoy_cc_wrapper", repository_ctx.attr._envoy_cc_wrapper, {
"{ENVOY_REAL_CC}": repr(str(real_cc)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, definitely could do with a narrative explaining what's happening here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

configure_unix_toolchain(repository_ctx, cpu_value, overriden_tools)
overriden_tools = {}
if cpu_value not in ["freebsd", "x64_windows", "darwin"]:
overriden_tools["gcc"] = _build_envoy_cc_wrapper(repository_ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is gcc hardcoded here? Is this generic $CC effectively? If so, why not on Darwin as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment. To Bazel, all CCs are GCC.

compiler = envoy_real_cxx
argv = []
for arg in sys.argv[1:]:
if arg == "-lstdc++":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to ensure that this is going to work with Envoy OSS-Fuzz that uses libc++, seehttps://github.com/google/oss-fuzz/tree/master/projects/envoy.

Can you checkout OSS-Fuzz (https://github.com/google/oss-fuzz) and run:

python infra/helper.py build_image envoy
python infra/helper.py build_fuzzers --sanitizer=address envoy <path to your Envoy source tree>
python infra/helper.py run_fuzzer envoy base64_fuzz_test

See https://github.com/google/oss-fuzz/blob/master/docs/new_project_guide.md#testing-locally for the official docs on this flow if you have issues, or reach out to me on Slack.

I think we should eventually put this in a CI slot to ensure we don't regress as we much with build, but for now we need to manually do this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running now. I've only got a laptop to build with so it's taking a pretty good amount of time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oss-fuzz works with the new script, but we need to add python to the Debian package lits in oss-fuzz/projects/envoy/Dockerfile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction: I ran the oss-fuzz script correctly this time (on my branch instead of envoy upstream) and it failed with some C++ stdlib linker problem. I think a -lstdc++ was getting through the wrapper somehow. Rebuilding now to see if more filtering fixed it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified it works once we add -stdlib=libc++ to the list of flags that need -lstdc++ to be pruned first.

argv.append(arg)
else:
argv = sys.argv[1:]
os.execv(compiler, [compiler] + argv)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty cool approach actually. It seems this gives us a lot of freedom to fix Bazel issues that aren't addressed upstream via the wrapper. Neat.

# interchangeable, but `gcc` will ignore the `-static-libstdc++` flag.
# This check lets Envoy statically link against libstdc++ to be more
# portable between intalled glibc versions.
if "-static-libstdc++" in sys.argv[1:]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you verified with ldd that we're still getting the desired static/dynamic link mix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, libstdc++.so doesn't show up in the output of ldd bazel-bin/source/exe/envoy when using this wrapper.

Signed-off-by: John Millikin <jmillikin@stripe.com>
…_deps`

Signed-off-by: John Millikin <jmillikin@stripe.com>
Signed-off-by: John Millikin <jmillikin@stripe.com>
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rad, thanks for verifying sanity.

# `g++` and `gcc -lstdc++` have similar behavior and Bazel treats them as
# interchangeable, but `gcc` will ignore the `-static-libstdc++` flag.
# This check lets Envoy statically link against libstdc++ to be more
# portable between intalled glibc versions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/intalled/installed/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

# unless the user has explicitly set environment variables
# before starting Bazel. But here in $PWD is the Bazel sandbox,
# which will be deleted automatically after the compiler exits.
(flagfile_fd, flagfile_path) = tempfile.mkstemp(dir='./', suffix=".linker-params")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a context manager here, i.e. with?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

envoy_real_cxx = {ENVOY_REAL_CXX}

def sanitize_flagfile(in_path, out_fd):
with open(in_path, "rb") as in_fp:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 space indent, standard Google style for Python.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with yapf -i --style '{based_on_style: google, indent_width: 2}' cc_wrapper.py

Signed-off-by: John Millikin <jmillikin@stripe.com>
# unless the user has explicitly set environment variables
# before starting Bazel. But here in $PWD is the Bazel sandbox,
# which will be deleted automatically after the compiler exits.
(flagfile_fd, flagfile_path) = tempfile.mkstemp(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh, hopefully abseil-py (nee pyglib) gets a proper context manager wrapped mktemp someday.

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is an impressive cleanup.

@htuch htuch merged commit 5d14d66 into envoyproxy:master Feb 20, 2018
@jmillikin-stripe jmillikin-stripe deleted the cc-wrapper-script branch February 27, 2018 17:57
clnperez pushed a commit to clnperez/envoy that referenced this pull request Mar 28, 2018
…yproxy#2631)

The script checks for `--static-libstdc++` in argv, and if found
(1) switches to the `$CXX` compiler and (2) drops `-lstdc++` from argv.

This should let the main Envoy binary build normally, without
interfering with other build systems that treat C and C++ as different
languages.

Signed-off-by: John Millikin <jmillikin@stripe.com>
clnperez pushed a commit to clnperez/envoy that referenced this pull request Apr 3, 2018
…yproxy#2631)

The script checks for `--static-libstdc++` in argv, and if found
(1) switches to the `$CXX` compiler and (2) drops `-lstdc++` from argv.

This should let the main Envoy binary build normally, without
interfering with other build systems that treat C and C++ as different
languages.

Signed-off-by: John Millikin <jmillikin@stripe.com>
clnperez pushed a commit to clnperez/envoy that referenced this pull request Apr 3, 2018
…yproxy#2631)

The script checks for `--static-libstdc++` in argv, and if found
(1) switches to the `$CXX` compiler and (2) drops `-lstdc++` from argv.

This should let the main Envoy binary build normally, without
interfering with other build systems that treat C and C++ as different
languages.

Signed-off-by: John Millikin <jmillikin@stripe.com>
Shikugawa pushed a commit to Shikugawa/envoy that referenced this pull request Mar 28, 2020
* Merge envoy-wasm for v1.13.0.

Signed-off-by: John Plevyak <jplevyak@gmail.com>

* Update .wasm files.

Signed-off-by: John Plevyak <jplevyak@gmail.com>

* WASM SDK v2 -> v3.

Signed-off-by: John Plevyak <jplevyak@gmail.com>

* Address comments.

Signed-off-by: John Plevyak <jplevyak@gmail.com>

* Bump bazelversion.

Signed-off-by: John Plevyak <jplevyak@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants