Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel 5.1.0 toolchain resolution fails on my ARM64 Mac #15175

Closed
nicholasjng opened this issue Apr 4, 2022 · 17 comments
Closed

Bazel 5.1.0 toolchain resolution fails on my ARM64 Mac #15175

nicholasjng opened this issue Apr 4, 2022 · 17 comments
Assignees
Labels
team-Configurability platforms, toolchains, cquery, select(), config transitions untriaged

Comments

@nicholasjng
Copy link
Contributor

Description of the problem / feature request:

Crosspost from this JAX issue on building jaxlib from source.

I have been using Bazel 5.0.0 to (successfully) build jaxlib from source on my machine (Apple M1 Pro, macOS 12.3.1) in the past. My last successful build was about two weeks ago.

When pulling in JAX main today at HEAD and trying to build from source, the build script downloaded Bazel version 5.1.0 for macOS ARM64 from GitHub (this is important), started the build, and failed.
Setting the toolchain debug flag --toolchain_resolution_debug=@bazel_tools//tools/cpp:toolchain_type reveals that the problem is a toolchain resolution failure (see the error below).
The reason that Bazel 5.1.0 became necessary to use now is that there were some changes in Tensorflow's BUILD definitions, and thus, as JAX depends on Tensorflow for XLA, also of JAX's build process.

I already talked to @hawkinsp, a JAX core developer who typically answers build-related questions on the project. He suspects that it is a Bazel issue, based on the following observations:

  1. In [5.1] Correct cpu and os values of local_config_cc_toolchains targets #14995, the aarch64 macOS CPU constraint value was renamed to arm64.
  2. In my local Bazel cache, the host platform is identified as aarch64, as evidenced by the following autogenerated platform config file:
# found inside my local Bazel cache, file: local_config_platform/constraints.bzl

# DO NOT EDIT: automatically generated constraints list for local_config_platform
# Auto-detected host platform constraints.
HOST_CONSTRAINTS = [
  '@platforms//cpu:aarch64',
  '@platforms//os:osx',
]

The logic for this is apparently found here (quoting directly from the discussion thread linked above):

case AARCH64:
return "@platforms//cpu:aarch64";

So, on first glance, it looks like Bazel identifies and saves the platform CPU name as aarch64, and later fails to match any macOS toolchains to it, because all of them have been renamed as a result of the above pull request. For more information, see the build log attached at the end of this message.

I would be happy about some feedback and / or guidance to resolving this issue. Please let me know if there is more information I can provide that could be helpful in the resolution of this problem.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Prerequisites:

  1. Have an M1-based Mac (macOS version should not be relevant here).
  2. Have a system Python 3 installation of version 3.8 or above (comes with XCode build tools, otherwise through package management tools like Homebrew).

Steps to reproduce:
5. git clone https://github.com/google/jax, cd into the resulting jax directory.
6. Create a virtual environment (python3 -m venv venv --system-site-packages --upgrade-deps is what I like to use, this works on Python >=3.9, but YMMV)
7. source venv/bin/activate followed by python -m pip install -e . to install the package itself (and dependencies) in developer mode.
8. Run python build/build.py. This prompts a Python script downloading Bazel 5.1.0 directly from the Bazel GitHub release for the macOS arm64 architecture, and invokes it for a Python wheel build of jaxlib, a companion package of JAX.

What operating system are you running Bazel on?

macOS 12.3.1, Apple M1 Pro Macbook Pro 14".

What's the output of bazel info release?

jax/build on  main [⇡$!?]
➜ ./bazel-5.1.0-darwin-arm64 info release
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=1 --terminal_columns=110
INFO: Reading rc options for 'info' from /Users/nicholasjunge/Workspaces/python/jax/.bazelrc:
  Inherited 'common' options: --experimental_repo_remote_exec
INFO: Reading rc options for 'info' from /Users/nicholasjunge/Workspaces/python/jax/.bazelrc:
  Inherited 'build' options: --apple_platform_type=macos --macos_minimum_os=10.9 --announce_rc --define open_source_build=true --spawn_strategy=standalone --enable_platform_specific_config --experimental_cc_shared_library --define=no_aws_support=true --define=no_gcp_support=true --define=no_hdfs_support=true --define=no_kafka_support=true --define=no_ignite_support=true --define=grpc_no_ares=true -c opt --config=short_logs --copt=-DMLIR_PYTHON_PACKAGE_PREFIX=jaxlib.mlir.
INFO: Reading rc options for 'info' from /Users/nicholasjunge/Workspaces/python/jax/.jax_configure.bazelrc:
  Inherited 'build' options: --strategy=Genrule=standalone --repo_env PYTHON_BIN_PATH=/Users/nicholasjunge/Workspaces/python/jax/venv/bin/python --action_env=PYENV_ROOT --python_path=/Users/nicholasjunge/Workspaces/python/jax/venv/bin/python --distinct_host_configuration=false
INFO: Found applicable config definition build:short_logs in file /Users/nicholasjunge/Workspaces/python/jax/.bazelrc: --output_filter=DONT_MATCH_ANYTHING
INFO: Found applicable config definition build:macos in file /Users/nicholasjunge/Workspaces/python/jax/.bazelrc: --config=posix
INFO: Found applicable config definition build:posix in file /Users/nicholasjunge/Workspaces/python/jax/.bazelrc: --copt=-fvisibility=hidden --copt=-Wno-sign-compare --cxxopt=-std=c++14 --host_cxxopt=-std=c++14
release 5.1.0

If bazel info release returns "development version" or "(@non-git)", tell us how you built Bazel.

(Not applicable, as the Bazel binary was downloaded directly off this Github repo's 5.1.0 release.)

What's the output of git remote get-url origin ; git rev-parse master ; git rev-parse HEAD ?

jax/build on  main [⇡$!?]
➜ git remote get-url origin
git@github.com:nicholasjng/jax.git
jax/build on  main [⇡$!?]
➜ git rev-parse main
e1bbbf55cd28abe60175a5cf106c4c2c9c9e044f
jax/build on  main [⇡$!?]
➜ git rev-parse HEAD
e1bbbf55cd28abe60175a5cf106c4c2c9c9e044f

Have you found anything relevant by searching the web?

No.

Any other information, logs, or outputs that you want to share?

jax on  main [⇡$!] via jax 
➜ python build/build.py                            

     _   _  __  __
    | | / \ \ \/ /
 _  | |/ _ \ \  /
| |_| / ___ \/  \
 \___/_/   \/_/\_\


b'\x1b[31mERROR: The project you\'re trying to build requires Bazel 5.1.0 (specified in /Users/nicholasjunge/Workspaces/python/jax/.bazelversion), but it wasn\'t found in /opt/homebrew/Cellar/bazel/5.0.0/libexec/bin.\x1b[0m\n\nBazel binaries for all official releases can be downloaded from here:\n  https://github.com/bazelbuild/bazel/releases\n\nYou can download the required version directly using this command:\n  (cd "/opt/homebrew/Cellar/bazel/5.0.0/libexec/bin" && curl -fLO https://releases.bazel.build/5.1.0/release/bazel-5.1.0-darwin-arm64 && chmod +x bazel-5.1.0-darwin-arm64)\n'
Bazel binary path: ./bazel-5.1.0-darwin-arm64
Bazel version: 5.1.0
Python binary path: /Users/nicholasjunge/Workspaces/python/jax/venv/bin/python
Python version: 3.9
NumPy version: 1.22.3
MKL-DNN enabled: yes
Target CPU: arm64
Target CPU features: release
CUDA enabled: no
TPU enabled: no
ROCm enabled: no

Building XLA and installing it in the jaxlib source tree...
./bazel-5.1.0-darwin-arm64 run --verbose_failures=true --toolchain_resolution_debug=@bazel_tools//tools/cpp:toolchain_type --config=mkl_open_source_only :build_wheel -- --output_path=/Users/nicholasjunge/Workspaces/python/jax/dist --cpu=arm64
INFO: Options provided by the client:
  Inherited 'common' options: --isatty=0 --terminal_columns=80
INFO: Reading rc options for 'run' from /Users/nicholasjunge/Workspaces/python/jax/.bazelrc:
  Inherited 'common' options: --experimental_repo_remote_exec
INFO: Reading rc options for 'run' from /Users/nicholasjunge/Workspaces/python/jax/.bazelrc:
  Inherited 'build' options: --apple_platform_type=macos --macos_minimum_os=10.9 --announce_rc --define open_source_build=true --spawn_strategy=standalone --enable_platform_specific_config --experimental_cc_shared_library --define=no_aws_support=true --define=no_gcp_support=true --define=no_hdfs_support=true --define=no_kafka_support=true --define=no_ignite_support=true --define=grpc_no_ares=true -c opt --config=short_logs --copt=-DMLIR_PYTHON_PACKAGE_PREFIX=jaxlib.mlir.
INFO: Reading rc options for 'run' from /Users/nicholasjunge/Workspaces/python/jax/.jax_configure.bazelrc:
  Inherited 'build' options: --strategy=Genrule=standalone --repo_env PYTHON_BIN_PATH=/Users/nicholasjunge/Workspaces/python/jax/venv/bin/python --action_env=PYENV_ROOT --python_path=/Users/nicholasjunge/Workspaces/python/jax/venv/bin/python --distinct_host_configuration=false
INFO: Found applicable config definition build:short_logs in file /Users/nicholasjunge/Workspaces/python/jax/.bazelrc: --output_filter=DONT_MATCH_ANYTHING
INFO: Found applicable config definition build:mkl_open_source_only in file /Users/nicholasjunge/Workspaces/python/jax/.bazelrc: --define=tensorflow_mkldnn_contraction_kernel=1
INFO: Found applicable config definition build:macos in file /Users/nicholasjunge/Workspaces/python/jax/.bazelrc: --config=posix
INFO: Found applicable config definition build:posix in file /Users/nicholasjunge/Workspaces/python/jax/.bazelrc: --copt=-fvisibility=hidden --copt=-Wno-sign-compare --cxxopt=-std=c++14 --host_cxxopt=-std=c++14
Loading: 
Loading: 0 packages loaded
INFO: Build option --distinct_host_configuration has changed, discarding analysis cache.
Analyzing: target //build:build_wheel (0 packages loaded, 0 targets configured)
DEBUG: Rule 'io_bazel_rules_docker' indicated that a canonical reproducible form can be obtained by modifying arguments shallow_since = "1596824487 -0400"
DEBUG: Repository io_bazel_rules_docker instantiated at:
  /Users/nicholasjunge/Workspaces/python/jax/WORKSPACE:37:14: in <toplevel>
  /private/var/tmp/_bazel_nicholasjunge/270a4a78734ae0f3124fa7265b8a65ef/external/org_tensorflow/tensorflow/workspace0.bzl:107:34: in workspace
  /private/var/tmp/_bazel_nicholasjunge/270a4a78734ae0f3124fa7265b8a65ef/external/bazel_toolchains/repositories/repositories.bzl:35:23: in repositories
Repository rule git_repository defined at:
  /private/var/tmp/_bazel_nicholasjunge/270a4a78734ae0f3124fa7265b8a65ef/external/bazel_tools/tools/build_defs/repo/git.bzl:199:33: in <toplevel>
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-armeabi-v7a; mismatching values: arm
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-armeabi-v7a; mismatching values: arm
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-darwin_arm64; mismatching values: arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-darwin_arm64; mismatching values: arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-darwin_arm64e; mismatching values: arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-darwin_arm64e; mismatching values: arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-darwin_x86_64; mismatching values: x86_64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-darwin_x86_64; mismatching values: x86_64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-ios_arm64; mismatching values: ios, arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-ios_arm64; mismatching values: ios, arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-ios_arm64e; mismatching values: ios, arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-ios_arm64e; mismatching values: ios, arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-ios_armv7; mismatching values: ios, armv7
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-ios_armv7; mismatching values: ios, armv7
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-ios_i386; mismatching values: ios, i386
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-ios_i386; mismatching values: ios, i386
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-ios_sim_arm64; mismatching values: ios, arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-ios_sim_arm64; mismatching values: ios, arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-ios_x86_64; mismatching values: ios, x86_64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-ios_x86_64; mismatching values: ios, x86_64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-tvos_arm64; mismatching values: tvos, arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-tvos_arm64; mismatching values: tvos, arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-tvos_sim_arm64; mismatching values: tvos, arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-tvos_sim_arm64; mismatching values: tvos, arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-tvos_x86_64; mismatching values: tvos, x86_64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-tvos_x86_64; mismatching values: tvos, x86_64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-watchos_arm64; mismatching values: watchos, arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-watchos_arm64; mismatching values: watchos, arm64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-watchos_arm64_32; mismatching values: watchos, arm64_32
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-watchos_arm64_32; mismatching values: watchos, arm64_32
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-watchos_armv7k; mismatching values: watchos, armv7k
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-watchos_armv7k; mismatching values: watchos, armv7k
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-watchos_i386; mismatching values: watchos, i386
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-watchos_i386; mismatching values: watchos, i386
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-watchos_x86_64; mismatching values: watchos, x86_64
INFO: ToolchainResolution:     Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: Rejected toolchain @local_config_cc//:cc-compiler-watchos_x86_64; mismatching values: watchos, x86_64
INFO: ToolchainResolution:   Type @bazel_tools//tools/cpp:toolchain_type: target platform @local_config_platform//:host: No toolchains found.
ERROR: /Users/nicholasjunge/Workspaces/python/jax/build/BUILD.bazel:25:10: While resolving toolchains for target //build:build_wheel: No matching toolchains found for types @bazel_tools//tools/cpp:toolchain_type. Maybe --incompatible_use_cc_configure_from_rules_cc has been flipped and there is no default C++ toolchain added in the WORKSPACE file? See https://github.com/bazelbuild/bazel/issues/10134 for details and migration instructions.
ERROR: Analysis of target '//build:build_wheel' failed; build aborted: 
INFO: Elapsed time: 0.093s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded, 253 targets configured)
ERROR: Build failed. Not running target
FAILED: Build did NOT complete successfully (0 packages loaded, 253 targets configured)
b''
Traceback (most recent call last):
  File "/Users/nicholasjunge/Workspaces/python/jax/build/build.py", line 527, in <module>
    main()
  File "/Users/nicholasjunge/Workspaces/python/jax/build/build.py", line 522, in main
    shell(command)
  File "/Users/nicholasjunge/Workspaces/python/jax/build/build.py", line 53, in shell
    output = subprocess.check_output(cmd)
  File "/opt/homebrew/Cellar/python@3.9/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/subprocess.py", line 424, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/opt/homebrew/Cellar/python@3.9/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['./bazel-5.1.0-darwin-arm64', 'run', '--verbose_failures=true', '--toolchain_resolution_debug=@bazel_tools//tools/cpp:toolchain_type', '--config=mkl_open_source_only', ':build_wheel', '--', '--output_path=/Users/nicholasjunge/Workspaces/python/jax/dist', '--cpu=arm64']' returned non-zero exit status 1.
@keith
Copy link
Member

keith commented Apr 4, 2022

You likely need to upgrade your reference to the platforms repo, which might be pulled in transitively somehow today in your project

@hawkinsp
Copy link

hawkinsp commented Apr 4, 2022

We also verified that this problem occurs with TensorFlow by itself, not just JAX. I'm honestly not sure how TF gets the platforms repo.

@keith
Copy link
Member

keith commented Apr 4, 2022

Can you test with bazel HEAD? They are vendored in bazel if nothing includes them and they were just updated recently https://github.com/bazelbuild/bazel//commit/676a0c8dea0e7782e47a386396e386a51566087f (probably not in 5.x). Either way likely the fastest way to fix this is to pin the newer version for now

@nicholasjng
Copy link
Contributor Author

Some news:

Bazel built from source at HEAD on my machine with the updated platforms repo and #14995 fails with the same toolchain mismatch errors.

Bazel built from source at HEAD on my machine with the updated platforms repo but with #14995 reverted builds jaxlib properly.

I did not change anything else, just branched off of master and did git revert b858ec.

The gist of this issue is, in my opinion, the following:

This was even mentioned in the original PR thread: #14844 (review), but that line of inquiry was abandoned.

Happy to hear your thoughts.

@hawkinsp
Copy link

hawkinsp commented Apr 5, 2022

@Wyverald Could we look into addressing this for 5.1.1, please?

@Wyverald
Copy link
Member

Wyverald commented Apr 5, 2022

@bazel-io fork 5.1.1

@Wyverald
Copy link
Member

Wyverald commented Apr 5, 2022

@katre @keith what do you recommend we should do here? Following @nicholasjng's analysis above, do we just change LocalConfigPlatformFunction to return @platforms//cpu:arm64 instead?

@keith
Copy link
Member

keith commented Apr 5, 2022

@nicholasjng how did you update the platforms repo? I can reproduce the issue with jax HEAD, but if I add this to the .bazelrc:

build --override_repository=platforms=/Users/ksmiley/dev/platforms

to a local checkout of platforms at HEAD, it works as expected. Is it possible the way you update it didn't "stick" ? For example if you put it in the bottom of the WORKSPACE instead of the top (above the tf loading potentially)

@keith
Copy link
Member

keith commented Apr 5, 2022

Since aarch64 is defined as an alias of arm64 https://github.com/bazelbuild/platforms/blob/fbd0d188dac49fbcab3d2876a2113507e6fc68e9/cpu/BUILD#L16-L20 I would expect them to be virtually interchangeable, if that's not the case maybe that should be what we attempt to fix instead? Since otherwise you could always have this mismatch just potentially in the opposite direction

@keith
Copy link
Member

keith commented Apr 5, 2022

In an entirely empty project I am able to see that since bazel 5.x the version of platforms has contained this fix by using:

$ USE_BAZEL_VERSION=5.0.0 bazelisk query @platforms//cpu:all --output=build | head -10
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
Loading: 0 packages loaded
# /private/var/tmp/_bazel_ksmiley/f5d0820468e3a59ee7b1f838e9951ebb/external/platforms/cpu/BUILD:17:6
alias(
  name = "aarch64",
  actual = "@platforms//cpu:arm64",
)

which makes me think that somehow TF, and transitively jax, are just pulling in older versions through some other transitive dependency.

@keith
Copy link
Member

keith commented Apr 5, 2022

With this branch of jax where I bump platforms the issue goes away https://github.com/keith/jax/tree/platforms-update

@sgowroji sgowroji added team-Configurability platforms, toolchains, cquery, select(), config transitions untriaged labels Apr 6, 2022
@nicholasjng
Copy link
Contributor Author

I get a slightly different output, using Bazel 5.1.0 from GitHub as downloaded during the build:

jax on  main [⇡$!] via jax 
➜ build/bazel-5.1.0-darwin-arm64 query @platforms//cpu:all --output=build | head -10
Loading: 0 packages loaded
# /private/var/tmp/_bazel_nicholasjunge/270a4a78734ae0f3124fa7265b8a65ef/external/platforms/cpu/BUILD:17:17
constraint_value(
  name = "aarch64",
  constraint_setting = "@platforms//cpu:cpu",
)
# Rule aarch64 instantiated at (most recent call last):
#   /private/var/tmp/_bazel_nicholasjunge/270a4a78734ae0f3124fa7265b8a65ef/external/platforms/cpu/BUILD:17:17 in <toplevel>

# /private/var/tmp/_bazel_nicholasjunge/270a4a78734ae0f3124fa7265b8a65ef/external/platforms/cpu/BUILD:23:17
constraint_value(
Loading: 0 packages loaded
Loading: 0 packages loaded

Nevertheless, your patch works for me as well. jaxlib builds with a downloaded Bazel, which is what we want.

Does this, in turn, mean that if one target pins the platforms repo to some version, that version is also set globally?

Good job on finding that. I had no idea about the implications of different versions of platforms being pulled in, and thought Bazel would only use its vendored version if nothing else was specified.
Thank you for clearing that up, I guess that means this is a configuration problem instead of a Bazel one (though it is a little confusing).

@hawkinsp
Copy link

hawkinsp commented Apr 6, 2022

I believe the old platforms repository is being included via https://github.com/tensorflow/runtime/blob/ed92908bf93f09db579f4be41e8f4ae567bce0e1/third_party/rules_cuda/cuda/dependencies.bzl#L61

I'll manually override it in JAX for now. Thanks!

@Wyverald
Copy link
Member

Wyverald commented Apr 6, 2022

Nice. So to confirm, is this no longer a 5.1.1 blocker?

@hawkinsp
Copy link

hawkinsp commented Apr 6, 2022

I sent jax-ml/jax#10164 to apply the @platforms workaround; @nicholasjng needs to confirm it works for him!

@nicholasjng
Copy link
Contributor Author

Can confirm, it works with a normal Bazel release download from GitHub now, no patches required. Thanks for the help!

@Wyverald Wyverald closed this as completed Apr 6, 2022
copybara-service bot pushed a commit to tensorflow/runtime that referenced this issue Apr 6, 2022
The goal of this change is to fix build failures on Mac ARM apparently caused by an outdated copy of @platforms.
See bazelbuild/bazel#15175

PiperOrigin-RevId: 439858709
@keith
Copy link
Member

keith commented Apr 6, 2022

For future reference for readers, you can see where platforms is coming from by doing a query like this:

% bazel query //external:platforms --output=build
# /Users/ksmiley/dev/tensorflow/WORKSPACE:19:14
http_archive(
  name = "platforms",
  generator_name = "platforms",
  generator_function = "workspace",
  urls = ["https://mirror.bazel.build/github.com/bazelbuild/platforms/releases/download/0.0.2/platforms-0.0.2.tar.gz", "https://github.com/bazelbuild/platforms/releases/download/0.0.2/platforms-0.0.2.tar.gz"],
  sha256 = "48a2d8d343863989c232843e01afc8a986eb8738766bfd8611420a7db8f6f0c3",
)
# Rule platforms instantiated at (most recent call last):
#   /Users/ksmiley/dev/tensorflow/WORKSPACE:19:14                                                                                in <toplevel>
#   /Users/ksmiley/dev/tensorflow/tensorflow/workspace1.bzl:11:28                                                                in workspace
#   /private/var/tmp/_bazel_ksmiley/98191c5afed6ae517e237dc0ba08559e/external/rules_cuda/cuda/dependencies.bzl:61:10             in rules_cuda_dependencies
#   /private/var/tmp/_bazel_ksmiley/98191c5afed6ae517e237dc0ba08559e/external/bazel_tools/tools/build_defs/repo/utils.bzl:233:18 in maybe
# Rule http_archive defined at (most recent call last):
#   /private/var/tmp/_bazel_ksmiley/98191c5afed6ae517e237dc0ba08559e/external/bazel_tools/tools/build_defs/repo/http.bzl:364:31 in <toplevel>

Loading: 0 packages loaded

And you can see the stack trace that lead to including it, and the version being used. You can also query the CPU definition directly to see if you have the old one:

% bazel query @platforms//cpu:aarch64 --output=build
# /private/var/tmp/_bazel_ksmiley/98191c5afed6ae517e237dc0ba08559e/external/platforms/cpu/BUILD:17:17
constraint_value(
  name = "aarch64",
  constraint_setting = "@platforms//cpu:cpu",
)
# Rule aarch64 instantiated at (most recent call last):
#   /private/var/tmp/_bazel_ksmiley/98191c5afed6ae517e237dc0ba08559e/external/platforms/cpu/BUILD:17:17 in <toplevel>

Loading: 0 packages loaded

Or the new one:

% bazel query @platforms//cpu:aarch64 --output=build --override_repository=platforms=/Users/ksmiley/dev/platforms
# /private/var/tmp/_bazel_ksmiley/98191c5afed6ae517e237dc0ba08559e/external/platforms/cpu/BUILD:17:6
alias(
  name = "aarch64",
  actual = "@platforms//cpu:arm64",
)
# Rule aarch64 instantiated at (most recent call last):
#   /private/var/tmp/_bazel_ksmiley/98191c5afed6ae517e237dc0ba08559e/external/platforms/cpu/BUILD:17:6 in <toplevel>

Loading: 1 packages loaded

sluongng added a commit to sluongng/copybara that referenced this issue Sep 5, 2022
In Bazel versions from 5.1.0 and older, there was a change [1] which
prevents copybara from compiling on Apple Silicon by default.

The solution recommended in [2] was to pin `platforms` repository to a
newer version where the constraints value for CPP toolchain could be
correctly resolved.

Without this change, we would need to use Bazel 5.0.0 or older to
compile copybara successfully on Apple Silicon.

[1]: bazelbuild/bazel#14844
[2]: bazelbuild/bazel#15175
copybara-staging bot pushed a commit to google/copybara that referenced this issue Sep 7, 2022
In Bazel versions from 5.1.0 and older, there was a change [1] which
prevents copybara from compiling on Apple Silicon by default.

The solution recommended in [2] was to pin `platforms` repository to a
newer version where the constraints value for CPP toolchain could be
correctly resolved.

Without this change, we would need to use Bazel 5.0.0 or older to
compile copybara successfully on Apple Silicon.

[1]: bazelbuild/bazel#14844
[2]: bazelbuild/bazel#15175

Fixes #207

Change-Id: I8f71518f3c569de794fd60acb899d835323fccc9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Configurability platforms, toolchains, cquery, select(), config transitions untriaged
Projects
None yet
Development

No branches or pull requests

6 participants