Skip to content

Commit

Permalink
feat(pip_parse): support patching 'whl_library'
Browse files Browse the repository at this point in the history
Before that the users had to rely on patching the actual wheel files and
uploading them as different versions to internal artifact stores if they
needed to modify the wheel dependencies. This is very common when
breaking dependency cycles in `pytorch` or `apache-airflow` packages.
With this feature we can support patching external PyPI dependencies via
unified patches passed into the `pip.whl_mods` extension and the legacy
`package_annotation` macro.

Fixes bazelbuild#1076.

Add a non-empty patch and show that there can be multiple patches

exp: A different design, that does not require us to put patches to annotations.json

Simplify the design and add extra notes in the implementation

fix: make the legacy WORKSPACE patching compatible with bazel 5.4

chore: update docs

refactor: s/module_override/whl_override

doc: update changelog

doc: improve documentation and code comments on the new features

s/whl_override/whl_library_override/g

refactor wheel installer

feat: support whl_overriding before extraction

Add a note

doc: update changelog

fixup: set better default values for patches

chore: add wheel_repackager.py to the list of pysrcs

fix rebase conflicts

fix: use better defaults for annotations

fixup: update docs

fixup: minor tidy up

add a comment on annotation support for bzlmod

refactor: whl patching to a separate function

finish cleaning up handling of whl_patches for python annotations

Add an empty patch to the pip_repository_annotations example

Move patch argument processing to a single place, next to annotations

Improve the script and add logging

feat!: remove legacy bzlmod patching example

doc: remove changelog entry for legacy patching

feat!: remove the patching support via pip annotations

feat!: remove support for whl_library patching for now
  • Loading branch information
aignas committed Sep 25, 2023
1 parent 0d0e183 commit 2d3eedc
Show file tree
Hide file tree
Showing 13 changed files with 300 additions and 44 deletions.
4 changes: 2 additions & 2 deletions .bazelrc
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
# This lets us glob() up all the files inside the examples to make them inputs to tests
# (Note, we cannot use `common --deleted_packages` because the bazel version command doesn't support it)
# To update these lines, run tools/bazel_integration_test/update_deleted_packages.sh
build --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_install,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/py_proto_library,tests/compile_pip_requirements,tests/compile_pip_requirements_test_from_external_workspace,tests/ignore_root_user_error,tests/pip_repository_entry_points
query --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_install,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/py_proto_library,tests/compile_pip_requirements,tests/compile_pip_requirements_test_from_external_workspace,tests/ignore_root_user_error,tests/pip_repository_entry_points
build --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/patches,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_install,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/pip_repository_annotations/patches,examples/py_proto_library,tests/compile_pip_requirements,tests/compile_pip_requirements_test_from_external_workspace,tests/ignore_root_user_error,tests/pip_repository_entry_points
query --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/patches,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_install,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/pip_repository_annotations/patches,examples/py_proto_library,tests/compile_pip_requirements,tests/compile_pip_requirements_test_from_external_workspace,tests/ignore_root_user_error,tests/pip_repository_entry_points

test --test_output=errors

Expand Down
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,9 @@ A brief description of the categories of changes:
* (python_repository) Support `netrc` and `auth_patterns` attributes to enable
authentication against private HTTP hosts serving Python toolchain binaries.

* (bzlmod) Added patching support via `patches` and `patch_strip` arguments to
the new `pip.whl_override` tag class.

### Removed

* (bzlmod) The `entry_point` macro is no longer supported and has been removed
Expand Down
5 changes: 4 additions & 1 deletion docs/pip_repository.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions examples/bzlmod/MODULE.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,15 @@ pip.parse(
"@whl_mods_hub//:wheel.json": "wheel",
},
)

# You can add patches that will be applied on the extracted whl contents
pip.whl_override(
patch_strip = 1,
patches = [
"@//patches:empty.patch",
],
whl_name = "requests",
)
use_repo(pip, "pip")

bazel_dep(name = "other_module", version = "", repo_name = "our_other_module")
Expand Down
4 changes: 4 additions & 0 deletions examples/bzlmod/patches/BUILD.bazel
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
exports_files(
srcs = glob(["*.patch"]),
visibility = ["//visibility:public"],
)
Empty file.
65 changes: 62 additions & 3 deletions python/extensions/pip.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ You cannot use both the additive_build_content and additive_build_content_file a
whl_mods = whl_mods,
)

def _create_versioned_pip_and_whl_repos(module_ctx, pip_attr, whl_map):
def _create_versioned_pip_and_whl_repos(module_ctx, pip_attr, whl_map, whl_overrides):
python_interpreter_target = pip_attr.python_interpreter_target

# if we do not have the python_interpreter set in the attributes
Expand All @@ -97,9 +97,10 @@ def _create_versioned_pip_and_whl_repos(module_ctx, pip_attr, whl_map):
))
python_interpreter_target = INTERPRETER_LABELS[python_name]

python_version = version_label(pip_attr.python_version)
pip_name = "{}_{}".format(
hub_name,
version_label(pip_attr.python_version),
python_version,
)
requrements_lock = locked_requirements_label(module_ctx, pip_attr)

Expand Down Expand Up @@ -129,19 +130,26 @@ def _create_versioned_pip_and_whl_repos(module_ctx, pip_attr, whl_map):
for mod, whl_name in pip_attr.whl_modifications.items():
whl_modifications[whl_name] = mod

no_patches = struct(patches = None, patch_strip = None)

# Create a new wheel library for each of the different whls
for whl_name, requirement_line in requirements:
# We are not using the "sanitized name" because the user
# would need to guess what name we modified the whl name
# to.
annotation = whl_modifications.get(whl_name)
whl_name = normalize_name(whl_name)

whl_patch = whl_overrides.get(whl_name, no_patches)

whl_library(
name = "%s_%s" % (pip_name, whl_name),
requirement = requirement_line,
repo = pip_name,
repo_prefix = pip_name + "_",
annotation = annotation,
whl_patches = whl_patch.patches,
whl_patch_strip = whl_patch.patch_strip,
python_interpreter = pip_attr.python_interpreter,
python_interpreter_target = python_interpreter_target,
quiet = pip_attr.quiet,
Expand Down Expand Up @@ -247,6 +255,19 @@ def _pip_impl(module_ctx):
# Build all of the wheel modifications if the tag class is called.
_whl_mods_impl(module_ctx)

whl_overrides = {}

for module in module_ctx.modules:
for attr in module.tags.whl_override:
whl_name = normalize_name(attr.whl_name)
if whl_name in whl_overrides:
fail("Duplicate module overrides for '{}'".format(attr))

whl_overrides[whl_name] = struct(
patches = attr.patches,
patch_strip = attr.patch_strip,
)

# Used to track all the different pip hubs and the spoke pip Python
# versions.
pip_hub_map = {}
Expand Down Expand Up @@ -292,7 +313,12 @@ def _pip_impl(module_ctx):
python_versions = [pip_attr.python_version],
)

_create_versioned_pip_and_whl_repos(module_ctx, pip_attr, hub_whl_map)
_create_versioned_pip_and_whl_repos(
module_ctx,
pip_attr,
hub_whl_map,
whl_overrides,
)

for hub_name, whl_map in hub_whl_map.items():
for whl_name, version_map in whl_map.items():
Expand Down Expand Up @@ -428,6 +454,38 @@ cannot have a child module that uses the same `hub_name`.
}
return attrs

_whl_override_tag = tag_class(
attrs = {
"patch_strip": attr.int(
default = 0,
doc = "The number of leading path segments to be stripped from the file name in the patches.",
),
"patches": attr.label_list(
doc = "A list of patches to apply to the repository *after* 'whl_library' is extracted and BUILD.bazel file is generated.",
mandatory = True,
),
# FIXME @aignas 2023-09-08: This is not strictly connected to the hub repo and we can override any whl
# for a particular package, however, any platform specific patching would not work. I am not sure
# about how to handle this properly - asking the user to input the full wheel name may be too verbose,
# but more correct but then the platform tag zoo is huge.
#
# We could make something like:
# patches_by_platform {
# "manylinux_2014": [...],
# }
#
# and we could default to patches_by_platform = {"any": patches},
"whl_name": attr.string(
doc = """The Python wheel name for the repository to be overridden.
This wheel name must be defined by other tags in this
extension within this Bazel module.""",
mandatory = True,
),
},
doc = "Apply patches to a given Python wheel library defined by other tags in this extension.",
)

pip = module_extension(
doc = """\
This extension is used to make dependencies from pip available.
Expand Down Expand Up @@ -469,6 +527,7 @@ JSON files where referred to as annotations, and were renamed to whl_modificatio
extension.
""",
),
"whl_override": _whl_override_tag,
},
)

Expand Down
74 changes: 73 additions & 1 deletion python/pip_install/pip_repository.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -612,6 +612,47 @@ py_binary(
environ = common_env,
)

def _patch_whl_file(rctx, *, python_interpreter, name, args, patches, patch_strip, quiet, timeout):
"""Patch a whl file and repack it to ensure that the RECORD metadata stays correct.
"""
no_extract_args = [arg for arg in args] + ["--no-extract"]
result = rctx.execute(
no_extract_args,
# Manually construct the PYTHONPATH since we cannot use the toolchain here
environment = _create_repository_execution_environment(rctx, python_interpreter),
quiet = quiet,
timeout = timeout,
)
if result.return_code:
fail("whl_library %s failed: %s (%s) error code: '%s'" % (name, result.stdout, result.stderr, result.return_code))

whl_file = json.decode(rctx.read("whl_file.json"))["whl_file"]
if not rctx.delete("whl_file.json"):
fail("failed to delete the whl_file.json file")

# extract files into the current directory for patching as rctx.patch
# does not support patching in another directory.
rctx.extract(whl_file)

for patch in patches:
rctx.patch(patch, strip = patch_strip)

result = rctx.execute(
[
python_interpreter,
"-m",
"python.pip_install.tools.wheel_installer.wheel_repackager",
whl_file,
],
environment = _create_repository_execution_environment(rctx, python_interpreter),
quiet = quiet,
timeout = timeout,
)
if result.return_code:
fail("repackaging .whl %s failed: %s (%s) error code: '%s'" % (name, result.stdout, result.stderr, result.return_code))

return whl_file.replace(".whl.orig.zip", ".whl")

def _whl_library_impl(rctx):
python_interpreter = _resolve_python_interpreter(rctx)
args = [
Expand All @@ -624,6 +665,24 @@ def _whl_library_impl(rctx):

args = _parse_optional_attrs(rctx, args)

annotation = None
if rctx.attr.annotation:
json_contents = json.decode(rctx.read(rctx.attr.annotation))
annotation = struct(**json_contents)

if rctx.attr.whl_patches:
whl_file = _patch_whl_file(
rctx,
name = rctx.attr.name,
args = args,
patch_strip = rctx.attr.whl_patch_strip,
patches = rctx.attr.whl_patches,
python_interpreter = python_interpreter,
quiet = rctx.attr.quiet,
timeout = rctx.attr.timeout,
)
args += ["--whl-file", whl_file]

result = rctx.execute(
args,
# Manually construct the PYTHONPATH since we cannot use the toolchain here
Expand Down Expand Up @@ -658,6 +717,11 @@ def _whl_library_impl(rctx):
)
entry_points[entry_point_without_py] = entry_point_script_name

annotation = None
if rctx.attr.annotation:
json_contents = json.decode(rctx.read(rctx.attr.annotation))
annotation = struct(**json_contents)

build_file_contents = generate_whl_library_build_bazel(
repo_prefix = rctx.attr.repo_prefix,
dependencies = metadata["deps"],
Expand All @@ -667,7 +731,7 @@ def _whl_library_impl(rctx):
"pypi_version=" + metadata["version"],
],
entry_points = entry_points,
annotation = None if not rctx.attr.annotation else struct(**json.decode(rctx.read(rctx.attr.annotation))),
annotation = annotation,
)
rctx.file("BUILD.bazel", build_file_contents)

Expand Down Expand Up @@ -717,6 +781,14 @@ whl_library_attrs = {
mandatory = True,
doc = "Python requirement string describing the package to make available",
),
"whl_patch_strip": attr.int(
doc = "Strip the specified number of leading components from file names. Applies to all whl_patches.",
default = 0,
),
"whl_patches": attr.label_list(
doc = "Patches to be applied after building/downloading the '.whl' file before generating BUILD.bazel files and extracting it.",
allow_files = True,
),
"_python_path_entries": attr.label_list(
# Get the root directory of these rules and keep them as a default attribute
# in order to avoid unnecessary repository fetching restarts.
Expand Down
1 change: 1 addition & 0 deletions python/pip_install/private/srcs.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@ PIP_INSTALL_PY_SRCS = [
"@rules_python//python/pip_install/tools/wheel_installer:namespace_pkgs.py",
"@rules_python//python/pip_install/tools/wheel_installer:wheel.py",
"@rules_python//python/pip_install/tools/wheel_installer:wheel_installer.py",
"@rules_python//python/pip_install/tools/wheel_installer:wheel_repackager.py",
]
7 changes: 7 additions & 0 deletions python/pip_install/tools/wheel_installer/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,13 @@ py_library(
],
)

py_binary(
name = "wheel_repackager",
srcs = [
"wheel_repackager.py",
],
)

py_binary(
name = "wheel_installer",
srcs = [
Expand Down
9 changes: 9 additions & 0 deletions python/pip_install/tools/wheel_installer/arguments.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@

import argparse
import json
import pathlib
from typing import Any


Expand Down Expand Up @@ -59,6 +60,14 @@ def parser(**kwargs: Any) -> argparse.ArgumentParser:
help="Use 'pip download' instead of 'pip wheel'. Disables building wheels from source, but allows use of "
"--platform, --python-version, --implementation, and --abi in --extra_pip_args.",
)
parser.add_argument(
"--whl-file", type=pathlib.Path, help="The file to be used for extraction."
)
parser.add_argument(
"--no-extract",
action="store_true",
help="Wether to extract the downloaded file.",
)
return parser


Expand Down
Loading

0 comments on commit 2d3eedc

Please sign in to comment.