Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Labels in repo rule impls disregard repo_mapping #12963

Closed
Wyverald opened this issue Feb 4, 2021 · 4 comments
Closed

Labels in repo rule impls disregard repo_mapping #12963

Wyverald opened this issue Feb 4, 2021 · 4 comments
Labels
P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug

Comments

@Wyverald
Copy link
Member

Wyverald commented Feb 4, 2021

Not entirely sure whether this is a bug. This happens when you call the Label("@some_repo//something") function to construct a label in the impl of a repository rule -- if the repo rule is defined in repo X that maps "some_repo" to "some_other_repo", you'd expect that the label would end up point to "some_other_repo"; but no, "some_repo" is used.

This can be a bit surprising, since it differs from the behavior of the load statement (understandably since these are resolved at different times). But it can also be surprising if we introduce strict deps for repos.

The repro case is a bit hard to construct, but bear with me:

WORKSPACE:

local_repository(name="data", path="data")
local_repository(name="otherdata", path="otherdata")
local_repository(name="rule_definer", path="rule_definer", repo_mapping={"@data": "@otherdata"})
load("@rule_definer//:defs.bzl", "myrule")
myrule(name="myrepo")
load("@myrepo//:defs.bzl", "thing")
print(thing)

rule_definer/defs.bzl:

load("@data//:defs.bzl", "foo")  # loads from "otherdata/defs.bzl", not "data/defs.bzl"
def _impl(ctx):
  line1=ctx.read(Label("@data//:foo.txt"))  # reads the file "data/foo.txt", not "otherdata/foo.txt"
  line2=ctx.read(Label("//:foo.txt"))  # reads the file "rule_definer/foo.txt", not "($root/)foo.txt"
myrule=repository_rule(implementation=_impl)

Now imagine we had strict deps for repos (as is expected for the external deps overhaul), the repo @rule_definer doesn't expect visibility into the repo @data at all; it just wants @otherdata (which it unfortunately calls @data).

On the other hand, it's arguable that this is WAI, since we are constructing a repo from the perspective of the root repo (where WORKSPACE lives); but that's inconsistent with line2. (Also imagine if either label was not hardcoded in rule_definer/defs.bzl, but provided as an attribute written in WORKSPACE itself -- even more inconsistencies!) Maybe the solution is just to say "when you hardcode labels as is being done here, don't ever hardcode a label with a @repo part".

cc @meteorcloudy @brandjon

@Wyverald Wyverald added team-Starlark team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. labels Feb 4, 2021
@brandjon
Copy link
Member

brandjon commented Feb 5, 2021

Yep, this sounds like the repo remapping not being propagated to all contexts where we construct Labels. I expect this mode of bug to be present in some other circumstances too, and it remains open work to nail down the concept of repo remapping and to what extent we wish to continue to support it as a feature. (If we do support it, we should make it harder to misapply it.)

On the other hand, it's arguable that this is WAI, since we are constructing a repo from the perspective of the root repo (where WORKSPACE lives); but that's inconsistent with line2.

Right. Calls to Label() should be consistent with how we resolve repos in load() statements. The rationale is that it's not the responsibility of the author of @rule_definer to know how the main repo will remap them.

(Also imagine if either label was not hardcoded in rule_definer/defs.bzl, but provided as an attribute written in WORKSPACE itself -- even more inconsistencies!)

The same sort of issue exists for BUILD files: Imagine that you call a macro belonging to another repo, passing in a string label that references a repo remapped within the macro's repo.

I believe the conceptual solution in that case is supposed to be to convert the string to a Label at the point where you want to lock in the repo mapping. This would mean calling your macro with Label("//pkg:foo") instead of "//pkg:foo". This falls apart for two reasons: first, it requires the string be absolute, not relative, and second, BUILD files don't have a Label predeclared symbol. What's more, if you load() the Label builtin from a .bzl file (re-exported as a different name so it doesn't collide within the .bzl), it crashes Bazel with an NPE due to the needed contextual information not being available in BUILD files. Filed #12971 to track that separately.

Maybe the solution is just to say "when you hardcode labels as is being done here, don't ever hardcode a label with a @repo part".

No, repo rules need to be able to resolve things from other repos. One case of this is when you generate BUILD/bzl content for your repo that embeds labels of other repos. If you want to resolve them correctly in the face of repo remapping, you need to do that resolution within your own repo rule, then convert it to a string and substitute it into the text that you're generating.

Needless to say, it's easy to get this wrong, hence why we want to reevaluate the whole repo renaming model.

@brandjon brandjon added P3 We're not considering working on this, but happy to review a PR. (No assignee) type: bug labels Feb 5, 2021
@brandjon
Copy link
Member

brandjon commented Feb 5, 2021

My reading of the code is that the repo mapping is set here:

rule.getPackage().getRepositoryMapping(),

But rule.getPackage() would be the WORKSPACE file (WORKSPACE is analogous to BUILD, repository rule instantiations are analogous to rule instantiations i.e. targets, and "Rule" in Bazel's source code really means "rule target" or simply "target"). So it sounds to me like we're using the repo mapping of the main repo, not the one whose fetch we're actually evaluating.

Curiously enough, I see the same logic in normal BUILD rules' analysis, so I wonder if we're giving the wrong Label values in rule implementation functions.

I don't intend to follow up on this any further until we come to a decision on whether to keep remapping around in its existing form.

@Wyverald
Copy link
Member Author

Wyverald commented Feb 8, 2021

I don't intend to follow up on this any further until we come to a decision on whether to keep remapping around in its existing form.

What do you mean? Is there a plan to remove repo mapping? bzlmod makes extensive use of repo mapping to support repo renaming (bazel_dep.repo_name), and having dependencies on multiple versions of the same module (override_dep.allow_multiple_versions). Or did you mean that repo renaming needs to be changed to some other form?

@brandjon
Copy link
Member

We discussed offline, but for everyone else: I don't have any particular plan regarding repo remapping other than to study it further to see whether it needs to exist in its current form. If it does, we need to document it better (both to users and within the codebase).

@brandjon brandjon added P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) team-Build-Language and removed P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Starlark labels Feb 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug
Projects
None yet
Development

No branches or pull requests

2 participants