Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add java_current_repository rule #16281

Closed
wants to merge 1 commit into from

Conversation

fmeum
Copy link
Collaborator

@fmeum fmeum commented Sep 15, 2022

Work towards #16124

@fmeum fmeum force-pushed the 16124-java-current-repository branch 15 times, most recently from 92e8453 to 7dea6a3 Compare September 15, 2022 16:54
@fmeum fmeum force-pushed the 16124-java-current-repository branch 4 times, most recently from 9e03e88 to 0b88306 Compare September 26, 2022 06:02
@fmeum fmeum force-pushed the 16124-java-current-repository branch from 0b88306 to e92f0de Compare September 26, 2022 07:02
@fmeum
Copy link
Collaborator Author

fmeum commented Sep 27, 2022

@comius Could you review the overall design of this PR? The context is at bazelbuild/proposals#274 (comment) (and following comments). The general idea is to inject a static final constant with the current repository name into all Java targets without prohibitive overhead. This is achieved by depending on a generated target through a transition that sets a Starlark build setting to the name of the current repository. Most of the complexity of the PR comes from the need to do this for both Starlarkified and native Java rules.

@gtech-bazel-bot gtech-bazel-bot bot added the awaiting-review PR is awaiting review from an assigned reviewer label Sep 28, 2022
Copy link
Contributor

@comius comius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick and incomplete review, but I hope it gives you some inputs where you can continue on this PR.

)

def _java_current_repository_impl(ctx):
java_file = ctx.actions.declare_file("_java_current_repository/RunfilesConstants.java")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered creating a .properties file and reading it as a resource? This wouldn't need to run a compiler. But I don't know if it's regenerated at the right time, that is switching repository.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be nicer, but I can't see it work: A given java_binary potentially bundles java_library targets from different repositories with different canonical repo names. How would each library look up its own repo name in a shared manifest?

Comment on lines +182 to +192
# Consumed by _current_repository_transition.
"current_repository": attr.string(),
"_runfiles_constants": attr.label(
default = "@bazel_tools//tools/java/runfiles:java_current_repository",
providers = [JavaInfo],
cfg = _current_repository_transition,
),
"_java_toolchain_type": attr.label(default = semantics.JAVA_TOOLCHAIN_TYPE),
"_allowlist_function_transition": attr.label(
default = "@bazel_tools//tools/allowlists/function_transition_allowlist",
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would also be pulled into Google java_library. Monorepo probably doesn't need it and there might be an observable memory regression involved (1 new public attribute will use more memory for each target).

Keep JAVA_LIBRARY_ATTRS intact and do a BAZEL_JAVA_LIBRARY_ATTRS which merges former and the new attributes into _java_library rule.

},
)

java_library = rule(
_java_library = rule(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't change the name here, tooling uses it. Solution is to put java_library macro into a new file.

def java_library(**kwargs):
_java_library(
# Skip over the leading '@'.
current_repository = repository_name()[1:],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repository_name()[1:] creates a new string and will probably cause a memory regression, repository_name() is probably ok. If possible process it, that is remove @ in an action.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can definitely change that, but don't we get the regression either way? repository_name is backed by getNameWithAt, so the call will return a new string each time. The returned value isn't retained if we strip off the @.

@@ -53,7 +54,7 @@ public static class LicenseParsingException extends Exception {
* <p>Note that the order is important for the purposes of finding the least
* restrictive license.
*/
public enum LicenseType {
public enum LicenseType implements StarlarkValue {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this and other StarlarkValue needed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This as well as the conversion of singleton sets are necessary to allow attaching a Starlark transition to java_binary. That triggers logic that converts all attributes to Starlark values and these legacy types as well as the Java standard library class for singleton sets failed during conversion. Do you see a better way to handle this?

@@ -206,6 +206,8 @@ public static Object fromJava(Object x, @Nullable Mutability mutability) {
return StarlarkList.copyOf(mutability, (List<?>) x);
} else if (x instanceof Map) {
return Dict.copyOf(mutability, (Map<?, ?>) x);
} else if (x instanceof Set && ((Set<?>) x).size() == 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? We are usually more thorough on reviews of Starlark changes. If you need it, perhaps it's better to submit it in a separate PR.

@fmeum
Copy link
Collaborator Author

fmeum commented Oct 4, 2022

@cushon As @comius has pointed out, the transition-based approach to generating the repo name Java constant currently has at least the following issues:

  1. Starlark transitions aren't really made to be applied to native rules, but not all Java rules will be starlarkified in time for Bazel 6.
  2. The Starlark API for getting the repository name essentially builds in a memory regression as it prepends an @.
  3. The transition and additional attributes on all Java targets could introduce memory regressions even if 2. is resolved.

Doing this in JavaBuilder seems much more efficient and much less complex (see https://github.com/fmeum/bazel/pull/new/16124-java-builder for a prototype). Do you think that could still be a viable alternative?

@cushon
Copy link
Contributor

cushon commented Oct 12, 2022

Doing this in JavaBuilder seems much more efficient and much less complex. Do you think that could still be a viable alternative?

I'm still not enthusiastic about doing it in JavaBuilder.

Has the explicit rule per repository approach been ruled out? I know the ergonomics aren't anyone's first choice, but it's efficient and even less complex, which seems like an appealing combination if there's time pressure to get something into 6.0.0. Doing that doesn't rule out pursing one of the more complicated but easier to use options later.

I'm not asking you to spend more time on the JavaBuilder prototype before there's agreement on the approach, but there were some details I wondered about:

  • are list and inferBinaryName really going to be the only methods that would need to know about the new custom file objects, or will this break other assumptions?
  • does the prototype ensure the generated source isn't part of the output jar?
  • did you consider writing a file to a tmp dir instead of a custom file object implementation, similar to the source jar handling in setUpSourceJars?
  • did you consider using an annotation processor for this? If Blaze passed the workspace name to the compilation as an -A javacopt, I think it would be possible to do the code generation with an annotation processor. For example, we could write class Foo { @AutoCurrentRepository private static final String CURRENT_REPOSITORY = AutoCurrentRepository_Foo.CURRENT_REPOSITORY; ... }. The ergonomics are somewhere in between the explicit rule per repository and magically generating a file in JavaBuilder, and maintenance-wise it uses a standard, already supported mechanism to do the codegen.

@fmeum
Copy link
Collaborator Author

fmeum commented Oct 24, 2022

Going to close this for now as #16534 seems to strike a better balance of complexity vs. ergonomics.

@fmeum fmeum closed this Oct 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-review PR is awaiting review from an assigned reviewer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants