Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go_proto_library: support multiple proto dependencies #1890

Merged
merged 14 commits into from
Jan 11, 2019

Conversation

vam-google
Copy link
Contributor

Add protos argument to go_proto_library (same as the existing proto argument, but supports multiple protos).

go_proto_compile changes one of the arguments from proto to protos, which is considered Ok (even though this is a technically public API there is no detected usages of the method anywhere else on github).

Implements #1856

Add `protos` argument to `go_proto_library` (same as the existing `proto` argument, but supports multiple protos).
Copy link
Contributor

@jayconrod jayconrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few small nits about some Starlark subtleties, but otherwise this looks very good. Thanks for taking this on.

go_srcs = []
outpath = None
for src in proto.check_deps_sources:
dep_sources = depset(direct = [], transitive = [proto.check_deps_sources for proto in protos])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we need to iterate the depset anyway, I think it would be clearer to just have a doubly nested loop, and build proto_paths at the same time we're declaring output files.

Constructing a depset has a surprising amount of overhead (just in the Java implementation, not anything fundamental), and we should try to avoid it unless either a) we're passing it along a dependency graph and we can avoid iterating it at every step or b) we can pass it to args.add_all and avoid iterating it unless the action is executed.

Copy link
Contributor Author

@vam-google vam-google Jan 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used depset here mainly to ensure that there will be no duplicates. Since these are technically "transitive sources" Bazel cannot ensure that there are no duplicates (if those were direct sources, bazel would simply fail with "duplicate something" error). Dataset skips duplicates during iteration by default. My impression (after reading bazel documentation) was a dataset is a proper data structure for this. Another way to avoid duplicates can be Args.add_all(uniquify=True).
Should we not care about duplicates here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely avoid duplicates, but we're not able to take advantage of depset performance here because we're iterating through check_deps_sources below, so the cost isn't worth it.

I was thinking a change like this:

go_srcs = []
outpath = None
proto_paths = {}
for proto in protos:
    for src in proto.check_deps_sources.to_list():
        path = proto_path(src, proto)
        if path in proto_paths:
            continue
        proto_paths[path] = True
        out = go.declare_file(go, path = importpath + "/" + src.basename[:-len(".proto")], ext = compiler.suffix)
        go_srcs.append(out)
        if outpath == None:
            outpath = out.dirname[:-len(importpath)]

WDYT?

The construction of transitive_descriptor_sets looks good though. The depset makes sense there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this. One potential concern is that in the suggested code snipped the deduplicaiton happens based on proto_path value, which is a calculated value (and logic is quite sophisticated). In contrast with depset it is done by bazel itself (so is very reliable). At the same time it seems that proto_paths and the output files are in 1:1 relation to each other and if any of them breaks the whole thing breaks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valid concern. Let's make it an error.

Ultimately, the proto_path is what gets passed to protoc, and that's what we use to find the right descriptor to generate code for. We have the full set of descriptors for transitive dependencies, and we don't have another way to identify direct dependencies. It's okay to have the same .proto file repeated (possibly seen through more than one proto_library), but it should be an error to have different files with the same effective import path.

So let's do this: proto_paths should be a map from proto_path values to Bazel Files. The code code be changed to something like this:

path = proto_path(src, proto)
if path in proto_paths:
    if proto_paths[path] != src:
        fail("proto files {} and {} have the same import path, {}".format(src.path, proto_paths[path].path, path))
    continue
proto_paths[path] = src

transitive = [proto.transitive_descriptor_sets for proto in protos],
)

for src in dep_sources:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dep_sources.to_list()

Bazel will eventually make it an error to iterate a depset. There is a flag --incompatible_depset_is_not_iterable, which detects this. (My previous code here is wrong; I didn't realize proto.check_deps_sources was a depset).

Copy link
Contributor Author

@vam-google vam-google Jan 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, for some reason I thought it was another way around (a new feature, not an old, deprecated one). I'll change it (once we come to a conclusion if we want to have dep_sources variable at all)

@@ -83,14 +83,23 @@ def _go_proto_library_impl(ctx):
compilers = ctx.attr.compilers
go_srcs = []
valid_archive = False

if ctx.attr.proto:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be an error to specify both proto and protos: one or the other should be required. This should have been the case for compilers as well, but let's leave that aside for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I just tried to follow similar pattern as compilers. I'm not sure if I understood your comment though: do you suggest to keep it as is or change (make specifying both arguments an error)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just change this one to report an error if proto and protos are specified.

Changing compilers would be a breaking change (one I'd like to make eventually, but later).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -104,6 +104,34 @@ proto_library(
srcs = ["proxy_b.proto"],
)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple cosmetic errors here. Could you run buildifier on this file? It should clean them up. You can install it with go get -u github.com/bazelbuild/buildtools/buildifier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry, I was removing some of the "experimental" arguments and forgot to reformat the file.
I'm using IntelliJ's formatter (with Bazel plugin enabled). I hope it produces same result, doesn't it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now. I assume they're using buildifier for formatting under the hood.

2) Add support for `protos` argument in get_imports() function as well.
@vam-google
Copy link
Contributor Author

@jayconrod PTAL. Also please pay attention to the change in get_imports(), which was not included in initial commit.

proto/def.bzl Outdated

direct = dict()
for proto in protos:
for src in proto.check_deps_sources:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think .to_list() is needed here (previous code was incorrect, too). Maybe run tests with --incompatible_depset_is_not_iterable to double-check there aren't any other issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@jayconrod
Copy link
Contributor

Added two small comments, but this is almost ready to go.

@jayconrod
Copy link
Contributor

Test failure is due to a CI configuration which will be fixed in #1902, so submitting.

@jayconrod jayconrod merged commit d473396 into bazel-contrib:master Jan 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants