Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add framework for writing lang_proto_library #6

Closed
Tracked by #10005
Yannic opened this issue Jul 25, 2019 · 9 comments
Closed
Tracked by #10005

Add framework for writing lang_proto_library #6

Yannic opened this issue Jul 25, 2019 · 9 comments
Assignees
Labels
enhancement New feature or request

Comments

@Yannic
Copy link
Collaborator

Yannic commented Jul 25, 2019

It would be very helpful to have a "framework" for writing lang_proto_library- or lang_grpc_library-rules.

Design goals:

I already have an idea how this framework could look like, but I hadn't had time to create a design doc for it. In the meantime, leave a comment if you feel like anything is missing.

@Yannic
Copy link
Collaborator Author

Yannic commented Oct 22, 2019

The generator API in the design doc required that all binding generators are protoc-plugins.

The main reasons for this decision were:

  • proto_lang_toolchain alone is insufficient for some languages. E.g., for C++, there are additional flags that influence the name of generated files (i.e. foo.proto -> foo.pb.h or foo.proto.h). To make it easier for users to use alternative generators, everything should be configurable by one toolchain (per language). Desicion: Replace proto_lang_toolchain with <lang>_proto_toolchain, and provide a "default implementation" that languages can use if they don't need extra info (i.e. one could write something like closure_proto_toolchain = proto_common.create_default_lang_proto_toolchain()).
  • Authors of <lang>_proto_library should not assume that their generator is built into protoc. Most languages have alternatives to the "official" generators. Ideally, this should be completely transparent for <lang>_proto_library (of course, <lang>_proto_toolchain needs to be aware of this).

We should investigate whether there are alternatives to this drastical step.

ProtoGenerator = provider(
    fields = {
        "display_name": "Optional. The name that's used in the progress message (e.g. C++).",
        "language": "String. As in `--<language>_out` (only for generators built into protoc).",
        "plugin": "Executable. The generator plugin. Required iff `language` is not set.",
        "options": "Optional. Sequence of strings.".
        "extra_inputs": "Optional. depset() of extra inputs for protoc (e.g. configuration files).",
    },
)

def _generate(
        actions,
        proto_toolchain,
        generator,
        descriptor_sets,
        protos,
        output_root,
        outputs = [],
        extra_inputs = []):
    """Generates source files for `.proto` files.

    Args:
      actions: Required. Instance of `ctx.actions`.
      proto_toolchain: Required. Toolchain, defined by `proto_toolchain`.
      generator: Required. Instance of `ProtoGenerator`.
      descriptor_sets: # Required: A set of `FileDescriptorSet` files.
      protos: Required. Sequence of strings `[<import_path>, …]` of protos to
              generate, e.g. coming from `proto_common.protos_to_generate()`.
      output_root: Required. File or string. Output root for the generator.
      outputs: Sequence of output files. Required iff `output_root` is a string.
      extra_inputs: Optional. Optional. depset() of extra inputs for protoc
                    (e.g. configuration files).

    Returns:
      Nothing
    """
    pass

@Yannic Yannic self-assigned this Oct 22, 2019
@Yannic Yannic added the enhancement New feature or request label Oct 22, 2019
@ribrdb
Copy link
Contributor

ribrdb commented Nov 15, 2019

I'm not sure I understand how this would work.
Is ProtoGenerator the provider that a lang_proto_toolchain would return? Or just a helper for implementing lang_proto_toolchain based on protoc?
Also what is _generate? It's private, is it intended to be a field on a provider similar to the compile field on go_proto_compiler? Is this intended to be called by lang_proto_library? How are you supposed to calculate the output_root?

@ribrdb
Copy link
Contributor

ribrdb commented Nov 15, 2019

I don't think non-protoc based generators typically accept FileDescriptorSet as input. If you wan to transparently support both, don't you need to pass both the FileDescriptorSets and the .proto files for dependencies?

@Yannic
Copy link
Collaborator Author

Yannic commented Nov 15, 2019

Yes, ProtoGenerator would be returned by lang_proto_toolchain.
_generate (exported as proto_common.generate) is a helper function for lang_proto_library rules that encapsulates calling protoc so lang_proto_library rules don‘t have to worry about all that we may use a generator that is built into protoc, but our users may also want to use a custom protoc plugin to generate e.g. C++, which at least some existing rules currently do not care about. The value of output_root depends on the generator. For C++, it‘s ProtoInfo.proto_source_root, for JS (closure), I think it can be current directory (.), for Java, it‘s a .jar file, for PHP, I think it has to be a tree artifact (aka directory) because the name of the files depend on options defined in the .proto file.

You are right, for now, this only works for protoc-based generators. Non-protoc based generators may name command-line flags differently, at which point it gets way to complicated (what if there are non-protoc based generators that actually support FileDescriptorSets, but expect them to be passed as —fds=file1.bin —fds=file2.bin?). I believe that protoc based plugins are the blessed and common way to write generators. If you want to go with something else, I’m sorry, you‘re on your own. You‘ll still be able to write lang_proto_library rules without the proposed API. Technically, this doesn‘t even restrict the choice of language to write the generator in. protoc plugins communicate through stdin and stdout, and the protocol protoc and the pluging use is well specified.

@ribrdb
Copy link
Contributor

ribrdb commented Nov 15, 2019

Oh, I misread "built into protoc" above. I thought you meant that foo_proto_library shouldn't assume that foo_proto_toolchain uses protoc. But you're saying that cc_proto_library shouldn't assume that it's using the cc generator compiled into protoc. Sounds good.

@Yannic
Copy link
Collaborator Author

Yannic commented Dec 9, 2019

(just thinking out loud here)

It seems like protoc has the nice --<gen>_opt options I wanted to use in the alternative design proposed in #6 (comment) only for plugins and not for built-in generators :(. I guess that's an argument for the "built-in-generator-less" API design from the original design doc. Will need to confirm this with the Protobuf team though (and it may be an option to modify protoc to allow --<gen>_opt for all generators).

@ribrdb
Copy link
Contributor

ribrdb commented Dec 9, 2019

I just ran into this issue. I've been working on something similar to what you proposed above, and it was working for custom languages. But I just tried to use it for generating js and discovered that there's no --js_opt flag.

@Yannic
Copy link
Collaborator Author

Yannic commented Dec 9, 2019

Coincidently, I was also trying it with --js_opt🤣.

Looking at the code, I see that some generators (e.g. C++) already allow --cpp_opt,
while some others don't (e.g. JS). Submitted protocolbuffers/protobuf#6999 to make the remaining generators match what plugins can do, so discard my last comment.

@comius
Copy link
Collaborator

comius commented Aug 31, 2022

I believe proto_common in Bazel is addressing this.

@comius comius closed this as completed Aug 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants