-
Notifications
You must be signed in to change notification settings - Fork 542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for building C extensions #824
Comments
Which readme are you referring to, this repo's readme? This repo, pybind11, and tensorflow have different owners. There's some owner-adjacency with this repo and pybind11 (Me and Ralf are on the same team within Google, but don't overlap in work much really), but they're different projects. I think grpc also has a custom solution that mostly-works-except-when-it-doesn't. Pybind11 is a higher-level way to write C extensions. Tesorflow's purpose is an ML library -- if it provides a py_extension, then it's almost certainly just incidental. I've talked to various TF folk and they, too, want an official py_extension rule they can use. I'm pretty certain all of these solutions are incomplete and/or unsound in some way -- I don't think the toolchain provides enough information to implement them entirely correctly.
Yes, that's the plan. There aren't really any details beyond that, though. While there is a I'm happy to discuss design and implementation ideas about it, though. |
Hi Richard, Tensorflow is only involved because they had to build those rules for themselves before others. As such, their files are used as examples people copy rules from. Pybind11 is indeed another project but closely related ("Bazel + Python + C++"). It probably deserves to be coordinated with this one? I spoke with Ralf a while ago (years ago) and my understanding is that the priority was elsewhere.
Exactly. (Do you believe this is a great state of affairs?) Right now I don't even know which one to model mine from.
Here's what's happening now: AFAICT looking at OSS projects, people - including Googlers for Google open source projects - are copying two types of artifacts:
Mutating and evolving copies of these two files are proliferating. At the minimum, it would be wise to place a generic version of each in here that serves as either a directly usable version or as an authoritative source for these definitions - one that perhaps, if it is not adequately parameterizable, still requires to be copied and modified by some projects. At least having a single, latest, best, central version of these two files would allow one to track them and merge their changes into their local copies. Not ideal but that would already be an improvement over the current situation.
I'm outside now, but I don't remember what the particular baggage involved was. It would be useful to know what is needed to make these rules usable outside of google3, and Bazel folks are thinking and what the current status for supporting those is. For example, is the reason they're not supported out of the box due to a particular technical issue? If so, can it be listed / documented somewhere? Or is it that their definition always involve a particular custom specification? (This problem is emblematic of the issue with Google open source. Why isn't the company making an effort to make most of its open source projects easily importable together using Bazel? IMHO a worthy goal would be for the core OSS projects to be made usable with a one-liner in a WORKSPACE file, without having to clone files and spend hours figuring out incompatibilities and patches. That's an decent project for promo IMO, requiring coordination across teams all over the org and a deep understanding of how Bazel works and some understanding of all the core projects. Right now it's a bit of a mess, even Google's own OSS projects duplicate these files and it makes it difficult to cobble a set of them together easily.) |
To build a DSO extension, we basically need 2 + 1 pieces. First, we need a toolchain to give us the Python headers (really, the CcInfo for them). We have that in rules_python, sort of. The hermetic toolchain has a cc_library with the headers, but I don't think you can actually get to it. I don't think a system toolchain has anything here. A catch here is I think this has to be a separate toolchain (or maybe just toolchain type) for bootstrapping reasons. (see below). Second, we need APIs to invoke the cc tools. We have that, sort of. Well, maybe fully now. The key piece is being able to tell the cc apis that the output library name needs to be e.g. "foo.so" instead of "libfoo.so"[1]. I think this can be worked around by creating a symlink with the desired name (there's a java-only cc api for this; not sure if its available to starlark yet). But, also, there's the issue of what goes into the DSO. The full transitive dependency? Just the srcs? Some subset of the transitive closure? This is where cc_shared_library comes in; it basically lets you control all that. I'm not sure if it's considered generally available yet; it was only mentioned to me a couple months ago internally (I think they've been working closely with Tensorflow). The +1 part is py_binary needs some way to consume the DSO. Just sticking it The bootstrapping issue I'm not 100% sure about. The particular example I have in mind is coverage support. This was recently added to the Python toolchain; so to use it your Python toolchain has to depend on it. Coverage has its own C extension (ctrace, iirc). If you want to build coverage from source, then how do you get the Python headers? You have to ask the toolchain. But you're in the middle of constructing that toolchain, so how can you ask that toolchain for information? (Again, not 100% sure here, but this is how I understand things work). I realize most of you are fans of e.g. pre-built wheels (they def have their advantages), but supporting building from source is also major win. Internally, the baggage our py_extension rule carries is it's very tightly coupled with cc_binary; as in, at the Java level, it's a subclass of it and calls various internal APIs. It has some other quirks, but that's the main thing. This is a case where starting fresh with the modern CC Starlark APIs is probably going to work out better. A lot more thought has gone into the CC rules and how to do this since when our rule came about. I think this might be entirely doable in Starlark today. Hope this answers part of your question! re: DSO infix, pybind11, andwheel building from #1113: If the extension-building rule also carried information about its ABI and python-tag needs, then this can help inform what extension a DSO can or should use. [1] It should have the CPython infixes, too, but you get the gist. |
Thanks Richard, this is a great insight into the depth of the issues involved in making this work. |
@rwgk @jblespiau @michaelreneer @laramiel |
We also have rules similar to the above in tensorstore: https://github.com/google/tensorstore/blob/master/third_party/python/python_configure.bzl |
Oh, another part I should call out is static vs dynamic linking. My depth of understanding C++ linking stuff is fairly shallow, but in any case: IMHO, this detail should be transparent to users. e.g., as Python C extension library author, I should be able to change from one to the other (or provide both and let consumers choose) without anyone's Python code caring. This sort of functionality requires metadata propagation so that the py_binary level can put things together appropriately. |
@rickeylev I recently experimented with using bazel for compiling c++ extensions. I thought I could share some confusion I had when attempting to use bazel from a library development perspective. Bazel has been used for building C extensions for quite a number of ML-related projects at Google (also DeepMind) see envlogger, reverb, launchpad, and also tensorstore from above @laramiel for examples. I think at least for me personally, besides the lack of support for bundling extensions in (1) would be closest to the existing implementation of py_wheel, but it would probably not be very convenient for end users as they have to now learn a different workflow for installing dependencies built with bazel from source. I personally think 2) would ideally be the way to go but then it probably would take more effort to get there. Thinking about 2), I realize there are at least a few things that weren't clear to me at the moment. (2) suggests that a build frontend will create an isolated environment with the necessary build-time dependencies. I suppose in that case bazel should respect the installed interpreter dependencies and use those for building? Using installed dependencies is also convenient, for example, for libraries that depends on TensorFlow's libtensorflowframework.so so that you can avoid building TF from source. Right now, the packages I mentioned above seems to adopt the python_configure approach. In that case, most of the libraries would require the users to pre-install the dependencies. The entire process is quite brittle overall with all sorts of quirks and hacks. Personally, I think I would like a hermetic environment for development but still find an easy way to test the built distribution on a range of python versions. |
(2) is basically a superset of (1). In order for something like pep517 to use bazel as a backend, it has to configure the Python toolchains to use whatever is appropriate. In the end, it has to do e.g. Note that:
It interacts through toolchains. Toolchains are the abstraction rules use that allow someone else to configure what some set of inputs are to the rule. e.g. such as the location of Python header files.
Essentially, yes. This is handled by transitions and constraints. As an example, you can define a rule that performs a split transition, which allows building the same target for multiple different configurations (e.g. different abis). Using constraints, targets and toolchains can control what their inputs are depending on mostly arbitrary criteria; this is how the multi-version python rules work today (the transition.bzl file you mentioned).
Yes, correct on all counts. The main missing piece here is a toolchain to carry the C dependency information (e.g. headers, the Note that all this toolchain stuff is not specific to the rules_python hermetic python stuff. You can, for example, implement a repository that tries to locate Python on the local system and define a toolchain using that information. That's basically what those local_config_python things try to do. Downstream rules are largely unaware if a local or hermetic toolchain is used.
Its in an adjacent area, yes. That code is an example of how to make multiple Python versions co-exist in one build. You could, for example, define a |
@rickeylev Thanks for the detailed reply!
One thing I wasn't sure how to do is to configure a pybind_extension from https://github.com/pybind/pybind11_bazel/tree/master/py. to simultaneously work with multiple toolchains. For the local_config_python approach the workflow mostly asks the user to call a Another thing which I wasn't sure about is what's the right thing to handle requirements when bazel is configured to produce wheels. In the case of making bazel a PEP517 backend, then I would imagine the dependencies to be installed into build environment via the frontend, but the On an unrelated note, I encountered all of this as I was working on a PR here which uses bazel to compile native extensions. I managed to get things to work, but the entire process right now is pretty brittle and many things feel very unnatural. |
Well, I think I got the toolchain half working. A Fiddling a bit more, I was able to use There's a fair amount of boiler plate to create the DSO, so we'll definitely want a rule to hide some of that. Also note that cc_shared_library is experimental still. I'm not sure how (or if) a regular But, a few questions others better versed in C might know:
|
Anything using pybind11 will be written in C++. ;)
FWIW, this bit of Windows documentation says:
I don't know whether it's possible (and reasonable) to "hide" that detail at the Bazel level such that users depend on a single target regardless of the platform. |
Yeah, but this was a hand-rolled Python C API extension, though. Basically a no-op extension. See foo.cc and its build file for how I was building it.
Thanks for the doc link; explains a little bit, I think. The interface lib is already hidden, I think. It's a dependency of the target that holds the DSO files. |
Tihs allows getting a build's `cc_library` of Python headers through toolchain resolution instead of having to use the underlying toolchain's repository `:python_headers` target directly. This makes it easier and more reliable to build C extensions with the correct Python runtime's headers (e.g. the same ones for the runtime a binary uses). Work towards bazelbuild#824
Tihs allows getting a build's `cc_library` of Python headers through toolchain resolution instead of having to use the underlying toolchain's repository `:python_headers` target directly. This makes it easier and more reliable to build C extensions with the correct Python runtime's headers (e.g. the same ones for the runtime a binary uses). Work towards bazelbuild#824
This allows getting a build's `cc_library` of Python headers through toolchain resolution instead of having to use the underlying toolchain's repository `:python_headers` target directly. Without this feature, it's not possible to reliably and correctly get the C information about the runtime a build is going to use. Existing solutions require carefully setting up repo names, external state, and/or using specific build rules. In comparison, with this feature, consumers are able to simply ask for the current headers via a special target or manually lookup the toolchain and pull the relevant information; toolchain resolution handles finding the correct headers. The basic way this works is by registering a second toolchain to carry C/C++ related information; as such, it is named `py_cc_toolchain`. The py cc toolchain has the same constraint settings as the regular py toolchain; an expected invariant is that there is a 1:1 correspondence between the two. This base functionality allows a consuming rule implementation to use toolchain resolution to find the Python C toolchain information. Usually what downstream consumers need are the headers to feed into another `cc_library` (or equivalent) target, so, rather than have every project reimplement the same "lookup and forward cc_library info" logic, this is provided by the `//python/cc:current_py_cc_headers` target. Targets that need the headers can then depend on that target as if it was a `cc_library` target. Work towards bazelbuild#824
This allows getting a build's `cc_library` of Python headers through toolchain resolution instead of having to use the underlying toolchain's repository `:python_headers` target directly. Without this feature, it's not possible to reliably and correctly get the C information about the runtime a build is going to use. Existing solutions require carefully setting up repo names, external state, and/or using specific build rules. In comparison, with this feature, consumers are able to simply ask for the current headers via a special target or manually lookup the toolchain and pull the relevant information; toolchain resolution handles finding the correct headers. The basic way this works is by registering a second toolchain to carry C/C++ related information; as such, it is named `py_cc_toolchain`. The py cc toolchain has the same constraint settings as the regular py toolchain; an expected invariant is that there is a 1:1 correspondence between the two. This base functionality allows a consuming rule implementation to use toolchain resolution to find the Python C toolchain information. Usually what downstream consumers need are the headers to feed into another `cc_library` (or equivalent) target, so, rather than have every project reimplement the same "lookup and forward cc_library info" logic, this is provided by the `//python/cc:current_py_cc_headers` target. Targets that need the headers can then depend on that target as if it was a `cc_library` target. Work towards bazelbuild#824
This allows getting a build's `cc_library` of Python headers through toolchain resolution instead of having to use the underlying toolchain's repository `:python_headers` target directly. Without this feature, it's not possible to reliably and correctly get the C information about the runtime a build is going to use. Existing solutions require carefully setting up repo names, external state, and/or using specific build rules. In comparison, with this feature, consumers are able to simply ask for the current headers via a special target or manually lookup the toolchain and pull the relevant information; toolchain resolution handles finding the correct headers. The basic way this works is by registering a second toolchain to carry C/C++ related information; as such, it is named `py_cc_toolchain`. The py cc toolchain has the same constraint settings as the regular py toolchain; an expected invariant is that there is a 1:1 correspondence between the two. This base functionality allows a consuming rule implementation to use toolchain resolution to find the Python C toolchain information. Usually what downstream consumers need are the headers to feed into another `cc_library` (or equivalent) target, so, rather than have every project reimplement the same "lookup and forward cc_library info" logic, this is provided by the `//python/cc:current_py_cc_headers` target. Targets that need the headers can then depend on that target as if it was a `cc_library` target. Work towards bazelbuild#824
This allows getting a build's `cc_library` of Python headers through toolchain resolution instead of having to use the underlying toolchain's repository `:python_headers` target directly. Without this feature, it's not possible to reliably and correctly get the C information about the runtime a build is going to use. Existing solutions require carefully setting up repo names, external state, and/or using specific build rules. In comparison, with this feature, consumers are able to simply ask for the current headers via a special target or manually lookup the toolchain and pull the relevant information; toolchain resolution handles finding the correct headers. The basic way this works is by registering a second toolchain to carry C/C++ related information; as such, it is named `py_cc_toolchain`. The py cc toolchain has the same constraint settings as the regular py toolchain; an expected invariant is that there is a 1:1 correspondence between the two. This base functionality allows a consuming rule implementation to use toolchain resolution to find the Python C toolchain information. Usually what downstream consumers need are the headers to feed into another `cc_library` (or equivalent) target, so, rather than have every project reimplement the same "lookup and forward cc_library info" logic, this is provided by the `//python/cc:current_py_cc_headers` target. Targets that need the headers can then depend on that target as if it was a `cc_library` target. Work towards bazelbuild#824
I have draft #1287 and am working through CI tests. If there are particular features or behaviors that need to be exposed, now is a good time to let me know. Right now only the headers are made available. I'm not sure what to do with the other libs. The basic way it works is defining another toolchain to carry and forward the C information. Right now, it only works with the hermetic runtimes. There's no particular reason it couldn't work with a system install, but someone else will have to contribute a repo rule to do that (the previously mentioned local_config_python stuff is basically that -- happy to review PR adding that to rules_python) tag people who may not be following the convo closely: @rwgk @jblespiau @michaelreneer @laramiel |
This allows getting a build's `cc_library` of Python headers through toolchain resolution instead of having to use the underlying toolchain's repository `:python_headers` target directly. Without this feature, it's not possible to reliably and correctly get the C information about the runtime a build is going to use. Existing solutions require carefully setting up repo names, external state, and/or using specific build rules. In comparison, with this feature, consumers are able to simply ask for the current headers via a special target or manually lookup the toolchain and pull the relevant information; toolchain resolution handles finding the correct headers. The basic way this works is by registering a second toolchain to carry C/C++ related information; as such, it is named `py_cc_toolchain`. The py cc toolchain has the same constraint settings as the regular py toolchain; an expected invariant is that there is a 1:1 correspondence between the two. This base functionality allows a consuming rule implementation to use toolchain resolution to find the Python C toolchain information. Usually what downstream consumers need are the headers to feed into another `cc_library` (or equivalent) target, so, rather than have every project reimplement the same "lookup and forward cc_library info" logic, this is provided by the `//python/cc:current_py_cc_headers` target. Targets that need the headers can then depend on that target as if it was a `cc_library` target. Work towards bazelbuild#824
This allows getting a build's `cc_library` of Python headers through toolchain resolution instead of having to use the underlying toolchain's repository `:python_headers` target directly. Without this feature, it's not possible to reliably and correctly get the C information about the runtime a build is going to use. Existing solutions require carefully setting up repo names, external state, and/or using specific build rules. In comparison, with this feature, consumers are able to simply ask for the current headers via a special target or manually lookup the toolchain and pull the relevant information; toolchain resolution handles finding the correct headers. The basic way this works is by registering a second toolchain to carry C/C++ related information; as such, it is named `py_cc_toolchain`. The py cc toolchain has the same constraint settings as the regular py toolchain; an expected invariant is that there is a 1:1 correspondence between the two. This base functionality allows a consuming rule implementation to use toolchain resolution to find the Python C toolchain information. Usually what downstream consumers need are the headers to feed into another `cc_library` (or equivalent) target, so, rather than have every project reimplement the same "lookup and forward cc_library info" logic, this is provided by the `//python/cc:current_py_cc_headers` target. Targets that need the headers can then depend on that target as if it was a `cc_library` target. Work towards bazelbuild#824
This allows getting a build's `cc_library` of Python headers through toolchain resolution instead of having to use the underlying toolchain's repository `:python_headers` target directly. Without this feature, it's not possible to reliably and correctly get the C information about the runtime a build is going to use. Existing solutions require carefully setting up repo names, external state, and/or using specific build rules. In comparison, with this feature, consumers are able to simply ask for the current headers via a special target or manually lookup the toolchain and pull the relevant information; toolchain resolution handles finding the correct headers. The basic way this works is by registering a second toolchain to carry C/C++ related information; as such, it is named `py_cc_toolchain`. The py cc toolchain has the same constraint settings as the regular py toolchain; an expected invariant is that there is a 1:1 correspondence between the two. This base functionality allows a consuming rule implementation to use toolchain resolution to find the Python C toolchain information. Usually what downstream consumers need are the headers to feed into another `cc_library` (or equivalent) target, so, rather than have every project reimplement the same "lookup and forward cc_library info" logic, this is provided by the `//python/cc:current_py_cc_headers` target. Targets that need the headers can then depend on that target as if it was a `cc_library` target. Work towards bazelbuild#824
This allows getting a build's `cc_library` of Python headers through toolchain resolution instead of having to use the underlying toolchain's repository `:python_headers` target directly. Without this feature, it's not possible to reliably and correctly get the C information about the runtime a build is going to use. Existing solutions require carefully setting up repo names, external state, and/or using specific build rules. In comparison, with this feature, consumers are able to simply ask for the current headers via a special target or manually lookup the toolchain and pull the relevant information; toolchain resolution handles finding the correct headers. The basic way this works is by registering a second toolchain to carry C/C++ related information; as such, it is named `py_cc_toolchain`. The py cc toolchain has the same constraint settings as the regular py toolchain; an expected invariant is that there is a 1:1 correspondence between the two. This base functionality allows a consuming rule implementation to use toolchain resolution to find the Python C toolchain information. Usually what downstream consumers need are the headers to feed into another `cc_library` (or equivalent) target, so, rather than have every project reimplement the same "lookup and forward cc_library info" logic, this is provided by the `//python/cc:current_py_cc_headers` target. Targets that need the headers can then depend on that target as if it was a `cc_library` target. Work towards bazelbuild#824
Weirdly, when I try to use the new py_cc_toolchain to configure a non-hermetic python I got some weird errors. If I have something like licenses(["restricted"])
load("@rules_python//python/cc:py_cc_toolchain.bzl", "py_cc_toolchain")
cc_library(
name = "python_headers",
hdrs = [],
)
py_cc_toolchain(
name = "local_py_cc_toolchain",
headers = ":python_headers",
python_version = "3.10",
) and on the commandline I use something like
I am mostly just mimicing the python repo BUILD file configured by rules_python but that works fine. I am not entirely sure what is going on.. |
Nope, not intentional. Thanks for reporting. Are you using a WORKSPACE setup, and you have to call e.g. native.register_toolchains() somewhere? I may have overlooked updating the WORKSPACE logic, come to think of it.
I think you're just missing the
|
Although I can't speak authoritatively/officially for pybind11_bazel, as I said on pybind/pybind11_bazel#55, it would be nice to use |
A data point from pyO3 (a rust-Python interop thing): https://github.com/PyO3/pyo3/blob/2d3dc654285eb8c9b29bb65b1d93323f551a8314/pyo3-build-config/src/impl_.rs#L223 (this came via a slack thread) Basically, it runs a small python program to find a variety of pieces of information:
Running a sub-process to get that is non-ideal; it'd be better done through toolchain resolution, or most ideally, something available during the analysis phase.
Yes, this case I want to support, too. A tricky part of the support is a Python interpreter isn't necessarily required to make it work -- having an interpreter you can run is just the means to the end of getting the location of e.g. headers etc.
Just making sure -- is part of what it's trying to get is libpython.so? That's what it looks like. I saw that DSO sitting around, but wasn't sure what to do with it. It should be easy to add onto the py_cc_toolchain. It sounds like that would be helpful? I think this would replace the
What does this part look like? Internally, there's a variety of copts and linkopts we pass around to things, but what we do is unique in a few ways, so I'm not sure which parts are specific to our setup or just general things one has to set when building python stuff.
This info would live on either PyCcToolchain info, or some CcInfo it exposes (ala how it currently exposes headers). I think something that would be illustrative to me is knowing what else around python/repositories.bzl#_python_repository_impl should be doing to expose the necesasry information. |
In this case, it does appear to want to get |
Quoting pybind/pybind11_bazel#65 (comment):
I have no idea whether this makes things easier, but pybind11_bazel doesn't have any Starlark rules – only Starlark macros. :) |
I don't think it changes the situation too much. The two have different capabilities. A combination of macros and rules is quite plausible. I'll post a more detailed response on pybind/pybind11_bazel#64
Thanks. This is interesting and new 🙃 . So from what I can figure out, there's a Since we're talking about embedding, what this means is a cc_binary wants the information about the Python runtime -- its headers, copts, linkopts, and shared libraries? It's mostly feasible to, as part of setting up the downloaded runtime, run python-config and parse its output into something to stick into the generated build files. The two problems I see are:
|
Oh hm I had an idea of how this might be doable: I seem to recall that |
Digging into this has been quite "enlightening". There are two versions of
Indeed, Bazel seems to prefer doing so when linking due to the length of the command line. For example, |
Nice! Those look very promising. They look easy enough to parse so that e.g. a linux host could parse them to generate the settings for mac. Do those give all necessary info? They seem much shorter than the shell script.
Yeah, I noticed that too. Something that isn't clear to me is what re-using a cc_library for both cases looks like, e.g.:
Does some setting of the I think the python_runtime_lib would only be useful for embedding? So maybe naming it |
I don't know for sure, TBH, but it would seem like a bug to report upstream (i.e. to the Python project itself) if embedding has missing flags; it seems pretty clear – from the
I don't think so...
That's my understanding. |
It looks like standalone python might help a bit: https://gregoryszorc.com/docs/python-build-standalone/main/distributions.html#python-json-file It has...a lot of info. I think i saw link flags in there? And since its json, we can just directly load it in starlark without having to write a special parser. |
Whoa, awesome! I don't see any flags except for |
py_cc_toolchain needs to allow users to use non-hermatic host python. This is allow pybind11_bazel to move to rules_python first... |
FYI gRPC has Bazel Cython support: https://github.com/grpc/grpc/blob/master/bazel/cython_library.bzl |
This exposes the runtime's C libraries throught the py cc toolchain. This allows tools to embed the Python runtime or otherwise link against it. Work towards bazelbuild#824
This exposes the runtime's C libraries through the py cc toolchain. This allows tools to embed the Python runtime or otherwise link against it. It follows the same pattern as with the headers: the toolchain consumes the cc_library, exports CcInfo, and a `:current_py_cc_libs` target makes it easily accessible to users. Work towards #824 * Also upgrades to rules_testing 0.5.0 to make use of rules_testing's DefaultInfoSubject
This is just a basic to to ensure the `@rules_python//python/cc:current_py_cc_libs` target is making the Python C libraries available; that the compiler and linker are both happy enough to resolve symbols. The test program itself isn't actually runnable, as that requires more configuration than I can figure out in telling Python where its embedded runtime information is. Work towards bazelbuild#824
one update, one response: There's pybind11_bazel PR to move onto the I put together a small test in #1749 to verify that re: @cloudhan
This is possible today, but it requires setting up your own toolchain. This basically translates to writing a repo rule to symlink/copy things on the local machine into a repo to fool various bazel things elsewhere. I have a basic impl to make using a local runtime easy at https://github.com/rickeylev/rules_python/tree/local.py.toolchains, and it appears to work, however (1) it's hung up integrating it into bzlmod (just tedious refactoring work) and (2) correctly finding the various library paths is surprisingly non-trivial (some comments in my branch link to an SO post that talks about it; I think this will just require some iterations and bug reports from different systems). |
…1749) This is just a basic to to ensure the `@rules_python//python/cc:current_py_cc_libs` target is making the Python C libraries available; that the compiler and linker are both happy enough to resolve symbols. The test program itself isn't actually runnable, as that requires more configuration than I can figure out in telling Python where its embedded runtime information is. Fow now, the test is restricted to Linux. The Mac and Windows CC building isn't happy with it, but I'm out of my depth about how to make those happy. Work towards #824
I just went through the process of building pynacl, and cffi using rules_python. First here is cffi.BUILD
Finally we can create
|
Nice!
cc_binary will always link all transitive dependencies while cc_shared_library give you some control over what is linked in. You almost certainly want to use cc_shared_library (part of its origin is tensorflow wanting to build python C extensions).
I think this is WAI and you have it right. IIUC, mac linkers, by default, want to verify things at link time, while linux ones don't, hence the extra flag to diable that. I found this blog post that explains a bit. Based on that post, you also don't want current_py_libs because you're not embedding python (ie. the whole interpreter). Instead, the Python symbols will be found at runtime as part of loading the C extension into the Python process. |
🚀 feature request
@rules_python//python/cc:current_py_cc_headers
@rules_python//python/cc:current_py_cc_libs
PyInfo.cc_info
field to allow C/C++ info (CcInfo) to propagate (directlypropagating CcInfo it isn't done to prevent a Python target from
being accidentally accepted by a plain CC target)
Describe alternatives you've considered
I've cobbled support for these things by hand in the past, was hoping building extension modules would be officially supported at some point.
Some projects, such as tensorflow and pybind11_bazel, have a python_configure.bzl file and/or py_extension rule that attempts to implement this.
(Note: this issue was originally about updating the readme to address the lack of this feature; it's been edited to be a feature request for this feature directly - rickeylev)
The text was updated successfully, but these errors were encountered: