Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong ABI used for small C++ classes on Linux. #778

Open
vadimcn opened this issue Jun 23, 2017 · 32 comments
Open

Wrong ABI used for small C++ classes on Linux. #778

vadimcn opened this issue Jun 23, 2017 · 32 comments

Comments

@vadimcn
Copy link

vadimcn commented Jun 23, 2017

Input C/C++ Header

#include <cstdint>

struct Foo {
    size_t data;
};

struct Bar {
    size_t data;
    ~Bar() { data = 0; }
};

Foo MakeFoo(); // { return Foo(); }

Bar MakeBar(); // { return Bar(); }

Bindgen Invocation

bindgen::Builder::default()
    .clang_args(&["-x","c++", "-std=c++11"])
    .header("input.h")
    .generate()
    .unwrap()

Actual Results

#[repr(C)]
#[derive(Debug, Copy)]
pub struct Foo {
    pub data: usize,
}
impl Clone for Foo {
    fn clone(&self) -> Self { *self }
}
#[repr(C)]
#[derive(Debug)]
pub struct Bar {
    pub data: usize,
}
extern "C" {
    #[link_name = "_ZN3BarD1Ev"]
    pub fn Bar_Bar_destructor(this: *mut Bar);
}
impl Bar {
    #[inline]
    pub unsafe fn destruct(&mut self) { Bar_Bar_destructor(self) }
}
extern "C" {
    #[link_name = "_Z7MakeFoov"]
    pub fn MakeFoo() -> Foo;
}
extern "C" {
    #[link_name = "_Z7MakeBarv"]
    pub fn MakeBar() -> Bar;
}

Expected Results

Not sure...

Discussion

MakeFoo and MakeBar look very similar, however in C++ they use different ABIs, at least on Linux.
Here's a quote from System V AMD64 ABI spec (page 20):

If a C++ object has either a non-trivial copy constructor or a non-trivial
destructor 11, it is passed by invisible reference (the object is replaced in the
parameter list by a pointer that has class POINTER) 12

This means that while Foo get returned by-value in the rax register, Bar must be returned by-pointer to a caller allocated memory. To Rust compiler, however, these look identical, so it expects both to be returned in rax.

@emilio
Copy link
Contributor

emilio commented Jun 23, 2017

Yeah, this is somewhat annoying, but passing non-trivial arguments by value needs rustc support, and last time I checked with them they really weren't interested in implementing some kind of extern "C++".

@emilio
Copy link
Contributor

emilio commented Jun 23, 2017

(The same happens for arguments, btw, we've run into this before, see https://bugzilla.mozilla.org/show_bug.cgi?id=1366247)

@emilio emilio added the bug label Jun 23, 2017
@vadimcn
Copy link
Author

vadimcn commented Jun 23, 2017

In my specific case, the struct was requested to be opaque, so maybe bindgen could use some other "filler" type? Looking at cabi_x86_64.rs, it seems that some kind on weird enum (one that ends up with StructWrappedNullablePointer layout) might do the trick.

Or, maybe there should be a sub-option of #[repr(C)] to specify this explicitly? E.g. #[repr(C, by_pointer)].

Overall, I think it would be preferable to emit an error rather than generate bindings that are known produce incorrect code.

@vadimcn
Copy link
Author

vadimcn commented Jun 23, 2017

Looks like this does the trick (i.e. some struct member must be misaligned):

#[repr(C, packed)]
struct Bar {
    data1: u8,
    data2: u16,
    data3: [u8; 5],
}

Would only be useful for opaque structs, of course...

@vadimcn
Copy link
Author

vadimcn commented Jun 24, 2017

On second thought, this

An object with either a non-trivial copy constructor or a non-trivial destructor cannot be
passed by value because such objects must have well defined addresses. Similar issues apply
when returning an object from a function.

is pretty much the distinction between Copy and !Copy types. Perhaps the rule should be that all non-Copy types are passed by-pointer?

@comex
Copy link

comex commented Jun 24, 2017

Huh, seems very similar to an MSVC ABI issue. And here I thought the madness of changing ABI based on the presence of methods was Windows-specific...

@retep998
Copy link
Member

Perhaps the rule should be that all non-Copy types are passed by-pointer?

Please no. This would wreak so much havoc when ABIs mysteriously change due to the addition or removal of Copy impls. I'd much rather this be explicitly handled via a new repr to specify that a type is not POD.

@vadimcn
Copy link
Author

vadimcn commented Jul 1, 2017

As I've said in IRLO thread, I think there are two ways we can go here:

  • Add support for "significant-address" c++ types in rustc via an attribute.
  • Have bindgen change such arguments/return values to be explicitly by-pointer. Of course, I may turn out that different targets require slightly different transforms (for example, I think some targets require function to return address of memory location of the return value, while others don't), but I guess bindgen is already in that boat because it needs to know about vtable layout and mangling.

@emilio, do you have an opinion on the best way to proceed?

@retep998
Copy link
Member

retep998 commented Jul 1, 2017

Right now the instant solution which works on stable is to transform the bindings to pass certain types by pointer and handle certain return values correctly. In the future Rust definitely should add more calling conventions and attributes however so that Rust can perform these transformations internally.

For example in windows API there is a COM method that looks like this in C++:

D3D12_HEAP_PROPERTIES GetCustomHeapProperties(
    UINT nodeMask,
    D3D12_HEAP_TYPE heapType
);

In order to get the ABI correct I have to make it look like this in Rust. Note that this already works in stable Rust, so it is a solution people can implement immediately in their projects.

fn GetCustomHeapProperties(
    nodeMask: UINT,
    heapType: D3D12_HEAP_TYPE,
    __ret_val: *mut D3D12_HEAP_PROPERTIES,
) -> *mut D3D12_HEAP_PROPERTIES

However, in the future I'd really love for Rust to have calling conventions for C++ methods as well as attributes to indicate that a type is not POD, so in some future version I could simplify the signature down to:

fn GetCustomHeapProperties(
    nodeMask: UINT,
    heapType: D3D12_HEAP_TYPE,
) -> D3D12_HEAP_PROPERTIES

So basically, do the bindgen transform regardless because it is needed for correctness now while in the future Rust really should add attributes and calling conventions to facilitate this and bindgen can be updated later to take advantage of those new language features if and when they do get implemented and stabilized.

@vadimcn
Copy link
Author

vadimcn commented Jul 1, 2017

@retep998: It seems to me that you are describing a different problem. Is there already an open issue for it? Let's take discussion there.

@retep998
Copy link
Member

retep998 commented Jul 1, 2017

@vadimcn The two Rust issues are:

  1. Types which are not POD are sometimes handled differently in function ABIs and need a way to mark them as such. issues with ABIs and struct return types rust#38258
  2. C++ methods on windows have a slightly different calling convention than C and C++ functions and need a way to mark them as such. Need calling convention for stdcall methods (different from functions!) rfcs#1342

While this issue here is specifically for bindgen and what it should do about these functions whose ABI is changed slightly by things that Rust is currently incapable of representing.

@vadimcn
Copy link
Author

vadimcn commented Jul 1, 2017

bindgen could also fix up function signatures if it generated a small wrapper function. For example:

extern "C" { 
    #[link_name = "..."]
    fn raw_ClassMethod(this: *mut Class, ret_value: *mut SomeStruct) -> *mut SomeStruct; 
}

unsafe fn ClassMethod(this: *mut Class) -> SomeStruct {
    let s: SomeStruct = mem::uninitialized();
    raw_ClassMethod(this, &s);
    s
}

@emilio
Copy link
Contributor

emilio commented Jul 2, 2017

That'd yield also similar problems as #607, though.

@vadimcn
Copy link
Author

vadimcn commented Jul 2, 2017

In this case SomeStruct is a POD.

For true C++ classes, you might consider some sort of "assume_trivially_movable" annotation, because in practice, 95% of them are. ...On the other hand, one can always do that in the safe idiomatic-Rust wrapper, so maybe not worth it...

@emilio
Copy link
Contributor

emilio commented Aug 7, 2017

Other user found this today wrapping a very simple C++ file: https://github.com/ssloy/tinyrenderer/blob/master/tgaimage.cpp

(The call to TGAImage::set passes TGAColor by value, and that segfaults)

@doppioandante
Copy link

Actually the tgaimage.cpp that I was using is here (didn't notice that this version was different from the one in the repo, that actually passes TGAColor as a reference).

@fitzgen
Copy link
Member

fitzgen commented Aug 8, 2017

Another option that we discussed with some of the rust dev tools team folks was to emit C++ extern "C" trampoline functions. For users, that would mean an extra non-virtual, out of line function call, and a dependency on a C++ compiler.

@vadimcn
Copy link
Author

vadimcn commented Aug 9, 2017

clang certainly does have the logic to figure out which types must be passed by-reference. Unfortunately, this is not exposed in libclang.

IMO, the options we have here are:

  1. What @fitzgen suggests.
  2. Expose the required API in libclang (of course, bindgen would then require at least that version of libclang).
  3. Try to compute the same info ourselves (could be very labor-intensive or even impossible, as libclang's API seems to omit many of the finer c++ details).

@emilio
Copy link
Contributor

emilio commented Aug 10, 2017 via email

@vadimcn
Copy link
Author

vadimcn commented Aug 10, 2017

I added the CXTargetInfo API to eventually get target triple and pointer
width from clang, and we already load the functions dynamically (thus
can fallback), so it shouldn't be terribly hard to add an API that does
this.

I gave it a try, but the information we need seems to be locked in the CodeGen layer. I could not see a way to get it out without instantiating a CodeGenModule and all that...

@fitzgen
Copy link
Member

fitzgen commented Aug 10, 2017

Even if we get the information from libclang, we don't have a way to convey it to rustc yet. Did I miss that development?

@fitzgen
Copy link
Member

fitzgen commented Aug 10, 2017

It's not clear to me how easy would this be, and whether it's acceptable
for every use-case, though it sounds fine to me.

It's "just" another pass over our IR, this time emitting C++ instead of Rust. Not terribly difficult, but not exactly trivial either ;)

As for acceptability, I suspect it is probably fine for most cases. I expect the cases where performance really matters enough that the function call overhead is unacceptable would already be porting the C++ methods by hand to Rust so that they can be inlined (we do this with JS::Value for example).

@vadimcn
Copy link
Author

vadimcn commented Aug 10, 2017

Even if we get the information from libclang, we don't have a way to convey it to rustc yet. Did I miss that development?

bindgen would just explicitly emit this parameter/return value as by-reference in the signature of "C" stub.

One wrinkle in this fine plan is, of course, that by-value/by-reference rules differ by platform (e.g. Linux vs Windows). Not sure what's the best way to deal with that...

Can we do this? Of course, this goes against the very idea that the address of such objects is significant.

...Or should we do the opposite, i.e. convert such parameters to by-ref, as long as any one platform would pass them by-ref?

...Or just tell wrapper writers to use #[cfg($platform)] when calling such functions?

@vadimcn
Copy link
Author

vadimcn commented Aug 10, 2017

It's "just" another pass over our IR, this time emitting C++ instead of Rust. Not terribly difficult, but not exactly trivial either ;)

This requires having a working c++ compiler as well as adding a c++ build step into your build script. On the other hand, this would nicely solve the issue with inline functions...

@fitzgen
Copy link
Member

fitzgen commented Aug 10, 2017

bindgen would just explicitly emit this parameter/return value as by-reference in the signature of "C" stub.

Ah ok, understood.

One wrinkle in this fine plan is, of course, that by-value/by-reference rules differ by platform (e.g. Linux vs Windows). Not sure what's the best way to deal with that...

We would rely on libclang giving us the correct information for the configured target platform. This is the same as what we do for size/align/field offset/etc.

I agree with what @emilio said: maintaining our own reimplementation of different ABI parameter passing rules would be hairy and is something I'd also like to avoid unless we have no other options. Even in a scenario where we were without other options, we would need someone to step up and really own this bit of code and very thoroughly testing it for me to be comfortable landing it. Really would like to avoid this situation.

Can we do this? Of course, this goes against the very idea that the address of such objects is significant.

...Or should we do the opposite, i.e. convert such parameters to by-ref, as long as any one platform would pass them by-ref?

...Or just tell wrapper writers to use #[cfg($platform)] when calling such functions?

None of these options are great. If only we had placement new... :-/

Given the world we live in, however, I think the second option is the best choice available.

@vadimcn
Copy link
Author

vadimcn commented Aug 10, 2017

We would rely on libclang giving us the correct information for the configured target platform. This is the same as what we do for size/align/field offset/etc.

It's easy to have sizeof(T) as a platform-dependent variable in an expression. Not so much with a function signature, which you'd have to use a different syntax to invoke.

Given the world we live in, however, I think the second option is the best choice available.

This would mean, among other things, having to re-run clang on headers for all possible targets :( Well, ok, we can probably figure out which are the ones that cause trouble and run only those.

Hmm... the more I think about this, the more appealing the idea of using a C++ compiler to generate "C" wrappers arounda C++ api sounds. I actually quite like the approach rust-cpp takes on that.

@DemiMarie
Copy link

@vadimcn I agree. Just invoke the C++ compiler, create an extern "C" wrapper, and call the C functions from Rust.

@DemiMarie
Copy link

Since we already link to clang, another solution arises: generate C++ code and have clang emit LLVM IR for it. Since rustc currently generates LLVM IR too, we can perform cross-language link-time optimization to remove all of the overhead.

@dtolnay
Copy link
Member

dtolnay commented May 6, 2021

Intel website moved the System V AMD64 ABI pdf so the link in the issue is broken -- here is an updated link:
https://software.intel.com/content/dam/develop/external/us/en/documents/mpx-linux64-abi.pdf

@aapoalas
Copy link
Contributor

aapoalas commented Dec 5, 2021

FWIW, I just ran into this issue with a complex C++ shared library object. A particular static class (no constructor, no destructor) method returning an instance of another class wouldn't work with the bindgen generated bindings. The reason was RVO causing an extra argument to be expected.

Now, I don't have anything smart to say about the issue itself. Better minds have obviously worked on the issue (and perhaps partially given up on it as well). However, given that I spent 5 days or so trying to figure out what on earth is happening with this, I'd suggest that a note about ABI issues especially with RVO is added to the tutorial's Generating Bindings to C++ page. I expect it will be a great help to future adventurers into .so / .dll calling while costing very little developer time to add.

@ctrlcctrlv
Copy link

Since we already link to clang, another solution arises: generate C++ code and have clang emit LLVM IR for it. Since rustc currently generates LLVM IR too, we can perform cross-language link-time optimization to remove all of the overhead.

Wow, this is a great idea from @DemiMarie. Surprised it got no attention.

@DemiMarie
Copy link

@ctrlcctrlv thanks! If performance isn’t as big a deal, another approach would be to simply generate C++ code and link to it normally.

barcharcraz added a commit to barcharcraz/rust-bindgen that referenced this issue Oct 30, 2022
The original commit adding this note referenced rust-lang#778, but then went on to talk about how the problem was RVO and that the problem is not possible to solve in bindgen because bindgen can't know if RVO happened or not. However this is incorrect in several ways. The problem in rust-lang#778 has nothing whatsoever to do with RVO but rather with bindgen simply not understanding the calling convention for passing or returning types that are "non trivial for the purposes of calls". This is completely consistent and does not depend on what the optimizer decided to do (after all, if callsite side calling-convention did depend on what the optimizer happened to do inside the function it would be impossible for C++ compilers to emit correct calls to external C++ functions!).

You can see this quite clearly [here](https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEEgDspAAOqAqETgwe3r7%2ByanpAiFhkSwxcVyJdpgOGUIETMQEWT5%2BAVU1AnUNBEUR0bEJtvWNzTltwz2hfaUDFQCUtqhexMjsHAD06wDUACoAnkmYW3tLxFtoWFsIsZikWyRbtKhM6FuGW5iqrEn0AHQmGgAggoCMQvA4tgRMCCokwlFsTPErPEACImADMViBILBEKhIIRSIBgK2pMh0IIEDmGKxJLJ%2BMpaAYBIZ5gAbNTMcSyVsAG6oPDoABUkJpxMRaK5QIZWwUAEcvA1MFTCbSeTKCIjkZK1WTiJgCMsGKKpYCJeKgaECFsWExQiqtdz6RTIQj0SjZQqlVSxUCnaT9YbiMa0gAvTCoKgQBmc2kSjgLWicACsvD8HC0pFQnDc1msstOqwRZnRPFIBE0CYWAGsQOj0b9603my22fpOJI05Ws5xeAoQBpy5WFnBYEg0CwknRYuRKBOp/Q4sAFMwkgoEKgCKQsLy8KsAGp4TAAdwA8od02WaLQocR%2BxAot2oqEGntOGWJ2xBKeGLQ3xneCwW0jHEADtzwfUal5aFu0%2BaovChd9eCtTAkzA2g8CiYhXw8LBu1BPAWCQhYqAMZdDxPc9GCQmRBBEMR2CkWj5CUNRu10Lh9EMYw80sfRMP7SAFlQJJHAEfsOD7VDqjEvwIFcUY/E44IphKMo9BSNJZMUjT8lk3o1NmWxpI6BguhGTwWj0dpZPMyZin6cohm6HTOJBboDMciQFgUQtGMTFMuzA7MOC2VQAA42QAWjZSQtmAZBkC2aMwQYas5mS3BCAecxSzmXgKwAuYazrBsW3Kps2zQztSHTTMQr7AchyK0hR0QFBUEnacyAoCB526kBl1XddN23TBdwPI8zwvGjr1ve9HzA59mGIf8P06r8CB/P9uyA7jQMzfBIMcaCJMzODkAQtYyxQtDMwwrCcIwNZMwIojuATPgyIUCjpuoj7mPo8QmP4QRFBUdQwN0AIDCMFBeJsB7BKpLNRIyCSoqoBhUCi1DBUwKKmRxcEiDvKT7FklwGHcSycmU6nPJmJzNIKTJaaUvItIyRn1LckzbImVzjIp2oJh5oz3Is7IOcl%2Bzpl5ny/O89sOFTWruxCsLIpiuKEqSlKvDSjKICy0nizygrhxK%2BtGwq8qVZqureAa2wmsKrRipVswgvq3tmo9hZoLvDIQEkIA) Notice how the body of `main` is identical no matter if copy elision is enabled or not. To get `test` passed in a register you must remove the copy constructor.

I spent a few hours being really confused by this note before I tracked down the original PR and realized what it was trying to talk about. Hopefully this saves the next person to come across it the same trouble.

As an aside I suspect the clang c bindings can already give you information on special member functions and base classes, making it fairly easy to do this correctly. AFAICT bindgen already does not rely on llvm's code generation facilities to figure out the ABI of calls for C (except perhaps indirectly, via rustc), so this seems reasonable to just do in bindgen, explicitly.
emilio pushed a commit that referenced this issue Oct 31, 2022
The original commit adding this note referenced #778, but then went on to talk about how the problem was RVO and that the problem is not possible to solve in bindgen because bindgen can't know if RVO happened or not. However this is incorrect in several ways. The problem in #778 has nothing whatsoever to do with RVO but rather with bindgen simply not understanding the calling convention for passing or returning types that are "non trivial for the purposes of calls". This is completely consistent and does not depend on what the optimizer decided to do (after all, if callsite side calling-convention did depend on what the optimizer happened to do inside the function it would be impossible for C++ compilers to emit correct calls to external C++ functions!).

You can see this quite clearly [here](https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1DIApACYAQuYukl9ZATwDKjdAGFUtAK4sGIM6SuADJ4DJgAcj4ARpjEEgDspAAOqAqETgwe3r7%2ByanpAiFhkSwxcVyJdpgOGUIETMQEWT5%2BAVU1AnUNBEUR0bEJtvWNzTltwz2hfaUDFQCUtqhexMjsHAD06wDUACoAnkmYW3tLxFtoWFsIsZikWyRbtKhM6FuGW5iqrEn0AHQmGgAggoCMQvA4tgRMCCokwlFsTPErPEACImADMViBILBEKhIIRSIBgK2pMh0IIEDmGKxJLJ%2BMpaAYBIZ5gAbNTMcSyVsAG6oPDoABUkJpxMRaK5QIZWwUAEcvA1MFTCbSeTKCIjkZK1WTiJgCMsGKKpYCJeKgaECFsWExQiqtdz6RTIQj0SjZQqlVSxUCnaT9YbiMa0gAvTCoKgQBmc2kSjgLWicACsvD8HC0pFQnDc1msstOqwRZnRPFIBE0CYWAGsQOj0b9603my22fpOJI05Ws5xeAoQBpy5WFnBYEg0CwknRYuRKBOp/Q4sAFMwkgoEKgCKQsLy8KsAGp4TAAdwA8od02WaLQocR%2BxAot2oqEGntOGWJ2xBKeGLQ3xneCwW0jHEADtzwfUal5aFu0%2BaovChd9eCtTAkzA2g8CiYhXw8LBu1BPAWCQhYqAMZdDxPc9GCQmRBBEMR2CkWj5CUNRu10Lh9EMYw80sfRMP7SAFlQJJHAEfsOD7VDqjEvwIFcUY/E44IphKMo9BSNJZMUjT8lk3o1NmWxpI6BguhGTwWj0dpZPMyZin6cohm6HTOJBboDMciQFgUQtGMTFMuzA7MOC2VQAA42QAWjZSQtmAZBkC2aMwQYas5mS3BCAecxSzmXgKwAuYazrBsW3Kps2zQztSHTTMQr7AchyK0hR0QFBUEnacyAoCB526kBl1XddN23TBdwPI8zwvGjr1ve9HzA59mGIf8P06r8CB/P9uyA7jQMzfBIMcaCJMzODkAQtYyxQtDMwwrCcIwNZMwIojuATPgyIUCjpuoj7mPo8QmP4QRFBUdQwN0AIDCMFBeJsB7BKpLNRIyCSoqoBhUCi1DBUwKKmRxcEiDvKT7FklwGHcSycmU6nPJmJzNIKTJaaUvItIyRn1LckzbImVzjIp2oJh5oz3Is7IOcl%2Bzpl5ny/O89sOFTWruxCsLIpiuKEqSlKvDSjKICy0nizygrhxK%2BtGwq8qVZqureAa2wmsKrRipVswgvq3tmo9hZoLvDIQEkIA) Notice how the body of `main` is identical no matter if copy elision is enabled or not. To get `test` passed in a register you must remove the copy constructor.

I spent a few hours being really confused by this note before I tracked down the original PR and realized what it was trying to talk about. Hopefully this saves the next person to come across it the same trouble.

As an aside I suspect the clang c bindings can already give you information on special member functions and base classes, making it fairly easy to do this correctly. AFAICT bindgen already does not rely on llvm's code generation facilities to figure out the ABI of calls for C (except perhaps indirectly, via rustc), so this seems reasonable to just do in bindgen, explicitly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants