Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#[ffi_returns_twice] #2633

Closed
wants to merge 5 commits into from
Closed

#[ffi_returns_twice] #2633

wants to merge 5 commits into from

Conversation

gnzlbg
Copy link
Contributor

@gnzlbg gnzlbg commented Feb 9, 2019

@mark-i-m
Copy link
Member

mark-i-m commented Feb 9, 2019

My initial thought is that I’m not sure from the RFC what the soundness and safety implications are, e.g.:

  • I feel like you should only be able to apply this attribute to unsafe fn and maybe only externs.
  • Is it sound to allow this on a function that has parameters with lifetimes?
  • Is it safe to invoke drop in a function with this attribute?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Feb 9, 2019

  • I feel like you should only be able to apply this attribute to unsafe fn and maybe only externs.

This attribute, as proposed in this RFC and as implemented in the PR, can only be applied to extern functions. These are unsafe to call.

  • Is it sound to allow this on a function that has parameters with lifetimes?

C FFI is already unsound.

  • Is it safe to invoke drop in a function with this attribute?

This RFC does not extend Rust with the ability to write bare functions that return multiple times.


Stable Rust already allows you to just add an extern function that returns multiple times and call it. That's already possible. This RFC adds an attribute, #[ffi_returns_twice] that allows users to tell the compiler that these extern functions might return multiple times - that's it.

This RFC doesn't allow you to write your own in Rust nor tells you what these unsafe functions are / aren't allowed to do. That would be part of the UCG. The RFC mentions some types of UB that are easily introduced when writing code using these types of functions (use after move, deallocating memory without running destructors, etc.), but these things are always UB.

@mark-i-m
Copy link
Member

mark-i-m commented Feb 9, 2019

The RFC only mentions that it can be added to extern functions, but it does not explicitly prohibit use anywhere else (in fact, it doesn’t mention what happens for other types of functions at all). Perhaps that can be clarified?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Feb 9, 2019

The RFC only mentions that it can be added to extern functions, but it does not explicitly prohibit use anywhere else (in fact, it doesn’t mention what happens for other types of functions at all). Perhaps that can be clarified?

Uh, indeed. I meant functions in an extern { ... } block not fn "extern" .... I've replaced all uses of "extern" with "foreign", does that make it clear?

@jonas-schievink jonas-schievink added T-lang Relevant to the language team, which will review and decide on the RFC. T-compiler Relevant to the compiler team, which will review and decide on the RFC. A-ffi FFI related proposals. labels Feb 10, 2019
@mark-i-m
Copy link
Member

I thinks that’s a big improvement, but I think it would also be good to have a statement like “Adding this attribute to any other type of item is an error”.

@Centril Centril added A-attributes Proposals relating to attributes A-machine Proposals relating to Rust's abstract machine. and removed T-compiler Relevant to the compiler team, which will review and decide on the RFC. labels Feb 11, 2019
@pyfisch
Copy link
Contributor

pyfisch commented Feb 12, 2019

This RFC adds a new function attribute, #[ffi_returns_twice], which indicates that an foreign function can return multiple times.

I find it confusing that the RFC always talks about returning multiple times but the attribute is called "twice".

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Feb 12, 2019

I find it confusing that the RFC always talks about returning multiple times but the attribute is called "twice".

I do so too.

This attribute is called #[ffi_returns_twice] and only works in Rust FFI function declarations. Those need to match the C function declaration, and in C this attribute is called returns_twice for whatever reason (maybe returns_multiple_times was too long).

The only people that will need this attribute is those reading C code on one side, and writing the Rust FFI wrapper on the other. So keeping the name the same, even if its unfortunate, makes the lives of the only users that this attribute will have easier.

EDIT: the docs do say that it returns multiple times do, maybe it is worth it to call out why it is called returns_twice in the docs (C legacy).

@scottmcm
Copy link
Member

Since you have a couple of these ffi_* attributes, another alternative I'd like considered is having it be one attribute with different parameters #[ffi(returns_twice)], #[ffi(nounwind, readnone)], etc.

(I don't know if it's better, but if they interact with each other -- like IIRC you mentioned const and pure would -- it might be easier to check and read one attribute than multiple.)

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Feb 14, 2019

@scottmcm see #2637

@eddyb
Copy link
Member

eddyb commented Feb 14, 2019

cc @RalfJung

@RalfJung
Copy link
Member

Basically all I ask is that you make sure this

In the presence of types that implement Drop, usage of APIs that return multiple times requires extreme care to avoid deallocating memory without invoking constructors.

is known to everyone using setjmp/longjmp.

Though shall not jump over stack frames though do not own.

undefined behavior of the type "use-after-move".

In the presence of types that implement `Drop`, usage of APIs that return
multiple times requires extreme care to avoid deallocating memory without
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth saying "deallocating pinned memory" here. At least to my knowledge that is the only case where we really care in terms of type system guarantees.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't Pin just a normal Rust library ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is. So what?

Unsound thread spawning and Rc are also both just normal Rust libraries, and yet having one makes the other unsound. The same goes for Pin and functions that let you deallocate memory without running drop.

@eddyb
Copy link
Member

eddyb commented Mar 13, 2019

@kornelski Do you mean with some #[unwind] attributes on the Rust side and compiling the C code with... -fexceptions I guess?

Otherwise unwinding is UB.

@kornelski
Copy link
Contributor

@eddyb Yes, that's what I mean. I realize it's not fully baked in Rust yet.

@jeff-davis
Copy link

I see that this is merged and available in nightly (behind a feature gate). What is the path toward stabilization? Is more input/discussion needed, or just bake time?

If it helps, this addresses my use case.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Apr 4, 2019

This RFC specifies what the attribute does, and there are some issues that have been raised about the RFC but have not been resolved yet, so that would need to be done. After that, the corresponding teams would need to vote on whether this should be merged or not (just normal process).

This RFC leaves the syntax as an unresolved question, so even if this RFC is merged, stabilization would be blocked on the syntax issue being resolved. We will probably need a different RFC proposing a more holistic approach to FFI attributes in general. There is an issue open about that: #2637 .

@mtak-
Copy link

mtak- commented Apr 13, 2019

Interesting factoid: there are functions that might return twice for which this attribute would be incorrect. _xbegin or generally anything that starts a HTM region does not require this attribute.

The naming consistency with C, along with ffi or other similar prefix, helps to make it clear that rust is not doing anything different from clang, and porting of the rtmintrin.h more straightforward.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Apr 14, 2019

Interesting factoid: there are functions that might return twice for which this attribute would be incorrect. _xbegin or generally anything that starts a HTM region does not require this attribute.

@mtak- why would adding returns_twice to _xbegin be incorrect ?

@Amanieu
Copy link
Member

Amanieu commented Apr 14, 2019

@gnzlbg Because on the second return of setjmp and vfork, they are able to observe data on the stack and in caller-saved registers that has been modified by the first return. This is not the case with _xbeing since the CPU will revert any changes to memory and registers that may have occurred in a transaction, thus effectively "rewinding" execution to the _xbegin call perfectly.

@RalfJung
Copy link
Member

That doesn't make it incorrect though, just unnecessary, right?
Or maybe that is always true for all functions, that in the worst case the attribute is unnecessary?

@Amanieu
Copy link
Member

Amanieu commented Apr 14, 2019

I guess you could say that about all functions. It is only necessary when the first return has visible side-effects, which is not the case with _xbegin because the hardware reverts any changes in the aborted transaction.

Essentially, consider the following code:

fn foo() -> i32 {
    let mut x = 0;
    let second_return = setjmp(env) != 0;
    x += 1;
    if second_return {
       return x;
    } else {
        longjmp(env);
    }
}

If the compiler is allowed to perform its normal optimizations, it will see that the address of x never escapes the function, and the return value can therefore be constant-folded to 1. However this is incorrect since the x += 1 line is executed twice because of the longjmp. The returns_twice attribute effectively tells the compiler to assume that all local variables may be clobbered by the setjmp call.

However if you used _xbegin and _xabort instead of setjmp and longjmp then foo can only ever return 1 since the full state is reverted by _xabort. The compiler does not need to treat _xbegin specially and can just const-propagate x to 1.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Apr 14, 2019

@Amanieu note that returns_twice does not even mark x as clobbered there. Reading / writing to it without using volatile is UB even if setjmp is returns_twice. @comex wrote a great summary of what returns_twice does here: #2625 (comment)

This does not invalidate your point about _xbegin.

@mtak-
Copy link

mtak- commented Apr 14, 2019

Everything @Amanieu said.

The CPU is really taking care to undo everything up until the point of the _xbegin abort handler. With setjmp/longmp, the software/libc has the responsibility of undoing the parts it wants undone. For _xbegin a compiler barrier is about all that's needed.

On powerpc you can suspend and resume transactions, which might mean power's tbegin needs the returns_twice attribute, since side effects can occur in the middle of a transaction. However, I'm not sure what from @comex's list would be necessary for tbegin.

EDIT: The compiler barrier is not really for correctness, it's just so that the optimizer doesn't move more memory accesses into the critical section than necessary, and also so that instructions which automatically trigger aborts such as cpuid are not moved into the critical section.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 23, 2019

FYI, I am now of the opinion that I think I would prefer a #[returns_twice] attribute that we would be able to commit to for Rust "in general", and not only for FFI, whatever that might mean, and even if one cannot write functions in Rust itself that return more than once. One might not be able to do this today, but maybe someday, with inline assembly, one will be able to. This RFC does not attempt to solve that problem, and it might be worth to explore that, before committing to this forever.

In the meantime, I think it would be better to just require code that needs this to live on nightly, and provide, for example, a general #[llvm_attribute(...)] mechanism to apply LLVM attributes to function, arguments, etc. so that one can just write #[llvm_attribute(returns_twice)] and call it a day here.

I think that for the time being, it is more important for us to be able to gain experience with this on nightly, than to ship this as a stable feature. We can always come back to this later.

Thoughts?


EDIT: I know this landed on nightly already, but we can just move that to a more general feature to apply LLVM attributes in a subsequent PR.

@Centril
Copy link
Contributor

Centril commented May 23, 2019

Seems like a step backwards from not having LLVM be the only backend?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 23, 2019

@Centril #[llvm_attribute(..)] would be perma unstable, with no intend to ever stabilize it. It would allow people to experiment on nightly with these, and other attributes, like the ones implemented in rust-lang/rust#58327 . I think that for those attributes as well, I would prefer a solution that works for Rust in general, more than something that only works in C FFI.

@jeff-davis
Copy link

@gnzlbg I appreciate that you are taking this all the way back to first principles, but it has left the problem quite open ended, and I hope there is some way we can work from both sides of the problem at once. While experimenting with a first-class #[returns_twice] attribute that works with FFI, inline asm, etc., we can also try to support the specific case of setjmp in libc.

The people who need it (like me) are probably better off with a supported setjmp that might be buggy than relying on undefined behavior, and whatever problems we do find can then be reported. It's likely to get a lot more use if there's a short- or medium-term path to stabilization.

We will need documentation around setjmp, just like C, to tell you how you can and can't use it, and suggest things like marking the caller #[inline(never)]. I don't think that kind of ugliness is avoidable anyway (at least not all of it).

My use case, by the way, is about writing complex extensions for postgres in pure rust. I just got an extension working that creates a background worker, listens on a port, and handles requests with tokio by passing queries to SPI (postgres internal query processing API), and returns the results to the client. I've been super happy with rust in every way, and I'm presenting the work at PGCon next week. There are only two holes in this story:

  1. Rust and C strings don't align very well and require extra allocations (perhaps there are ways to mitigate this).
  2. I'm relying on platform-specific code and undefined behavior to use setjmp for error handling.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 24, 2019

@jeff-davis right now there is no setjmp, but we could expose it on nightly as an unstable feature. Would that work for you? Or do you need it on stable Rust ASAP ?

@jeff-davis
Copy link

@gnzlbg I would like setjmp to be on a path to stable libc in the short to medium term. I would be very happy with that.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 24, 2019

Note that this RFC does not stabilize setjmp, it just provides an attribute that allows calling setjmp via C FFI with less issues than what we have now.

@jeff-davis
Copy link

@gnzlbg Right, setjmp is #2625 which is blocked on #2633 (this issue) which is blocked on #2637. If the latter two are still far out, I'll move the discussion back to #2625 to see if we can find a near-term solution.

@nikomatsakis
Copy link
Contributor

I'm going to go ahead and close this; this exact area of work (longjmp, in partcular) is being worked on in the ffi-unwind project group (in which @gnzlbg has been active from time to time). This seems like it should be discussed in that context.

@nikomatsakis
Copy link
Contributor

If you're like to learn about the ffi-unwind project group, check out this repository. =)

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 6, 2024
…, r=compiler-errors

Remove `ffi_returns_twice` feature

The [tracking issue](rust-lang#58314) and [RFC](rust-lang/rfcs#2633) have been closed for a couple of years.

There is also an attribute gate in R-A which should be removed if this lands.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 6, 2024
…, r=compiler-errors

Remove `ffi_returns_twice` feature

The [tracking issue](rust-lang#58314) and [RFC](rust-lang/rfcs#2633) have been closed for a couple of years.

There is also an attribute gate in R-A which should be removed if this lands.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 6, 2024
…, r=compiler-errors

Remove `ffi_returns_twice` feature

The [tracking issue](rust-lang#58314) and [RFC](rust-lang/rfcs#2633) have been closed for a couple of years.

There is also an attribute gate in R-A which should be removed if this lands.
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Feb 7, 2024
Rollup merge of rust-lang#120502 - clubby789:remove-ffi-returns-twice, r=compiler-errors

Remove `ffi_returns_twice` feature

The [tracking issue](rust-lang#58314) and [RFC](rust-lang/rfcs#2633) have been closed for a couple of years.

There is also an attribute gate in R-A which should be removed if this lands.
GuillaumeGomez pushed a commit to GuillaumeGomez/rustc_codegen_gcc that referenced this pull request Feb 15, 2024
…ler-errors

Remove `ffi_returns_twice` feature

The [tracking issue](rust-lang/rust#58314) and [RFC](rust-lang/rfcs#2633) have been closed for a couple of years.

There is also an attribute gate in R-A which should be removed if this lands.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-attributes Proposals relating to attributes A-ffi FFI related proposals. A-machine Proposals relating to Rust's abstract machine. A-unsafe Unsafe related proposals & ideas T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for setjmp / longjmp