Skip to content

Commit

Permalink
Auto merge of #97461 - eddyb:proc-macro-less-payload, r=bjorn3
Browse files Browse the repository at this point in the history
proc_macro: don't pass a client-side function pointer through the server.

Before this PR, `proc_macro::bridge::Client<F>` contained both:
* the C ABI entry-point `run`, that the server can call to start the client
* some "payload" `f: F` passed to that entry-point
  * in practice, this was always a (client-side Rust ABI) `fn` pointer to the actual function the proc macro author wrote, i.e. `#[proc_macro] fn foo(input: TokenStream) -> TokenStream`

In other words, the client was passing one of its (Rust) `fn` pointers to the server, which was passing it back to the client, for the client to call (see later below for why that was ever needed).

I was inspired by `@nnethercote's` attempt to remove the `get_handle_counters` field from `Client` (see #97004 (comment)), which combined with removing the `f` ("payload") field, could theoretically allow for a `#[repr(transparent)]` `Client` that mostly just newtypes the C ABI entry-point `fn` pointer <sub>(and in the context of e.g. wasm isolation, that's *all* you want, since you can reason about it from outside the wasm VM, as just a 32-bit "function table index", that you can pass to the wasm VM to call that function)</sub>.

<hr/>

So this PR removes that "payload". But it's not a simple refactor: the reason the field existed in the first place is because monomorphizing over a function type doesn't let you call the function without having a value of that type, because function types don't implement anything like `Default`, i.e.:
```rust
extern "C" fn ffi_wrapper<A, R, F: Fn(A) -> R>(arg: A) -> R {
    let f: F = ???; // no way to get a value of `F`
    f(arg)
}
```
That could be solved with something like this, if it was allowed:
```rust
extern "C" fn ffi_wrapper<
    A, R,
    F: Fn(A) -> R,
    const f: F // not allowed because the type is a generic param
>(arg: A) -> R {
    f(arg)
}
```

Instead, this PR contains a workaround in `proc_macro::bridge::selfless_reify` (see its module-level comment for more details) that can provide something similar to the `ffi_wrapper` example above, but limited to `F` being `Copy` and ZST (and requiring an `F` value to prove the caller actually can create values of `F` and it's not uninhabited or some other unsound situation).

<hr/>

Hopefully this time we don't have a performance regression, and this has a chance to land.

cc `@mystor` `@bjorn3`
  • Loading branch information
bors committed May 28, 2022
2 parents 4f39fb1 + 78a83b0 commit 116201e
Show file tree
Hide file tree
Showing 8 changed files with 217 additions and 134 deletions.
71 changes: 39 additions & 32 deletions compiler/rustc_builtin_macros/src/proc_macro_harness.rs
Original file line number Diff line number Diff line change
Expand Up @@ -294,50 +294,57 @@ fn mk_decls(cx: &mut ExtCtxt<'_>, macros: &[ProcMacro]) -> P<ast::Item> {
// that we generate expressions. The position of each NodeId
// in the 'proc_macros' Vec corresponds to its position
// in the static array that will be generated
let decls = {
let local_path = |cx: &ExtCtxt<'_>, sp: Span, name| {
cx.expr_path(cx.path(sp.with_ctxt(span.ctxt()), vec![name]))
};
let proc_macro_ty_method_path = |cx: &ExtCtxt<'_>, method| {
cx.expr_path(cx.path(span, vec![proc_macro, bridge, client, proc_macro_ty, method]))
};
let attr_or_bang = |cx: &mut ExtCtxt<'_>, ca: &ProcMacroDef, ident| {
cx.resolver.declare_proc_macro(ca.id);
cx.expr_call(
span,
proc_macro_ty_method_path(cx, ident),
vec![
cx.expr_str(ca.span, ca.function_name.name),
local_path(cx, ca.span, ca.function_name),
],
)
};
macros
.iter()
.map(|m| match m {
let decls = macros
.iter()
.map(|m| {
let harness_span = span;
let span = match m {
ProcMacro::Derive(m) => m.span,
ProcMacro::Attr(m) | ProcMacro::Bang(m) => m.span,
};
let local_path = |cx: &ExtCtxt<'_>, name| cx.expr_path(cx.path(span, vec![name]));
let proc_macro_ty_method_path = |cx: &ExtCtxt<'_>, method| {
cx.expr_path(cx.path(
span.with_ctxt(harness_span.ctxt()),
vec![proc_macro, bridge, client, proc_macro_ty, method],
))
};
match m {
ProcMacro::Derive(cd) => {
cx.resolver.declare_proc_macro(cd.id);
cx.expr_call(
span,
proc_macro_ty_method_path(cx, custom_derive),
vec![
cx.expr_str(cd.span, cd.trait_name),
cx.expr_str(span, cd.trait_name),
cx.expr_vec_slice(
span,
cd.attrs
.iter()
.map(|&s| cx.expr_str(cd.span, s))
.collect::<Vec<_>>(),
cd.attrs.iter().map(|&s| cx.expr_str(span, s)).collect::<Vec<_>>(),
),
local_path(cx, cd.span, cd.function_name),
local_path(cx, cd.function_name),
],
)
}
ProcMacro::Attr(ca) => attr_or_bang(cx, &ca, attr),
ProcMacro::Bang(ca) => attr_or_bang(cx, &ca, bang),
})
.collect()
};
ProcMacro::Attr(ca) | ProcMacro::Bang(ca) => {
cx.resolver.declare_proc_macro(ca.id);
let ident = match m {
ProcMacro::Attr(_) => attr,
ProcMacro::Bang(_) => bang,
ProcMacro::Derive(_) => unreachable!(),
};

cx.expr_call(
span,
proc_macro_ty_method_path(cx, ident),
vec![
cx.expr_str(span, ca.function_name.name),
local_path(cx, ca.function_name),
],
)
}
}
})
.collect();

let decls_static = cx
.item_static(
Expand Down
6 changes: 3 additions & 3 deletions compiler/rustc_expand/src/proc_macro.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ use rustc_span::{Span, DUMMY_SP};
const EXEC_STRATEGY: pm::bridge::server::SameThread = pm::bridge::server::SameThread;

pub struct BangProcMacro {
pub client: pm::bridge::client::Client<fn(pm::TokenStream) -> pm::TokenStream>,
pub client: pm::bridge::client::Client<pm::TokenStream, pm::TokenStream>,
}

impl base::BangProcMacro for BangProcMacro {
Expand Down Expand Up @@ -42,7 +42,7 @@ impl base::BangProcMacro for BangProcMacro {
}

pub struct AttrProcMacro {
pub client: pm::bridge::client::Client<fn(pm::TokenStream, pm::TokenStream) -> pm::TokenStream>,
pub client: pm::bridge::client::Client<(pm::TokenStream, pm::TokenStream), pm::TokenStream>,
}

impl base::AttrProcMacro for AttrProcMacro {
Expand Down Expand Up @@ -73,7 +73,7 @@ impl base::AttrProcMacro for AttrProcMacro {
}

pub struct DeriveProcMacro {
pub client: pm::bridge::client::Client<fn(pm::TokenStream) -> pm::TokenStream>,
pub client: pm::bridge::client::Client<pm::TokenStream, pm::TokenStream>,
}

impl MultiItemModifier for DeriveProcMacro {
Expand Down
81 changes: 46 additions & 35 deletions library/proc_macro/src/bridge/client.rs
Original file line number Diff line number Diff line change
Expand Up @@ -357,22 +357,33 @@ impl Bridge<'_> {
}
}

/// A client-side "global object" (usually a function pointer),
/// which may be using a different `proc_macro` from the one
/// used by the server, but can be interacted with compatibly.
/// A client-side RPC entry-point, which may be using a different `proc_macro`
/// from the one used by the server, but can be invoked compatibly.
///
/// N.B., `F` must have FFI-friendly memory layout (e.g., a pointer).
/// The call ABI of function pointers used for `F` doesn't
/// need to match between server and client, since it's only
/// passed between them and (eventually) called by the client.
/// Note that the (phantom) `I` ("input") and `O` ("output") type parameters
/// decorate the `Client<I, O>` with the RPC "interface" of the entry-point, but
/// do not themselves participate in ABI, at all, only facilitate type-checking.
///
/// E.g. `Client<TokenStream, TokenStream>` is the common proc macro interface,
/// used for `#[proc_macro] fn foo(input: TokenStream) -> TokenStream`,
/// indicating that the RPC input and output will be serialized token streams,
/// and forcing the use of APIs that take/return `S::TokenStream`, server-side.
#[repr(C)]
#[derive(Copy, Clone)]
pub struct Client<F> {
pub struct Client<I, O> {
// FIXME(eddyb) use a reference to the `static COUNTERS`, instead of
// a wrapper `fn` pointer, once `const fn` can reference `static`s.
pub(super) get_handle_counters: extern "C" fn() -> &'static HandleCounters,
pub(super) run: extern "C" fn(Bridge<'_>, F) -> Buffer,
pub(super) f: F,

pub(super) run: extern "C" fn(Bridge<'_>) -> Buffer,

pub(super) _marker: PhantomData<fn(I) -> O>,
}

impl<I, O> Copy for Client<I, O> {}
impl<I, O> Clone for Client<I, O> {
fn clone(&self) -> Self {
*self
}
}

/// Client-side helper for handling client panics, entering the bridge,
Expand Down Expand Up @@ -419,31 +430,31 @@ fn run_client<A: for<'a, 's> DecodeMut<'a, 's, ()>, R: Encode<()>>(
buf
}

impl Client<fn(crate::TokenStream) -> crate::TokenStream> {
pub const fn expand1(f: fn(crate::TokenStream) -> crate::TokenStream) -> Self {
extern "C" fn run(
bridge: Bridge<'_>,
f: impl FnOnce(crate::TokenStream) -> crate::TokenStream,
) -> Buffer {
run_client(bridge, |input| f(crate::TokenStream(input)).0)
impl Client<crate::TokenStream, crate::TokenStream> {
pub const fn expand1(f: impl Fn(crate::TokenStream) -> crate::TokenStream + Copy) -> Self {
Client {
get_handle_counters: HandleCounters::get,
run: super::selfless_reify::reify_to_extern_c_fn_hrt_bridge(move |bridge| {
run_client(bridge, |input| f(crate::TokenStream(input)).0)
}),
_marker: PhantomData,
}
Client { get_handle_counters: HandleCounters::get, run, f }
}
}

impl Client<fn(crate::TokenStream, crate::TokenStream) -> crate::TokenStream> {
impl Client<(crate::TokenStream, crate::TokenStream), crate::TokenStream> {
pub const fn expand2(
f: fn(crate::TokenStream, crate::TokenStream) -> crate::TokenStream,
f: impl Fn(crate::TokenStream, crate::TokenStream) -> crate::TokenStream + Copy,
) -> Self {
extern "C" fn run(
bridge: Bridge<'_>,
f: impl FnOnce(crate::TokenStream, crate::TokenStream) -> crate::TokenStream,
) -> Buffer {
run_client(bridge, |(input, input2)| {
f(crate::TokenStream(input), crate::TokenStream(input2)).0
})
Client {
get_handle_counters: HandleCounters::get,
run: super::selfless_reify::reify_to_extern_c_fn_hrt_bridge(move |bridge| {
run_client(bridge, |(input, input2)| {
f(crate::TokenStream(input), crate::TokenStream(input2)).0
})
}),
_marker: PhantomData,
}
Client { get_handle_counters: HandleCounters::get, run, f }
}
}

Expand All @@ -453,17 +464,17 @@ pub enum ProcMacro {
CustomDerive {
trait_name: &'static str,
attributes: &'static [&'static str],
client: Client<fn(crate::TokenStream) -> crate::TokenStream>,
client: Client<crate::TokenStream, crate::TokenStream>,
},

Attr {
name: &'static str,
client: Client<fn(crate::TokenStream, crate::TokenStream) -> crate::TokenStream>,
client: Client<(crate::TokenStream, crate::TokenStream), crate::TokenStream>,
},

Bang {
name: &'static str,
client: Client<fn(crate::TokenStream) -> crate::TokenStream>,
client: Client<crate::TokenStream, crate::TokenStream>,
},
}

Expand All @@ -479,21 +490,21 @@ impl ProcMacro {
pub const fn custom_derive(
trait_name: &'static str,
attributes: &'static [&'static str],
expand: fn(crate::TokenStream) -> crate::TokenStream,
expand: impl Fn(crate::TokenStream) -> crate::TokenStream + Copy,
) -> Self {
ProcMacro::CustomDerive { trait_name, attributes, client: Client::expand1(expand) }
}

pub const fn attr(
name: &'static str,
expand: fn(crate::TokenStream, crate::TokenStream) -> crate::TokenStream,
expand: impl Fn(crate::TokenStream, crate::TokenStream) -> crate::TokenStream + Copy,
) -> Self {
ProcMacro::Attr { name, client: Client::expand2(expand) }
}

pub const fn bang(
name: &'static str,
expand: fn(crate::TokenStream) -> crate::TokenStream,
expand: impl Fn(crate::TokenStream) -> crate::TokenStream + Copy,
) -> Self {
ProcMacro::Bang { name, client: Client::expand1(expand) }
}
Expand Down
2 changes: 2 additions & 0 deletions library/proc_macro/src/bridge/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,8 @@ mod handle;
mod rpc;
#[allow(unsafe_code)]
mod scoped_cell;
#[allow(unsafe_code)]
mod selfless_reify;
#[forbid(unsafe_code)]
pub mod server;

Expand Down
83 changes: 83 additions & 0 deletions library/proc_macro/src/bridge/selfless_reify.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
//! Abstraction for creating `fn` pointers from any callable that *effectively*
//! has the equivalent of implementing `Default`, even if the compiler neither
//! provides `Default` nor allows reifying closures (i.e. creating `fn` pointers)
//! other than those with absolutely no captures.
//!
//! More specifically, for a closure-like type to be "effectively `Default`":
//! * it must be a ZST (zero-sized type): no information contained within, so
//! that `Default`'s return value (if it were implemented) is unambiguous
//! * it must be `Copy`: no captured "unique ZST tokens" or any other similar
//! types that would make duplicating values at will unsound
//! * combined with the ZST requirement, this confers a kind of "telecopy"
//! ability: similar to `Copy`, but without keeping the value around, and
//! instead "reconstructing" it (a noop given it's a ZST) when needed
//! * it must be *provably* inhabited: no captured uninhabited types or any
//! other types that cannot be constructed by the user of this abstraction
//! * the proof is a value of the closure-like type itself, in a sense the
//! "seed" for the "telecopy" process made possible by ZST + `Copy`
//! * this requirement is the only reason an abstraction limited to a specific
//! usecase is required: ZST + `Copy` can be checked with *at worst* a panic
//! at the "attempted `::default()` call" time, but that doesn't guarantee
//! that the value can be soundly created, and attempting to use the typical
//! "proof ZST token" approach leads yet again to having a ZST + `Copy` type
//! that is not proof of anything without a value (i.e. isomorphic to a
//! newtype of the type it's trying to prove the inhabitation of)
//!
//! A more flexible (and safer) solution to the general problem could exist once
//! `const`-generic parameters can have type parameters in their types:
//!
//! ```rust,ignore (needs future const-generics)
//! extern "C" fn ffi_wrapper<
//! A, R,
//! F: Fn(A) -> R,
//! const f: F, // <-- this `const`-generic is not yet allowed
//! >(arg: A) -> R {
//! f(arg)
//! }
//! ```

use std::mem;

// FIXME(eddyb) this could be `trait` impls except for the `const fn` requirement.
macro_rules! define_reify_functions {
($(
fn $name:ident $(<$($param:ident),*>)?
for $(extern $abi:tt)? fn($($arg:ident: $arg_ty:ty),*) -> $ret_ty:ty;
)+) => {
$(pub const fn $name<
$($($param,)*)?
F: Fn($($arg_ty),*) -> $ret_ty + Copy
>(f: F) -> $(extern $abi)? fn($($arg_ty),*) -> $ret_ty {
// FIXME(eddyb) describe the `F` type (e.g. via `type_name::<F>`) once panic
// formatting becomes possible in `const fn`.
assert!(mem::size_of::<F>() == 0, "selfless_reify: closure must be zero-sized");

$(extern $abi)? fn wrapper<
$($($param,)*)?
F: Fn($($arg_ty),*) -> $ret_ty + Copy
>($($arg: $arg_ty),*) -> $ret_ty {
let f = unsafe {
// SAFETY: `F` satisfies all criteria for "out of thin air"
// reconstructability (see module-level doc comment).
mem::MaybeUninit::<F>::uninit().assume_init()
};
f($($arg),*)
}
let _f_proof = f;
wrapper::<
$($($param,)*)?
F
>
})+
}
}

define_reify_functions! {
fn _reify_to_extern_c_fn_unary<A, R> for extern "C" fn(arg: A) -> R;

// HACK(eddyb) this abstraction is used with `for<'a> fn(Bridge<'a>) -> T`
// but that doesn't work with just `reify_to_extern_c_fn_unary` because of
// the `fn` pointer type being "higher-ranked" (i.e. the `for<'a>` binder).
// FIXME(eddyb) try to remove the lifetime from `Bridge`, that'd help.
fn reify_to_extern_c_fn_hrt_bridge<R> for extern "C" fn(bridge: super::Bridge<'_>) -> R;
}
Loading

0 comments on commit 116201e

Please sign in to comment.