-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#[distributed_slice]
aka a method to enumerate tests
#3
Comments
Summary of Past WorkPast discussions
Use cases
Prior art
Potential RFC collaborators
Design ideaspartial const ALL_THINGS: &[Thing] = &[..];
// "[..]" means this is a definition of a partial slice constant.
// It's a compilation error to have more than one definition.
partial const ALL_THINGS = &[.., Thing(1), ..];
// "[.., x, ..]" means it's an extension or a partial slice constant.
// It has to be defined elsewhere.
partial const ALL_THINGS = &[.., Thing(2), ..];
// Another extension of the same constant. // crate `foo`
// extendable static should be public and have type `&[T]`
#[distributed_slice]
pub static ERROR_MSGS: &'static [(u32, &'static str)] = &[
(ERROR_CODE_1, ERROR_MSG_1),
(ERROR_CODE_2, ERROR_MSG_2),
];
// crate `bar` which depends on crate `foo`
use foo::ERROR_MSGS;
// if extended static has type `&[T]`, then this const should have type `T`
#[distributed_slice(ERROR_MSGS)]
const CUSTOM_ERROR: (u32, &'static str) = (0x123456, "custom err message");
|
I spent my evening doing a tiny bit of research, expanding on the points you'd already mentioned. Mostly as an overview for myself, but I thought making those thoughts public wouldn't hurt. Use cases
Prior artBoth ctor and linkme use linker magic. It's kind of amazing this works as well as it does.
Rust's Why implement this in the compiler?Since there are already libraries (like
Using the linker or the compiler?Prior art shows that this feature could be implemented in two ways. The linker route means possible limitations in platform support.
However, with "linker magic", rust could support such distributed slices, possibly through dynamically loaded libraries. Dynamic linking in Rust has always been kind of hard, and usually goes through Note that Ed Page noted during RustNL that this might not be so big of an issue, saying that rust crates span the entire spectrum from c's single header libraries to gigantic dynamically libraries like openssl for example. He thinks that the kind of applications you would use distributed slices for are not likely to be these gigantic libraries you might want to dynamically link. (correct me if I'm now misquoting you @epage) Personally, I think going through the compiler, not the linker is much less hacky and provides a more natural extension of Complications
|
Regarding dylibs, there are
|
One idea that has me more and more convinced is to not implement this feature as a slice at all. Distributed slice is maybe not even the right name. We cannot easily make the order stable so keeping indices to elements in the slice is nonsensical. Instead implementing as a type that implements Iterator (a bit like iterators over hashmaps) with a randomized order (that's seeded by something in the source code, to keep reproducible builds) we can do a lot better. Actually, the way this iterator works can then also be changed internally. If a shared library is loaded the iterators could in theory essentially be |
I also just talked to @m-ou-se who is working on an RFC to implement functions in a different library than their definition, which turns out to be a related issue. Essentially, they're "single item distributed slices". There, dynamic library loading is even more of an issue, since there can be only one implementation of a function. Thus, a dynamic library that has a conflicting definition would never be able to be loaded. |
Here's the RFC for that: rust-lang/rfcs#3632 |
@epage got an ok from Scott. The plan:
|
Problem Description
The following are just a few examples of some large crates using alternative collection systems (
Additionally, one can imagine webservers registring routes this way, although I found nobody doing that at the time of writing. In almost all the examples above, doing global registration is is an opt-in feature, behind a cargo feature flag. Especially `inventory`, based on `ctor`, which most crates mentioned above use, is only regularly tested on windows, macos and linux, and use on embedded targets is complicated.On embedded targets you must manually ensure that all global constructors are called, or a runtime like [`cortex-m-rt`](https://crates.io/crates/cortex-m-rt) must do so.It seems, authors of libraries are wary including registration systems in their library. Specifically for the testing-devex team, working on libtest-next. Custom test frameworks are useful for all kinds of purposes, like test fixtures.
Existing solutionsLinkmeBecause this pattern is so useful,
InventoryAn alternative approach, also written by David Tolnay is This is wildly unsafe, as
|
There we go, that's a kind of proposal. I hope it makes some sense. Any thoughts? |
To add weight, this was a plan agreed to in conjunction with libs-api with an explicit endorsement to implement a language feature and not to use ctor or linkme from dtolany
An important part of prior art in this is iirc the plan was to stabilize the compiler's test collection for use by users.
imo the biggest argument for an opaque type is that we are able to stabilize the least amount of the API and then go in one of several directions in the future, depending on how things evolve.
We should call out that traits will be implemented with being compatible with
This doesn't describe how we would be defining and registering items, but the name we'd use. This is where an opaque type might be difficult and require alternative design work instead. |
Thanks Ed! I'll see if I can incorporate some of your comments. I just don't quite get your last one?
|
I've personally implemented this linker magic for embedded code before, and I think that performing that work in the compiler instead is a fantastic idea. However, I'm quite worried about how this will interact with dynamic library loading, and I'd like to see a resilient API that deals with that use case properly. The Use CasesThe primary concern with loading a library at runtime is that new items get registered, and the various "distributed slices" / registries need to be intelligently updated and notified about this. Based on the use cases @jdonszelmann outlined, these registries need the following actions in the face of dynamic library loading:
I see a few patterns here. For use cases which may permit dynamic library loading, some kind of complex data structure needs to be updated every time new items in the "distributed slice" are added. All but one of the above use cases ( The important concept to note here is that the user of each "distributed slice" needs to run custom code every time a library is loaded at runtime. In general, they will take this opportunity to lock and update their internal data structures with the newly available information. If support for this feature were implemented into the compiler, and we want to take the time to properly support dynamically loaded libraries, then we can't just expose the "distributed slice" or an iterator of items in unspecified order. Some kind of hook mechanism is necessary. If a consumer of this feature does not care about dynamic library loading, then all the elements of their "distributed slices" are known to the compiler, and they should be able to access them in An Alternative DesignWe should implement two things: Let's take the example of // The current code, using `inventory`:
// The "distributed slice" object as wrapped in 'inventory'.
impl<...> inventory::Collect for ServerFnTraitObj<...> {
#[inline]
fn registry() -> &'static inventory::Registry {
static REGISTRY: inventory::Registry = inventory::Registry::new();
®ISTRY
}
}
// The actual data structure 'server_fn' stores information in.
static REGISTERED_SERVER_FUNCTIONS: LazyServerFnMap<...> = {
// Inside the 'initialize_server_fn_map' macro:
once_cell::sync::Lazy::new(|| {
inventory::iter::<ServerFnTraitObj<...>>
.into_iter()
.map(|obj| (obj.path(), obj.clone()))
.collect()
})
};
// A function to explicitly update that structure.
pub fn register_explicit<T>()
where T: ServerFn<...> {
REGISTERED_SERVER_FUNCTIONS.insert(...);
}
// An item being added to the "distributed slice".
#[linker_constructor_magic]
fn __ctor() {
<ServerFnTraitObj<...>>::registry()
.submit(ServerFnTraitObj::new(...));
}
// The code using @jdonszelmann's proposal:
// The "distributed slice" object, bikeshedding aside.
declare_distributed_slice!(REGISTERED_SERVER_FUNCTIONS_INTERNAL: ServerFnTraitObj<...>);
// The actual data structure 'server_fn' stores information in.
static REGISTERED_SERVER_FUNCTIONS: LazyServerFnMap<...> = {
// Inside the 'initialize_server_fn_map' macro:
once_cell::sync::Lazy::new(|| {
REGISTERED_SERVER_FUNCTIONS_INTERNAL
.into_iter()
.map(|obj| (obj.path(), obj.clone()))
.collect()
})
};
// A function to explicitly update that structure.
pub fn register_explicit<T>()
where T: ServerFn<...> {
REGISTERED_SERVER_FUNCTIONS.insert(...);
}
// An item being added to the "distributed slice".
add_to_distributed_slice!(ServerFnTraitObj::new(...));
// The code using my proposal:
// The actual data structure 'server_fn' stores information in.
static REGISTERED_SERVER_FUNCTIONS: ServerFnMap<...> = {
// Eagerly initialize it to empty.
Default::default()
};
// An alternative definition if dynamic loading was disallowed.
const REGISTERED_SERVER_FUNCTIONS_CONST: LazyServerFnMap<...> = {
let mut map = Default::default();
distributed_slice_const!(|obj| {
map.insert(obj.path(), obj.clone());
});
map
};
// A function to explicitly update that structure.
pub fn register_explicit<T>()
where T: ServerFn<...> {
REGISTERED_SERVER_FUNCTIONS.insert(...);
}
// The hook function called when a new item is detected.
#[distributed_slice_hook]
pub fn server_functions<T>()
where T: ServerFn<...> {
register_explicit::<T>();
}
// An item being added to the distributed slice.
add_to_distributed_slice!(ServerFnTraitObj::new(...)); A Comparison@jdonszelmann's proposal lines up nicely with how this feature is currently implemented in My proposal handles the use cases we've seen just as well. It omits the opaque registry object and instead relies on the consumer to bring their own data structure for storage. It is able to support consumers which do not want dynamic library loading as well as those that do. By only requiring a single definition in terms of a hook function, consumers don't need to worry about how dynamic library loading will work: if their hook function is well-defined, then they have a mutable static data structure they are adding to, and Rust's type system will already ensure that it is used safely. There are also additional optimizations that could be used with my proposal. Since every item is only processed by the hook function once, the compiler may be able to const-eval the execution of the hook function on items defined in the current crate / linking unit (LLVM already does this for some In terms of actually implementing this, we can start with There's still a lot to bikeshed about the API. I don't know how a distributed slice would be publicly declared or how it should appear in documentation. There are still some questions to answer about multi-threading and generics. But I believe that we need to better understand how dynamic library loading is going to look before experimenting with this feature. I'd love to hear others' suggestions on this! |
I strongly feel like we should not introduce anything related to global constructors for this, and that we should keep the design as simple as possible for now while being just flexible enough that we can change the underlying implementation any time we want. Distributed slice also is not the only feature that might theoretically want updates of dynamic library loads (panic handler, allocator, Mara's RFC). For now, I propose just to experiment with a compiler generated slices + iterator and see if that design even works so we can learn something new. As long as the feature is not stabilized, we can always argue about alternative implementations. |
There's a lot of discussion here, especially with what libraries could benefit from a
|
For the purpose of rustc, using |
@bjorn3 Might still work to have one slice for the whole dylib, and a separate slice for things that aren't in the dylib. |
Chiming in from Bevy: a mechanism like this that worked across platforms would be incredibly useful for making reflection way more user-friendly and open up further improvements in the ECS itself. Manually registering types for reflection is tedious and error-prone currently. |
Sorry if I've missed this approach being mentioned, but I'd consider rust-lang/rust#66113 to be prior art worth considering. Briefly the ideas is providing an iterator over vtables in the binary. e.g. #[test]
fn my_test() { ... } can be transformed to roughly: #[used]
static _: &dyn Test = {
struct my_test;
impl Test for my_test {
fn name(&self) -> &'static str { "my_test" }
fn run(&self) { ... }
}
&my_test
}; and execution is roughly: use std::ptr;
fn main() {
for vtable in ptr::vtables::<dyn Test>() {
let test: &dyn Test = ptr::from_raw_parts(NonNull::dangling(), vtable).as_ref_unchecked();
println!("running {}", test.name());
test.run();
}
} This notably also potentially resolves rust-lang/rfcs#668 and rust-lang/rfcs#1022 which were indeed the initial motivation. I'm not abreast of how this interacts with the pointer metadata apis rust-lang/rust#81513. |
I think an intrinsic to get all vtables is a bad idea. Vtables get duplicated into each cgu that uses them and the intrinsic likely would prevent unused vtables from getting optimized away. And if it didn't, it would have pretty unreliable optimization dependent behavior. |
Yes this may be an issue, but note per my testing at the time it was only a cost if it was used:
In that PR I also included |
We need a method for custom test harnesses to enumerate tests. dtolnay (maintainer of
linkme
andctor
) was pushing for an in-language#[distributed_slice]
The text was updated successfully, but these errors were encountered: