Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce some macros for soundness #11

Merged
merged 29 commits into from
Mar 16, 2024
Merged

Introduce some macros for soundness #11

merged 29 commits into from
Mar 16, 2024

Conversation

dureuill
Copy link
Owner

@dureuill dureuill commented Mar 3, 2024

Fixes #8

steffahn added 6 commits March 6, 2024 12:27
For a few Rust versions now, sealed traits no longer need to (ab)use
public-in-private tricks, since private supertraits are allowed.

With a unsafe fn new_scope(...) -> impl Scope<...>, the type Wrapper
no longer needs to be public either.
…sed-in expressions.

Especially unsafe is important, as the macro would otherwise be unsound, allowing unsafe code
in e.g. a scop!() without explicit usage of `unsafe` from the user.
…tion

I believe rustfmt support is important enough that we should encourage this encourage
this style, and even force the inner braces to help with that. Also, this makes
the macro just take a simple $b:block, which makes the rendered docs
slightly more clear what is expected syntactically, compared to
the status quo of {$($t:tt)*}.
@steffahn
Copy link
Contributor

steffahn commented Mar 6, 2024

I've collected some suggestions / improvements from my first review of this PR into a PR to your PR branch ;-)

For more information look at the commit messages.

@steffahn steffahn mentioned this pull request Mar 6, 2024
…ases involving lifetimes.

For example if we have something like
```rs
fn try_zip_files<'scope, 'a: 'scope, R: Read + Seek + 'scope>(
    mut archive: ZipArchive<R>,
    s: &'a str,
) -> impl nolife::Scope<Family = ZipReaderFamily, Output = ZipResult<nolife::Never>> + 'scope {
    scope!({
        /*use s */ s;
        // ...
    }
}
```
this code fails before this commit.

(The general pattern is to add a `+ 'scope` lifetime and then
add `…: 'scope` bounds for all captured lifetimes and type parameters.)

This still leaves adding an actual test case for such a use case (and documentation for it) as TODO.
@steffahn
Copy link
Contributor

steffahn commented Mar 6, 2024

it is now difficult to fix all generic parameters of a BoxScope, so that the containing struct can be without generic parameters (that are related to the BoxScope). I'm not sure how to fix that, except by boxing both the dyn Scope and its Future associated type

That’s a good concern. I’ll look into that, too.

@dureuill
Copy link
Owner Author

dureuill commented Mar 6, 2024

Theoretically, since the BoxScope already owns the moral equivalent of a Box<RawScope>, there should be no need to actually box the dyn Future inside the RawScope.

In practice, Rust is making this super difficult, as anything containing a DST is super difficult to handle.

I tried to mark the associated Future type inside of Scope as ?Sized, but then the RawScope is storing the Future in an Option, so it can't be !Sized. MaybeUninit or any manual union or enum are out too, and I really need to start the RawScope without the Future so that its state can be morally pinned before passing it to the time_capsule

I thought that maybe we could erase a RawScope<S> to a RawScope<ErasedScope> after the fact, but short of implementing something similar to what anyhow is doing I don't think we can do this easily.

@steffahn
Copy link
Contributor

steffahn commented Mar 6, 2024

I’ve been thinking (not too deeply yet) of about creating an approach of

struct GeneralScope<S: Scope<Output = Never>> (GeneralScopeInner<S>);
enum GeneralScopeInner<S: Scope<Output = Never>> {
    Initial(S),
    Running(MaybeUninit<S::Future>, State<S::Family>, PhantomPinned),
    Poison,
}

and giving the enter API on that like

pub fn enter<'borrow, Output, G>(self: Pin<&'borrow mut Self>, f: G) -> Output
where
    G: for<'a> FnOnce(&'borrow mut <S::Family as Family<'a>>::Family) -> Output,

which would already help decouple the boxing from the logic. So instead of BoxScope<S> one would use Pin<Box<GeneralScope<S>>>.

Now, I think that one could then place the essential parts of enter of polling + returning a &'borrow mut <S::Family as Family<'some_lifetime>>::Family into a trait-object safe, unsafe API1, and to implement enter accepting a general impl FnOnce in terms of that. Then we just need to put that behind some trait, (I’m running out of names here… let’s say GeneralScopeTrait { type Family } for now) and then the user can just convert their

Pin<Box<GeneralScope<S>>>

into

Pin<Box<dyn GeneralScopeTrait<Family = S::Family>>>

via coercion.

Footnotes

  1. With the use of private supertraits, this could perhaps even be private API, I haven’t checked if that’s usable in default implementations of methods then; the wrapping enter method could furthermore make due with Pin<&mut Self> and use a Self: Sized + Unpin bound instead, for better convenience, and using it would be powered by a

    impl GeneralScopeTrait for Pin<P>
    where
        P: DerefMut,
        <P as Deref>::Target: GeneralScopeTrait
    

    implementation. Edit: no that's not a full replacement, as one would want to be able to get the full lifetime &'borrow mut <F as Family<'a>>::Family in a callback for Pin<&'borrow mut dyn GeneralScopeTrait<Family = F>>, so – just in case someone needs it – perhaps having both API variants available, e.g. enter and enter_pinned, could be useful.

@dureuill
Copy link
Owner Author

dureuill commented Mar 6, 2024

Hey,

thanks for the suggestions, it is true that a trait like GeneralScopeTrait would be convenient.

However I remember specifically avoiding a direct Box (it is instead reimplemented using raw pointers) because it is self-referential, and I wanted to avoid the effects of noalias on Box.

More generally I tried to be very cautious to the paths that allow accessing the self-referential part (the one in State, not the async scope, that one I leave to the compiler) to avoid any pointer invalidation. Of course there could be mistakes already, but I'm afraid I could introduce new ones if I fiddle a bit too much with that code.

I'll try and see if I can come up with another solution that doesn't involve modifying the RawScope too much, and otherwise, I'll box all the things until naming opaque types becomes possible in Rust.

@dureuill
Copy link
Owner Author

dureuill commented Mar 6, 2024

I merged your commits from #12. Thank you they were very useful ❤️

@steffahn
Copy link
Contributor

steffahn commented Mar 6, 2024

However I remember specifically avoiding a direct Box (it is instead reimplemented using raw pointers) because it is self-referential, and I wanted to avoid the effects of noalias on Box.

Sure, that makes a lot of sense; leaving a more generalized approach probably to primitives like this to arrive some time in the future.

@dureuill
Copy link
Owner Author

dureuill commented Mar 6, 2024

I pushed an update that adds an ErasedScope<Family, Output> struct and an erased trait method to Scope that returns the ErasedScope corresponding to the Scope.

ErasedScope is not super optimized (it is a Wrapper in disguise, with both the producer and the future as trait objects), but it will work for now I think, unless we find a more efficient solution.

I updated my gist to offer an alternative DynZipReader struct that uses the new ErasedScope to be without generic parameters.

@steffahn
Copy link
Contributor

steffahn commented Mar 6, 2024

I'm seeing even the currently published approach of DynBoxScope<T> = BoxScope<T, Pin<Box<dyn Future<Output = Never>>>> came with two levels of boxing (though only one level of dyn calls, and also not three levels of boxing). I'm wondering if that couldn't be flattened, though.


Regarding ErasedScope, it seems unfortunate that BoxScope<S> takes the whole S: Scope but only uses S::Family and S::Future; by re-parametrizing to BoxScope<Family, Future>, one could avoid the need for Box<dyn FnOnce(…)…> wrapping the producer, too.


Also, I do think I have a relatively complete version in my head, of the alternative approach of wrapping the async { ... } block, not the whole |time_capsule| async { ... } closure, in an unsafe Wrapper type, enforcing no mismatch of the TimeStamp with non-corresponding scopes via invariant unique lifetimes.

I should try to write that down tomorrow, for comparison to see which approach seems "nicer" overall.

@steffahn
Copy link
Contributor

steffahn commented Mar 7, 2024

I should try to write that down tomorrow, for comparison to see which approach seems "nicer" overall.

Made an experiment of trying to actually write it down and it looks like I missed something that it makes it so it doesn't work after all: The problem comes from tagging TimeCapsule with a lifetime, TimeCapsule<'lifetime, T>, which gets captured by the async block future, and ultimately runs into the issue that Rust doesn't really support anything like impl for<'lifetime> FnOnce(TImeCapsule<'lifetime, T>) -> impl Trait<'lifetime> yet. I.e. Rust doesn't really do well with HRTB-closures when the return type of the closure is both an anonymous/opaque type with just an impl Trait... signature, and that type (not even necessarily the trait bound itself) captures the HRTB-lifetime.

So it's probably best to rather work with the approach of this PR for not. Of course, my other two sections of my previous comment are unrelated to this experiment, and thus still relevant as potential ways to improve the code in this PR (or in future PRs).

@dureuill
Copy link
Owner Author

dureuill commented Mar 8, 2024

Thank you for your attempt, it is good to know what works and what doesn't.

I'm seeing even the currently published approach of DynBoxScope = BoxScope<T, Pin<Box<dyn Future<Output = Never>>>> came with two levels of boxing (though only one level of dyn calls, and also not three levels of boxing). I'm wondering if that couldn't be flattened, though.

I'll try an approach reliant on manual type erasure rather than dyn trait, similar to anyhow's approach. So that you can build a BoxScope<NoFuture> with BoxScope<NoFuture>::new_erased<impl Future>, with NoFuture a type erased Future that implements Future itself.

If that works, I'm wondering if I should make that approach the only option, because most of the time if you're using a BoxScope it'll be to store it in a struct and very frequently you don't want that struct to be generic because of that 🤔

Regarding ErasedScope, it seems unfortunate that BoxScope<S> takes the whole S: Scope but only uses S::Family and S::Future; by re-parametrizing to BoxScope<Family, Future>, one could avoid the need for Box<dyn FnOnce(…)…> wrapping the producer, too.

I think you're right I'll try to revert that part of the changes

@dureuill
Copy link
Owner Author

dureuill commented Mar 8, 2024

fixed the tests and now I have an actual miri failure related to nofuture.

I'll take a look later.

@dureuill
Copy link
Owner Author

dureuill commented Mar 9, 2024

Ah might be the NonNull. It internally uses a *const. I remember avoiding it for RawScope, possibly for this reason. I'll try after transforming them into raw *mut

Testing non-compiling examples in miri is not very interesting, and
there is a bug where a test does not compile under normal nightly but
**does** compile under miri
- Drop erased future with the original layout of the future
- Never construct a reference to the erased future
- Access the erased future on a chain where no reference is ever
  created. This is important for stacked borrows that check the size of
the allocation. Creating any reference to a StackScope<T,
NoFuture<Erased>> creates an access only for the size of that object,
eschewing the actual future size.
@dureuill
Copy link
Owner Author

dureuill commented Mar 10, 2024

Fixed the NoFuture part, there were multiple soundness issue.

  1. I was unduly creating references to the erased future. Fixed mostly by introducing a RawFuture trait that does all the operations by pointer, rather than calling drop and poll on &mut NoFuture
  2. I was dropping the RawScope with the layout of the erased future, not the layout of the original future. Fixed by saving the layout of the RawScope in the NoFuture and deallocating that.
  3. I was creating references to the erased RawScope, which while not unsound, didn't borrow enough of the allocation to cover the future part. Fixed by carefully only creating pointers paths.

That certainly was a journey 😅 but now miri is happy in both stacked borrow and tree borrows

@steffahn, do you think there will remain any soundness issues once I merge this and #9?

@steffahn
Copy link
Contributor

I still want to try usage of ordinary dyn Future, I think that's possible, and if possible, it would seem more in line of "using language features for unsafe stuff as much as possible" that seems to be a motto of this crate.

Regarding other soundness issues, that's hard to know. The more subtle soundness issues I find in crates, the less confidently I'd ever say there none left. There are none I'm currently aware of once these issues are fixed, but a finalized version of the code that fixes it is the very basis of even being able to start really reviewing the whole thing more in-depth.

@dureuill
Copy link
Owner Author

dureuill commented Mar 10, 2024

"using language features for unsafe stuff as much as possible"

Sure that would be great if we could make it work. I don't see how right now, due to the tension between:

  1. The need to 2-steps initialize the RawScope, since it is itself sef-referential:
    1. we need it to have a stable address for the State part of the RawScope
    2. that State part is used to produce the Future, so we have to store an uninitialized future in the first step. In nolife 3.x, the uninitialized Future was behind an option, currently it is behind a raw MaybeUninit.
  2. The fact that Rust doesn't know how to store a !Sized that is maybe-initialized. An enum (including Option) or a union (including MaybeUninit) require a T: Sized.

Now, there might be another approach I'm not seeing here, and the limitation of (2) is not fundamental, it just appears to be a limitation of the Rust language at the time, which is why we can fake it by reimplementing an erased future ourselves.

I agree, as much as I loved writing the code in nofuture.rs, it goes a bit over the unsafe budget that this crate is comfortable with, and I wouldn't like to introduce soundness issues in user code for something that is less integral to what nolife is doing.

To me the best resolution would be for some kind of named opaque types to land in Rust, so the NoFuture option is kind of temporary. I wonder if I shouldn't settle for Box<dyn Future> as a more reasonable if less performant option in the meantime.

@dureuill
Copy link
Owner Author

a finalized version of the code that fixes it is the very basis of even being able to start really reviewing the whole thing more in-depth.

Sure, I'll re-read the whole thing to check the SAFETY comments, then I'll land these two PRs.

@steffahn
Copy link
Contributor

With repr(C) you can turn a MaybeUninit-containing struct into one that contains the value directly in the field, behind the pointer indirection (pointer cast).

I'm currently on a hike, so I can't write this down too much in detail - as mentioned I'll gladly try an implementation myself, perhaps later today, if I find the time for it.

@dureuill
Copy link
Owner Author

Sure, enjoy your hike and take your time, I can wait 😊

Went for a two hours walk myself (does it still qualifies as a "walk" after two hours?)

Hope your afternoon is sunny ☀️

@steffahn
Copy link
Contributor

steffahn commented Mar 10, 2024

Looks like this works just fine. Here's the code: https://github.com/steffahn/nolife/commits/use_trait_objects/

I've worked not from the latest commit, but from 22448c8 so you can't merge this but would need to git reset --hard to the thing, or throw in some reverts first, or something like that.

Also, I haven't worked through documenting all the unsafe yet (though I did leave a few comments in places). Tests seem to pass fine, at least, and IMO the code is a lot simpler this way.

@steffahn
Copy link
Contributor

Actually, I've just added a revert commit myself now, the code is in #13.

@dureuill
Copy link
Owner Author

dureuill commented Mar 10, 2024

oh wow, so your solution was to have the MaybeUninit<F> only at construction time, and not as a permanent field of RawScope at all.

That's very smart, I can guarantee that before today I would never have thought of that one. TIL.

I like the fields function and the denies you added btw

I feel like my mind isn't sharp enough tonight to properly review this (pretty much the entirety of RawScope changed), but I'll definitely give it a go sometimes this week.

Thank you again for your contribution, you are making this library better 👍

@dureuill
Copy link
Owner Author

dureuill commented Mar 10, 2024

(btw, I updated my gist to use your branch, and it looks nice 👍)

@steffahn
Copy link
Contributor

Also, I think my changes might have eliminated a case of memory leak if the scope.run call panics. Not 100% sure what the code did there previously; if there was a leak, then maybe we're missing a test for that case, because miri should detect memory leaks.

@dureuill
Copy link
Owner Author

v0.3.x used a separate ClosedBoxScope type that was forgotten only after the open would be successful, so there was no leak in v0.3.x. I introduced it when I removed the ClosedBoxScope, and re-reading the code it is kind of obvious 😅

It is true that we are missing a panicking_producer test. That's not theoretically attainable by a user using the scope! macro, but since they can technically use the unsafe new_scope...

I added a test to my branch. I checked that it fails miri with a leak on issue_8 and passes with your rebased branch.

@steffahn
Copy link
Contributor

Ah, of course – the things that aren’t possible with scope!.. I guess it doesn’t hurt to be defensively correctly handling such cases nonetheless.

steffahn and others added 4 commits March 10, 2024 23:24
* "Multiple soundness fixes for erased futures"
* "NoFuture: use a RawFuture trait rather than implementing Drop and Future"

This reverts commits fbd2f05 and 6900cf7.
@dureuill dureuill merged commit 8f218e0 into main Mar 16, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants