Add asynchronous versions of most embedded-hal traits using GATs #285

lachlansneff · 2021-06-21T21:23:43Z

GATs (generic associated types), among other things let you return futures from trait methods.

The lifetimes can be a bit tricky but it results in a pretty clean trait.

For example:

/// Async transfer in place.
pub trait TransferInPlace<Word: 'static> {
    /// Error type
    type Error;

    /// Associated future for the `transfer` method.
    type TransferInPlaceFuture<'a>: Future<Output = Result<(), Self::Error>> + 'a
    where
        Self: 'a;

    /// Writes `words` to the slave from the `readwrite` buffer and reads words into the same buffer.
    /// This method uses a single `readwrite` buffer.
    fn transfer_inplace<'a>(&'a mut self, readwrite: &'a mut [Word]) -> Self::TransferInPlaceFuture<'a>;
}

It's important to note that GATs are only available on nightly at the moment, so this uses feature attributes. I'd like to discuss how this should be exposed (likely through a feature until GATs are stabilized).

Also, there are some traits that don't use futures that are duplicated in the nb, blocking, and futures modules. They should probably be merged.

rust-highfive · 2021-06-21T21:23:46Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @ryankurte (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

Dirbaio · 2021-06-22T16:23:44Z

Nice to see more interest on async! :)

Some general feedback:

1. Can't use DMA

I see the proposed traits are modeled after the existing nb traits, which read/write one word at a time. This rules out implementing them with DMA. CPU needs to get involved for every single word, waking the async task and having it start the next word write. Waking an async task is not free (can take a few us), for fast IO (eg multi-mhz SPI) it's likely that the time to wake the task will be similar to the time to send one word. In this case, the behavior will degrade to "blocking, but crappier": the CPU will have no time to run other tasks anyway, and it'll be slower and clunkier than real blocking.

Note that nb traits suffer from this same problem. However this limitation was "okay-ish", since it's impossible to impl nb traits with DMA anyway (because they can't borrow a buffer for the entire duration of the operation, only the instant when polled). With futures it's possible to allow DMA impls, so it'd be a mistake not to.

embassy-traits reads/writes entire slices (example) so that impls can set up DMA to read/write the entire slice in one go and only wake the task when it's fully done.

However doing DMA with "naked" slices is unsound but only if the futures are leaked. Since DMA is reading/writing directly from the slice, DMA has to be stopped if the future is dropped mid-operation. This is easy, embassy does that no problem. However in Rust you can leak things (not run Drop) without unsafe. If you do that, DMA will not be stopped and can keep writing to the buffer after it's deallocated. Detailed explanation of the problem here.

To fully solve this, the traits would have to use embedded-dma, but then the ergonomics heavily suffer, especially without alloc.

Embassy "solves" it by forbidding the user to leak futures. Dropping them is fine, leaking them is not. This means embassy's API is not fully safe, but in practice it's really really easy to use it correctly, since there's only a handful of ways to leak things (mem::forget, ManuallyDrop, Box::leak, Rc/Arc cycles), and they're all really easy to avoid. IMO this is the right tradeoff: super efficient DMA use, almost-safe API, super ergonomic. I expect this to be very controversial for adoption in embedded-hal though, as many people are of the "must be fully safe at any cost" opinion.

The third solution would be to fix the Leak Problem in Rust itself. IMO the Leak Problem is a Rust design issue. There's been some discussions, it might be fixable with a Leak auto trait that works similar to Sized. However it seems no one is feeling the pain of this as we (embedded) do, as we don't want alloc. For example io-uring has the same problem (io-uring is essentially DMA done by the kernel instead of the hardware) and they solve it by owning the buffers with Box/Vec.

And no, Pin doesn't prevent the leak problem. Pinning the buffer doesn't. Pinning the future doesn't.

2. Maybe not add all traits

I'd wait until there's clear demand of traits before adding them.

Async Uart, Spi and I2c are definitely wanted.
Async ADC is likely to have no benefit, since usually taking an ADC sample is fast enough that the core won't have time to do useful work on other tasks during it.
I'm not sure about Capture, its usage is quite niche.

3. Lifetimes

Lifetimes are indeed complicated. embassy-traits has where Self: 'a (I guess this is where you took it from?) because embassy drivers can borrow pins/peripherals instead of owning them, so eg an Uart driver is Uart<'a, UART1> instead of the more common Uart<UART1>.

If we drop this requirement then it can be simplified (I think that bound can be simply removed), but I would like to keep it. It's nice that drivers can be non-'static.

(I just wanted to point out the reason behind these bounds, since it's not very obvious).

4. Iterator traits

I think iterator-based traits like WriteIter shouldn't exist in async flavor because 1. it forces CPU involvement for each word and 2. utility is arguably limited since the iter itself is not async (should be a Stream, which is a whole other can of worms).

src/futures/spi.rs

Cargo.toml

lachlansneff · 2021-06-22T18:02:17Z

@Dirbaio

With futures it's possible to allow DMA impls, so it'd be a mistake not to.

I agree with this. And I'm personally fine with an unsafe API to improve the ergonomics, as long as it's easy to use safely.

I'm unsure if FullDuplex should be copied over to the futures module because reading and writing single words to SPI is probably going to be nearly instant.

Cargo.toml

…_impl_trait feature as a result

…d buffer

ryankurte · 2021-06-23T22:29:35Z

hey thanks for the PR! looks generally good, and async is going to be great fun when we get it worked out ^_^

i'm in agreement with @eldruin here (#172 (comment)), the GAT approach seems workable to me, definitely need to demonstrate / prove each of them in a couple of use cases (linux-embedded-hal may be one good HAL candidate as there's already some async in the sub-projects).

@Dirbaio also makes some great points, perhaps we could focus on SPI and I2C (and UART?) to reduce the scope of the initial implementation?

embassy-traits reads/writes entire slices (example) so that impls can set up DMA to read/write the entire slice in one go and only wake the task when it's fully done.

this approach is what i would expect / seems reasonable to me in that it should be implementable with or without DMA, i don't think it'd be useful to expose embedded-dma or other strange constraints in this interface.

the one caveat would seem to be UART, to me the UART read/write would ideally be represented as a stream/sink, though i understand these aren't yet in core. i would be okay with a futures dependency for now if there's a no_std and alloc free way of getting it in.

i think UART differs from the other peripherals because the transactions are unbounded, it seems like circular buffering / DMA in this case is a configuration / implementation detail that shouldn't need to be exposed to HAL consumers. i can see an argument for a slice based write to avoid the byte-wise copy/flush (and improve the ergonomics of the internal buffer), but it doesn't seem like there's an unsurprising way to implement a serial read via slice without pushing complexity up to users.

it also doesn't seem like there's an obvious/ergonomic way to address wakers at the moment (#172 (comment)), but the proposed wake-every-poll approach would seem to do the job for now.

For reference, couple of prior works on the topic (that i think y'all have already mentioned, just useful to have a list):

Plan for core::future #149 - discussion about core::futures
[RFC] Use core language async primitives instead external crates #172 - proposal to add core::futures based traits
Add asynchronous versions of most HAL traits #206 - future/poll based impl
Disasm/async-embedded-traits - broadly same GAT approach as proposed here

Dirbaio · 2021-06-24T15:12:04Z

UART read/write would ideally be represented as a stream/sink

These have the same problem as per-word traits: you need CPU involvement for each word, you can eg write a big slice with DMA. Maybe it could be done with a stream/sink of slices, but that gets ugly fast.

it doesn't seem like there's an unsurprising way to implement a serial read via slice without pushing complexity up to users.

"Read a slice of words" is the lowest common denominator of what most hardware out there can do, while not sacrificing much:

It allows efficient DMA use if you know how much data you're going to receive upfront.
It still allows stuff like "read until newline" by reading 1-word silces in a loop.

ryankurte · 2021-06-27T10:04:38Z

These have the same problem as per-word traits: you need CPU involvement for each word, you can eg write a big slice with DMA. Maybe it could be done with a stream/sink of slices, but that gets ugly fast.

yeah, though they'd work well implemented on top of IRQs or DMA and a ring-buffer, i think the ideals differ a bit depending on whether you look at this as an abstraction for physical UART peripherals vs. a UART abstraction for drivers to use.

"Read a slice of words" is the lowest common denominator of what most hardware out there can do, while not sacrificing much:

seems alright so long as we're clear that any call might return a partially filled slice (say you had a [u8; 1024] buffer you'd expect to RX data roughly as it came in not to fill the buffer first i think. the opposite case would mean you'd almost always have to use byte-wise reads for responsiveness.)

eldruin · 2021-06-27T14:20:40Z

I would like to note that if we only fill the received array partially, we need to return the number of words read. Alternatively, return a slice to the input array, although this would result in an interface like #286.

Dirbaio · 2021-06-29T15:48:36Z

seems alright so long as we're clear that any call might return a partially filled slice (say you had a [u8; 1024] buffer you'd expect to RX data roughly as it came in not to fill the buffer first i think. the opposite case would mean you'd almost always have to use byte-wise reads for responsiveness.)

In practice that would degenerate to byte-by-byte read anyway. The future would have to complete as soon as 1 byte has been received, and while the CPU is processing that byte no more bytes would be read. The hardware FIFO for UARTs is typically tiny, usually just 1..4 bytes.

The "allow short writes" solution makes a lot of sense when the IO is backed by a big buffer, and read calls can copy big chunks off it. That's what std::io::Read does. However I don't think forcing implementations to buffer is a good idea. With DMA hardware it's nice that the hardware can read/write directly to the slices provided by the user.

embassy-nrf has two uart drivers: Uarte implementing uart::Read and uart::Write, which wait for the entire slice to be read/written and can therefore do zerocopy IO, and BufferedUarte which has ringbuffers hardware reads/writes to, and impls io::AsyncRead, io::AsyncWrite. These copy to/from the ringbuffers, so it's not zerocopy IO, but can be more performant since the hardware can be constantly reading/writing to the ringbuffer even when the CPU is busy with other things. AsyncRead allows "short reads" like what you're proposing.

Embassy also has a ReadUntilIdle trait, although I think that one is maybe too specific: https://github.com/embassy-rs/embassy/blob/d49adc98beebe5b7982204150ebdc0a485fefab2/embassy-traits/src/uart.rs#L23-L25

It really depends on the use case. I don't think there's One Uart Trait To Rule Them All.

Terminal for a human to type in? use uart::Read with 1-byte slices. Humans are slow, you won't miss keypresses.
Custom protocol where you reasonably control the timing and frame sizes? maybe you can get away with uart::Read or uart::ReadUntilIdle, which is nice because you get zerocopy IO.
Complex protocol, hard to predict timings/sizes, want max performance? use BufferedUarte with AsyncRead/AsyncWrite.

Also, having generic io::AsyncRead, io::AsyncWrite representing "a pipe of bytes" is nicer because they're more general. For example, a PPP driver could require them, which would allow doing PPP over serial, PPP over TCP, PPP over Unix pipe...

TLDR:

IMO uart traits should be "wait until entire slice has been read" to allow zerocopy
For more advanced usecases generic io traits work better
io traits can be implemented for uart with ringbuffers.

src/futures/digital.rs

lachlansneff · 2021-08-04T14:51:44Z

Now that GATs are no longer an incomplete feature, it's looking likely that they'll land before the end of the year! Maybe it's time to get this in as an unproven feature?

burrbull · 2021-08-04T15:08:20Z

Maybe just async feature?

eldruin · 2021-08-04T15:51:10Z

We have decided against an unproven feature. See here.
The best way would be to demonstrate that these traits work well by creating HAL implementations for them and also demonstrating their use in a driver.
If only some of the traits are proven, we can always merge those and add the others as they are shown to work. I assume once a couple async traits are merged, community interest will drive implementations for the rest fairly quickly.

burrbull · 2021-08-04T16:01:41Z

I propose to hide them under async feature anyway and add doc that this is unstable.
Before stabilization it will prevent us to use unstable feature.
After stabilization this will enable "additional" functionality.
Unlike unproven feature we don't need to delete it ever.
Also we will don't need to increase MSRV.

lulf · 2021-11-03T12:38:32Z

@lachlansneff I see there is one unresolved comment from @Dirbaio. I volunteer continuing this work if you don't have time for it.

@Dirbaio From embassy POV, do you think these traits look sufficiently complete that we can give a go at making the embassy implementations support them? (I can make a PR for it)

Dirbaio · 2021-11-03T13:11:39Z

@lulf Yes, the traits here look great and I don't think there will be any issue switching embassy to them. It'd be great to do it, it'll help test the traits :)

I have one major concern about merging this PR: async fn in traits is coming. It'll allow using actual async fns, which will desugar to GAT + regular fn like the traits in this PR. It is very likely that async fn traits will become the idiomatic way, and switching from GATs to async fn will be a breaking change.

IMO we should release this in a separate crate, for example async-embedded-hal. This crate can get breaking changes as needed, releasing v0.1, v0.2, v0.3... When we consider it "proven", instead of releasing a v1.0 we can merge it into the main embedded-hal as originally planned.

Gating the async traits behind an async feature won't work if we plan to make breaking changes to them, even if we declare "if you enable the async feature you're not covered by semver". There will be cases where e.g. async traits got broken in v1.1 -> v1.2 so you have to pin embedded-hal = "=1.1", but v1.2 added some unrelated blocking trait that your HAL wants to implement so your HAL requires v1.2 and now you can't build at all.

lachlansneff · 2021-11-03T16:39:51Z

@lulf

@lachlansneff I see there is one unresolved comment from @Dirbaio. I volunteer continuing this work if you don't have time for it.

I'll finish making those changes today or tomorrow.

I'm with @Dirbaio on this: a separate crate async-embedded-hal or embedded-hal-async would be the right move until async trait is in nightly. I'll go ahead and make the crate in the next day or so.

If anyone has an opinion about the name, I'm fine with either.

burrbull · 2021-11-03T17:00:00Z

+ on embedded-hal-async. Easier to find on crates.io

derekdreery · 2021-11-22T22:15:21Z

@Dirbaio I have one major concern about merging this PR: async fn in traits is coming. It'll allow using actual async fns, which will desugar to GAT + regular fn like the traits in this PR. It is very likely that async fn traits will become the idiomatic way, and switching from GATs to async fn will be a breaking change.

I've had the following issue with SPI using embassy: I want to write out bits to a display as quickly as possible (since there are a lot to write and my SPI is not that fast). To do this, I want to use 2 buffers for my DMA. I want one buffer to be DMAd out to the peripheral while I fill the other.

Now a difference between async fn and futures with GATs is that the futures are a concrete type. This type could have a method like start_dma or similar that starts the operation immediately. For the async fn version, remember futures do nothing until polled. So we have to poll the future (await the async fn). This complicates how you implement the parallel write mentioned above.

So basically what I'm saying is that I'd like a concrete type for the future, and then a method like start_dma that will eagerly start writing to the peripheral without waiting for the future to be polled. Maybe there is an elegant solution I've missed, if so please tell me, but otherwise, I feel like we're losing ergonomics if we use async fn, at least for my use case.

lulf · 2021-12-16T09:35:26Z

@lachlansneff I think starting with the embedded-hal-async is nice. What is the preferred way to go about this? Merge this PR and reorganize the e-h into two crates then or do we want to see that reorg happening in this PR?

lachlansneff · 2021-12-17T16:28:39Z

@lachlansneff I think starting with the embedded-hal-async is nice. What is the preferred way to go about this? Merge this PR and reorganize the e-h into two crates then or do we want to see that reorg happening in this PR?

Sorry, the semester just finished, so I didn't have time to set this up. I have it half set up on my laptop. Was thinking to just create the crate and drop this PR.

ryankurte · 2021-12-19T22:01:37Z

given the fundamentals (generic type bounds for now, async trait one day) are pretty clear now i wonder whether it might be useful for us (@rust-embedded/hal) to create the embedded-hal-async crate here with the usual tooling and a bunch of warnings about it's alpha-ness? then we could take a similar approach to e-h to collaboratively build it up component by component.

Dirbaio · 2021-12-20T00:23:56Z

I've opened a PR adding embedded-hal-async. #334

Dirbaio · 2022-05-04T01:46:19Z

This can probably be closed, now that we have embedded-hal-async.

eldruin · 2022-05-04T07:45:03Z

Indeed. Thank you everybody!

First attempt at futures module

e736fb5

lachlansneff requested a review from a team as a code owner June 21, 2021 21:23

rust-highfive assigned ryankurte Jun 21, 2021

rust-highfive added S-waiting-on-review Review is incomplete T-hal labels Jun 21, 2021

Fix some of the docs

ff085fa

burrbull mentioned this pull request Jun 22, 2021

[RFC] Use core language async primitives instead external crates #172

Closed

Add 'unstable-gats' feature to enable futures module

7eb5fd1

Dirbaio reviewed Jun 22, 2021

View reviewed changes

src/futures/spi.rs Outdated Show resolved Hide resolved

src/futures/spi.rs Outdated Show resolved Hide resolved

burrbull reviewed Jun 22, 2021

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

lachlansneff added 2 commits June 22, 2021 14:49

Add async i2c module

3a30a33

Remove futures::spi::FullDuplex

d9d08fa

Dirbaio reviewed Jun 22, 2021

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

lachlansneff added 7 commits June 22, 2021 15:16

Make futures::spi::{Read, Write, ReadWrite, ReadWriteInPlace} traits

1928f77

Rename the futures feature to unstable-features

5cdd987

Remove commented-out futures::spi::Transactional trait

eecf646

Remove MaybeUninit from futures::spi::ReadWriteInPlace

6298952

Remove futures::i2c::Transactional and remove required min_type_alias…

bac1603

…_impl_trait feature as a result

Remove some unncessary bounds from futures::spi traits

92aa33b

Return initialized buffers from method that read into an uninitialize…

582ebad

…d buffer

lachlansneff added 2 commits June 29, 2021 14:51

Update spi trait names

a61008b

Remove MaybeUninit from futures::spi

306b187

lachlansneff added 4 commits July 6, 2021 16:23

Add more futures::digital traits

c002a23

Change some associated future trait type names

e686e6e

Formatting

2e16b79

Remove MaybeUninit from futures::i2c

55cbca1

lachlansneff force-pushed the master branch from 22ec134 to 55cbca1 Compare July 6, 2021 20:30

lachlansneff added 2 commits July 6, 2021 16:32

fix typo

fee1b99

Add error associated types to futures::digital traits

1df5604

Dirbaio requested changes Jul 13, 2021

View reviewed changes

src/futures/digital.rs Outdated Show resolved Hide resolved

newAM mentioned this pull request Aug 4, 2021

async rand_core trait rust-random/rand#1155

Closed

ryan-summers mentioned this pull request Aug 11, 2021

Consider migrating away from nb rust-embedded-community/embedded-nal#47

Open

jacobrosenthal mentioned this pull request Nov 20, 2021

Asyncify flush rust-embedded-community/ssd1331#13

Open

Expand names of generic associated future types

38ba051

Dirbaio mentioned this pull request Dec 18, 2021

Will there be a release? embassy-rs/embassy#549

Closed

Dirbaio mentioned this pull request Dec 20, 2021

Async #334

Closed

3 tasks

eldruin closed this May 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add asynchronous versions of most embedded-hal traits using GATs #285

Add asynchronous versions of most embedded-hal traits using GATs #285

lachlansneff commented Jun 21, 2021 •

edited

Loading

rust-highfive commented Jun 21, 2021

Dirbaio commented Jun 22, 2021

lachlansneff commented Jun 22, 2021

ryankurte commented Jun 23, 2021 •

edited

Loading

Dirbaio commented Jun 24, 2021

ryankurte commented Jun 27, 2021

eldruin commented Jun 27, 2021

Dirbaio commented Jun 29, 2021 •

edited

Loading

lachlansneff commented Aug 4, 2021

burrbull commented Aug 4, 2021

eldruin commented Aug 4, 2021 •

edited

Loading

burrbull commented Aug 4, 2021 •

edited

Loading

lulf commented Nov 3, 2021

Dirbaio commented Nov 3, 2021

lachlansneff commented Nov 3, 2021 •

edited

Loading

burrbull commented Nov 3, 2021 •

edited

Loading

derekdreery commented Nov 22, 2021 •

edited

Loading

lulf commented Dec 16, 2021

lachlansneff commented Dec 17, 2021

ryankurte commented Dec 19, 2021 •

edited

Loading

Dirbaio commented Dec 20, 2021

Dirbaio commented May 4, 2022

eldruin commented May 4, 2022

Add asynchronous versions of most embedded-hal traits using GATs #285

Add asynchronous versions of most embedded-hal traits using GATs #285

Conversation

lachlansneff commented Jun 21, 2021 • edited Loading

rust-highfive commented Jun 21, 2021

Dirbaio commented Jun 22, 2021

1. Can't use DMA

2. Maybe not add all traits

3. Lifetimes

4. Iterator traits

lachlansneff commented Jun 22, 2021

ryankurte commented Jun 23, 2021 • edited Loading

Dirbaio commented Jun 24, 2021

ryankurte commented Jun 27, 2021

eldruin commented Jun 27, 2021

Dirbaio commented Jun 29, 2021 • edited Loading

lachlansneff commented Aug 4, 2021

burrbull commented Aug 4, 2021

eldruin commented Aug 4, 2021 • edited Loading

burrbull commented Aug 4, 2021 • edited Loading

lulf commented Nov 3, 2021

Dirbaio commented Nov 3, 2021

lachlansneff commented Nov 3, 2021 • edited Loading

burrbull commented Nov 3, 2021 • edited Loading

derekdreery commented Nov 22, 2021 • edited Loading

lulf commented Dec 16, 2021

lachlansneff commented Dec 17, 2021

ryankurte commented Dec 19, 2021 • edited Loading

Dirbaio commented Dec 20, 2021

Dirbaio commented May 4, 2022

eldruin commented May 4, 2022

lachlansneff commented Jun 21, 2021 •

edited

Loading

ryankurte commented Jun 23, 2021 •

edited

Loading

Dirbaio commented Jun 29, 2021 •

edited

Loading

eldruin commented Aug 4, 2021 •

edited

Loading

burrbull commented Aug 4, 2021 •

edited

Loading

lachlansneff commented Nov 3, 2021 •

edited

Loading

burrbull commented Nov 3, 2021 •

edited

Loading

derekdreery commented Nov 22, 2021 •

edited

Loading

ryankurte commented Dec 19, 2021 •

edited

Loading