-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement DMA based on embedded-dma traits #153
Conversation
I've just pushed a proof-of-concept example using the SAI peripheral with DMA: https://github.com/antoinevg/stm32h7xx-hal/blob/sai-dma/examples/sai-dma-passthru.rs I coded this up before you pushed this branch so I'll probably need to go back and update it for your changes! |
Please excuse me if I'm a little off-track here but I'm still very new to embedded-hal implementation! re: Soundness of taking a mutable reference to the target peripheral, rather than consuming it completely Are we assuming that a given peripheral will only ever have a single DMA stream associated with it? The SAI peripheral, in particular, has two channels: A & B Each of these channels has its own DMA stream associated with it. Usually configured as one stream each for transmitting and receiving audio data. So there are two related issues here I think:
Would it make sense to modify i.e. something like:
And also modify the
|
My expectation is that it would work like I'm not familiar with the SAI peripheral but for example if you need to ensure the peripheral settings aren't changed during the DMA transfer, do it like If you don't need to ensure that peripheral settings aren't changed during the DMA transfer you can just have the peripheral store an If you need to ensure that only some peripheral settings aren't changed during the DMA transfer you can do something like |
I've had some DMA code using the serial peripheral: https://github.com/mattico/stm32h7xx-hal/tree/add-dma-example |
@richardeoin Re: Do the compiler fences also need matching synchronisation barriers for the Cortex-M7? https://community.st.com/s/article/FAQ-DMA-is-not-working-on-STM32H7-devices has good information about this if you haven't seen it. From what I remember, I couldn't get DMA to work with the serial port without disabling the dcache on the DMA buffers. No combination of |
With DMA in a cached region you'd have to use the cache control operations to specifically invalidate (on DMA writes) or clean (on DMA reads) the memory in use for DMA. |
Thank you for taking the time to answer. I think I understood about 1% of what you were saying there, I really meant it when I said I'm new to embedded-hal development! 🤣 Could you show what you mean in a few lines of hypothetical code? Alternatively, is there a tutorial or documentation I could read that can help bridge the gaps between the working SAI/DMA implementation code I posted earlier in this thread, my two questions and your reply? I could probably piece things together on my own from the sources but time is expensive and it would be cool if I could put the little I have towards contributing functionality rather than reverse engineering! |
Hi @antoinevg @mattico ! Those examples are great, thanks for sharing them. Also nice that you're trying to update them to work with this PR. Indeed DMA with the SAI isn't well supported without several changes, so right now you're running a bit ahead of where this PR is @antoinevg
Yes, the current instances of Indeed the approach taken by the serial HAL would be a good option. Here's the function where the serial HAL (which owns the serial PAC) is split into two ZSTs. Here we're using ZSTs to represent ownership, whilst not consuming resources at runtime. You'd then implement the DMA like so:
Potentially this approach would work for the SAI channels too. |
@mattico The link about memory layout and dcache is useful, although it's a separate issue (I think). This PR doesn't consider the dcache at all yet, it just assumes it's off. |
@antoinevg I added an extra option to the
If you have a neater way of modifying the macro that's also great, I didn't spend too much time thinking about it. Of course it doesn't solve the problem with actually initialising both streams at the same time, since they both want a mutable reference to pac::SAI1 |
Awesome, working nicely thank you!
We also need access to pac::SAI1 after DMA has started up. The SAI peripheral is a bit weird in that it needs DMA to be up & running before setting you can set its I see that That said, wouldn't it be better to rather have EDIT: It gets worse… 😅 It looks like the SAI peripheral needs the DMA to be fully configured before it can be successfully configured otherwise DMA stalls after the first interrupt is fired and the flag registers are corrupted. Which of course we can't do because the |
Yup, this closure allows access to whatever memory the Transfer has exclusive mutable access to. That that would be the place to set the
Ah, you mean that the DMA must be configured before the call to If you're willing to use let mut sai1_a_for_dma = unsafe { pac::Peripherals::steal().SAI1 };
let mut sai1_b_for_dma = unsafe { pac::Peripherals::steal().SAI1 };
let mut transfer_ch0: Transfer<_, _, PeripheralToMemory, _> = Transfer::init(
streams.0,
&mut sai1_a_for_dma,
...);
let mut transfer_ch1: Transfer<_, _, MemoryToPeripheral, _> = Transfer::init(
streams.1,
&mut sai1_b_for_dma,
...); The usefulness of this construction is an argument that TargetAddress should be implemented directly on the PAC/HAL struct itself, rather than a mutable reference to them. Then you could write: let mut transfer_ch0: Transfer<_, _, PeripheralToMemory, _> = Transfer::init(
streams.0,
unsafe { pac::Peripherals::steal().SAI1 },
...); |
FWIW when I was working on SAI I was thinking the SAI would be handed off to the DMA. |
17aa9e4
to
335f183
Compare
Thank you for the suggestions @richardeoin, they helped a lot. We've now got a first iteration of the There are still a few things which are not ideal, but hopefully this gets us a step closer. Notes:
|
Great! The example looks reasonable already, and it's very useful to have to play around with. Some quick responses to your notes:
|
Even though some debug probes might load it, we certainly shouldn't assume that in the examples.
The BDMA does not support all of the functionality in Stream. For now unsupported methods are just left as no-ops.
As well as the PAC structure. Also add the option to include an extra layer of indirection for the target register
Allow use of mem::transmute to elide the intermediate types
These are useful for DMA.
Without barrier instructions, the Cortex-M7 core can reorder the execution of transfers to normal and device memory with respect to each other. This does not correspond to the compiler_fence calls, which are concerned with generating the correct _program order_. The Cortex-M7 core does not reorder the transfers to device memory (the DMA configuration) with respect to each other.
The remaining core Stream trait should apply to MDMA streams also, although implementing this is future work
[DMA] Next transfer returns number of remaining data
…into dma This merge involved re-working much of #173
@ryan-summers Could you take a look at the merge above? I had to re-work much of #173 to merge #171, in the end it was quite simple but I hope it still works (On hardware I only checked that |
I gave it a review and did some hardware testing with the |
Move methods that we can't implement for the MDMA from the `Stream` trait to the `DoubleBufferedStream` trait (the MDMA is not double-buffered).
The lack of support for constant source buffers was first raised by @ryan-summers here #153 (comment) This solution adds a 5th generic type parameter to `Transfer`, and uses this to implement transfer for two different type constraints on `BUF`. The additional generic type parameter will also be useful for MDMA support, which will also need its own versions of `init`, `next_transfer`, `next_transfer_with` and so on. The downside is the additional complexity of another type parameter
Implement DMA for constant source buffers
Breaking change: Added additional generic type on |
Updating smoltcp on DMA branch
Reopened after it was closed by an errant bors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! We can always implement future changes if we want to make some things cleaner (like supporting full duplex DMA), but this is a wonderful start
bors r+ |
DMA implementation based on the embedded-dma traits and the existing implementation in the stm32f4xx-hal.
Currently some parts are very similar - theHowever some changes are required since we'd like to support the BDMA and MDMA also.Stream
trait and its implementation are completely identical similar, and much ofdma/mod.rs
is similar. However it remains to be seen how much will stay the same when implementing the BDMA and/or MDMA.Implementation / examples for:
TODO:
Do the compiler fences also need matching synchronisationMemory barriers for the Cortex-M7Contributions to this branch are very welcome, especially implementations and examples for other peripherals.
Ref #80