Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Make SOF with Zephyr run on i.MX93's A55 core with Jailhouse #7192

Closed
LaurentiuM1234 opened this issue Mar 1, 2023 · 22 comments
Closed
Labels
enhancement New feature or request
Milestone

Comments

@LaurentiuM1234
Copy link
Contributor

LaurentiuM1234 commented Mar 1, 2023

PURPOSE
I've decided to open this issue in order to track the current progress and interact with the community. Our plan is to make SOF with Zephyr run on i.MX93's cortex-A55 core with the help of Jailhouse. Jailhouse will be used to split the hardware resources between Linux and Zephyr which will be running independently on different cores.

Feedback and suggestions would be very much appreciated.

ISSUES

  1. Currently, using scripts/xtensa-build-zephyr, I have managed to obtain the zephyr.elf. The script fails when trying to run smex in order to build the .ldc file. My current understanding is that smex isn't able to properly parse the zephyr.elf file because of the fact that there's currently no support for parsing ELF-64 files. My question regarding this would be: would it be a good idea to add support in smex for parsing such files? Also, would it be necessary since, for now, my plan for now is to use Zephyr's logging system (no trace, the logs will be printed via the UART)?
  2. Regarding zephyr.elf's sections: are there any restrictions imposed by SOF (e.g: alignment, order)? For now, I was able to get away by simply adding the .static_uuids, .static_log and .trace_ctx to the already present sections added for ARM64 platforms on Zephyr.
  3. Would going through smex and the signing process still be required if the plan is to load the firmware image through Jailhouse? Please correct me if I'm wrong but my current understanding is that the .ri file resulting from the SOF build process contains some sort of header which is "stripped" during the load process in the linux kernel. Because of this, since Jailhouse loads the firmware image as-is, the A55 core would probably not be able to work properly. Would it be acceptable to bypass the .ri file creation process and simply utilize the generated zephyr.bin? (which can be generated by enabling CONFIG_BUILD_OUTPUT_BIN)

RELATED PRs
TODO

CC: @dbaluta @iuliana-prodan @paulstelian97 @lgirdwood

@LaurentiuM1234 LaurentiuM1234 added the enhancement New feature or request label Mar 1, 2023
@lgirdwood
Copy link
Member

@LaurentiuM1234 what is jailhouse ? any links ?
Adding @nashif @mwasko @andyross

@lgirdwood lgirdwood added this to the TBD milestone Mar 2, 2023
@LaurentiuM1234
Copy link
Contributor Author

@lgirdwood Sure, for a more detailed overview please see 1 and 2. For a more general overview please see 3. Also, please see 4 for github repo.

TL; DR: Jailhouse is a hypervisor used to partition the system's resources. In our case, we'll be using Jailhouse to split the A55 cores between Linux and Zephyr, meaning they will run independently on different cores.

@lgirdwood
Copy link
Member

@LaurentiuM1234 thanks, had a read through of the jailhouse docs. Looks like a great feature. Re questions above.

  1. Re the Zephyr logging / dictionary support - I think this is mostly done now, although there could be minor work still WIP. Best to check with @kv2019i and @marc-hb
  2. There is some WIP to update the Zephyr sections, but I dont think we have any ordering restrictions in general (maybe an xtensa thing for literals only). @andyross would know.
  3. No need to use smex/rimage, I think you could just use teh default Zephyr Jailhouse methods for image building and running the code. i.e. use elf or bin files if that is what jailhouse needs.

@kv2019i
Copy link
Collaborator

kv2019i commented Mar 8, 2023

@LaurentiuM1234 Ack, no need to use smex/rimage. Zephyr dictionary support can be used and we have tested it to work. Only reason why we still don't have dictionary enabled by default for Intel targets is that we have many platforms stuck to old compiler versions and Zephyr dictionary require a somewhat modern compiler. We just switched one Intel platform to clang toolchain and that will allow us to us dictionaries. With that, no need for smex anymore.

@andyross
Copy link
Contributor

andyross commented Mar 8, 2023

Regarding linker behavior: that gets pretty hairy internal to zephyr, but for the most part apps don't need to care. SOF has a few special sections (e.g. rimage manifest data) that you'll need to implement in the Zephyr SOC layer (probably just clone from the intel_adsp linker scripts, though it's possible your requirements are mild if you aren't using rimage itself). Definitely ping me if you have trouble.

@marc-hb
Copy link
Collaborator

marc-hb commented Mar 9, 2023

rimage should be removed from xtensa-build-zephyr.py in the next couple months, please look at:

It should soon be part of west build instead for intel_adsp. For i.MX it will be all up to you!

@marc-hb
Copy link
Collaborator

marc-hb commented Mar 12, 2023

@LaurentiuM1234 Ack, no need to use smex/rimage.

rimage is definitely used by i.MX right now, thanks @dbaluta for explaining that to me. rimage is not just signing but also "stitching"

For smex, you could easily add some new SMEX_BUILD = False property after #7255 is merged. Or you could just add ELF64 support to smex, the smex code is surprisingly small and it looks like it has some ELF64 headers already.

EDIT: C re-implementation of elfutils submitted in

@marc-hb
Copy link
Collaborator

marc-hb commented Mar 12, 2023

Would it be acceptable to bypass the .ri file creation process and simply utilize the generated zephyr.bin? (which can be generated by enabling CONFIG_BUILD_OUTPUT_BIN)

You don't need to disable rimage to use zephyr.bin.

If you need neither rimage nor smex you probably don't want to use xtensa-build-zephyr.py at all. Look at the west commands invoked by xtensa-build-zephyr.py for "inspiration", then invoke west directly.

Everything is possible :-)

@LaurentiuM1234
Copy link
Contributor Author

Thank you very much for all the useful comments!

Would it be acceptable to bypass the .ri file creation process and simply utilize the generated zephyr.bin? (which can be generated by enabling CONFIG_BUILD_OUTPUT_BIN)

You don't need to disable rimage to use zephyr.bin.

If you need neither rimage nor smex you probably don't want to use xtensa-build-zephyr.py at all. Look at the west commands invoked by xtensa-build-zephyr.py for "inspiration", then invoke west directly.

Everything is possible :-)

Hm, for now I've used xtensa-build-zephyr.py for the sake of convenience since it also takes care of cloning the Zephyr repo. Although smex and rimage are not required for i.MX93 with Jailhouse (at least as far as I'm aware right now) I think we can still use xtensa-build-zephyr.py with something like this:

For smex, you could easily add some new SMEX_BUILD = False property after #7255 is merged. Or you could just add ELF64 support to smex, the smex code is surprisingly small and it looks like it has some ELF64 headers already.

and maybe RIMAGE_BUILD = False just to be somewhat consistent with the other platforms and have the convenience of xtensa-build-zephyr.py also cloning Zephyr for us (although this can be obtained by calling the script with the -u argument and then manually doing the west commands as @marc-hb suggested).

An alternative to this would be to have something like sof-build-jailhouse.py that would create the zephyr.bin and also do the Jailhouse setup required (in case there's more people interested in adding support for their platforms to run SOF with Jailhouse). But at this point this is just an idea which I haven't tested yet so we'll have to see if it's worth introducing such a script.

PROGRESS UPDATE

On Zephyr and SOF sides of things everything seems to be working fine, managed to get some logs on the console. Currently, I'm working on the Linux side of things. Hopefully, will soon have something decent enough to submit a draft PR.

@marc-hb
Copy link
Collaborator

marc-hb commented Mar 12, 2023

Hm, for now I've used xtensa-build-zephyr.py for the sake of convenience since it also takes care of cloning the Zephyr repo.

Cloning is just two west commands: west init and west update, look at the xtensa-build-zephyr.py logs. If that's too inconvenient then west is not doing its number 1 job!?

Don't get me wrong: xtensa-build-zephyr.py does a number of things that west does not and it has been very useful to tame the typical configuration divergence between developers and CIs. But at the end of the day it's still a wrapper on top of wrapper (west) on top of a wrapper (cmake) on top of wrapper (ninja) on top of the compiler = best avoided if you don't need most of it. In the middle/long term we want to reduce the roles of xtensa-build-zephyr.py, not increase them or its complexity.

Just for fun, look at one-line commit 9a23191 and see how many layers -Werror goes through. It's literally insane.

Current xtensa-build-zephyr.py jobs:

I think SMEX_BUILD = False and RIMAGE_BUILD = False options would be small enough changes hence OK but it would be a pity to add this extra code and small step towards combinatorial explosion if all you use the script for is west cloning...

Less is more!

Also like this one: "A little copying (= your own, tiny and simple west script) is better than a little dependency (using only 10% of the much bigger xtensa-build-zephyr.py script)"

@LaurentiuM1234
Copy link
Contributor Author

LaurentiuM1234 commented Apr 10, 2023

STATUS UPDATE
Created PR that introduces support for running SOF with Zephyr and Jailhouse on i.MX93's A55 core. The SOF PR can be found here: #7425. The Zephyr PR can be found here: zephyrproject-rtos/zephyr#56713

ARCHITECTURE OVERVIEW
The image below describes the general architecture of our solution. All hardware resources relevant to SOF have been included.

Screenshot from 2023-04-10 16-00-17

Note: all the hardware resources in the picture are either managed by the root or the inmate (also applies to the cores). For more information regarding Jailhouse and how it operates please see 1, 2, 3. Github repo can be found at 4.

HOW DOES IT WORK?
The following diagrams provided a step-by-step overview of the solution starting from Linux booting and ending with Linux platform driver communicating with the FW:

Step 1
Linux is booted normally and user space becomes available. At this point, Jailhouse driver is inserted using modprobe and the hypervisor is enabled through the jailhouse enable command given from the user space. All the management is done by the Jailhouse driver. The effect of the hypervisor enablement can be seen in the diagram from the right side.

Screenshot from 2023-04-10 16-27-29

Note: when the hypervisor is enabled, the root cell is also created. At this point all the hardware is still managed by Linux.

Step 2
Inmate is created. The Zephyr binary is loaded and started. Core 1 will start executing code from a given address. At this point, Linux loses control over the hardware resources assigned to the inmate (by this I mean Linux won't be able to access said resources as the hypervisor will restrict it's read/write access to the registers of said hardware). The effect of this step can be seen in the right figure.

Screenshot from 2023-04-10 16-32-58

Note: technically speaking, Linux doesn't actually lose control over the EDMA2, SAI3 and MUA since their nodes are disabled in the DTS but since the hypervisor will restrict its access to their registers we'll consider that Linux loses control over them.

Step 3
During this step, Linux will send a SOF_IPC_FW_READY to SOF. This sequence has to be initiated by the Linux platform driver because FW is up before the Linux platform driver due to how Jailhouse is designed (it needs user space to become available to enable the hypervisor). In this case, we'll consider that the platform driver is running in master mode as it initiates the IPC communication with the FW. The content of the reply, alongside the content of struct sof_ipc_fw_ready and struct sof_ipc_window is written to the hostbox. The Linux IPC reply handler will make sure to parse all of these. After this, the flow remains unaffected.

Screenshot from 2023-04-10 16-55-10

ISSUES

  1. Due to how the Jailhouse driver operates, the platform driver needs to be blacklisted as the FW can't possibly be up when the platform driver would normally be probed. This was solved by changing the initial protocol as described in Step 3.
  2. The DMA buffer passed from Linux to SOF in which the stream data is written has a physical address which belongs to the address space of the root. This means that the inmate will require access to root's address space (which is not safe at all). Also, we'll need to map the physical address provided by Linux as all addresses used by the FW will go through the MMU (for now, this was quickly solved by creating a mapping for the whole root address space)
  3. Described in [DNM]: drivers: imx: Move SAI MCLK enablement to sai_config() #7410 (review). Fixed by imx: sai: enable transmitter to output MCLK #7427
  4. Described in schedule: zephyr_dma_domain: Make thread priority higher #7367
  5. Buzzing noise for 0.x seconds in the beginning of a recorded audio.
  6. PM operations are currently not supported at all. They are quite problematic because of the following reasons:
    1) We can't cut off a core's clock.
    2) The resume operation seems to expect SOF to be restarted which is somewhat hard to do (can be solved by hypercalls but to me this will be a last resort as this approach seems kinda forced. Jailhouse doesn't export its symbols so this hypercall will have to be implemented)
  7. Warnings during build process such as:
    warning: format '%d' expects argument of type 'int', but argument 2 has type 'size_t' {aka 'long unsigned int'} [-Wformat=]
    and
    warning: taking address of packed member of 'struct coherent' may result in an unaligned pointer value [-Waddress-of-packed-member]

STATUS
With the Zephyr and Linux patches that I'm going to upstream as soon as I'm done cleaning up, I was able to test a simple playback, simple capture and simple playback with PAUSE/RESUME. All of them worked fine. Other cases have not been tested as I considered it to be too early.

TODO

  • Send Linux and Zephyr patches for review.
  • The PR which introduces support for i.MX93 in SOF will remain as a draft until said patches are sent. Mainly this PR will depend on the Zephyr PR to be able to build the binary.

@lgirdwood What do you think about adding the Jailhouse configurations for inmate/root in the SOF repository? They are kinda SOF specific so I doubt they'll be accepted in the main Jailhouse repository. Also, they might help people interested in running SOF on an arm core alongside Linux.

EDIT:

  1. Included 2 more issues.
  2. Added link to Zephyr PR.

@lgirdwood
Copy link
Member

@mmaka1 @kv2019i @plbossart fyi - good technical description here.

@marc-hb
Copy link
Collaborator

marc-hb commented Jun 2, 2023

Re the Zephyr logging / dictionary support - I think this is mostly done now, although there could be minor work still WIP. Best to check with @kv2019i and @marc-hb

Latest dictionary news from @kv2019i:

west sign should be removed from xtensa-build-zephyr.py in the next couple months, west sign should soon be part of west build instead.

This is now done and the corresponding Zephyr documentation updates have just been merged:

This makes it much easier to skip rimage entirely.

I'm curious whether you're still using xtensa-build-zephyr.py? Only for cloning/initialisation convenience or also for building? Again: not using this script at all would be perfectly fine; it should always stay optional.

there's currently no support for parsing ELF-64 files
[...]
Warnings during build process such as:
warning: format '%d' expects argument of type 'int', but argument 2 has type 'size_t' {aka 'long unsigned int'} [-Wformat=]
and
warning: taking address of packed member of 'struct coherent' may result in an unaligned pointer value [-Waddress-of-packed-member]

I just saw that the A55 has a ILP32 mode, I'm curious why you didn't try that first?4GB looks like a lot of RAM for an audio DSP :-)

Lack of hybrid 32/64 support in Jailhouse? Or because of the way you implemented IPC?

I'm aware 64 bits support is now old news in Zephyr but I doubt anyone else has been running SOF in LP64 mode yet... thanks for charting this new territory for the rest of us! :-)

For smex, you could easily add some new SMEX_BUILD = False property after #7255 is merged.

Actually, that was probably a bad idea because I guess some users may want SMEX and other users not for the same xtensa-build-zephyr.py configuration. So a new --smex=False CLI option would probably be better.

@LaurentiuM1234
Copy link
Contributor Author

LaurentiuM1234 commented Jun 6, 2023

I'm curious whether you're still using xtensa-build-zephyr.py? Only for cloning/initialisation convenience or also for building? Again: not using this script at all would be perfectly fine; it should always stay optional.

Currently, I'm using a modified version of xtensa-build-zephyr.py for cloning + building (by modified I mean it's basically the same thing with some lines of code commented out). Lately, I haven't really had the time to look more into the build process and decide on/find a proper way to build the binary :(.

I just saw that the A55 has a ILP32 mode, I'm curious why you didn't try that first?4GB looks like a lot of RAM for an audio DSP :-)

Lack of hybrid 32/64 support in Jailhouse? Or because of the way you implemented IPC?

Frankly, I wasn't aware of the existence of this ABI. Thanks for pointing it out ! Might be worth investigating it a bit for the sake of curiosity at least.

@marc-hb
Copy link
Collaborator

marc-hb commented Jun 6, 2023

Currently, I'm using a modified version of xtensa-build-zephyr.py for cloning + building (by modified I mean it's basically the same thing with some lines of code commented out).

Thanks for letting us know.

Lately, I haven't really had the time to look more into the build process and decide on/find a proper way to build the binary :(.

Understood; the perfect is the enemy of the good.

Frankly, I wasn't aware of the existence of this (IPL32) ABI.

So you're indirectly confirming that the SOF code base is mostly 64bits-clean besides the few warnings you mentioned above? That sounds really good! Can you share a very rough description of the kind of tests that are passing?

So there is some recipe that compiles in LP64 mode with the Zephyr SDK? If it does not require too many changes then we should definitely fix these warnings and add a Github Action for it to make sure the codebase stays 64bits clean.

@LaurentiuM1234
Copy link
Contributor Author

So you're indirectly confirming that the SOF code base is mostly 64bits-clean besides the few warnings you mentioned above?

Hm, I guess so. It seems to work fine apart from the build warnings.

Can you share a very rough description of the kind of tests that are passing?

Nothing too fancy. In fact, there's no formal testing atm as the support is still not fully upstream (the linux patches are still in review). So far, all I've done is build it and test some aplay/arecords + pause/resume during aplay.

So there is some recipe that compiles in LP64 mode with the Zephyr SDK?

Yup, seems like the case. Took a look at cmake/compiler/gcc/target_arm64.cmake and it seems like Zephyr is always compiled with the -mabi=lp64 option (in the case of gcc). This also seems to be the case for armclang.

If it does not require too many changes then we should definitely fix these warnings and add a Github Action for it to make sure the codebase stays 64bits clean.

I was planning to eventually have a look at the warnings and find a way to fix them. Also, I'm all up for adding a github action. It shouldn't require too much effort (hopefully!).

@marc-hb
Copy link
Collaborator

marc-hb commented Jun 8, 2023

I tried these but quickly realized there's a LOT missing in SOF for them:

west build -p --board mimx8mm_evk_a53 modules/audio/sof/app
west build -p --board mimx93_evk_a55  modules/audio/sof/app

What Zephyr --board have you been using?

@LaurentiuM1234
Copy link
Contributor Author

west build -p --board mimx8mm_evk_a53 modules/audio/sof/app

This shouldn't work as i.MX8MM with ARM64 is not supported in SOF.

west build -p --board mimx93_evk_a55 modules/audio/sof/app

The name of the board to use is mimx93_evk_a55_sof.

Btw, were these command issued from Zephyr? If so, the reason the build fails is because the sof module is not updated, hence it doesn't contain the i.MX93 support.

If the command was issued from SOF side, then please make sure to update west.yml to also fetch the NXP HAL. Currently, i.MX93 support depends on it.

If any of the above doesn't work would it be possible to share the build errors?

@dbaluta
Copy link
Collaborator

dbaluta commented Jun 8, 2023

If the command was issued from SOF side, then please make sure to update west.yml to also fetch the NXP HAL. Currently, i.MX93 support depends on it.

I wonder how do we conditionally pull in a HAL for a specific platform family? (e.g IMX)

Currently, this works in SOF source tree root:

--- a/west.yml
+++ b/west.yml
@@ -60,6 +60,7 @@ manifest:
           - mipi-sys-t
           - lz4
           - tinycrypt
+          - hal_nxp

But there must be another way - to avoid unnecessary pulling hal_nxp for non-nxp platforms.

marc-hb added a commit to marc-hb/sof that referenced this issue Jun 8, 2023
A ton of (valid!) warnings but this will make sure the code at least
compiles.

For more context see thesofproject#7192

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
@marc-hb
Copy link
Collaborator

marc-hb commented Jun 8, 2023

... and add a Github Action for it to make sure the codebase stays 64bits clean.

Done:

If the command was issued from SOF side

I assumed you meant: "when using the sof/west.yml manifest".

The name of the board to use is mimx93_evk_a55_sof. [...] please make sure to update west.yml to also fetch the NXP HAL.

Thanks, that recipe finally worked out of the box! I tried a number of other configurations (from both manifest "sides") but none had worked so far.

I was planning to eventually have a look at the warnings and find a way to fix them.

Looks like warnings sometimes point at real, hard problems! At least one in this LP64 case: the old sof-logger is still hardcoded to four 32bits parameters. I don't see how to solve this.

But there must be another way - to avoid unnecessary pulling hal_nxp for non-nxp platforms.

It's very complicated actually and has been changing in west very recently:

In #7772 I kept it simple and I just added hal_nxp. It's not that big and the --fetch-opt=--filter=tree:0 git optimization makes it smaller and faster in CI. This seems to add about 20s in https://github.com/thesofproject/sof/actions/runs/5215791991, I think it's worth it.

Once the west dust settles down we can revisit and try to optimize.

marc-hb added a commit to marc-hb/sof that referenced this issue Jun 8, 2023
There's really no reason to fall back on a default "platform" and pick
up a bunch of other, totally random settings.

Now the following command fails immediately with a useful `platform not
defined` error message instead of stopping with a cryptic
`platform/trace/trace.h: No such file or directory` after successfully
compiling half the .c files in a Frankenstein CONFIG_uration.

```
  west build -p --board mimx8mm_evk_a53 modules/audio/sof/app
```

More context in thesofproject#7192.

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
kv2019i pushed a commit that referenced this issue Jun 13, 2023
There's really no reason to fall back on a default "platform" and pick
up a bunch of other, totally random settings.

Now the following command fails immediately with a useful `platform not
defined` error message instead of stopping with a cryptic
`platform/trace/trace.h: No such file or directory` after successfully
compiling half the .c files in a Frankenstein CONFIG_uration.

```
  west build -p --board mimx8mm_evk_a53 modules/audio/sof/app
```

More context in #7192.

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
kv2019i pushed a commit that referenced this issue Jun 14, 2023
A ton of (valid!) warnings but this will make sure the code at least
compiles.

For more context see #7192

Signed-off-by: Marc Herbert <marc.herbert@intel.com>
@lgirdwood
Copy link
Member

@LaurentiuM1234 looks like most / all PRs merged now ? if so we close ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants