Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for *flat mapped* coherent secure RAM #1533

Closed
wants to merge 5 commits into from

Conversation

etienne-lms
Copy link
Contributor

This change enables secure coherent memory support and allow platform vexpress-qemu_virt to easily define a coherent memory at top of the secure RAM.

* Start coherent RAM with a platform specific structure.
* Generic data to be place in coherent RAM can use attribute "coherent_ram".
*/
#define DECLARE_COHERENT_RAM_SECTION \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a macro for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add some flexibility regarding the location of the coherent RAM, in case one wants it at a dedicated place. I proposed to possible location: before the load address or after the core reserved vaddr range (virt, no phys only, to fit with pager constraints). Since there were those 2 places, I preferred to use a single macro. You feel it bring complexity ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, I didn't notice that. This is good then.
The \ at the end of each line seems to have ended up at column 81 which obviously is beyond 80.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, i've noticed that from travis feedback. Bad luck.


#if defined(CFG_TEE_COHERENT_START)
ASSERT((__coherent_start & (4096 - 1)) == 0, "Coherent RAM start is not 4Kbyte aligned")
ASSERT((__coherent_end & (4096 - 1)) == 0, "Coherent RAM end is not 4Kbyte aligned")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing that . = ALIGN(4096); works?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is extra checks. Actually __coherent_end is already aligned, and static mapping will assert CFG_TEE_COHERENT_START (hence __coherent_start) is already 4kB page aligned.
Those checks aimed at detecting issues at build time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: i will remove those extra checks.

Copy link
Contributor

@jenswi-linaro jenswi-linaro May 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assert that checks that . = ALIGN(4096); works is just silly and and other assert could be done inside the DECLARE_COHERENT_RAM_SECTION macro instead.

Copy link
Contributor Author

@etienne-lms etienne-lms May 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding several ASSERT() inside the macro DECLARE_COHERENT_RAM_SECTION combined with the 80 chars max line size make the macro not very easy to read. I will move all assertion at the end of the file. Nevertheless, fell free to comment back...

* the secure RAM.
*/
#ifndef CFG_TEE_COHERENT_SIZE
#define CFG_TEE_COHERENT_SIZE 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default: no coherent RAM => null sized.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I suggest CFG_TEE_COHERENT_SIZE ?= 0 in core/arch/arm/arm.mk instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Thanks.

@@ -54,6 +54,7 @@
#define __rodata_unpaged __section(".rodata.__unpaged")
#define __early_bss __section(".early_bss")
#define __noprof __attribute__((no_instrument_function))
#define __coherent __attribute__((coherent_ram))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My google karma is failing me, do you have a link to that attribute?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, will fix to __section(".coherent_ram"). Thanks.

. = CFG_TEE_COHERENT_START; \
.coherent (NOLOAD) : { \
__coherent_start = .; \
KEEP(*(coherent_ram)); \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the other sections starts with a "." can't we have the same here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my fault. i'll fix (see also comments on attribute __coherent_ram).

* Start coherent RAM with a platform specific structure.
* Generic data to be place in coherent RAM can use attribute "coherent_ram".
*/
#define DECLARE_COHERENT_RAM_SECTION \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add some flexibility regarding the location of the coherent RAM, in case one wants it at a dedicated place. I proposed to possible location: before the load address or after the core reserved vaddr range (virt, no phys only, to fit with pager constraints). Since there were those 2 places, I preferred to use a single macro. You feel it bring complexity ?

. = CFG_TEE_COHERENT_START; \
.coherent (NOLOAD) : { \
__coherent_start = .; \
KEEP(*(coherent_ram)); \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my fault. i'll fix (see also comments on attribute __coherent_ram).

@@ -53,9 +53,27 @@
OUTPUT_FORMAT(CFG_KERN_LINKER_FORMAT)
OUTPUT_ARCH(CFG_KERN_LINKER_ARCH)

/*
* Start coherent RAM with a platform specific structure.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oups, this comment is related to the libpsci integ. I will remove it.


#if defined(CFG_TEE_COHERENT_START)
ASSERT((__coherent_start & (4096 - 1)) == 0, "Coherent RAM start is not 4Kbyte aligned")
ASSERT((__coherent_end & (4096 - 1)) == 0, "Coherent RAM end is not 4Kbyte aligned")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is extra checks. Actually __coherent_end is already aligned, and static mapping will assert CFG_TEE_COHERENT_START (hence __coherent_start) is already 4kB page aligned.
Those checks aimed at detecting issues at build time.

* the secure RAM.
*/
#ifndef CFG_TEE_COHERENT_SIZE
#define CFG_TEE_COHERENT_SIZE 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default: no coherent RAM => null sized.

@@ -54,6 +54,7 @@
#define __rodata_unpaged __section(".rodata.__unpaged")
#define __early_bss __section(".early_bss")
#define __noprof __attribute__((no_instrument_function))
#define __coherent __attribute__((coherent_ram))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, will fix to __section(".coherent_ram"). Thanks.

#if CFG_TEE_COHERENT_SIZE
ASSERT(!(__coherent_start & (4096 - 1)),
"Coherent memory start alignment");
ASSERT(!(CFG_TEE_COHERENT_SIZE & (4096 - 1)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the following assert are useless, they are just checking that . = ALIGN(4096) works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test above explicitly notifies developer that the coherent size follows the mapping constraint. Otherwise, this will be detected only a runtime. I found this explicit trace more convenient, but I agree it is redundant with the runtime assertions/panics.

Test below does not check the alignment but the size. In case one defines CFG_TEE_COHERENT_SIZE to a value below the effective coherent section computed at link stage, the coherent might the only partially mapped and only run time data abort could detect it.

ASSERT((__coherent_start & (4096 - 1)) == 0, "Coherent RAM start is not 4Kbyte aligned")
ASSERT((__coherent_end & (4096 - 1)) == 0, "Coherent RAM end is not 4Kbyte aligned")
#if CFG_TEE_COHERENT_SIZE
ASSERT(!(__coherent_start & (4096 - 1)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the problem with wrapping this inside the DECLARE_COHERENT_RAM_SECTION macros as:

                ASSERT(!(__coherent_start & (4096 - 1)),    \
                       "Coherent memory start alignment");     \

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An assertion in the macro could be ok, but the 3 make the macro a bit ugly (in my opinion).
But ok, I will move them (at least those really required) inside the macro.

@@ -222,7 +222,6 @@
#define GICD_OFFSET 0
#define GICC_OFFSET 0x10000


Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: useless change, to be discarded.

@etienne-lms
Copy link
Contributor Author

Since the overall changes are quite small, I will rebase and squash the proposed commit to get a (hopefully) nice travis status and ease later reviews.

@etienne-lms
Copy link
Contributor Author

thanks travis! i missed generic cases: must fix platform not defining any CFG_TEE_COHERENT_START.
I also found I forgot to push a change and let few crappy stuff on default location of the coherent memory. Fixes on-going...

@etienne-lms
Copy link
Contributor Author

I will put this change on hold.
My initial implementation was based on #1459 and I see now that it would imply too many conflicting changes to try to push it aside from this reference PR. Hence, I'll wait #1459 to mature and be merged before pushing back support for coherent memory.

@etienne-lms
Copy link
Contributor Author

etienne-lms commented May 19, 2017

Rebased and adapted (on top of recently merged #1459).
Updated content: coherent memory is now always located before the core load address.

(edited) another info on updated content: coherent memory must be inside the physical TEE_RAM. This eases the flat mapping layout.

@etienne-lms
Copy link
Contributor Author

rebased.

@jforissier
Copy link
Contributor

Looks like this needs some rework?

@etienne-lms
Copy link
Contributor Author

Yes, this deserves a rebase. Latest optee_os already defines coherent memory to maps rwx ram.
This p-r aimed at defining coherent mem (at least rw) and providing generic linking and mapping of the coherent memory. Current p-r location coherent ram inside the tee_ram fla map constriant, for simplicity. Obviously #1729 locates coherent ram outside.

What's needs to be rebased:

  • linking: declare a single coherent ram in kern.lds.S which can be above or below the flat mapped tee_ram) and define a .coherent_ram section attribute to allow linking code with the section.
  • mapping: register_phys_mem(COHERENT_MEM, ...); should get the coherent mem to be mapped. This implies changes in init_mem_map()#L845 to build a core vmem layout around coherent mem, CFG_TEE_RAM_VA_SIZE mapped and userland constraints.

@etienne-lms
Copy link
Contributor Author

Rebased.

  • init_mem_map() sequence that defines the virtual memory layout is modified.
  • register_phys_mem(TEE_MEM_COHERENT) maps the coherent.
  • Defining CFG_TEE_COHERENT_START/_SIZE gets a __coherent section linked.
    (used by libpsci)

* To speed up scans of the memory_map table, it is sorted, moving
* all small-page mapped area before the pgdir mapped areas.
*/
qsort(memory_map, last, sizeof(struct tee_mmap_region),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt this makes much difference, average performance of quicksort is O(n * log(n)) (worst case O(n * n)) while doing without sorting here worst case is O(2 * n) or O(3 * n).

I'd category this as doubtful micro optimization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. I will remove this sorting.


#ifdef CFG_WITH_LPAE
/* LPAE: 1 pgdir for protection, user memory located above the core */
va = CORE_MMU_PGDIR_SIZE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we start to map the rest at 2 MiB regardless of where flat mapped range is (it could even conflict with this)?

More efficient use of level 2 tables would be to keep in same level 2 table as used above, it's 1 GiB so there should be plenty of space.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we start to map the rest at 2 MiB regardless of where flat mapped range is (it could even conflict with this)?

The loop right after this init skips the virtual range already used to assigned virtual locations. This virtual area is expected to be fully cover the current value of vaspace_start/vaspace_size.

More efficient use of level 2 tables would be to keep in same level 2 table as used above, it's 1 GiB so there should be plenty of space.

Agree. I will use the full 1GB around the flat map area to reuse the level2 xlat table.

Copy link
Contributor Author

@etienne-lms etienne-lms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment addressed

  • Change implementation comments (simplify)
  • Found/fix issue (see commit "[fix] ...").
  • Cleanup: remove unused functions

#ifdef CFG_TEE_COHERENT_START
register_phys_mem(MEM_AREA_TEE_COHERENT, CFG_TEE_COHERENT_START,
CFG_TEE_COHERENT_SIZE);
#endif
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be in commit "core: map registered coherent memory".
It should be in "core: linkable coherent memory section from CFG_TEE_COHERENT_START" which introduces CFG_TEE_COHERENT_START/_SIZE.

init_smallpage_map(memory_map, vstart, coherent_start, false);
vstart = coherent_start + coherent_size;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the one and the one below could be merged...

@etienne-lms
Copy link
Contributor Author

Rebased (conflict to fix).
Tests in progress. qemu_virt ok, qemu_armv8 in progress... (in case ARMv8 would need coherent map support)

@jenswi-linaro
Copy link
Contributor

Please squash the commits as you'd like to have them merged and I'll take a final look.

@etienne-lms
Copy link
Contributor Author

CFG_TEE_COHERENT_START/_SIZE may not be a good label candidate.

  • Any platform can get a coherent memory nicely mapped, using register_phys_mem(MEM_AREA_TEE_COHERENT, ..., ...).
  • Defining CFG_TEE_COHERENT_START/_SIZE makes the area a linked section inside the core implementation (allows static allocations inside coherent memory).

Thus we may rather call this CFG_LINKED_COHERENT_START/_SIZE.

@jenswi-linaro
Copy link
Contributor

It think the define is good as it is.
Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org>

@etienne-lms
Copy link
Contributor Author

@MrVan, this p-r proposes a generic mapping support for coherent RAM.
From abut to be merge #1729, your coherent memory does not need flat mapping support. My change may conflict with your use of coherent memory.

@MrVan
Copy link
Contributor

MrVan commented Sep 18, 2017

@etienne-lms I did not follow this p-r. I'll give a check and test with #1729.

@etienne-lms
Copy link
Contributor Author

Thanks

@MrVan
Copy link
Contributor

MrVan commented Sep 19, 2017

@etienne-lms
It failed to boot correctly on i.mx7d-sdb

DEBUG:   [0x0] TEE-CORE:add_phys_mem:532: CFG_SHMEM_START type NSEC_SHM 0xbfe00000 size 0x00200000
DEBUG:   [0x0] TEE-CORE:add_phys_mem:532: CFG_TA_RAM_START type TA_RAM 0xbe100000 size 0x01d00000
DEBUG:   [0x0] TEE-CORE:add_phys_mem:532: VCORE_UNPG_RW_PA type TEE_RAM_RW 0xbe03d000 size 0x000c3000
DEBUG:   [0x0] TEE-CORE:add_phys_mem:532: VCORE_UNPG_RX_PA type TEE_RAM_RX 0xbe000000 size 0x0003d000
DEBUG:   [0x0] TEE-CORE:add_phys_mem:532: ROUNDDOWN(IRAM_S_BASE, CORE_MMU_DEVICE_SIZE) type TEE_COHERENT 0x00100000 size 0x00100000
DEBUG:   [0x0] TEE-CORE:add_phys_mem:532: ROUNDDOWN(IRAM_BASE, CORE_MMU_DEVICE_SIZE) type TEE_COHERENT 0x00900000 size 0x00100000
DEBUG:   [0x0] TEE-CORE:add_phys_mem:532: AIPS3_BASE type IO_SEC 0x30800000 size 0x00400000
DEBUG:   [0x0] TEE-CORE:add_phys_mem:532: AIPS2_BASE type IO_SEC 0x30400000 size 0x00400000
DEBUG:   [0x0] TEE-CORE:add_phys_mem:532: AIPS1_BASE type IO_SEC 0x30000000 size 0x00400000
DEBUG:   [0x0] TEE-CORE:add_phys_mem:532: ANATOP_BASE type IO_SEC 0x30300000 size 0x00200000
DEBUG:   [0x0] TEE-CORE:add_phys_mem:545: Physical mem map overlaps 0x30300000
DEBUG:   [0x0] TEE-CORE:add_phys_mem:532: GIC_BASE type IO_SEC 0x31000000 size 0x00100000
DEBUG:   [0x0] TEE-CORE:add_phys_mem:532: CONSOLE_UART_BASE type IO_NSEC 0x30800000 size 0x00200000
DEBUG:   [0x0] TEE-CORE:verify_special_mem_areas:470: No NSEC DDR memory area defined
DEBUG:   [0x0] TEE-CORE:add_va_space:571: type RES_VASPACE size 0x00a00000
DEBUG:   [0x0] TEE-CORE:add_va_space:571: type SHM_VASPACE size 0x02000000
ERROR:   [0x0] TEE-CORE: assertion '!coherent_start' failed at core/arch/arm/mm/core_mmu.c:858 <init_mem_map>
ERROR:   [0x0] TEE-CORE: Panic at core/kernel/assert.c:50 <_assert_break>
ERROR:   [0x0] TEE-CORE: Call stack:
ERROR:   [0x0] TEE-CORE:  0xbe004ec5
ERROR:   [0x0] TEE-CORE:  0xbe00a907
ERROR:   [0x0] TEE-CORE:  0xbe009eab
ERROR:   [0x0] TEE-CORE:  0xbe0055d1
ERROR:   [0x0] TEE-CORE:  0xbe0000c4

@etienne-lms
Copy link
Contributor Author

Thanks for the test. Indeed, this p-r expects only 1 coherent memory. This is because the 'coherent' memory handler by this p-r is 'flat-mapped coherent memory' and each adds a constraints on the virtual mapping layout (discontinuous TEE_RAM and coherent memory location), hence only 1 coherent mem is supported.

I think this p-r should not use COHERENT labelling, rather something like FLATMAP_COHERENT.

Or maybe: we can rename the current COHERENT into UNCACHED_RWX; platforms needing RAM mapped uncached can use the IO_RAM type; and we define COHERENT as flat mapped uncached read/write memory.

@jenswi-linaro
Copy link
Contributor

+1 for FLATMAP_COHERENT

With this change, the small-page mapped areas are mapped around the
flat map areas to reuse the translation tables. If optee is memory
constrained running with the pager enables, the small page tables
use the map the flat map area will be use for other small page
mapped entries.

With this change, the virtual memory layout is design in 4 steps:
- define virtual location of flat mapped areas.
- define virtual location of small-page mapped areas around flat
  mapped areas (likely to already allocate xlat tables).
- define virtual location of pgdir mapped areas from bottom range
  to top range, skipping the already assigned virtual locations.

List of the constraints on virtual memory layout:
- optee_os includes some flat mapped sections.
- already leave the first pgdir entry unmapped for protection.
- Lpae mappings will locate userland virtual memory above core
  virtual memory.
- Non-lpae mappings must locate userland virtual memory below core
  virtual memory (ttbr0/ttbr1 referencing)

This change prepares mapping support of a coherent memory which will
add new constraints of the virtual memory layout.

Function init_smallpage_map() is used to assign virtual location
for small page regions inside a given virtual address range. This
clarifies sequence building the virtual mapping layout.

Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
Physical memory areas registered with register_phys_mem() with the
type attribute MEM_AREA_TEE_COHERENT_FLAT are mapped by the generic
code with virtual address set to the the physical address value.

Generic code expects at most one such coherent flat mapped memory area.

Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
Platform can define COHERENT_FLATMAP_BASE/SIZE have request generic
code to define a flat mapped coherent memory area that gets linked
in optee_os core. One may use the __coherent attribute to located
data inside the coherent memory.

CFG_TEE_COHERENT_START/_SIZE can be located before or after the optee_os
reserved range that covers flat mapped areas and pager vaspace.

The coherent memory defined by CFG_TEE_COHERENT_START/_SIZE is
automatically to the core static mapping directive.

Update plat-sunxi/kernel.ld.S accordingly.

Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
Fix string size to match "COHERENT_FLAT" and "PAGER_VASPACE".

Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
@etienne-lms etienne-lms changed the title Support for coherent secure RAM Support for *flat mapped* coherent secure RAM Jan 15, 2018
@etienne-lms
Copy link
Contributor Author

Closing deprecate p-r.

@etienne-lms etienne-lms deleted the coherent branch December 15, 2018 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants