Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpu/stm32-common: slightly rework flashpage driver and fix iotlab-m3 #9068

Merged
merged 4 commits into from
May 30, 2018

Conversation

aabadie
Copy link
Contributor

@aabadie aabadie commented May 3, 2018

Contribution description

This PR is an attempt to fix the #9065 issue that was introduced by #8768. Regarding the differences, the calls to lock/unlock functions were moved in different places in #8768, wrapping each calls to erase and write_raw internal functions.
This change seems to have introduced timing issues on iotlab-m3 but not on the other boards I tested (nucleo-l073, l152 and f070).

Reverting this change and adding extra checks fixes the issue. One last thing it still not working (but it was already the case before #8768): flashpage_raw. I tested different things but none worked on iotlab-m3. I can disable this feature for STM32F1 family if requested.

Other CPUs (l0, l1, f0) are still working for both write and write_raw functions.

Issues/PRs references

Fixes #9065

@aabadie aabadie added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Platform: ARM Platform: This PR/issue effects ARM-based platforms Area: drivers Area: Device drivers CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Area: boards Area: Board ports labels May 3, 2018
@aabadie aabadie requested a review from cladmi May 3, 2018 15:53
@cladmi
Copy link
Contributor

cladmi commented May 3, 2018

The test I mention in the #9065 issue works now on both iotlab-m3 and iotlab-a8-m3.

I did not have time to review the code for the moment.

@aabadie aabadie force-pushed the pr/stm32-common/fix_flashpage_m3 branch from dd7c3cc to ed6791a Compare May 4, 2018 13:48
@kaspar030
Copy link
Contributor

@kYc0o can you take a look at this?

@kYc0o
Copy link
Contributor

kYc0o commented May 8, 2018

On it.

However I'd need quite a bit of time to understand all the changes. AFAIR flashpage worked integrally for this CPU, and with all the recent modifications ended in this state. Thus I need to compare all the version to catch what's missing.

@aabadie
Copy link
Contributor Author

aabadie commented May 8, 2018

AFAIR flashpage worked integrally for this CPU

Before #8768 flashpage_raw was not supported by stm32f1, only flashpage was. That's why the last commit of this PR removes it.
The idea of this PR is basically to restore the same behavior of the flashpage driver as it was before #8768 but for CPUs that support flashpage_raw (F0, L0, L1), I had to call lock/unlock functions from within the flashpage_write_raw function, since this function can be called directly (and in this case, the flash has to be unlocked).

@kYc0o kYc0o self-assigned this May 14, 2018
@@ -140,6 +136,11 @@ void flashpage_write_raw(void *target_addr, const void *data, size_t len)
assert(((unsigned)target_addr + len) <
(CPU_FLASH_BASE + (FLASHPAGE_SIZE * FLASHPAGE_NUMOF)) + 1);

#ifndef CPU_FAM_STM32F1
DEBUG("[flashpage_raw] unlocking the flash module\n");
_unlock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really true that for all the following boards, it is not necessary to _unlock for flashpage_write_raw ?

cpu/sam0_common/Makefile.features:FEATURES_PROVIDED += periph_flashpage_raw
cpu/stm32f0/Makefile.features:  FEATURES_PROVIDED += periph_flashpage_raw
cpu/stm32l0/Makefile.features:FEATURES_PROVIDED += periph_flashpage_raw
cpu/stm32l1/Makefile.features:FEATURES_PROVIDED += periph_flashpage_raw

If yes is it necessary to do _unlock for _erase_page for them ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry misread the ifndef.

@cladmi
Copy link
Contributor

cladmi commented May 15, 2018

I tried understanding what was happening and played a version of this one looking more like master and also with master version.
Trying to find which of the DEBUG converted to a printf made it work and also I add delay in various places.

And at the point I am now, I think that maybe the changes in this PR are not required as it does not look like it is related to timing.
I only tested the test 255 from tests/periph_flashpage currently but could be also the same problem for flashpage_write_raw.

It works on master with both the two following diffs:

diff --git a/cpu/stm32_common/periph/Makefile b/cpu/stm32_common/periph/Makefile
index bcc3fe2..7ae2d98 100644
--- a/cpu/stm32_common/periph/Makefile
+++ b/cpu/stm32_common/periph/Makefile
@@ -1,3 +1,4 @@
 MODULE = stm32_common_periph
 
+CFLAGS += -O0
 include $(RIOTMAKE)/periph.mk

Or also this one

diff --git a/cpu/stm32_common/periph/flashpage.c b/cpu/stm32_common/periph/flashpage.c
index 1884579..1de1d8e 100644
--- a/cpu/stm32_common/periph/flashpage.c
+++ b/cpu/stm32_common/periph/flashpage.c
@@ -95,6 +95,7 @@ static void _erase_page(void *page_addr)
     /* make sure no flash operation is ongoing */
     DEBUG("[flashpage] erase: waiting for any operation to finish\n");
     while (FLASH->SR & FLASH_SR_BSY) {}
+    __asm__ __volatile__("":::"memory");
     /* set page erase bit and program page address */
     DEBUG("[flashpage] erase: setting the erase bit\n");
     CNTRL_REG |= FLASH_CR_PER;

And putting this __asm__ __volatile__("":::"memory"); on any line between the two while (FLASH->SR & FLASH_SR_BSY) {} makes it work. (I did not tested flashpage_write_raw yet).
https://stackoverflow.com/questions/22106843/gccs-reordering-of-read-write-instructions

So maybe we are just in a case of compiler re-ordering instructions so requires checking the compiled output and fix it properly.

@aabadie
Copy link
Contributor Author

aabadie commented May 15, 2018

putting this asm volatile("":::"memory"); on any line between the two while (FLASH->SR & FLASH_SR_BSY) {} makes it work. (I did not tested flashpage_write_raw yet)

Thanks @cladmi ! Your analysis seems very good. Without the compiler optimization option and with just adding the __asm__ __volatile__("":::"memory"); after all while (FLASH->SR & FLASH_SR_BSY) {}, everything is working on iotlab-m3, including flashpage_raw.

@aabadie
Copy link
Contributor Author

aabadie commented May 15, 2018

Here is the disassembled code (using gdb). Just showing the loop in flashpage_write_raw function.

  • Without __asm__ __volatile__("":::"memory");:
167	    for (size_t i = 0; i < (len / FLASHPAGE_DIV); i++) {
   0x080014d2 <+62>:	mov.w	r2, r9, lsr #1
   0x080014dc <+72>:	cmp	r2, r4
   0x080014de <+74>:	bne.n	0x80014f6 <flashpage_write_raw+98>
   0x08001504 <+112>:	adds	r4, #1
   0x08001506 <+114>:	b.n	0x80014dc <flashpage_write_raw+72>
   0x08001508 <+116>:	cmp	r3, #21
   0x0800150a <+118>:	lsrs	r0, r0, #32
   0x0800150c <+120>:	movs	r0, r0
   0x0800150e <+122>:	lsrs	r0, r1, #32
   0x08001510 <+124>:	asrs	r0, r0, #32
   0x08001512 <+126>:	ands	r2, r0
   0x08001514 <+128>:	movs	r0, #0
   0x08001516 <+130>:	ands	r2, r0

168	        DEBUG("[flashpage_raw] writing %c to %p\n", (char)data_addr[i], dst);
169	        *dst++ = data_addr[i];
   0x080014f6 <+98>:	ldrh.w	r1, [r8, r4, lsl #1]
   0x080014fa <+102>:	strh.w	r1, [r7, r4, lsl #1]

170	        while (FLASH->SR & FLASH_SR_BSY) {}
   0x080014fe <+106>:	ldr	r1, [r3, #12]
   0x08001500 <+108>:	lsls	r1, r1, #31
   0x08001502 <+110>:	bmi.n	0x80014fe <flashpage_write_raw+106>

171	        // __asm__ __volatile__("":::"memory");
172	    }
  • With __asm__ __volatile__("":::"memory");
167	    for (size_t i = 0; i < (len / FLASHPAGE_DIV); i++) {
   0x080014d2 <+62>:	mov.w	r2, r9, lsr #1
   0x080014dc <+72>:	cmp	r2, r4
   0x080014de <+74>:	bne.n	0x80014f6 <flashpage_write_raw+98>

168	        DEBUG("[flashpage_raw] writing %c to %p\n", (char)data_addr[i], dst);
169	        *dst++ = data_addr[i];
   0x080014f6 <+98>:	ldrh.w	r1, [r8, r4, lsl #1]
   0x080014fa <+102>:	strh.w	r1, [r7, r4, lsl #1]

170	        while (FLASH->SR & FLASH_SR_BSY) {}
   0x080014fe <+106>:	ldr	r1, [r3, #12]
   0x08001500 <+108>:	lsls	r1, r1, #31
   0x08001502 <+110>:	bmi.n	0x80014fe <flashpage_write_raw+106>

171	        __asm__ __volatile__("":::"memory");
   0x08001504 <+112>:	adds	r4, #1
   0x08001506 <+114>:	b.n	0x80014dc <flashpage_write_raw+72>
   0x08001508 <+116>:	cmp	r3, #21
   0x0800150a <+118>:	lsrs	r0, r0, #32
   0x0800150c <+120>:	movs	r0, r0
   0x0800150e <+122>:	lsrs	r0, r1, #32
   0x08001510 <+124>:	asrs	r0, r0, #32
   0x08001512 <+126>:	ands	r2, r0
   0x08001514 <+128>:	movs	r0, #0
   0x08001516 <+130>:	ands	r2, r0

172	    }

The block of instructions from <+112> to <+130> are obviously moved, so there's indeed some reordering.

@aabadie
Copy link
Contributor Author

aabadie commented May 15, 2018

I had a look at the STM32 cube code and there the 'end of operation' bit (EOP) in status register is cleared after BSY bit goes back to 0. I tried this instead of the assembly hack and it also works. Will push the change, so you can have a look.

The disassemble code is the following in this case:

171	    for (size_t i = 0; i < (len / FLASHPAGE_DIV); i++) {
   0x080014d2 <+62>:	mov.w	r2, r8, lsr #1
   0x080014dc <+72>:	cmp	r2, r4
   0x080014de <+74>:	bne.n	0x80014f6 <flashpage_write_raw+98>
   0x08001506 <+114>:	adds	r4, #1

172	        DEBUG("[flashpage_raw] writing %c to %p\n", (char)data_addr[i], dst);
173	        *dst++ = data_addr[i];
   0x080014f6 <+98>:	ldrh.w	r1, [r7, r4, lsl #1]
   0x080014fa <+102>:	strh.w	r1, [r6, r4, lsl #1]

174	        while (FLASH->SR & FLASH_SR_BSY) {}
   0x080014fe <+106>:	ldr	r1, [r3, #12]
   0x08001500 <+108>:	lsls	r0, r1, #31
   0x08001502 <+110>:	bmi.n	0x80014fe <flashpage_write_raw+106>

175	        if (FLASH->SR & FLASH_SR_EOP) {
   0x08001504 <+112>:	ldr	r1, [r3, #12]
   0x08001508 <+116>:	lsls	r1, r1, #26

176	            FLASH->SR &= ~(FLASH_SR_EOP);
   0x0800150a <+118>:	ittt	mi
   0x0800150c <+120>:	ldrmi	r0, [r3, #12]
   0x0800150e <+122>:	bicmi.w	r0, r0, #32
   0x08001512 <+126>:	strmi	r0, [r3, #12]
   0x08001514 <+128>:	b.n	0x80014dc <flashpage_write_raw+72>
   0x08001516 <+130>:	nop
   0x08001518 <+132>:	cmp	r3, #69	; 0x45
   0x0800151a <+134>:	lsrs	r0, r0, #32
   0x0800151c <+136>:	movs	r0, r0
   0x0800151e <+138>:	lsrs	r0, r1, #32
   0x08001520 <+140>:	asrs	r0, r0, #32
   0x08001522 <+142>:	ands	r2, r0
   0x08001524 <+144>:	movs	r0, #0
   0x08001526 <+146>:	ands	r2, r0

177	        }
178	    }

@aabadie aabadie force-pushed the pr/stm32-common/fix_flashpage_m3 branch 2 times, most recently from 44f20e3 to 7f39a4f Compare May 15, 2018 20:20
@aabadie
Copy link
Contributor Author

aabadie commented May 16, 2018

I re-tested this PR on iotlab-m3, nucleo-l073, nucleo-f070 and nucleo-l152 and everything (full page write + raw bytes write) is working now.

@cladmi
Copy link
Contributor

cladmi commented May 16, 2018

@aabadie can you squash and write the explanations in the commit messages ? It would be easier for me to review with the summarized details.

@aabadie
Copy link
Contributor Author

aabadie commented May 16, 2018

can you squash and write the explanations in the commit messages ?

Do you want that the change related to __asm__ __volatile__("":::"memory"); still appears in the history ? If yes, I think I'll completely rewrite the history of this PR:

  1. Apply the cosmetic changes introduced by this PR (debug messages, typos, etc), will still fail on iotlab-m3
  2. Apply the change related to assembly call. Will work on iotlab-m3.
  3. Replace the previous change with the check and update of EOP bit in status register. Will also work on iotlab-m3.
  4. Another pass of refactor :introduce the _wait_for_pending_operations function to avoid code duplication.

Is it ok for your @cladmi ?

@aabadie aabadie force-pushed the pr/stm32-common/fix_flashpage_m3 branch 2 times, most recently from 901aa8f to b80691f Compare May 16, 2018 20:10
@aabadie
Copy link
Contributor Author

aabadie commented May 16, 2018

@cladmi, I reworked the commit history of this PR. The last one contains the real fix for iotlab-m3, the others are cosmetic.

@aabadie
Copy link
Contributor Author

aabadie commented May 22, 2018

@cladmi, is it ok for you here ?

@aabadie aabadie force-pushed the pr/stm32-common/fix_flashpage_m3 branch from b80691f to f120107 Compare May 27, 2018 13:12
@aabadie
Copy link
Contributor Author

aabadie commented May 28, 2018

@kYc0o, can you check that this PR actually fixes the flashpage feature on iotlab-m3 ?

@aabadie aabadie force-pushed the pr/stm32-common/fix_flashpage_m3 branch from f120107 to 988c8b2 Compare May 29, 2018 14:21
@aabadie
Copy link
Contributor Author

aabadie commented May 29, 2018

@cladmi maybe you want to have a look at this one during hack'n'ack ?

{
DEBUG("[flashpage] waiting for any pending operation to finish\n");
while (FLASH->SR & FLASH_SR_BSY) {}
if (FLASH->SR & FLASH_SR_EOP) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if section is not necessary to make it work on M3 nodes. I tested it.
So if needed it should go in its own commit with justification.

What makes it work is that now the while is in a function so it is not getting re-ordered by gcc.
I think a comment on the function and a message in the commit is necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what is done in the STM32 Cube code, I just did the same. I have no idea if/when this may happen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, then it should not be in a "fix M3" commit but in a "make it look more like stm32 cube code just in case".

KEY_REG = FLASH_KEY1;
KEY_REG = FLASH_KEY2;
}
}

void _lock(void)
{
DEBUG("[flash-common] locking the flash module\n");
CNTRL_REG |= CNTRL_REG_LOCK;
if (!(CNTRL_REG & CNTRL_REG_LOCK)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there still a case where this is necessary ?
But maybe this would also be needed to ask it for _unlock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least 'test 255' and 'write_raw 0x807f808 abcdefgh' work on M3 nodes without both protections.

But there may be good reasons for it just asking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is the same check in the _unlock function, I wanted it to be consistent. The idea is "if already locked, do nothing".

aabadie added 4 commits May 30, 2018 14:09
- improve debug messages
- fix missing space before comment
- use a comment instead of debug message (the same message is displayed by the function called after)
    Moving the while loop in a separate function ensures no ordering
    optimizations is applied silently by gcc.
    This commit fixes the flashpage not working on iotlab-m3.
This was taken from STM32 Cube generated code
@aabadie aabadie force-pushed the pr/stm32-common/fix_flashpage_m3 branch from 988c8b2 to 6007274 Compare May 30, 2018 12:15
@aabadie
Copy link
Contributor Author

aabadie commented May 30, 2018

@cladmi, I reworked the commit history. Let me know if this is ok for you.

Copy link
Contributor

@cladmi cladmi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it now. The changes are now also documented appropriately for me.

tests/periph_flashpage works on:

  • iotlab-m3 with test 255 and write_raw 0x807f808 abcdefgh
  • samr21-xpro with test 255 and write_raw 0xff10 abcdefgh12345678.

@kYc0o: can you also take a look and test your boards ?

@kYc0o
Copy link
Contributor

kYc0o commented May 30, 2018

Unfortunately I have no boards working now. I trust you with this.

@kYc0o
Copy link
Contributor

kYc0o commented May 30, 2018

Tested on nucleo-l152re and b-l072z-lrwan1. Everything alright so go!

@kYc0o kYc0o merged commit 7529133 into RIOT-OS:master May 30, 2018
@aabadie aabadie deleted the pr/stm32-common/fix_flashpage_m3 branch June 1, 2018 19:03
@cladmi cladmi added this to the Release 2018.07 milestone Jul 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: boards Area: Board ports Area: drivers Area: Device drivers CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Platform: ARM Platform: This PR/issue effects ARM-based platforms Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

flashpage_write broken on iotlab-m3
4 participants