Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sam0 flashpage_write issue #7667

Closed
travisgriggs opened this issue Oct 3, 2017 · 13 comments
Closed

sam0 flashpage_write issue #7667

travisgriggs opened this issue Oct 3, 2017 · 13 comments
Assignees
Labels
Platform: ARM Platform: This PR/issue effects ARM-based platforms Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Comments

@travisgriggs
Copy link
Contributor

travisgriggs commented Oct 3, 2017

This took a while to chase this down. On the samd21 issue, the flash page_write function casts the void* data argument to a uint32_t* pointer. This works fine when the compiler places the memory for your argument aligned. But I've found that it doesn't always. I was declaring a

static u8 Row[NVMCTRL_ROW_SIZE];

The compiler was alternately placing this memory at x20002402 or x20002400. At the latter, the cast to a uint32_t* was fine, but at the former, chip level exception occured. This kind of bug is no fun to figure out.

I have three proposals I'd like to consider for a fix. With some direction, I'll work up the appropriate PR.

  1. Require a client side fix. In this case, we put the onus of getting the data argument 32bit aligned. One can use an expression like:

    static __attribute__((aligned(4))) u8 Row[NVMCTRL_ROW_SIZE];

to force the alignment. If this is the preferred solution, then the change that I believe should be made is the documentation of flashpage_write() should be modified to make this clear to those who decide to use this function.

This is not my personally preferred solution.

  1. Use uint16_t instead of uint32_t. I think it's OK to assume that memory is 2 byte aligned (see Interrupt-Handling on msp430 is buggy #3 otherwise). page_addr and data_addr become uint16_t* and the loop looks like:

     for (unsigned i = 0; i < (FLASHPAGE_SIZE / 2); i++) {
         *page_addr++ = data_addr[i];
     }
    
  2. There is a pathological case where someone does not have the argument short aligned, e.g.:

   #pragma pack(1)
    struct foo {
        u8 singleByte;
        u8 data[FLASHPAGE_SIZE];
    };
    #pragma pack(8)

In this case, it's again not safe to cast data as a uint16_t*. So data_addr must become a uint8_t* (page_addr remains uint16_t*). And the loop must do the work directly:

  for (unsigned i = 0; i < (FLASHPAGE_SIZE - 1); i+= 2) {
        uint16_t pair = ((uint16_t)data_addr[i + 1] << 8) | (uint16_t)(data_addr[i]);
        *page_addr++ = pair;
    }

This is my preferred solution, I think. It's the most robust and safe. It also matches the ASF implementation most closely.

(aside: I had hoped to use memcpy or wmemcpy, since they often dynamically deal with aligned/unaligned data nicely, but I could not get them to work--the data_addr is "special" enough to require explicit code apparently)

@jnohlgard
Copy link
Member

For a quick workaround, use uint32_t buf[] for your client side source buffer and cast a pointer to uint8_t* if you need to do bytewise manipulations. It is safe to cast a pointer from a wider alignment to a different type with a lower alignment requirement, but not the other way around: uint8_t *ptr8 = (uint8_t *)ptr32 is safe, but uint32_t *ptr32 = (uint32_t *)ptr8 is not.

@travisgriggs
Copy link
Contributor Author

Your workaround actually points to yet-another solution:

  1. Just declare the data argument type as uint32_t* rather than void*.

Is there an appetite for a PR for any of these solutions? I really do think one of them should be done, even it's just adjusting the documentation. I spent a good day of time pulling hair out with intermittent stack corruptions caused by this.

@jnohlgard
Copy link
Member

Changing the argument type is not possible. That means an API change for the general periph flashpage interface, but this API is designed to be as platform agnostic as possible, and forcing the argument to uint32_t makes assumptions about the underlying hardware platform.

@travisgriggs
Copy link
Contributor Author

Back to the original 3 then :)

@cladmi
Copy link
Contributor

cladmi commented Oct 10, 2017

I agree this is an issue. It requires either an API change, with a per board alignment requirement, or a safe implementation.

@cladmi
Copy link
Contributor

cladmi commented Oct 13, 2017

A non breaking API change is to add a flashpage_write_aligned(int page, page_write_input_buf_t *data) function, with alignment type defined per cpu. And make the current "flashpage_write" implementations work with unaligned *data.
The aligned version could be faster, or just point to the safe version.

@travisgriggs
Copy link
Contributor Author

I will work up a PR for proposed solution #3 if I had some assurances it wasn't going to rot on the review pile... just say go.

@miri64 miri64 added Platform: ARM Platform: This PR/issue effects ARM-based platforms Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) labels Oct 25, 2017
@haukepetersen
Copy link
Contributor

@kaspar030 @kYc0o you have a grip on this issue, right?

@cladmi
Copy link
Contributor

cladmi commented Oct 25, 2017

I can look into it after this week.

@kYc0o
Copy link
Contributor

kYc0o commented Jan 16, 2018

@travisgriggs can you check if #7970 solves this issue?

@PeterKietzmann
Copy link
Member

@fedepell can you confirm that #10069 solved this issue?

@fedepell
Copy link
Contributor

@PeterKietzmann : I believe that, at the time of this issue, there were a bunch of problems in the driver. One was the byte alignment issue (solved by #7970), one was the fact that a write has to be done in more steps (solved by #10069) and last but not least we weren't waiting also for chip to be ready (which was seemingly not a problem but the datasheet says we should, and maybe happens very seldom, solved in #10880).
I believe, at least for the usage I was doing with it in the last few months on the SAML21 (both shorter writes for saving a datalog and long intensive writes to do remote updates), that now the flash driver here is pretty much stable and possibly this issue should be closed.
Also an automated test has been added in #10106 which should also hopefully give us more confidence future changes.

@PeterKietzmann
Copy link
Member

@fedepell thanks for the detailed report. Sounds like great improvement – so I'm gonna consider this issue as solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Platform: ARM Platform: This PR/issue effects ARM-based platforms Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

No branches or pull requests

8 participants