Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

samd21:uart TXC test backwards #7617

Closed
travisgriggs opened this issue Sep 19, 2017 · 9 comments · Fixed by #7638
Closed

samd21:uart TXC test backwards #7617

travisgriggs opened this issue Sep 19, 2017 · 9 comments · Fixed by #7638
Assignees
Labels
Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Comments

@travisgriggs
Copy link
Contributor

travisgriggs commented Sep 19, 2017

In cpu/sam0_common/periph/uart.c, uart_write() has a spin wait on TXC.

    while (!(dev(uart)->INTFLAG.reg & SERCOM_USART_INTFLAG_DRE)) {}
    dev(uart)->DATA.reg = data[i];
    while (dev(uart)->INTFLAG.reg & SERCOM_USART_INTFLAG_TXC) {}

Based on my reading of the TXC bit (for samd21 at least), it's backwards and should be NOT'ed in the spin test. As written, it's going to be set to 0 after the DATA.reg, so the while test is going to immediately fail since it won't be set UNTIL it's done shifting it out. To achieve the effect desired, I think a ! needs to be wrapped around the test.

I'm curious why we test for this at all though. The samd21 doc says:

"This flag is set when the entire frame in the transmit shift register has been shifted out and there are no new data in DATA."

The second half of the and clause implies it's ok to keep feeding the thing bytes (respecting the DRE flag). The TXC really seems to be there for sensing (via poll or interrupt) when characters are no longer being streamed at it.

So... I can do a PR. Or someone who owns it more can grab it and go. I'd want some direction on the second point there. I personally think the test should just be removed entirely, unless the other sam21 flavors have differing behaviors.

@smlng
Copy link
Member

smlng commented Sep 21, 2017

Reviewing the code, I came to same conclusion that the TXC check should be not'ed - but I'm unsure if it can be removed as an alternative solution. Btw. the UART driver is generic for sam*21 (and even all sam0 based) CPUs, so it should apply for all.

@smlng smlng self-assigned this Sep 21, 2017
@smlng smlng added the Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) label Sep 21, 2017
@smlng
Copy link
Member

smlng commented Sep 21, 2017

Btw. nice catch! Did you have any issues with uart, or was it just by chance?

@photonthunder
Copy link

My two cents would be to pull it out and make it a separate function. Not really needed with every byte sent, but would be useful to do a check to make sure the UART is empty after sending a bunch of bytes. Something I might check before sleeping...

@travisgriggs
Copy link
Contributor Author

Did you have any issues with uart, or was it just by chance?

I'm not sure yet. Originally I thought it was just an idle observation as I was trying to debug a nasty deadlock issue we're trying to fix. I'm running some tests right now that have me suspicious it may actually be more than just benign. I should have a better answer on that in about 24 hours.

@travisgriggs
Copy link
Contributor Author

I'm unsure if it can be removed as an alternative solution

Given that it does not do what was intended, it was already effectively removed. But I take your point, that if its needed, we should implement it. But so far, the installed base has anecdotally indicated its not needed, by getting by without it implemented properly.

I did look at the L21's data sheet, it uses the same verbiage, which leads me to believe that the sole condition for writing to the UART is checking DRE first. I'm not sure how many data sheets to troll through before feeling the understood behavior is ubiquituous.

@travisgriggs
Copy link
Contributor Author

The SAMR21 data sheet says the same thing.

@photonthunder
Copy link

Atmel Software Framework (ASF) does do the TX check after every byte.

@travisgriggs
Copy link
Contributor Author

I'm pretty confident now that not only is the current version ineffectual, but has been contributing to some nasty bugs we've been having. It's a case of a lot (5-ish) of different priority threads running asynchronously mixed with powering that chip down frequently, leads to hangs. I know multithreaded printf can lead to garbled output, but I don't think it should hang stuff.

My current app with the wrong code hangs. With the not added in as suggested, it works fine. I have not yet tested without the check at all.

@travisgriggs
Copy link
Contributor Author

Why it hangs:

In a normal single threaded, non interrupted scenario, what happens is this:

  1. poll DRE until set
  2. load the byte
  3. poll TRX until clear

With this coded upside down like this, TRX should be clear immediately following #2. So the first probe evaluates true and the code falls through. So it was basically a NOP.

However, if this thread gets suspended (either because of ISR or thread of higher priority), the uart hardware finishes its job and sets TXC to 1. When control finally returns to step 3, it will spin forever waiting for TXC to clear. So that thread is now hung. If another thread comes along and manages to write, that will cause TXC to go 0, which will finally clear it up and allow it to proceed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants