Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WiFi.begin() causes exception 3 #7373

Closed
5 of 6 tasks
xsrf opened this issue Jun 13, 2020 · 8 comments
Closed
5 of 6 tasks

WiFi.begin() causes exception 3 #7373

xsrf opened this issue Jun 13, 2020 · 8 comments

Comments

@xsrf
Copy link
Contributor

xsrf commented Jun 13, 2020

Basic Infos

  • This issue complies with the issue POLICY doc.
  • I have read the documentation at readthedocs and the issue is not addressed there.
  • I have tested that the issue is present in current master branch (aka latest git).
  • I have searched the issue tracker for a similar issue.
  • If there is a stack dump, I have decoded it.
  • I have filled out all fields below.

Platform

  • Hardware: ESP8266 on WeMos D1 Mini R2
  • Core Version: 89d0c78
  • Development Env: Platformio
  • Operating System: Windows

Settings in IDE

platformio.ini

[env:d1_mini]
platform = espressif8266
platform_packages =
  ; use upstream Git version
  framework-arduinoespressif8266 @ https://github.com/esp8266/Arduino.git#89d0c78703e7a4bb627f9c69e237618add0f8de3
board = d1_mini
framework = arduino
monitor_speed = 115200
upload_speed = 921600

Problem Description

WiFi.begin() causes exception 3 (LoadStoreErrorCause: Processor internal physical address or data error during load or store)

MCVE Sketch

#include <Arduino.h>
#include <ESP8266WiFi.h>

void setup() {
  Serial.begin(115200);
  Serial.println("Before begin()");
  delay(500);
  WiFi.begin("test","testtesttest");
  delay(500);
  Serial.println("After begin()");
}

void loop() {
  Serial.println(WiFi.status());
  Serial.println(WiFi.localIP());
  delay(500);
}

Debug Messages


 ets Jan  8 2013,rst cause:2, boot mode:(3,7)

load 0x4010f000, len 3664, room 16 
tail 0
chksum 0xee
csum 0xee
v89d0c787
~ld
Before begin()

--------------- CUT HERE FOR EXCEPTION DECODER ---------------

Exception (3):
epc1=0x40100718 epc2=0x00000000 epc3=0x00000000 excvaddr=0x400081e9 depc=0x00000000

>>>stack>>>

ctx: cont
sp: 3ffffba0 end: 3fffffc0 offset: 0190
3ffffd30:  feefeffe feefeffe feefeffe feefeffe
3ffffd40:  feefeffe feefeffe feefeffe feefeffe
3ffffd50:  feefeffe feefeffe feefeffe 3fffff00  
3ffffd60:  0000049c 0000049c 00000020 40100902
3ffffd70:  feefeffe feefeffe feefeffe feefeffe  
3ffffd80:  00000000 400042db 000003fd 40100b50
3ffffd90:  40004b31 00001000 000003fd 4010027c  
3ffffda0:  40105b1c feefeffe feefeffe 4022d1e5
3ffffdb0:  40100c1d 4022d2cf 3ffef05c 0000049c  
3ffffdc0:  000003fd 3fffff00 3ffef05c 4022d2b2
3ffffdd0:  ffffff00 55aa55aa 0000009a 00000020  
3ffffde0:  00000020 0000008b 0000008a aa55aa55  
3ffffdf0:  000003ff 4022d7b2 3ffef05c 3ffef05c
3ffffe00:  000000ff 000000db 000000db 40100647  
3ffffe10:  40100c1d 00000001 3ffef06c 4022d9d2
3ffffe20:  00000008 3ffef05c 000000ff 3fffff00  
3ffffe30:  3fffff20 3ffef093 0000009a 00000020
3ffffe40:  3ffef11c 3fffff61 00000001 4022da82
3ffffe50:  3fffff00 40239bd0 00000000 0000000c  
3ffffe60:  3ffef45c 3fffff20 3fff5394 4022da51  
3ffffe70:  3ffef05c 4022dab8 3ffe84cc 3ffe8623
3ffffe80:  4020194a 3ffe8623 3ffe861b 4020189f
3ffffe90:  3ffec400 3fff4cd4 00000020 401009bf  
3ffffea0:  00000020 000000ca 000000cb aa55aa55
3ffffeb0:  00000300 4024984b 00000000 40100378  
3ffffec0:  3ffec4d4 00000001 3fff2634 40249866
3ffffed0:  40249896 00000001 00000001 00000001  
3ffffee0:  40202cf2 3fff2634 0000000a feefeffe
3ffffef0:  d5103800 fe0c515a feefeffe 00000100  
3fffff00:  74736574 3ffeef00 00000020 40100902
3fffff10:  402033a5 feefeffe feefeffe feefeffe  
3fffff20:  74736574 74736574 74736574 40203300
3fffff30:  3ffe863e 00000000 74002928 4020330d  
3fffff40:  007a1200 0696e6f1 ffffff00 3ffee3c4
3fffff50:  401050a1 00028e34 3ffee420 00000000  
3fffff60:  3ffedd00 3ffee420 00000181 00000002
3fffff70:  00000004 00000000 3ffee318 00000001  
3fffff80:  40202a55 000001f4 3ffee3c4 3ffee3c4
3fffff90:  3fffdad0 00000000 3ffee34c 4020106a
3fffffa0:  feefeffe feefeffe 3ffee384 40202560  
3fffffb0:  feefeffe feefeffe 3ffe84e4 40100bc5
<<<stack<<<

--------------- CUT HERE FOR EXCEPTION DECODER ---------------

I haven't found a way to decode the stacktrace using PlatformIO, sorry.

Using this works:

#include <Arduino.h>
#include <ESP8266WiFi.h>

void setup() {
  Serial.begin(115200);
  Serial.println("Before begin()");
  delay(500);
  //WiFi.begin("test","testtesttest");
  delay(500);
  Serial.println("After begin()");
}

void loop() {
  Serial.println(WiFi.status());
  Serial.println(WiFi.localIP());
  delay(500);
}

Output:

Before begin()
After begin()
0
(IP unset)
0
(IP unset)
0
(IP unset)

This is not the first time I encounter this issue... I rarely use WiFi.begin() in my code because all my ESPs are configured correctly for my wifi. But today I accidentally flashed code with WiFi.begin() and wrong credentials and now I'm unable to recover. I had the same issue few weeks ago and after hours of flashing different stuff using different software (NodeMCU flasher etc.) and different flash layouts it finally recovered, but I don't know what exactly helped.

I know that there are similar issues like #1997 but they are either old, closed or refer to much more complex code using WifiManager etc...

@d-a-v
Copy link
Collaborator

d-a-v commented Jun 13, 2020

Try adding WiFi.mode(WIFI_STA); before begin().
Just for you to know, in version 3.0.0 wifi autoconnect like you seem to use will probably be deprecated (check #7303).

@xsrf
Copy link
Contributor Author

xsrf commented Jun 13, 2020

Try adding WiFi.mode(WIFI_STA); before begin().

Doesn't help. But thx for the Info about 3.0.0

@d-a-v
Copy link
Collaborator

d-a-v commented Jun 13, 2020

Try to use the "erase all flash" option, and one of the WiFi shipped examples.
Maybe you'll have to use the arduino IDE for that one time (you'll get both of option, examples, plus the stack decoder if you download it).
Then you'll be able to use PIO again.
The erase all flash option can be done manually with esptool.py erase_flash
You'll need to use begin(ssid, password) at least once after that command.

@xsrf
Copy link
Contributor Author

xsrf commented Jun 13, 2020

Thx very much! PIO has an erase flash task (didn't know that until now) and it solved my issue. I won't close this issue though, since I think begin() still has problems and also created the flash corruption in the first place. I'll let you decide...

@devyte
Copy link
Collaborator

devyte commented Jun 13, 2020

When flashing a new sketch to a device, you must always erase the flash.
Having said that, there is a known and extremely complex issue with erasing the wifi config area when using OTA. There is a proof of concept PR in #6965 by @mhightower83 which is meant to address that case, because you can't do a full erase there.
And now that you recovered, I'm going to guess you won't be able to reproduce the problem. If by chance you do have a deterministic way to reproduce the problem, please retest with #6965 and report back. It would help to confirm the fix.

Closing, because the issue is already being covered in #6965.

@devyte devyte closed this as completed Jun 13, 2020
@xsrf
Copy link
Contributor Author

xsrf commented Jun 14, 2020

When flashing a new sketch to a device, you must always erase the flash.

Okay, I'm curious now. I'm pretty sure this is wrong and you meant when using different SDK versions or different flash layout (spiffs), right?
I can understand that there are data structures on the flash that hold wifi config and are not erased during regular programming. As long as the SDK stays the same, there should be no issue, right?

In my case, since I rarely update the wifi config, it may actually have been written by an older SDK version and then just saving new credentials into the old struct by the new code with potential new SDK may have caused the issue... But this is just speculation.
Is it possible that newer SDKs can use the configuration written by older SDKs but cannot update it properly? This would explain my experience so far that actually writing new credentials bricks my devices, not just using different code.

Having said that, there is a known and extremely complex issue with erasing the wifi config area when using OTA. There is a proof of concept PR in #6965 by @mhightower83 which is meant to address that case, because you can't do a full erase there.

I'm not using OTA in this case, but I'm curious what the consequences are. I do have devices I have to update via OTA because their headers aren't easy accessible...
Does this mean that I should stick to the same SDK version whenever I use OTA? Also, my devices using OTA may already have newer SDK with older configuration data.
I guess they may brick the moment someone tries to save new credentials to them? 🤔

And now that you recovered, I'm going to guess you won't be able to reproduce the problem. If by chance you do have a deterministic way to reproduce the problem, please retest with #6965 and report back. It would help to confirm the fix.

I'll have a deeper look into that and report back. However, as said, that's not an OTA only problem, so I don't know how this would help in any cases...

Is there a documentation of this configuration section? Is there a way to dump it?

@devyte
Copy link
Collaborator

devyte commented Jun 14, 2020

I long ago lost count of how often users open issues stating crashes or connection issues, and then an erase flash fixes the problem for them. The golden rule is: if you update a sketch, e. g. bug fixes, etc, it should be ok to not erase If you flash a new sketch, i. e. different from the previous, or if you change core version (implies potential sdk version change), then you have to erase flash.
When a sketch starts up, it has no way of knowing whether the wifi config is valid or if it contains garbage, and so has no way of knowing if it should read it, or rebuild it. I think that it is assumed that it is valid, because rebuilding it means writing to flash every boot, and that would wear out that flash area very quickly.
So, erase like explained above to make sure.
The thing is, the wifi config area isn't well understood, nor is it well understood how/when the closed source sdk makes use of it. That is in part the main problem for the OTA case (which you said is not your case), because there you can't erase the whole flash, and the wifi config area could move, or change size.

@xsrf
Copy link
Contributor Author

xsrf commented Jun 14, 2020

I don't know what espressif has done there. but the usual way is to add a checksum and version to the data, so the sdk can actually know if the data is valid and otherwise rebuild it (once) 🙄
The fact that, as you say, so many people have issues with this shows that this is a problem...
I guess I'll have to come up with my own way of persisting wifi credentials in the future... 😒

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants