Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LFS flashing alters core firmware behavior resulting in panic #3109

Closed
pnkowl opened this issue May 15, 2020 · 3 comments
Closed

LFS flashing alters core firmware behavior resulting in panic #3109

pnkowl opened this issue May 15, 2020 · 3 comments

Comments

@pnkowl
Copy link

pnkowl commented May 15, 2020

Expected behavior

The act of node.flashreload() should not change the behavior of the underlying firmware if nothing is ever called from LFS.

Actual behavior

Adding bits of code to LFS, even it never executed, causes string.find() to take 5.5x longer than if those bits of code did not exist in LFS. For more time consuming string.find requests, this causes a watchdog panic.

Scope

  • LFS images created with "A Lua Cross-Compile Web Service"
  • This issue has existed in the LFS master branch since I first started using these code bits (circa March 2020).
  • This issue has been verified to exist in the LFS dev branch

Sample Results

notes
4a==4c==offending bits are not present in LFS
4b==offending bits in LFS
numbers are tmr.now() differences (micro seconds elapsed)
full listings attached with LFS zips and img files
[ticket 200515a.zip](https://github.com/nodemcu/nodemcu-firmware/files/4635297/ticket.200515a.zip)

                                         4a      4b       4c
f(p1..p2,"p1,2")             p1,2:       57564   310904   57706  
f(p1..p2..p3,"p1,2,3")       p1,2,3:    176993   970681  177020  
f(p1..p2..p3..p4,"p1,2,3,4") p1,2,3,4:  477748  2647403  477772  
f(p1..p4..p4,"p1,p4,p4")     p1,p4,p4:  715512  panic    715502  

Mitigation

  • encapsulate each offending bit of code in its own global function and initialize it outside of LFS.
  • all other code can stay in LFS

Test code

Provide a Minimal, Complete, and Verifiable example which will reproduce the problem.

-- add code here
function f(payload,text)
  t1=tmr.now()
  local a,b,c,d,e=string.find(payload, "([A-Z]+) /([^?]*)%??(.*) HTTP")
  t2=tmr.now()
  print(text..": "..t2-t1.."\n")
end

dev branch
execute one of 
node.flashreload("LFS_hello_dev.img")
node.flashreload("LFS_hello_bug_dev.img")

node.flashindex("hello")()  -- verify LFS lives
p1="POST /a_small.lua HTTP/1.1 Host: 192.168.1.16 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:73.0) "
p2="Gecko/20100101 Firefox/73.0 Accept: */* Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, "
p3="deflate Content-Type: text/plain Content-Length: 5 Origin: http://192.168.1.16 Connection: keep-alive Referer: http://192.168.1.16/a_small.lua?edit "
p4=string.rep("--abc ",48)
f(p1..p2,"p1,2")
f(p1..p2..p3,"p1,2,3")
f(p1..p2..p3..p4,"p1,2,3,4")  -- sometimes panics if LFS, close to the watchdog limit
f(p1..p4..p4,"p1,p4,p4")  -- panics if LFS

master branch (non minimal examples)
node.flashreload("LFS200316b_ok.img")
node.flashreload("LFS200316b_bad2.img")         -- contains ide_receive1.lua

NodeMCU startup banner, master branch

You MUST include the firmware startup banner to describe the version you are using. We reserve the right to immediately close any bug that doesn't.

NodeMCU 3.0.0.0 built on nodemcu-build.com provided by frightanic.com
branch: master
commit: 3d91785
release:
release DTS: 202002231746
SSL: false
build type: float
LFS: 0x40000
modules: adc,bit,cron,dht,enduser_setup,file,gpio,mqtt,net,node,ow,pwm2,rtcmem,rtctime,sntp,struct,tmr,uart,wifi
build 2020-05-01 12:54 powered by Lua 5.1.4 on SDK 3.0.1-dev(fce080e)
lua: cannot open init.lua

NodeMCU startup banner, dev branch

You MUST include the firmware startup banner to describe the version you are using. We reserve the right to immediately close any bug that doesn't.

NodeMCU 3.0.0.0 built on nodemcu-build.com provided by frightanic.com
branch: dev
commit: 5e2ea5a
release:
release DTS: 202005120049
SSL: false
build type: float
LFS: 0x40000 bytes total capacity
modules: adc,bit,cron,dht,file,gpio,mqtt,net,node,ow,pwm,rtcmem,rtctime,sntp,softuart,struct,tmr,uart,wifi
build 2020-05-12 14:43 powered by Lua 5.1.4 on SDK 3.0.1-dev(fce080e)
cannot open init.lua:

Hardware

Describe which ESP8266/ESP32 device you use and document any special hardware setup
required to reproduce the problem.

  • NodeMCU ver 0.9 (ESP-12, 4M byte ESP8266)
  • no external connections besides usb
@nwf
Copy link
Member

nwf commented May 15, 2020

The existence of a flashed LFS implies the existence of a LFS string table, which will be loaded even if no code from LFS is ever fetched explicitly; AIUI, this is an artefact of Lua's desire to intern strings for pointer equality testing.

Now why that increases the runtime of string.find I can only leave to @TerryE's excellent debugging skills. I suspect some accidental quadratic behaviour.

@TerryE
Copy link
Collaborator

TerryE commented May 15, 2020

The reason for the increases runtime is an easy one to answer and is discussed in some of my performance papers. Moving string.find(payload, "([A-Z]+) /([^?]*)%??(.*) HTTP") into LFS also moves the string constant ([A-Z]+) /([^?]*)%??(.*) HTTP")into the LFS string table. This is passes as a const char * into the workhorse regexp function match()in lstrlib.c which itself uses character access to climb up and down this string, and each byte access triggers an unaligned exception which uses a software access function instead of a single hardware instruction to grab each byte of the string. I've used access macros to remove 95%+ of such unaligned exceptions, but this is one example of where we've left the handler to do the work and this exposes a cost of using LFS.

If you replace the search string by "([A-Z]+) /([^?])%??(.)".." HTTP" then this will be created in the RAM string table and so will execute at H/W speed.

BTW: Your response header doesn't include the \r\n line delimiters as the RFC requires, and even if it does you are doing a very inefficient search, so let me pick it apart:

  • (A-Z]+) / one or more uppercase letters followed by " /". Note that is always best to anchor searches that start at char 1 using the ^ anchor
  • ([^?]*)%? a greedy string of not ? followed by and optional ?. The greedy string followed by the optional ? is a bad combo.
  • (.*) HTTP a greedy string any characters followed by HTTP

Lua patterns are not regexps and the greedy matches where they aren't needed will always require the match to scan to then end of the string. The URI cannot have embedded space characters so by far the easiest pattern to use is

  • ^(%w-) +([^ ]-) +(%w*) which splits out the first 3 fields using a space delimiter. Check the first. Also check third or don' bother and hardwire the match
  • ^(%w-) +([^ ]-) +HTTP

If you need to split out field 2 into 2A and 2B depending on the parameter separator '?'

The panic might be an entirely valid response as the search using a ROM string will be so inefficient.

You also need to use the match() method to return the captures, BTW. So doing something like

function f(payload,text)
  local t0,a,b=tmr.now(), payload:match('^(%w-) +([^ ]-) +HTTP')
  print(text,tmr.now()-t0,a,b)
end

drops the find execution time from the 50-700 mSec range down by over a factor of a 1,000 to a few hundred uSec.

Also note that in this case the find engine isn't doing a triple do loop across the whole string backtracking inside the pattern so the overhead of the extra unaligned exceptions is small and can be ignored.

@TerryE TerryE closed this as completed May 15, 2020
@TerryE
Copy link
Collaborator

TerryE commented May 15, 2020

BTW, good work for the feedback so thank-you , even if this one was a false alarm. 😄 Post back if you want some extra explanation. I've just closed this because it isn't really an open issue, IMO.

@pnkowl pnkowl mentioned this issue May 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants