Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lua 5.1 to 5.3 Realignment Phase 2 and baseline 5.3 #2912

Closed
wants to merge 11 commits into from

Conversation

TerryE
Copy link
Collaborator

@TerryE TerryE commented Sep 12, 2019

Addresses: #2808, #2839 (partial), #2858, #2890

To be addressed before merge: #2783

  • This PR is for th e dev branch rather than for master.
  • This PR is compliant with the other contributing guidelines as well.
  • I have thoroughly tested my contribution.
  • The code changes are reflected in the documentation at docs/*.

Overview

This is a huge PR. Sorry for this, but by nature, I can't practically release this in chunks. As per #2890 (comment), I have reorganised this into a series of commits:

  1. Updates to app/lua to realign APIs. This is a reasonably large path to the Lua 5.1 codebase, implementing the Lua 5.3 version improved ROTables and the introduction of Function subtypes so that the modules can use the same API for building within both 5.1 and 5.3.

  2. Updates to app/modules and supporting directories to realign APIs. A general cleanup of the modules to use the stricter Lua API implemented in (1)

  3. Add the standard Lua 5.3.5 as a reference baseline. This is just the standard Lua 5.3.5 source without any changes reorganised into our current directory hierarchy. This is purely for baselining purposes. Note that the Lua Test Suite is included.

  4. First pass of Lua53 changes to make runable Lua53. This includes the core changes needed to be able to build LUA=53 and download a runnable Lua 5.3 firmware image onto the ESP, albeit with some limitations as discussed below. This has been tested against the Lua test suite within the luac.cross -e environment.

In this form, the build works for the default LUA=51 target, and as a minimum prerequisite for merge, we should have solidly tested this against both the firmware and luac.cross variants.

We should also have a stable and usable (albeit with agreed deficiencies) LUA=53 target.

I will create the initial PR with only (1) above and then add (2) - (4).

Still outstanding to be added before merge

Special Request. Please review holistically at the first instance. If necessary make changes and email me your patch. Given the 1000s of individual line changes here, if we use the Github review API and commenting system, this PR will become rapidly unworkable.

@TerryE
Copy link
Collaborator Author

TerryE commented Sep 13, 2019

Well, the baseline Lua53 code and the overall shared build system is available for peer review. The first (4) commits are as per my original post. The last two are small tweaks to work around PR #2793.

In terms of reviewing these patches, I am sorry to say that they are large.

  • 9300249 is the largest (some 38K lines) but this is just introducing the standard Lua 5.3 source and test scripts into the repository.
  • 1056ad8 and 2a50243 are 3,700 and 4,000 line patches respectively though most of these are the context wrapper lines. I suggest that you copy the diff format to a local program editor which does decent diff C highlighting to review the changes. (Just add .diff to the patch URI)
  • 1f75eaa is 6,500 lines so far and covers the changes to app/lua53. I've still got some stuff to add such as LFS support which I will do in the next few days.

(I've had to cancel my holiday in Greece, because I can't travel because of a ME/CFS relapse. So I will have more time to kill on this.)

The Travis fails are due to some merge gremlin with Gregor's enhanced BuildInfo patch. I will sweep this resolution up in the next commit.

@TerryE
Copy link
Collaborator Author

TerryE commented Sep 16, 2019

Gosh this hx711.c is hassle 😄

Anyway what I've done in this last commit is to go through all of the Lua Test Suite in detail removing any temporary changes which I did to get lua53 to compile and run the tests, debugging issues as I went along. The tests run clean with minimal and explained differences. CD to app/lua53/host and make LUA=53 DEBUG=1 TEST=1 to build luac.cross. cd tests; $ROOT/luac.cross -e all.lua will run the entire suite. You can also gdb this. If you want a sort of lua interface on the host then create a file containing debug.debug() and luac.cross -e 'debug.lua' gives you a command prompt, and you can drop in and out of gdb.

At the top level change user_config.h and user_modules.h as you require then do a make LUA=53 then this will build the firmware. If you enable gdbstub and rm -R app/lua53/.output then make LUA=53 DEBUG=1, this will give you a remote debuggable version.

There were some genuine bugs, and also some test assumptions that were invalid: e.g. the various base library functions and modules are in ROM and not directly in _G. So looking good overall. Lua 5.3 now includes almost complete math and debug libraries.

I have been focusing on luac.cross testing and used this to hammer out all of the shared environment issues. The firmware version seems to run OK, but I still need to stress test this. I will do a lot of this after I've added LFS.

Still got to port LFS -- my next commit.

@TerryE
Copy link
Collaborator Author

TerryE commented Oct 13, 2019

This latest commit includes the LFS functionality and various cascade fixes to allow the Lua test suite to run will only known variations from standard to accommodate LFS running and in particular running the test suite as LFS images so that we hammer the core against execution from LFS. This all runs within the tests.c wrapper which does paranoid dynamic memory tracking (e.g. checking overrun marker fields and failing if any mallocs are not paired with frees.

The luac.cross -e environment essentially runs the same Lua core VM as the NodeMCU firmware on the ESP with a subset of 'firmware' libraries.

This madatory malloc/free paring feature makes LFS on luac.cross possible since we can lua_newstate(), load LFS, lua_close(), lua_newstate() to run Lua code on the host with LFS loaded. Bouncing the Lua runtime like this also saves a restart when loading a new LFS on the ESP.

The next tranche of work is hammering the on-ESP firmware build.

@TerryE
Copy link
Collaborator Author

TerryE commented Oct 25, 2019

OK, this latest push largely adds LFS support. But in more detail

  • luac.cross -o name -f <lua file list> generates an LFS image file. The -a option isn't yet implemented.
  • LFS images can be loaded into luac.cross for execution in the -e script environment by using the -F LFSimage option. They can also be loaded into the ESP from SPIFFS using node.flashreload('imagename') which will reboot the ESP with the new LFS loaded.
  • The LFS file format is different from Lua 5.1 but is of comparable size. The compiled image (.lc) format is also different but about ½ the size.
  • Lua 5.3 is faster and uses less RAM than Lua 5.1.
  • I've added an extra pbrk() method to gdbstub. Calling gdbstub.init() is optional, and this method also enables UART output capture. The gdbstub documentation has been updated. @pjsg Philip could you take a look at this please.
  • I've added the % operator to strings. This works in a similar fashion to Python.

Yet TODO:

  • Documentation
  • Review SPIFFS loading.
  • Update LFS image loading in nodemcu-parttion.py
  • ESP-side LFS image building
  • Lua GC tuning. The old eLua GC was very aggressive but ran like a dog. The 5.3 engine has modified GC which is a lot faster but the default setting are a little too lazy so tends to under-report free heap(). Needs some tuning.
  • Lua test suite. I'll raise a separate issue to discuss this one. I've done a couple but this is quite a job. Note that all of the failures so far have been issues with the scripts working in a NodeMCU environment, rather than the runtime itself failing. (e.g. references to io library or needing more than 44Kb RAM to run.)

You must include LUA=53 in your make to build a Lua 5.3 runtime. Omitting this will get you a Lua 5.1 build.

At the moment I am bootstrapping the ESP by using the ftpserver module (and using nodemcu-tool upload ftp.img) to upload an LFS image with this in.

I am getting to the point where I desperately need other testers to get involved, and in particular regression test the Lua 5.1 option. There are quite a few changes here to align the Lua 5.1 and Lua 5.3 runtimes so that the modules work with both. Once we are happy that at least Lua 5.1 is stable, then we can merge this into dev, since at the moment you need to do an explicit LUA=53 make to include the Lua 5.3 runtime.

@fikin
Copy link
Contributor

fikin commented Oct 29, 2019

compiling this branch using marcel's docker nodemcu-build img is getting me to ld error :

HOSTCC /opt/nodemcu-firmware/app/lua/luac_cross/../lstrlib.c
../lstrlib.c: In function 'str_format2':
../lstrlib.c:828:5: warning: implicit declaration of function 'lua_rawlen' [-Wimplicit-function-declaration]
     int i,n=lua_rawlen(L,2);
     ^
HOSTCC /opt/nodemcu-firmware/app/lua/luac_cross/../ltable.c
...
HOSTLD ../../../luac.cross
.output/host/release/obj/lstrlib.o: In function `str_format2':
lstrlib.c:(.text+0x10b3): undefined reference to `lua_rawlen'
collect2: error: ld returned 1 exit status

what am i missing here?

@TerryE
Copy link
Collaborator Author

TerryE commented Oct 29, 2019

what am i missing here?

An offline PM:

Sorry guys

If you want to do a LUA=51 (default) build then patch

sed -i -e '828s/lua_rawlen/lua_objlen/' app/lua/lstrlib.c

This was one of the backported features to lua 5.1. I copied across the code forgetting that lua_rawlen() in 5.3 is lua_objlen() in 5.1. BTW this adds

print('%s =%s' % {k,v})

support to the string library

//Terry

If you are getting this message then you are testing the LUA=51 variant. Try make LUA=53 DEBUG=1 if you want to look at Lua 5.3

@fikin
Copy link
Contributor

fikin commented Oct 29, 2019

ok, no worries. using docker run .. -e LUA=53 ... does the trick, compilation is ok ;)

@fikin
Copy link
Contributor

fikin commented Nov 3, 2019

test code is:

file.remove("test.lua")
assert(file.open("test.lua","w+"), "file open failed")
file.writeline("print(1)")
file.close()
require("test")

first problem: running following as esplorer block-command results in reset:

> file.remove("test.lua")
> assert(file.open("test.lua","w+"), "file open failed")

 ets Jan  8 2013,rst cause:1, boot mode:(3,7)
load 0x40100000, len 28484, room 16 
tail 4
chksum 0xb7
load 0x3ffe8000, len 2468, room 4 
tail 0
chksum 0x1b
load 0x3ffe89a4, len 8, room 8 
tail 0
chksum 0xf6
csum 0xf6

second problem: running these line per line results in require error:

> file.remove("test.lua")
> assert(file.open("test.lua","w+"), "file open failed")
> file.writeline("print(1)")
> file.close()
> require("test")
error loading module 'test' from file 'test.lc':
	cannot open test.lc: 
stack traceback:
	[C]: in ?
	[C]: in function 'require'
	stdin:1: in main chunk
	[C]: in ?
	[C]: in ?
>

running as block-command all file operations is fine. reset comes only when require is being added.

compiling test.lua to test.lc is still failure:

> require("test")
stdin:1: module 'test' not found:
	no field package.preload['test']
	no file 'test.lc'
	no file 'test.lua'
stack traceback:
	[C]: in function 'require'
	stdin:1: in main chunk
	[C]: in ?
	[C]: in ?

btw i'm using d1.

so far haven't encountered other problems with this branch.

@TerryE
Copy link
Collaborator Author

TerryE commented Nov 3, 2019

@fikin Nikolay, thanks; that is brilliant. I will track these down tomorrow.

@HHHartmann
Copy link
Member

When building with

#define LUA_USE_MODULES_ENCODER

the build breaks

CC app/modules/crypto.c
crypto.c: In function 'call_encoder':
crypto.c:53:3: error: implicit declaration of function 'luaL_checktable' [-Werror=implicit-function-declaration]
   luaL_checktable(L, -1);
   ^

@HHHartmann
Copy link
Member

Hi again Terry,
this is strange behaviour:

NodeMCU 3.0.0.0 
	branch: pr_2912
	commit: 630c2903bf7561e3a9b8b539bff70b50bdf6bc98
	release: 2.0.0-master_20170202 +455
	release DTS: 
	SSL: false
	build type: float
	LFS: 0x0
	modules: adc,bit,dht,file,gpio,i2c,mqtt,net,node,ow,spi,tmr,uart,wifi
 build 2019-11-05 21:02 powered by Lua 5.3.5 on SDK 3.0.1-dev(fce080e)
cannot open init.lua: 
> 
----------------------------
mispec.lc       : 3543 bytes
mispec.lua      : 4762 bytes
mispec_file.lua : 5283 bytes
mispec_ws2812.lua : 5741 bytes
mispec_ws2812_2.lua : 4781 bytes
mispec_ws2812_effects.lua : 808 bytes
----------------------------
Total file(s)   : 6
Total size      : 24918 bytes

Total : 3486139 bytes
Used  : 27359 bytes
Remain: 3458780 bytes

> require ("mispec")
stdin:1: module 'mispec' not found:
	no field package.preload['mispec']
	no file 'mispec.lc'
	no file 'mispec.lua'
stack traceback:
	[C]: in function 'require'
	stdin:1: in main chunk
	[C]: in ?
	[C]: in ?
> =node.heap()
41184
> require ("mispec")
error loading module 'mispec' from file 'mispec.lc':
	cannot open mispec.lc: 
stack traceback:
	[C]: in ?
	[C]: in function 'require'
	stdin:1: in main chunk
	[C]: in ?
	[C]: in ?
> 

The file mispec.lua and mispec.lc exist on SPIFFS.
Note the different error messages between the two require calls.
For the content of mispec.lua please see #2953
The lc file is compiled on chip.

@TerryE
Copy link
Collaborator Author

TerryE commented Nov 8, 2019

@fikin @HHHartmann sorry for not updating on this. Job for tomorrow. I've been adding another feature requested by @jmattsson: support for the luac.cross -a option so now as well as

luac.cross -F myLFS.img -e init.lua

which will load a compiled LFS image into the luac.cross execution environment and make it available to the init.lua script (this is how I do a lot of my testing); you can now also do:

luac.cross -F myLFS.img -o myLFS-a8000.bin -a 0x402a8000

will create an ESP binary LFS based at 0x402a8000 that you can use esptool to write directly to flash or to integrate it into a provision image. I just need to add this to nodemcu-partition.

@TerryE
Copy link
Collaborator Author

TerryE commented Nov 9, 2019

I've tracked down and fixed a couple of these.

  • The require bug, is because an error in vfs/stdio abstraction. I missed a ! test. The require handling has subtly changed 5.1 to 5.3, I need to do a bit more digging. Fixing this makes require work as expected.
  • luaL_checktable() is a macro in 5.1 lauxlib.h for luaL_checktype(L, (n), LUA_TTABLE) that was dropped in 5.3. I've just substituted the check type expression in crypto.c.

Will post another update tomorrow.

@HHHartmann
Copy link
Member

You also need to replace the link in nodemcu-firmware\sdk-overrides\include\espconn.h with a file like in dev to not break windows checkouts

#ifndef _SDK_OVERRIDE_ESPCONN_H_
#define _SDK_OVERRIDE_ESPCONN_H_
// Pull in the correct lwIP header
#include "../../app/include/lwip/app/espconn.h"
#endif

@TerryE TerryE mentioned this pull request Nov 17, 2019
@HHHartmann
Copy link
Member

@TerryE Not sure about this, but did you address the lua_rawlen issue in lstrlib.c when building LUA51?

@TerryE
Copy link
Collaborator Author

TerryE commented Nov 17, 2019

did you address the lua_rawlen issue in lstrlib.c when building LUA51

Yes, I am using the Lua 5.1 variant at this moment on this ESPlorer issue.

This was referenced Nov 19, 2019
@TerryE
Copy link
Collaborator Author

TerryE commented Dec 10, 2019

Changes in 32e161b

Interface consistency across the Lua 5.1 and 5.3 versions

  • As discussed previously all firmware components other than the Lua directories are shared, so these components must use a common API interface to the two different Lua runtimes.

  • The startup and interactive interface for Lua is primarily maintained in the lua.c file. The standard distribution version has had quite a bit of restructuring moving from Lua 5.1 to 5.3. The NodeMCU version is driven by the needs of LFS startup and that the runtime executes within a non-preemptive event driven task-oriented architecture, so we can't have an interactive poll stdin loop, so our two version were similar without good reasons for the differences, so I have decided to converge our NodeMCU lua.c variants completely, so that they will now use the same source file (app/lua/lua.c will actually be a symlink to app/lua53/lua.c) with #if LUA_VERSION_NUM conditionals used where a few source changes are needed to reflect the internal runtime API introduced in 5.2 and 5.3.

  • I've had to do a little realignment to lauxlib.c etc, but with a common lua.c main module, keeping the Lua 5.1 and 5.3 versions in sync and ditto with the module libraries should be a lot more straightforward.

UART fixes

  • The bugs breaking the uploaders have been fixed. My gist test-upload.lua should be a help for @kmpm to sort out any remaining issues with nodemcu-uploader

ADC module

  • Now acquires the parameter page from the Partition table.

@HHHartmann
Copy link
Member

Thanks Terry for this commit, I hope it will fix several issues.
Let me remind you that symlinks will break checkouts under windows however. This is ugly but could we use short includes instead like

#ifndef _SDK_OVERRIDE_ESPCONN_H_
#define _SDK_OVERRIDE_ESPCONN_H_
// Pull in the correct lwIP header
#include "../../app/include/lwip/app/espconn.h"
#endif

I don't even know if the include guard is needed here. Maybe a oneliner is sufficient here.

@jmattsson
Copy link
Member

@HHHartmann You're right, for this use case we don't even need the include guard. I'd be "happy" with adopting the one-line-#include as a poor man's symlink.

@HHHartmann
Copy link
Member

@TerryE Terry I am having problems loading an LFS image.
I am using docker,
changed the build script to build LUS=53,
flashed with complete clean,
built LFS image with new luac.cross
uploaded the image (yes it works now)
Executed flashreload
which gave me this

=node.flashreload("LFS_float_20191215-2052.img")
=node.flashreload("LFS_float_20191215-2052.img")
true
> 
 ets Jan  8 2013,rst cause:2, boot mode:(3,6)

load 0x40100000, len 28820, room 16 
tail 4
chksum 0x07
load 0x3ffe8000, len 2480, room 4 
tail 12
chksum 0xf5
ho 0 tail 12 room 4
load 0x3ffe89b0, len 8, room 12 
tail 8
chksum 0xf0
csum 0xf0
‡�;ƒ›’nì�’d„˜Œ <...>

Erasing LFS from flash addr 0x08e000 to 0x0cdfff
(error object is a boolean value)
stack traceback:
	[C]: in ?
	[C]: in ?

The device hangs here until reset
I configured LFS as

#define LUA_FLASH_STORE                   0x40000

and added several modules I commonly use

@HHHartmann
Copy link
Member

HHHartmann commented Dec 18, 2019

@TerryE Terry I built from a clean clone only changing #define LUA_FLASH_STORE 0x40000 and using LUS=53.

I get the following error building luac.cross

HOSTCC /opt/nodemcu-firmware/app/lua53/host/../lundump.c
../lundump.c: In function 'addTS':
../lundump.c:462:10: warning: unused variable 'totl' [-Wunused-variable]
   size_t totl     = sizelstring(l);
          ^

There is also a symbolic link in sdk-overrides/include/espconn.h which I resolved by copying the target.

Loading an LFS image built with this build yields the same results as mentioned above.

Going back to pre 53 firmware for now.

@TerryE
Copy link
Collaborator Author

TerryE commented Dec 19, 2019

IIRC, total is used in a lua_assert() which generates the unused warning if debug isn't enabled. I can change the define for the non-debug path to generate UNUSED(e) which will suppress the warning.

The symbolic link issue is really an interoperability problem of docker on WinX using samba SMB which doesn't support symlinks across mount points. You are the expert on docker over Windows. What is the best practice workaround for this flaw in docker?

I can't debug your LFS reload because I am not psychic; I don't have your test case. Email me a Zip of your LFS code and your config so I can reproduce it. Thanks.

@HHHartmann
Copy link
Member

For the symlink please see #2912 (comment) and @jmattsson s comment thereafter.
There is also another symlink in app/lua/lua.c.
I will send an email this afternoon when I am back home.

@TerryE
Copy link
Collaborator Author

TerryE commented Dec 19, 2019

Gregor, I understand what the symlink issue is. My point is that the why is nothing directly to do with NodeMCU but rather this is a bug or functional limitation of how docker operates over Windows: if you want your source root to be a winX file tree in NTFS, then Docker uses Samba to expose this to the container. NTFS and Linux FS all support some form of symbolic linking. Samba doesn't, because the two implementations aren't fully consistent and it is difficult to map this functionality.

@HHHartmann
Copy link
Member

Terry as I recall this is an issue of git under Windows. Marcel pointed out somewhere that a plugin or an extension (not sure of the terminology) for git world have to be installed on the client machine.

See #2856 for more info.

@TerryE
Copy link
Collaborator Author

TerryE commented Dec 20, 2019

AFAIK, using git on Windows works with fixes as per the referenced issue; ditto git within a container so long as the root is within a container FS. The symlink problem arises when using git within the container with the root referencing a Samba mapped mount point. This is annoying, but as you point out we can use a #include as a workaround to avoid duplicating files unnecessarily.

@fikin
Copy link
Contributor

fikin commented Jan 13, 2020

rebase to latest encounters one compile problem, namely gregor's findings about unused "size_t totl = sizelstring(l);".
commenting it out makes compilation ok.
i haven't got any softlink problems as mentioned above (using linux docker -e LUA=53).

i'll begin experimenting with it now.

@marcelstoer , @TerryE how do you see merging it to dev branch? is there something to be helped with? for 51 vs 53 building?
imho it would be totally awesome to get it mainstream ;)

@TerryE
Copy link
Collaborator Author

TerryE commented Jan 31, 2020

@TerryE Terry I am having problems loading an LFS image.
I am using docker,
changed the build script to build LUS=53,
...
The device hangs here until reset

Gregor sorry for the illness related sabbatical. Back in the loop now. The load seems to be working, but the issue is do with an optimisation to remove a reboot. If you externally reset the ESP then so should find that the LFS has actually loaded. I will do a fix for this.

@HHHartmann
Copy link
Member

@TerryE Terry, so glad to hear that you are back in the saddle and please don't excuse yourself for being ill.
So I will give it another try this weekend. I suspect that my current project will not run with lua 5.3. So I will see what I can do about that.
Btw I am HHHartmann. "gregor" belongs to Gregor Herdmann. Guess he is wondering why he was mentioned ;-)

@TerryE
Copy link
Collaborator Author

TerryE commented Jan 31, 2020

Gregor, I tried loading ./luac.cross -f -o /tmp/lfs.img $(find lua_examples -name \*.lua) which loaded fine, other than (1) the above sticking reboot bug, and (2) I had to exclude one file which has lua 5.1 only syntax (see #3028)

This second triggers EM: errors when doing an LFS._config, until I removed the second _init module. This LFS contained some 40 modules compiling to ~130Kb LFS space.

@TerryE
Copy link
Collaborator Author

TerryE commented Feb 9, 2020

@HHHartmann, Sorry about the delay. I had a mind fart during merging the 51 and 53 variants of lua.c. The RTS uses slightly different methods of signalling LFS reload to lua.c needs to have variant code. The 5.3 restart was hanging but I've fixed this so here is an example where I load pretty much all of the Lua files in lua_examples:

>  node.flashreload'lfs.img'
> 
 ets Jan  8 2013,rst cause:2, boot mode:(3,6)
 ...
Erasing LFS from flash addr 0x096000................................. to 0x0d5fff
LFS image loaded

NodeMCU 3.0.0.0 
	branch: dev-lua53-2
	commit: 32e161b131400a0cd215fddd50733d26dd98ff93
	release: 2.0.0-master_20170202 +457
	release DTS: 201912102127
	SSL: false
	build type: float
	LFS: 0x0
	modules: adc,bit,file,gdbstub,gpio,net,node,tmr,uart,wifi
 build 2020-02-09 17:29 powered by Lua 5.3.5 on SDK 3.0.1-dev(fce080e)
cannot open init.lua: 
> =table.concat(LFS._list,' ')
dummy_strings HTTP_OTA _init lfs_fragments graphics_test sjson-streaming send_text_message play_file play_network irsend GT_cross GT_graphics_test GT_clip GT_text GT_gradient GT_box GT_pixel_and_lines GT_fonts GraphicsTest GT_color_test UcgLogo GT_triangle HelloWorld telnet_pipe telnet_fifosock simple_telnet telnet default luaOTAserver _provision _doTick check adc_rgb send_email_smtp myfile tz mqtt_file mqtt2cloud somfy mcp23008_leds mcp23008_buttons webap_toggle_pin tcp2uart make_phone_call
> =node.heap()
42536
> 

I just want to check out LFS reload for 5.1 and rebaseline against current dev so we can do the merge into dev.

@marcelstoer
Copy link
Member

Superseded by #3075 -> closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants