Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impacts of migrating to Lua 5.3 #2808

Closed
TerryE opened this issue Jun 21, 2019 · 13 comments
Closed

Impacts of migrating to Lua 5.3 #2808

TerryE opened this issue Jun 21, 2019 · 13 comments

Comments

@TerryE
Copy link
Collaborator

TerryE commented Jun 21, 2019

I have done a roll up of the changes in moving from Lua 5.1 (referred to as Lua51) to Lua 5.3 (Lua53). (This combines any changes introduced in Lua 5.2 and also removes the points that aren't relevant to NodeMCU implementation (e.g. os and io stuff and unimplemented math and debug components.) Broadly these changes fall into three broad categories:

  • Changes in the Lua language itself. The biggy here is the introduction of an integer subtype for numbers which makes the need to separate integer and float builds redundant.

  • Changes in the Lua libraries. These might break existing examples and user applications, but for the majority of users these won't impact their applications. In general they are pretty easy to code around. For example loadstring() functionality has been moved into load() so replacing loadstring by (loadstring or load) means the Lua code works under 5.1 and 5.3.

  • Changes in the C API. These might break existing C library modules. An example here is that all of the different signed and unsigned integer variants of the API calls have been dropped in favour of the core set which return lua_Integer. Type casting and conversion is left to the application program. So no more luaL_checkint() or luaL_checklong() only luaL_checkinteger() which returns a lua_Integer.

Clearly I will deal with all of the core stuff (the app/lua to app/lua53 directories and core NodeMCU modules such as node, file, etc.) as part of the PR.

Having done a scan of the lua_examples and lua_modules directories there are very few cases where moving to Lua5.3 will break the application, and I can put through a bulk edit so that the source is Lua53 compatible (such as the above loadstring use). However, I can't commit to testing all of these changes, but they should just work out of the box.

I can also do a similar bulk edit so that the app/modules directly compiles cleanly on both build variants, but again I just don't have the bandwidth to test every module.

So as I see it, given the reality that the other committers will spend little if any effort in supporting this upgrade, we have two alternatives:

  1. Give up now because it involves just too much effort which we can't resource.
  2. Take an optimistic migration strategy. In this one, we just make the changes after a review window, accepting the risk that module X might just exhibit a bug that we need to fix downstream. (As an example we had a previous example where the integer 0x80000000 generated different code for luac.cross and node.compile() because of these integer edge effects. We might get similar issues moving from 5.1 to 5.3.)

Before I go any further, I would like to reach a quorum consensus from the committers . I personally can't think of any in-between options and that (2) is preferable to (1).


The detailed stuff

Major Changes in the Language

  • The main difference with Lua53 is the introduction of an integer subtype for numbers. Although this change should not affect "normal" computations, some computations (mainly those that involve some kind of overflow) can give different results. You can fix these differences by forcing a number to be a float (in Lua51 all numbers were float), in particular writing constants with an ending .0 or using x = x + 0.0 to convert a variable. (This recommendation is only for a quick fix for an occasional incompatibility, as it is not a general guideline for good programming; for this, use floats where you need floats and integers where you need integers.)

  • The conversion of a float to a string now adds a .0 suffix to the result if it looks like an integer. (For instance, the float 2.0 will be printed as 2.0, not as 2.) You should always use an explicit format when you need a specific format for numbers.(Formally this is not an incompatibility, because Lua does not specify how numbers are formatted as strings, but some programs assumed a specific format.)

Minor Changes in the Language

These will directly impact few if any NodeMCU applications:

  • The concept of environment changed. Only Lua functions have environments. To set the environment of a Lua function, use the variable _ENV or the function load. C functions no longer have environments. Use an upvalue with a shared table if you need to keep shared state among several C functions. (You may use luaL_setfuncs to open a C library with all functions sharing a common upvalue.) To manipulate the "environment" of a userdata (which is now called user value), use the new functions lua_getuservalue and lua_setuservalue.

  • Lua identifiers cannot use locale-dependent letters.

  • Doing a step or a full collection in the garbage collector does not restart the collector if it has been stopped.

  • Weak tables with weak keys now perform like ephemeron tables.

  • Equality between function values has changed. Now, a function definition may not create a new value;
    it may reuse some previous value if there is no observable difference to the new function.

Changes in the Lua libraries (which might impact lua_examples, lua_modules and user applications)

  • Function module is deprecated. It is easy to set up a module with regular Lua code. Modules are not expected to set global variables.

  • Functions setfenv and getfenv were removed, because of the changes in environments.

  • Function loadstring is deprecated. Use load instead; it now accepts string arguments
    and are exactly equivalent to loadstring.

  • Function table.maxn is deprecated. Write it in Lua if you really need it.

  • Function unpack was moved into the table library and therefore must be called as table.unpack.

  • Character class %z in patterns is deprecated, as now patterns may contain '\0' as a regular character.

  • The table package.loaders was renamed package.searchers.

  • Lua does not have bytecode verification anymore. So, all functions that load code (load and loadfile) are potentially insecure when loading untrusted binary data. (Actually, those functions were already insecure because of flaws in the verification algorithm.) When in doubt, use the mode argument of those functions to restrict them to loading textual chunks.

  • The bit32 library has been deprecated. It is easy to require a compatible external library or, better yet, to replace its functions with appropriate bitwise operations. (Keep in mind that bit32 operates on 32-bit integers, while the bitwise operators in Lua5.3 operate on Lua integers, which can have 32 or 64 bits, depending on the build configuration)

  • The Table library now respects metamethods for setting and getting elements.

  • The ipairs iterator now respects metamethods and its __ipairs metamethod has been deprecated.

  • The pow functions is deprecated in the mathematical library. You can replace math.pow(x,y) with x^y.

  • The call collectgarbage("count") now returns only one result. (You can compute that second result from the fractional part of the first result.)

Changes in the API which impact NodeMCU C modules

  • Pseudoindex LUA_GLOBALSINDEX was removed. You must get the global environment from the registry.

  • Function luaL_register is deprecated. Use luaL_setfuncs so that your module does not create globals. (Modules are not expected to set global variables anymore.)

  • The osize argument to the allocation function may not be zero when creating a new block, that is, when ptr is NULL (see lua_Alloc). Use only the test ptr == NULL to check whether the block is new.

  • luaL_typerror was removed. Write your own version if you need it.

  • Functions lua_equal and lua_lessthan are deprecated. Use the new lua_compare with appropriate options instead.

  • Function lua_objlen was renamed lua_rawlen.

  • Macros to inject/project nn-default integer and unsigned types were deprecated, only the versions which return lua_Integer are supported. Any type-casting or sign checking should be done in the C implementation. (This list of functions is luaL_checkint, luaL_optint, luaL_checklong, luaL_optlong``lua_pushunsigned, lua_tounsigned, lua_tounsignedx, luaL_checkunsigned, luaL_optunsigned)

Changes in the API which currently are not used in the app/modules files.

  • Pseudoindex LUA_ENVIRONINDEX and functions lua_getfenv/lua_setfenv were removed, as C functions no longer have environments.

  • Function lua_cpcall is deprecated. You can simply push the function with lua_pushcfunction and call it with lua_pcall.

  • Function lua_dump has an extra parameter, strip. Use 0 as the value of this parameter to get the old behavior.

  • Continuation functions now receive as parameters what they needed to get through lua_getctx, so lua_getctx has been removed. Adapt your code accordingly.

  • Function lua_load has an extra parameter, mode. Pass NULL to simulate the old behavior.

  • Function lua_resume has an extra parameter, from. Pass NULL or the thread doing the call.

  • Finalizers (__gc metamethods) for userdata are called in the reverse order that they were marked for finalization, not that they were created. (Most userdata are marked immediately after they are created.) Moreover, if the metatable does not have a __gc field when set, the finalizer will not be called, even if it is set later.

  • Functions to inject/project unsigned integers () were deprecated. Use their signed equivalents with a type cast.

@TerryE
Copy link
Collaborator Author

TerryE commented Jun 21, 2019

As a codicil, the above "Detailed Stuff" is extracted from the official Lua project documentation. There are more impacts because Lua53 subsumes many of the eLua patches and so we need to make other changes to mask these. For example eLua has two function types: one for Lua and standard C functions and a new one for lightweight C functions. Lua53 just treats them all as functions though you can probe the subtype if you really need to, so I need to change common expressions like (lua_isfunction(L, n) || lua_islightfunction(L, n)) to (lua_isanyfunction(L, n)) where the "any" variety is a compatibility macro to generate the variant code needed for the 51 and 53 builds. It is this type of source code change that dominates the edits needed, just so a common module source file will compile to both runtimes.

@jmattsson
Copy link
Member

+2 :D

Thanks heaps for all your looking into and summarising of this, Terry! There's no doubt that this is a significant upgrade, but I do think it's something we should do. My recommendation would be to leave a lua-5.1-final branch around, like we did for the early SDK 1.5.4 branch, so if any users for whatever reason can't get on board with 5.3, they have a branch they can keep using, with the caveat that mainline development is on 5.3 only.

I do have one question on the above - you say that "modules are not expected to set global variables anymore". But how do we register our modules into the global scope then? I mean, node.heap() should still be node.heap(), not _G.node.heap(), right?

@TerryE
Copy link
Collaborator Author

TerryE commented Jun 22, 2019

@jmattsson, I've done a series of related issues labelled Lua 5.3, covering various aspects of this.

  • With the early releases of Lua53, I am adding this as a build variant, and the top level app/Makefile will select one of app/lua or app/lua53 to compile in depending on a make parameter. All of the other directories will be common to both variants with define macros and the occasional #ifdef logic handling any compilation differences. So we won't have separate branches for the two Lua versions. At some point in the future we might deprecate then later drop 5.1 support and the app/lua directory. We can use the same process with the ESP branch.

  • The point on globals is the Lua project's words, not mine. They are trying to discourage models setting globals as a side-effect, but they can't really stop it as the global table is still accessible indirectly. Our use of a ROM table and its being mapped in through the global table's __index is still unchanged.

@devsaurus
Copy link
Member

Certainly 👍 if the Lua experts are aligned :)

Do you envision a gradual migration of our modules to the Lua53 concepts or one single big change?

@TerryE
Copy link
Collaborator Author

TerryE commented Jun 22, 2019

@devsaurus, we will need to have all the modules patched before I release the first Lua53 PR, but the encapsulation will make them backwards compatible. I think it sensible to do this in 2 or 3 batches:

  • The core set: node, file, etc. We can do the normal committer review and a couple of weeks dev soak to through up any issues.
  • The bulk of the remainder as a batch change.
  • Possibly having a 3rd small group where we need to had extra conditional logic because of internal funnies. This last group might not be needed.

When this is all complete, all directories other than the app/lua and app/lua53 directories will be common to both variants.

One other goal that I would like to consider and implement if possible is making the module source ESP32 / ESP 8266 agnostic, except of course where the are architectural reasons why the module won't v work on one variant.

@marcelstoer
Copy link
Member

Thank you Terry for all the details and for devising a plan!

that (2) is preferable to (1)

👍

My only request is that none of this lands on dev until the next "release" is out the door.

@galjonsfigur
Copy link
Member

galjonsfigur commented Jun 22, 2019

Big 👍

About lua_modules and lua_examples - I'm slowly working on them to make them pass the luacheck test + testing the examples themselves + adding more clear comments etc. and making them compatible with both Lua 5.1 and 5.3 should be easy. There is a rather small list of things to consider (at least when looking at existing Lua code in this repository):

  • unpack and table.unpack: this can be solved using something like local unpack = table.unpack or unpack
  • bit operators: NodeMCU used own bit module so there is no worries about that - it should be easy to port it to Lua 5.3 like any other module
  • patterns: when I was checking Lua examples I stumbled upon read_email_imap.lua file that uses pattern incompatible with Lua 5.3 but compatible with 5.1 - I suppose that shouldn't be a big issue because in all Lua code this is the only file with this problem.
  • and probably something else that @TerryE mentioned in first post

Also - I think that it would be a good idea to only make setting for precision for floating numbers (double or single) in builds and support that instead of changing both floating numbers and integers to 64 bit. If someone wants to use 64bit integers another setting in user_config.h could be added for more advanced users.

@TerryE
Copy link
Collaborator Author

TerryE commented Jun 22, 2019

My only request is that none of this lands on dev until the next "release" is out the door.

Don't worry Marcel, this one is sort of a biggy so I agree entirely that it needs to be coordinated carefully will dev -> master milestones.

@TerryE
Copy link
Collaborator Author

TerryE commented Jun 22, 2019

Hi @galjonsfigur, I think that you've earned enough karma to have a say here. 😄

I am going to take a more relaxed view on compatibility issues when it comes to our Lua IoT developers, simply because we are potentially going to bugger up a lot more production code out there if we don't compared to the libraries where (in the end) I can do the bulk edit myself with less effort than trying to engage the original contributors to get up to speed and do the heavy lifting.

Lua 5.3 has various compatibility settings which enable thing like the top level unpack, and I just don't think that we should go anal and bugger around our developer community without good reason,

Thanks for the heads-up on patterns, but this is one where I really don't want to hybridise 5.3, so we should just accept the hit and appologise to our users.

As you correctly mention the issue of precision is going to prove a PITA, IMO. I think that we are going to have to support both 32-bit and 64-bit build variants. That is with 32 bit integers and floats or 64 bit integers and floats. The ESP32 xtensa cell has 32 bit floating point in the hardware, but the ESP8266 has a lot of the standard double library routines in the bootROM code, so ironically 64-bit FP will probably run faster on the ESP8266 than 32-bit. The simplest from my PoV would be 32-bit integers and 64-bit FP, but this is really the worst of both worlds.

Anyway please track all of my issues, PRs and work on this and I would really value your feedback.

@galjonsfigur
Copy link
Member

Thanks for opinion!

I just checked performance impact when using double over single precision floating numbers and the results: (Just simple modification to Arduino sketch so this is not really a great example)

float:
time for 1 mio. plus calculations in s: 0.65
time for 1 mio. minus calculations in s: 0.76
time for 1 mio. multiplications in s: 1.25
time for 1 mio. divisions in s: 3.44

double:
time for 1 mio. plus calculations in s: 0.89
time for 1 mio. minus calculations in s: 1.01
time for 1 mio. multiplications in s: 2.45
time for 1 mio. divisions in s: 9.83

Those are more indicators than benchmark but I think there will be a performance penalty when using 64bit floating point numbers, at least on ESP8266. My point it that I don't think end user would have to worry about what to set as it's now with float or integer build. And AFAIK Lua 5.3 allows to make a combinations of 32bit int and float, 64bit int and float, 32bit int and double or 64bit int and double. Those settings can be exposed somewhere in user_config.h, but supporting all of the combinations could be an issue. For floating point there is only 1 module that would have to use lua_Number (or preporcessor logic for single or double precision): hdc1080 - maybe there is more and I overlooked those.

I just don't see a reason to bother with 64 vs 32 bit integer values. I think I would be a good idea to go for 32 bit by default and just leave option for floating point numbers when some application would need better precision in Lua script, because as for C side of things, sensor readings are processed using double precision floating point numbers (for example bme280).

I will try to write something constructive on other isuses related to Lua 5.3, but I'm tracking them and it's a wonderful work!

@TerryE
Copy link
Collaborator Author

TerryE commented Jun 22, 2019

In my experience -- for what its worth -- apps spend so little time in true floating point computation that the runtime double vs single is pretty irrelevant. (We spend for more time in the Lua GC for example.) The main issues relate to getting the built to compile and execute cleanly. RAM use is an issue and the fundamental storage element, the TValue, is 8 bytes long for 32-bit builds and 12 bytes for 64 bit builds.

@TerryE
Copy link
Collaborator Author

TerryE commented Jul 29, 2019

I now have a baseline for app/modules compiling clean against app/lua53 so I think that we have broken the back of this issue. The aim of this issue was to give the committers a heads-up and get committer buy-in. I feel that we have achieved these objectives.

lessons learnt from the integration of eLua features into core Lua

I don't think unfair to describe the the eLua patch as a hack. The eLua Project had a set of goals to achieve within limited resource and objectives; they achieved their goals and the initial versions NodeMCU depended heavily on this patch. However, the coding style and techniques used were quite different to the approach of the core Lua team when it backported a lot of this functionality into Lua 5.3.

An example of this is in the approach to add support for readonly tables and light function references.

  • The eLua approach minimised the patch complexity by adding new C and header files, and exposing new Lua types for these; varisnt code was embedded in macros that result in fairly complex runtime conditionals being generated at all levels including the application coding, for example as in the ~50 tests of the following form in our C modules:
if (lua_type(L, 2) == LUA_TFUNCTION || lua_type(L, 2) == LUA_TLIGHTFUNCTION) {
  • The Lua approach is to add subtypes to handle such complexities and move the handling of subtypes as far down the module hierarchy as possible; the subtypes aren't even exposed at an application API level. So the Lua way of coding this test is simply:
if ( lua_isfunction(L,2)) {

On reflecting how the core team implemented these changes, I feel that their approach is more sound and better aligned to the rest of the Lua architecture: the code is simpler and runs faster.

So for example I have to ask myself "why do we have or need light functions and expose them at the C API level?" And the answer is that there is no good reason, IMO. Getting rid of them in our Lua code base will simplify code modules in a lua version independent way. See #2866 for further discussion.

@TerryE
Copy link
Collaborator Author

TerryE commented Jul 4, 2020

I think that I've addressed all of the open issues here through the recent PRs, so I am closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants