Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AP_Scripting: allow expansion of memory for scripting at runtime #26517

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

tridge
Copy link
Contributor

@tridge tridge commented Mar 15, 2024

This makes it possible to expand the MultiHeap used by scripting at runtime. This means if the user underestimates the right value for SCR_HEAP_SIZE and they have free memory (which they usually do on a H7) then it will allocate more memory with a warning
Note that:

  • it will only expand the heaps if armed. If disarmed the user must fix their SCR_HEAP_SIZE
  • expanding can be disabled with a SCR_DEBUG_OPTS option

This PR also cleans up the handling of scripting debug options

The motivation for this PR is there is really no way to predict in advance how much memory a set of scripts will need, so users may under-estimate and the scripts fail at a critical time

This PR also fixes an issue where realloc may fail if there is no room in the current sub-heap. We now always alloc/copy/ree. ChibiOS was doing this within a heap already, so this doesn't cost any more, but means it fixes a path for script failure

This was test flown on a Pixhawk6X

update: moved the allocation code to a separate AP_MultiHeap library, removed the hal.util functions. Tested in sim-on-hw and SITL

@IamPete1
Copy link
Member

My only real concern is that the user won't notice. I think we should call the error handler in scripting so it will make a fuss and print the warning lots and will trigger a pre-arm. If we do that I think its fine to allow it to run when disarmed too. Currently just running out of memory only tells you that there is not enough, I think this method would be able to tell how much is needed. Ideally the users gets this error then puts the value into the heap size param and then never sees it again.

@tridge
Copy link
Contributor Author

tridge commented Mar 18, 2024

I think we should call the error handler in scripting so it will make a fuss and print the warning lots and will trigger a pre-arm

that would make their script fail, which loses the main reason for this PR which is to handle cases where a less common path in a script (eg. a failsafe handling) takes you over your memory limit and a critical script fails
we could make it an InternalError, but I suspect @peterbarker would object as it is really not an internal error

@IamPete1
Copy link
Member

that would make their script fail, which loses the main reason for this PR which is to handle cases where a less common path in a script (eg. a failsafe handling) takes you over your memory limit and a critical script fails

Sorry, bad wording, I didn't mean you actually error out, just call the set_and_print_new_error_message method. That is what will reprint the error a lot and cause a pre-arm. Ideally with a message something like "need SCR_HEAP_SIZE > x"

@Hwurzburg Hwurzburg added the WikiNeeded needs wiki update label Mar 18, 2024
@timtuxworth
Copy link
Contributor

When this says "makes it possible", what does that mean? From the dev call discussion it sounded like the RAM will be automatically allocated if needed when armed, which actually makes sense, but that's a bit more proactive than "makes it possible". Could you make it a bit clearer what's actually happening?

Copy link
Contributor

@tpwrules tpwrules left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some notes to make things cleaner.

libraries/AP_Common/MultiHeap.h Outdated Show resolved Hide resolved
libraries/AP_Scripting/lua_scripts.cpp Show resolved Hide resolved
libraries/AP_Common/MultiHeap.cpp Outdated Show resolved Hide resolved
libraries/AP_Common/MultiHeap.cpp Outdated Show resolved Hide resolved
libraries/AP_Common/MultiHeap.cpp Outdated Show resolved Hide resolved
libraries/AP_Common/MultiHeap.cpp Outdated Show resolved Hide resolved
@tridge tridge mentioned this pull request Nov 16, 2024
7 tasks
@tridge tridge force-pushed the pr-lua-malloc-expand branch 2 times, most recently from 020e939 to fc769f8 Compare November 17, 2024 05:26
@tridge
Copy link
Contributor Author

tridge commented Nov 17, 2024

Sorry, bad wording, I didn't mean you actually error out, just call the set_and_print_new_error_message method.

I've now added that in, message like this:

Required SCR_HEAP_SIZE over 34300

@tridge tridge force-pushed the pr-lua-malloc-expand branch 4 times, most recently from 11f2a13 to 265df41 Compare November 17, 2024 07:33
Copy link
Contributor

@tpwrules tpwrules left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested only by inspection, but this looks like a good set of changes to correct the problem for 4.6. I don't think it's worth digging into the Lua interpreter for now but fixing the HAL now will let us clean up and iterate during the 4.7 dev cycle while still fixing the problem for 4.6.

Also don't forget the test suite!

libraries/AP_MultiHeap/MultiHeap_chibios.cpp Show resolved Hide resolved
libraries/AP_MultiHeap/AP_MultiHeap.cpp Outdated Show resolved Hide resolved
libraries/AP_MultiHeap/MultiHeap_malloc.cpp Show resolved Hide resolved
libraries/AP_MultiHeap/MultiHeap_malloc.cpp Show resolved Hide resolved
libraries/AP_MultiHeap/AP_MultiHeap.cpp Show resolved Hide resolved
libraries/AP_MultiHeap/AP_MultiHeap.cpp Outdated Show resolved Hide resolved
@tpwrules
Copy link
Contributor

tpwrules commented Nov 18, 2024

Tested the aerobatics script on SITL and SimOnHW and this does seem to work as advertised.

Unfortunately bumping the heap size up to what the warning reports is not a guarantee that you won't get a bigger warning next time presumably due to allocation patterns but we aren't running into flight controller memory issues yet so it should be good for now until scripts get even more complex.

I think the warning user experience is decent, though having it repeat so many times in flight for a less critical thing is annoying. Maybe it should be downgraded to info? The point is that you don't have to stop the flight, but it will automatically become a prearm fail and an error upon landing no matter the in-flight severity IIUC.

I don't take any position on if this deserves a backport, but I think this is a good set of changes for doing such if desired. Simplifying/reducing it more will make backporting awful.

@tpwrules
Copy link
Contributor

tpwrules commented Nov 18, 2024

This makes things worse on ESP32S3, I can't get scripting to allocate at all though I'm not really sure why and it (barely) works on master... It's not a huge deal but disappointing.

Have you given it a try on QURT?

make functions for lua heap allocation suitable for use in all
non-ChibiOS HALs
this allows for new heaps to be added at runtime for lua scripting if
you run out of memory while armed
Copy link
Contributor

@tpwrules tpwrules left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, ESP32 can be fixed later.

Still need buy-in on the backport.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

6 participants