Memory leak in Valetudo? #39

dugite-code · 2018-10-30T01:12:52Z

I discovered a small issue when using Home-Assistant to Poll the Map too often (every 10 seconds to get a sort of Live map). The /tmp directory filled and the vacuum locked up. A re-boot fixes the issue obviously.

I had done other things on the vacuum in the past so I might have had less space available than normal.

A quick work-around was to throw this into cron
*/1 * * * * cd /tmp/maps && ls -t /tmp/maps | tail -n +2 | xargs rm --

The text was updated successfully, but these errors were encountered:

dugite-code · 2018-11-05T01:18:52Z

Actually after further looking into the matter it looks like it might be a low memory issue rather than the /tmp partition filling up. I'v added a swap file on /mnt/data to see if that solves my issues

As you can see Valetudo is taking up 76% of the memory:

After restart:

I have also added */30 * * * * /usr/sbin/service valetudo restart to my crontab

Hypfer · 2018-11-05T20:53:49Z

After looking at the code for 20 minutes, I still have no idea what could be responsible for the memory leak.

However, I've noticed that on every request the robot + charger images are read from flash for no reason so I guess the whole part will be rewritten at some point.
Could you provide a specification for this interface? What endpoints are there? What does Home-Assistant expect?
I'm not using those home automation frameworks nor did I write said code so I don't have the slightest Idea. 🙃

dugite-code · 2018-11-06T01:51:08Z

Essentially all HA is doing is calling YOUR.VACUUM.ROBOT.IP/api/remote/map and then pulling the image using the mapsrc from the json response using a GET request (using the python requests library).

The only real difference between HA and using curl is the speed of the requests, normally HA will do this every 30 seconds I tuned it up to every 10 seconds, I wouldn't think that would cause an issue.

setting drawRobot and drawCharger to false by calling YOUR.VACUUM.ROBOT.IP/api/remote/map?drawRobot=false&drawCharger=false&scale=5&border=3&doCropping=true&drawPath=true looks like it might fix the issue. When drawing the robot and charger at 3min run time it was rising above 25% memory, when not drawing the robot and charger it's bouncing between 16% and 20% at 5min run time.

~~I'll run this for a while without restarting and let you know if the memory usage stays at 20% mark~~ after an hour it's back at the 56% mark

As a side note: Having a further look into the HA settings I've enabled limit_refetch_to_url_change: True, This should reduce the fetching of the map image a little.

dugite-code · 2018-11-12T07:41:47Z

After further looking into it, it's strange I am seeing the issue after all the upstart script has an oom score of 1000 so it should in theory handle itself. I am using an older firmware v11_003194 so perhaps that is the issue.

Hypfer · 2018-12-23T14:15:25Z

Is this still happening? Did you try remote debugging and taking a heap dump?

dugite-code · 2018-12-25T00:03:16Z

I am still seeing the issue. I'll try to look into the heap dump soon. I'll also look into updating the firmware as I'm using an older one

…

On December 23, 2018 2:15:26 PM UTC, "Sören Beye" ***@***.***> wrote: Is this still happening? Did you try remote debugging and taking a heap dump? -- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: #39 (comment)

axel-kah · 2018-12-29T15:09:14Z

I just rooted my rockrobo v1 with latest firmwarebuilder (FW v11_003194, --disable-xiaomi), dummycloud_0.1 and Valetudo 0.9 via rrcc. I really like the newly gained power over my vacuum but observed the same behaviour as @dugite-code. Basically every map related call increases the memory footprint of Valetudo process.

I did some measurements in a notebook to stress Valetudo by doing the same api call 1000 times in a row. requests was used to interact with Valetudo API and output of ps via paramiko for the process info. The memory footprint steadily increases to up to 60% before the process terminates when triggering a new map with api/remote/map and requesting the png.

Simply fetching the same png for 1000 times also steadily increases footprint but only by 0.5% points.

Doing the same for api/map/latest does also increase footprint somewhat.

I don't have any experience with node.js apps but I'll try to use the techniques described here to get a clue why the garbage collection fails: https://www.nearform.com/blog/how-to-self-detect-a-memory-leak-in-node/

axel-kah · 2018-12-30T13:21:54Z

Tracing the memory leak is a lot harder than I thought. Stumbling blocks:

Debugging on the robot:

the binary output of pkg does not support the --inspect flag
passing the --inspect flag to pkg --options does not work
using memwatch-sigusr2 (depends on memwatch-next) requires crosscompilation (didn't want to get into that)

Debugging on the host:

usage of the VAC_MAP_TEST env variable was not self explanatory and could not find more documentation. What kind of file do you have to povide? I only found out it's not a rendered png :)
Webserver.js needed slight adaptation to use custom paths for the rrlog files (copied from robot) and temp folder. This effectively mocked the robots filesystem and allows to use the regular code paths that create png maps from logfiles (win!).

Using --inspect was finally possible! Using chromiums debugger for stepping/breaking worked and taking heap snapshots as well. I took several snapshots while requesting 100 maps in between. Unfortunately comparing those snapshots is also not as straight forward as I thought. I can see that the heap increases and creates arrays almost linearly with request count but when trying to attribute it to object class names, function names or anything recognizable from the source I failed because It's all just very generic names in the heap detail view.

Do the devs have any input for me how to debug this properly, what to look for?

Hypfer · 2018-12-30T21:10:19Z

Thanks for looking into that and providing this graphs @axel-kah!
Sadly, I can't help much with that. I'd still love to just drop local map generation altogether which should definitely fix this issue as well :^)

This depends on #66

Hypfer · 2019-03-23T14:31:16Z

I guess this should be fixed with https://github.com/Hypfer/Valetudo/releases/tag/0.2.2

Hypfer · 2019-03-23T15:52:47Z

Nope. Still an issue.

Although it seems like Valetudo is not the only software experiencing it
jimp-dev/jimp#153

Hypfer · 2019-03-23T16:09:18Z

Doesn't happen with Node 8 it seems.

Hypfer · 2019-03-23T16:31:51Z

Testing Node 11.12 shows no signs of the leak so I guess nodejs/node#23862 this is the culprit.

Fixed in Node 11.10
nodejs/node#25993

dugite-code · 2019-03-25T08:58:53Z

Awesome, with the map via MQTT it looks like it's now working as expected.

Update: Frustratingly after re-enabling valentudo it again caused the vacuum to unprovision at the 4am re-boot. I am certain it's not the memory leak though as unlike before where the vacuum was locked up due to low memory it was just unprovisioned. After testing this I might set-up the private provisioning from dustcloud as with MQTT I have don't really need miio any more

lance36 · 2019-03-25T12:14:00Z

Great work!

dugite-code changed the title ~~Over-polling map~~ Memory leak in Valetudo? Nov 5, 2018

Hypfer added the bug Something isn't working label Nov 5, 2018

lance36 mentioned this issue Mar 19, 2019

Get the map from Mi Vacuum V1? rytilahti/python-miio#356

Closed

Hypfer closed this as completed Mar 23, 2019

dugite-code mentioned this issue Mar 26, 2019

Wifi-Connection gets lost at random time #80

Closed

Hypfer mentioned this issue Apr 6, 2019

build(TravisCI): upgrade Node.js version #150

Merged

github-actions bot locked as resolved and limited conversation to collaborators Jan 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak in Valetudo? #39

Memory leak in Valetudo? #39

dugite-code commented Oct 30, 2018

dugite-code commented Nov 5, 2018 •

edited

Loading

Hypfer commented Nov 5, 2018

dugite-code commented Nov 6, 2018 •

edited

Loading

dugite-code commented Nov 12, 2018

Hypfer commented Dec 23, 2018

dugite-code commented Dec 25, 2018 via email

axel-kah commented Dec 29, 2018

axel-kah commented Dec 30, 2018

Hypfer commented Dec 30, 2018

Hypfer commented Mar 23, 2019

Hypfer commented Mar 23, 2019

Hypfer commented Mar 23, 2019

Hypfer commented Mar 23, 2019

dugite-code commented Mar 25, 2019 •

edited

Loading

lance36 commented Mar 25, 2019

Memory leak in Valetudo? #39

Memory leak in Valetudo? #39

Comments

dugite-code commented Oct 30, 2018

dugite-code commented Nov 5, 2018 • edited Loading

Hypfer commented Nov 5, 2018

dugite-code commented Nov 6, 2018 • edited Loading

dugite-code commented Nov 12, 2018

Hypfer commented Dec 23, 2018

dugite-code commented Dec 25, 2018 via email

axel-kah commented Dec 29, 2018

axel-kah commented Dec 30, 2018

Hypfer commented Dec 30, 2018

Hypfer commented Mar 23, 2019

Hypfer commented Mar 23, 2019

Hypfer commented Mar 23, 2019

Hypfer commented Mar 23, 2019

dugite-code commented Mar 25, 2019 • edited Loading

lance36 commented Mar 25, 2019

dugite-code commented Nov 5, 2018 •

edited

Loading

dugite-code commented Nov 6, 2018 •

edited

Loading

dugite-code commented Mar 25, 2019 •

edited

Loading