-
Notifications
You must be signed in to change notification settings - Fork 13.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High CPU load on Pixhawk 4 (fmu-v5) #11604
Comments
I did a bisect on master and the commit that causes this issue is the following: b40f8d5 It's when we disabled the d-cache on the F7. |
@davids5 is there a way to revert the patch if we added additional safeguards? |
@DanielePettenuzzo - can you provide more information:parameters, top running while armed etc. @LorenzMeier - iirc original testing indicated +10-15% increase. So we should look for the root cause which may be something we can mitigate. |
@davids5 yes I usually run |
My bench pixhawk is powered with the power module and only connected to a companion computer with an FTDI on TELEM2. Nothing else is plugged into the pixhawk |
What is the data rate on TELM2? |
The following is the mavlink configuration on telem2: Moreover, we don't only have the regular onboard stream but we also enable log streaming to the companion computer on this serial port Just this telemetry port gives us a ~10% cpu increase |
I am assuming there is very little data to the FMU. Is that a correct assumption? The baud rate translates 6.67us per char, the data rate is 10 uS/char. I measured the F4 in the past and it was a ~2 us code path to send a char but spi interrupts would delay that out and limit the bandwidth. The F7 is different from the F4 because of the d-cache on the rx buffers so I am not clear how this would be affected when off. I am wondering if flow control is activated and the load increase is related it on the data that is accessed to manage flow control. To see if this affecting it we should measure the load under 3 conditions. Please try the following: Breakout the the CTS line into the FMU. Leave it connected as normal then once the load has peaked.
Is there any effect on the load? Do a measurement at 1Mbps and see what the load is. |
@DanielePettenuzzo are all 3 mavlink instances being used? Convincing ourselves the d-cache is safe to re-enable might take some time. As a quick work around simply dropping unneeded streams from each mavlink instance will help. If you're able to fully replicate this on the bench capturing |
According to the log flow control wasn't active for the first mavlink instance ( Here's a plot of cpuload (green), mavlink 0 tx rate (red), and navigation state (light blue). We should take a closer look at what the position controller (and/or navigator) are doing in the DESCEND nav state that requires nearly 20% of the cpu. |
I'm not suggesting this as a direct fix to the issue, but #11176 will help reduce the mavlink cpu load. |
@davids5 we are not using flow control going to the ftdi for the companion computer. I tried anyway what you asked for and it seems that when the vehicle is disarmed the load is about 46%, when I arm it it goes up to 90% and when I connect CTS to either GND or 3.3V it goes up to 98% |
Yesterday we had a crash because all motors shut down. The following is a log of the crash: https://review.px4.io/plot_app?log=16f466ec-f02f-4e11-8133-92a7571c7efa I don't know if it's related to the cpu issue or not |
@DanielePettenuzzo this log is much worse with several oddities. Can you share more logs from this vehicle? Missing sensor messages (not dropped) and estimator time slip throughout the entire log.Only a preflight top was captured, but the cpu usage is quite unusual.In particular cpu usage of these high priority processes is 2-3x worse than normal.
I2C incredibly slow (not the culprit, but should be fixed)
|
I have reviewed the systems ISR timing. I did not have the Lidarlite on I2C to test. There are no issues with the driver reentering (Interrupt storms). -common_isr (arm) The effects of the dache on the tx portion of the isr are significant. The time in the isr increase to ~2.5 times with the cash off. This starves the CPU.
|
I tried open MAVLink on TELEM2 for RC telemetry, and CPU load is around 60%, here is the log: |
Fixed by #11769 once it will get merged. |
When flashing master I noticed very high CPU loads when arming the vehicle. Depending on my parameters, the CPU load can go between 85 and 100%. Also when disarmed the cpu load is way higher than usual.
Log: https://review.px4.io/plot_app?log=6d8f3085-a624-4cbb-8974-5ae992ba51d7
This log is from my bench setup with no external SPI or I2C sensors.
FYI @LorenzMeier @dagar @davids5 @bkueng
The text was updated successfully, but these errors were encountered: