mavlink send helpers refactor to prevent writing partial messages #12967

dagar · 2019-09-16T19:14:58Z

Possible quick work around for #12957.

This pull request rearranges the mavlink helpers slightly to shift the the buffer, should_transmit, error count updates into the single final mavlink mavlink_end_uart_send call rather than each mavlink_send_uart_bytes (called 3 times per mavlink message send).

This is accomplished by unifying some of the uart/udp logic differences. Each write (mavlink_send_uart_bytes) goes into a buffer, then that buffer is written out once (mavlink_end_uart_send). This should prevent broken mavlink messages from going out over the wire, and will also be slightly more efficient by minimizing work for each send.

TODO:

move the get_free_tx_buf() check to mavlink_start_uart_send, store the result and skip mavlink_send_uart_bytes if there's not enough buffer space left.

julianoes

Can you also please copy the PR description in to the commit? (Or squash it so it goes into the git history.)

src/modules/mavlink/mavlink_main.cpp

src/modules/mavlink/mavlink_main.h

dagar · 2019-09-17T14:37:47Z

For context this is the imposed mavlink library helper structure we're dancing around.

https://github.com/mavlink/c_library_v2/blob/ac40c0329e88b70ae5db4c1467ed5853d305af54/mavlink_helpers.h#L352-L359

	MAVLINK_START_UART_SEND(chan, header_len + 3 + (uint16_t)length + (uint16_t)signature_len);
	_mavlink_send_uart(chan, (const char *)buf, header_len+1);
	_mavlink_send_uart(chan, packet, length);
	_mavlink_send_uart(chan, (const char *)ck, 2);
	if (signature_len != 0) {
		_mavlink_send_uart(chan, (const char *)signature, signature_len);
	}
	MAVLINK_END_UART_SEND(chan, header_len + 3 + (uint16_t)length + (uint16_t)signature_len);

dagar · 2019-09-17T14:52:45Z

Can you also please copy the PR description in to the commit? (Or squash it so it goes into the git history.)

One of the reasons I like "Squash and merge" is we get one last chance to clean it up.

jorge789 · 2019-09-17T22:22:12Z

Tested on CUAV+ V5:
Modes Tested

Position Mode: Good.
Altitude Mode: Good.
Stabilized Mode: Good.
Mission Plan Mode (Automated): Good.
RTL: Good.

- Procedure
Arm and Take off in position mode, after flying for approximately one minute, switched to altitude then stabilized mode proceed to switch to mission plan mode then make sure that vehicle follows all waypoints as shown in QGC, once completed all waypoin activate RTL and see landing behaviour.

Notes:
No issues noted, good flight in general.

Log:
https://review.px4.io/plot_app?log=25cc3296-2428-41dc-a6a4-04e1c6251f7c

dannyfpv · 2019-09-17T22:49:12Z

Tested on Pixhawk 4 v5 f-450

Mode tested:
mission mode
altitude mode
no issues, smooth flight

log:
https://review.px4.io/plot_app?log=77ed5ecb-c1fc-4b66-be47-4007dc83640d

Tested on Pixhawk4 mini v5 f-450

Mode tested:
mission mode
altitude mode
no issues, smooth flight

log:
https://review.px4.io/plot_app?log=2f1378df-d18e-4de6-88f2-c45cab4ca07b

dannyfpv · 2019-09-18T17:16:23Z

@dagar we tried to test on PixRacer V4, no connection by the telemetry on QGC. We tried stable and it worked fine.

dagar · 2019-09-19T02:51:14Z

Looks like I screwed up the rebase when extracting the other PRs. Attempting to fix.

julianoes

Looks correct now.

bkueng

and will also be slightly more efficient by minimizing work for each send.

It is not more efficient since you have another memcpy. It also requires considerably more memory (I see almost 1KB on pixracer) due to the additional buffer.
Can we please find a better solution? I know the mavlink API is not great for that.

jkflying · 2019-09-19T12:49:14Z

The only way I can think of getting around the extra memory is for each message, while still using the helper functions, is to allocate a buffer bigger than just the payload data, use the mavlink_msg_XXXXX_out_dynamic_send_buf helper instead, with an offset into the buffer, and then let it copy the data back onto itself. 🤢

dagar · 2019-09-19T15:24:42Z

and will also be slightly more efficient by minimizing work for each send.

It is not more efficient since you have another memcpy.

Yes, but overall it's still slightly faster, although not enough to get excited about.

master (top)

PID COMMAND                   CPU(ms) CPU(%)  USED/STACK PRIO(BASE) STATE FD
167 mavlink_if0                  3900  6.137  1824/ 2532 100 (100)  w:sig  4
168 mavlink_rcv_if0               255  0.327  2424/ 3916 175 (175)  w:sem  4

PR (top)

PID COMMAND                   CPU(ms) CPU(%)  USED/STACK PRIO(BASE) STATE FD
167 mavlink_if0                  2153  5.400  1816/ 2532 100 (100)  w:sig  4
168 mavlink_rcv_if0               178  0.409  2608/ 3916 175 (175)  w:sem  4

master (mavlink status)

instance #0:
        GCS heartbeat:  394441 us ago
        mavlink chan: #0
        type:           USB CDC
        flow control: OFF
        rates:
          tx: 23.249 kB/s
          txerr: 0.000 kB/s
          tx rate mult: 1.000
          tx rate max: 800000 B/s
          rx: 0.527 kB/s
        FTP enabled: YES, TX enabled: YES
        mode: Config
        MAVLink version: 2
        transport protocol: serial (/dev/ttyACM0 @2000000)
        ping statistics:
          last: 0.81 ms
          mean: 0.92 ms
          max: 103.86 ms
          min: 0.27 ms
          dropped packets: 0

PR (mavlink status)

instance #0:
        GCS heartbeat:  8932 us ago
        mavlink chan: #0
        type:           USB CDC
        flow control: OFF
        rates:
          tx: 23.367 kB/s
          txerr: 0.000 kB/s
          tx rate mult: 1.000
          tx rate max: 800000 B/s
          rx: 0.527 kB/s
        FTP enabled: YES, TX enabled: YES
        mode: Config
        MAVLink version: 2
        transport protocol: serial (/dev/ttyACM0 @2000000)
        ping statistics:
          last: 0.51 ms
          mean: 0.94 ms
          max: 27.88 ms
          min: 0.23 ms
          dropped packets: 0

dagar · 2019-09-19T16:28:25Z

It also requires considerably more memory (I see almost 1KB on pixracer) due to the additional buffer.
Can we please find a better solution? I know the mavlink API is not great for that.

Unless you have another idea I don't think the memory is that much of a concern. This buffer is 280 bytes (MAVLINK_MAX_PACKET_LEN) per mavlink instance. I'd still much rather fix this in mavlink, and I even considered pulling more of that code PX4 side. We can come back to this later if the mavlink api situation changes/improves.

This doesn't really excuse the waste, but we're still going to be somewhere around 5-10 K ahead of v1.9.2 on baseline configurations, and the cost of each additional module is lower. If we consider the system overall and the annoying constraints like the mavlink api I think I'd rather invest the time in other areas of the system where we can easily save quite a bit more than 1 KB (finish uORB updates, further reduce FDs, drop NuttX HPWORK/LPWORK, move more tasks to WQs, minimize logger subscriptions array, etc).

julianoes · 2020-04-14T15:15:06Z

It turns out that this PR can prevent PARAM_VALUE messages from getting dropped when the ground station requests them.

I tested this using just FTDI without flow control connected to Telem1 using a small tester based on MAVSDK: mavlink/MAVSDK#1038

For the results, see: https://docs.google.com/spreadsheets/d/1aAT58x1zVd6_5bL9NjCOnc87Eoy4Aa3QMd-cmwZXI8A/edit?usp=sharing

jkflying · 2020-04-14T17:06:21Z

@julianoes very cool test you did there.

Just out of curiosity, does anyone know why the loading time doesn't decrease further moving from 115200 to 460800? Is there some kind of rate limiter?

dagar · 2020-04-14T17:08:55Z

Just out of curiosity, does anyone know why the loading time doesn't decrease further moving from 115200 to 460800? Is there some kind of rate limiter?

I had the same question. I'm guessing it's the rate limiting here.

https://github.com/PX4/Firmware/blob/e2b8fd7b118a232d5ffa406041acb58cde1c2543/src/modules/mavlink/mavlink_receiver.cpp#L2916

Plenty of room for future optimization...

src/modules/mavlink/mavlink_main.cpp

julianoes · 2020-04-15T12:39:09Z

@bkueng some measurements regarding CPU usage, on Pixhawk/fmu-v2:

master : baudrate:  57600, rate: 5000, mode: 0: 4.5%, 0.37%
master : baudrate:  57600, rate: 5000, mode: 2: 6.3%, 0.37%
master : baudrate:  57600, rate: 5000, mode: 8: 6.8%, 0.37%

this PR: baudrate:  57600, rate: 5000, mode: 0: 4.6%, 0.37%
this PR: baudrate:  57600, rate: 5000, mode: 2: 6.1%, 0.37%
this PR: baudrate:  57600, rate: 5000, mode: 8: 6.3%, 0.37%

master : baudrate: 460800, rate: 40000, mode: 0: 4.5%, 0.37%
master : baudrate: 460800, rate: 40000, mode: 2: 6.6%, 0.37%
master : baudrate: 460800, rate: 40000, mode: 8: 8.2%, 0.37%

this PR: baudrate: 460800, rate: 40000, mode: 0: 4.5%, 0.37%
this PR: baudrate: 460800, rate: 40000, mode: 2: 6.3%, 0.37%
this PR: baudrate: 460800, rate: 40000, mode: 8: 7.5%, 0.37%

It actually looks like the PR improves CPU...

Edit: and it's believable because we do 3x memcpy and 1x write instead of 3x write.

julianoes

I vote to merge this.

bkueng · 2020-04-15T14:15:28Z

@bkueng some measurements regarding CPU usage, on Pixhawk/fmu-v2:

It's the additional memory usage I'm concerned with.

LorenzMeier · 2020-04-15T14:21:29Z

@bkueng We can recuperate this with smaller UART TX buffers on some boards. I think we have the ~1K free RAM and if not this can be squeezed out where needed. This is a pretty significant concern overall and a constant issue for a long time.

LorenzMeier · 2020-04-15T14:27:18Z

@dagar I reviewed the diff and I saw the last commit message. I haven't reviewed the first commit message up until now, not expecting anything other than the usual Daniel commit messages. I'm on the fence if they should be reviewed or if we rather would like to just make sure that individual commit messages for PRs that are not flagged as WIP or draft should always be clean in wording. I personally don't think that reviewing commit messages adds value and it should be on the proposer to make sure those are in good shape when a PR is not marked as WIP.

Given that I've written a good share of the original code I could tell that your changes were sound.

dagar · 2020-04-15T14:38:11Z

@dagar I reviewed the diff and I saw the last commit message. I haven't reviewed the first commit message up until now, not expecting anything other than the usual Daniel commit messages. I'm on the fence if they should be reviewed or if we rather would like to just make sure that individual commit messages for PRs that are not flagged as WIP or draft should always be clean in wording.

My preference is to squash & merge by default unless the pull request is a sprawling change that touches many parts of the system and each incremental change was actually tested. That way we can keep the ugly history ("how the sausage was made") including review and test history and follow up changes. Then with squash & merge we get one last chance to verify the commit message actually reflects what the overall change evolved into and decide if it should reference the PR # or has enough information on its own.

julianoes · 2020-04-15T14:41:56Z

@bkueng some measurements regarding CPU usage, on Pixhawk/fmu-v2:

It's the additional memory usage I'm concerned with.

Oops, I remembered that wrongly. But I do agree with the fact that this optimization potentially allows to make the tx buffers smaller.

dagar requested review from jkflying, MaEtUgR and a team September 16, 2019 19:14

dagar mentioned this pull request Sep 16, 2019

MAVLink messages getting sent with 3 write operations #12957

Open

dagar added bug Admin: Enhancement (improvement) 💡 labels Sep 16, 2019

dagar added this to the Release v1.10.0 milestone Sep 16, 2019

dagar force-pushed the pr-mavlink_write branch from 99d00b9 to 2a1570c Compare September 16, 2019 20:36

dagar mentioned this pull request Sep 16, 2019

[WIP]: Mavlink stream check available buffer before doing any work #12970

Closed

dagar changed the title ~~[WIP]: mavlink write hacks~~ mavlink send helpers refactor to prevent writing partial messages Sep 17, 2019

dagar marked this pull request as ready for review September 17, 2019 00:02

dagar mentioned this pull request Sep 17, 2019

FMUv4: CPU load > 95% #12974

Closed

julianoes requested changes Sep 17, 2019

View reviewed changes

dagar requested a review from julianoes September 17, 2019 15:06

dagar force-pushed the pr-mavlink_write branch 2 times, most recently from ae262b5 to 7cbe9c6 Compare September 19, 2019 03:13

julianoes previously approved these changes Sep 19, 2019

View reviewed changes

bkueng reviewed Sep 19, 2019

View reviewed changes

dagar mentioned this pull request Sep 19, 2019

Mavlink: ifdef networking code #12991

Merged

dagar dismissed julianoes’s stale review via 6484ec8 April 14, 2020 14:53

dagar force-pushed the pr-mavlink_write branch from f700f81 to 6484ec8 Compare April 14, 2020 14:53

stale bot removed the stale label Apr 14, 2020

dagar modified the milestones: Release v1.10.0, Release v1.11.0 Apr 14, 2020

dagar changed the title ~~[WIP]: mavlink send helpers refactor to prevent writing partial messages~~ mavlink send helpers refactor to prevent writing partial messages Apr 14, 2020

PX4 deleted a comment from stale bot Apr 14, 2020

dagar requested a review from julianoes April 14, 2020 15:41

jkflying previously approved these changes Apr 14, 2020

View reviewed changes

dagar commented Apr 14, 2020

View reviewed changes

src/modules/mavlink/mavlink_main.cpp Show resolved Hide resolved

dagar dismissed jkflying’s stale review via cc297e0 April 14, 2020 17:52

perf counters fix perf free and cleanup naming

0d6251d

dagar force-pushed the pr-mavlink_write branch from cc297e0 to 0d6251d Compare April 14, 2020 17:53

dagar mentioned this pull request Apr 14, 2020

[WIP] mavlink: don't throttle parameter manager, ftp, or log handler #14667

Closed

julianoes mentioned this pull request Apr 15, 2020

mavlink: lock sending messages outside #14673

Closed

julianoes approved these changes Apr 15, 2020

View reviewed changes

LorenzMeier merged commit b761060 into master Apr 15, 2020

LorenzMeier deleted the pr-mavlink_write branch April 15, 2020 14:22

weekly-digest bot mentioned this pull request Apr 19, 2020

Weekly Digest (12 April, 2020 - 19 April, 2020) #14700

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mavlink send helpers refactor to prevent writing partial messages #12967

mavlink send helpers refactor to prevent writing partial messages #12967

dagar commented Sep 16, 2019 •

edited

Loading

julianoes left a comment

dagar commented Sep 17, 2019

dagar commented Sep 17, 2019

jorge789 commented Sep 17, 2019 •

edited

Loading

dannyfpv commented Sep 17, 2019 •

edited by Tony3dr

Loading

dannyfpv commented Sep 18, 2019

dagar commented Sep 19, 2019

julianoes left a comment

bkueng left a comment

jkflying commented Sep 19, 2019

dagar commented Sep 19, 2019

dagar commented Sep 19, 2019

julianoes commented Apr 14, 2020

jkflying commented Apr 14, 2020

dagar commented Apr 14, 2020

julianoes commented Apr 15, 2020 •

edited

Loading

julianoes left a comment

bkueng commented Apr 15, 2020

LorenzMeier commented Apr 15, 2020

LorenzMeier commented Apr 15, 2020 •

edited

Loading

dagar commented Apr 15, 2020

julianoes commented Apr 15, 2020

mavlink send helpers refactor to prevent writing partial messages #12967

mavlink send helpers refactor to prevent writing partial messages #12967

Conversation

dagar commented Sep 16, 2019 • edited Loading

julianoes left a comment

Choose a reason for hiding this comment

dagar commented Sep 17, 2019

dagar commented Sep 17, 2019

jorge789 commented Sep 17, 2019 • edited Loading

dannyfpv commented Sep 17, 2019 • edited by Tony3dr Loading

dannyfpv commented Sep 18, 2019

dagar commented Sep 19, 2019

julianoes left a comment

Choose a reason for hiding this comment

bkueng left a comment

Choose a reason for hiding this comment

jkflying commented Sep 19, 2019

dagar commented Sep 19, 2019

master (top)

PR (top)

master (mavlink status)

PR (mavlink status)

dagar commented Sep 19, 2019

julianoes commented Apr 14, 2020

jkflying commented Apr 14, 2020

dagar commented Apr 14, 2020

julianoes commented Apr 15, 2020 • edited Loading

julianoes left a comment

Choose a reason for hiding this comment

bkueng commented Apr 15, 2020

LorenzMeier commented Apr 15, 2020

LorenzMeier commented Apr 15, 2020 • edited Loading

dagar commented Apr 15, 2020

julianoes commented Apr 15, 2020

dagar commented Sep 16, 2019 •

edited

Loading

jorge789 commented Sep 17, 2019 •

edited

Loading

dannyfpv commented Sep 17, 2019 •

edited by Tony3dr

Loading

julianoes commented Apr 15, 2020 •

edited

Loading

LorenzMeier commented Apr 15, 2020 •

edited

Loading