Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Obstacle Avoidance prearm checks and log health status #11638

Merged
merged 10 commits into from
Mar 27, 2019
Merged

Obstacle Avoidance prearm checks and log health status #11638

merged 10 commits into from
Mar 27, 2019

Conversation

cmic0
Copy link
Contributor

@cmic0 cmic0 commented Mar 13, 2019

This PR is an incremental piece of #11550 and substitutes #11456. Adds Obstacle Avoidance to the prearm checks.

If Obstacle Avoidance system is enabled (COM_OBS_AVOID set to 1) then arming will be temporary denied as long as OA is not up and running.
Moreover the status of OA is now logged in the vehicle_status_flags message:

  • Topic avoidance_system_required: is set to 1 if avoidance system is enabled, 0 if not.
  • Topic avoidance_system_valid: is set to 1 when avoidance system is up and running, 0 when avoidance system dies.

Eventually another parameter has been added:

  • COM_OA_BOOT_T: The avoidance system running on the companion computer is expected to boot within this time and start providing trajectory points. If after this time no avoidance has been detected a MAVLink message will be sent.

Test data / coverage
Tested in SITL with upstream avoidance and bench tested on a vehicle.

Example of the logged signal when avoidance is lost and regained:
image

@bkueng @mrivi FYI

cmic0 added 5 commits March 12, 2019 14:08
This allows to perform pre-arm checks and prevent arming if obstacle avoidance is enabled but not yet running.
Added a print once flag to prevent excessive message spamming in QGC.

Signed-off-by: Claudio Micheli <claudio@auterion.com>
Signed-off-by: Claudio Micheli <claudio@auterion.com>
Signed-off-by: Claudio Micheli <claudio@auterion.com>
Signed-off-by: Claudio Micheli <claudio@auterion.com>
Since onboard controllers bootup times are hardware dependent, it makes sense to have the possibility to adapt timeout time according to the specific HW.

Signed-off-by: Claudio Micheli <claudio@auterion.com>
Copy link
Contributor

@julianoes julianoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs either some renaming to reflect what the code is actually doing, or it needs to be changed do something else.

I can see two cases:

  1. A avoidance check interval (only a couple of times, or forever?)
  2. A fixed timeout (e.g. 60s). If no avoidance is available then, give up.

The pre-arm check seems correct.

* This parameter defines the bootup timeout.
* After the timeout a mavlink message to warn the user that the system
* is still booting up is triggered.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of this param is not really clear to me. From what I read, the only change is the time of when the mavlink statustext warning is sent. However, it's not clear if this warning is now always triggered even if no onboard computer is used at all.

Also, I would probably call it an onboard computer or companion computer, and not an onboard controller because it's not really controlling anything, right?

And it probably requires a reboot. 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I am losing something here:
why should "the warning always be triggered even if no onboard computer is not used at all"?

That message is only printed if OA is enabled (at the current time onboard controller is only used for OA), AFAIK.

@@ -802,6 +802,21 @@ PARAM_DEFINE_INT32(NAV_RCL_ACT, 2);
* Temporary Parameter to enable interface testing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still temporary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch 😄, guess I can get rid of that comment.

avoidance_waiting_count++;

} else if (avoidance_waiting_count == AVOIDANCE_MAX_TRIALS) {
mavlink_log_critical(&mavlink_log_pub, "Avoidance not responding. Try reboot vehicle.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So avoidance is a thing that responds? What about Could not detect avoidance system. Try to reboot the vehicle.?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, following up my previous comment: this is a suggestion to help the user unblocking the situation. Actually instead of reboot vehicle should be power cycle vehicle to even restart the onboard computer.

@@ -3975,6 +3992,8 @@ void Commander::data_link_check(bool &status_changed)

if (_datalink_last_status_avoidance_system == telemetry_status_s::MAV_STATE_ACTIVE) {
mavlink_log_info(&mavlink_log_pub, "Avoidance system healthy");
status_flags.avoidance_system_valid = true;
avoidance_waiting_count = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't reset avoidance_waiting_count here, so you never expect to get it back?

@@ -123,6 +123,7 @@ static constexpr uint64_t HOTPLUG_SENS_TIMEOUT = 8_s; /**< wait for hotplug sens
static constexpr uint64_t PRINT_MODE_REJECT_INTERVAL = 500_ms;
static constexpr uint64_t INAIR_RESTART_HOLDOFF_INTERVAL = 500_ms;

static constexpr uint8_t AVOIDANCE_MAX_TRIALS = 4; /* Maximum number of trials for avoidance system to start */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why there are multiple trials. As far as I can see you're just waiting for heartbeats coming from "avoidance", so you just check after the bootup timeout, right? And then you reset the time and check after the bootup timeout again. Aha, I see what's going on, the bootup timeout is not a timeout, it's actually a "check interval".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are right, maybe calling it "check interval" is more appropriate. The reasoning behind this implementation is that we would like to have a message that periodically shows that the system is still booting up, not just wait for a single timeout. Let's say that you have an onboard computer (not controller 😄 ) boots in 60s:

  • You define the COM_ONB_BOOT_T to 25s
  • When you power the system, you will then have a periodic message shows the system is still booting up (as you may now on some companion computers with docker the booting time is still quite high, even more than 2minutes).
  • If after three messages (elapsed time will be > than average boot time) the FC has still not received an heartbeat then you alert the user that something is wrong.

Thus printing a message gives the user the impression that the system is actually doing something.

What do you think?
Maybe some parameter renaming could help understanding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you only cover these two cases with messages:

Successful case:
(booting, no message, pre-arm check would fail if you try to arm)
When avoidance connects: Send status message "Avoidance system connected" -> Now you know you can fly.

Unsuccessful case:
(booting, no message, pre-arm check would fail if you try to arm)
After timeout (e.g. 2 mins): Send error message "Error: Avoidance system not available!".

This way you set the timeout that you expect and you have a message that it is ready before that if it's faster than expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the messaging logic to the one you suggested.
@julianoes have a look here e004f77

@@ -997,6 +999,15 @@ bool prearm_check(orb_advert_t *mavlink_log_pub, const vehicle_status_flags_s &s
}
}

if (status_flags.avoidance_system_required && !status_flags.avoidance_system_valid) {
if (prearm_ok && reportFailures) {
mavlink_log_critical(mavlink_log_pub, "ARMING DENIED: Avoidance system not ready");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right so this makes sense.

@@ -802,6 +802,21 @@ PARAM_DEFINE_INT32(NAV_RCL_ACT, 2);
* Temporary Parameter to enable interface testing
*
* @boolean
* @reboot_required true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the reboot required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, that is actually a good point:
I was thinking in terms of having added OA at prearm checks, thus if you enable OA you are required to reboot to force the user to redo prearm checks.

But actually on other platforms (..) OA is a feature which can be enabled/disabled inflight.
I think the answer to your question depends on how OA is meant to be in the future. @jkflying can maybe clarify

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrivi after a quick offline chat with @jkflying we decided to have the parameter as it was.
Thanks for raising it.
See here e004f77

cmic0 added 2 commits March 14, 2019 17:48
With this commit the use cases will be:

Success case:
- booting, no messages about OA, pre-arm check would fail if you try to arm and OA is not yet running

Fail case:
- if OA takes longer than timeout time defined in COM_ONB_BOOT_T, then an error message is triggered.


Signed-off-by: Claudio Micheli <claudio@auterion.com>
Signed-off-by: Claudio Micheli <claudio@auterion.com>
Copy link
Contributor

@julianoes julianoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would try to improve the parameter description.


/**
* Set onboard controller bootup timeout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call it onboard computer or companion computer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strongly agree. FWIW IMO terminology for onboard/offboard is unhelpful because it is all from perspective of FC, but people think in terms of vehicles.

Is this a general companion computer timeout or an avoidance system running on companion computer timeout?

Assuming the later I'd do something like:

Set avoidance system bootup timeout.

The avoidance system running on the companion computer is expected to boot within this time and start providing trajectory points. If no avoidance system is detected a MAVLink warning message is sent.

Note, I have tried to explain how it detects the companion computer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I've lost that comment line during my changes.
@hamishwillee thanks for your suggestion, makes sense.

To answer your question: Yes, it is exactly the time within the avoidance system is expected to be running (formally defined as companion computer boot time + avoidance system startup time).

To this aim i thing that makes sense to change the name parameter to a more consistent too, i.e COM_OA_BOOT_T or COM_OBS_AVOID_T.

I think that the second one would be more consistent with COM_OBS_AVOID.
What do you think? @hamishwillee @julianoes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably vote COM_OA_BOOT_T because it's more self explanatory.

* Set onboard controller bootup timeout
*
* This parameter defines the bootup timeout.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about:

The avoidance system is expected to boot within this bootup timeout. If no avoidance system has been detected by this time, a MAVLink warning message is sent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better than what was here. See above for my version - it is much the same, but you don't need to mention time or timeout quite so much.

*
* This parameter defines the bootup timeout.
* After the timeout, a mavlink message that tells the user that the avoidance system
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, MAVLink, not mavlink by preference.

Since the time out is only Obstacle-Avoidance related, the new naming is more self explanatory.

Signed-off-by: Claudio Micheli <claudio@auterion.com>
* After the timeout, a mavlink message that tells the user that the avoidance system
* is not available is sent.
* The avoidance system running on the companion computer is expected to boot
* within this time and start providing trajectory points.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true? Is it about trajectory points being sent? Or is it about a heartbeat?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is true. In the actual implementation, as soon as the avoidance node is running sends both heartbeats and trajectory points. On the long term, this behavior will probably be changed.

If we want to maintain a more general statement we could go for:
The avoidance system running on the companion computer is expected to boot within this time, after which its functionalities are fully operative.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No if it's true it's fine.

julianoes
julianoes previously approved these changes Mar 20, 2019
@@ -3825,6 +3834,7 @@ bool Commander::preflight_check(bool report)
void Commander::data_link_check(bool &status_changed)
{
bool updated = false;
static bool print_msg_once = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use static, but a class member instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checkout here 99bd6ef

Signed-off-by: Claudio Micheli <claudio@auterion.com>
@cmic0
Copy link
Contributor Author

cmic0 commented Mar 22, 2019

Fixed CI failing due to unused variable here 99bd6ef.
FYI @julianoes

@@ -204,6 +203,8 @@ class Commander : public ModuleBase<Commander>, public ModuleParams
systemlib::Hysteresis _auto_disarm_landed{false};
systemlib::Hysteresis _auto_disarm_killed{false};

bool _print_msg_once{false};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename it to _print_avoidance_msg_once otherwise it's not clear what this is referring to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 384cfc3

…ry one.

Signed-off-by: Claudio Micheli <claudio@auterion.com>
@julianoes julianoes merged commit 0eb4942 into PX4:master Mar 27, 2019
@julianoes julianoes deleted the pr-oa-prearm_checks branch March 27, 2019 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants