Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix idle delay race-condition #11206

Merged
merged 2 commits into from
Jan 29, 2019
Merged

Fix idle delay race-condition #11206

merged 2 commits into from
Jan 29, 2019

Conversation

MaEtUgR
Copy link
Member

@MaEtUgR MaEtUgR commented Jan 13, 2019

Context
I think I found the root cause of the problem addressed in #11167

The first commit is the fix the second is refactoring only.

Describe problem solved by the proposed pull request
The idle time configured by MPC_IDLE_TKO does not always get executed. There's a race-condition between arming, when the flight task gets started and the takeoff. This inconsistently results in not awaiting the configured idle time and if awaited still too abrupt takeoffs without a proper takeoff ramp. See the excellent description of @bresch in #11167.

Describe your solution
The required flight task should get started at the exact time we want to take off. It was started when arming and the takeoff happened later. This change is necessary because otherwise the task produces setpoints, the land detector detects a takeoff but the takeoff did not happen yet because it's delayed by the arm_hysteresis (which I refactored to the name takeoff_delay_hysteresis because I think it better describes what it does).

Test data / coverage
A lot of SITL testing

@Stifael
Copy link
Contributor

Stifael commented Jan 13, 2019

I don't think it solves the root cause of the problem

This change is necessary because otherwise the task produces setpoints, the land detector detects a takeoff but the takeoff did not happen yet because it's delayed by the arm_hysteresis

why exactly does the landdetector detect a takeoff?

You only can enter the smooth-takeoff part if the vehicle was armed long enough (https://github.com/PX4/Firmware/blob/master/src/modules/mc_pos_control/mc_pos_control_main.cpp#L711-L714). If not in smooth-takeoff but still landed, then you enter this if-statement, where the setpoint is reset and hence the land-detector should not see any takeoff-setpoint.
However, this logic breaks if for whatever reason there is a valid thrust-septoint received, which happens in the scenario described here.

(which I refactored to the name takeoff_delay_hysteresis because I think it better describes what it does).

It really depends what you expect from that hysteresis. When I introduced it, the goal was to delay the arming state such that propeller have enough time to reach idle speed. In theory, the right solution would be to set the arming-state-flag to true only if the propeller reached idle speed. This would ensure that the entire system receives a true arming-state flag at the same time.
However, the goal of this hysteresis is not really to delay the takeoff, although in most scenarios it ends up to be the same.

@MaEtUgR
Copy link
Member Author

MaEtUgR commented Jan 13, 2019

I don't think it solves the root cause of the problem

I'm open to suggestions but #11167 looks more like a symptom of this cause to me.

why exactly does the landdetector detect a takeoff?

Sometimes it does, sometimes it doesn't and I'm honestly not sure why. But this pr fixes the problem. And the fix makes sense to me because you don't need to start the task when you're anyways delaying the takeoff. That defeats the reason to not start the task already before arming.

In theory, the right solution would be to set the arming-state-flag to true only if the propeller reached idle speed.

I have to object for safety reasons. The armed_flag is telling everyone that the system is now dangerous and the propellers should only rotate when it's true.


the goal of this hysteresis is not really to delay the takeoff

To ensure that the propellers reach idle speed before initiating a takeoff

It's not to force a takeoff after the delay but to force a delay before a possible takeoff. Or what did you mean?

EDIT: force-push to hopefully make CI being able to checkout again

@Stifael
Copy link
Contributor

Stifael commented Jan 13, 2019

It's not to force a takeoff after the delay but to force a delay before a possible takeoff. Or what did you mean?

The issue to solve is / was the following:
The arming-flag state indicates to the entire system that the vehicle is armed now. However, that is not the same has having idle speed. For most vehicles there is no difference between being armed and having idle speed, because the props can accelerate quick enough. However, for vehicles with larger props, that makes a difference. For instance, if QGC sends an arming command followed by takeoff, the time it takes between arming and takeoff is not long enough to reach idle speed before starting the takeoff ramp, which results in not having a ramp at all. Consequently, there needs to be a difference between being armed and having reached idle speed. The first thing that should happen after the arming command is to throttle up to idle speed, and then move on with takeoff.
Hence, instead of delaying the takeoff, the intention was to give the vehicle time to reach idle speed, independent of having a takoff command or not.

@Stifael
Copy link
Contributor

Stifael commented Jan 13, 2019

by the way, the entire logic also depends on what "armed" means within the autopilot.
Does armed=true mean that the propellers started to rotate (which is basically what I would expect), or does it mean that the vehicle is now ready for take off (that is how it is implemented without arming- hysteresis).

@MaEtUgR
Copy link
Member Author

MaEtUgR commented Jan 14, 2019

We're talking about the exact same intention just with different words 😉

arm -> wait for the motors to reach idle speed -> possible takeoff
arm -> enforce a takeoff delay (to let the motors spin up) -> possible takeoff

Is there now anything wrong with the fix (first commit) or with the names (second commit)? We can discuss that separately.

bresch
bresch previously approved these changes Jan 14, 2019
Copy link
Member

@bresch bresch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems correct to me to start the flight task only when the drone is fully ready for takeoff.
However, maybe MPC_SPOOLUP_TIME might be a more precise name. What do you think?

@@ -159,7 +159,7 @@ class MulticopterPositionControl : public ModuleBase<MulticopterPositionControl>
(ParamInt<px4::params::MPC_POS_MODE>) MPC_POS_MODE,
(ParamInt<px4::params::MPC_AUTO_MODE>) MPC_AUTO_MODE,
(ParamInt<px4::params::MPC_ALT_MODE>) MPC_ALT_MODE,
(ParamFloat<px4::params::MPC_IDLE_TKO>) MPC_IDLE_TKO, /**< time constant for smooth takeoff ramp */
(ParamFloat<px4::params::MPC_TKO_DELAY>) MPC_TKO_DELAY, /**< time between arming and takeoff */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this name is better. I agree with @bresch, `MPC_SPOOL_UP_TIME' or something similar would be better

@Stifael
Copy link
Contributor

Stifael commented Jan 14, 2019

I would really appreciate to find first the actual bug. In principle, there should not be any smooth-takeoff until the arming-hysteresis turns true. If this is not the case, then that is a bug that needs to be found first

@bresch
Copy link
Member

bresch commented Jan 14, 2019

@Stifael This part gets skipped if the land detector fails: https://github.com/PX4/Firmware/blob/766f911005dc4949925a46507236c62582d54d59/src/modules/mc_pos_control/mc_pos_control_main.cpp#L739
Then you're not in smooth takeoff but the drone can still take off...
That's why I tried to robustify the land detector.

@Stifael
Copy link
Contributor

Stifael commented Jan 16, 2019

@bresch

This part gets skipped if the land detector fails

That is true. However, but it also fails if there is a valid thrust-sp in z-direction. this error happens here

@julianoes
Copy link
Contributor

This does not pass the Dronecode SDK integration test, not sure if it is supposed to or if this is orthogonal to it.

@LorenzMeier
Copy link
Member

What is the current status here? @MaEtUgR any insights? @julianoes Can we set up your test in CI?

julianoes
julianoes previously approved these changes Jan 21, 2019
Copy link
Contributor

@julianoes julianoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works with the Dronecode SDK integration hover tests, both with MPC_TKO_DELAY set to 0 and MPC_TKO_DELAY set to 5. However, if the delay is set too high, it will auto-disarm after arming before receiving the takeoff command.

@MaEtUgR
Copy link
Member Author

MaEtUgR commented Jan 21, 2019

@julianoes

However, if the delay is set too high, it will auto-disarm after arming before receiving the takeoff command.

That's something I'm aware of. This was the case before this pr and stays unchanged. The question for that is: Does someone have a motor "spool up time" of >10 seconds: https://github.com/PX4/Firmware/blob/c2e48f45ab21101a86217f6b7a515c54ac0d468f/src/modules/commander/Commander.cpp#L1747-L1748

@Stifael
Copy link
Contributor

Stifael commented Jan 21, 2019

is this PR still relevant, given that it does not solve the root cause? The root cause was solved here.
In addition, I still think the name-change is not really an improvement. That said, the previous name was probably also not good. If you really want to change that name, then I would go with @bresch suggestion, because that is basically what I was intending to implement

@julianoes
Copy link
Contributor

julianoes commented Jan 21, 2019

Does someone have a motor "spool up time" of >10 seconds

Probably not 😄

@bresch
Copy link
Member

bresch commented Jan 28, 2019

@MaEtUgR @Stifael I think this PR is still a good improvement as we don't need to run the flight task when the vehicle is not armed. I would like to merge it if we change the name of the parameter to the one we suggested or if we leave it as it was originally.

@Stifael
Copy link
Contributor

Stifael commented Jan 28, 2019

@bresch,
agreed. I prefer to change the name to what you suggested (MPC_SPOOL_UP_TIME) and change the comments

@MaEtUgR
Copy link
Member Author

MaEtUgR commented Jan 28, 2019

Yes, we already discussed. I'll rebase and change the name.

@MaEtUgR MaEtUgR dismissed stale reviews from julianoes and bresch via 030d6a9 January 28, 2019 14:02
@MaEtUgR
Copy link
Member Author

MaEtUgR commented Jan 28, 2019

I rebased and changed naming and description inside the second commit only. Review?

Copy link
Member

@bresch bresch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, thanks.

@Stifael Stifael dismissed their stale review January 29, 2019 09:02

Comments are addressed

@Stifael Stifael merged commit cdb6aca into master Jan 29, 2019
@MaEtUgR MaEtUgR deleted the fix-arm-idle-race branch January 29, 2019 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants