Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flight - Altitude calculation errors caused by looptime fluctuations #569

Closed
digitalentity opened this issue Feb 28, 2015 · 61 comments
Closed

Comments

@digitalentity
Copy link
Contributor

I have a BMP180 sensor connected to my CC3D and I noticed that it produces strange spikes every few seconds. This is likely not a hardware issue as I have connected a BMP085 breakout board and seen these spikes as well. Configurator does not show any I2C errors. Copter is sitting still on my table.
2015-03-01 09 49 09

@SlyMike
Copy link

SlyMike commented Mar 1, 2015

Do you have an i2c oled display connected? I noticed similar behaviour with my Naze32 the other day. I was getting an occasional spike but also noticed a stepping pattern imerge as the pages changed on the oled display.

@thenickdude
Copy link
Contributor

thenickdude commented Mar 1, 2015 via email

@SlyMike
Copy link

SlyMike commented Mar 1, 2015

Just tried to recreate my trace for you and ended up with this instead...

screenshot 2015-03-01 11 29 48

Not what I expected!
Currently running Cleanflight/NAZE 1.7.2 Feb 23 2015 / 14:21:30 (afe44d1). I got the stepping pattern on 1.7.0.

@SlyMike
Copy link

SlyMike commented Mar 1, 2015

Flashed back to 1.7.1 to see if the original steps were there...

screenshot 2015-03-01 11 42 07

screenshot 2015-03-01 11 43 05

screenshot 2015-03-01 11 43 38

^^^ that's what I'm talking about ^^^

screenshot 2015-03-01 11 44 08

screenshot 2015-03-01 11 44 32

screenshot 2015-03-01 11 49 53

^^^ Back to normal after a few minutes ^^^

@digitalentity Sorry for hijacking your thread!

Just ordered a new baro :_( Irrespective of the stepping pattern, something else is wrong here.

@digitalentity
Copy link
Contributor Author

@SlyMike, no problem, we seem to have a common problem here.

I've noticed alot of spikes when motors are running at high throttle, this is probably a noise in powerlines issue. Will to some more testing in a few days.
Also ordered a new baro, this time a MS5611, to see if the issue is indeed hardware related.

@thenickdude
Copy link
Contributor

thenickdude commented Mar 2, 2015 via email

@digitalentity
Copy link
Contributor Author

@thenickdude Spikes due to prop wash are expected and present, but they are not a 25 meters variance. Moreover, I have my baro covered by piece of HEPA to prevent high speed air flow from messing with baro readings.

@hydra hydra changed the title [issue] Strange barometer altitude spikes Strange barometer altitude spikes Mar 3, 2015
@hydra
Copy link
Contributor

hydra commented Mar 5, 2015

I confirm that when disarmed you will get baro spikes when an OLED display is connected, especially when changing pages.

The baro code is coupled to the looptime, it needs decoupling.

This isn't an issue when armed since the display does not update.

@hydra hydra changed the title Strange barometer altitude spikes Barometer sensor errors caused by looptime fluctuations. Mar 5, 2015
@hydra
Copy link
Contributor

hydra commented Mar 5, 2015

Regardless of OLED display, this still needs fixing.

@digitalentity
Copy link
Contributor Author

I don't have an OLED connected, just usb vcp, mag, baro and gps.
On Mar 6, 2015 3:07 AM, "Dominic Clifton" notifications@github.com wrote:

I confirm that when disarmed you will get baro spikes when an OLED display
is connected, especially when changing pages.

The baro code is coupled to the looptime, it needs decoupling.

This isn't an issue when armed since the display does not update.


Reply to this email directly or view it on GitHub
#569 (comment)
.

@thenickdude
Copy link
Contributor

I couldn't reproduce this on my Naze32 rev 5 (which I believe is the MS5611 barometer), even when I added this code to mw.c to add crazy high random variability to my looptime (i.e. add between 150ms and 0ms delay to looptime)

EDIT: I was only varying looptime and it turns out that the timing of the outer loop is the important part, read my later comment

@digitalentity
Copy link
Contributor Author

Just tested a newly received BMP180 breakout board, same random spikes, even when disarmed. I was unable to reproduce these spikes on Flip32 with MS5611 board, so this issue is unrelated to looptime. It is also not related to a specific sample of BMP085/BMP180 sensor as I am seeing these spikes on 3 different boards. I think this is likely software-related. I will look into the code and see if I can come up with a fix for this.

@digitalentity digitalentity changed the title Barometer sensor errors caused by looptime fluctuations. BMP085/BMP180 barometer sensor errors Apr 1, 2015
@ledvinap
Copy link
Contributor

ledvinap commented Apr 1, 2015

I have probably same problem on sparky. USB usage is probable cause here, looptime goes to ~20ms range ...

@ledvinap
Copy link
Contributor

ledvinap commented Apr 1, 2015

@thenickdude : The problem is probably not directly related to looptime difference, but to stalling of whole main loop. Adding random delay_ms() to main loop() may simulate it ...

@thenickdude
Copy link
Contributor

Ohh right, the rest of the loop contents outside the looptime controlled section, I forgot about those.

EDIT: Yeah, it is absolutely an error caused by the timing of the main loop. I added the random delay as you suggested and this is the result, this board is completely motionless:

screen shot 2015-04-02 at 12 49 33 am

@ledvinap
Copy link
Contributor

ledvinap commented Apr 1, 2015

Yes ...
delay(301 * (currentTime % 511)); should do the trick ...

@hydra
Copy link
Contributor

hydra commented Apr 1, 2015

@thenickdude yup, that's the kind of thing I see.

@digitalentity
Copy link
Contributor Author

I've just tested the same - added the random delay and got very similar picure. Seems this is not related to baro driver after all.

@hydra
Copy link
Contributor

hydra commented Apr 1, 2015

@digitalentity yes, that's what I thought when I commented on the issue a month a go. Good to have confirmation.

@digitalentity
Copy link
Contributor Author

It seems the root of the problem is somewhere in calculateEstimatedAltitude(), I'll further investigate this tomorrow if I have some time to spare.

@hydra hydra changed the title BMP085/BMP180 barometer sensor errors Altitude calculation errors caused by looptime fluctuations Apr 1, 2015
@digitalentity
Copy link
Contributor Author

Did a quick hack to bmp085 driver to fix preassure to a predefined value and introduce some random variance (19ab926) and was very surprised - the spikes were all gone! This is a bmp085-specific problem after all.

@digitalentity
Copy link
Contributor Author

Looks like bmp085/bmp180 does not always follow the datasheet when it comes to conversion delays. I've increased ut_delay and up_delay to much higher values (da9fdd4) and spikes are not present any more - just normal altitude fluctuations. It is a hack, but it seem to do the trick for me.

@thenickdude
Copy link
Contributor

It's not bmp085 specific as the timing variation I saw was on the Naze32, which uses MS5611. They could both have the same bug in their drivers though. Also adding delays to the loop shouldn't ever cause the barometer to be read faster, so it should be eliminating that if it was the cause.

I'm confused by what you're trying with adding random values to the read pressure?

@digitalentity
Copy link
Contributor Author

@thenickdude by adding random variance I was trying to simulate preassure changes. If I don't add random variance altiude is zero and completely constant without any spikes or changes regardless of looptime.
Then I altered the driver itself and increased the timeout between 'read pressure' command and actual reading raw value from the sensor. Spikes I was seing in my first comment are gone. I think that bmp085 hardware does not always behave exactly by datasheet, sometimes it need more time than 25.5ms (as in datasheet) to finish the pressure conversion.

@thenickdude
Copy link
Contributor

Those spikes are probably due to the bug in the driver that tries to read the pressure/temperature value anyway even though the code knows they aren't available yet (by waiting for the flag to be set). That check can probably be fixed without increasing the delay:

    // wait in case of cockup
    if (!convDone)
        convOverrun++;

    i2cRead(BMP085_I2C_ADDR, BMP085_ADC_OUT_MSB_REG, 2, data);
    bmp085_ut = (data[0] << 8) | data[1];

(notice how when convDone is false it still reads it anyway and updates the value!)

However we still have the problem where computed altitude varies dramatically with the loop timing, which it shouldn't do.

@digitalentity
Copy link
Contributor Author

convDone is used only if we have an additional wire from EOC pin on BMP085 to some EXTI line. External sensors on I2C does not have that, so this code should not be even compiled when BARO_EOC_GPIO is undefined.

EDIT: BMP085 also does not have a status register and no flag to detect end of conversion. We have to rely on delays for external sensors on i2c bus.

@digitalentity
Copy link
Contributor Author

@ledvinap I've added a generic int32_t implementation of 3- to 9-point median filter and modified generic barometer code to use a 3-point version, should be independent from actual barometer hardware driver. Please see commits 963e8f3, 7c1b41b. Median filter code can also be reused for sonar etc.
Didn't test the code though, don't have a cc3d board at home.

EDIT: found a bug, used a wrong variable.

@ledvinap
Copy link
Contributor

ledvinap commented Apr 3, 2015

Well .. the baro problem is probably caused by wrong timing:

but this variable is used in executePeriodicTasks() for baroUpdate(currentTime); (on next loop and possibly skipping once because of shouldProcessRx(). The delay between sampling time and executing baroUpdate may be quite long and affected by other tasks. If this delay is long when start-conversion command is send and short when reading data, baro value won't be updated ...

Correct action is probably to call micros() again after switch(state), so that wait is guarantied to be longer than requested.

@ledvinap
Copy link
Contributor

ledvinap commented Apr 3, 2015

Fix in #720, fixes problem on sparky.

Other functions called with currentTime should be checked too ..

@digitalentity
Copy link
Contributor Author

Your fix is better than mine, it eliminates the cause. Modified #718 to solve end-of-conversion flag not being used when available.

@digitalentity
Copy link
Contributor Author

Other functions called with 'currentTime' and affected are probably these:

rxUpdateAt = currentTime + DELAY_50_HZ;

nextUpdateAt = currentTime + COMPASS_UPDATE_FREQUENCY_10HZ;

warningLedTimer = currentTime + 500000;

GPSLEDTime = currentTime + 150000;

@ledvinap
Copy link
Contributor

ledvinap commented Apr 3, 2015

Only compass is possibly affected and both compass chips now in use should simply return previous value/wait for ready bit.

@digitalentity
Copy link
Contributor Author

Than the only piece of code that should be changed is baro-related.

@ledvinap
Copy link
Contributor

ledvinap commented Apr 3, 2015

It is probably enough to fix problem with baro now, and maybe fixing compass code to avoid surprise in future.

@digitalentity
Copy link
Contributor Author

I've tested the 3-point median filter on buggy bmp085 code, spikes are all gone. This does not fix the cause of spikes (bad timing) but effetively filters them out.

@digitalentity
Copy link
Contributor Author

@ledvinap as a test, I've merged your's #720 and mine #718 and #725 into current master and bench-tested the code. All seem to work fine - baro provides sane readings with no spikes.

@ledvinap
Copy link
Contributor

ledvinap commented Apr 3, 2015

IMO the median filter is good idea even with fixed code - single failed I2C read may cause big problems otherwise ...

@digitalentity
Copy link
Contributor Author

I think the same way. Probably the same median filter should be applied to sonar readings as well.

@digitalentity
Copy link
Contributor Author

Did a flight test today. Baro mode works really well now. Quad was hovering in BARO + GPSHOLD mode for about 10 minutes and I didn't notice any sudden altitude changes, everything was within normal 0.5-1 meter variance.

@hydra
Copy link
Contributor

hydra commented Apr 7, 2015

after fixes and filter

So, after merging the the PR's #720, #718 and #725 I observe the altitude calculation is still incorrect when an OLED display is connected when disarmed.

Granted, an OLED display doesn't change pages or refresh when armed however, if i'm understanding things correctly looptime fluctuations cased by other long running tasks that happen when armed will still cause issues, e.g. processing a new GPS frame.

What other fixes can be made?

@hydra
Copy link
Contributor

hydra commented Apr 7, 2015

Also, thanks for the great investigation work, fixes and testing so far!

@digitalentity
Copy link
Contributor Author

Don't have an OLED display to investigate further. When I receive my brand new display from China, I'll dig into this issue again. So far I am very happy with the results I am getting.

@hydra
Copy link
Contributor

hydra commented Apr 8, 2015

@digitalentity You can replicate the behaviour it by adding a variable delay in exectutePeriodicTasks() that changes every 5 seconds and is executed at the same frequency that the display would normally execute (5hz).

@hydra
Copy link
Contributor

hydra commented Apr 8, 2015

@ledvinap @digitalentity I'm thinking that perhaps the call to executePeriodicTasks() should be performed on every loop iteration, even iterations that would call processRx(). Comments?

@hydra
Copy link
Contributor

hydra commented Apr 8, 2015

Perhaps this commit would help too: eb1617c

@ledvinap
Copy link
Contributor

ledvinap commented Apr 8, 2015

@hydra: You are talking about 1m fluctuation?
Try disabling all attitude filters. The baro noise about 10m on my Sparky, the problem you see may be some aliasing problem. Or voltage fluctuation. Or black magic ...

@ledvinap
Copy link
Contributor

ledvinap commented Apr 8, 2015

What about implementing priority queue for timed task? Just register periodic invocation or invocation delay. Then just fetch first callback in loop() and call it.

With some care it may handle CPU overload much better that current scheme.
It may be also possible to include 'expected runtime' in callback descriptions and reorder tasks to execute high-priority tasks with better accuracy (Well ... much work was probably done in this field in eighties ...).

This should work without introducing much thread synchronization problems that would come with full RTOS. And maybe it could be slowly migrated to cooperative multitasking (GPS code is great candidate) ...

@hydra
Copy link
Contributor

hydra commented Apr 8, 2015

@ledvinap yes, ~1m fluctuation on the SPRacingF3 with OLED enabled. ~10m fluctuation on Sparky with all settings at default.

While I agree that better task scheduling is required at some point, is there nothing else we could do now instead?

@ledvinap
Copy link
Contributor

ledvinap commented Apr 8, 2015

The 10m on sparky is after default filtering? Can't test right now, but that would be terrible.

Using better filtering may help a bit, but noise should be reduced as sqrt(averaging), so overaging 100 samples may result in ~1m fluctuation.

Randomizing baro sampling times may help if main problem is caused by aliasing.

Did you try to disable all filters and use blackbox to record baro data? Analyzing the data may reveal something interesting ...

@hydra
Copy link
Contributor

hydra commented Apr 11, 2015

@ledvinap My bad - On the sparky i get about 3m fluctuations using all default settings - when configurator is viewing the sensors tab and connected via the USB port.

@ledvinap
Copy link
Contributor

Can you disable all filtering and record blackbox log (just sitting on table)?

BTW: My sparky (from rtfquads) is +-0.5m with default filtering, USB powered ...

@hydra hydra changed the title Altitude calculation errors caused by looptime fluctuations Flight - Altitude calculation errors caused by looptime fluctuations May 15, 2015
martinbudden added a commit to martinbudden/cleanflight that referenced this issue Jun 23, 2016
Fixed fake sensors initialisation warnings.
@hydra hydra closed this as completed Feb 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants