Skip to content
This repository has been archived by the owner on Jan 16, 2020. It is now read-only.

MPU9250 Audit #209

Closed
nicolaerosia opened this issue Jun 23, 2017 · 15 comments
Closed

MPU9250 Audit #209

nicolaerosia opened this issue Jun 23, 2017 · 15 comments

Comments

@nicolaerosia
Copy link
Contributor

Hello,

I have noticed the following issues with the current driver:

@julianoes @AD-Aleksandrov @mhkabir @crossa @dagar could you please take a look?

[0] https://www.invensense.com/wp-content/uploads/2015/02/RM-MPU-9250A-00-v1.6.pdf

@julianoes
Copy link
Contributor

julianoes commented Jun 24, 2017

@nicolaerosia Good analysis!

The goal about a year ago was not to use the internal MPU9250 filtering and to sample the accel (and gyro) at 4 (respective 8) kHz, then read out the values from the FIFO buffer at 1 kHz. Therefore _packets_per_cycle_filtered would be 8.
The goal was to integrate the samples and use that in the ekf2.

Later, we found that the performance was not great (for whatever reason), and decided to match the PX4/Firmware driver by switching to the internal filtering with 41 Hz bandwidth (87de5cb).

If my observation above is correct, the following comments/computations are wrong:
https://github.com/PX4/DriverFramework/blob/master/drivers/mpu9250/MPU9250.hpp#L245-L249

Correct, I missed that in my review back then. Fortunately, it looks like this is corrected for over time in https://github.com/PX4/DriverFramework/blob/master/drivers/mpu9250/MPU9250.cpp#L423.

This affects fifo_sample_interval_us,
https://github.com/PX4/DriverFramework/blob/master/drivers/mpu9250/MPU9250.cpp#L568-L570

I think this is not required anymore (I had added it in 583b606)
because every sample is timestamped and integrated in the wrapper: https://github.com/PX4/Firmware/blob/master/src/platforms/posix/drivers/df_mpu9250_wrapper/df_mpu9250_wrapper.cpp#L626.

Why is a bandwidth of 184Hz chosen for Edison and RPI_SINGLE and 41 Hz for the others?

The fast sampling was not possible or used too much CPU on the Edison and Raspberry Pi, therefore a lower bandwidth setting (which comes with a lower sampling rate) was chosen.

FYI: @ChristophTobler

@nicolaerosia
Copy link
Contributor Author

@julianoes Just a quick comment, a bandwidth of 184Hz has the same sampling rate as 41 Hz, take a look at page 13, [0].

@nicolaerosia
Copy link
Contributor Author

Another problem is related to timestamp, given that we query the FIFO at 1 KHz, and FIFO samples internally at 8 KHz, we need to report the correctly adjusted timestamp and push it to df_mpu9250_wrapper.
The existing struct imu_sensor_data from ImuSensor.hpp is limited because it implies that you're publishing accel & gyro & mag.
In our setup, the mag sampling rate is limited to 100Hz and we are publishing duplicate values.
My approach would be to introduce three structures,

struct accel_data {
	uint64_t timestamp;
	float accel_m_s2_x;
	float accel_m_s2_y;
	float accel_m_s2_z;
};

struct gyro_data {
	uint64_t timestamp;
	float gyro_rad_s_x;
	float gyro_rad_s_y;
	float gyro_rad_s_z;
};

struct mag_data {
	uint64_t timestamp;
	float mag_ga_x;
	float mag_ga_y;
	float mag_ga_z;
};

We can get the timestamp by forcing driver users to implement a function callback _timestamp, which would return the current timestamp.

	virtual uint64_t _timestamp() = 0;

and in the case of df_mpu9250_wrapper.cpp,

	virtual uint64_t _timestamp() override {
		return hrt_absolute_time();
	}

Immediately after querying the size of the FIFO, I capture the timestamp.
And for each packet I read from FIFO, I substract the internal FIFO sample rate interval.
E.g.
If for some reason, we read a single packet, there's not substraction.
For two packets, the older one will have current_timestamp - FIFO sample rate interval, while the freshest one will have the current_timestamp.
And so on.

@julianoes @mhkabir what do you think?

@julianoes
Copy link
Contributor

@julianoes Just a quick comment, a bandwidth of 184Hz has the same sampling rate as 41 Hz, take a look at page 13, [0].

Yes correct. But we got there historically because we wanted to use the highest sampling possible, and lesat internal filtering. Then it needed to be slowed down for Edison and Raspberry Pi. Later, only the default (for Snapdragon) was changed back to 41 Hz.

@julianoes
Copy link
Contributor

@nicolaerosia I think I see your timestamp concerns but couldn't we use fifo_sample_interval_us as well to account for it? And, don't the duplicate values disappear anyway when you integrate 4 samples into 1 sample from 1 kHz to 250 Hz?

@nicolaerosia
Copy link
Contributor Author

@julianoes the idea is to drop the rate limiting from df_mpu9250_wrapper and publish at whatever frequency the driver publishes, please check PX4/PX4-Autopilot#7449 .
The duplicate values won't disappear even with rate limiting to 250 Hz since the mag is sampled at 100 Hz.

@nicolaerosia
Copy link
Contributor Author

@julianoes I still don't get why default (Snapdragon) changed to 41 Hz because it has the same sampling rate (1 KHz) only different filtering parameters to limit the bandwidth to 41 Hz.

@julianoes
Copy link
Contributor

@julianoes the idea is to drop the rate limiting from df_mpu9250_wrapper and publish at whatever frequency the driver publishes, please check PX4/PX4-Autopilot#7449 .
The duplicate values won't disappear even with rate limiting to 250 Hz since the mag is sampled at 100 Hz.

Ok sure. It makes sense then I think.

@julianoes
Copy link
Contributor

@julianoes I still don't get why default (Snapdragon) changed to 41 Hz because it has the same sampling rate (1 KHz) only different filtering parameters to limit the bandwidth to 41 Hz.

This was just to match the Pixhawk/Pixracer PX4/Firmware driver and pipeline.

@karliss
Copy link

karliss commented Dec 11, 2018

Any progress on this? Current settings for snapdragon are unusable due to empty FIFO being considered an error. When sampling 1k samples at 1kHz it is not uncommon to read either 0 or 2 samples, this quickly triggers ERROR_FLAG_ERRCOUNT. Running minimal amount of modules results in an error every 2-5 seconds, but using more typical set of modules causes 20-30 EMPTY FIFO per second reaching NORETURN_ERRCOUNT in ~5 minutes.

@julianoes
Copy link
Contributor

@karliss I don't think anyone is working actively on this. I think we're open for a pull request to fix the fifo error that you're describing.

@dagar
Copy link
Member

dagar commented Dec 11, 2018

This isn't a short term solution, but the new mpu9250 driver in the Firmware tree (PX4/PX4-Autopilot#10602) is going to work everywhere (nuttx, linux, and qurt).

@davidaroyer
Copy link
Contributor

davidaroyer commented Dec 12, 2018

@julianoes @dagar I've been looking at the empty FIFO issue that @karliss just brought up. I never noticed this issue until recently when I started seeing the px4 shell spammed with the "FIFO Count: 0" message.

After quite a bit of bench testing, I have found that if I add a simple 10 microsecond sleep when the fifo is empty, the driver can march on with data and it avoids quickly triggering the ERROR_FLAG_ERRCOUNT. In other words, instead of the current fifo_cnt check, I simply tried this:

if (fifo_cnt < 0) {
    DF_LOG_INFO("MPU9250::_measureData: FIFO Count: %d", fifo_cnt);
} else if (fifo_cnt == 0) {
    usleep(10);
    fifo_cnt = get_fifo_count();
}

Note, this is just my dumb way of trying out something like this. It at least halfway fixes the problem, but I don't really like the idea of inserting a random usleep in the sensor driver.

Another potential solution would be to maintain current behavior, only report the "FIFO Count:" message for (fifo_cnt < 0), and no longer increase the error_counter for this particular case. Maybe that looks like so:

if (fifo_cnt < 0) {
    DF_LOG_INFO("MPU9250::_measureData: FIFO Count: %d", fifo_cnt);
} else if (fifo_cnt == 0) {
    return;
}

Downstream from this there's another check that will increase the error counter based on a bytes_to_read check, but as far as I can tell, that check will only fail if fifo_cnt is <= 0. So that second check seems a bit redundant. Would it make more sense to take that second check out and implement below?

if (fifo_cnt < 0) {
    DF_LOG_INFO("MPU9250::_measureData: FIFO Count: %d", fifo_cnt);
    ++m_sensor_data.error_counter;
    return;
} else if (fifo_cnt == 0) {
    return;
}

I'm more than happy to submit a PR for this particular issue, but I'd prefer to get some feedback on how you'd like to fix it. I've flown with essentially the last suggestion with no issues during the flight.

Thoughts?

@hamishwillee
Copy link

@julianoes @dagar Can you respond to the part from #209 (comment)

I'm more than happy to submit a PR for this particular issue, but I'd prefer to get some feedback on how you'd like to fix it. I've flown with essentially the last suggestion with no issues during the flight.

Aerotenna have been waiting on this before updating docs etc for v1.9 - the opoc doesn't work out of the box because of this. Can you provide direction?

@julianoes
Copy link
Contributor

I haven't looked at this in a long time and have no opinion. I suggest that you go ahead and do what makes sense to you and works. Thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants