-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
interpolate over nans of pld sensors #140
Conversation
We can make the interpolate kwarg True by default, or control this behavior from the yaml |
Codecov ReportBase: 78.48% // Head: 79.47% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #140 +/- ##
==========================================
+ Coverage 78.48% 79.47% +0.98%
==========================================
Files 7 7
Lines 1292 1364 +72
==========================================
+ Hits 1014 1084 +70
- Misses 278 280 +2
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Hi @callumrollo, I'd recommend testing on this delayed mode dataset: https://cproof.uvic.ca/gliderdata/deployments/dfo-bb046/dfo-bb046-20200908/delayed_raw/ |
Thanks @hvdosser. Can you send me a yaml file for the dataset you linked? I will then incorporate the nan interpolation with control from the yaml file. Sorry for the delay on this, just back from vacation |
Hi @callumrollo, no worries, thanks for figuring this out! Here's the link to the yaml: https://cproof.uvic.ca/gliderdata/deployments/dfo-bb046/dfo-bb046-20200908/deployment.yml |
@hvdosser do these tests look sufficient? I've added some of the data you provided and tested it in default mode and with nan interpolation enabled. We're using GPCTD as the timebase.The nan interpolation adds non-nan values for the other sensors (oxygen and cdom) where they did not sample at the same time as the CTD. Getting a nice increase isn test coverage from this too :) |
Ping @hvdosser for a review on this |
@hvdosser if you had a chance to make sure this works it would be appreciated... |
@hvdosser do you have a timeframe for reviewing this? It's quite a substantial change, so I'd like to get it merged before addressing other Issues |
Hi @callumrollo and @jklymak , sorry for the delay on this. I'll have time on Monday Feb 13th to review and merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, and the new tests are great! I added a few very minor comments. As a broader comment for @jklymak, we'll likely want to update our example yaml files and documentation with some information about including interpolate
for the SeaExplorers, since this is a pretty major correction to the delayed-mode processing. Thanks for your efforts on this @callumrollo, and apologies again for the delay in review.
I've merged in the latest changes from main. Is this good to go @hvdosser? |
@callumrollo and @jklymak, I'll merge this pull request, but I do think we need to consider what happens when we have large gaps in the data and we interpolate over them. Particularly for sensors like the ECOpuck, which only samples a small portion of the water column. Am I correct in thinking that setting |
OK, I think xarray provides this functionality: note the max_gap parameter, which should be perfect for this. Just set to some reasonable value (that probably changes depending on the sampling frequency) and big gaps would remain NaN... |
Hi @hvdosser. The potential problem with slow sampling sensors was raised in the original discussion of Issue #128. Linear interpolation will result in an apparent measurement at every timestamp of the specified timebase. using |
This will fix Issue #128
This is a first draft. If
interpolate
is set to True inraw_to_timeseries
, nans from variables will be interpolated over as the sensors are aligned with the timestamps of the variable designated as the timebase.This should be tested with some delayed mode data, as the interpolation has no effect on nrt data, as these typically have no nans. @hvdosser @jklymak, do you have a delayed mode dataset you would like to use to test this method?