Duplicate time values in pld1 break oxygen correction #95

callumrollo · 2022-06-20T09:14:07Z

If fed a pld1 file with duplicate timestamps, the reindex call in utils.oxygen_concentration_correction will fail

ds_temp = data.potential_temperature[~np.isnan(data.potential_temperature)].reindex(time=ds_oxy.time, method="nearest")

Instances of duplicate timestamps happen very rarely. In this snippet of pld file, we see timestamp 11:12:28.730 repeated.

   18/11/2021 11:12:28.730;116;1604.485;5521.078;;;;;;;;;;;;;;9.0473;9.7355;1.6952;7.3451;9.7404;
   18/11/2021 11:12:28.761;116;1604.485;5521.078;;;;;;;;;;;;;;;;;;;
   18/11/2021 11:12:28.792;116;1604.485;5521.078;;;;;;;;;;;;;;9.0473;9.7354;1.7010;7.3450;9.6665;
   18/11/2021 11:12:28.823;116;1604.485;5521.078;;;;;;;;;;;;;;;;;;;
   18/11/2021 11:12:28.838;116;1604.485;5521.078;;;;;;;;;;;;;;;;;;;
   18/11/2021 11:12:28.871;116;1604.485;5521.078;;;;;;;;;;;;;;9.0467;9.7354;1.6963;7.3445;9.7035;
   18/11/2021 11:12:28.886;116;1604.485;5521.078;;;;;;;;;;;;;;;;;;;
   18/11/2021 11:12:28.901;116;1604.485;5521.078;;;;;;;;;;;;;;;;;;;
   18/11/2021 11:12:28.935;116;1604.485;5521.078;;;;;;;;;;;;;;9.0453;9.7354;1.6883;7.3433;9.6665;
   18/11/2021 11:12:28.952;116;1604.485;5521.078;;;;;;;;;;;;;;;;;;;
   18/11/2021 11:12:28.970;116;1604.485;5521.078;;;;;;;;;;;;;;;;;;;
   18/11/2021 11:12:28.664;116;1604.485;5521.079;-0.026;;;;;;;;;;;;;9.0462;9.7353;1.6874;7.3441;9.6665;
   18/11/2021 11:12:28.680;116;1604.485;5521.079;-0.026;;;;;;;;;;;;;;;;;;
   18/11/2021 11:12:28.699;116;1604.485;5521.079;-0.026;;;;;;;;;;;;;;;;;;
   18/11/2021 11:12:28.730;116;1604.485;5521.079;-0.026;;;;;;;;;;;;;9.0462;9.7353;1.6639;7.3441;9.6295;

This is easily fixed using xarray Dataset.drop_duplicates. Should this be performed as a catching step in the oxygen correction function? Or earlier in processing? Or should some other workaround be used?

The text was updated successfully, but these errors were encountered:

callumrollo · 2022-06-20T09:15:31Z

Just after applying the oxygen correction in seaexplorer.py, a call is made to ds.sortby("time"). Perhaps this could be moved forward a few lines and a call to drop_duplicates added?

jklymak · 2022-06-20T09:23:15Z

Seems reasonable. OTOH we should probably compare the length of the TS before and after the drop and error or warn if its too big?

callumrollo mentioned this issue Jun 20, 2022

deduplicate timestamps with check #96

Merged

callumrollo closed this as completed in #96 Jun 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicate time values in pld1 break oxygen correction #95

Duplicate time values in pld1 break oxygen correction #95

callumrollo commented Jun 20, 2022

callumrollo commented Jun 20, 2022

jklymak commented Jun 20, 2022

Duplicate time values in pld1 break oxygen correction #95

Duplicate time values in pld1 break oxygen correction #95

Comments

callumrollo commented Jun 20, 2022

callumrollo commented Jun 20, 2022

jklymak commented Jun 20, 2022