Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

era5 encoding is lost with preprocess_ERA5 #781

Closed
veenstrajelmer opened this issue Feb 12, 2024 · 1 comment
Closed

era5 encoding is lost with preprocess_ERA5 #781

veenstrajelmer opened this issue Feb 12, 2024 · 1 comment

Comments

@veenstrajelmer
Copy link
Collaborator

veenstrajelmer commented Feb 12, 2024

When reducing on non-decoded dataset (decode_cf=False upon opening), this preserves the encoding attrs:

ds_reduced = ds_raw.max(dim="expver", keep_attrs=True)

However, in preprocess_ERA5, the input dataset is already decoded and this does not help in preserving encoding (it will be dropped by the reduction). This means the reduction in this function prevents dtype int in a to_netcdf file already. Therefore, prevent_dtype_int is of no value in this function if this expver dimension is present. These files will then be written unzipped to float32 dtypes. It would be good to make this more robust and transparent.

@veenstrajelmer
Copy link
Collaborator Author

This is not an issue anymore with datasets downloaded from the new CDS-beta as the variable dtypes are now zipped float32 instead of scaled int. Therefore this issue is not relevant anymore, only for old datasets.

Furthermore, the prevent_dtype_int() function was also simplified and merged with dfmt.preprocess_ERA5() so this issue is not that relevant anymore. More info in #942

@veenstrajelmer veenstrajelmer closed this as not planned Won't fix, can't repro, duplicate, stale Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant