Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for N axis dimensions mapped to N coordinates #343

Merged
merged 13 commits into from
Nov 8, 2022

Commits on Nov 7, 2022

  1. Add support for datasets with N dims mapped to N coord vars

    Overview:
      - `xcdat` APIs previously expected a 1:1 map between axis dimensions and coordinates. For example, the `time` axis dimension would be mapped to `time` coord var. However, datasets can have N dimensions mapped to N coordinate variables
      - For example, datasets that have N dims mapped to N coord vars include E3SM, which might have a "Z" axis with the dimensions and coords "ilev" and "lev"
    
    Feature Updates:
      - Update bounds accessor methods to operate on the dataset or a variable within the dataset using the new `var_key` kwarg
      - Update `add_missing_bounds()` to loop over all axes and their coordinate vars and attempt to add bounds for each coordinate var if they do not exist
      - Update decoding function (`decode_time()`) to decode CF and non-CF compliant time coordiantes as `cftime` objects
      - Update dataset longitude swapping function (`swap_lon_axis()`) to loop over longitude coords and their bounds to swap them
    - Update spatial and temporal accessor class methods to refer to the dimension coordinate variable on the `data_var` being operated on, rather than the parent dataset
    
    Bug Fixes:
      - Fix `add_bounds()` not ignoring 0-dim singleton coords in addition to 1-dim singleton coords
    
    Refactor:
      - Refactor `open_dataset()` and `open_mfdataset()` to remove unnecessary preprocessing functions
      - Rename `get_axis_coord()` to `get_dim_coords()` and `get_axis_dim()` to `get_dim_keys()`
      - Update fixtures in `fixtures.py` to use `cftime` objects and add `decode_times` to `generate_dataset()`
      - Extract utility function `_if_multidim_dask_array_then_load()` and reuse in several places in the codebase, usually when manipulating bounds
    tomvothecoder committed Nov 7, 2022
    Configuration menu
    Copy the full SHA
    8a30534 View commit details
    Browse the repository at this point in the history
  2. Refactor redundant calls to .values

    - This improves runtime performance, since `.values` performs type conversions to `np.array()`
    - Refactor `_get_cftime_coords()` to use `xr.coding.times.decode_cf_datetime()` for decoding CF compliant time units
    tomvothecoder committed Nov 7, 2022
    Configuration menu
    Copy the full SHA
    23bcd2d View commit details
    Browse the repository at this point in the history
  3. Refactor decoding times to a lazy operation

    - Update `_is_decoded()` to check `.encoding` attributes rather than the object types in the array for performance purposes, since accessing `.values` can involve type conversion
    tomvothecoder committed Nov 7, 2022
    Configuration menu
    Copy the full SHA
    2928e6a View commit details
    Browse the repository at this point in the history
  4. Add test for decoding with default calendar attr

    - Refactor imports and fix docstrings
    tomvothecoder committed Nov 7, 2022
    Configuration menu
    Copy the full SHA
    c7f0e99 View commit details
    Browse the repository at this point in the history
  5. Decode times in preprocess for open_mfdataset()

    - This is required for cases where time units are different between files in multi-file datasets, but the offsets are the same. More info in the `_preprocess()` docstring
    - Add raise ValueError for decoding datasets with time coords without CF attrs set
    tomvothecoder committed Nov 7, 2022
    Configuration menu
    Copy the full SHA
    7c8846e View commit details
    Browse the repository at this point in the history
  6. Fix incorrect units arg passed to decode_cf_datetime()

    - Add `decode_time()` test for units in days
    tomvothecoder committed Nov 7, 2022
    Configuration menu
    Copy the full SHA
    1e57bc9 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    3a5d4fd View commit details
    Browse the repository at this point in the history
  8. Update _get_all_coord_keys()

    to interpret common var names
    - Update tests
    tomvothecoder committed Nov 7, 2022
    Configuration menu
    Copy the full SHA
    2920ffc View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    5431b0e View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    6591295 View commit details
    Browse the repository at this point in the history

Commits on Nov 8, 2022

  1. Configuration menu
    Copy the full SHA
    d5e5ccd View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ad587d4 View commit details
    Browse the repository at this point in the history
  3. Fix test warnings

    tomvothecoder committed Nov 8, 2022
    Configuration menu
    Copy the full SHA
    8575a09 View commit details
    Browse the repository at this point in the history