Case insensitive lenient special name support? #4453

bouweandela · 2021-12-07T10:46:57Z

✨ Feature Request

Motivation

It was recently reported to me by @axel-lauer that he experienced frustration with using the iris library via ESMValTool because it failed to merge some cubes because some had a longitude coordinate with long_name "Longitude" (the CMIP6 standard), while others had a longitude coordinate with long_name "longitude" (the CMIP5 standard).

Would this already be solved by #4446 or would some additional changes be needed, such as changing special lenient name behaviour such that is case insensitive (if that doesn't conflict with the CF Conventions)?

The text was updated successfully, but these errors were encountered:

rcomer · 2021-12-07T12:52:59Z

Hi @bouweandela, would those coordinates also have standard_names? If so, I think the lenient comparison would ignore the long_name clash.

bouweandela · 2021-12-07T13:42:58Z

Yes, they would have standard_names, so that sounds promising!

rcomer · 2021-12-07T21:59:04Z

Sorry, finger slipped!

rcomer · 2021-12-15T11:17:56Z

It seems I had misremembered in the above. The mismatched long_names do cause an error for lenient arithmetic (with Iris3.1):

import iris
import iris.cube
import iris.coords

coord1 = iris.coords.DimCoord(range(5), standard_name="longitude", long_name="Longitude")
coord2 = iris.coords.DimCoord(range(5), standard_name="longitude", long_name="longitude")

cube1 = iris.cube.Cube(range(5), dim_coords_and_dims=[(coord1, 0)])
cube2 = iris.cube.Cube(range(5), dim_coords_and_dims=[(coord2, 0)])

cube1 - cube2

ValueError: Insufficient matching coordinate metadata to resolve cubes, cannot map dimension (0,) of the RHS cube 'unknown' to the LHS cube 'unknown'.

You can have mismatched var_names though:

coord1.var_name = "foo"
coord3 = coord1.copy()
coord3.var_name = "bar"

cube3 = iris.cube.Cube(range(5), dim_coords_and_dims=[(coord3, 0)])

print(cube1 - cube3)

unknown / (unknown)                 (longitude: 5)
    Dimension coordinates:
        longitude                             x

bjlittle · 2021-12-15T13:40:08Z

For equality, lenient behaviour is really only your friend in the cases outlined here i.e., when you're comparing two things, and one thing has a value and the other not (i.e., "something" vs None)

The only other special treatment is with regards to var_name (currently).

That all said, for strict/lenient equality a long_name="latitude" compared with long_name="Latitude" will always be False, as you're comparing "something" vs "something".

Yes, we could say, well for lenient behaviours make str equality case insensitive, which is ideal to adhere to the CMIP6 standard, but on the flip side it then breaks someone else's workflow where they do actually care about long_name="longitude" vs long_name="LONGITUDE" being different (don't know why, but they might)... unfortunately, there is just no way for us to know this; it's incredibly local and specific to users, their data and their workflow i.e., different people care about different things (and that's totally okay).

IMHO adding baked in local exceptions to iris really is a slippy slope from a maintenance perspective, and I'd be keen to avoid it.

In summary, without thinking too deeply about this, the obvious options are:

either iris opens up the ability for users to define generic behavious to be applied during lenient operations at the API level,
users align metadata values beforehand in the knowledge of how iris strict and lenient operations behave, or
allow users that want "special" lenient treatment, a hook to monkey-patch the local behaviour that they want and need for their workflow

Personally, knowing what I know, I'm attracted to option [3] from the perspective that it's completely achievable with very little development effort. This is compared to option [1] which is a HUGE development undertaking, and option [2] which is a HUGE inconvenience for users.

Also, going back to your question @bouweandela, as it stands, no #4446 wouldn't help you alone here... however, it certainly would help you along with option [3].

bjlittle · 2021-12-15T13:42:01Z

Okay, I'm going to proof-of-concept option [3] (above) in the context of your use case and see how it hangs together.

This'll hopefully give us a concrete strawman (and hopefully proposal) of how we might be able to move forwards with this issue.

rcomer · 2021-12-15T13:53:34Z

I like the idea of option 3 from the user POV too. Could it work similarly to the FUTURE flag? I.e. I can set it once at the top of my script so it modifies all behaviour in my script, or I can choose to use it as a context manager to just modify the behaviour in this one section.

bjlittle · 2021-12-15T13:58:39Z

I like the idea of option 3 from the user POV too. Could it work similarly to the FUTURE flag? I.e. I can set it once at the top of my script so it modifies all behaviour in my script, or I can choose to use it as a context manager to just modify the behaviour in this one section.

That would be lovely, right? 🤔

I think there are at least two threads here:

does the option [3] mechanism work, and is it palatable to both developers and users alike
how will it be used, and how do we best exercise control of when these "special" behaviours kick-in, so to speak

bouweandela · 2022-02-14T20:05:13Z

Thanks for looking into this @bjlittle and @rcomer and my apologies for being slow to reply. It sounds like option 3 is as good as it gets. I personally feel that being super strict on things is what scares scientists off from using iris, but maybe the iris library is not the place to fix that and we need we need libraries like esmvalcore to carry out all the homogenising of metadata before passing things on to iris. I do try to report the more generic things back to iris, that's why i opened this issue.

Another example of super strict behaviour that only makes sense to programmers and not to climate scientists is refusing to merge floating point coordinates that only differ by 1 ULP.

bouweandela · 2022-02-14T20:10:17Z

Thinking since more about option 3: does having this option simplify the user's code in practice? It is fairly straightforward to write code to homogenise the capitalisation of coordinate names, it's just a bit unexpected that this is needed. Do the CF conventions have anything to say on whether or not variable names are case sensitive?

ESadek-MO · 2023-05-24T09:33:28Z

We've discussed this as a team @SciTools/peloton and feel this is a case for user workaround. This is beyond the intended scope of lenient operations.

bouweandela added the New: Feature label Dec 7, 2021

rcomer closed this as completed Dec 7, 2021

rcomer reopened this Dec 7, 2021

bjlittle self-assigned this Dec 15, 2021

bjlittle pinned this issue Dec 15, 2021

bouweandela mentioned this issue Jan 17, 2022

Possible improvements in cube concatenation to be discussed with iris developers ESMValGroup/ESMValCore#1423

Open

rcomer unpinned this issue Feb 13, 2022

trexfeathers removed the New: Feature label Jun 15, 2022

pp-mo added the Feature: ESMValTool label Sep 6, 2022

valeriupredoi mentioned this issue Sep 30, 2022

Ordering issues labelled Feature: ESMValTool from iris ESMValGroup/ESMValCore#1738

Open

trexfeathers added the Feature: NetCDF + CF-conventions label Jan 25, 2023

trexfeathers mentioned this issue Apr 12, 2023

Common agreement on loading CF non-compliant NetCDF files #5165

Open

ESadek-MO closed this as not planned Won't fix, can't repro, duplicate, stale May 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Case insensitive lenient special name support? #4453

Case insensitive lenient special name support? #4453

bouweandela commented Dec 7, 2021

rcomer commented Dec 7, 2021

bouweandela commented Dec 7, 2021

rcomer commented Dec 7, 2021

rcomer commented Dec 15, 2021 •

edited

Loading

bjlittle commented Dec 15, 2021 •

edited

Loading

bjlittle commented Dec 15, 2021 •

edited

Loading

rcomer commented Dec 15, 2021

bjlittle commented Dec 15, 2021

bouweandela commented Feb 14, 2022 •

edited

Loading

bouweandela commented Feb 14, 2022 •

edited

Loading

ESadek-MO commented May 24, 2023

Case insensitive lenient special name support? #4453

Case insensitive lenient special name support? #4453

Comments

bouweandela commented Dec 7, 2021

✨ Feature Request

Motivation

rcomer commented Dec 7, 2021

bouweandela commented Dec 7, 2021

rcomer commented Dec 7, 2021

rcomer commented Dec 15, 2021 • edited Loading

bjlittle commented Dec 15, 2021 • edited Loading

bjlittle commented Dec 15, 2021 • edited Loading

rcomer commented Dec 15, 2021

bjlittle commented Dec 15, 2021

bouweandela commented Feb 14, 2022 • edited Loading

bouweandela commented Feb 14, 2022 • edited Loading

ESadek-MO commented May 24, 2023

rcomer commented Dec 15, 2021 •

edited

Loading

bjlittle commented Dec 15, 2021 •

edited

Loading

bjlittle commented Dec 15, 2021 •

edited

Loading

bouweandela commented Feb 14, 2022 •

edited

Loading

bouweandela commented Feb 14, 2022 •

edited

Loading