Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CellMethods now printed in CF-compliant manner #4279

Closed
wants to merge 17 commits into from

Conversation

MoseleyS
Copy link
Contributor

@MoseleyS MoseleyS commented Aug 11, 2021

🚀 Pull Request

Description

Fixes #3592

This simple PR rearranges the order in which CellMethod information is formatted by the str method. The grouping of this information (where multiple coords refer to the same method) has therefore been removed.

Consult Iris pull request check list

@trexfeathers
Copy link
Contributor

Not being a CF expert, I'm not certain on the specifics here. However I am certain that it breaks the assumptions of the latest cube summary code, having just worked on that (#4206), so some of the doctests will likely fail at the very least.

Have a look at this commit for details. Perhaps you'll have a good idea how to adapt for CF-compliance: 4d8859c (specifically the changes in cube_summary.py).

@trexfeathers
Copy link
Contributor

@MoseleyS you'll also need to be working from a much more up to date branch, since MoseleyS:3592_cellmethods_cfcompliant contains none of the latest CI checks. You can merge main into this branch, or rebase it.

@MoseleyS
Copy link
Contributor Author

Looks like I forgot to update my fork first...

@MoseleyS MoseleyS force-pushed the 3592_cellmethods_cfcompliant branch from e1f2eea to 977a0f2 Compare August 11, 2021 09:59
@MoseleyS
Copy link
Contributor Author

This should work with the cube_summary changes as the format is changing from "{method}: {coord}" to "{coord}: {method}" (see #3592). I expect I will need to modify a few more unit tests though to accept the corrected ordering of information.

The block @trexfeathers pointed me to contains an oddity: content looks like it will always be identical to str(method) and ought to be refactored as such.

@trexfeathers
Copy link
Contributor

@MoseleyS just to note: by all means don't treat the current setup as a straight jacket.

E.g: it may be that cube summary shouldn't be reading CellMethod's string representation at all, but rather working with specialised attributes.

Whatever seems best practice to you, in light of what would be truly CF-compliant.

@MoseleyS MoseleyS marked this pull request as ready for review August 13, 2021 11:27
Copy link
Contributor

@trexfeathers trexfeathers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for slogging through this @MoseleyS, I think it's a great idea! I'd also appreciate an opinion from @pp-mo about the general concept here, since @pp-mo has more experience with CF.

I have a few comments for you. The main issue I have is the downstream parsing of CellMethod.__str__ in the cube summary code. I already struggled to de-risk this in my previous implementation, and now with your improved implementation that issue has become that extra bit problematic. See what you think of my suggestion.

Once all other changes are made, you will need to find existing examples in the documentation and change them to the new world before the doctests will pass. If it helps, all the failures are listed in the CI details (here for latest run at time of writing).

Finally: this needs a "What's New" entry.

Attributes:
time mean within years
time mean over years
Attributes:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Attributes:
Attributes:

Comment on lines +226 to +228
names = content.split(": ")[:-1]
# Remove "coord: " or "coord1: coord2: " from the front of the string, leaving the value.
value = content.split(": ")[-1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems risky to assume that : won't also appear in other places (especially if some of my other comments get actioned). Could you create CellMethod.method_str() for creating the 'right-hand-side' part of the string? Then the code could be:

names = method.coord_names
value = method.method_str()


for coord_name, interval, comment in cell_components:
coords.append(coord_name)
other_info = ", ".join(filter(None, chain((interval, comment))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at one of the examples, this currently generates:

6 minutes, This is a test comment

If we're aligning strictly to the conventions, should we actually be aiming for the following?

interval: 6 minutes comment: This is a test comment

return cm_summary + ", ".join(collection_summaries)
result = f"{''.join([f'{c}: ' for c in coords])}{self.method}"
if any(other_infos):
result += f" ({', '.join(other_infos)})"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
result += f" ({', '.join(other_infos)})"
result += f" ({' '.join(other_infos)})"

If we're aligning strictly to the conventions, should we actually be separating with a space only?

@@ -2816,25 +2816,24 @@ def __init__(self, method, coords=None, intervals=None, comments=None):
self._init(method, tuple(_coords), tuple(_intervals), tuple(_comments))

def __str__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in my cube_summary.py comment: could you abstract out the code that generates the method + interval + comment string (e.g. mean (6 minutes, This is a test comment, 12 minutes)) into its own method: self.method_str?

Then just have self.__str__ stitch together self.coord_names and self.method_str()? That would avoid the need for downstream string parsing, which can be risky and difficult to follow.

Comment on lines +264 to +265
for title, value in zip(sect.names, sect.values):
add_scalar_row(", ".join(title), value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for title, value in zip(sect.names, sect.values):
add_scalar_row(", ".join(title), value)
for coords, method in zip(sect.names, sect.values):
add_scalar_row(", ".join(coords), method)

Specialised behaviour will benefit from specialised language to help understanding of what's going on.

Attributes:
time mean within years
time mean over years
Attributes:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Attributes:
Attributes:

@MoseleyS
Copy link
Contributor Author

Thanks for the suggestions. I'm a bit busy with other work, so this is going to have to wait for a week or so, but in principle, I agree with your suggestions.

@trexfeathers
Copy link
Contributor

@MoseleyS does this still matter to you? Can we get it over the line? Or should I close it?

Could perhaps bring to a UKMO 'Surgery'?

@MoseleyS
Copy link
Contributor Author

MoseleyS commented Oct 5, 2022

@MoseleyS does this still matter to you? Can we get it over the line? Or should I close it?

Could perhaps bring to a UKMO 'Surgery'?

I'd forgotten about this. I'd still like it to happen as I like consistency. I'm not sure when I'm going to have time to look at this again though, perhaps next month. Feel free to take over if you'd like to.

@trexfeathers
Copy link
Contributor

Thanks @MoseleyS. @pp-mo has pointed out there are arguments for the current implementation too, although we're not certain in either direction. I'll mark this for team discussion just to make sure everyone is happy.

@trexfeathers
Copy link
Contributor

Summary from team discussion today (ping @pp-mo):

We still agree that the CellMethod.__str__() should print something CF compliant.

More complex for a Cube. The left hand text uses the names of objects wherever possible. This dimension is called foo, this coordinate is called bar, etcetera. That is where the 'controversy' lies - there is no clear answer for cell methods, since they do not have names but are instead provided as a sequence of operations that were performed on this data. So we'd like to go with a THIRD WAY 😁 :

    Cell methods:
        0                < my_cube.cell_methods[0].__str__() >
        1                < my_cube.cell_methods[1].__str__() >
        2                < my_cube.cell_methods[2].__str__() >

I still intend to pick this up, but likely not before December. @MoseleyS is free to pursue before then if they wish.

@MoseleyS
Copy link
Contributor Author

    Cell methods:
        0                < my_cube.cell_methods[0].__str__() >
        1                < my_cube.cell_methods[1].__str__() >
        2                < my_cube.cell_methods[2].__str__() >

I still intend to pick this up, but likely not before December. @MoseleyS is free to pursue before then if they wish.

That's fine by me. Ideally, the CellMethod.__str__ method will return the CF-compliant format.
I doubt I'll get to this before December, but anything is possible I guess.

@pp-mo
Copy link
Member

pp-mo commented Oct 20, 2022

If everyone is saying "December", evidently this won't make it into Iris 3.4.
So I'll label it for 3.5

@trexfeathers
Copy link
Contributor

Replaced by #5224

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Correct CellMethod.__str__ method to be CF-compliant
4 participants