Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Particle reader cleanup #3526

Merged
merged 36 commits into from
Jan 18, 2022

Conversation

matthewturk
Copy link
Member

This is supposed to be a no-op set of changes to the particle readers. It is in support of #3511 , so that we can keep moving forward with it with minimal changes in the medium term and to lighten the review.

Ideally, if the tests pass, this should be able to go in -- we've been testing here locally but there may be unforeseen consequences. The type annotations should be compatible with currently supported versions of python.

Future work will happen in another, manageable pull request, and will include moving some of the IO handling for particles into the ParticleFile subclasses, but largely keeping the API for the io handler the same, while eliminating duplicated code.

@matthewturk matthewturk added index: particle api-consistency naming conventions, code deduplication, informative error messages, code smells... refactor improve readability, maintainability, modularity labels Sep 24, 2021
Copy link
Member

@neutrinoceros neutrinoceros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a first batch of comments and questions.
Mostly my two points are:

  • I'm not confortable with calling this PR a "no-op" change in that it seems to be breaking any existing third party frontend that supports particles. The change is probably acceptable nonetheless, but we should keep this in mind. I could also be convinced that the change isn't breaking, but the fact that particle generators' signature changes is noteworthy.
  • most custom types here feel uncalled for, but the essence of them is relevant. We can simplify this. Also please use from typing import ... rather than import typing. I makes for longer import statements but cleaner code that can be automatically upgraded to more modern annotation style later with pyupgrade

yt/data_objects/static_output.py Outdated Show resolved Hide resolved
@@ -50,7 +50,7 @@ def _read_particle_coords(self, chunks, ptf):
if pcount == 0:
continue
pos = self._get_particle_positions()
yield ptype, [pos[:, i] for i in range(3)]
yield ptype, [pos[:, i] for i in range(3)], 0.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break any third party particle frontend, I think. I don't know that there are any, and in any case it is trivial to mitigate but it needs to be mentioned in our next release notes, I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

going to add it to a draft release note for yt 4.1

@@ -85,7 +85,7 @@ def _read_particle_coords(self, chunks, ptf):
x = self._get_field((ptype, "particle_position_x"))
y = self._get_field((ptype, "particle_position_y"))
z = self._get_field((ptype, "particle_position_z"))
yield ptype, (x, y, z)
yield ptype, (x, y, z), 0.0
Copy link
Member

@neutrinoceros neutrinoceros Sep 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you haven't changed the Base class here, but still the behaviour changes ?
Does this mean that every third party frontend that implements _read_particle_fields is broken by this change ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, however, I did a quick check and we have extremely few if any that implement this, unless they are hidden somewhere. Because this was introduced in yt-4.0, it should also be mitigated. Would a wrapper around the functions via metaclass work for you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we haven't been promoting (yet) the recipe for third party frontends, it's indeed unlikely that this actually breaks anyone, so I don't think we need to change the way you're doing it, and a quick public announcement (an email to yt-dev) should be more than enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was in a seminar today where @gvwilson talked about the "contracts" we have with software from a design perspective, and how we should only ever relax our specifications on what is input to a function and strengthen what is allowed to be output by a function. This violates that principle, and I am thinking about it in a new way. So let me think a bit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewturk any further thoughts on this?

I think if we didn't want to add the 0 hsml to all the _read_particle_coords methods, we could instead check the length of the returned tuple where needed? For example in BaseParticleIOHanlder._count_particles_chunks we have:

            for ptype, (x, y, z), hsml in self._read_particle_coords(chunks, ptf):
                psize[ptype] += selector.count_points(x, y, z, hsml) 

but could do something along the lines of

            for coord_tuple in self._read_particle_coords(chunks, ptf):
                ptype = coord_tuple[0]
                x, y, z = coord_tuple[1]
                if len(coord_tuple) == 3:
                    hsml = coord_tuple[2]
                else:
                    hsml = 0.
                psize[ptype] += selector.count_points(x, y, z, hsml)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: we decided to keep the hsml in all the _read_particle_coords (the base class now reflects this change). Having the different behavior with variable outputs proved difficult with the type checking and overall more complicated, so I think making this change while there are few if any third party plugins to break is the way to go.

yt/frontends/gadget/data_structures.py Show resolved Hide resolved
yt/frontends/swift/data_structures.py Show resolved Hide resolved
yt/utilities/io_handler.py Outdated Show resolved Hide resolved
yt/utilities/io_handler.py Outdated Show resolved Hide resolved
yt/utilities/io_handler.py Outdated Show resolved Hide resolved
yt/utilities/io_handler.py Outdated Show resolved Hide resolved
@matthewturk
Copy link
Member Author

Here's a first batch of comments and questions.
Mostly my two points are:

  • I'm not confortable with calling this PR a "no-op" change in that it seems to be breaking any existing third party frontend that supports particles. The change is probably acceptable nonetheless, but we should keep this in mind. I could also be convinced that the change isn't breaking, but the fact that particle generators' signature changes is noteworthy.

OK, it's noteworthy. Would it be sufficient to insert an unpacking for conditional creation of the default argument?

  • most custom types here feel uncalled for, but the essence of them is relevant. We can simplify this. Also please use from typing import ... rather than import typing. I makes for longer import statements but cleaner code that can be automatically upgraded to more modern annotation style later with pyupgrade

Ugh, I really don't like that notation, though. Because it's all typing and typing-related stuff I much prefer having it explicitly namespaced -- especially since some will conflict with names from traitlets. But, that's still a theoretical concern, so I suppose I can update it.

@neutrinoceros
Copy link
Member

Ugh, I really don't like that notation, though.

Here are two arguments to back up my concern:

  • https://docs.python.org/3/library/typing.html has 0 occurrences of "import typing" and 12 occurrences of "from typing import" so it's pretty one-sided as far as the standard library is concerned
  • unfortunately there isn't a single namespace that contains every relevant types we might want to import. typing is an important player but there are also types and collections.abc. I don't think it would make type hints any clearer to use (and mix) namespaces everywhere.

@matthewturk
Copy link
Member Author

matthewturk commented Nov 12, 2021 via email

@chrishavlin
Copy link
Contributor

We might be able to do star unpacking to accomplish this, but I'm not 100% sure. It might be just what you're looking at here, anyway.

ya, I've been playing with a couple different ways, will see if star unpacking works. But I was curious on if you came down on either side of the philosophical fence in terms of do we make all _read_particle_coords return the same tuple or do we handle different outputs?

@matthewturk
Copy link
Member Author

I think if we put the function to sanitize the output somewhere else (can be anywhere, just not in the body of the loop) it should be fine to manage the different types of output. I think.

@Xarthisius Xarthisius removed their request for review January 11, 2022 16:31
Co-authored-by: Clément Robert <cr52@protonmail.com>
@neutrinoceros
Copy link
Member

pre-commit.ci autofix

Copy link
Member

@neutrinoceros neutrinoceros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all my concerns were addressed. Please consider my approval as sticky even if further change is needed.

@neutrinoceros
Copy link
Member

I went through all remaining threads and ended up closing them all since they were actually addressed already. Since we have 2 approvals I'll merge this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-consistency naming conventions, code deduplication, informative error messages, code smells... index: particle refactor improve readability, maintainability, modularity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants