reduce code duplication in IOHandler _read_particle_coords and _read_particle_fields #4597

chrishavlin · 2023-07-24T21:00:14Z

This PR reduces duplicated code related to iterating over data_file chunks in _read_particle_coords and _read_particle_fields across IOHandlers. Shouldn't affect any behavior...

missed one fix ahf modification

matthewturk · 2023-07-24T21:50:27Z

yt/utilities/io_handler.py

+        return data_files
+
+    def _sorted_chunk_iterator(self, chunks):
+        data_files = self._get_data_files(chunks)


I think we should explicitly turn chunks into a list here, unless it has been elsewhere higher up the stack; recently we exhausted an iterator and ran into issues. (Maybe with nc_cm1?) Also, we sort with x.start and x.filename but not all frontends did that -- does x.start have a default value of 0 or something constant that we can rely on?

Ya, _get_data_files turns chunks into a list.

Will check on how reliable the start attribute is.

Oh and I left the front ends that do not sort by both filename and start intact. They override this method so that they are unchanged

OK, so ParticleFile.start does initialize to 0 if it's not provided (within ParticleFile.__init__ (link to relevant code).

But in any case, the changes in this PR do not change any behavior -- only the frontends that were sorting by filename and start are using this function. Those that were not already doing that either override this method or don't use this method at all.

yt/frontends/ahf/io.py

chrishavlin · 2023-07-25T14:04:46Z

yt/frontends/adaptahop/io.py

@@ -194,6 +185,10 @@ def members(self, ihalo):
            members = fpu.read_attrs(todo.pop(0))["particle_identities"]
        return members

+    def _sorted_chunk_iterator(self, chunks):
+        data_files = self._get_data_files(chunks)
+        yield from sorted(data_files, key=attrgetter("filename"))


@matthewturk just pointing out an example where I added an override of _sorted_chunk_iterator so that the frontend continues only sorting by filename. The change in ahf is similar.

chrishavlin · 2023-07-25T14:07:39Z

yt/frontends/sdf/io.py

-        for chunk in chunks:
-            for obj in chunk.objs:
-                data_files.update(obj.data_files)
+        data_files = self._get_data_files(chunks)
        assert len(data_files) == 1
        for _data_file in sorted(data_files, key=lambda x: (x.filename, x.start)):


just a note: I did not use the _sorted_chunk_iterator here because I wanted to keep the assert len(data_files) == 1 line above.

neutrinoceros

Glad to see this merged if @matthewturk is still happy with it.

use _sorted_chunk_iterator to reduce code dupe

7a67fe3

missed one fix ahf modification

chrishavlin added index: particle refactor improve readability, maintainability, modularity labels Jul 24, 2023

matthewturk previously approved these changes Jul 24, 2023

View reviewed changes

neutrinoceros reviewed Jul 25, 2023

View reviewed changes

yt/frontends/ahf/io.py Outdated Show resolved Hide resolved

chrishavlin commented Jul 25, 2023

View reviewed changes

move ptf halo validation to functions

c3784b0

chrishavlin dismissed matthewturk’s stale review via c3784b0 July 25, 2023 14:13

neutrinoceros approved these changes Jul 26, 2023

View reviewed changes

matthewturk approved these changes Jul 27, 2023

View reviewed changes

matthewturk merged commit 28defca into yt-project:main Jul 27, 2023

neutrinoceros added this to the 4.3.0 milestone Jul 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reduce code duplication in IOHandler _read_particle_coords and _read_particle_fields #4597

reduce code duplication in IOHandler _read_particle_coords and _read_particle_fields #4597

chrishavlin commented Jul 24, 2023

matthewturk Jul 24, 2023

chrishavlin Jul 25, 2023

chrishavlin Jul 25, 2023

chrishavlin Jul 25, 2023

chrishavlin Jul 25, 2023

chrishavlin Jul 25, 2023

neutrinoceros left a comment

reduce code duplication in IOHandler _read_particle_coords and _read_particle_fields #4597

reduce code duplication in IOHandler _read_particle_coords and _read_particle_fields #4597

Conversation

chrishavlin commented Jul 24, 2023

matthewturk Jul 24, 2023

Choose a reason for hiding this comment

chrishavlin Jul 25, 2023

Choose a reason for hiding this comment

chrishavlin Jul 25, 2023

Choose a reason for hiding this comment

chrishavlin Jul 25, 2023

Choose a reason for hiding this comment

chrishavlin Jul 25, 2023

Choose a reason for hiding this comment

chrishavlin Jul 25, 2023

Choose a reason for hiding this comment

neutrinoceros left a comment

Choose a reason for hiding this comment