Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alter the default behavior for organizing outputs #367

Merged
merged 8 commits into from
Feb 2, 2024

Conversation

mabruzzo
Copy link
Collaborator

Overview

This PR changes the way that Cholla organizes output files.

Previously all outputs were written into a single output directory.

  • This is fine when you have a small number of outputs.
  • However, when you have a lot of outputs, it can be hard to read (and identify the name of the most recent output).
  • Plus its a little tedious to selectively delete files

This PR changes the default output behavior. Now we group all outputs written at a single simulation cycle into a single directory. To use the older behavior you can set the new legacy_flat_outdir runtime parameter to 1 (in the future, I think it would be good to remove the older behavior completely).

More details

Output file paths traditionally followed the following template:

    "{outdir}{nfile}{pre_extension_suffix}{extension}[.{proc_id}]"

where each curly-braced token represents a different variable. In detail:
- {outdir} is the parameter from the parameter file. The historical behavior (that we currently maintain), if this is non-empty, then all characters following the last '/' are treated as a prefix to the output file name (if there aren't any '/' characters, then the whole string is effectively a prefix.
- {nfile} is the current file-output count.
- {pre_extension_suffix} is the pre-hdf5-extension suffix. It's the suffix that precedes the file extension (or {extension})
- {extension} is the filename extension. Examples include ".h5" or ".bin" or ".txt".
- {proc_id} represents the process-id that held the data that will be written to this file. In non-MPI runs, this will be omitted.

The new default behavior is to create

    "{outdir}/{nfile}/{nfile}{pre_extension_suffix}{extension}[.{proc_id}]"

where the the significance of each curly-braced token is largely unchanged. There are 2 things worth noting:
- all files written at a single simulation-cycle are now grouped in a single directory
- {outdir} never specifies a file prefix. When {outdir} is empty, it is treated as "./". Otherwise, we effectively append '/' to the end of {outdir}

Implementation Details

While doing all of this, I took the opportunity to consolidate the logic for formatting output filenames. Previously the logic was scattered throughout the codebase. Now, the logic is encapsulated by a class called FnameTemplate.

I acknowledge that it looks a little funny that we construct an FnameTemplate instance everywhere we want to determine an output filename (rather than just defining a function). I mostly did this in anticipation of some future refactoring of the io section of the codebase that I plan to do (basically it involves storing the output-related parameters outside of the global Parameter struct). In the short term, I think it would actually be nice to pass around a single FnameTemplate instance to all of the functions that use it, but I'm trying not to change function interfaces too much in the PR (I need to backport this change to the particle-feedback branch I have, which branched near the start of last summer). If you strongly dislike this design choice, I can change it.

Other Thoughts:

Currently the output filenames differ between a simulation compiled with MPI run with a single process compared to a simulation compiled without MPI. All of the filenames written by the MPI version will end in .h5.0. The non-MPI version writes filenames terminating in .h5. Going forward, I actually think it would be better to terminate the filenames with .h5.0 in the non-MPI case (for the sake of consistency). But I think that's a topic for another day

@brantr
Copy link
Collaborator

brantr commented Jan 26, 2024

This PR affects all downstream analysis. I agree it's "better", but it will have substantial downstream impact (either analysis scripts will be changed to reflect the output structure, or scripts will need to be used to move files around).

I think the change to serial file naming is terrific, if that matters.

@evaneschneider
Copy link
Collaborator

This PR affects all downstream analysis. I agree it's "better", but it will have substantial downstream impact (either analysis scripts will be changed to reflect the output structure, or scripts will need to be used to move files around).

I think the change to serial file naming is terrific, if that matters.

Yes for sure! This was on the CAAR to-do list for a long time and we never got around to it. But since @mabruzzo added a flag to maintain the legacy behavior for now, I think the impact should be minimal for current users. This is a good reminder though that we should double-check the concatenation and plotting examples in the python scripts directory to make them consistent.

@bcaddy
Copy link
Collaborator

bcaddy commented Jan 26, 2024

I think that this will break the concatenation scripts. I'm happy to help/advise on how to update them for this but I think it should be part of the same PR or at least those changes should be in flight before this is merged.

@mabruzzo
Copy link
Collaborator Author

If it would be less controversial, we could make the default-behavior the default choice. With that said, it would probably hinder efforts gradually phase out the old system.

It's unclear why the continuous integration failed. There doesn't seem to be any useful error messages when I click details

@bcaddy
Copy link
Collaborator

bcaddy commented Jan 26, 2024

If the CI stuff fails randomly try just rerunning it. Sometimes errors like that are just the file system or jenkins having issues.

@evaneschneider
Copy link
Collaborator

I think that this will break the concatenation scripts. I'm happy to help/advise on how to update them for this but I think it should be part of the same PR or at least those changes should be in flight before this is merged.

It should just be a matter of changing a single line of code in each of them, right? If so, I'm fine with just doing that as part of this PR. Although it's actually not clear to me from a quick glance that the concatenation scripts themselves would break, since they take the source directory as an input. It seems like it's the dask templates that would need to be edited.

@mabruzzo
Copy link
Collaborator Author

I think that this will break the concatenation scripts. I'm happy to help/advise on how to update them for this but I think it should be part of the same PR or at least those changes should be in flight before this is merged.

It should just be a matter of changing a single line of code in each of them, right? If so, I'm fine with just doing that as part of this PR. Although it's actually not clear to me from a quick glance that the concatenation scripts themselves would break, since they take the source directory as an input. It seems like it's the dask templates that would need to be edited.

Things will break since they try to iterate over all of the outputs. But fixing that is easy enough. I'll make of point of doing that

@mabruzzo
Copy link
Collaborator Author

If the CI stuff fails randomly try just rerunning it. Sometimes errors like that are just the file system or jenkins having issues.

Is there any way to do that without pushing another commit?

@evaneschneider
Copy link
Collaborator

I think that this will break the concatenation scripts. I'm happy to help/advise on how to update them for this but I think it should be part of the same PR or at least those changes should be in flight before this is merged.

It should just be a matter of changing a single line of code in each of them, right? If so, I'm fine with just doing that as part of this PR. Although it's actually not clear to me from a quick glance that the concatenation scripts themselves would break, since they take the source directory as an input. It seems like it's the dask templates that would need to be edited.

Things will break since they try to iterate over all of the outputs. But fixing that is easy enough. I'll make of point of doing that

Ah yes good point, I'm not as familiar with the newer behavior. Thanks!

@bcaddy bcaddy self-requested a review February 1, 2024 18:17
Copy link
Collaborator

@bcaddy bcaddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good to me. Excellent use of std::filesystem.

@evaneschneider evaneschneider merged commit 3db8f10 into cholla-hydro:dev Feb 2, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants