-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Time series enhancements #2690
Time series enhancements #2690
Conversation
8529390
to
e7ec520
Compare
So atm it looks like the failing tests are exactly the two I just added. This is because they are written for pytest and our CI is still running nose. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I left a few inline comments. Somebody who knows the time series stuff may be able to offer some insight here.
The deprecation of from_filenames
seems reasonable given the existence of __new__
, especially because the docstring on DatasetSeries has the same kwargs available as from_filenames
, so it seems like there's no added flexibility from the latter. I am curious why it is still around (since from_filenames
is from ~2013 and __new__
is from ~2014).
yt/data_objects/time_series.py
Outdated
file_list = td_filenames | ||
else: | ||
raise YTOutputNotIdentified(filenames, {}) | ||
def get_filenames_from_glob_pattern(pattern): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the docstring you name this arg outputs
, not pattern
. They should be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I don't really understand why this needed to be changed from filenames? They should be output files after all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well this function's name indicate that it's meant to be passed a pattern. It's true that, in practice, it receives the "outputs" param from DatasetSeries.__new__()
, but only in as a special-case handling where the parameter doesn't actually hold "outputs" yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you’re not convinced I can change it, it’s not that crucial. Let me know !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, but the way I'm reading this there's no separation in functionality right? outputs will always be sent to this function. There's no separation if it's a list or a pattern. In new we see outputs = get_filenames_from_glob_pattern(outputs)
, so it isn't always going to be sent a pattern and not a list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, do we have a test verifying that a list works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't show up in the lines I've touched but there is a separation in DatasetSeries.__new__
yt/yt/data_objects/time_series.py
Line 141 in 6befef2
if isinstance(outputs, str): |
I have no idea if we have a test for lists. I considered adding one but I don't know of any sample data that'd be appropriate for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just verified that I could use a list with enzo_tiny_cosmology
.
That said, I know this is nitpicky but I think it's good to be consistent with how we send args in defintions vs. in the place they're used in our code. I would prefer if the definition also used outputs
so we're consistent, even if there's an implication it will be a pattern of some sort.
This can now viewed as a companion PR to #2695 |
…kens, raise FileNotFoundError if no match is found. Make new tests pass.
Co-authored-by: Madicken Munk <madicken.munk@gmail.com>
5c058ef
to
5c74be0
Compare
@munkm I also don't remember why we had both. I think there may have been a time where we were anticipating making a This looks good to me. |
I'm happy to update this if needed once we resolve the above conversations @munkm 😄 |
Wait, I'm confused. The thing you just reverted was ok based on our discussions? |
yes ! Did I misinterpret the following
? |
yt/data_objects/time_series.py
Outdated
def get_filenames_from_glob_pattern(pattern): | ||
epattern = os.path.expanduser(pattern) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def get_filenames_from_glob_pattern(pattern): | |
epattern = os.path.expanduser(pattern) | |
def get_filenames_from_glob_pattern(outputs): | |
epattern = os.path.expanduser(outputs) |
I think maybe (or maybe I did?). I was trying to say I was ok with what you were saying. Eta: sorry, my bad! You didn't misinterpret. I did! I thought the reversion had already happened and you were going back to pre-pre functionality! |
… add a docstring for clarity
Looks like we settled on something satisfying for everyone now ➡️ :partyparot: |
PR Summary
This is part of an effort to harmonise and simplify the data loading api.
This PR includes the following improvements to
DatasetSeries
:DatasetSeries.from_filenames()
because it's redundant sinceDatasetSeries.__new__
was added, and I suspect it is not used'~'
token expansion, to improve consistency betweenDatasetSeries()
andyt.load()
)I changed the error raised in the pattern parsing function to
FileNotFoundError
to better reflect the issue, this could potentially break downstream code that relies on this function throwingYTOutputNotIdentified
, though it seems very unlikely.PR Checklist