-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify Workflow to Allow IterableDataset Inputs #8263
base: dev
Are you sure you want to change the base?
Conversation
Signed-off-by: Eric Kerfoot <eric.kerfoot@kcl.ac.uk>
for more information, see https://pre-commit.ci
Signed-off-by: Eric Kerfoot <eric.kerfoot@kcl.ac.uk>
…gine_stream_fix Signed-off-by: Eric Kerfoot <eric.kerfoot@kcl.ac.uk>
Signed-off-by: Eric Kerfoot <eric.kerfoot@kcl.ac.uk>
/build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pr, looks good to me. One minor comments inline.
raise ValueError("If data_loader is not PyTorch DataLoader, must specify the epoch_length.") | ||
try: | ||
epoch_length = len(data_loader) | ||
except TypeError: # raised when data_loader is given an iterable dataset which has no length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should add a warning here or validate the dataset instance in the data_loader
?
isinstance(data_loader.dataset, Iterable)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data_loader
argument itself isn't always a DataLoader
instance so may not have a dataset
member, it can be a Iterable
itself which might have a length. It think there's a use case where you'd pass in just a list of items you already have a so not use a DataLoader
at all and so this is the more robust way of attempting to get a length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it looks like if epoch_length is None:
this if should be moved to the outside level, right?
Description
This modifies the behaviour of
Workflow
to permitIterableDataset
to be used correctly. A check against theepoch_length
value is removed, to allow that value to beNone
, and a test is added to verify this. The length of a data loader is not defined when using iterable datasets, so try/raise is added to allow that to be queried safely. This is related to my work on the streaming support, in my prototype gist I had to provide a bogus epoch length value in the then change it toNone
later once the evaluator object was created. This PR will remove the need for this hack.Types of changes
./runtests.sh -f -u --net --coverage
../runtests.sh --quick --unittests --disttests
.make html
command in thedocs/
folder.