Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sampling random generator #2309

Merged
merged 10 commits into from
Sep 23, 2024
Merged

Sampling random generator #2309

merged 10 commits into from
Sep 23, 2024

Conversation

sfalkena
Copy link
Contributor

Following #2307, adding the possibility to set the seed of the random samplers by supplying a torch.Generator. As far as I could tell, this is the only place where no random generator was used yet, apart from the Kornia transforms (would we want that? They do not support a generator argument).

@github-actions github-actions bot added samplers Samplers for indexing datasets datamodules PyTorch Lightning datamodules labels Sep 20, 2024
@github-actions github-actions bot added the testing Continuous integration testing label Sep 20, 2024
calebrob6
calebrob6 previously approved these changes Sep 21, 2024
@adamjstewart adamjstewart added this to the 0.7.0 milestone Sep 23, 2024
Copy link
Collaborator

@adamjstewart adamjstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Just some minor comments on consistency and simplicity.

We should probably do the same for all datamodules, but let's do that in a separate PR.

tests/samplers/test_batch.py Outdated Show resolved Hide resolved
torchgeo/samplers/batch.py Outdated Show resolved Hide resolved
@sfalkena
Copy link
Contributor Author

What is the thing with codecov failing? When I check the dashboard states 100%.

@adamjstewart
Copy link
Collaborator

I'm trying to figure out the codecov thing now. I'm 99.9% sure it's a glitch, not your fault. Let's rerun the tests just to make sure.

@adamjstewart
Copy link
Collaborator

default: None

This is actually a bit tricky. For many uses of torch.rand*, None is fine. However, in some cases like data module splitting, we actually have to use some kind of generator. If we don't, we end up with overlapping train/val/test splits. So to clarify, I think the default value to use in the function signature is None, but for some cases, we need to use:

if generator is None:
    generator = torch.default_generator()

or something like that.

@adamjstewart adamjstewart enabled auto-merge (squash) September 23, 2024 14:40
@adamjstewart adamjstewart merged commit dfebdae into microsoft:main Sep 23, 2024
20 of 21 checks passed
@adamjstewart
Copy link
Collaborator

Do you want to open a PR to harmonize the previous examples where we used generator?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datamodules PyTorch Lightning datamodules samplers Samplers for indexing datasets testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants