Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USAVars Augmentation maps to 0 #1432

Open
nilsleh opened this issue Jun 20, 2023 · 4 comments · Fixed by #1433 · May be fixed by #2147
Open

USAVars Augmentation maps to 0 #1432

nilsleh opened this issue Jun 20, 2023 · 4 comments · Fixed by #1433 · May be fixed by #2147
Assignees
Labels
datamodules PyTorch Lightning datamodules transforms Data augmentation transforms

Comments

@nilsleh
Copy link
Collaborator

nilsleh commented Jun 20, 2023

Description

In the USAVars Datamodule, the default augmentation from NonGeoDatamodule is used. However, the dataset returns uint8 data, and it comes out of the augmentation still as uint8. This means you get an error when trying to train but also that your input images are just all zeros.

Steps to reproduce

dm = USAVarsDataModule(root="path/to/usa_vars", batch_size=16)
dm.setup("fit")
dl = dm.train_dataloader()
batch = next(iter(dl))
aug_batch = dm.aug(batch)
print(aug_batch["image"].max())

Version

'0.5.0.dev0'

@nilsleh nilsleh mentioned this issue Jun 20, 2023
@nilsleh
Copy link
Collaborator Author

nilsleh commented Jun 20, 2023

Also not sure to what extent other current datasets might have the same issue because it's a silent bug.

@adamjstewart adamjstewart added this to the 0.4.2 milestone Jun 20, 2023
@adamjstewart
Copy link
Collaborator

adamjstewart commented Jun 20, 2023

This is another reason I want #985. Although I don't think rasterio can statically infer the type of the file, so maybe that wouldn't help. We'd have to carefully annotate each read.

@nilsleh
Copy link
Collaborator Author

nilsleh commented Jun 22, 2023

I am reopening this because I had two other people tell me they had the same thing happening with their custom datasets. Is there maybe a way we could implement a warning at some stage?

# Convert all inputs back to their previous dtype

Or some other check, because it's a quiet sneaky phenomenon, that takes some digging to understand why.

@nilsleh nilsleh reopened this Jun 22, 2023
@adamjstewart
Copy link
Collaborator

We could certainly warn, but I want to make sure warnings are uncommon. I also really want to get rid of our AugmentationSequential wrapper and instead use the Kornia one. The biggest thing holding us back is kornia/kornia#2119. I started working on this but quickly found it more difficult than I expected.

I think one of the following solutions would be fine:

There are roughly sorted in order from "this is what I want long-term" to "this would be tolerable short-term".

@adamjstewart adamjstewart modified the milestones: 0.4.2, 0.5.0 Sep 28, 2023
@adamjstewart adamjstewart added transforms Data augmentation transforms datamodules PyTorch Lightning datamodules labels Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datamodules PyTorch Lightning datamodules transforms Data augmentation transforms
Projects
None yet
3 participants