I'm having some issues understanding SaveImage
and SaveImaged
output path and filename configuration
#6188
-
I'm trying to set up a simple dataset/model management suite to simplify running experiments with supervised and unsupervised training techniques. A) I want to know where the B) I want to specify with more granularity where this sample ends up. Solving either problem will make my life easier because I can use the solution of one to work around the other issue. Consider my dataset with which I'm testing my implementation is the BraTS2021 data, as it's provided on the challenge page:
I've written some code to manage this dataset, where @dataclass(init=True, repr=True, eq=False, kw_only=True, slots=True)
class Sample(_SerializableClass):
"""
Represents a single sample in our dataset.
Represents a single sample in our dataset.
Can be supervised or unsupervised, depending on the presence of a label path.
:param image: The path to the image file
:param label: The path to the label file, optional
:param name: The name of this sample, optional.
If not given, will default to the name of the image.
:param is_preprocessed: Whether this sample has been pre-processed or not
"""
image: Path
label: Path = field(default=None)
name: str = field(default=None)
is_preprocessed: bool = field(default=False)
... @dataclass(init=True, repr=True, eq=False, kw_only=True, slots=True)
class Dataset(_SerializableClass):
"""
A representation of an entire dataset.
A representation of an entire dataset, with all accompanying parameters.
This object manages samples, pre-processing and general dataset information.
:param name: The name of this dataset
:param authors: The authors of this dataset
:param license: The license of this dataset.
:param creation_date: The moment when this dataset was first instanciated
:param modified_date: The last time this dataset was altered
:param channels: The channels of the input images, and what they represent
:param classes: The classes of the labels, and what they represent
:param samples: A list of the samples of this dataset.
:param spacing_override: If set, will override median spacing of this dataset for pre-processing.
"""
name: str = field(repr=True)
path: Path = field(init=False, repr=False)
authors: list[str] = field(default_factory=list, repr=True)
license: str = field(default="All rights reserved", repr=True)
creation_date: Arrow = field(default_factory=now, repr=True)
modified_date: Arrow = field(default_factory=now, repr=True)
channels: dict[int, str] = field(default_factory=dict, init=True, repr=False)
classes: dict[int, str] = field(default_factory=dict, init=True, repr=False)
samples: list[Sample] = field(default_factory=list, init=True, repr=False)
spacing_override: Optional[tuple[float, float, float]] = field(
init=True, default=None
)
... The JSON representations of such objects looks like: {
"name": "BraTS2021",
"authors": [],
"license": "All rights reserved",
"creation_date": "2023-03-20T17:21:39.470708+01:00",
"modified_date": "2023-03-20T17:21:58.437291+01:00",
"channels": {},
"classes": {},
"spacing_override": null,
"samples": [
{
"name": "BraTS2021_00000",
"is_preprocessed": false,
"image": "E:/BraTS2021/RSNA_ASNR_MICCAI_BraTS2021_TrainingData_16July2021/BraTS2021_00000/BraTS2021_00000_t1ce.nii.gz",
"label": "E:/BraTS2021/RSNA_ASNR_MICCAI_BraTS2021_TrainingData_16July2021/BraTS2021_00000/BraTS2021_00000_seg.nii.gz"
},
...
{
"name": "BraTS2021_01767",
"is_preprocessed": false,
"image": "E:/BraTS2021/RSNA_ASNR_MICCAI_BraTS2021_ValidationData/BraTS2021_01767/BraTS2021_01767_t1ce.nii.gz"
},
... Now that I have the relevant samples, I want to be able to take these and feed them through a preprocessing pipeline from pprint import pprint
import monai.transforms as mt
# ---default_transforms.py---
def init_base_transforms(
image_key: str = "image",
label_key: Optional[str] = None,
spacing: tuple[float, float, float] = (0.5, 0.5, 0.5),
allow_missing: bool = False,
) -> mt.Compose:
keys = [image_key]
interp_mode = ["bilinear"]
if label_key is not None:
keys.append(label_key)
interp_mode.append("nearest")
transforms = [
mt.LoadImaged(keys=keys, allow_missing_keys=allow_missing),
mt.EnsureChannelFirstd(keys=keys, allow_missing_keys=allow_missing),
mt.EnsureTyped(keys=keys, allow_missing_keys=allow_missing),
mt.Orientationd(keys=keys, axcodes="LPS", allow_missing_keys=allow_missing),
mt.Spacingd(
keys=keys,
pixdim=spacing,
mode=interp_mode,
allow_missing_keys=allow_missing,
),
mt.NormalizeIntensityd(
keys=[image_key],
nonzero=True,
channel_wise=True,
allow_missing_keys=allow_missing,
),
]
return mt.Compose(transforms)
# ---------------------------
preprocessing_transforms = init_base_transforms(
image_key="image",
label_key="label",
spacing=dataset.dataset_spacing,
allow_missing=True,
)
# Helper function to create a list[dict[str, Path]] for use in dictionary transforms
samples = dataset.as_monai_list(get_supervised=True, get_unsupervised=True)
save_transform = mt.SaveImaged(
keys=["image", "label"],
allow_missing_keys=True,
# dataset.json location -> ./Preprocessed
output_dir=dataset.processed_samples_path,
output_postfix="",
resample=False,
separate_folder=False,
print_log=True,
)
for sample in samples:
output = preprocessing_transforms(sample)
pprint(save_transform(output)) Running this code gives me:
If I set
I've been digging into the
Where I hope I haven't overwhelmed you with logs and code, and that I have paired down my issue to the minimal example. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
I've found a solution, but it feels very convoluted, and prone to breaking on any future updates in MONAI. In the current version, V1.1.0, I can do the following. First, I make a class CustomFolderLayout:
def __init__(
self,
output_dir: Path,
extension: str = "",
makedirs: bool = True,
**kwargs: dict
):
self.output_dir: Path = output_dir
if not extension:
self.ext = ""
elif extension.startswith("."):
self.ext = extension
else:
self.ext = f".{extension}"
self.makedirs = makedirs
def filename(self, subdirectory: str, filename: str = "subject", **kwargs):
path = self.output_dir / subdirectory
for k, v in kwargs.items():
filename += f"_{k}-{v}"
if self.ext is not None:
filename += self.ext
return path / filename In here I need the Now that I have my def _name_formatter(metadict: dict, saver: mt.SaveImaged) -> dict:
return {
"subdirectory": metadict["sample_name"],
"filename": metadict["dict_key"]
}
...
save_transform = mt.SaveImaged(
keys=["image", "label"],
allow_missing_keys=True,
output_dir=dataset.processed_samples_path,
output_postfix="",
resample=False,
separate_folder=False,
print_log=True,
output_name_formatter=_name_formatter,
)
# Please forgive me for this absolute hack to get path naming to work as I want
save_transform.saver.folder_layout = CustomFolderLayout(
output_dir=save_transform.saver.folder_layout.output_dir,
extension=save_transform.saver.output_ext,
makedirs=True
)
sample: Sample
for sample in tqdm(dataset.samples, desc="Processing samples", leave=False):
output: dict = preprocessing_transforms(sample.to_monai())
# And this one, I beg of you
for k in save_transform.keys:
output[f"{k}_meta_data"]["dict_key"] = k
output[f"{k}_meta_data"]["sample_name"] = sample.name
save_transform(output) But this gave me problems, as the Idea is the same, we're modifying the ...
save_transform = mt.SaveImaged(
keys=["image", "label"],
allow_missing_keys=True,
output_dir=dataset.processed_samples_path,
output_postfix="",
resample=False,
separate_folder=False,
print_log=True,
output_name_formatter=_name_formatter,
)
# Please forgive me for this absolute hack to get path naming to work as I want
save_transform.saver.folder_layout = CustomFolderLayout(
output_dir=save_transform.saver.folder_layout.output_dir,
extension=save_transform.saver.output_ext,
makedirs=True
)
sample: Sample
for sample in tqdm(dataset.samples, desc="Processing samples", leave=False):
output: dict = preprocessing_transforms(sample.to_monai())
# And this one, I beg of you
for k in save_transform.keys:
output[f"{k}"].meta["dict_key"] = k
output[f"{k}"].meta["sample_name"] = sample.name
save_transform(output) This gives me the output I wanted:
And with me standardizing the file names, and knowing the subdirectory, I can go back and update my dataset records to point to the pre-processed data. My question remains, however, I've managed to solve it this way, but there's no guarantee that this will remain a workable solution, or won't cause conflicts in the future. Did I just completely misunderstand the |
Beta Was this translation helpful? Give feedback.
I've found a solution, but it feels very convoluted, and prone to breaking on any future updates in MONAI. In the current version, V1.1.0, I can do the following.
First, I make a
CustomFolderLayout
, based onmonai.data.folder_layout.FolderLayout
: