Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to de-identify a pydicom.Dataset? - addition of example to docs #210

Closed
fcossio opened this issue Aug 1, 2022 · 11 comments
Closed

How to de-identify a pydicom.Dataset? - addition of example to docs #210

fcossio opened this issue Aug 1, 2022 · 11 comments

Comments

@fcossio
Copy link
Contributor

fcossio commented Aug 1, 2022

Hi,
I have implemented my own logic to load a pydicom.Dataset instance from a database. I would like to de-identify the instance without having to write it as a file and then read it with deid.

Is there anything similar to

def replace_identifiers(recipe, dataset: pydicom.dataset.Dataset) -> pydicom.dataset.Dataset:
    """de-identify a single pydicom.dataset.Dataset instance"""
    ...

?

@vsoch
Copy link
Member

vsoch commented Aug 1, 2022

Aside from adding typing to deid here, you should be able to do:

if isinstance(dataset, pydicom.dataset.Dataset):
    replace_identifiers(...)

@vsoch
Copy link
Member

vsoch commented Aug 1, 2022

Also, typing in and of itself doesn't prevent you from providing the wrong type! E.g.,:

In [1]: def func(name: str):
   ...:     print(name)
   ...: 

In [2]: func(1)
1

@fcossio
Copy link
Contributor Author

fcossio commented Aug 2, 2022

After some more digging through the documentation, I solved my problem with the following:

class DeidDataset:
    def __init__(self, recipe_path: str = None):
        """Deidentify datasets according to vaib recipe

        :param recipe_path: path to the deid recipe
        """
        if recipe_path == None:
            logging.warning(f"DeidDataset using default recipe {default_recipe_path}")
            recipe_path = default_recipe_path
        self.recipe = DeidRecipe(recipe_path)

    def anonymize(self, dataset:pydicom.Dataset) -> pydicom.Dataset:
        """Anonymize a single dicom dataset

        :param dataset: dataset that will be anonymized
        :returns: anonymized dataset
        """
        parser = DicomParser(dataset, self.recipe)
        parser.define('remove_day', self.remove_day)
        parser.define('round_AS_to_nearest_5y', self.round_AS_to_nearest_5y)
        parser.define('round_DS_to_nearest_5', self.round_DS_to_nearest_5)
        parser.define('round_DS_to_nearest_0_05', self.round_DS_to_nearest_0_05)
        parser.parse(strip_sequences=True, remove_private=True)
        return parser.dicom
    ...

Thanks for making this tool available.

@fcossio fcossio closed this as completed Aug 2, 2022
@vsoch
Copy link
Member

vsoch commented Aug 2, 2022

oh that's fantastic! Do you mind if I include with our docs somewhere as an example? Even if we create a gist and then link, I think it might be super helpful for future users.

@fcossio
Copy link
Contributor Author

fcossio commented Aug 3, 2022

Of course! I will be OOO for the next two weeks. If you can wait that time, I will make a proper PR afterwards adding the example to the docs.

@fcossio
Copy link
Contributor Author

fcossio commented Aug 3, 2022

Actually, I just found that this only works for files, there are two lines that must be silenced in order for it to work with a dataset that doesn't come from a file:

self.dicom_file = os.path.abspath(self.dicom.filename)

self.dicom_name = os.path.basename(self.dicom_file)

and the file meta here:

datasets = [dicom, dicom.file_meta]

@vsoch
Copy link
Member

vsoch commented Aug 3, 2022

Yes of course! When you are back ping me I’d you have questions or want any help.

@vsoch vsoch reopened this Aug 3, 2022
@vsoch vsoch changed the title How to de-identify a pydicom.Dataset? How to de-identify a pydicom.Dataset? - addition of example to docs Aug 3, 2022
@fcossio
Copy link
Contributor Author

fcossio commented Aug 15, 2022

I'm back 😄

I will need to expose an argument to be able to silence these two lines.

Actually, I just found that this only works for files, there are two lines that must be silenced in order for it to work with a dataset that doesn't come from a file:

self.dicom_file = os.path.abspath(self.dicom.filename)

self.dicom_name = os.path.basename(self.dicom_file)

I propose adding a boolean from_file argument to the __init__ method of DicomParser and then using if-else statements to silence the lines accordingly.

For the DicomField part that needs to be silenced:

and the file meta here:

datasets = [dicom, dicom.file_meta]

I can add the same argument to this method and skip the dicom.file_meta part accordingly.

def get_fields(dicom, skip=None, expand_sequences=True, seen=None):

To keep the interface intact, the default value for the proposed arguments would be True and only if it is necessary, the user could set it to False when needed.

Does this sound good to you @vsoch ?

@vsoch
Copy link
Member

vsoch commented Aug 15, 2022

A dataset that doesn’t come from a file - what would it be?

@fcossio
Copy link
Contributor Author

fcossio commented Aug 15, 2022

I am loading a dataset that was stored in a database as json. Therefore it contains no filepath or file_meta.

@vsoch
Copy link
Member

vsoch commented Aug 15, 2022

Gotcha, ok just make functions to derive both of those items then, and if you cannot set to None, and make sure places that use them also respond appropriately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants