deid

The code and files for this example can be found here. For this example, we want to replace values that are nested (in sequences). This operation is available for deid versions 0.1.34 and later, and currently we support REPLACE, and BLANK.

Imports

We first import the functions that we need

from deid.dicom import get_identifiers, replace_identifiers
from deid.config import DeidRecipe

We are using an MR.dcm that is provided in the example’s folder linked above.

dicom_files = ['MR.dcm']
items = get_identifiers(dicom_files)

For each item (indexed by the dicom file name), sequences are flattened out in the data structure. For example:

 'ReferencedImageSequence__ReferencedSOPClassUID': '111111111111111111',
 'ReferencedImageSequence__ReferencedSOPInstanceUID': '111111111111111',
 'ReferencedPerformedProcedureStepSequence__InstanceCreationDate': '22222222',
 'ReferencedPerformedProcedureStepSequence__InstanceCreationTime': '22222222',
 'ReferencedPerformedProcedureStepSequence__InstanceCreatorUID': 'xxxxxxx',
 'ReferencedPerformedProcedureStepSequence__ReferencedSOPClassUID': 'xxxxxxxxxx',
 'ReferencedPerformedProcedureStepSequence__ReferencedSOPInstanceUID': 'xxxxxxxx',

The function we will use for the example will perform an action to generate a uid, but you can also use it to communicate with databases, APIs, or do something like save the original (and newly generated one) in some (IRB approvied) place

The Deid Recipe

The process of updating header values means writing a series of actions in the deid recipe, in this folder the file deid.dicom has the following content:

Along with a deid.dicom that asks to REPLACE a field with a function:

FORMAT dicom

%header

REPLACE InstanceCreationDate func:generate_uid

Let’s create an instance of our recipe:

# Create the DeidRecipe Instance from deid.dicom
recipe = DeidRecipe('deid.dicom')

Here are a few different ways to interact:

# To see an entire (raw in a dictionary) recipe just look at
recipe.deid

# What is the format?
recipe.get_format()
# dicom

# What actions do we want to do on the header?
recipe.get_actions()

[{'action': 'REPLACE',
  'field': 'InstanceCreationDate',
  'value': 'func:generate_uid'}]

# We can filter to an action type (not useful here, we only have one type)
recipe.get_actions(action='REPLACE')

# or we can filter to a field
recipe.get_actions(field='InstanceCreationDate')
[{'action': 'REPLACE',
  'field': 'InstanceCreationDate',
  'value': 'func:generate_uid'}]

# and logically, both (not useful here)
recipe.get_actions(field='PatientID', action="REMOVE")

Our recipe instance is ready to go. From the above we are saying we want to replace the InstanceCreationDate field with the output from the generate_uid function, which is expected in the item dict. Let’s write that next.

Write Your Function

A simple function with a uid generated from the uuid library might look like this:

def generate_uid(item, value, field):
    '''This function will generate a uuid! You can expect it to be passed
       the dictionary of items extracted from the dicom (and your function)
       and variables, the original value (func:generate_uid) and the field
       name you are applying it to.
    '''
    import uuid
    # a field can either be just the name string, or a DicomElement
    if hasattr(field, 'name'):
        field = field.name
    prefix = field.lower().replace(' ', " ")
    return prefix + "-" + str(uuid.uuid4())

but if we want to be more correct and adhere to the dicom standard, we would want to do:

def generate_uid(item, value, field, dicom):
    '''This function will generate a dicom uid! You can expect it to be passed
       the dictionary of items extracted from the dicom (and your function)
       and variables, the original value (func:generate_uid) and the field
       object you are applying it to.
    '''
    import uuid

    # a field can either be just the name string, or a DicomElement
    if hasattr(field, 'name'):
        field = field.name

    # Your organization should have it's own DICOM ORG ROOT.
    # For the purpose of an example, borrowing PYMEDPHYS_ROOT_UID
    ORG_ROOT = "1.2.826.0.1.3680043.10.188"  # e.g. PYMEDPHYS_ROOT_UID
    prefix = field.lower().replace(' ', " ")
    bigint_uid = str(uuid.uuid4().int)
    full_uid = ORG_ROOT + "." + bigint_uid
    sliced_uid = full_uid[0:64]  # A DICOM UID is limited to 64 characters
    return prefix + "-" + sliced_uid

As stated in the docstring, you can expect it to be passed the dictionary of items extracted from the dicom (and your function) and variables, the original value (func:generate_uid) and the field name you are applying it to.

Development Tip

If you want to interactively develop and test what is passed to the function, just insert an embedded ipython into the function:

def generate_uid(item, value, field, dicom):
    '''This function will generate a dicom uid! You can expect it to be passed
       the dictionary of items extracted from the dicom (and your function)
       and variables, the original value (func:generate_uid) and the field
       object you are applying it to.
    '''
    import IPython
    IPython.embed()

And then proceed running the replace operation. This will put your into an interactive session and have all the variables available to you for inspection. For example:

item
# {'(0008, 0005)': (0008, 0005) Specific Character Set              CS: 'ISO_IR 100'  [SpecificCharacterSet],
# ...
# 'generate_uid': <function __main__.generate_uid(item, value, field, dicom)>}

value
# 'func:generate_uid'

field
# (0020, 000d) Study Instance UID                  UI: 1.2.276.0.7230010.3.1.2.8323329.5329.1495927169.580350  [StudyInstanceUID]

dicom
# (0008, 0005) Specific Character Set              CS: 'ISO_IR 100'
...

And note that field can be the string identifier, or the full element, depending on how it is used internally, so you should always check.

Update Your Items

How do we update the items? Remember, the action is:

REPLACE InstanceCreationDate func:generate_uid

so the key for each item in items needs to be ‘generate_uid.” Just do this:

for item in items:
    items[item]['generate_uid'] = generate_uid

Replace identifiers

We are ready to go! Now let’s generate the cleaned files! It will output to a temporary directory. Since we want to replace nested sequences, we need to set strip_sequences to False.

# Clean the files
cleaned_files = replace_identifiers(
    dicom_files=dicom_files, deid=recipe, strip_sequences=False, ids=items
)

Note that expansion of sequences is not currently supported for operations that remove or add a value (ADD, REMOVE, JITTER). You can load in a cleaned file to see what was done

print(cleaned_files[0].InstanceCreationDate)
print(cleaned_files[0].ReferencedPerformedProcedureStepSequence[0].InstanceCreationDate)

20200608
20200608

Full code for the example above is available here.