Pydicom API’s Reference Guide

Common pydicom functions called by user code

File Reading/Parsing

The main function to read and parse DICOM files using pydicom is dcmread. It is coded in the module dicom.filereader, but is also imported when the pydicom package is imported:

>>> import pydicom
>>> dataset = pydicom.dcmread(...)

If you need fine control over the reading, you can either call read_partial or use open_dicom. All are documented below:

pydicom.filereader.dcmread(fp, defer_size=None, stop_before_pixels=False, force=False, specific_tags=None)[source][source]

Read and parse a DICOM dataset stored in the DICOM File Format.

Read a DICOM dataset stored in accordance with the DICOM File Format (DICOM Standard Part 10 Section 7). If the dataset is not stored in accordance with the File Format (i.e. the preamble and prefix are missing, there are missing required Type 1 File Meta Information Group elements or the entire File Meta Information is missing) then you will have to set force to True.

Parameters:

fp : str or file-like

Either a file-like object, or a string containing the file name. If a file-like object, the caller is responsible for closing it.

defer_size : int or str or None

If None (default), all elements read into memory. If specified, then if a data element’s stored value is larger than defer_size, the value is not read into memory until it is accessed in code. Specify an integer (bytes), or a string value with units, e.g. “512 KB”, “2 MB”.

stop_before_pixels : bool

If False (default), the full file will be read and parsed. Set True to stop before reading (7FE0,0010) ‘Pixel Data’ (and all subsequent elements).

force : bool

If False (default), raises an InvalidDicomError if the file is missing the File Meta Information header. Set to True to force reading even if no File Meta Information header is found.

specific_tags : list or None

If not None, only the tags in the list are returned. The list elements can be tags or tag names.

Returns:

FileDataset

An instance of FileDataset that represents a parsed DICOM file.

Raises:

InvalidDicomError

If force is True and the file is not a valid DICOM file.

See also

pydicom.dataset.FileDataset
Data class that is returned.
pydicom.filereader.read_partial
Only read part of a DICOM file, stopping on given conditions.

Examples

Read and return a dataset stored in accordance with the DICOM File Format >>> ds = pydicom.dcmread(“rtplan.dcm”) >>> ds.PatientName

Read and return a dataset not in accordance with the DICOM File Format >>> ds = pydicom.dcmread(“rtplan.dcm”, force=True) >>> ds.PatientName

Use within a context manager: >>> with pydicom.dcmread(“rtplan.dcm”) as ds: >>> ds.PatientName

pydicom.filereader.read_partial(fileobj, stop_when=None, defer_size=None, force=False, specific_tags=None)[source][source]

Parse a DICOM file until a condition is met.

Parameters:

fileobj : a file-like object

Note that the file will not close when the function returns.

stop_when :

Stop condition. See read_dataset for more info.

defer_size : int, str, None, optional

See dcmread for parameter info.

force : boolean

See dcmread for parameter info.

specific_tags : list or None

See dcmread for parameter info.

Returns:

FileDataset instance or DicomDir instance.

See also

dcmread
More generic file reading function.

Notes

Use dcmread unless you need to stop on some condition other than reaching pixel data.

File Writing

DICOM files can also be written using pydicom. There are two ways to do this. The first is to use write_file with a prexisting FileDataset (derived from Dataset) instance. The second is to use the save_as method on an Dataset instance.

pydicom.filewriter.write_file(filename, dataset, write_like_original=True)[source]

Write dataset to the filename specified.

If write_like_original is True then dataset will be written as is (after minimal validation checking) and may or may not contain all or parts of the File Meta Information (and hence may or may not be conformant with the DICOM File Format). If write_like_original is False, dataset will be stored in the DICOM File Format in accordance with DICOM Standard Part 10 Section 7. The byte stream of the dataset will be placed into the file after the DICOM File Meta Information.

Parameters:

filename : str or file-like

Name of file or the file-like to write the new DICOM file to.

dataset : pydicom.dataset.FileDataset

Dataset holding the DICOM information; e.g. an object read with pydicom.dcmread().

write_like_original : bool

If True (default), preserves the following information from the Dataset (and may result in a non-conformant file): - preamble – if the original file has no preamble then none will be

written.

  • file_meta – if the original file was missing any required File Meta
    Information Group elements then they will not be added or written. If (0002,0000) ‘File Meta Information Group Length’ is present then it may have its value updated.
  • seq.is_undefined_length – if original had delimiters, write them now
    too, instead of the more sensible length characters
  • is_undefined_length_sequence_item – for datasets that belong to a
    sequence, write the undefined length delimiters if that is what the original had.

If False, produces a file conformant with the DICOM File Format, with explicit lengths for all elements.

See also

pydicom.dataset.FileDataset
Dataset class with relevant attributes and information.
pydicom.dataset.Dataset.save_as
Write a DICOM file from a dataset that was read in with dcmread(). save_as wraps dcmwrite.
Dataset.save_as(filename, write_like_original=True)[source][source]

Write the Dataset to filename.

Saving a Dataset requires that the Dataset.is_implicit_VR and Dataset.is_little_endian attributes exist and are set appropriately. If Dataset.file_meta.TransferSyntaxUID is present then it should be set to a consistent value to ensure conformance.

Parameters:

filename : str or file-like

Name of file or the file-like to write the new DICOM file to.

write_like_original : bool

If True (default), preserves the following information from the Dataset (and may result in a non-conformant file): - preamble – if the original file has no preamble then none will

be written.

  • file_meta – if the original file was missing any required File
    Meta Information Group elements then they will not be added or written. If (0002,0000) ‘File Meta Information Group Length’ is present then it may have its value updated.
  • seq.is_undefined_length – if original had delimiters, write them
    now too, instead of the more sensible length characters
  • is_undefined_length_sequence_item – for datasets that belong to
    a sequence, write the undefined length delimiters if that is what the original had.

If False, produces a file conformant with the DICOM File Format, with explicit lengths for all elements.

See also

pydicom.filewriter.write_dataset
Write a DICOM Dataset to a file.
pydicom.filewriter.write_file_meta_info
Write the DICOM File Meta Information Group elements to a file.
pydicom.filewriter.dcmwrite
Write a DICOM file from a FileDataset instance.

Dataset

class pydicom.dataset.Dataset(*args, **kwargs)[source][source]

A collection (dictionary) of DICOM DataElements.

Examples

Add DataElements to the Dataset (for elements in the DICOM dictionary). >>> ds = Dataset() >>> ds.PatientName = “CITIZEN^Joan” >>> ds.add_new(0x00100020, ‘LO’, ‘12345’) >>> ds[0x0010, 0x0030] = DataElement(0x00100030, ‘DA’, ‘20010101’)

Add Sequence DataElement to the Dataset >>> ds.BeamSequence = [Dataset(), Dataset(), Dataset()] >>> ds.BeamSequence[0].Manufacturer = “Linac, co.” >>> ds.BeamSequence[1].Manufacturer = “Linac and Sons, co.” >>> ds.BeamSequence[2].Manufacturer = “Linac and Daughters, co.”

Add private DataElements to the Dataset >>> ds.add(DataElement(0x0043102b, ‘SS’, [4, 4, 0, 0])) >>> ds.add_new(0x0043102b, ‘SS’, [4, 4, 0, 0]) >>> ds[0x0043, 0x102b] = DataElement(0x0043102b, ‘SS’, [4, 4, 0, 0])

Updating and retrieving DataElement values >>> ds.PatientName = “CITIZEN^Joan” >>> ds.PatientName ‘CITIZEN^Joan” >>> ds.PatientName = “CITIZEN^John” >>> ds.PatientName ‘CITIZEN^John’

Retrieving a DataElement’s value from a Sequence >>> ds.BeamSequence[0].Manufacturer ‘Linac, co.’ >>> ds.BeamSequence[1].Manufacturer ‘Linac and Sons, co.’

Retrieving DataElements >>> elem = ds[0x00100010] >>> elem = ds.data_element(‘PatientName’) >>> elem (0010, 0010) Patient’s Name PN: ‘CITIZEN^Joan’

Deleting a DataElement from the Dataset >>> del ds.PatientID >>> del ds.BeamSequence[1].Manufacturer >>> del ds.BeamSequence[2]

Deleting a private DataElement from the Dataset >>> del ds[0x0043, 0x102b]

Determining if a DataElement is present in the Dataset >>> ‘PatientName’ in ds True >>> ‘PatientID’ in ds False >>> (0x0010, 0x0030) in ds True >>> ‘Manufacturer’ in ds.BeamSequence[0] True

Iterating through the top level of a Dataset only (excluding Sequences) >>> for elem in ds: >>> print(elem)

Iterating through the entire Dataset (including Sequences) >>> for elem in ds.iterall(): >>> print(elem)

Recursively iterate through a Dataset (including Sequences) >>> def recurse(ds): >>> for elem in ds: >>> if elem.VR == ‘SQ’: >>> [recurse(item) for item in elem] >>> else: >>> # Do something useful with each DataElement

Attributes

default_element_format (str) The default formatting for string display.
default_sequence_element_format (str) The default formatting for string display of sequences.
indent_chars (str) For string display, the characters used to indent nested Sequences. Default is ” “.

Methods

add(data_element) Add a DataElement to the Dataset.
add_new(tag, VR, value) Add a DataElement to the Dataset.
clear()
convert_pixel_data() Convert the Pixel Data to a numpy array internally.
copy()
data_element(name) Return the DataElement corresponding to the element keyword name.
decode() Apply character set decoding to all DataElements in the Dataset.
decompress() Decompresses pixel data and modifies the Dataset in-place
dir(*filters) Return an alphabetical list of DataElement keywords in the Dataset.
formatted_lines([element_format, …]) Iterate through the Dataset yielding formatted str for each element.
fromkeys($type, iterable[, value]) Returns a new dict with keys from iterable and values equal to value.
get(key[, default]) Extend dict.
get_item(key) Return the raw data element if possible.
group_dataset(group) Return a Dataset containing only DataElements of a certain group.
items()
iterall() Iterate through the Dataset, yielding all DataElements.
keys()
pop(k[,d]) If key is not found, d is returned if given, otherwise KeyError is raised
popitem() 2-tuple; but raise KeyError if D is empty.
remove_private_tags() Remove all private DataElements in the Dataset.
save_as(filename[, write_like_original]) Write the Dataset to filename.
setdefault(k[,d])
top() Return a str of the Dataset’s top level DataElements only.
trait_names() Return a list of valid names for auto-completion code.
update(dictionary) Extend dict.
values()
walk(callback[, recursive]) Iterate through the DataElements and run callback on each.
add(data_element)[source][source]

Add a DataElement to the Dataset.

Equivalent to ds[data_element.tag] = data_element

Parameters:

data_element : pydicom.dataelem.DataElement

The DataElement to add to the Dataset.

add_new(tag, VR, value)[source][source]

Add a DataElement to the Dataset.

Parameters:

tag

The DICOM (group, element) tag in any form accepted by pydicom.tag.Tag such as [0x0010, 0x0010], (0x10, 0x10), 0x00100010, etc.

VR : str

The 2 character DICOM value representation (see DICOM standard part 5, Section 6.2).

value

The value of the data element. One of the following: * a single string or number * a list or tuple with all strings or all numbers * a multi-value string with backslash separator * for a sequence DataElement, an empty list or list of Dataset

clear() → None. Remove all items from D.
convert_pixel_data()[source][source]

Convert the Pixel Data to a numpy array internally.

Returns:

None

Converted pixel data is stored internally in the dataset.

If a compressed image format, the image is decompressed,

and any related data elements are changed accordingly.

copy() → a shallow copy of D
data_element(name)[source][source]

Return the DataElement corresponding to the element keyword name.

Parameters:

name : str

A DICOM element keyword.

Returns:

pydicom.dataelem.DataElement or None

For the given DICOM element keyword, return the corresponding Dataset DataElement if present, None otherwise.

decode()[source][source]

Apply character set decoding to all DataElements in the Dataset.

See DICOM PS3.5-2008 6.1.1.

decompress()[source][source]

Decompresses pixel data and modifies the Dataset in-place

If not a compressed tranfer syntax, then pixel data is converted to a numpy array internally, but not returned.

If compressed pixel data, then is decompressed using an image handler, and internal state is updated appropriately:

  • TransferSyntax is updated to non-compressed form
    • is_undefined_length for pixel data is set False
Returns:

None

Raises:

NotImplementedError

If the pixel data was originally compressed but file is not

ExplicitVR LittleEndian as required by Dicom standard

dir(*filters)[source][source]

Return an alphabetical list of DataElement keywords in the Dataset.

Intended mainly for use in interactive Python sessions. Only lists the DataElement keywords in the current level of the Dataset (i.e. the contents of any Sequence elements are ignored).

Parameters:

filters : str

Zero or more string arguments to the function. Used for case-insensitive match to any part of the DICOM keyword.

Returns:

list of str

The matching DataElement keywords in the dataset. If no filters are used then all DataElement keywords are returned.

formatted_lines(element_format='%(tag)s %(name)-35.35s %(VR)s: %(repval)s', sequence_element_format='%(tag)s %(name)-35.35s %(VR)s: %(repval)s', indent_format=None)[source][source]

Iterate through the Dataset yielding formatted str for each element.

Parameters:

element_format : str

The string format to use for non-sequence elements. Formatting uses the attributes of DataElement. Default is “%(tag)s %(name)-35.35s %(VR)s: %(repval)s”.

sequence_element_format : str

The string format to use for sequence elements. Formatting uses the attributes of DataElement. Default is “%(tag)s %(name)-35.35s %(VR)s: %(repval)s”

indent_format : str or None

Placeholder for future functionality.

Yields:

str

A string representation of a DataElement.

fromkeys($type, iterable, value=None, /)

Returns a new dict with keys from iterable and values equal to value.

get(key, default=None)[source][source]

Extend dict.get() to handle DICOM DataElement keywords.

Parameters:

key : str or pydicom.tag.Tag

The element keyword or Tag or the class attribute name to get.

default : obj or None

If the DataElement or class attribute is not present, return default (default None).

Returns:

value

If key is the keyword for a DataElement in the Dataset then return the DataElement’s value.

pydicom.dataelem.DataElement

If key is a tag for a DataElement in the Dataset then return the DataElement instance.

value

If key is a class attribute then return its value.

get_item(key)[source][source]

Return the raw data element if possible.

It will be raw if the user has never accessed the value, or set their own value. Note if the data element is a deferred-read element, then it is read and converted before being returned.

Parameters:

key

The DICOM (group, element) tag in any form accepted by pydicom.tag.Tag such as [0x0010, 0x0010], (0x10, 0x10), 0x00100010, etc.

Returns:

pydicom.dataelem.DataElement

group_dataset(group)[source][source]

Return a Dataset containing only DataElements of a certain group.

Parameters:

group : int

The group part of a DICOM (group, element) tag.

Returns:

pydicom.dataset.Dataset

A dataset instance containing elements of the group specified.

items() → a set-like object providing a view on D's items
iterall()[source][source]

Iterate through the Dataset, yielding all DataElements.

Unlike Dataset.__iter__, this does recurse into sequences, and so returns all data elements as if the file were “flattened”.

Yields:pydicom.dataelem.DataElement
keys() → a set-like object providing a view on D's keys
pixel_array

Return the Pixel Data as a NumPy array.

Returns:

numpy.ndarray

The Pixel Data (7FE0,0010) as a NumPy ndarray.

pop(k[, d]) → v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised

popitem() → (k, v), remove and return some (key, value) pair as a

2-tuple; but raise KeyError if D is empty.

remove_private_tags()[source][source]

Remove all private DataElements in the Dataset.

save_as(filename, write_like_original=True)[source][source]

Write the Dataset to filename.

Saving a Dataset requires that the Dataset.is_implicit_VR and Dataset.is_little_endian attributes exist and are set appropriately. If Dataset.file_meta.TransferSyntaxUID is present then it should be set to a consistent value to ensure conformance.

Parameters:

filename : str or file-like

Name of file or the file-like to write the new DICOM file to.

write_like_original : bool

If True (default), preserves the following information from the Dataset (and may result in a non-conformant file): - preamble – if the original file has no preamble then none will

be written.

  • file_meta – if the original file was missing any required File
    Meta Information Group elements then they will not be added or written. If (0002,0000) ‘File Meta Information Group Length’ is present then it may have its value updated.
  • seq.is_undefined_length – if original had delimiters, write them
    now too, instead of the more sensible length characters
  • is_undefined_length_sequence_item – for datasets that belong to
    a sequence, write the undefined length delimiters if that is what the original had.

If False, produces a file conformant with the DICOM File Format, with explicit lengths for all elements.

See also

pydicom.filewriter.write_dataset
Write a DICOM Dataset to a file.
pydicom.filewriter.write_file_meta_info
Write the DICOM File Meta Information Group elements to a file.
pydicom.filewriter.dcmwrite
Write a DICOM file from a FileDataset instance.
setdefault(k[, d]) → D.get(k,d), also set D[k]=d if k not in D
top()[source][source]

Return a str of the Dataset’s top level DataElements only.

trait_names()[source][source]

Return a list of valid names for auto-completion code.

Used in IPython, so that data element names can be found and offered for autocompletion on the IPython command line.

update(dictionary)[source][source]

Extend dict.update() to handle DICOM keywords.

values() → an object providing a view on D's values
walk(callback, recursive=True)[source][source]

Iterate through the DataElements and run callback on each.

Visit all DataElements, possibly recursing into sequences and their datasets. The callback function is called for each DataElement (including SQ element). Can be used to perform an operation on certain types of DataElements. E.g., `remove_private_tags`() finds all private tags and deletes them. DataElement`s will come back in DICOM order (by increasing tag number within their dataset).

Parameters:

callback

A callable that takes two arguments:
  • a Dataset
  • a DataElement belonging to that Dataset

recursive : bool

Flag to indicate whether to recurse into Sequences.