Pydicom Complete API Reference

Auto-generated API documentation

DICOM Dataset

Define the Dataset and FileDataset classes.

The Dataset class represents the DICOM Dataset while the FileDataset class adds extra functionality to Dataset when data is read from or written to file.

Overview of DICOM object model

Dataset (dict subclass)
Contains DataElement instances, each of which has a tag, VR, VM and value.
The DataElement value can be:
  • A single value, such as a number, string, etc. (i.e. VM = 1)
  • A list of numbers, strings, etc. (i.e. VM > 1)
  • A Sequence (list subclass), where each item is a Dataset which
    contains its own DataElements, and so on in a recursive manner.
class pydicom.dataset.Dataset(*args, **kwargs)[source][source]

A collection (dictionary) of DICOM DataElements.

Examples

Add DataElements to the Dataset (for elements in the DICOM dictionary):

>>> ds = Dataset()
>>> ds.PatientName = "CITIZEN^Joan"
>>> ds.add_new(0x00100020, 'LO', '12345')
>>> ds[0x0010, 0x0030] = DataElement(0x00100030, 'DA', '20010101')

Add Sequence DataElement to the Dataset:

>>> ds.BeamSequence = [Dataset(), Dataset(), Dataset()]
>>> ds.BeamSequence[0].Manufacturer = "Linac, co."
>>> ds.BeamSequence[1].Manufacturer = "Linac and Sons, co."
>>> ds.BeamSequence[2].Manufacturer = "Linac and Daughters, co."

Add private DataElements to the Dataset:

>>> ds.add(DataElement(0x0043102b, 'SS', [4, 4, 0, 0]))
>>> ds.add_new(0x0043102b, 'SS', [4, 4, 0, 0])
>>> ds[0x0043, 0x102b] = DataElement(0x0043102b, 'SS', [4, 4, 0, 0])

Updating and retrieving DataElement values:

>>> ds.PatientName = "CITIZEN^Joan"
>>> ds.PatientName
'CITIZEN^Joan"
>>> ds.PatientName = "CITIZEN^John"
>>> ds.PatientName
'CITIZEN^John'

Retrieving a DataElement’s value from a Sequence:

>>> ds.BeamSequence[0].Manufacturer
'Linac, co.'
>>> ds.BeamSequence[1].Manufacturer
'Linac and Sons, co.'

Retrieving DataElements:

>>> elem = ds[0x00100010]
>>> elem = ds.data_element('PatientName')
>>> elem
(0010, 0010) Patient's Name                      PN: 'CITIZEN^Joan'

Deleting a DataElement from the Dataset:

>>> del ds.PatientID
>>> del ds.BeamSequence[1].Manufacturer
>>> del ds.BeamSequence[2]

Deleting a private DataElement from the Dataset:

>>> del ds[0x0043, 0x102b]

Determining if a DataElement is present in the Dataset:

>>> 'PatientName' in ds
True
>>> 'PatientID' in ds
False
>>> (0x0010, 0x0030) in ds
True
>>> 'Manufacturer' in ds.BeamSequence[0]
True

Iterating through the top level of a Dataset only (excluding Sequences):

>>> for elem in ds:
>>>    print(elem)

Iterating through the entire Dataset (including Sequences):

>>> for elem in ds.iterall():
>>>     print(elem)

Recursively iterate through a Dataset (including Sequences):

>>> def recurse(ds):
>>>     for elem in ds:
>>>         if elem.VR == 'SQ':
>>>             [recurse(item) for item in elem]
>>>         else:
>>>             # Do something useful with each DataElement
Attributes:
default_element_format : str

The default formatting for string display.

default_sequence_element_format : str

The default formatting for string display of sequences.

indent_chars : str

For string display, the characters used to indent nested Sequences. Default is ” “.

is_little_endian : bool

Shall be set before writing with write_like_original=False. The written dataset (excluding the pixel data) will be written using the given endianess.

is_implicit_VR : bool

Shall be set before writing with write_like_original=False. The written dataset will be written using the transfer syntax with the given VR handling, e.g LittleEndianImplicit if True, and LittleEndianExplicit or BigEndianExplicit (depending on is_little_endian) if False.

Methods

add(data_element) Add a DataElement to the Dataset.
add_new(tag, VR, value) Add a DataElement to the Dataset.
clear()
convert_pixel_data() Convert the Pixel Data to a numpy array internally.
copy()
data_element(name) Return the DataElement corresponding to the element keyword name.
decode() Apply character set decoding to all DataElements in the Dataset.
decompress() Decompresses pixel data and modifies the Dataset in-place
dir(*filters) Return an alphabetical list of DataElement keywords in the Dataset.
elements() Iterate through the top-level of the Dataset, yielding DataElements or RawDataElements (no conversion done).
formatted_lines([element_format, …]) Iterate through the Dataset yielding formatted str for each element.
fromkeys($type, iterable[, value]) Returns a new dict with keys from iterable and values equal to value.
get(key[, default]) Extend dict.get() to handle DICOM DataElement keywords.
get_item(key) Return the raw data element if possible.
group_dataset(group) Return a Dataset containing only DataElements of a certain group.
items()
iterall() Iterate through the Dataset, yielding all DataElements.
keys()
pop(k[,d]) If key is not found, d is returned if given, otherwise KeyError is raised
popitem() 2-tuple; but raise KeyError if D is empty.
remove_private_tags() Remove all private DataElements in the Dataset.
save_as(filename[, write_like_original]) Write the Dataset to filename.
setdefault(k[,d])
top() Return a str of the Dataset’s top level DataElements only.
trait_names() Return a list of valid names for auto-completion code.
update(dictionary) Extend dict.update() to handle DICOM keywords.
values()
walk(callback[, recursive]) Iterate through the DataElements and run callback on each.
add(data_element)[source][source]

Add a DataElement to the Dataset.

Equivalent to ds[data_element.tag] = data_element

Parameters:
data_element : pydicom.dataelem.DataElement

The DataElement to add to the Dataset.

add_new(tag, VR, value)[source][source]

Add a DataElement to the Dataset.

Parameters:
tag

The DICOM (group, element) tag in any form accepted by pydicom.tag.Tag such as [0x0010, 0x0010], (0x10, 0x10), 0x00100010, etc.

VR : str

The 2 character DICOM value representation (see DICOM standard part 5, Section 6.2).

value

The value of the data element. One of the following: * a single string or number * a list or tuple with all strings or all numbers * a multi-value string with backslash separator * for a sequence DataElement, an empty list or list of Dataset

convert_pixel_data()[source][source]

Convert the Pixel Data to a numpy array internally.

Returns:
None

Converted pixel data is stored internally in the dataset.

If a compressed image format, the image is decompressed,
and any related data elements are changed accordingly.
data_element(name)[source][source]

Return the DataElement corresponding to the element keyword name.

Parameters:
name : str

A DICOM element keyword.

Returns:
pydicom.dataelem.DataElement or None

For the given DICOM element keyword, return the corresponding Dataset DataElement if present, None otherwise.

decode()[source][source]

Apply character set decoding to all DataElements in the Dataset.

See DICOM PS3.5-2008 6.1.1.

decompress()[source][source]

Decompresses pixel data and modifies the Dataset in-place

If not a compressed tranfer syntax, then pixel data is converted to a numpy array internally, but not returned.

If compressed pixel data, then is decompressed using an image handler, and internal state is updated appropriately:

  • TransferSyntax is updated to non-compressed form
  • is_undefined_length for pixel data is set False
Returns:
None
Raises:
NotImplementedError

If the pixel data was originally compressed but file is not ExplicitVR LittleEndian as required by Dicom standard

dir(*filters)[source][source]

Return an alphabetical list of DataElement keywords in the Dataset.

Intended mainly for use in interactive Python sessions. Only lists the DataElement keywords in the current level of the Dataset (i.e. the contents of any Sequence elements are ignored).

Parameters:
filters : str

Zero or more string arguments to the function. Used for case-insensitive match to any part of the DICOM keyword.

Returns:
list of str

The matching DataElement keywords in the dataset. If no filters are used then all DataElement keywords are returned.

elements()[source][source]

Iterate through the top-level of the Dataset, yielding DataElements or RawDataElements (no conversion done).

>>> for elem in ds.elements():
>>>     print(elem)

The elements are returned in the same way as in __getitem__.

Yields:
pydicom.dataelem.DataElement or pydicom.dataelem.RawDataElement

The Dataset’s DataElements, sorted by increasing tag order.

formatted_lines(element_format='%(tag)s %(name)-35.35s %(VR)s: %(repval)s', sequence_element_format='%(tag)s %(name)-35.35s %(VR)s: %(repval)s', indent_format=None)[source][source]

Iterate through the Dataset yielding formatted str for each element.

Parameters:
element_format : str

The string format to use for non-sequence elements. Formatting uses the attributes of DataElement. Default is “%(tag)s %(name)-35.35s %(VR)s: %(repval)s”.

sequence_element_format : str

The string format to use for sequence elements. Formatting uses the attributes of DataElement. Default is “%(tag)s %(name)-35.35s %(VR)s: %(repval)s”

indent_format : str or None

Placeholder for future functionality.

Yields:
str

A string representation of a DataElement.

get(key, default=None)[source][source]

Extend dict.get() to handle DICOM DataElement keywords.

Parameters:
key : str or pydicom.tag.Tag

The element keyword or Tag or the class attribute name to get.

default : obj or None

If the DataElement or class attribute is not present, return default (default None).

Returns:
value

If key is the keyword for a DataElement in the Dataset then return the DataElement’s value.

pydicom.dataelem.DataElement

If key is a tag for a DataElement in the Dataset then return the DataElement instance.

value

If key is a class attribute then return its value.

get_item(key)[source][source]

Return the raw data element if possible.

It will be raw if the user has never accessed the value, or set their own value. Note if the data element is a deferred-read element, then it is read and converted before being returned.

Parameters:
key

The DICOM (group, element) tag in any form accepted by pydicom.tag.Tag such as [0x0010, 0x0010], (0x10, 0x10), 0x00100010, etc. May also be a slice made up of DICOM tags.

Returns:
pydicom.dataelem.DataElement
group_dataset(group)[source][source]

Return a Dataset containing only DataElements of a certain group.

Parameters:
group : int

The group part of a DICOM (group, element) tag.

Returns:
pydicom.dataset.Dataset

A dataset instance containing elements of the group specified.

is_original_encoding

Return True if the properties to be used for writing are set and have the same value as the ones in the dataset after reading it. This includes properties related to endianess, VR handling and the specific character set.

iterall()[source][source]

Iterate through the Dataset, yielding all DataElements.

Unlike Dataset.__iter__, this does recurse into sequences, and so returns all data elements as if the file were “flattened”.

Yields:
pydicom.dataelem.DataElement
pixel_array

Return the Pixel Data as a NumPy array.

Returns:
numpy.ndarray

The Pixel Data (7FE0,0010) as a NumPy ndarray.

remove_private_tags()[source][source]

Remove all private DataElements in the Dataset.

save_as(filename, write_like_original=True)[source][source]

Write the Dataset to filename.

Saving a Dataset requires that the Dataset.is_implicit_VR and Dataset.is_little_endian attributes exist and are set appropriately. If Dataset.file_meta.TransferSyntaxUID is present then it should be set to a consistent value to ensure conformance.

Parameters:
filename : str or file-like

Name of file or the file-like to write the new DICOM file to.

write_like_original : bool

If True (default), preserves the following information from the Dataset (and may result in a non-conformant file): - preamble – if the original file has no preamble then none will

be written.

  • file_meta – if the original file was missing any required File
    Meta Information Group elements then they will not be added or written. If (0002,0000) ‘File Meta Information Group Length’ is present then it may have its value updated.
  • seq.is_undefined_length – if original had delimiters, write them
    now too, instead of the more sensible length characters
  • is_undefined_length_sequence_item – for datasets that belong to
    a sequence, write the undefined length delimiters if that is what the original had.

If False, produces a file conformant with the DICOM File Format, with explicit lengths for all elements.

See also

pydicom.filewriter.write_dataset
Write a DICOM Dataset to a file.
pydicom.filewriter.write_file_meta_info
Write the DICOM File Meta Information Group elements to a file.
pydicom.filewriter.dcmwrite
Write a DICOM file from a FileDataset instance.
top()[source][source]

Return a str of the Dataset’s top level DataElements only.

trait_names()[source][source]

Return a list of valid names for auto-completion code.

Used in IPython, so that data element names can be found and offered for autocompletion on the IPython command line.

update(dictionary)[source][source]

Extend dict.update() to handle DICOM keywords.

walk(callback, recursive=True)[source][source]

Iterate through the DataElements and run callback on each.

Visit all DataElements, possibly recursing into sequences and their datasets. The callback function is called for each DataElement (including SQ element). Can be used to perform an operation on certain types of DataElements. E.g., `remove_private_tags`() finds all private tags and deletes them. DataElement`s will come back in DICOM order (by increasing tag number within their dataset).

Parameters:
callback
A callable that takes two arguments:
  • a Dataset
  • a DataElement belonging to that Dataset
recursive : bool

Flag to indicate whether to recurse into Sequences.

class pydicom.dataset.FileDataset(filename_or_obj, dataset, preamble=None, file_meta=None, is_implicit_VR=True, is_little_endian=True)[source][source]

An extension of Dataset to make reading and writing to file-like easier.

Attributes:
preamble : str or bytes or None

The optional DICOM preamble prepended to the dataset, if available.

file_meta : pydicom.dataset.Dataset or None

The Dataset’s file meta information as a Dataset, if available (None if not present). Consists of group 0002 elements.

filename : str or None

The filename that the dataset was read from (if read from file) or None if the filename is not available (if read from a BytesIO or similar).

fileobj_type

The object type of the file-like the Dataset was read from.

is_implicit_VR : bool

True if the dataset encoding is implicit VR, False otherwise.

is_little_endian : bool

True if the dataset encoding is little endian byte ordering, False otherwise.

timestamp : float or None

The modification time of the file the dataset was read from, None if the modification time is not available.

Methods

add(data_element) Add a DataElement to the Dataset.
add_new(tag, VR, value) Add a DataElement to the Dataset.
clear()
convert_pixel_data() Convert the Pixel Data to a numpy array internally.
copy()
data_element(name) Return the DataElement corresponding to the element keyword name.
decode() Apply character set decoding to all DataElements in the Dataset.
decompress() Decompresses pixel data and modifies the Dataset in-place
dir(*filters) Return an alphabetical list of DataElement keywords in the Dataset.
elements() Iterate through the top-level of the Dataset, yielding DataElements or RawDataElements (no conversion done).
formatted_lines([element_format, …]) Iterate through the Dataset yielding formatted str for each element.
fromkeys($type, iterable[, value]) Returns a new dict with keys from iterable and values equal to value.
get(key[, default]) Extend dict.get() to handle DICOM DataElement keywords.
get_item(key) Return the raw data element if possible.
group_dataset(group) Return a Dataset containing only DataElements of a certain group.
items()
iterall() Iterate through the Dataset, yielding all DataElements.
keys()
pop(k[,d]) If key is not found, d is returned if given, otherwise KeyError is raised
popitem() 2-tuple; but raise KeyError if D is empty.
remove_private_tags() Remove all private DataElements in the Dataset.
save_as(filename[, write_like_original]) Write the Dataset to filename.
setdefault(k[,d])
top() Return a str of the Dataset’s top level DataElements only.
trait_names() Return a list of valid names for auto-completion code.
update(dictionary) Extend dict.update() to handle DICOM keywords.
values()
walk(callback[, recursive]) Iterate through the DataElements and run callback on each.
exception pydicom.dataset.PropertyError[source][source]

For AttributeErrors caught in a property, so do not go to __getattr__

Various helper modules

Configuration of pydicom behavior - the config module

Pydicom configuration options.

pydicom.config.DS_decimal(use_Decimal_boolean=True)[source][source]

Set DS class to be derived from Decimal (True) or from float (False) If this function is never called, the default in pydicom >= 0.9.8 is for DS to be based on float.

pydicom.config.allow_DS_float = False

Set allow_float to True to allow DSdecimal instances to be created with floats; otherwise, they must be explicitly converted to strings, with the user explicity setting the precision of digits and rounding. Default: False

pydicom.config.data_element_callback = None

Set data_element_callback to a function to be called from read_dataset every time a RawDataElement has been returned, before it is added to the dataset.

pydicom.config.data_element_callback_kwargs = {}

Set this to use as keyword arguments passed to the data_element_callback function

pydicom.config.datetime_conversion = False

Set datetime_conversion to convert DA, DT and TM data elements to datetime.date, datetime.datetime and datetime.time respectively. Default: False

pydicom.config.debug(debug_on=True)[source][source]

Turn debugging of DICOM file reading and writing on or off. When debugging is on, file location and details about the elements read at that location are logged to the ‘pydicom’ logger using python’s logging module.

Parameters:debug_on – True (default) to turn on debugging,

False to turn off.

pydicom.config.enforce_valid_values = False

Raise errors if any value is not allowed by DICOM standard, e.g. DS strings that are longer than 16 characters; IS strings outside the allowed range.

pydicom.config.image_handlers = [<module 'pydicom.pixel_data_handlers.numpy_handler' from '/home/ubuntu/pydicom/pydicom/pixel_data_handlers/numpy_handler.py'>, <module 'pydicom.pixel_data_handlers.rle_handler' from '/home/ubuntu/pydicom/pydicom/pixel_data_handlers/rle_handler.py'>, <module 'pydicom.pixel_data_handlers.pillow_handler' from '/home/ubuntu/pydicom/pydicom/pixel_data_handlers/pillow_handler.py'>]

Image handlers for converting pixel data. This is an ordered list that the dataset._get_pixel_array() method will try to extract a correctly sized numpy array from the PixelData attribute. If a handler lacks required dependencies or can not otherwise be loaded, it shall throw an ImportError. Handers shall have two methods:

supports_transfer_syntax(dicom_dataset)
This returns True if the handler might support the transfer syntax indicated in the dicom_dataset
def get_pixeldata(dicom_dataset):
This shall either throw an exception or return a correctly sized numpy array derived from the PixelData. Reshaping the array to the correct dimensions is handled outside the image handler

The first handler that both announces that it supports the transfer syntax and does not throw an exception, either in getting the data or when the data is reshaped to the correct dimensions, is the handler that will provide the data.

If they all fail, the last one to throw an exception gets to see its exception thrown up.

If no one throws an exception, but they all refuse to support the transfer syntax, then this fact is announced in a NotImplementedError exception.

Working with compressed pixel data - the encaps module

Functions for working with encapsulated (compressed) pixel data.

pydicom.encaps.decode_data_sequence(data)[source][source]

Read encapsulated data and return a list of strings.

Parameters:
data : str

String of encapsulated data, typically dataset.PixelData

Returns:
list of bytes

All fragments in a list of byte strings

pydicom.encaps.defragment_data(data)[source][source]

Read encapsulated data and return the fragments as one continuous string.

Parameters:
data : list of bytes

The encapsulated pixel data fragments.

Returns:
bytes

All fragments concatenated together.

pydicom.encaps.generate_pixel_data(bytestream)[source][source]

Yield an encapsulated pixel data frame as a tuples of bytes.

For the following transfer syntaxes, a fragment may not contain encoded data from more than one frame. However data from one frame may span multiple fragments.

1.2.840.10008.1.2.4.50 - JPEG Baseline (Process 1) 1.2.840.10008.1.2.4.51 - JPEG Baseline (Process 2 and 4) 1.2.840.10008.1.2.4.57 - JPEG Lossless, Non-Hierarchical (Process 14) 1.2.840.10008.1.2.4.70 - JPEG Lossless, Non-Hierarchical, First-Order

Prediction (Process 14 [Selection Value 1])

1.2.840.10008.1.2.4.80 - JPEG-LS Lossless Image Compression 1.2.840.10008.1.2.4.81 - JPEG-LS Lossy (Near-Lossless) Image Compression 1.2.840.10008.1.2.4.90 - JPEG 2000 Image Compression (Lossless Only) 1.2.840.10008.1.2.4.91 - JPEG 2000 Image Compression 1.2.840.10008.1.2.4.92 - JPEG 2000 Part 2 Multi-component Image Compression

(Lossless Only)

1.2.840.10008.1.2.4.93 - JPEG 2000 Part 2 Multi-component Image Compression

For the following transfer syntaxes, each frame shall be encoded in one and only one fragment.

1.2.840.10008.1.2.5 - RLE Lossless

Parameters:
bytestream : bytes

The value of the (7fe0, 0010) ‘Pixel Data’ element from an encapsulated dataset. The Basic Offset Table item should be present and the Sequence Delimiter item may or may not be present.

Yields:
tuple of bytes

A tuple representing an encapsulated pixel data frame, with the contents of the tuple the frame’s fragmented data.

References

DICOM Standard Part 5, Annex A

pydicom.encaps.generate_pixel_data_fragment(fp)[source][source]

Yield the encapsulated pixel data fragments as bytes.

For compressed (encapsulated) Transfer Syntaxes, the (7fe0,0010) ‘Pixel Data’ element is encoded in an encapsulated format.

Parameters:
fp : pydicom.filebase.DicomBytesIO

The encoded (7fe0,0010) ‘Pixel Data’ element value, positioned at the start of the item tag for the first item after the Basic Offset Table item. fp.is_little_endian should be set to True.

Yields:
bytes

A pixel data fragment.

Raises:
ValueError

If the data contains an item with an undefined length or an unknown tag.

References

DICOM Standard Part 5, Annex A.4

pydicom.encaps.generate_pixel_data_frame(bytestream)[source][source]

Yield an encapsulated pixel data frame as bytes.

Parameters:
bytestream : bytes

The value of the (7fe0, 0010) ‘Pixel Data’ element from an encapsulated dataset. The Basic Offset Table item should be present and the Sequence Delimiter item may or may not be present.

Yields:
bytes

A frame contained in the encapsulated pixel data.

References

DICOM Standard Part 5, Annex A

pydicom.encaps.get_frame_offsets(fp)[source][source]

Return a list of the fragment offsets from the Basic Offset Table.

The Basic Offset Table Item must be present and have a tag (FFFE,E000) and a length, however it may or may not have a value.

Basic Offset Table with no value Item Tag | Length | FE FF 00 E0 00 00 00 00

Basic Offset Table with value (2 frames) Item Tag | Length | Offset 1 | Offset 2 | FE FF 00 E0 08 00 00 00 00 00 00 00 10 00 00 00

For single or multi-frame images with only one frame, the Basic Offset Table may or may not have a value. When it has no value then its length shall be 0x00000000.

For multi-frame images with more than one frame, the Basic Offset Table should have a value containing concatenated 32-bit unsigned integer values that are the byte offsets to the first byte of the Item tag of the first fragment of each frame as measured from the first byte of the first item tag following the Basic Offset Table Item.

All decoders, both for single and multi-frame images should accept both an empty Basic Offset Table and one containing offset values.

Parameters:
fp : pydicom.filebase.DicomBytesIO

The encapsulated pixel data positioned at the start of the Basic Offset Table. fp.is_little_endian should be set to True.

Returns:
offsets : list of int

The byte offsets to the first fragment of each frame, as measured from the start of the first item following the Basic Offset Table item.

Raises:
ValueError

If the Basic Offset Table item’s tag is not (FFEE,E000) or if the length in bytes of the item’s value is not a multiple of 4.

References

DICOM Standard Part 5, Annex A.4

pydicom.encaps.read_item(fp)[source][source]

Read and return a single Item in the fragmented data stream.

Parameters:
fp : pydicom.filebase.DicomIO

The file-like to read the item from.

Returns:
bytes

The Item’s raw bytes (value?).

Miscellaneous helper functions - the misc module

Miscellaneous helper functions

pydicom.misc.is_dicom(file_path)[source][source]

Boolean specifying if file is a proper DICOM file.

This function is a pared down version of read_preamble meant for a fast return. The file is read for a proper preamble (‘DICM’), returning True if so, and False otherwise. This is a conservative approach.

Parameters:
file_path : str

The path to the file.

See also

filereader.read_preamble, filereader.read_partial

pydicom.misc.size_in_bytes(expr)[source][source]

Return the number of bytes for a defer_size argument to dcmread()

Pydicom-specific exceptions - the errors module

Module for pydicom exception classes

exception pydicom.errors.InvalidDicomError(*args)[source][source]

Exception that is raised when the the file does not seem to be a valid dicom file, usually when the four characters “DICM” are not present at position 128 in the file. (According to the dicom specification, each dicom file should have this.)

To force reading the file (because maybe it is a dicom file without a header), use dcmread(…, force=True).