Pydicom Complete API Reference¶
Auto-generated API documentation
DICOM Dataset¶
Define the Dataset and FileDataset classes.
The Dataset class represents the DICOM Dataset while the FileDataset class adds extra functionality to Dataset when data is read from or written to file.
Overview of DICOM object model¶
- Dataset (dict subclass)
- Contains DataElement instances, each of which has a tag, VR, VM and value.
- The DataElement value can be:
- A single value, such as a number, string, etc. (i.e. VM = 1)
- A list of numbers, strings, etc. (i.e. VM > 1)
- A Sequence (list subclass), where each item is a Dataset which
- contains its own DataElements, and so on in a recursive manner.
-
class
pydicom.dataset.
Dataset
(*args, **kwargs)[source][source]¶ A collection (dictionary) of DICOM DataElements.
Examples
Add DataElements to the Dataset (for elements in the DICOM dictionary):
>>> ds = Dataset() >>> ds.PatientName = "CITIZEN^Joan" >>> ds.add_new(0x00100020, 'LO', '12345') >>> ds[0x0010, 0x0030] = DataElement(0x00100030, 'DA', '20010101')
Add Sequence DataElement to the Dataset:
>>> ds.BeamSequence = [Dataset(), Dataset(), Dataset()] >>> ds.BeamSequence[0].Manufacturer = "Linac, co." >>> ds.BeamSequence[1].Manufacturer = "Linac and Sons, co." >>> ds.BeamSequence[2].Manufacturer = "Linac and Daughters, co."
Add private DataElements to the Dataset:
>>> ds.add(DataElement(0x0043102b, 'SS', [4, 4, 0, 0])) >>> ds.add_new(0x0043102b, 'SS', [4, 4, 0, 0]) >>> ds[0x0043, 0x102b] = DataElement(0x0043102b, 'SS', [4, 4, 0, 0])
Updating and retrieving DataElement values:
>>> ds.PatientName = "CITIZEN^Joan" >>> ds.PatientName 'CITIZEN^Joan" >>> ds.PatientName = "CITIZEN^John" >>> ds.PatientName 'CITIZEN^John'
Retrieving a DataElement’s value from a Sequence:
>>> ds.BeamSequence[0].Manufacturer 'Linac, co.' >>> ds.BeamSequence[1].Manufacturer 'Linac and Sons, co.'
Retrieving DataElements:
>>> elem = ds[0x00100010] >>> elem = ds.data_element('PatientName') >>> elem (0010, 0010) Patient's Name PN: 'CITIZEN^Joan'
Deleting a DataElement from the Dataset:
>>> del ds.PatientID >>> del ds.BeamSequence[1].Manufacturer >>> del ds.BeamSequence[2]
Deleting a private DataElement from the Dataset:
>>> del ds[0x0043, 0x102b]
Determining if a DataElement is present in the Dataset:
>>> 'PatientName' in ds True >>> 'PatientID' in ds False >>> (0x0010, 0x0030) in ds True >>> 'Manufacturer' in ds.BeamSequence[0] True
Iterating through the top level of a Dataset only (excluding Sequences):
>>> for elem in ds: >>> print(elem)
Iterating through the entire Dataset (including Sequences):
>>> for elem in ds.iterall(): >>> print(elem)
Recursively iterate through a Dataset (including Sequences):
>>> def recurse(ds): >>> for elem in ds: >>> if elem.VR == 'SQ': >>> [recurse(item) for item in elem] >>> else: >>> # Do something useful with each DataElement
Attributes: - default_element_format : str
The default formatting for string display.
- default_sequence_element_format : str
The default formatting for string display of sequences.
- indent_chars : str
For string display, the characters used to indent nested Sequences. Default is ” “.
- is_little_endian : bool
Shall be set before writing with write_like_original=False. The written dataset (excluding the pixel data) will be written using the given endianess.
- is_implicit_VR : bool
Shall be set before writing with write_like_original=False. The written dataset will be written using the transfer syntax with the given VR handling, e.g LittleEndianImplicit if True, and LittleEndianExplicit or BigEndianExplicit (depending on is_little_endian) if False.
Methods
add
(data_element)Add a DataElement to the Dataset. add_new
(tag, VR, value)Add a DataElement to the Dataset. clear
()convert_pixel_data
()Convert the Pixel Data to a numpy array internally. copy
()data_element
(name)Return the DataElement corresponding to the element keyword name. decode
()Apply character set decoding to all DataElements in the Dataset. decompress
()Decompresses pixel data and modifies the Dataset in-place dir
(*filters)Return an alphabetical list of DataElement keywords in the Dataset. elements
()Iterate through the top-level of the Dataset, yielding DataElements or RawDataElements (no conversion done). formatted_lines
([element_format, …])Iterate through the Dataset yielding formatted str for each element. fromkeys
($type, iterable[, value])Returns a new dict with keys from iterable and values equal to value. get
(key[, default])Extend dict.get() to handle DICOM DataElement keywords. get_item
(key)Return the raw data element if possible. group_dataset
(group)Return a Dataset containing only DataElements of a certain group. items
()iterall
()Iterate through the Dataset, yielding all DataElements. keys
()pop
(k[,d])If key is not found, d is returned if given, otherwise KeyError is raised popitem
()2-tuple; but raise KeyError if D is empty. remove_private_tags
()Remove all private DataElements in the Dataset. save_as
(filename[, write_like_original])Write the Dataset to filename. setdefault
(k[,d])top
()Return a str of the Dataset’s top level DataElements only. trait_names
()Return a list of valid names for auto-completion code. update
(dictionary)Extend dict.update() to handle DICOM keywords. values
()walk
(callback[, recursive])Iterate through the DataElements and run callback on each. -
add
(data_element)[source][source]¶ Add a DataElement to the Dataset.
Equivalent to ds[data_element.tag] = data_element
Parameters: - data_element : pydicom.dataelem.DataElement
The DataElement to add to the Dataset.
-
add_new
(tag, VR, value)[source][source]¶ Add a DataElement to the Dataset.
Parameters: - tag
The DICOM (group, element) tag in any form accepted by pydicom.tag.Tag such as [0x0010, 0x0010], (0x10, 0x10), 0x00100010, etc.
- VR : str
The 2 character DICOM value representation (see DICOM standard part 5, Section 6.2).
- value
The value of the data element. One of the following: * a single string or number * a list or tuple with all strings or all numbers * a multi-value string with backslash separator * for a sequence DataElement, an empty list or list of Dataset
-
convert_pixel_data
()[source][source]¶ Convert the Pixel Data to a numpy array internally.
Returns: - None
Converted pixel data is stored internally in the dataset.
- If a compressed image format, the image is decompressed,
- and any related data elements are changed accordingly.
-
data_element
(name)[source][source]¶ Return the DataElement corresponding to the element keyword name.
Parameters: - name : str
A DICOM element keyword.
Returns: - pydicom.dataelem.DataElement or None
For the given DICOM element keyword, return the corresponding Dataset DataElement if present, None otherwise.
-
decode
()[source][source]¶ Apply character set decoding to all DataElements in the Dataset.
See DICOM PS3.5-2008 6.1.1.
-
decompress
()[source][source]¶ Decompresses pixel data and modifies the Dataset in-place
If not a compressed tranfer syntax, then pixel data is converted to a numpy array internally, but not returned.
If compressed pixel data, then is decompressed using an image handler, and internal state is updated appropriately:
- TransferSyntax is updated to non-compressed form
- is_undefined_length for pixel data is set False
Returns: - None
Raises: - NotImplementedError
If the pixel data was originally compressed but file is not ExplicitVR LittleEndian as required by Dicom standard
-
dir
(*filters)[source][source]¶ Return an alphabetical list of DataElement keywords in the Dataset.
Intended mainly for use in interactive Python sessions. Only lists the DataElement keywords in the current level of the Dataset (i.e. the contents of any Sequence elements are ignored).
Parameters: - filters : str
Zero or more string arguments to the function. Used for case-insensitive match to any part of the DICOM keyword.
Returns: - list of str
The matching DataElement keywords in the dataset. If no filters are used then all DataElement keywords are returned.
-
elements
()[source][source]¶ Iterate through the top-level of the Dataset, yielding DataElements or RawDataElements (no conversion done).
>>> for elem in ds.elements(): >>> print(elem)
The elements are returned in the same way as in __getitem__.
Yields: - pydicom.dataelem.DataElement or pydicom.dataelem.RawDataElement
The Dataset’s DataElements, sorted by increasing tag order.
-
formatted_lines
(element_format='%(tag)s %(name)-35.35s %(VR)s: %(repval)s', sequence_element_format='%(tag)s %(name)-35.35s %(VR)s: %(repval)s', indent_format=None)[source][source]¶ Iterate through the Dataset yielding formatted str for each element.
Parameters: - element_format : str
The string format to use for non-sequence elements. Formatting uses the attributes of DataElement. Default is “%(tag)s %(name)-35.35s %(VR)s: %(repval)s”.
- sequence_element_format : str
The string format to use for sequence elements. Formatting uses the attributes of DataElement. Default is “%(tag)s %(name)-35.35s %(VR)s: %(repval)s”
- indent_format : str or None
Placeholder for future functionality.
Yields: - str
A string representation of a DataElement.
-
get
(key, default=None)[source][source]¶ Extend dict.get() to handle DICOM DataElement keywords.
Parameters: - key : str or pydicom.tag.Tag
The element keyword or Tag or the class attribute name to get.
- default : obj or None
If the DataElement or class attribute is not present, return default (default None).
Returns: - value
If key is the keyword for a DataElement in the Dataset then return the DataElement’s value.
- pydicom.dataelem.DataElement
If key is a tag for a DataElement in the Dataset then return the DataElement instance.
- value
If key is a class attribute then return its value.
-
get_item
(key)[source][source]¶ Return the raw data element if possible.
It will be raw if the user has never accessed the value, or set their own value. Note if the data element is a deferred-read element, then it is read and converted before being returned.
Parameters: - key
The DICOM (group, element) tag in any form accepted by pydicom.tag.Tag such as [0x0010, 0x0010], (0x10, 0x10), 0x00100010, etc. May also be a slice made up of DICOM tags.
Returns: - pydicom.dataelem.DataElement
-
group_dataset
(group)[source][source]¶ Return a Dataset containing only DataElements of a certain group.
Parameters: - group : int
The group part of a DICOM (group, element) tag.
Returns: - pydicom.dataset.Dataset
A dataset instance containing elements of the group specified.
-
is_original_encoding
¶ Return True if the properties to be used for writing are set and have the same value as the ones in the dataset after reading it. This includes properties related to endianess, VR handling and the specific character set.
-
iterall
()[source][source]¶ Iterate through the Dataset, yielding all DataElements.
Unlike Dataset.__iter__, this does recurse into sequences, and so returns all data elements as if the file were “flattened”.
Yields: - pydicom.dataelem.DataElement
-
pixel_array
¶ Return the Pixel Data as a NumPy array.
Returns: - numpy.ndarray
The Pixel Data (7FE0,0010) as a NumPy ndarray.
Remove all private DataElements in the Dataset.
-
save_as
(filename, write_like_original=True)[source][source]¶ Write the Dataset to filename.
Saving a Dataset requires that the Dataset.is_implicit_VR and Dataset.is_little_endian attributes exist and are set appropriately. If Dataset.file_meta.TransferSyntaxUID is present then it should be set to a consistent value to ensure conformance.
Parameters: - filename : str or file-like
Name of file or the file-like to write the new DICOM file to.
- write_like_original : bool
If True (default), preserves the following information from the Dataset (and may result in a non-conformant file): - preamble – if the original file has no preamble then none will
be written.
- file_meta – if the original file was missing any required File
- Meta Information Group elements then they will not be added or written. If (0002,0000) ‘File Meta Information Group Length’ is present then it may have its value updated.
- seq.is_undefined_length – if original had delimiters, write them
- now too, instead of the more sensible length characters
- is_undefined_length_sequence_item – for datasets that belong to
- a sequence, write the undefined length delimiters if that is what the original had.
If False, produces a file conformant with the DICOM File Format, with explicit lengths for all elements.
See also
pydicom.filewriter.write_dataset
- Write a DICOM Dataset to a file.
pydicom.filewriter.write_file_meta_info
- Write the DICOM File Meta Information Group elements to a file.
pydicom.filewriter.dcmwrite
- Write a DICOM file from a FileDataset instance.
-
trait_names
()[source][source]¶ Return a list of valid names for auto-completion code.
Used in IPython, so that data element names can be found and offered for autocompletion on the IPython command line.
-
walk
(callback, recursive=True)[source][source]¶ Iterate through the DataElements and run callback on each.
Visit all DataElements, possibly recursing into sequences and their datasets. The callback function is called for each DataElement (including SQ element). Can be used to perform an operation on certain types of DataElements. E.g., `remove_private_tags`() finds all private tags and deletes them. DataElement`s will come back in DICOM order (by increasing tag number within their dataset).
Parameters: - callback
- A callable that takes two arguments:
- a Dataset
- a DataElement belonging to that Dataset
- recursive : bool
Flag to indicate whether to recurse into Sequences.
-
class
pydicom.dataset.
FileDataset
(filename_or_obj, dataset, preamble=None, file_meta=None, is_implicit_VR=True, is_little_endian=True)[source][source]¶ An extension of Dataset to make reading and writing to file-like easier.
Attributes: - preamble : str or bytes or None
The optional DICOM preamble prepended to the dataset, if available.
- file_meta : pydicom.dataset.Dataset or None
The Dataset’s file meta information as a Dataset, if available (None if not present). Consists of group 0002 elements.
- filename : str or None
The filename that the dataset was read from (if read from file) or None if the filename is not available (if read from a BytesIO or similar).
- fileobj_type
The object type of the file-like the Dataset was read from.
- is_implicit_VR : bool
True if the dataset encoding is implicit VR, False otherwise.
- is_little_endian : bool
True if the dataset encoding is little endian byte ordering, False otherwise.
- timestamp : float or None
The modification time of the file the dataset was read from, None if the modification time is not available.
Methods
add
(data_element)Add a DataElement to the Dataset. add_new
(tag, VR, value)Add a DataElement to the Dataset. clear
()convert_pixel_data
()Convert the Pixel Data to a numpy array internally. copy
()data_element
(name)Return the DataElement corresponding to the element keyword name. decode
()Apply character set decoding to all DataElements in the Dataset. decompress
()Decompresses pixel data and modifies the Dataset in-place dir
(*filters)Return an alphabetical list of DataElement keywords in the Dataset. elements
()Iterate through the top-level of the Dataset, yielding DataElements or RawDataElements (no conversion done). formatted_lines
([element_format, …])Iterate through the Dataset yielding formatted str for each element. fromkeys
($type, iterable[, value])Returns a new dict with keys from iterable and values equal to value. get
(key[, default])Extend dict.get() to handle DICOM DataElement keywords. get_item
(key)Return the raw data element if possible. group_dataset
(group)Return a Dataset containing only DataElements of a certain group. items
()iterall
()Iterate through the Dataset, yielding all DataElements. keys
()pop
(k[,d])If key is not found, d is returned if given, otherwise KeyError is raised popitem
()2-tuple; but raise KeyError if D is empty. remove_private_tags
()Remove all private DataElements in the Dataset. save_as
(filename[, write_like_original])Write the Dataset to filename. setdefault
(k[,d])top
()Return a str of the Dataset’s top level DataElements only. trait_names
()Return a list of valid names for auto-completion code. update
(dictionary)Extend dict.update() to handle DICOM keywords. values
()walk
(callback[, recursive])Iterate through the DataElements and run callback on each.
Various helper modules¶
Configuration of pydicom behavior - the config module¶
Pydicom configuration options.
-
pydicom.config.
DS_decimal
(use_Decimal_boolean=True)[source][source]¶ Set DS class to be derived from Decimal (True) or from float (False) If this function is never called, the default in pydicom >= 0.9.8 is for DS to be based on float.
-
pydicom.config.
allow_DS_float
= False¶ Set allow_float to True to allow DSdecimal instances to be created with floats; otherwise, they must be explicitly converted to strings, with the user explicity setting the precision of digits and rounding. Default: False
-
pydicom.config.
data_element_callback
= None¶ Set data_element_callback to a function to be called from read_dataset every time a RawDataElement has been returned, before it is added to the dataset.
-
pydicom.config.
data_element_callback_kwargs
= {}¶ Set this to use as keyword arguments passed to the data_element_callback function
-
pydicom.config.
datetime_conversion
= False¶ Set datetime_conversion to convert DA, DT and TM data elements to datetime.date, datetime.datetime and datetime.time respectively. Default: False
-
pydicom.config.
debug
(debug_on=True)[source][source]¶ Turn debugging of DICOM file reading and writing on or off. When debugging is on, file location and details about the elements read at that location are logged to the ‘pydicom’ logger using python’s logging module.
Parameters: debug_on – True (default) to turn on debugging, False to turn off.
-
pydicom.config.
enforce_valid_values
= False¶ Raise errors if any value is not allowed by DICOM standard, e.g. DS strings that are longer than 16 characters; IS strings outside the allowed range.
-
pydicom.config.
image_handlers
= [<module 'pydicom.pixel_data_handlers.numpy_handler' from '/home/ubuntu/pydicom/pydicom/pixel_data_handlers/numpy_handler.py'>, <module 'pydicom.pixel_data_handlers.rle_handler' from '/home/ubuntu/pydicom/pydicom/pixel_data_handlers/rle_handler.py'>, <module 'pydicom.pixel_data_handlers.pillow_handler' from '/home/ubuntu/pydicom/pydicom/pixel_data_handlers/pillow_handler.py'>]¶ Image handlers for converting pixel data. This is an ordered list that the dataset._get_pixel_array() method will try to extract a correctly sized numpy array from the PixelData attribute. If a handler lacks required dependencies or can not otherwise be loaded, it shall throw an ImportError. Handers shall have two methods:
- supports_transfer_syntax(dicom_dataset)
- This returns True if the handler might support the transfer syntax indicated in the dicom_dataset
- def get_pixeldata(dicom_dataset):
- This shall either throw an exception or return a correctly sized numpy array derived from the PixelData. Reshaping the array to the correct dimensions is handled outside the image handler
The first handler that both announces that it supports the transfer syntax and does not throw an exception, either in getting the data or when the data is reshaped to the correct dimensions, is the handler that will provide the data.
If they all fail, the last one to throw an exception gets to see its exception thrown up.
If no one throws an exception, but they all refuse to support the transfer syntax, then this fact is announced in a NotImplementedError exception.
Working with compressed pixel data - the encaps module¶
Functions for working with encapsulated (compressed) pixel data.
-
pydicom.encaps.
decode_data_sequence
(data)[source][source]¶ Read encapsulated data and return a list of strings.
Parameters: - data : str
String of encapsulated data, typically dataset.PixelData
Returns: - list of bytes
All fragments in a list of byte strings
-
pydicom.encaps.
defragment_data
(data)[source][source]¶ Read encapsulated data and return the fragments as one continuous string.
Parameters: - data : list of bytes
The encapsulated pixel data fragments.
Returns: - bytes
All fragments concatenated together.
-
pydicom.encaps.
generate_pixel_data
(bytestream)[source][source]¶ Yield an encapsulated pixel data frame as a tuples of bytes.
For the following transfer syntaxes, a fragment may not contain encoded data from more than one frame. However data from one frame may span multiple fragments.
1.2.840.10008.1.2.4.50 - JPEG Baseline (Process 1) 1.2.840.10008.1.2.4.51 - JPEG Baseline (Process 2 and 4) 1.2.840.10008.1.2.4.57 - JPEG Lossless, Non-Hierarchical (Process 14) 1.2.840.10008.1.2.4.70 - JPEG Lossless, Non-Hierarchical, First-Order
Prediction (Process 14 [Selection Value 1])1.2.840.10008.1.2.4.80 - JPEG-LS Lossless Image Compression 1.2.840.10008.1.2.4.81 - JPEG-LS Lossy (Near-Lossless) Image Compression 1.2.840.10008.1.2.4.90 - JPEG 2000 Image Compression (Lossless Only) 1.2.840.10008.1.2.4.91 - JPEG 2000 Image Compression 1.2.840.10008.1.2.4.92 - JPEG 2000 Part 2 Multi-component Image Compression
(Lossless Only)1.2.840.10008.1.2.4.93 - JPEG 2000 Part 2 Multi-component Image Compression
For the following transfer syntaxes, each frame shall be encoded in one and only one fragment.
1.2.840.10008.1.2.5 - RLE Lossless
Parameters: - bytestream : bytes
The value of the (7fe0, 0010) ‘Pixel Data’ element from an encapsulated dataset. The Basic Offset Table item should be present and the Sequence Delimiter item may or may not be present.
Yields: - tuple of bytes
A tuple representing an encapsulated pixel data frame, with the contents of the tuple the frame’s fragmented data.
References
DICOM Standard Part 5, Annex A
-
pydicom.encaps.
generate_pixel_data_fragment
(fp)[source][source]¶ Yield the encapsulated pixel data fragments as bytes.
For compressed (encapsulated) Transfer Syntaxes, the (7fe0,0010) ‘Pixel Data’ element is encoded in an encapsulated format.
Parameters: - fp : pydicom.filebase.DicomBytesIO
The encoded (7fe0,0010) ‘Pixel Data’ element value, positioned at the start of the item tag for the first item after the Basic Offset Table item. fp.is_little_endian should be set to True.
Yields: - bytes
A pixel data fragment.
Raises: - ValueError
If the data contains an item with an undefined length or an unknown tag.
References
DICOM Standard Part 5, Annex A.4
-
pydicom.encaps.
generate_pixel_data_frame
(bytestream)[source][source]¶ Yield an encapsulated pixel data frame as bytes.
Parameters: - bytestream : bytes
The value of the (7fe0, 0010) ‘Pixel Data’ element from an encapsulated dataset. The Basic Offset Table item should be present and the Sequence Delimiter item may or may not be present.
Yields: - bytes
A frame contained in the encapsulated pixel data.
References
DICOM Standard Part 5, Annex A
-
pydicom.encaps.
get_frame_offsets
(fp)[source][source]¶ Return a list of the fragment offsets from the Basic Offset Table.
The Basic Offset Table Item must be present and have a tag (FFFE,E000) and a length, however it may or may not have a value.
Basic Offset Table with no value Item Tag | Length | FE FF 00 E0 00 00 00 00
Basic Offset Table with value (2 frames) Item Tag | Length | Offset 1 | Offset 2 | FE FF 00 E0 08 00 00 00 00 00 00 00 10 00 00 00
For single or multi-frame images with only one frame, the Basic Offset Table may or may not have a value. When it has no value then its length shall be 0x00000000.
For multi-frame images with more than one frame, the Basic Offset Table should have a value containing concatenated 32-bit unsigned integer values that are the byte offsets to the first byte of the Item tag of the first fragment of each frame as measured from the first byte of the first item tag following the Basic Offset Table Item.
All decoders, both for single and multi-frame images should accept both an empty Basic Offset Table and one containing offset values.
Parameters: - fp : pydicom.filebase.DicomBytesIO
The encapsulated pixel data positioned at the start of the Basic Offset Table. fp.is_little_endian should be set to True.
Returns: - offsets : list of int
The byte offsets to the first fragment of each frame, as measured from the start of the first item following the Basic Offset Table item.
Raises: - ValueError
If the Basic Offset Table item’s tag is not (FFEE,E000) or if the length in bytes of the item’s value is not a multiple of 4.
References
DICOM Standard Part 5, Annex A.4
Miscellaneous helper functions - the misc module¶
Miscellaneous helper functions
-
pydicom.misc.
is_dicom
(file_path)[source][source]¶ Boolean specifying if file is a proper DICOM file.
This function is a pared down version of read_preamble meant for a fast return. The file is read for a proper preamble (‘DICM’), returning True if so, and False otherwise. This is a conservative approach.
Parameters: - file_path : str
The path to the file.
See also
filereader.read_preamble
,filereader.read_partial
Pydicom-specific exceptions - the errors module¶
Module for pydicom exception classes
-
exception
pydicom.errors.
InvalidDicomError
(*args)[source][source]¶ Exception that is raised when the the file does not seem to be a valid dicom file, usually when the four characters “DICM” are not present at position 128 in the file. (According to the dicom specification, each dicom file should have this.)
To force reading the file (because maybe it is a dicom file without a header), use dcmread(…, force=True).