Element VRs and Python types

DICOM elements can contain anything from ASCII strings to unicode text, decimals, floats, signed and unsigned integers of different byte-depth and even encoded data. The format of the value of an element is given by its Value Representation or VR, and a list of VRs is given in the DICOM Standard in Part 5, Table 6.2-1.

So when using pydicom, what Python type should be used with a given VR to ensure that the value gets written correctly?

  • Elements of any VR:

    • Can be set as empty by using None

    • Can have their values set using their set using or stored as type from the table below

  • Non-SQ element values:

    • Can also be set using a list of their set using type - for Value Multiplicity (VM) > 1, the value will be stored as a MultiValue of their stored as type

    • However, according to the DICOM Standard, elements with VR LT, OB, OD, OF, OL, OW, ST, UN, UR and UT should never have a VM greater than 1.

  • SQ element values should be set using a list of zero or more Dataset instances.

VR

Name

Set using

Stored as (T)

Type hint for element value

AE

Application Entity

str

str

None | T | MutableSequence[T]

AS

Age String

str

str

AT

Attribute Tag

Tag1

BaseTag

CS

Code String

str

str

DA

Date

str

str or DA2

DS

Decimal String

str, float or int

DSfloat or DSdecimal3

DT

Date Time

str

str or DT2

FL

Floating Point Single

float

float

FD

Floating Point Double

IS

Integer String

str or int

IS

LO

Long String

str

str

LT

Long Text

str

str

None | T

OB

Other Byte

bytes or io.BufferedIOBase 4

bytes or io.BufferedIOBase

None | T

OD

Other Double

OF

Other Float

OL

Other Long

OV

Other 64-bit Very Long

OW

Other Word

PN

Person Name

str

PersonName

None | T | MutableSequence[T]

SH

Short String

str

str

SL

Signed Long

int

int

SQ

Sequence of Items

list

Sequence

MutableSequence[ Dataset]

SS

Signed Short

int

int

None | T | MutableSequence[T]

ST

Short Text

str

str

None | T

SV

Signed 64-bit Very Long

int

int

None | T | MutableSequence[T]

TM

Time

str

str or TM2

UC

Unlimited Characters

str

str

UI

Unique Identifier (UID)

str

UID

UL

Unsigned Long

int

int

UN

Unknown

bytes

bytes

None | T

UR

URI/URL

str

str

US

Unsigned Short

int

int

None | T | MutableSequence[T]

UT

Unlimited Text

str

str

None | T

UV

Unsigned 64-bit Very Long

int

int

None | T | MutableSequence[T]

1 Any type accepted by Tag() can be used
2 If config.datetime_conversion = True (default False)
3 If config.use_DS_decimal = True (default False)
4 See notes for bufferable O* VRs below

Bufferable O* VRs

The value for elements with O* VRs (OB, OD, OF, OL, OV and OW) can be set using an object that inherits from io.BufferedIOBase such as the io.BufferedReader instances returned by the open() built-in when in read mode. This allows you to avoid having to read a large amount of data into memory when creating datasets:

from pydicom import Dataset, FileMetaDataset
from pydicom.uid import ExplicitVRLittleEndian

with open("a_large_amount_of_data", "rb") as f:
    ds = Dataset()
    ds.file_meta = FileMetaDataset()
    ds.file_meta.TransferSyntaxUID = ExplicitVRLittleEndian
    ds.PixelData = f
    ds.save_as("large_dataset.dcm")

To use a buffered Pixel Data value with a dataset that has a compressed transfer syntax such as JPEG 2000 Lossles, the encapsulate_buffer() and encapsulate_extended_buffer() functions can be used to encapsulate the buffered frames:

from pydicom import Dataset, FileMetaDataset
from pydicom.encaps import encapsulate_buffer
from pydicom.uid import JPEG2000Lossless

with open("a_large_jpeg2000_file.j2k", "rb") as f:
    ds = Dataset()
    ds.file_meta = FileMetaDataset()
    ds.file_meta.TransferSyntaxUID = JPEG2000Lossless
    ds.PixelData = encapsulate_buffer([f])
    ds.save_as("large_compressed_dataset.dcm")

However, be aware that Datasets containing buffered objects that don’t normally work with pickle or deepcopy() such as BufferedReader are not able to be pickled or deepcopied.