pydicom.dataset.Dataset¶
-
class
pydicom.dataset.Dataset(*args: Dict[pydicom.tag.BaseTag, Union[pydicom.dataelem.DataElement, pydicom.dataelem.RawDataElement]], **kwargs: object)[source]¶ Contains a collection (dictionary) of DICOM Data Elements.
Behaves like a
dict.Note
Datasetis only derived fromdictto make it work in a NumPyndarray. The parentdictclass is never called, as alldictmethods are overridden.Examples
Add an element to the
Dataset(for elements in the DICOM dictionary):>>> ds = Dataset() >>> ds.PatientName = "CITIZEN^Joan" >>> ds.add_new(0x00100020, 'LO', '12345') >>> ds[0x0010, 0x0030] = DataElement(0x00100030, 'DA', '20010101')
Add a sequence element to the
Dataset>>> ds.BeamSequence = [Dataset(), Dataset(), Dataset()] >>> ds.BeamSequence[0].Manufacturer = "Linac, co." >>> ds.BeamSequence[1].Manufacturer = "Linac and Sons, co." >>> ds.BeamSequence[2].Manufacturer = "Linac and Daughters, co."
Add private elements to the
Dataset>>> block = ds.private_block(0x0041, 'My Creator', create=True) >>> block.add_new(0x01, 'LO', '12345')
Updating and retrieving element values:
>>> ds.PatientName = "CITIZEN^Joan" >>> ds.PatientName 'CITIZEN^Joan' >>> ds.PatientName = "CITIZEN^John" >>> ds.PatientName 'CITIZEN^John'
Retrieving an element’s value from a Sequence:
>>> ds.BeamSequence[0].Manufacturer 'Linac, co.' >>> ds.BeamSequence[1].Manufacturer 'Linac and Sons, co.'
Accessing the
DataElementitems:>>> elem = ds['PatientName'] >>> elem (0010, 0010) Patient's Name PN: 'CITIZEN^John' >>> elem = ds[0x00100010] >>> elem (0010, 0010) Patient's Name PN: 'CITIZEN^John' >>> elem = ds.data_element('PatientName') >>> elem (0010, 0010) Patient's Name PN: 'CITIZEN^John'
Accessing a private
DataElementitem:>>> block = ds.private_block(0x0041, 'My Creator') >>> elem = block[0x01] >>> elem (0041, 1001) Private tag data LO: '12345' >>> elem.value '12345'
Alternatively:
>>> ds.get_private_item(0x0041, 0x01, 'My Creator').value '12345'
Deleting an element from the
Dataset>>> del ds.PatientID >>> del ds.BeamSequence[1].Manufacturer >>> del ds.BeamSequence[2]
Deleting a private element from the
Dataset>>> block = ds.private_block(0x0041, 'My Creator') >>> if 0x01 in block: ... del block[0x01]
Determining if an element is present in the
Dataset>>> 'PatientName' in ds True >>> 'PatientID' in ds False >>> (0x0010, 0x0030) in ds True >>> 'Manufacturer' in ds.BeamSequence[0] True
Iterating through the top level of a
Datasetonly (excluding Sequences):>>> for elem in ds: ... print(elem) (0010, 0010) Patient's Name PN: 'CITIZEN^John'
Iterating through the entire
Dataset(including Sequences):>>> for elem in ds.iterall(): ... print(elem) (0010, 0010) Patient's Name PN: 'CITIZEN^John'
Recursively iterate through a
Dataset(including Sequences):>>> def recurse(ds): ... for elem in ds: ... if elem.VR == 'SQ': ... [recurse(item) for item in elem] ... else: ... # Do something useful with each DataElement
Converting the
Datasetto and from JSON:>>> ds = Dataset() >>> ds.PatientName = "Some^Name" >>> jsonmodel = ds.to_json() >>> ds2 = Dataset() >>> ds2.from_json(jsonmodel) (0010, 0010) Patient's Name PN: 'Some^Name'
-
indent_chars¶ For string display, the characters used to indent nested Sequences. Default is
" ".- Type
-
is_little_endian¶ Shall be set before writing with
write_like_original=False. TheDataset(excluding the pixel data) will be written using the given endianess.- Type
-
is_implicit_VR¶ Shall be set before writing with
write_like_original=False. TheDatasetwill be written using the transfer syntax with the given VR handling, e.g Little Endian Implicit VR ifTrue, and Little Endian Explicit VR or Big Endian Explicit VR (depending onDataset.is_little_endian) ifFalse.- Type
-
__init__(*args: Dict[pydicom.tag.BaseTag, Union[pydicom.dataelem.DataElement, pydicom.dataelem.RawDataElement]], **kwargs: object) → None[source]¶ Create a new
Datasetinstance.
Methods
__init__(*args, **kwargs)Create a new
Datasetinstance.add(data_element)Add an element to the
Dataset.add_new(tag, VR, value)Create a new element and add it to the
Dataset.clear()Delete all the elements from the
Dataset.convert_pixel_data([handler_name])Convert pixel data to a
numpy.ndarrayinternally.copy()Return a shallow copy of the dataset.
data_element(name)Return the element corresponding to the element keyword name.
decode()Apply character set decoding to the elements in the
Dataset.decompress([handler_name])Decompresses Pixel Data and modifies the
Datasetin-place.dir(*filters)Return an alphabetical list of element keywords in the
Dataset.elements()Yield the top-level elements of the
Dataset.Create an empty
Dataset.file_metaif none exists.fix_meta_info([enforce_standard])Ensure the file meta info exists and has the correct values for transfer syntax and media storage UIDs.
formatted_lines([element_format, …])Iterate through the
Datasetyielding formattedstrfor each element.from_json(json_dataset[, bulk_data_uri_handler])Add elements to the
Datasetfrom DICOM JSON format.fromkeys([value])Create a new dictionary with keys from iterable and values set to value.
get()Simulate
dict.get()to handle element tags and keywords.get_item()Return the raw data element if possible.
get_private_item(group, element_offset, …)Return the data element for the given private tag group.
group_dataset(group)Return a
Datasetcontaining only elements of a certain group.items()Return the
Datasetitems to simulatedict.items().iterall()Iterate through the
Dataset, yielding all the elements.keys()Return the
Datasetkeys to simulatedict.keys().overlay_array(group)Return the Overlay Data in group as a
numpy.ndarray.pop(key, *args)Emulate
dict.pop()with support for tags and keywords.popitem()Emulate
dict.popitem().private_block(group, private_creator[, create])Return the block for the given tag group and private_creator.
private_creators(group)Return a list of private creator names in the given group.
Remove all private elements from the
Dataset.save_as(filename[, write_like_original])Write the
Datasetto filename.set_original_encoding(is_implicit_vr, …)Set the values for the original transfer syntax and encoding.
setdefault(key[, default])Emulate
dict.setdefault()with support for tags and keywords.to_json([bulk_data_threshold, …])Return a JSON representation of the
Dataset.to_json_dict([bulk_data_threshold, …])Return a dictionary representation of the
Datasetconforming to the DICOM JSON Model as described in the DICOM Standard, Part 18, Annex F.top()Return a
strrepresentation of the top level elements.Return a
listof valid names for auto-completion code.update(dictionary)Extend
dict.update()to handle DICOM tags and keywords.values()Return the
Datasetvalues to simulatedict.values().walk(callback[, recursive])Iterate through the
Dataset'selements and run callback on each.waveform_array(index)Return an
ndarrayfor the multiplex group at index in the (5400,0100) Waveform Sequence.Attributes
Return
Trueif the encoding to be used for writing is set and is the same as that used to originally encode theDataset.Return the pixel data as a
numpy.ndarray.-
add(data_element: pydicom.dataelem.DataElement) → None[source]¶ Add an element to the
Dataset.Equivalent to
ds[data_element.tag] = data_element- Parameters
data_element (dataelem.DataElement) – The
DataElementto add.
-
add_new(tag: Union[int, str, Tuple[int, int], BaseTag], VR: str, value: object) → None[source]¶ Create a new element and add it to the
Dataset.- Parameters
tag – The DICOM (group, element) tag in any form accepted by
Tag()such as[0x0010, 0x0010],(0x10, 0x10),0x00100010, etc.VR (str) – The 2 character DICOM value representation (see DICOM Standard, Part 5, Section 6.2).
value –
The value of the data element. One of the following:
-
convert_pixel_data(handler_name: str = '') → None[source]¶ Convert pixel data to a
numpy.ndarrayinternally.- Parameters
handler_name (str, optional) – The name of the pixel handler that shall be used to decode the data. Supported names are:
'gdcm','pillow','jpeg_ls','rle','numpy'and'pylibjpeg'. If not used (the default), a matching handler is used from the handlers configured inpixel_data_handlers.- Returns
Converted pixel data is stored internally in the dataset.
- Return type
- Raises
ValueError – If handler_name is not a valid handler name.
NotImplementedError – If the given handler or any handler, if none given, is unable to decompress pixel data with the current transfer syntax
RuntimeError – If the given handler, or the handler that has been selected if none given, is not available.
Notes
If the pixel data is in a compressed image format, the data is decompressed and any related data elements are changed accordingly.
-
copy() → pydicom.dataset.Dataset[source]¶ Return a shallow copy of the dataset.
-
data_element(name: str) → Optional[pydicom.dataelem.DataElement][source]¶ Return the element corresponding to the element keyword name.
- Parameters
name (str) – A DICOM element keyword.
- Returns
For the given DICOM element keyword, return the corresponding
DataElementif present,Noneotherwise.- Return type
-
decode() → None[source]¶ Apply character set decoding to the elements in the
Dataset.See DICOM Standard, Part 5, Section 6.1.1.
-
decompress(handler_name: str = '') → None[source]¶ Decompresses Pixel Data and modifies the
Datasetin-place.New in version 1.4: The handler_name keyword argument was added
If not a compressed transfer syntax, then pixel data is converted to a
numpy.ndarrayinternally, but not returned.If compressed pixel data, then is decompressed using an image handler, and internal state is updated appropriately:
Dataset.file_meta.TransferSyntaxUIDis updated to non-compressed formis_undefined_lengthisFalsefor the (7FE0,0010) Pixel Data element.
Changed in version 1.4: The handler_name keyword argument was added
- Parameters
handler_name (str, optional) – The name of the pixel handler that shall be used to decode the data. Supported names are:
'gdcm','pillow','jpeg_ls','rle'and'numpy'. If not used (the default), a matching handler is used from the handlers configured inpixel_data_handlers.- Returns
- Return type
- Raises
NotImplementedError – If the pixel data was originally compressed but file is not Explicit VR Little Endian as required by the DICOM Standard.
-
dir(*filters: str) → List[str][source]¶ Return an alphabetical list of element keywords in the
Dataset.Intended mainly for use in interactive Python sessions. Only lists the element keywords in the current level of the
Dataset(i.e. the contents of any sequence elements are ignored).- Parameters
filters (str) – Zero or more string arguments to the function. Used for case-insensitive match to any part of the DICOM keyword.
- Returns
The matching element keywords in the dataset. If no filters are used then all element keywords are returned.
- Return type
list of str
-
elements() → Iterator[pydicom.dataelem.DataElement][source]¶ Yield the top-level elements of the
Dataset.New in version 1.1.
Examples
>>> ds = Dataset() >>> for elem in ds.elements(): ... print(elem)
The elements are returned in the same way as in
Dataset.__getitem__().- Yields
dataelem.DataElement or dataelem.RawDataElement – The unconverted elements sorted by increasing tag order.
-
ensure_file_meta() → None[source]¶ Create an empty
Dataset.file_metaif none exists.New in version 1.2.
-
fix_meta_info(enforce_standard: bool = True) → None[source]¶ Ensure the file meta info exists and has the correct values for transfer syntax and media storage UIDs.
New in version 1.2.
Warning
The transfer syntax for
is_implicit_VR = Falseandis_little_endian = Trueis ambiguous and will therefore not be set.- Parameters
enforce_standard (bool, optional) – If
True, a check for incorrect and missing elements is performed (seevalidate_file_meta()).
-
formatted_lines(element_format: str = '%(tag)s %(name)-35.35s %(VR)s: %(repval)s', sequence_element_format: str = '%(tag)s %(name)-35.35s %(VR)s: %(repval)s', indent_format: Optional[str] = None) → Iterator[str][source]¶ Iterate through the
Datasetyielding formattedstrfor each element.- Parameters
element_format (str) – The string format to use for non-sequence elements. Formatting uses the attributes of
DataElement. Default is"%(tag)s %(name)-35.35s %(VR)s: %(repval)s".sequence_element_format (str) – The string format to use for sequence elements. Formatting uses the attributes of
DataElement. Default is"%(tag)s %(name)-35.35s %(VR)s: %(repval)s"indent_format (str or None) – Placeholder for future functionality.
- Yields
str – A string representation of an element.
-
classmethod
from_json(json_dataset: Union[Dict[str, bytes], str], bulk_data_uri_handler: Optional[Union[Callable[[pydicom.tag.BaseTag, str, str], object], Callable[[str], object]]] = None) → _Dataset[source]¶ Add elements to the
Datasetfrom DICOM JSON format.New in version 1.3.
See the DICOM Standard, Part 18, Annex F.
- Parameters
json_dataset (dict or str) –
dictorstrrepresenting a DICOM Data Set formatted based on the DICOM JSON Model.bulk_data_uri_handler (callable, optional) – Callable function that accepts either the tag, vr and “BulkDataURI” or just the “BulkDataURI” of the JSON representation of a data element and returns the actual value of data element (retrieved via DICOMweb WADO-RS).
- Returns
- Return type
-
get(key: str, default: Optional[object] = 'None') → object[source]¶ -
get(key: Union[int, Tuple[int, int], pydicom.tag.BaseTag], default: Optional[object] = 'None') → pydicom.dataelem.DataElement Simulate
dict.get()to handle element tags and keywords.- Parameters
- Returns
value – If key is the keyword for an element in the
Datasetthen return the element’s value.dataelem.DataElement – If key is a tag for a element in the
Datasetthen return theDataElementinstance.value – If key is a class attribute then return its value.
-
get_item(key: slice) → Dataset[source]¶ -
get_item(key: Union[int, str, Tuple[int, int], BaseTag]) → pydicom.dataelem.DataElement Return the raw data element if possible.
It will be raw if the user has never accessed the value, or set their own value. Note if the data element is a deferred-read element, then it is read and converted before being returned.
-
get_private_item(group: int, element_offset: int, private_creator: str) → pydicom.dataelem.DataElement[source]¶ Return the data element for the given private tag group.
New in version 1.3.
This is analogous to
Dataset.__getitem__(), but only for private tags. This allows to find the private tag for the correct private creator without the need to add the tag to the private dictionary first.- Parameters
- Returns
The corresponding element.
- Return type
- Raises
ValueError – If group is not part of a private tag or private_creator is empty.
KeyError – If the private creator tag is not found in the given group. If the private tag is not found.
-
group_dataset(group: int) → pydicom.dataset.Dataset[source]¶ Return a
Datasetcontaining only elements of a certain group.
-
property
is_original_encoding¶ Return
Trueif the encoding to be used for writing is set and is the same as that used to originally encode theDataset.New in version 1.1.
This includes properties related to endianess, VR handling and the (0008,0005) Specific Character Set.
-
items() → ItemsView[pydicom.tag.BaseTag, Union[pydicom.dataelem.DataElement, pydicom.dataelem.RawDataElement]][source]¶ Return the
Datasetitems to simulatedict.items().- Returns
The top-level (
BaseTag,DataElement) items for theDataset.- Return type
dict_items
-
iterall() → Iterator[pydicom.dataelem.DataElement][source]¶ Iterate through the
Dataset, yielding all the elements.Unlike
iter(Dataset), this does recurse into sequences, and so yields all elements as if dataset were “flattened”.- Yields
dataelem.DataElement
-
keys() → KeysView[pydicom.tag.BaseTag][source]¶ Return the
Datasetkeys to simulatedict.keys().
-
overlay_array(group: int) → np.ndarray[source]¶ Return the Overlay Data in group as a
numpy.ndarray.New in version 1.4.
- Parameters
group (int) – The group number of the overlay data.
- Returns
The (group,3000) Overlay Data converted to a
numpy.ndarray.- Return type
-
property
pixel_array¶ Return the pixel data as a
numpy.ndarray.Changed in version 1.4: Added support for Float Pixel Data and Double Float Pixel Data
- Returns
The (7FE0,0008) Float Pixel Data, (7FE0,0009) Double Float Pixel Data or (7FE0,0010) Pixel Data converted to a
numpy.ndarray.- Return type
-
pop(key: Union[int, str, Tuple[int, int], BaseTag], *args: object) → Union[pydicom.dataelem.DataElement, pydicom.dataelem.RawDataElement][source]¶ Emulate
dict.pop()with support for tags and keywords.Removes the element for key if it exists and returns it, otherwise returns a default value if given or raises
KeyError.- Parameters
*args (zero or one argument) – Defines the behavior if no tag exists for key: if given, it defines the return value, if not given,
KeyErroris raised
- Returns
- Return type
The element for key if it exists, or the default value if given.
- Raises
KeyError – If the key is not a valid tag or keyword. If the tag does not exist and no default is given.
-
popitem() → Tuple[pydicom.tag.BaseTag, Union[pydicom.dataelem.DataElement, pydicom.dataelem.RawDataElement]][source]¶ Emulate
dict.popitem().- Returns
- Return type
tuple of (BaseTag, DataElement)
-
private_block(group: int, private_creator: str, create: bool = False) → pydicom.dataset.PrivateBlock[source]¶ Return the block for the given tag group and private_creator.
New in version 1.3.
If create is
Trueand the private_creator does not exist, the private creator tag is added.Notes
We ignore the unrealistic case that no free block is available.
- Parameters
group (int) – The group of the private tag to be found as a 32-bit
int. Must be an odd number (e.g. a private group).private_creator (str) – The private creator string associated with the tag.
create (bool, optional) – If
Trueand private_creator does not exist, a new private creator tag is added at the next free block. IfFalse(the default) and private_creator does not exist,KeyErroris raised instead.
- Returns
The existing or newly created private block.
- Return type
- Raises
ValueError – If group doesn’t belong to a private tag or private_creator is empty.
KeyError – If the private creator tag is not found in the given group and the create parameter is
False.
-
private_creators(group: int) → List[str][source]¶ Return a list of private creator names in the given group.
New in version 1.3.
Examples
This can be used to check if a given private creator exists in the group of the dataset:
>>> ds = Dataset() >>> if 'My Creator' in ds.private_creators(0x0041): ... block = ds.private_block(0x0041, 'My Creator')
- Parameters
group (int) – The private group as a 32-bit
int. Must be an odd number.- Returns
All private creator names for private blocks in the group.
- Return type
list of str
- Raises
ValueError – If group is not a private group.
Remove all private elements from the
Dataset.
-
save_as(filename: Union[str, os.PathLike[AnyStr], BinaryIO], write_like_original: bool = True) → None[source]¶ Write the
Datasetto filename.Wrapper for pydicom.filewriter.dcmwrite, passing this dataset to it. See documentation for that function for details.
See also
pydicom.filewriter.dcmwriteWrite a DICOM file from a
FileDatasetinstance.
-
set_original_encoding(is_implicit_vr: Optional[bool], is_little_endian: Optional[bool], character_encoding: Optional[str]) → None[source]¶ Set the values for the original transfer syntax and encoding.
New in version 1.2.
Can be used for a
Datasetwith raw data elements to enable optimized writing (e.g. without decoding the data elements).
-
setdefault(key: Union[int, str, Tuple[int, int], BaseTag], default: Optional[object] = None) → pydicom.dataelem.DataElement[source]¶ Emulate
dict.setdefault()with support for tags and keywords.Examples
>>> ds = Dataset() >>> elem = ds.setdefault((0x0010, 0x0010), "Test") >>> elem (0010, 0010) Patient's Name PN: 'Test' >>> elem.value 'Test' >>> elem = ds.setdefault('PatientSex', ... DataElement(0x00100040, 'CS', 'F')) >>> elem.value 'F'
- Parameters
default (pydicom.dataelem.DataElement or object, optional) – The
DataElementto use with key, or the value of theDataElementto use with key (defaultNone).
- Returns
The
DataElementfor key.- Return type
- Raises
ValueError – If key is not convertible to a valid tag or a known element keyword.
KeyError – If
enforce_valid_valuesisTrueand key is an unknown non-private tag.
-
to_json(bulk_data_threshold: int = 1024, bulk_data_element_handler: Optional[Callable[[pydicom.dataelem.DataElement], str]] = None, dump_handler: Optional[Callable[[Dataset], str]] = None) → str[source]¶ Return a JSON representation of the
Dataset.New in version 1.3.
See the DICOM Standard, Part 18, Annex F.
- Parameters
bulk_data_threshold (int, optional) – Threshold for the length of a base64-encoded binary data element above which the element should be considered bulk data and the value provided as a URI rather than included inline (default:
1024). Ignored if no bulk data handler is given.bulk_data_element_handler (callable, optional) – Callable function that accepts a bulk data element and returns a JSON representation of the data element (dictionary including the “vr” key and either the “InlineBinary” or the “BulkDataURI” key).
dump_handler (callable, optional) –
Callable function that accepts a
dictand returns the serialized (dumped) JSON string (by default usesjson.dumps()).
- Returns
Datasetserialized into a string based on the DICOM JSON Model.- Return type
Examples
>>> def my_json_dumps(data): ... return json.dumps(data, indent=4, sort_keys=True) >>> ds.to_json(dump_handler=my_json_dumps)
-
to_json_dict(bulk_data_threshold: int = 1024, bulk_data_element_handler: Optional[Callable[[pydicom.dataelem.DataElement], str]] = None) → _Dataset[source]¶ Return a dictionary representation of the
Datasetconforming to the DICOM JSON Model as described in the DICOM Standard, Part 18, Annex F.New in version 1.4.
- Parameters
bulk_data_threshold (int, optional) – Threshold for the length of a base64-encoded binary data element above which the element should be considered bulk data and the value provided as a URI rather than included inline (default:
1024). Ignored if no bulk data handler is given.bulk_data_element_handler (callable, optional) – Callable function that accepts a bulk data element and returns a JSON representation of the data element (dictionary including the “vr” key and either the “InlineBinary” or the “BulkDataURI” key).
- Returns
Datasetrepresentation based on the DICOM JSON Model.- Return type
-
trait_names() → List[str][source]¶ Return a
listof valid names for auto-completion code.Used in IPython, so that data element names can be found and offered for autocompletion on the IPython command line.
-
update(dictionary: Union[Dict[str, object], Dict[Union[int, str, Tuple[int, int], BaseTag], pydicom.dataelem.DataElement]]) → None[source]¶ Extend
dict.update()to handle DICOM tags and keywords.
-
values() → ValuesView[Union[pydicom.dataelem.DataElement, pydicom.dataelem.RawDataElement]][source]¶ Return the
Datasetvalues to simulatedict.values().- Returns
The
DataElementsthat make up the values of theDataset.- Return type
dict_keys
-
walk(callback: Callable[[Dataset, pydicom.dataelem.DataElement], None], recursive: bool = True) → None[source]¶ Iterate through the
Dataset'selements and run callback on each.Visit all elements in the
Dataset, possibly recursing into sequences and their items. The callback function is called for eachDataElement(including elements with a VR of ‘SQ’). Can be used to perform an operation on certain types of elements.For example,
remove_private_tags()finds all elements with private tags and deletes them.The elements will be returned in order of increasing tag number within their current
Dataset.- Parameters
callback –
A callable function that takes two arguments:
a
Dataseta
DataElementbelonging to thatDataset
recursive (bool, optional) – Flag to indicate whether to recurse into sequences (default
True).
-
waveform_array(index: int) → np.ndarray[source]¶ Return an
ndarrayfor the multiplex group at index in the (5400,0100) Waveform Sequence.New in version 2.1.
- Parameters
index (int) – The index of the multiplex group to return the array for.
- Returns
The Waveform Data for the multiplex group as an
ndarraywith shape (samples, channels). If (003A,0210) Channel Sensitivity is present then the values will be in the units specified by the (003A,0211) Channel Sensitivity Units Sequence.- Return type
See also
-