Working with Pixel Data¶
How to work with pixel data in pydicom.
Introduction¶
Many DICOM SOP classes contain bulk pixel data, which is usually used to represent one or more image frames (although other types of data are possible). In these SOP classes the pixel data is (almost) always contained in the (7FE0,0010) Pixel Data element. The only exception to this is Parametric Map Storage which may instead contain data in the (7FE0,0008) Float Pixel Data or (7FE0,0009) Double Float Pixel Data elements.
Note
In the following the term pixel data will be used to refer to
the bulk data from Pixel Data, Float Pixel Data and Double Float
Pixel Data elements. While the examples use PixelData
,
FloatPixelData
or DoubleFloatPixelData
could also be used
interchangeably provided the dataset contains the corresponding element.
By default pydicom reads in pixel data as the raw bytes found in the file:
>>> from pydicom import dcmread, examples
>>> path = examples.get_path("mr") # The path to the examples.mr dataset
>>> ds = dcmread(path)
>>> ds.PixelData
b'\x89\x03\xfb\x03\xcb\x04\xeb\x04\xf9\x02\x94\x01\x7f...
PixelData
is often not immediately useful as data may be
stored in a variety of different ways:
The pixel values may be signed or unsigned integers, or floats
There may be multiple image frames
There may be multiple planes per frame (i.e. RGB) and the order of the pixels may be different
The image data may be encoded using one of the available compression standards (
1.2.840.10008.1.2.4.50
JPEG Baseline,1.2.840.10008.1.2.5
RLE Lossless, etc). Encoded image data will also be encapsulated and each encapsulated image frame may be broken up into one or more fragments.
Because of the complexity in interpreting the pixel data, pydicom provides
an easy way to get it in a convenient form:
Dataset.pixel_array
.
Dataset.pixel_array
¶
Warning
Dataset.pixel_array
requires NumPy.
Dataset.pixel_array
returns a
numpy.ndarray
containing the pixel data:
>>> arr = ds.pixel_array
>>> arr
array([[ 905, 1019, 1227, ..., 302, 304, 328],
[ 628, 770, 907, ..., 298, 331, 355],
[ 498, 566, 706, ..., 280, 285, 320],
...,
[ 334, 400, 431, ..., 1094, 1068, 1083],
[ 339, 377, 413, ..., 1318, 1346, 1336],
[ 378, 374, 422, ..., 1369, 1129, 862]], dtype=int16)
>>> arr.shape
(64, 64)
If the pixel data is compressed then
pixel_array
will return the uncompressed data,
provided the dependencies of the required pixel data decoder have been met. See
handling compressed image data for more
information.
NumPy can be used to modify the data, but if the changes are to be saved,
they must be written back to the dataset’s PixelData
element.
Warning
Converting data from an ndarray
back to bytes
may not be
as straightforward as in the following example, particularly for
multi-planar images or where compression is required.
# example: zero anything < 300
arr = ds.pixel_array
arr[arr < 300] = 0
ds.PixelData = arr.tobytes()
ds.save_as("temp.dcm")
Some changes may require other DICOM tags to be modified. For example, if the
image size is reduced (e.g. a 512x512 image is shrunk to 256x256) then
Rows
and Columns
should be set
appropriately. You must explicitly set these yourself; pydicom does not do so
automatically.
See Downsize MRI image using pydicom for an example.
pixel_array
can also be used to pass image
data to graphics libraries for viewing. See Viewing Images for details.
Color space¶
When using pixel_array
with Pixel Data that has an (0028,0002) Samples per Pixel value
of 3
then the returned pixel data will be in the color space as given by
(0028,0004) Photometric Interpretation (e.g. RGB
, YBR_FULL
,
YBR_FULL_422
, etc).
pydicom offers a limited ability to convert between 8-bits/channel YBR and
RGB color spaces through the
convert_color_space()
function. When changing the color space you should also change the value
of Photometric Interpretation to match.
Note
See the DICOM Standard, Part 3, Section C.7.6.3.1 for more information about color spaces.
Palette Color¶
Some DICOM datasets store their output image pixel values in a lookup table
(LUT), where the values in Pixel Data are the index to a corresponding
LUT entry. When a dataset’s (0028,0004) Photometric Interpretation value is
PALETTE COLOR
then the
apply_color_lut()
function can be used
to apply a palette color LUT to the pixel data to produce an RGB image.
from pydicom import examples
from pydicom.pixels import apply_color_lut
# Fetch an example PALETTE COLOR dataset
ds = examples.palette_color
arr = ds.pixel_array
rgb = apply_color_lut(arr, ds)
It’s also possible to apply one of the DICOM well-known color palettes provided the bit-depth of the pixel data is 8-bit.
from pydicom import examples
from pydicom.pixels import apply_color_lut
ds = examples.palette_color
arr = ds.pixel_array
# You could also use the corresponding well-known SOP Instance UID
rgb = apply_color_lut(arr, palette='PET')
Modality LUT or Rescale Operation¶
The DICOM Modality LUT module
converts raw pixel data values to a specific (possibly unitless) physical
quantity, such as Hounsfield units for CT. The
apply_modality_lut()
function can be
used with an input array of raw values and a dataset containing a Modality LUT
module to return the converted values. When a dataset requires multiple
grayscale transformations, the Modality LUT transformation is always applied
first.
from pydicom import examples
from pydicom.pixels import apply_modality_lut
ds = examples.ct
arr = ds.pixel_array
hu = apply_modality_lut(arr, ds)
VOI LUT or Windowing Operation¶
The DICOM VOI LUT module applies a
VOI or windowing operation to input values. The
apply_voi_lut()
function
can be used with an input array and a dataset containing a VOI LUT module to
return values with applied VOI LUT or windowing. When a dataset contains
multiple VOI or windowing views then a particular view can be returned by
using the index keyword parameter.
When a dataset requires multiple grayscale transformations, then it’s assumed that the modality LUT or rescale operation has already been applied.
from pydicom import examples
from pydicom.pixels import apply_voi_lut
ds = examples.overlay
arr = ds.pixel_array
out = apply_voi_lut(arr, ds, index=0)