No conversation on AI in the medical domain is complete without delving into medical imagery. Medical images are often problematic, though, because of all the different formats that exist. In this post, we’ll tell you everything you need to know about all the different medical image formats that are out there.
If you already work with medical image formats and are looking to label them to train your ML model, check out our post on medical data labeling, or our introduction to medical image annotation. If you’re ready to label your data, check out our data labeling platform Ango Hub: it’s free and it supports all mainstream medical file formats. And if you’re looking to outsource your medical labeling, it’s easy with Ango Service.
Properties of Medical Images
To understand the different medical image formats it is important to first understand the properties of medical images. Many of us are familiar with normal/planar images. The following discussion will take elements from our understanding of traditional images to understand medical images.
While planar (common jpg, png etc) images have 2 dimensions (3 if we count the color channels), to capture more information, medical images can often have additional dimensions. We refer to images with added dimensions as volumes.
Thus we have the following divisions:
- 2 Dimensional Image: projection of an anatomical volume onto an image plane, such as an X-Ray.
- 3 Dimensional Volume: a series of images representing thin slices through a volume. Such as a single CT scan. The third dimension is spatial, representing different locations of scans.
- 4 Dimensional Volume: a set of 3D volumes or multiple acquisitions of the same 3D volume over time to produce a dynamic series of acquisitions, such as a study containing multiple MRI scans. The fourth dimension is often temporal, representing scans at different times.
Pixel Depth represents the bits needed to encode each pixel. For example, for RGB images, 1 byte is needed for each color channel, and since there are 3 color channels the pixel depth is 3 bytes. 1 byte represents a range of values for each color from 0-255. The higher the number of values per pixel, the more detail we can capture.
However, for medical images, this pixel depth varies. Generally, since more detail needs to be represented, based on the type of study and format used, the pixel depth is higher than that of normal RGB images. Many medical images can store up to 2 bytes per pixel (pixel depth of 2 bytes) representing a range of values from 0 – 65535.
Photometric Interpretation determines how a pixel is to be displayed: monochrome or colored. For instance, Grayscale displays would interpret pixels differently than RGB displays. As concrete examples, X-Rays and CT Scans are grayscale whereas PET, Ultrasounds (some) and SPECT are RGB.
Medical Images may have information about how the image was produced. For example, a magnetic resonance image will have parameters related to the pulse sequence used, e.g., timing information, flip angle, the number of acquisitions, etc.
A nuclear medicine image like a PET image will have information about the radiopharmaceutical injected and the weight of the patient.
Metadata is a powerful tool to annotate and exploit image-related information for clinical and research purposes and to organize and retrieve images of certain specifications.
This is where the actual data representing pixels of the image is stored, commonly as an array of values. Combining the properties in the Metadata section the pixel data can be rendered.
Commonly, for computer vision applications this is part that is of greater importance, and often the one fed to Deep Learning architectures.
The table below (from this paper) indicates the pixel data storage methods for various medical image data formats:
|Analyze||Fixed-length: 348-byte binary format||.img and .hdr||Unsigned integer (8-bit), signed integer (16-, 32-bit), float (32-, 64-bit), complex (64-bit)|
|Nifti||Fixed-length: 352-byte binary format (348 bytes in the case of data stored as .img and .hdr)||.nii||Signed and unsigned integer (from 8- to 64-bit), float (from 32- to 128-bit), complex (from 64- to 256-bit)|
|Minc||Extensible binary format||.mnc||Signed and unsigned integer (from 8- to 32-bit), float (32-, 64-bit), complex (32-, 64-bit)|
|Dicom||Variable length binary format||.dcm||Signed and unsigned integer, (8-, 16-bit; 32-bit only allowed for radiotherapy dose), float not supported|
Medical Image Formats
DICOM File Format
DICOM is arguably the most popular format for storing medical images. It defines a standard for handling, storing, printing, and transmitting information in medical imaging. This is the format of files you can expect right off a scanner or hospital PACS.
A DICOM file consists of a header and the image data in the same file (*.dcm). The header contains information such as the Patient Id, Patient Name, Modality, and other information. It also defines how many frames are contained and in which resolutions. Any annotation that is done on the DICOM volume or image may also be stored in the header.
Apart from X-Rays, a single acquisition (scan) will create multiple DICOM files. Ideally, each of these files will have a spatial and temporal location, thus we can arrange them in an orderly fashion.
NIFTI File Format
NIFTI is also a popular format to store medical data. A NIFTI file (.nii) format contains two affine coordinate definitions which relate each voxel index (i,j,k) to a spatial location (x,y,z).
The main difference between DICOM and Nifti is the way Nifti stores 3D data. In DICOM, a single volume stores multiple slices, without native support for 3D coordinates. NIFTI files, on the other hand, natively store 3D image volumes (such as CT scans and MRIs).
Recently, An updated version of the standard, the NIFTI-2 was created to manage larger datasets. This new version encodes each of the dimensions of an image matrix with a 64-bit integer instead of a 16-bit as in the Nifti-1, allowing for a greater range of detail.
NRRD File Format
The flexible NRRD format (.nrrd) includes a single header file and image file(s) that can be separated or combined. A NRRD header accurately represents N-dimensional raster information for scientific visualization and medical image processing.
As opposed to DICOM the focus is not on medical data storage but rather on the visual representation of N-Dimensional data (commonly N being 2,3 or 4). According to the NRRD Documentation, the format in which files are stored is below:
The general format of a NRRD file (with attached header) is:
NRRD000X <field>: <desc> <field>: <desc> # <comment> ... <field>: <desc> <key>:=<value> <key>:=<value> <key>:=<value> # <comment> <data><data><data><data><data><data>...
The Analyze Medical File Format
Analyze was one of the earliest formal medical image storage formats developed as early as the 1980s. The Analyze format was designed for multidimensional data (3D, 4D volumes). Thus it was a great advance for medical image storage which was typically only supportive of 2D format before.
An Analyze 7.5 volume consists of two binary files: an image file with extension “.img” that contains the voxel raw data and a header file with extension “.hdr” that contains the metadata, such as the number of pixels in the x, y, and z directions, voxel size, and data type.
MINC File Format
The MINC format (.mnc) for neuroinformatics data was designed and implemented beginning in 1992. The goal of the project was the development of a data format and programming tools for neuroimaging.
MINC files are commonly used to represent any 2D or 3D image data, such as MRI, PET, CT, or histology. One advantage for PET data is MINC’s ability to represent an irregularly-spaced time axis. MINC can also be used to represent derived data such as diffusion tensors and deformation fields.
It follows a hierarchical data storage format in its present state (MINC 2.0) and is a flexible and extensible format.
Medical Image Formats In Short
While each file format has its own use for different use-cases, DICOM is by far the most commonly used medical image format, and the other formats are often converted to DICOM for further processing. However for a natively 3D system, NRRD and NIFTI are often preferred since they can natively store multidimensional data.
Using our discussion of medical image formats and their internal storage methods, we can adequately utilize their meta and pixel data for Machine Learning and Deep Learning applications. Numerous frameworks exist, in the open-source community, that ingest these files along with quality annotations to make remarkable inferences on medical data.
Quality annotations on medical data, however, are hard to come by. Annotating these complex formats is an even more onerous task than deciphering and interpreting them. This is where we come in at Ango AI. With our team of medical experts and a platform geared to handling the complexities of reading and annotating medical data, we ensure that our clients get the highest quality medical datasets to push the limits of AI in the medical domain.
Author: Balaj Saleem
Technical Proofreader: Onur Aydın
Editor: Lorenzo Gravina