Efficient GPU-based Image Registration: for Detailed Large-Scale Whole-body Analysis

(1)

UNIVERSITATISACTA UPSALIENSIS

UPPSALA

Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 1687

Efficient GPU-based Image Registration

for Detailed Large-Scale Whole-body Analysis

SIMON EKSTRÖM

ISSN 1651-6206 ISBN 978-91-513-1026-8

(2)

Dissertation presented at Uppsala University to be publicly examined in Föreläsningssalen, Röntgen, Akademiska Sjukhuset, Uppsala, Friday, 27 November 2020 at 09:15 for the degree of Doctor of Philosophy (Faculty of Medicine). The examination will be conducted in English. Faculty examiner: Professor Fred Hamprecht (Heidelberg Collaboratory for Image Processing (HCI) Interdisciplinary Center for Scientific Computing (IWR) and Department of Physics and Astronomy, Heidelberg University).

Abstract

Ekström, S. 2020. Efficient GPU-based Image Registration. for Detailed Large-Scale Whole- body Analysis. Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 1687. 63 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-513-1026-8.

Imaging has become an important aspect of medicine, enabling visualization of internals in a non-invasive manner. The rapid advancement and adoption of imaging techniques have led to a demand for tools able to take advantage of the information that is produced. Medical image analysis aims to extract relevant information from acquired images to aid diagnostics in healthcare and increase the understanding within medical research. The main subject of this thesis, image registration, is a widely used tool in image analysis that can be employed to find a spatial transformation aligning a set of images. One application, that is described in detail in this thesis, is the use of image registration for large-scale analysis of whole-body images through the utilization of the correspondences defined by the resulting transformations. To produce detailed results, the correspondences, i.e. transformations, need to be of high resolution and the quality of the result has a direct impact on the quality of the analysis. Also, this type of application aims to analyze large cohorts and the value of a registration method is not only weighted by its ability to produce an accurate result but also by its efficiency. This thesis presents two contributions on the subject; a new method for efficient image registration with the ability to produce dense deformable transformations, and the application of the presented method in large-scale analysis of a whole-body dataset acquired using an integrated positron emission tomography (PET) and magnetic resonance imaging (MRI) system. In this thesis, it is shown that efficient and detailed image registration can be performed by employing graph cuts and a heuristic where the optimization is performed on subregions of the image. The performance can be improved further by the efficient utilization of a graphics processing unit (GPU). It is also shown that the method can be employed to produce a model on health based on a PET-MRI dataset which can be utilized to automatically detect pathology in the imaging.

Keywords: Magnetic resonance imaging, Image registration, whole-body

Simon Ekström, Department of Surgical Sciences, Radiology, Akademiska sjukhuset, Uppsala University, SE-75185 Uppsala, Sweden.

urn:nbn:se:uu:diva-421472 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-421472)

(3)

Till mor och far

(4)

(5)

List of papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals:

I Simon Ekström, Filip Malmberg, Håkan Ahlström, Joel Kullberg, Robin Strand. Fast graph-cut based optimization for practical dense deformable registration of volume images. Computerized Medical Imaging and Graphics (2020): 101745.

II Therese Sjöholm, Simon Ekström, Robin Strand, Håkan Ahlström, Lars Lind, Filip Malmberg, Joel Kullberg. A whole-body FDG PET/MR atlas for multiparametric voxel-based analysis. Scientiﬁc reports 9.1 (2019): 1-10.

III Simon Ekström, Martino Pilia, Joel Kullberg, Håkan Ahlström, Robin Strand, Filip Malmberg. Faster dense deformable image registration by utilizing both CPU and GPU. Submitted.

IV Simon Ekström, Joel Kullberg, Håkan Ahlström, Robin Strand, Filip Malmberg. Deformable Image Registration of Volumetric Whole-body MRI: An Evaluation. Manuscript.

Reprints were made with permission from the publishers.

The author has made substantial contributions to all papers. In Paper I, III, and IV the author was the main contributor and responsible for method development, implementation, experiments, and analysis of the results. In Paper II, Sjöholm and Ekström contributed equally to the paper with Ekström contributing to the writing and being responsible for the image registration aspect of the method development and data analysis.

(6)

Related work

The author has also contributed to the following publications:

1. Ahlström H, Ekström S, Sjöholm T, Strand R, Kullberg J, Johansson E, Hagmar P, Malmberg F. Registration-based automated lesion detec- tion and therapy evaluation of tumors in whole body PET-MR images.

Annals of Oncology, 28(suppl_5), 2017.

2. Ekström S, Sjöholm T, Malmberg F, Lind L, Ahlström H, Kullberg J, Strand R. Whole-body morphological and functional atlas using an in- tegrated PET-MRI system and fat-water registration. In Proceedings 25. Annual Meeting International Society for Magnetic Resonance in Medicine, volume 25, page 3884, Honolulu, Hawaii, 2017.

3. Sjöholm T, Ekström S, Malmberg F, Strand R, Johansson M, Lind L, Engström M, Ahlström H, Kullberg J. Intensity inhomogeneity correc- tion of whole body fat-water images using fat and water fraction infor- mation on a 3T PET/MR scanner. In Proceedings 25. Annual Meeting International Society for Magnetic Resonance in Medicine, volume 25, page 3888, Honolulu, Hawaii, 2017.

4. Carlbom L, Ekström S, Strand R, Boersma G, Eriksson J, Ahlström H, Kullberg J. Voxel-wise analysis of tissue speciﬁc insulin sensitivity and body composition by imiomics, a whole-body pet-mr study. Manuscript.

5. Pilia M, Kullberg J, Ahlström H, Malmberg F, Ekström S, Strand R.

Average volume reference space for large scale registration of whole- body magnetic resonance images. PLoS One, 14(10):e0222700, 2019.

6. Guglielmo P, Ekström S, Strand R, Visvanathar R, Malmberg F, Johans- son E, Pereira MJ, Skrtic S, Carlsson BCL, Eriksson JW, Ahlström H, Kullberg J. Validation of automated whole-body analysis of metabolic and morphological parameters from an integrated FDG-PET/MRI ac- quisition. Scientiﬁc Reports, 10(1):1-8, 2020.

7. Strand R, Ekström S, Breznik E, Sjöholm T, Pilia M, Lind L, Malm- berg F, Ahlström H, Kullberg J. Recent Advances in Large Scale Whole Body MRI Image Analysis - Imiomics. 2020 International Conference on Sustainable Information Engineering and Technology (SIET).

(7)

Abbrevations

ALU Arithmetic logic units ANN Artiﬁcial neural network ANTs Advanced normalization tools API Application programming interface CT Computed tomography

CUDA Compute uniﬁed device architecture DIR Deformable image registration DNN Deep neural network

DRAM Dynamic random access memory DSC Dice similarity coefﬁcient FCN Fully convolutional network FDG Fluorodeoxyglucose

FFD Free-form deformations GPU Graphics processing unit

GPGPU General-purpose computing on graphics processing units ICM Iterated conditional modes

MRF Markov random ﬁeld

MRI Magnetic resonance imaging NCC Normalized cross-correlation

NIREP Non-rigid image registration evaluation project OpenCL Open computing language

PCC Pearson’s correlation coefﬁcient PET Positron emission tomography ReLU Rectiﬁed linear unit

SIMT Single instruction multiple thread SM Streaming multiprocessor

SSD Sum of squared differences T2D Type 2 diabetes

VME Vector magnitude error

(10)

1. Introduction

Medical imaging has become an important aspect of diagnostics and interven- tion in modern healthcare. A prevalent application of imaging in medicine is the visualization of internals in a non-invasive manner. This has led to the development of a large number of different image types, or modalities, all with different applications. The advancement and adoption of these techniques combined with the rapid growth of computing set the spark for computational analysis of medical imaging. This type of analysis, termed medical image analysis, is applied to the acquired images in an attempt to extract relevant information. Two common topics discussed within medical image analysis are image segmentation and image registration. The latter topic, image registration, is the main subject of this thesis which presents both the development of a new image registration method and the application of image registration for large-scale image analysis in medicine. To apply image registration in practice there are a number of factors that need to be taken into consideration with the imaging modality being an important factor. To limit the scope of the image registration to the applications discussed in this thesis only two modalities were covered; magnetic resonance imaging (MRI) and positron emission tomography (PET).

1.1 Magnetic resonance imaging

Magnetic resonance imaging (MRI) has become a widely adopted technique in medical imaging due to its ability to provide great soft tissue contrast imaging without the involvement of ionizing radiation. It has also been proven a versatile tool, providing imaging of not only tissue contrast but also of biological processes such as diffusion and perfusion. MRI utilizes strong magnetic fields and radio waves in order to generate images. This is possible due to the properties of certain atomic nuclei, which have the capability to absorb and emit radio frequency energy when placed in a strong magnetic field. These nuclei include hydrogen which is a prevalent nucleus in the human body. By applying mathematical models together with imposed gradients in the magnetic fields, MRI is able to spatially map these nuclei and their properties.

A possible application of MRI is the imaging of fat and water content in soft tissue [2] and this is the technique thas has been used to produce the fat and water whole-body images that can be seen throughout this thesis. An example of fat and water tissue imaging acquired through MRI can be seen in Fig. 1.1.

(11)

Fat Water FDG PET

Figure 1.1. Whole-body images acquired using PET-MRI. The ﬁrst two images are fat and water images collected from MRI presenting the body composition. The third image is a FDG-PET image containing information on glucose metabolism.

An issue with imaging of this type is the lack of a quantiﬁable interpretation of the produced image intensities. Imaging the same tissue with different scanners or positioning may produce different intensities, as opposed to computed tomography (CT) which produces absolute radiodensity. This is an important aspect to take into consideration when applying image registration on images acquired using MRI.

1.2 Positron emission tomography

Positron emission tomography (PET) is a functional imaging technique that aims to map metabolic processes. To produce an image, a radioactive tracer substance is injected into the body. These positron-emitting tracers are syn- thesized as analogs of molecules relevant to the biological processes to depict.

One such tracer is Fludeoxyglucose ([¹⁸F]FDG) which is an analog to glucose, enabling assessment of the glucose metabolism throughout the body.

The emitted positron of the tracer collides with a nearby electron, which con- sequently emits a pair of photons traveling in opposite directions. These two photons are captured by an array of detectors surrounding the subject and a 3D mapping of this emission is constructed.

(12)

A problem with PET is that most of the emitted photons are actually ab- sorbed by the body itself, also referred to as attenuation. To solve this problem, PET is typically combined with either MRI or CT in order to get an anatomical reference for attenuation correction. This also provides the extra beneﬁt of an anatomical reference for the viewer, since PET only produces functional imaging.

1.3 Medical image analysis

As mentioned, the development and adoption of imaging techniques are rapidly growing, and the same goes for the data collected. This increases the demand for tools able to analyze this data. The ﬁeld of medical image analysis aims to extract relevant information from this growing amount of data, information that can be valuable both for clinical applications and medical research. Two very common tools within medical image analysis is image segmentation and, the main contribution of this thesis, image registration.

Image segmentation aims to aid the delineation of regions of interest, either automatically or semi-automatically, within medical images and these delineations can later be used in various applications. This includes measurements of speciﬁc tissue such as measuring the volume of visceral and subcutaneous adipose tissue [39], tumor segmentation [30], or as a tool for improving image registrations [64]. There are plenty of approaches for segmentation but they can generally be divided into two subgroups; traditional and learning- based. Examples of traditional approaches include atlas-based segmentation [33], graph cut based segmentation [5], or the use of fuzzy c-means cluster- ing algorithms [11]. The learning-based category has grown rapidly thanks to the growth of deep learning and not only for segmentation tasks but medical image analysis as a whole [44]. These applications utilizes a neural network which is trained to perform a speciﬁc segmentation task. These networks and their uses are further described in Chapter 6.

Image registration is the problem of aligning two or more images. One image is selected as a reference image, which acts as the target image, and the other images, the source images, are transformed to overlap this target.

The target and source images can also be referred to as the ﬁxed and the moving images. This has a wide range of applications and one application that will be covered in this thesis is the large-scale analysis of whole-body MRI. Whole-body images from large cohorts can be analyzed by utilizing image registration to transform all images of a cohort into a common reference space. This concept is termed Imiomics (imaging-omics) and will be further described in Chapter 2. In short it enables point-wise statistical analysis of very large groups of images thanks to the shared coordinate system produced b the image registration.

(13)

It is not uncommon to use registration in combination with segmentation.

This process is generally referred to as atlas-based segmentation and has shown to perform well in a variety of biomedical applications [33]. This method is built around the registration of a set of atlas to a target image where the atlases later are applied to compute a likely segmentation. Registration, like segmentation, has also been impacted by the advancement of deep learning and several new methods based on learning has been proposed [29]. The deep neural networks applied for registration are usually similar to the ones used for segmentations but with an exception for the output. Two or more images, i.e. target and source, are feed into the network and instead of constructing the network to output a segmentation, a transformation is produced.

1.4 Objectives

Many analysis applications utilizing image registration are heavily dependent on the registration performance both in terms of computation time and quality, including the Imiomics concept. The necessity for a fast and robust registration method is even more evident when analyzing larger datasets with greater spatial resolution.

The original paper describing Imiomcs only applied the concept on MRI imaging, limiting itself to body composition and morphology. Integrated PET- MRI allows for simultaneous acquisition of both the functional imaging pro- vided by PET and MRI, which has the potential to improve the analysis of systemic conditions. The inherent co-registration between PET and MRI, in this case, should make the incorporation of PET into the Imiomics analysis possible without any greater difﬁculty.

The aims of this work can thus be summarized by two main objectives:

i Improve upon the image registration of volumetric whole-body imaging through the development of new tools to enable efﬁcient and robust analysis of large datasets.

ii Apply the tools together with the Imiomics concept to explore the via- bility and possible applications on multi-parametric PET-MRI datasets.

1.5 Thesis outline

This thesis is a comprehensive summary based on four papers. Chapter 2 describes the Imiomics concept and how it can be applied on large-scale imaging datasets. Chapter 3 provides an introduction to image registration. Chapter 4 is a brief introduction to graph theory and focuses on graph cut as a tool for energy minimization. Chapter 5 presents the graphics processing unit (GPU) architecture and how it can be utilized for high-performance computing. Chap- ter 6 presents a brief introduction to deep neural networks and its applications.

(14)

The papers that this thesis is based on are summarized in Chapter 7. Chapter 8 is a discussion of the contributions of this thesis, the challenges, and the outlook. In the last chapter, Chapter 9, a Swedish summary of the thesis is presented.

(15)

2. Imiomics

This chapter gives a more in-depth description of Imiomics, both from a technical perspective and its value in medical research and the clinic.

2.1 Background

As mentioned, the development and adoption of imaging techniques in medicine are growing in a rapid pace, allowing for the acquisition of spatially detailed and incredibly large datasets. This has enabled databases, such as the UK Biobank [66], to collect enormous amounts of data with detailed imaging of 100,000 subjects. There is a huge potential in a resource like this and the detailed imaging of a cohort this scale aids in answering a range of research questions. Despite this, the data is generally reduced into small sets of a pri- ori deﬁned measurements of regions-of-interest. This information loss is even more evident when studying systemic conditions which have an effect on the body as a whole and is not limited to these predeﬁned regions. In [64], a new concept for a holistic analysis of whole-body imaging data was proposed termed Imiomics (imaging-omics), which allows the analysis of large groups of images without this major reduction of the available information. The core idea is to employ image registration in order to deform all images into a common coordinate system, allowing for detailed analysis by applying traditional statistical tools on a point-wise basis. This allows for a thorough analysis of the complete imaging data

Imiomics involves a number of different technical challenges but due to the scope of this thesis the main focus will be on the task of image registration.

Registering a large amount of whole-body images is a daunting task as not only is the precision of importance but also the efﬁciency in terms of computation time. For further reading on the other technical challenges see [63].

2.2 Method

This section breaks down the process of performing Imiomics analyses into three steps. Assuming all data has already been acquired, the ﬁrst step will be to select a reference space for the analysis. In the second step, image registration is employed to transform the imaging of all subjects into said reference space. How these two steps are performed may in some degree depend on the application and the data but the last step, the analysis, is fully determined by what you want to investigate.

(16)

2.2.1 Reference space selection

Image registration searches for a transformation from a source image to a reference image and the choice of a reference space can have a large impact on the quality of the result. The key function of the registration in Imiomics is the process of producing point-wise correspondences, this means that both input images have to be similar to some degree before the registration. An example of an ill-suited reference image is the use of a female reference subject when registering a male group as there are inherent anatomical differences. The reference space should be representative of the cohort that is analyzed. The most straightforward approach for selecting a reference space is to select a subject from the cohort with proximity to the cohort mean. This, however, can be challenging as the body shape and composition may vary heavily throughout a cohort.

An alternative approach is to produce a synthetic average reference space.

One such method was presented in [50], where a reference space neutral to local tissue volume was computed in order to aid the registration process. The aim was to produce an image which, when used as a reference space, has an average local volume change close to zero. For a given cohort a synthetic reference space is produced by ﬁrst selecting an initial reference as a starting guess. The reference space is then iteratively improved by applying a warping that is optimized towards producing an average local volume change of zero.

2.2.2 Image registration

As the Imiomics concept was first published a three-stage process was implemented for the registration of whole-body fat-water MRI [64]. This process was designed to mirror the elasticity of the different tissue. The first stage was a locally affine piece-wise registration of bone, where major bone seg- ments were transformed one-by-one in an affine fashion to reflect the rigid- ity of bone. This also gives a good initial transformation for the remaining tissues. The next stage registered muscle tissue by using the water image acquired from the MRI. Muscle, as opposed to bone, is expected to be elastic, and thus deformable registration was applied. The last stage was the registration of fat tissue, here the fat images were also registered in a deformable fashion, however, fat is expected to have a higher elasticity than muscle and the regularization was decreased in order to reflect this.

The initial implementation was based on the image registration framework elastix [37] which utilizes a parametric transformation model, i.e. the resulting transformation is represented by a set of parameters, with a lower transformation resolution. In Paper II the same three-stage was implemented but replac- ing the elastix method with the method presented in Paper I in an attempt to improve upon the quality. This new implementation of the three-stage process

(17)

has also been applied in another publication [27] evaluating its performance in whole-body analysis of metabolic and morphological parameters.

The image registration is a key component of the Imiomics analysis and the quality of the analysis is heavily reliant on the quality of the registration.

The registration of whole-body imaging is not free from problems, however.

Selecting a suitable reference space may in some degree facilitate the process but the registration still needs to be able to handle the large variety in anatomy and the various imaging artifacts that are typically present in a dataset of this scale. Another important aspect of the image registration is the efﬁciency and as the datasets grow it is essential that the process can produce a result within a reasonable timeframe. The trade-off between quality and computation time is a vital aspect of making the registration viable for analyses of this kind. Paper IV presents an in-depth comparison of different approaches for registering whole-body images where these aspects are evaluated.

2.2.3 Analysis

The image registration of whole-body images for a group of subjects into a single reference subject enables point-to-point correspondences for the whole body of all subjects through a shared coordinate system. Using these correspondences, various information of a group can be extracted on a voxelwise basis. Having the information structured in this manner allows the utilization of many common statistical tools.

Depending on the acquired imaging data you have multiple choices on what to analyze. In Paper II imaging of fat fraction together with FDG-PET was included in the analysis. The integrated PET/MRI performs simultaneous imaging of both modalities and the inherent registration of the system removed the need for any explicit image registration. However, the registration step described could be extended to include also other modalities.

In addition to the imaging information the registration also produces a map of the change in local tissue volume. This parameter is a measurement of the local change in volume induced by the resulting transformation. To determine the local change in the volume the Jacobian determinant of the transformation produced by the registration is computed. A Jacobian determinant value between 0 and 1 implies a local contraction while a value above 1 implies local expansion. A value below 0 implies a folding in the space, this case is further described in Section 3.2. The Jacobian can be included in the analysis to investigate differences in volume for every point in the image space.

2.3 Medical Applications

This sections brieﬂy presents three possible medical applications of Imiomics.

(18)

2.3.1 Correlation Analysis

Utilizing correlations as a statistical tool in order to assess possible associations is commonplace in medical research. However, as described previously, a common problem when such analyses involve imaging data is the fact that the information from images is reduced to a small number of pre-defined measurements. Using the information structured as described in Section. 2.2.3 to compute the correlation coefficient for every point enables the construction of correlation maps. These maps are a great tool for understanding associations between imaging and non-imaging data. By including a non-imaging parameter such as age, the association between age and imaging data such as the FDG uptake can be presented for every point in the image which can provide ad- ditional information not available when associating age with any pre-defined measurement of a specific tissue.

This type of analysis was applied in [43] to investigate relationships between endothelium-dependent vasodilation and body composition. This in- vestigation concluded that Imiomics did not only conﬁrm ﬁndings produced by traditional approaches but also produced a more detailed result. Another example in literature is [13], where Imiomics were utilized to investigate associations between fat fraction, FDG-PET uptake rate, and a range of metabo- lites.

0.12

0.00 0.06

0.00

0.05

Control T2D Results

Figure 2.1. An example of a group comparison performed using Imiomics. Here the pointwise mean FDG uptake rate Ki of a control group and a group diagnosed with type 2 diabetes (T2D) is presented together with the result of a pointwise t-test.

Presented in the result is the regions with a p-value in the range[0,0.05] on top of the T2D mean image. The arms were excluded from the analysis due to differences in arm positioning between the subjects.

(19)

2.3.2 Group Comparison

The act of comparing groups is also common in medical research, for instance, in a cross-sectional study you may want to compare two subsets of the population. Similar to correlation analysis, the Imiomics framework enables group comparisons on a point-wise basis. Point-wise statistical tests for group comparisons can be performed on two subsets of the registered subjects to produce maps of the detected difference. This allows for a detailed analysis of differences in the available imaging information.

In [42], a group comparison in this fashion was applied in order to investigate whether the Imiomics framework was able to provide further information on the relationship between metabolic syndrome and body composition.

Figure 2.1 presents an example of a pointwise t-test to compare a group of subjects diagnosed with type 2 diabetes (T2D) against a group of controls.

Depicted in the ﬁgure are the mean uptake rates, Ki, for the two groups, to- gether with the p-values computed using the t-test. The T2D group consists of 13 subjects and the control group consists of 9 subjects.

2.3.3 Anomaly Detection

Another application is the use of Imiomics to model the whole body of a healthy population in order to test patients against this model to detect anoma- lies. This has interesting applications for systemic diseases such as oncologi- cal diseases and T2D and could potentially aid in the detection of pathology.

This type of application is covered in Paper II, where a whole-body atlas containing both functional information and information on body composition was computed using image registration. This atlas was built in order to create a statistical model of the average healthy subject. Having an atlas with voxelwise information of healthy tissue enables automated anomaly detection. In the paper a proof-of-concept task was set up where subjects with suspected pathology were tested against the atlas in order to detect deviations from the normal.

(20)

3. Image registration

The act of aligning images is a common task in a number of different ﬁelds.

In computer vision, one may want to align images from multiple cameras to derive the positions of objects in a scene. Fields applying remote sensing may need to align acquired data. It has also been widely applied in medical imaging. This alignment is generally referred to as image registration and even though the ﬁelds are different in multiple aspects, the general principle is the same. The objective of image registration is to, given a set of images, ﬁnd a spatial transformation that aligns the images in a way that is optimal given a measurement of optimality. Fig. 3.1 presents an example of registering one whole-body image (source) to another (target).

Source Target Result

Figure 3.1. Example of a registered whole-body fat-water images where the source image has been registered to the target. Shown are selected slices from 3D volumes.

Image registration has played a vital role in medical imaging within a vast number of tasks and can be considered one of the main problems in medical image analysis. It is one of the cornerstones in a multitude of different tasks, including image fusion, atlas segmentation, motion correction, and many others. Image fusion is an important aspect of medical imaging and could involve fusing imaging from different devices or from different time points. A no- table example is the fusion of PET and CT imaging in order to overlap the

(21)

biological function seen in PET with the detailed anatomy seen in CT. Image registration can also be employed in image segmentation through atlas segmentation. The concept is to; given a set of pre-labeled images, i.e. atlases, transform these labels from the atlases to the target image by the use of image registration and ﬁnally fuse the labels from all atlases, e.g. by majority voting.

Multi-atlas segmentation, which utilizes multiple atlases, has been shown to perform very well in a large variety of biomedical applications [33]. Multiple surveys providing introductions to the wide array of proposed image registration techniques and their applications have been presented throughout the years [7][48][46][71][62].

3.1 Preliminaries

This section describes the components that characterize an image registration method. This ﬁrst requires a formal deﬁnition of an image. An image can be described by two components; a set of points V and a mapping I : V → R.

The image domain is deﬁned by V , which are arranged on a discrete grid, and would for a three-dimensional image be the set of voxels in an image. The mapping, I, maps a point, x∈ V to the intensity value at the given point.

x I

(a) (b) (c)

Figure 3.2. Illustration of the three interpolation strategies; nearest neighbor (a), linear (b), and spline (c). Depicted is the mapping of the intensities I, given x for a one- dimensional image.

To enable a precision greater than the grid defined by V , a continuous model of an image can be approximated through interpolation. This is an important aspect of the registration method presented in Paper I as the searched displace- ments are not bound by the image grid. Three common variations of interpolation strategies are illustrated in Fig. 3.2; nearest neighbor, linear interpolation, and spline interpolation. The choice of interpolation may have a significant impact on both quality and performance. Nearest neighbor interpolation have the property of not introducing any new intensities in the image which is ap- preciated when performing interpolation of images containing discrete values, such as labels defining a regions-of-interest. The other strategies on the other

(22)

hand can be applied to better model the expected behavior of images with continuous intensities.

u(x)

Source, V_S Target, V_T

Figure 3.3. An illustration of the backward transform. The transformation is per- formed for every x∈ VTby applying the displacement u(x) to determine which point in the source image to sample.

As mentioned, the general definition of image registration applies to two or more images. To narrow down the problem, image registration is from here on defined as the problem of aligning only two images. The first image will be referred to as the source or moving image denoted as S= (IS,VS). The second image is the target or fixed image, denoted as T= (IT,VT). These two images are related by a transformation, W : V → R^d, mapping points in the target image to points in the source image as illustrated in Fig. 3.3. This type of mapping is referred to as a backward transform and is commonly applied in practice since the implementation is more straightforward compared to the forward transform. W can be defined as

W(x) = x + u(x), (3.1)

where u is a mapping producing a displacement vector for every point x∈ VT. To transform the source image in practice, a new image Sis constructed inher- iting the domain of the target image V_T, each point x∈ VTis then evaluated by applying the transformation and sampling the corresponding intensity in S, i.e.

I_S= IS◦W. The transformation does not necessarily result in a point directly in V_S and this is were interpolation can be applied to acquire a continuous model of the image.

Reiterating on the objective of image registration; given a set of images, i.e.

S and T, ﬁnd a transformation, W , that is optimal given some measurement of optimality, f(W). This can be formulated as an optimization problem of the form

Wˆ = arg min

W f(W). (3.2)

Image registration techniques are characterized by three components; a model for W , the objective function f(W), and an optimization approach.

(23)

Naturally, the outcome is heavily dependent upon the ﬁrst two components.

In addition, since the problem is inherently ill-posed and reaching the global optimum can in general not be guaranteed, the outcome will also vary with the choice of the optimization method.

3.2 Transformation model

The transformation model is considered the most fundamental characteristic of an image registration technique [7] and may have a significant impact on the performance. On one hand, a restricted model with few degrees of freedom may be too limiting, preventing the method from finding meaningful transformations. On the other hand, a less restricted model increases the search space, making the search for a solution both difficult and time-consuming.

(a) (b) (c)

Figure 3.4. Three examples of transformation models with different degrees of free- dom. (a) visualizes a simple rotation by the angleθ. (b) visualizes a sparse deformable transformation model, where the transformation is represented by a sparse grid of control points. (c) visualizes a dense, non-parametric, representation where each image element is displaced by a corresponding displacement vector.

Figure 3.4 depict examples from three different classes of transformation models. The first example, (a), visualizes a rotation by the angleθ. This type of transformation belongs to the first class, which involves linear transformations. These models are global by nature and are thus only able to align entire images, making local transformations impossible. They are ill-fitted for more intricate tasks but their simplicity makes them very efficient when represent- ing global alignments. Two examples; affine and rigid transformations, apply linear transformations in combination with translation.

The first example, affine transformations, is a geometric transformation involving rotation, translation, reflection, and shearing, amongst others. A prop-

(24)

erty of an afﬁne transform is that straight lines remain straight. This type of transformation can be formulated as

u(x) = Ax + t, (3.3)

where A is an affine matrix and t the translation vector. The second example, rigid transformations, is a special case of affine transformations for which the Euclidean distance between all points is preserved. This means it only permits rotation, translation, and reflection, and can thus be written as

u(x) = Rx + t, (3.4)

where R is an orthogonal matrix. As mentioned, the rigid transformation also includes reﬂections, which generally is an undesirable property in image registration. Proper rigid transformations, which is a subclass of rigid transfor- mations, prohibits reﬂections by adhering to the condition det(R) = 1.

These global models do not adequately represent the transformation re- quired by the more intricate tasks, and thus deformable transformation models are often applied. These models do not restrict themself to global transformations and can thus map local variations between two images. The techniques utilizing deformable models fall under the category of deformable image registration (DIR). A comprehensive review of this class of transformation was presented in [32].

Sparse representations are common when modeling deformable transformations as they reduce the number of unknowns. These representations can be either irregular or regular based on the regularity of their control points.

Landmark-based registration, where the objective is to register a set of corresponding landmarks, typically uses an irregular representation. Free-form deformations (FFD) [56] on the other hand, uses a discrete grid of control points to form a lattice over the image. The lattice, with the image, can then be deformed by manipulating these control points. Transformation (b) in Fig. 3.4 visualizes one such sparse representation, where the markers represent the control points.

These representations are preferably combined with splines in order to produce a continuous mapping in the image domain, Ω. This mapping can be deﬁned as

u(x) =∑

k

p_kB_k(x), (3.5)

where p_k∈ R^dare the parameters and B_k:Ω → R the basis functions.

The formerly mentioned transformation models are all examples of parametric models since they can be described by a number of explicit parameters.

A popular implementation of image registration applying these models is the elastix tool [36]. Another variation is the non-parametric model which follows no such parametrization and thus allows each point in the source image to be

(25)

displaced arbitrarily, as visualized in the last example of Fig. 3.4, (c). In this case, the transformation could be written as

W(x) = x + u(x), (3.6)

where u is a dense displacement ﬁeld holding a displacement vector for every x∈ V. In comparison to the sparse nature of the previous deformable models, this could also be referred to as a dense model.

This dense representation is very expressive, especially since each point can be displaced arbitrarily. This leads to an immense search space, and a search space full of improbable deformations at that. Several physical models have been proposed to both reduce the search space and restrict the solution to physically probable deformations. One such method was proposed in [69], which proposed image matching as a diffusion process. Others include elastic body models, models based on viscous ﬂuids, and ﬂows of diffeomorphisms, amongst others.

It is not too uncommon to talk about the transformations produced by image registration in terms of their properties. Two well-known properties are inverse consistency and diffeomorphism.

For an image registration method to be inverse consistent requires the trans- formation produced in one direction, W_A_→B, to be the inverse of the transfor- mation produced in the other direction, WB→A. However, due to the lack of a unique solution, this is seldom the case. An inverse consistent method such as the one proposed by [10], aims to produce transformations in both directions adhere to the constraint W_A→B= WB→A−1.

Diffeomorphisms are invertible functions mapping one manifold to another where both the functions and their inverses are differentiable. This means that they can be used to represent a smooth and continuous mapping between the two images while preserving the topology. These properties are valuable in image registration as they restrict the transformation from physically implau- sible deformations such as foldings.

3.3 Objective function

The objective function, or matching criterion, attempts to quantify the alignment of the source and the target image. This matching could be either geometric or iconic. A geometric method attempts to establish a correspondence between landmarks while, an iconic method, or intensity-based, evaluates the matching using image intensities. There are also hybrid methods where information on the geometric relationships between two images can be combined with the intensities.

Choosing the similarity metric is an important aspect of devising an iconic matching criterion. This is mostly a consequence of the intensity relationship between the two images to measure. Ideally, the employed similarity metric

(26)

should account for this relationship. This problem is even more evident when matching images of different modalities.

Sum of squared differences (SSD) is a commonplace distance metric which is easy to adopt due to its simplicity. The SSD between S and T can be deﬁned as

D_SSD(u) = ∑

v∈VT

((IS◦ Tu)(v) − IT(v))². (3.7) However, it is limited by the assumption that there is a direct correspondence in the intensities between the two images, i.e. I_T = IS◦ W. This makes it less attractive for some modalities. For instance, data acquired from MRI imaging is generally non-quantitative and the intensity values of the same tissue may differ when images are acquired on different scanners or with differences in positioning. Another popular metric is Pearson’s correlation co- efﬁcient (PCC), sometimes also referred to as normalized cross-correlation (NCC). PCC has the property of being insensitive to both differences in mean intensities and noise in low-intensity areas, which are both common in MRI.

The PCC between two images, A and B, can be deﬁned as PCC(A,B) = ∑v∈V(IA(v) − ¯IA)(IB(v) − ¯IB)

∑v∈V(IA(v) − ¯IA)²∑v∈V(IB(v) − ¯IB)² , (3.8) where ¯I denotes the mean value of I and V is a set of overlapping points be- tween A and B . To formulate the PCC for a given point, v, is deﬁned as the PCC of a window, w, surrounding v. PCC as a similarity metric can thus be formulated as

DPCC(u) = ∑

v∈VT

1− PCC(Tw(v),(S ◦W)w(v))², (3.9) where T_w_(v)and(S◦W)w(v)are both subsets of their original images, covering a window w with v at its center.

Another problem that arises is the lack of correspondence when doing multi- modal image registration. Mutual information and normalized mutual information attempts to tackle this problem by applying information theory to measure the similarity between two images [51].

Due to the ill-posed nature of the registration problem it is often desirable to include regularization in the objective function. This typically involves im- posing a degree of smoothness on the transformation. Numerous registration methods, such as the once applying a parametric deformation model, impose implicit regularization as part of the model. This may also be the case for non- parametric methods employing physical deformation models. Still, for certain models, explicit regularization is often necessary. A common variation of regularization is the diffusion regularization, which penalizes ﬁrst-order deriva- tives in order to impose smoothness. This term can be written as

R(u) = ∑

x∈V∇u(x)². (3.10)

(27)

Combining a similarity metric and regularization the ﬁnal objective function can be formulated as

f(u) = D + αR, (3.11)

whereα is weighting factor controlling the degree of regularization.

3.4 Optimization method

The third component, the optimization method, is the means in which to ﬁnd the transformation that optimally aligns the two images with respect to the objective function. Optimization for image registration can be separated into two classes; continuous and discrete methods.

Continuous optimization operates on, as the name implies, continuous and also differentiable objective functions. Gradient descent is a common algo- rithm for continuous optimization and has been applied for a large range of tasks.

Discrete optimization, on the other hand, operates in a discrete domain.

This method of optimization can be adopted by formulating image registration as a discrete labeling problem. Labeling problems are frequently encountered in a variety of ﬁelds involving images with the purpose of producing optimal mappings between sites (e.g. pixels or voxels) and labels with respect to an objective function. Image segmentation is a typical example of a labeling problem.

Markov random ﬁeld (MRF) optimization is a convenient tool for modeling problems of this nature [41] and has been applied in multiple instances of image registration [20][31]. Solving the labeling problem can then be seen as an analogue to minimizing the energy function of a MRF. The MRF can be represented by an undirected graph, G= {V,E}, consisting of a set of vertices, V, and a set of edges, E. The vertices represent image elements while the edges represent their spatial relationship, i.e. adjacent vertices. Let S= {1,...,m} be a set of indices for m sites, in this case vertices, L = {l1,...,lM} be a set of M labels, and L : S→ L the mapping between the two. The energy function can then be formulated as

E(L) = ∑

v∈Vφv(L(v)) + ∑

(v,w)∈E

φv,w(L(v),L(w)), (3.12)

where φv is the unary potential andφv,w the binary potential. The unary po- tentials typically represents the data term of the objective function while the binary potential, modelled by the edges of the MRF graph, represents the regularization.

Energy functions of this form can then be minimized in a multitude of ways, including iterated conditional modes (ICM) [3], simulated annealing, linear programming, and graph-based methods. A popular approach for ﬁnding the

(28)

global minima of energy functions of the form described in Eq. 3.12 is graph cut optimization. This approach has also been employed successfully in multiple image registration methods [68][61][67]. Graph cut and how it relates to energy minimization is further described in Chapter 4.

3.5 Multi-resolution strategies

Local minima are a common problem due to the ill-posed nature of image registration. They can to a certain extent be reduced by the choice of suitable registration characteristics but usually not enough. A common strategy for further avoiding these local minima is to employ hierarchical multi-resolution registration. The registration is performed in a hierarchy with increasing resolution until the desired ﬁnal resolution level is completed. Two common approaches for deﬁning this hierarchy are either in terms of increasing data complexity or increasing transformation complexity. A summary of hierarchical models applied in image registration can be found in [40].

3.6 Evaluation

A major challenge when researching image registration techniques is the ill- posed nature of the problem. The lack of a unique solution in combination with the lack of ground-truth makes evaluation a difﬁcult task. The optimization often aims to maximize the similarity between the image pair and image similarity is thus widely used as a metric for registration accuracy. This metric however, together with tissue overlap metrics are unreliable on their own [53]

and should thus be combined with metrics that also evaluates the quality of the produced transformation.

The evaluation of image registration methods is still an open research ques- tion. A common approach to sidestep the downsides mentioned is to combine any metric for image similarity or tissue overlap with other metrics focused on the properties of the transformation. The non-rigid image registration evaluation project (NIREP) [9] is a project that aims to establish a standardized framework for the evaluation of image registration. Three metrics that has been used throughout this thesis are tissue overlap, inverse consistency, and the Jacobian determinant.

Tissue overlap is a measurement of the overlap of different tissues. This requires corresponding tissues to be delineated beforehand and the metric mea- sures the overlap between these two delineations. The Dice similarity coefﬁ- cient (DSC) is a common tool for measuring the overlap given two segmented regions and can be deﬁned as

DSC= 2|X ∩Y|

|X| + |Y|, (3.13)

(29)

where X and Y are the two sets deﬁning the segmentation.

The inverse consistency property of a transformation, which was described in Section. 3.2, is an important aspect of image registration as true correspondences are desired. To measure this property for a speciﬁc image pair,(A,B), the registration is performed in both directions, producing two transforma- tions; T_A_→B and T_B_→A. Ideally these two registrations should produce identi- cal correspondences and the composition, T_B→A◦TA→B, should be the identity transform. However, this property is seldom fully realized and the error can measured as the vector magnitude error (VME),

V ME= 1

|VB| ∑

x∈VB

|x − TB→A◦ TA→B|. (3.14)

Another important property described in Section. 3.2 is the ability to produce diffeomorphic transformations. The Jacobian determinant of a displace- ment field u, J_u(x), from here on only referred to as the Jacobian, is a great tool for interpreting the displacement field produced by the registration. In short it quantifies the local change in volume caused by the transformation at point in the image. It can be used not only for determining the volume change in an image element but also to determine whether the diffeomorphic property is realized. A negative value of J_u(x) signals that the displacement field in- verts, or folds, the space at the point x. The Jacobian of the registration output can be computed to assess the number of foldings,|{Ju(x)} < 0|, providing a metric on the methods ability to produce diffeomorphic transformations, i.e.

displacement ﬁelds without foldings.