KTH Royal Institute of Technology
Bachelor thesis
Reconstruction in
Computerized Tomography using Entropy Regularization
Author:
Martin Sj¨ oberg
Theoharis Charitidis
Supervisor:
Axel Ringh
Abstract
Inverse problems are a class of problems that often appear in science. They can
be thought of as calculating the cause that corresponds to measured data, and
it is the opposite (i.e. inverse) of a direct problem, where data is calculated from
a known source. In Computerized Tomography (CT), which is an important
non-invasive imaging method in medicine, inverse problems arise when we
want to reconstruct the inner structure of the physical body from indirect
measurements. In this thesis we investigate different methods for CT image
reconstruction. We compare the reconstruction quality of using Entropy as
a regularizer to Tikhonov - and Total Variation (TV) regularization and the
analytic Filtered Backprojection approach, and also investigate the sensitivity
of entropy regularization with respect to priors.
Contents
1 Introduction 3
2 Background 4
2.1 Tomography . . . . 4
2.1.1 The Radon Transform . . . . 4
2.2 Inverse Problems . . . . 5
2.3 Filtered backprojection . . . . 7
2.4 Optimization . . . . 8
2.4.1 Tikhonov regularization . . . . 8
2.4.2 Total Variation regularization . . . . 8
2.4.3 Entropy regularization . . . . 9
3 Method 9 3.1 ODL library . . . . 10
3.2 Approach . . . . 10
4 Results 12 4.1 Comparing reconstructions with different methods . . . . 12
4.2 Entropy Priors . . . . 13
5 Discussion 16
A Python code utilizing ODL 18
1 Introduction
Inverse problems arise in many areas of science and mathematics when one wants to infer properties of something which is not directly observed. This has applications in many areas including signal processing, astronomy and medical imaging. Simply put it is the problem of finding a cause from an effect, which is the opposite of a direct problem where the result is calculated from the cause.
Computerized tomography (CT) is a method used in a number of industries and is also an active field of research. It is a non-invasive method of looking at the inner structure of objects, which is useful in industry for internal inspection of components, and one common application area is in medical imaging. The type of CT in focus here is X-ray Tomography, which is one of several types [8, p. 41-61].
The reconstruction problem of CT is an ill-posed inverse problem, and there are different ways to try and solve it. One important algorithm is the Filtered Backprojection algorithm (FBP) [8, p. 81], which is an approach based on approximating an inverse to the Radon transform analytically. However, the drawbacks of FBP are for example its sensitivity to noise and it’s tendency to produce artifacts (e.g. if the data is incomplete) [10]. Alternative methods have been suggested, methods that are normally computationally more expen- sive but can be more robust regarding noise or incomplete data. One way to formulate such methods is to formulate the reconstruction problem as an op- timization problem which can be solved by iterative methods. In this setting the ill-posedness can be mitigated with regularizing cost functions, where dif- ferent cost functions can be used to promote different structures and features.
The main focus of this thesis is to evaluate the use of entropy as a regular-
izer, first by comparing the maximum entropy regularization (MER) to the
Tikhonov- and Total Variation (TV) regularizations, secondly by comparing
MER to relative entropy by introducing different priors.
2 Background
2.1 Tomography
Tomography aims at creating images of sections in an object. The term is mostly present in industrial and scientific applications but also in the medical field. The basic technique works by shooting rays through the analyzed object from several angles which are then captured by a detector on the opposite side (see figure 1) . This method is based on what is called the Radon Transform [7].
Figure 1: Illustration of how tomographic measurements are made, (from [7]
p.28)
2.1.1 The Radon Transform
The Radon Transform is the foundation of the reconstruction process in to- mography. The concept is that by measuring line integrals from several angles through a body, which is defined by a function f , one can calculate the original function f . In the case of tomography the function f corresponds to the image we want to reconstruct and the line integrals correspond to the rays that travel through the analyzed object.
If we shoot one ray with known intensity, I
0, and measure the intensity on the opposite side of the body, I, the ratio between them can be approximated by the following line integral
log I
0I ≈
Z
b af (x)dx, (1)
where f (x) is the attenuation along the path of the ray, a and b are the endpoints of the path [7, p. 24]. The forward operator in CT, called the Radon transform (see figure 2) is based on this, and defined as
R(f )(~ n
⊥, x) = Z
∞f (x + t~ n
⊥)dt, (2)
where x is the point where the ray crosses the horizontal axis, and α is the angle ~ n makes with the horizontal axis. The integration is along the ray, which is perpendicular to ~ n, i.e., parallel to ~ n
⊥.
Figure 2: Illustration of how the Radon Transform conceptually works from en.wikipedia.org.
2.2 Inverse Problems
In the following subsection we present a general theory for inverse problems.
For a more thorough introduction see, e.g , [7].
Unlike direct problems, which are finding the results from a cause, inverse
problems are formulated as to find the cause from a result. For instance, if an
image is inverted then the inverse problem would be to invert it back to the
original. This implies that one can move seamlessly between the inverted image
and the original without any loss of information or error. This would work
in practice because the example problem is well-posed. The direct problem
is formulated as Ax = m and the inverse problem is formulated as to find x,
that is solving Ax = m for x. A is called the forward map and in the context
of tomography it is the Radon transform, sometimes referred to as the ray
transform.
Figure 3: Illustration of color inversion of an image(from pexels.com).
However, the methods used in this paper are formulated for ill-posed in- verse problems, which are inverse problems that are more difficult to solve. To understand this better we use the notion of a well-posed problem, introduced by Jasques Hadamard (1865-1963). The inverse problem Ax = b is said to be well-posed if
A
1: Existence. There is at least one solution.
A
2: Uniqueness. Only one solution exists.
A
3: Stability. Solution depends continuously on data.
An ill posed problem is one which fails to fulfill at least one of the above conditions.
Example 1: Consider the equation:
Ax = b ⇐⇒ 0 1 0 0
x
1x
2= b
1b
2, (3)
and b
26= 0. Since 0 · x
1+ 0 · x
2= 0 6= b
2, a solution does not exist. In this case it fails to satisfy A
1. This occurs in the case of tomography as well, where noise is prevalent in the measurement.
Example 2: We consider the equation:
Ax = b ⇐⇒ 1 0 1 0 1 0
x
1x
2x
3
= b
1b
2, (4)
where b
1, b
2are constants. It is easily seen that any vector of the form
x
1x
2x
3
=
b
1− s
b
2s
, (5)
for all s ∈ IR is a solution. Thus this system does not have a unique solution, hence violating A
2.
We consider the general mathematical formulation of an inverse problem, which is to solve the following equation for f:
m = Af + , (6)
where A : X 7→ Y is the ray transform, mapping the image f ∈ X into the data space Y (represented in figure 4b as a sinogram) thus producing the measured data m. It should be noted that the direct problem must be well posed - A must be well defined, single valued and continuous. The vector ∈ Y corresponds to the error caused by noise in the measurement and m ∈ Y is the measured data.
If f is the image then arguably the solution could be approximated as
A
−1m ≈ f, (7)
however, reconstructing the CT image using (7) will not work because the ray transform, A, is not invertible.
2.3 Filtered backprojection
Initially in X-ray tomography the filtered backprojection (FBP) algorithm was used to create images from a finite data set. Even today FBP holds a strong presence in the industry of tomography, and that is arguably for its accuracy, easy implementation and computation speed.
The downside of FBP is that it needs evenly spaced angles in its measure- ments to work properly. If one of the data sets are missing or get corrupted the end result will form line artifacts that affect the overall image quality.
FBP is an analytical approach unlike other methods described later. Sim-
ply put this algorithm can be thought of as trying to solve the reconstruction
problem by approximating the inverse of the forward map connected to the
specific space, i.e., it is a pseudo-inverse of the ray transform. [10]
2.4 Optimization
As described earlier, the naive reconstruction (7) will not suffice to solve the inverse problem. Instead reconstruction methods can be formulated using opti- mization. A broad class of methods are captured by the following optimization problem:
min
f
||Af − m||
22+ αF (f ), (8)
where F (f ) is the regularizing functional and α > 0 is the regularization parameter, meaning that it tunes the balance between the regularizing term and the data-matching term. Note that the problem at hand is a convex one if F is convex. What separates the different methods in this context is the choice of the regularization functional F (f ), which can depend on our prior information about the solution. Having information about the solution is often referred to as the prior [7].
2.4.1 Tikhonov regularization
Tikhonov regularization is a classical method for solving ill-posed inverse prob- lems. This is because of the method’s ability to provide smoothing and imple- ment known properties of the solution.
The classical Tikhonov regularization is formulated as
min
f ≥0||Af − m||
22+ α||f ||
22, (9) where α > 0 acts as the regularization parameter, which as stated earlier tunes the balance between the regularizing term and the data-matching term.
2.4.2 Total Variation regularization
Total Variation (TV) is another well known method for solving ill-posed in- verse problems. Unlike Tikhonov and FBP, which produce fairly smooth re- constructions, TV has the ability to minimize noise and be edge preserving.
This comes with the cost of losing fine details but gives more definition along the sharp edges of a reconstruction [7].
The mathematical formulation for Total variation is
min
f ≥0||Af − m||
22+ α||∇f ||
1, (10)
where ∇f is the spatial gradient of f. The regularizing functional in TV shares some similarity to the one in Tikhonov regularization, but here it is the 1- norm of the spatial gradient of f instead. The regularization parameter α > 0 acts as the ”tuning” parameter here as well - it can be modified as to decide how much we ”trust” the original data. This is a non-smooth problem due to the 1-norm, wherefore we need to use a solver able to handle non-smooth optimization.
2.4.3 Entropy regularization
The mathematical notion of Entropy, closely related and sometimes used inter- changeably with Kullback Leibler divergence (KL) [6], was introduced by CE Shannon [2, 9] in the field of information theory. The use of the so called max- imum entropy regularization (MER) has been investigated in solving inverse problems [2, 5], including image reconstruction. [10]
According to Tseitlin et al. [10] there have been a small number of con- flicting reports on the use of entropy for magnetic resonance imaging (MRI).
In that setting they conclude that it can be better than FBP in suppressing baseline noise and it also can produce reconstructions without the character- istic artifacts of FBP using few projection angles and/or projections without a uniform angle distribution. [10]
Here, MER is formulated as
min
f ≥0||Af − m||
22+ αH(f ), (11) where H is the functional:
H(f ) =
Z
f (x) − g(x) + g(x) log( g(x)
f (x) )dx if f > 0 +∞ otherwise
, (12) with prior g ≡ 1. Inserting a prior g(x) > 0 corresponds to the relative entropy of g with respect to f . We consider the use of MER and relative entropy separately, using different priors g(x) in the latter method.
3 Method
The goal of this thesis is twofold. First we compare the quality of reconstruc-
tion of Filtered Backprojection- (FBP), Tikhonov-, Total Variation- (TV) and
Entropy regularization. Secondly, we test the Entropy regularization’s sensi-
tivity with respect to priors.
3.1 ODL library
The Operator Discretization Library (ODL
1) is an open-source Python library for prototyping reconstruction methods for inverse problems, which will be used for all computation and reconstruction [1].
3.2 Approach
For the specifics of how ODL is used, the complete code can be seen in Ap- pendix A. A more in depth view of the package with example code, etc., can be found in the documentation.
2ODL works on a high level and the built in solver used, the Douglas-Rachford primal-dual splitting algorithm [3], can be utilized using minimalistic code. We also utilize the ASTRA toolbox [11] for computing the ray transform.
The phantom we use is the Shepp-Logan phantom, which can be seen in Figure 4a and the corresponding data generated by the ray transform can be seen in Figure 4b. This data is generated using a uniform angle partition of 180 angles over π radians, along with a signal-to-noise level of approximately 5 % white Gaussian noise. This noisy data can be seen in Figure 4c. We also use a detector partition of 256 ’bins’. After the reconstruction methods are defined, we try different regularization parameters and select the best reconstructions by visual assessment. The side-by-side comparisons can be seen in the results section.
The second part of this paper is focused on testing entropy regularization’s
sensitivity to priors. This is done by using different priors, where some are
created by modifying the original Shepp-Logan phantom (e.g. by removing
or adding some features) and one reconstruction is done using the Forbild
phantom as a prior, which can be considered a ’bad’ prior. This is done to
investigate if the algorithm performs well even though the prior is bad.
20 15 10 5 0 5 10 15 20
x
20 15 10 5 0 5 10 15 20
y
0.00 0.50 1.00
(a) Phantom
0.0 0.5 1.0 1.5 2.0 2.5 3.0
30 20 10 0 10 20 30
s
0.00 5.36 10.71
(b) Noise-free data
0.0 0.5 1.0 1.5 2.0 2.5 3.0
30 20 10 0 10 20 30
s
-0.91 5.04 10.99
(c) Data
Figure 4: The Shepp-Logan Phantom (a), the corresponding data (b) and data
with additive white Gaussian noise (c).
4 Results
As stated above, white Gaussian noise is added to the data to yield approx- imately a signal-to-noise ratio of 5 %. This is used for all reconstructions.
Below, the best reconstructions are presented for FBP, Tikhonov, TV and Entropy respectively.
4.1 Comparing reconstructions with different methods
In figure 5 below is a side-by-side comparison of FBP, Tikhonov, TV and Maximum Entropy respectively. As can be seen in figure 5a, FBP does not suppress much of the noise and the noise is also present somewhat uniformly over the entire image independently of underlying density of the phantom.
The three smaller dots in the lower region is not very distinctive and if we did not know they were there to begin with we would not be able to tell for sure.
This can be seen in strong contrast to Tikhonov (figure 5b) where that method does a lot better at suppressing the noise in the regions with low density, e.g.
on the outside of the body and the inside of the two bigger ellipses inside the phantom. However, Tikhonov does not handle the noise as good in regions with higher density, i.e., the lighter gray areas. As with FBP, this makes it difficult to discern the smaller dots from the surrounding noise, therefore we can not suggest it is better than FBP in that perspective.
Looking at figure 5c, Total Variation does a tremendous job at suppressing noise all over and smoothing out the image, which is essentially what was expected. The strong edges between areas are well preserved when they differ enough in contrast, but when the difference between density is smaller it does not perform as well. This can be seen by the three smaller dots being difficult to discern from one another.
Lastly we look at the Maximum Entropy in figure 5d. There is a decent amount of noise suppression, especially on the outside of the phantom, but the noise present inside the body is almost on the same level as with Tikhonov.
The main drawbacks that can be observed is a bad edge definition which looks
almost like a ’glow’, especially visible on the outside rim of the body where
the stark contrast between the white and the black meet. This also makes
the edges of features inside the body worse defined than for all the other
reconstructions.
20 15 10 5 0 5 10 15 20
x
20 15 10 5 0 5 10 15 20
y
FBP
-0.26 0.49 1.23
(a) Filtered Backprojection.
20 15 10 5 0 5 10 15 20
x
20 15 10 5 0 5 10 15 20
y
Tikhonov, alpha = 1.2
0.00 0.45 0.90
(b) Tikhonov. α = 1.2
20 15 10 5 0 5 10 15 20
x
20 15 10 5 0 5 10 15 20
y
TV, alpha = 0.02
0.00 0.50 1.01
(c) Total Variation. α = 0.02
20 15 10 5 0 5 10 15 20
x
20 15 10 5 0 5 10 15 20
y
Entropy, alpha = 1
0.09 0.43 0.76
(d) Maximum Entropy. α = 1.0 Figure 5: Reconstructions using FPB (a), Tikhonov (b), TV (c) and MER (d).
4.2 Entropy Priors
Earlier we used the so called maximum entropy for reconstruction (see figure 5d). This means setting the prior in (12) to a constant 1, i.e., g(x) ≡ 1. Simply put it means a prior that is white on the entire domain, which can be seen as a ’dumb’ prior in the sense that it does not use any prior information about the body in study. Now we investigate the use of a few different priors to see how much it can affect the reconstruction. We use both priors that are close to the Shepp-Logan phantom (figure 4a) but with a few features changed and some that do not resemble the phantom as much. Below (figures 6-9) are all priors tested side-by-side to their resulting reconstruction.
Looking at the first one in figure 6, the Forbild phantom is used as a prior
and the reconstruction is seen in 6b. This prior is obviously differing from
the Shepp-Logan phantom and the effect on the reconstruction is clearly seen.
There are artifacts introduced from the prior that does not result in a good reconstruction.
In figure 7 a prior is used were the three small dots in the lower half of the phantom has been removed. The major thing to notice is the difference in sharpness, contrast and edge-preservation that occurs compared to maximum entropy (compare figure 5d and 7b). On the other hand, the three small dots are still hard to see.
A prior with another feature removed can be seen in figure 8a. This is similar in result to the one used above, the difference is that now we removed one of the medium sized dots in the middle. The same pros can be seen here, e.g. a major improvement in edge-preservation and contrast. Even in this case the dot is visible to some degree, but it is hard to tell if one would notice it if it was not known beforehand that it is supposed to be there.
Lastly we use the same phantom and prior as in figure 7, but we inter- change how they are used, i.e., use figure 7a as phantom and the Shepp-Logan phantom as prior. This means that the measurement data is different, be- cause we use figure 7a as the data generating phantom. Thus, the data does not contain the three small dots, i.e., they do not exist in the actual body;
they are only introduced in the prior. This can be thought of as using prior information that the body used to contain the three dots but do not anymore.
In this case the reconstruction shows the dots clearly, and thus introduces false structures in the reconstruction.
20 15 10 5 0 5 10 15 20
x
20 15 10 5 0 5 10 15 20
y
0.00 0.90 1.80
(a) Prior
20 15 10 5 0 5 10 15 20
x
20 15 10 5 0 5 10 15 20
y
0.00 1.05 2.11
(b) Reconstruction
Figure 6: Reconstruction (b) using the Forbild phantom (a) as prior.
20 15 10 5 0 5 10 15 20
x
20 15 10 5 0 5 10 15 20
y
0.00 0.50 1.00
(a) Prior
20 15 10 5 0 5 10 15 20
x
20 15 10 5 0 5 10 15 20
y
0.00 0.59 1.18
(b) Reconstruction
Figure 7: Reconstruction (b) using a modified SL phantom (a) as prior.
20 15 10 5 0 5 10 15 20
x
20 15 10 5 0 5 10 15 20
y
0.00 0.50 1.00
(a) Prior
20 15 10 5 0 5 10 15 20
x
20 15 10 5 0 5 10 15 20
y
0.00 0.58 1.15
(b) Reconstruction
Figure 8: Reconstruction (b) using a modified SL phantom (a) as prior.
20 15 10 5 0 5 10 15 20
x
20 15 10 5 0 5 10 15 20
y
0.00 0.50 1.00
(a) Prior
20 15 10 5 0 5 10 15 20
x
20 15 10 5 0 5 10 15 20
y
0.00 0.58 1.15