IN
DEGREE PROJECT TECHNOLOGY, FIRST CYCLE, 15 CREDITS
,
STOCKHOLM SWEDEN 2016
Information Theoretic Similarity
Measures for Robust Image
Matching
Multimodal Imaging - Infrared and Visible light
JAMILA YUSUF ISSE
CHAIMAE EL GHOUCH
Table of contents
1. Introduction . . . 6
1.1 Problem definition . . . 7
1.2 Scope and constraints . . . 7
2. Background .. . . 8
2.1 Image matching 2.1.1 Area-based methods . . . 8 2.1.2 Feature-based methods . . . 8 2.2 Similarity measures . . . 92.2.1 Mutual information (MI) . . . 9
2.2.2 Cross-cumulative residual entropy (CCRE) . . . 10
2.2.3 Sum of conditional variances (SCV) . . . 10
2.3 Multimodality . . . 11
2.4 Infrared - Visible Imaging . . . 12
2.4.1. Electromagnetic radiation (EM) . . . 12
2.4.1.1 Visible light . . . 13
2.4.1.2 Infrared light . . . 13
2.4.2 Optical remote sensing - utilizing both Infrared and visible light in imaging 13
3. Method . . . 15
3.1 Data selection . . . 15 3.1.1 Data sets . . . 16 3.2 Experiment . . . 17 3.2.1 Experimental setup A . . . 17 3.2.1 Experimental setup B . . . 174. Results . . . 19
4.1 Experimental setup A . . . 194.1.1. Thermal camera - more details . . . 19
4.1.2. Satellite image - less details . . . 20
4.2 Experimental setup B . . . 20
4.2.1. Thermal camera pair - more details . . . 20
4.2.2. Satellite pair - less details . . . 21
5. Discussion . . . 22
5.1 Experiment evaluation . . . 22
5.1.1 Experimental setup A - Working with one modality. . . 22
5.1.1.1 Experimental setup A - Recommendation . . . 23
5.1.2 Experimental setup B - Working with two modalities. . . 24
5.1.2.1 Experimental setup B - Recommendation . . . 25
6. Conclusion . . . 26
1. Introduction
Given an image, one challenge is to determine whether or not the image contains a specific object or feature. This task can easily be solved by humans due to the capability of recognizing objects through a process of structuring them into different categories, by making a match based on their characteristics [1]. Despite the systems and technologies that are developed today a problem remains because the matching requires reasoning from various image attributes and extensive amounts of knowledge representation. Nevertheless, extensive studies have been made in the area of image matching and registration to develop more robust and accurate techniques. Because of this there are great progresses to the advantage of many different fields. A similarity measure can be regarded as a tool used to evaluate the spatial correspondence of images and plays a fundamental role in image matching and registration [2]. There are several measures of similarity and the usage of them differs since each similarity measure is considered applicable or not depending on the data involved. When measures apply to images with different modalities, information theoretic similarity measures are more applicable than other standard similarity measures because of their strong roots in mathematics as well as their ability to detect nonlinear changes in intensity. This concept was introduced by mutual information which is a measurement of information that two random variables have in common. Mutual information (MI) has given means to make similarity quantification with a wider range of robustness, and by that giving means to the introduction of further
measurements such as crosscumulative residual entropy (CCRE) and sum of conditional
challenges occur in the stage of quantifying the similarities as these changes may not be transformations of linear nature, much rather nonlinear nature, as employed in many cases when dealing with multimodality [3]. Intensity changes are one of the most common challenging contexts, and matching images with such differences is done regularly within many fields. Fields such as satellite imaging, medical imaging and many more share this problem as they make use of imaging with various types of lights. The infrared light is used in both of the mentioned fields, especially in the former one and images taken with such light are commonly compared with images of visible light.
1.1. Problem definition
This report investigates image matching concerning nonlinear brightness changes, such that arises in connection with multimodal imaging. The aim is to study three different similarity measures and their performance behaviour by using different sets of data including multimodal images taken with infrared and visible light. The investigation aims to answer the following questions: How do these similarity measures perform depending on different data and how are their performances affected by multimodal inputs in conjunction to the variations in data?1.2. Scope and constraints
The focus of this report is on image matching concerning differences in image modalities in conjunction with differences in data inputs. The interest is to test three different similarity measures of information theoretic approaches and this is done by using images taken with the modalities of infrared and visible light. The similarity measures in question are mutualinformation, crosscumulative residual entropy and sum of conditional variances.
Consequently, the use of featurebased methods are usually employed in the images that contain enough distinctive and easily detectable objects. This is usually the case of applications in remote sensing and computer vision. The typical images contain a lot of details (towns, rivers, roads and forests). On the other hand, areabased methods are frequently used in the medical field because medical images are not so rich in such details. Recently, registration methods using simultaneously both areabased and featurebased approaches have started to appear [4]. However, a key issue in image matching is the choice of similarity measure, which is a measure to quantify match between entities.
2.2. Similarity measures
Similarity measures are crucial when solving problems such as pattern recognition and clustering or even classifications [10]. When selecting a similarity measure, some issues to consider are, for instance, the modalities involved. When images are from the same modality, similarity measures such as sum of squared differences (SSD) or Correlation ratio (CR) are useful because images of the same type will have the same intensity on corresponding areas [11]. Consequently, in order to deal with multimodal images, a similarity measure is required to be robust enough to handle transformations such as nonlinear brightness changes caused by differences in modalities [2]. As a result, information theoretic similarity measures are more suitable as these measure the statistical relationship between pixel intensities and can therefore easily detect nonlinear changes [12]. Interesting information theoretic similarity measures that are further explained in sections (2.2.12.2.4) are mutual information (MI), crosscumulative residual entropy (CCRE), sum ofconditional variances (SCV). 2.2.1. Mutual information (MI) The use of mutual information (MI) was introduced by Viola and Maes (1997) as a similarity measure [15,16] and is a suitable similarity measure when dealing with multimodality. MI relies on the concept of measuring the amount of the information a variable (image) contains about another. MI also tend to be maximized when both images are in geometric alignment. MI assumes a statistical relationship between images by analyzing the joint entropy of the observed images which we denote as variables, deriving from a formal definition of the entropy as the amount of uncertainty about a certain variable [13].
(X) P (x) log P (x) H = ∑
x∈X
Where P(variable) is the probabilistic distribution of the variable. Consequently MI combines both the joint entropy of both variables and the individual entropy of variables to result to the following [14]: I(X, ) H(X) H(Y ) H(X, ) M Y = + − Y (1) MI between two images can be maximized by maximizing the individual entropies and minimizing the joint entropy. Although MI has been a useful tool in image registration, it has a drawback which is that it does not take into account neighbourhood relationships. Also it does not consider the spatial correspondence that exists among pixels [15]. 2.2.2. Crosscumulative residual entropy (CCRE) Crosscumulative residual entropy was introduced by Wang and Vemuri (2006) as an information theoretic similarity measure. It relies on the concept of measuring the entropy by using cumulative distributions and derives from the cumulative residual entropy (CRE). The key strength of CCRE over MI is that it includes its significantly larger noise immunity and a much larger convergence range over the field of transformations [16]. Given two images X Y, , CCRE is defined as the following: (X, ) ξ(X) E[ξ(X|Y )] C Y = − (2)
Where ξ(X) = −
∫
(λ) log F (λ) dλ andR+
F
(X)F = P(|X| > λ )
group of clustered pixels in one image should be clustered in a similar way in the other image. This approach showed that the method was rather robust against nonlinear brightness changes and had a better performance having a lower computational complexity [2]. To calculate the sum of conditional variances two images are taken X, as the so called reference image and Y, called the target image. A partition of the target image Y is made into disjoint bins, that correspond to intensity regions of the reference image X, , nb Y (j) X(j) where j = 1, ..., n b . Then the matching value is obtained by summing the variances of the intensities of each bin of the target image Y (j) ; (X, ) [(Y E(Y )) | X (j)] SSCV Y = ∑nb j = 1E i− i 2 i∈X (3)
2.5.1.1. Visible light With a frequency of 400 THz to 800 THz and a wavelength of 740 nm to 380 nm, visible light is only a small part of the electromagnetic spectrum, and in fact the only light in the spectrum that can be seen with human eyes. It is said that the most important characteristic of visible light is that of color. Light at the lower end of the visible spectrum with a longer wavelength is seen as red whereas light in the middle of the visible spectrum is seen as green and at the upper end is seen as violet, as shown in figure 2.5.1 [20]. Visible light is used in imaging in many cases where the property of light scattering is exploited. When light hits an object it is either absorbed by the object or it changes its direction, it is the latter that is referred to as scattering [21]. Depending on the objects that the light hits it scatters in different ways according to the object's material and emits radiation of different wavelengths. This can be used to identify what the materials are and gives the possibility to distinguish between them [22]. In the medical optical field, for instance, this property of light scattering is utilized when visualizing soft tissues. It helps to distinguish between the tissues due to the many various ways different types of tissues absorb and scatter light [23]. 2.5.1.2. Infrared light Infrared has a frequency of 3 GHz to 400 THz and a wavelength of 30 cm to 740 nm and this is invisible to the human eye. This type of light can be felt as heat, as infrared radiation is able to transfer heat. The infrared light can be used in a variety of ways and there is, within its spectrum, a distinction between its wavelength where a shorter wavelength in this specific spectrum is called nearinfrared, a longer wavelength is called farinfrared, and in between we have the intermediate wavelengths called midinfrared (see figure 2.5.1) [24].