## Maximum Entropy Matching: An Approach to

## Fast Template Matching

### Frans Lundberg

### October 25, 2000

**Contents**

**1** **Introduction** **2**

**2** **Maximum Entropy Matching** **2**

2.1 The cornerstones of Maximum Entropy Matching . . . 2

2.2 Bitset creation . . . 4

2.3 Bitset comparison . . . 5

**3** **PAIRS and the details of the bitset comparison algorithm** **6**
3.1 PAIRS . . . 6

3.2 Motivation for PAIRS . . . 7

3.3 The bitset comparison algorithm . . . 7

3.4 Implementation issues . . . 8

3.5 Speed . . . 9

**4** **A comparison between PAIRS and normalized cross-correlation** **11**
4.1 Test setup . . . 11

4.2 Generation of image distortions . . . 12

4.2.1 Gaussian noise, NOISE . . . 12

4.2.2 Rotation of the image, ROT . . . 13

4.2.3 Scaling of the image, ZOOM . . . 13

4.2.4 Perspective change, PERSP . . . 13

4.2.5 Salt and pepper noise, SALT . . . 13

4.2.6 A gamma correction of the intensity values, GAMMA . . . . 13

4.2.7 NODIST and STD . . . 14

4.3 Relevance of the distortions . . . 14

4.4 Results . . . 14

4.5 Other template sizes . . . 19

4.6 Performance using other images . . . 19

**5** **Statistics of PAIRS bitsets** **22**
5.1 Statistics of acquired bitsets . . . 22

**1**

**Introduction**

One important problem in image analysis is the localization of a template in a larger
image. Applications where the solution of this problem can be used include: tracking,
optical flow, and stereo vision. The matching method studied here solve this problem
*by defining a new similarity measurement between a template and an image *
neigh-borhood. This similarity is computed for all possible integer positions of the template
within the image. The position for which we get the highest similarity is considered to
be the match. The similarity is not necessarily computed using the original pixel values
directly, but can of course be derived from higher level image features.

The similarity measurement can be computed in different ways and the simplest
ap-proach are correlation-type algorithms. Aschwanden and Guggenb ¨uhl [2] have done a
comparison between such algorithms. One of best and simplest algorithms they tested
*is normalized cross-correlation (NCC). Therefore this algorithm has been used to *
com-pare with the PAIRS algorithm that is developed by the author and described in this
text. It uses a completely different similarity measurement based on sets of bits
ex-tracted from the template and the image.

This work is done within WITAS which is a project dealing with UAV’s (unmanned aerial vehicles). Two specific applications of the developed template matching algo-rithm have been studied.

1. One application is tracking of cars in video sequences from a helicopter. 2. The other one is computing optical flow in such video sequences in order to

detect moving objects, especially vehicles on roads.

The video from the helicopter is in color (RGB) and this fact is used in the presented tracking algorithm. The PAIRS algorithm have been applied to these two applications and the results are reported.

A part of this text will concern a general approach to template matching called Maximum Entropy Matching (MEM) that is developed here. The main idea of MEM is that the more data we compare on a computer the longer it takes and therefore the data that we compare should have maximum average information, that is, maximum entropy. We will see that this approach can be useful to create template matching algorithms which are in the order of 10 times faster then correlation (NCC) without decreasing the performance.

**2**

**Maximum Entropy Matching**

**2.1**

**The cornerstones of Maximum Entropy Matching**

*The purpose of template matching in image processing is to find the displacement r*
*such that the image function I*(*x* *r*)is as similar as possible to the template function

*T*(*x*). This can be expressed as

*rmatch*=argmax
*similarity T*(*x*);*I*(*x* *r*)
: (1)

Maximum Entropy Matching (MEM) and the PAIRS method described later are valid for all types of discretely sampled signals of arbitrary dimension, but here we will discuss the specific case of template matching of RGB images.

The difficult part with template matching is to find a similarity measurement that will give a displacement of the template which corresponds to the real displacement of the signal in the world around us. For many applications it is difficult to even define this ideal displacement, since the difference between the image neighborhood and the template does not consist of a pure translation. This fact makes it difficult to compare different template matching algorithms. Furthermore, the similarity measurement that should be used is application dependent. For example, rotation invariance might be wanted for one application, but not for another.

Maximum Entropy Matching does not necessarily lead to a similarity measurement that is better than others, but it aims to increase the speed of the template matching while keeping the performance. It works by comparing derived image features of the image and the template for each possible displacement of the template. The approach is based on the following statements.

1. The less data we compare for each possible template position, the faster this comparison will be.

2. The data we compare should have high entropy.

3. On the average, less data needs to be compared to conclude two objects are dissimilar then to conclude they are similar. This statement will be called the

*fast dissimilarity principle.*

4. The data that we use for the comparison should be chosen so that the similarity measurement will be distortion persistent.

Statement 1 is true in the sense that the time to compute a similarity measurement
is usually proportional to the amount of data that is compared. For correlation-type
template matching all of the pixel data in the template is used in the matching
algo-rithm. We will see that the amount of data that is used for comparison can be decreased
substantially using the MEM approach. The compare time also depends on the way the
data is compared. Not counting normalizations, the similarity measurement for these
algorithms is acquired by one multiplication and one addition for each byte of pixel
data (assuming each intensity value is stored as one byte). The comparison of data for
the MEM approach is done by an XOR-operation, and a look-up table, which is faster
per byte then the correlation-type approaches1_{and simple to implement in hardware.}

Statement 2 is intuitively appealing. To increase the speed of the matching algo-rithm we want to use as little data as possible in the comparisons, but we wish to use as much information as possible. Therefore the data used in the comparison should have high average information, that is, high entropy. Experiments show that it is possible to reduce the original amount of data used in the comparisons in the order of 10 to 100 times while keeping good performance.

It is not very difficult to prove that the maximum entropy of digital data is only achieved when the following two criteria are fulfilled. One, the probability of each bit in the data being 1 is 0.50. Two, all bits should be statistically independent. When we talk about entropy we view the data to be compared as one random variable, and when we talk about independence of bits, the single bits are considered random variables.

Since statistical independence of the bits is necessary to achieve maximum entropy it is natural and necessary to view the compare data as a set of bits. This view is used in MEM where a bitset is extracted from the template and from each neighborhood in

the image. The similarity measurement used to compare the image neighborhood bitset and the template bitset is simply the number of equal bits.

Lossy data compression of images is a large research area that I believe can be very useful in order to find high entropy image features that are good to use for tem-plate matching. However, the problems of finding these compare features and how to compress image data is fundamentally different, since there is no demand for image reconstruction from the compare features.

*Statement 3 (the fast dissimilarity principle) is an important and a very general*
statement that is valid for all types of objects that are built up by smaller parts. The
statement comes from the fact that two objects are considered similar only if all their
*parts are similar. If a part from object A is dissimilar to the corresponding part of*
object B, we can conclude that A and B are dissimilar. If the part from A and the
*corresponding part of B are similar we cannot conclude anything about the similarity*
between the whole objects. Therefore it usually takes less data to conclude that two
objects are dissimilar then to conclude they are similar. We will see how this statement
can be used to speed up the matching algorithm.

Statement 4. In template matching no image neighborhood is identical to the tem-plate. There is always some distortion (for example: noise, rotation or a shadow on the template) present and we must try to choose the data we compare so that the similarity measurement is affected as little as possible by these distortions.

The following two sections will describe how to extract compare data, and then how to compare this data for fast, high performance template matching.

**2.2**

**Bitset creation**

*Maximum Entropy Matching consists of two separate parts: bitset creation and bitset*

*comparison. In the bitset creation part a set of bits is produced for the template and*

for each neighborhood in the image directly or indirectly from the pixel data. How these bitsets are created is not determined by MEM. The optimal bitsets to extract is dependent on what image distortions that are expected for the intended application. One example of a bitset creation algorithm is PAIRS. We demand three things from the bitset creation algorithm.

1. The created bitsets should have high entropy.

2. The created bitsets should be resistant to the image distortions that appears for the intended application.

3. The bitset creation time should be short.

In order to compare two different bitset creation algorithms we must have mea-surements of “high entropy”, “distortion resistance” and “bitset creation time”. We will suggest possible ways of measuring this quantities.

It is difficult to estimate the entropy of a bitset consisting of more then a few bits. If the extracted bitset only has, say, 8 bits, we can estimate the full discrete probability distribution using a database of image neighborhoods. The entropy is then computed by its definition from the probability distribution. This is possible for a 1-byte bitset which has only 256 possible states. But, for a 4-byte bitset we have 232410

9_{}

pos-sible states and an explicit estimation of the full probability distribution is not pospos-sible. There are other ways to estimate entropy. In [3] a method for estimating the entropy of one-dimensional information sequences is applied to gray-scale images. The method

uses pattern matching to estimate the entropy. More about pattern matching in infor-mation theory can be found in [4].

Since the entropy is difficult to estimate we can instead use a measurement of how close to maximum entropy the data is. Assuming we have a database of image neigh-borhoods we can find the probabilities for each bit being set to 1. These probabilities should be close to 0.50 to achieve high entropy. Also, we can measure how independent the bits are by estimating the correlationρ.

ρ*i j*=

*E*(*bi* *E*(*bi*)(*bj* *E*(*bj*)))
p

*V*(*bi*)*V*(*bj*)

(2)

*E denotes expectation value, V denotes variance, and bkdenotes the k’th bit in the*

bitset. Since we are dealing with binary distributions we can fortunately conclude that
ifρ*i j*=*0 the i’th and the j’th bits are independent.*

We can construct a measurement of how much the bitsets deviate from having max-imum entropy by studying how much they deviate from the assumption of 50 per cent probability of a bit set to one and from the desired independence of the bits.

If the bits in a bitset have a probability of being 1 equal to 0.50 and they are in-dependent the distribution of the number of ones in the bitset will follow a binomial distribution. Therefore we can define another measurement of how close to maximum entropy the bitsets are as the deviation from a binomial distribution of the number of ones in the bitsets. These two ways of measuring how close to maximum entropy the bitsets are will be exemplified.

*Distortion persistence of the bits can be measured by performing experiments on*

a number of templates subject to controlled distortions. A bitset is created from the template before and after the distortion. The number of bits that are equal of the two bitsets is a measurement of how persistent the bitset creation algorithm is to the applied distortion.

*The bitset creation time can be measured for a specific computer. However, if bitset*
creation method A is faster then method B on computer X, A is not necessarily faster
on computer Y. Also, the implementations are often not trivial to optimize. So it is not
always possible to determine which bitset creation method that is generally the fastest.

**2.3**

**Bitset comparison**

The bitset comparison part of the MEM is not application dependent. For each image neighborhood and for the template a set of bits is generated somehow. The similarity measure in the template matching algorithm is simply the number of equal bits in the template bitset and the image neighborhood bitset. The bitsets consists of a whole number of bytes2for practical reasons.

*It is possible to use the fast dissimilarity principle (MEM Statement 3 in section*
2.1) to decrease the bitset compare time. This is done by comparing the first number of
bytes in the neighborhood and the template bitsets. If the number of equal bits in these
parts of the bitsets is below a certain threshold the bitsets are considered dissimilar, and
the similarity value is set to zero. If the number of equal bits is not below the threshold
the whole bitsets are compared and the similarity measure is the number of equal bits
of the whole bitsets. This algorithm and its implementation is described in detail in the
next section.

**3**

**PAIRS and the details of the bitset comparison **

**algo-rithm**

**3.1**

**PAIRS**

PAIRS is an algorithm to create bitsets of arbitrary number of bytes from neighbor-hoods in RGB images. PAIRS can easily be modified to deal with other kinds of signals of arbitrary inner and outer dimension The PAIRS method is based on random pairs of pixels within a neighborhood of an image. Each bit in a bitset is created from a certain pair of pixels. A bit is set to 1 if the first pixel value is larger than the other in the pair. Otherwise the bit is set to 0. The pair of pixel values are chosen in the same color band. The random pairs are chosen according to Algorithm 1 which is presented in C-like pseudo-code.

Algorithm 1 ---// Computes a list of pairs to be used // for bitset creation.

INPUT VARIABLES

n Number of bytes in each bitset

colors Number of colors (3 for RGB images)

OUTPUT VARIABLES

list List of pixel pair coordinates used to

form the bitsets, size: 8 x n where each element contains the coordinates of the pixel pair

FUNCTIONS CALLED

rand rand(low,high) returns a random integer

between low and high. ALGORITHM

For i1=0 to n2*8-1 {

index1x = rand(0,N-1); index1y = rand(0,N-1); index2x = rand(0,N-1); index2y = rand(0,N-1); index3 = rand(0,colors-1);

Store all five index variables in list[i1]. }

---When the list of pixel pairs have been created according to Algorithm 1 or a pre-computed list is loaded from a file, the actual bitsets are created according to Algo-rithm 2.

Algorithm 2

---// Creates bitsets from image neighborhoods. INPUT VARIABLES

im An RGB image,

size: imSize1 x imSize2 x 3

list A list of pixel pairs created

by Algorithm 1 OUTPUT VARIABLES

bs Image bitset,

size: (imSize1-N+1) x (imSize2-N+1) x 8*n

ALGORITHM

For i1=0 to imSize1-N, i2=0 to imSize2-N /* For all image neighborhoods */ {

For i3=0 to 8*n-1

/* For all bits in the bitset */ {

Get index1x, index1y, index2x, index2y and index3 from list[i3].

If im[i1+index1x, i2+index1y, index3] > im[i1+index2x, i2+index2y, index3] { bs[i1,i2,i3] = 1; } Else { bs[i1,i2,i3] = 0; } } }

---Note that Algorithm 2 is used to create both the image bitset and the template bitset. The size of the resulting template bitset will be 11n or simply n if we neglect the

singleton dimensions.

**3.2**

**Motivation for PAIRS**

The PAIRS method for bitset creation that was described in the previous section has been developed since it is a good compromise between the desired properties of MEM bitsets as described in 2.2. The entropy of these bitsets are high, the similarity mea-surement is resistant to certain kinds of distortion, and the bitset creation time is low. I believe other ways to create bitsets can prove better then PAIRS for some applications, but PAIRS is fast and rather simple to implement, and it works on the original input intensity data. The method has proved useful in applications and is used to demonstrate the maximum entropy approach to template matching. Matching using bitsets created with PAIRS compares very well with correlation approaches according to experiments with controlled distortions, see Section 4. When high invariance against certain types of distortions, such as rotation, is needed I believe higher-level image features should be used when forming the bitsets.

**3.3**

**The bitset comparison algorithm**

The previous section described the PAIRS way to create bitsets. The bitset comparison
algorithm is used to match the template bitset with the image bitsets. The algorithm
is not dependent on what bitset creation method that is used. There are two versions
*of the bitset comparison algorithm (Algorithm 3), with or without sort out. If sort out*
is not used the similarity measurement between two bitsets is simply the number of
equal bits. Sort out can be used to increase the speed of the algorithm by setting the
*similarity to zero if the first n1 bytes is less then a certain limit. The principle behind*
*this is the fast dissimilarity principle discussed in Section 2.1. The average number of*
bytes that have to be compared can be reduced substantially by using sort out. Notice
that a drawback with using sort out is that the execution time will be dependent on the
input data.

---// Computes a similarity measurement between // the template bitset and the image bitsets. INPUT VARIABLES

im_bs Image bitset, size: s1 x s2 x 8*n

temp_bs Template bitset, size: 8*n

ADDITIONAL INPUT VARIABLES FOR SORT OUT VERSION

n1 Number of bytes to use for initial

sort out.

thres Threshold

OUTPUT VARIABLES

s The similarity measurement,

size: s1 x s2 FUNCTIONS CALLED

simil simil(bs1, bs2) computes the number

of equal bits in bitset bs1 and bs2. ALGORITHM (without sort out)

For i1=0 to s1-1, i2=0 to s2-1

s[i1,i2] = simil(im_bs[i1,i2,0:n-1], temp_bs(0:n-1)); }

ALGORITHM (with sort out) For i1=0 to s1-1, i2=0 to s2-1 {

sortout_sim = simil(im_bs[i1,i2,0:n1-1], temp_bs(0:n1-1); if sortout_sim<thres { s[i1,i2] = 0; } else { s[i1,i2] = sortout_sim + simil(im_bs[i1,i2,n1:n-1], temp_bs[i1,i2,n1:n-1]); } }

**---3.4**

**Implementation issues**

The implementations of the algorithms presented in this text do not follow the exact
syntax presented, but they are functionally equivalent. To compare the speed of
MEM-PAIRS with correlation-type approaches Algorithm 1 through 3 and NCC have been
implemented using C and run on a general purpose computer of the type
UltraSPARK-II, 333 MHz. Execution times mentioned in this text refer to runs on this computer. The
details of the NCC algorithm is explained in section 4.1. Algorithm 1 is not time critical
since the list of pixel pairs can be pre-computed. The bitsets created by Algorithm 2
are stored as arrays ofunsigned char’s, that is, as arrays of bytes. This is not
necessarily the fastest way, but it is flexible. Some effort has been made to optimize
*the innermost loop of the algorithm. The bitset creation time tcr*has been measured to

*0.40 µs/byte. This time does not depend much on the number of bytes per bitset or the*
size of the image.

Algorithm 3 is about counting the number of equal bits in two arrays of bytes. An XOR-operation is performed between each of the bytes in the image array and the

corresponding template array byte. The number of zeros in the resulting byte (which is
the number of equal bits of the two compared bytes) is computed with a 256-item long
lookup table. The number of equal bits from each byte in the arrays is summed and
the result is the similarity measurement between the bitsets. The time to compare two
*bytes in the bitsets tcmphas been measured to to 0.025 µs.*

The NCC algorithm has also been implemented in C. Some efforts have been made
to optimize the code. The time it takes to compare two bytes (two intensity values) is
*0.120 µs, denoted tcmpN. The compare time per byte is based on the amount of input*

data even though the bytes are converted todouble’s internally to do the multiplica-tion in the NCC algorithm.

**3.5**

**Speed**

In this section we will compare the speed of MEM-PAIRS3 and NCC for template
*matching. Let tmand tmN* denote the total time to match the template with the image

*for PAIRS and NCC. Let s denote the size of the template, sx*the width of the search

*space, sythe height of the search space, and n the number of bytes used in the bitsets.*

*By search space is meant the rectangular set of tested template positions. If no sort out*
is used in the PAIRS method and both the bitsets have to be created to do the matching
the match time for PAIRS is

*tm*=(*nsxsy*+*n*)*tcr*+*nsxsytcmp*: (3)

Usually the time to create the template can be neglected and then we have

*tm*=*nsxsy*(*tcr*+*tcmp*): (4)

The match time for NCC is

*tmN*=*3sxsys*

2_{t}

*cmpN*: (5)

*We exemplify using the following values: s*=*16, sx*=*sy*=*41, n*=*64, tcr* =

0:40010

6_{s, t}

*cmp*=0:02510

6_{s, and t}

*cmpN*=0:12010

6_{s. These values are}

used in the comparative test between PAIRS and NCC in section 4. The resulting match times for this example are:

*tm*=46 ms

*tmN*=155 ms

We see that PAIRS is 3 times faster for this case. We note that the time to compare the
*bitsets tcmpis much shorter than the time to create them tcr*. For certain applications the

bitsets could be pre-computed or a large number of templates used on one image. We will see later that if optical flow estimation is done by PAIRS template matching the total time to create the bitsets is shorter than the total compare time since each bitset is used in comparisons many times. For an application where the bitset creation time is of no importantance the actual match time for PAIRS would instead be

*tm*=2:7 ms

if the same parameters as above are used. For this case PAIRS is 60 times faster then NCC.

3_{The MEM-PAIRS template matching algorithm will from now on often be abbreviated to just “PAIRS”}

*sparse*
*dsparse*
*sparse*
*d*
*sparse*
*d* *d*
Image _{Template}

Figure 1: This figure shows how the bitsets can be created sparsely in the image if several template bitsets are used in the matching.

For the application of tracking an object in a video sequence only one template is matched with a certain image. Therefore the bitsets created from the image will be used only once, so the bitset creation will take most of the time. There is a remedy to speed up the process of creating the bitsets. Have a look at Figure 1. The figure depicts how we can reduce the total bitset creation time by creating fewer image neigh-borhood bitsets and more template bitsets. The image bitsets are only created at every

*dsparse*’th pixel position in the x- and y-direction as shown on the left side in the figure

*for the case when dsparse*=3. The right hand side of the figure shows a 1010

tem-plate that we wish to match with the image. We view this temtem-plate as a collection of

*d*2* _{sparse}*=9 number of 88 templates with the upper left corner positioned within the

gray square. One of these 88 templates is marked in the figure. The 88 templates

are denoted with the position of their upper left corner in the 1010 neighborhood.

*The position indices k and l run from 0 to dsparse* 1. The assumption we use now is

that if the(*k*;*l*)-template matches the image in position(*i*;*j*)then the 1010 template

matches the image in position(*i* *k*;*j* *l*)*. This assumption is reasonable when dsparse*

is small compared to the template size. The assumption makes it possible to find a
sim-ilarity measurement of all possible positions of the template in the image even though
the image bitsets are created sparsely. The algorithms for bitset creation and
compar-isons become somewhat more complicated to implement, but the total match time is
*decreases substantially for many applications. We will create d _{sparse}*2 times fewer image

*bitsets and d*2

*sparse*template bitsets instead of only one. Equation 3 can be modified for

the case of sparsely sampled image bitsets to

*tm*=*n*
*sxsy*
*d*2
*sparse*
*tcr*+*nd*
2
*sparsetcr*+*nsxsytcmp*: (6)

The terms in the sum are from left to right the time to create the image bitsets, the time
to create the template bitsets and the time to compare them. The equation neglects the
*edge effects that occur if sxor syis not a multiple of dsparse*. For the example parameters

*given above and with dsparse*=4 the total match time would be

*tm*=2:7+0:4+2:7 ms=5:8 ms

which is 27 times faster then NCC. Later we will see how the use of sparsely sampled bitsets affects the performance of template matching.

**4**

**A comparison between PAIRS and normalized **

**cross-correlation**

**4.1**

**Test setup**

In this section we describe a performance test of PAIRS matching and normalized
cross-correlation (NCC). The PAIRS matching algorithm uses Algorithm 1 and 2 to
create the bitsets, and Algorithm 3 without sort out to match the bitsets. The NCC
*method uses a similarity measurement s between the image neighborhood I and the*
*template T as defined below.*

*s*=
2

### ∑

*k*=0 ∑

*N 1*

*i*=0 ∑

*N 1*

*j*=0

*I*(

*i*;

*j*;

*k*)

*T*(

*i*;

*j*;

*k*) q ∑

*N 1*

*i*=0 ∑

*N 1*

*j*=0

*I*2 (

*i*;

*j*;

*k*)

*T*2 (

*i*;

*j*;

*k*) (7)

*i and j are spatial coordinates, and k is the color index. Since RGB images are assumed*
*k runs from 0 to 2. The similarity measurement can be said to be standard gray scale*

normalized cross-correlation done separately for all three color bands and the resulting similarity is the sum of the similarity measurements from each color band.

The approach in this test is to define a template as a neighborhood of an image.
Then controlled distortions are applied to the template and the template is matched
with the original image. The RGB test images are chosen as random parts of images
from a database containing 20000 images. The database is Photo Library 1 and 2 in
*Corel’s product Corel GALLERY Magic, see their homepage [1] for more information.*
5000 testimages of size 5656 are chosen from the database and the templates are

taken from the central neighborhoods of these images. The size of the templates is 1616. This implies that the number of possible positions of the template in the

image is 412. When other sizes of the templates are used the image sizes are adjusted so that the number of possible positions of the template is the same. Figure 2 shows 20 of these images and the region from where the templates are taken. In the order of 50 per cent of the images initially chosen was not used since the energy of the template area was too low. (The image was refused if the average standard deviation of the three color bands in the template area was less then 20. The intensity values are between 0 and 255.)

The distortions that are applied to the templates are are defined in the subsequent sections. After the template is distorted it is matched with the image using the PAIRS and the NCC method. If the result is within a distance of 2 pixels from the correct position it is considered a hit. All the 5000 images are used for each of the 26 different cases of distortions. A new list of pairs (as defined by Algorithm 1) is produced every 100’th image. The the number of misses for NCC and PAIRS are recorded. The tests are performed for different number of bytes in the bitsets formed by the PAIRS. These different methods will be denoted PAIRSX or just PX where X is the number of bytes used to form the bitsets. PAIRSX+ denotes PAIRS with a number of bytes greater then or equal to X.

**4.2**

**Generation of image distortions**

The template is always shifted a subpixel distance in both directions before any other distortion is applied. This is a natural thing to do since even integer shifts are not more common than others for real applications. The sizes of the x- and y-shifts are

These images are from Corel GALLERY Magic which are protected by coperight laws. Used under license.

Figure 2: Examples of 20 images used in a comparative test between PAIRS and NCC. The template regions are marked.

chosen randomly from a uniform distribution. The shift is performed using bicubic interpolation.

The distortions are applied only to the template. If more than one distortion is used they are applied in the same order as they appear below. The distortions are applied to a template larger than the final distorted template so that the geometric distortions such as rotation and zoom can be performed without introducing undetermined intensity values in certain regions. The different types and strengths of the distortions are denoted by a LABEL (for example NOISE, or ROT) followed by the distortion strength parameter

*PLABEL*. For example “ROT10” denotes a rotation distortion with a distortion strength

*parameter (PROT*) of 10. (For this example, it means a rotation with a maximum angle

of 10Æ

).

**4.2.1** **Gaussian noise, NOISE**

The distortion called NOISE is generated by adding zero mean Gaussian noise to the
image. The signal energy of the image is computed as the sum of the squares of all
*pixel values for all colors. PNOISE*is the noise to signal energy ratio4.

4_{Unconventionally noise to signal ratio is used instead of signal to noise ratio, since we want a greater}

### C

### y

### z

### y’

### z’

### C’

### x

### x’

### θ

### φ

### O

Figure 3: Perspective change for the PERSP distortion.

**4.2.2** **Rotation of the image, ROT**

The rotation of the template is done by an angle of between φ*max*andφ*max*uniformly

*distributed. Bicubic interpolation is used. PNOISE* equalsφ*max*.

**4.2.3** **Scaling of the image, ZOOM**

When this distortion is used the template is scaled with a factor drawn from a uniform
distribution between 1 *PZOOM*and 1+*PZOOM*. Bicubic interpolation is used. When

the scale factor is<1 a suitable low-pass filter is applied before the interpolation to

avoid aliasing.

**4.2.4** **Perspective change, PERSP**

Have a look at Figure 3. We assume that the undistorted template comes from a flat
surface represented by the rectangle in the figure. The camera is located at C on the
z-axis which is perpendicular to the surface. The distorted template is then the image
that would be acquired if we move the camera to C’ and still aims it at the origin
O. The distance from the origin is not changed. The z’-axis is obtained by rotating
the unprimed coordinate system an angleφaround the z-axis followed by an angleθ
around the x-axis. The angleφis picked from a uniform distribution between 0 and 2π,
andθ*from a uniform distribution between 0 and PPERSP*.

**4.2.5** **Salt and pepper noise, SALT**

This type of noise is created by randomly setting some intensity values to 0 or 255.
The probability of setting the intensity to 0 is the same as the probability of setting it
*to 255. This probability is denoted PSALT*.

**4.2.6** **A gamma correction of the intensity values, GAMMA**

This distortion is applied independently for all intensity values. In this case the input
*intensity values iinare integer values between 0 and 255 and the output values iout*are

*iout*=*round*
*iin*
255
γ
255
: (8)

The value ofγ*is chosen from the distribution PU*( 1;1)

*GAMMAwhere U*( 1;1)is a uniform

distribution between -1 and 1.

**4.2.7** **NODIST and STD**

The label NODIST denotes no distortion except for the subpixel shift. This case can be used as a reference as the minimum miss rate that is possible. STD (standard distor-tion) is a mixture of the distortions defined above and consists of NOISE0.01, ROT5, ZOOM0.05, PERSP10, SALT0.02 and GAMMA1.5. STD is a standard test case with distortions that are considered relevant for many applications including tracking and optical flow estimation.

**4.3**

**Relevance of the distortions**

**NOISE Noise in imaging systems is often modeled well as additive Gaussian noise.**
**ROT, ZOOM, PERSP Template matching for tracking must be able to handle small**

rotations, image scaling and perspective changes due to object and camera mo-tion.

**SALT Salt-and-pepper noise occurs in some image measurement systems. **
Further-more, this distortion can be used to model object occlusion.

**GAMMA Cameras are often not calibrated. This distortion is relevant when the **
tem-plate and the image are acquired from different cameras. Also, it reflects how
well the template matching algorithm can handle different lightning conditions
for the template and the image.

The strength of the distortions is chosen quite high, since we need a large number of misses in the template matching tests in order to get statistically reliable data. However, the distortions are not so large that they are not relevant for practical applications.

**4.4**

**Results**

The results of the comparative test are shown in Figure 4 through 10 and the numerical figures are presented in Table 1. The miss rate is on the y-axis and the number of bytes for the PAIRS method is on the logarithmic x-axis of the graphs. The horizontal line in the figures corresponds to the miss rate of the NCC matching. If the line is missing it is out of the y-axes range. The other curve in the figures corresponds to the miss rate of the PAIRS method for different number of bytes in the bitsets.

Figure 4 show the miss rate for NODIST and for the standard test case, STD. The only distortion applied for the NODIST case is the subpixel shift. Opposite to expecta-tions PAIRS32+ perform significantly better then NCC for the NODIST case. Also for the STD test case PAIRS32+ perform better. The miss rate for NCC is 7 times greater then for PAIRS256.

For the case of Gaussian noise PAIRS64 performs as well as NCC for a low amount of noise. For very high noise levels, NCC outperforms PAIRS. See Figure 5. A noise to signal energy ratio higher than 0.01 is not common for high quality video sequences from scenes with good lightning conditions.

The performance between NCC and PAIRS can be characterized the same way for all the three types of geometrical distortions that have been tested: rotation (ROT),

NCC P16 P32 P64 P128 P256 P512 P1024 NODIST 0.008 0.009 0.005 0.003 0.004 0.002 0.002 0.002 STD 0.046 0.065 0.034 0.019 0.011 0.007 0.005 0.006 NOISE0.001 0.008 0.018 0.009 0.005 0.004 0.004 0.005 0.004 NOISE0.01 0.008 0.043 0.021 0.012 0.009 0.006 0.005 0.005 NOISE0.1 0.012 0.140 0.084 0.047 0.030 0.022 0.018 0.015 NOISE1 0.032 0.499 0.311 0.203 0.140 0.103 0.077 0.067 ROT5 0.011 0.014 0.006 0.005 0.003 0.003 0.002 0.002 ROT10 0.039 0.054 0.030 0.018 0.011 0.008 0.008 0.007 ROT15 0.102 0.151 0.097 0.066 0.056 0.047 0.043 0.041 ROT20 0.213 0.272 0.205 0.163 0.150 0.132 0.128 0.123 ZOOM0.05 0.010 0.015 0.006 0.004 0.003 0.003 0.003 0.003 ZOOM0.10 0.013 0.022 0.008 0.004 0.003 0.003 0.002 0.002 ZOOM0.15 0.028 0.055 0.028 0.018 0.013 0.012 0.012 0.011 ZOOM0.20 0.040 0.082 0.049 0.032 0.024 0.021 0.020 0.017 PERSP10 0.006 0.012 0.004 0.003 0.002 0.001 0.001 0.001 PERSP30 0.012 0.016 0.008 0.004 0.004 0.003 0.003 0.003 PERSP40 0.021 0.034 0.017 0.011 0.008 0.006 0.006 0.005 PERSP50 0.056 0.084 0.051 0.036 0.030 0.025 0.024 0.021 SALT0.02 0.019 0.014 0.005 0.003 0.002 0.002 0.001 0.001 SALT0.04 0.037 0.019 0.007 0.003 0.002 0.002 0.001 0.001 SALT0.06 0.062 0.026 0.007 0.004 0.002 0.003 0.002 0.002 SALT0.08 0.082 0.038 0.011 0.005 0.003 0.002 0.001 0.001 GAMMA15 0.016 0.009 0.005 0.002 0.002 0.002 0.002 0.002 GAMMA20 0.039 0.011 0.004 0.002 0.001 0.002 0.002 0.002 GAMMA30 0.144 0.012 0.007 0.004 0.002 0.002 0.002 0.002 GAMMA50 0.252 0.022 0.011 0.009 0.008 0.006 0.006 0.005

Table 1: The miss rate of template matching experiments with different types of distor-tions and similarity measurements. PX denotes the PAIRS algorithm with X number of bytes in the bitsets.

16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 NODIST 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 STD

16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 NOISE0.001 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 NOISE0.01 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 NOISE0.1 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 NOISE1

Figure 5: The miss rate for different levels of Gaussian noise. The noise to signal ratios of 0.001, 0.01, 0.1 and 1 corresponds to a SNR of 30 dB, 20 dB, 10 dB and 0 dB.

16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 ROT5 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 ROT10 16 64 256 1024 0 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 ROT15 16 64 256 1024 0 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 ROT20

Figure 6: The miss rate for different amounts of rotation of the template. Note the different scales of the y-axis. The NCC miss rate is out of the y-axes range for the ROT20 distortion. 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 ZOOM0.05 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 ZOOM0.10 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 ZOOM0.15 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 ZOOM0.20

16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 PERSP10 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 PERSP30 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 PERSP40 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 PERSP50

Figure 8: The miss rate for different amounts of perspective changes applied to the template. 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 SALT0.02 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 SALT0.04 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 SALT0.06 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 SALT0.08

Figure 9: The miss rate after a salt and pepper type of noise have distorted the template.

rescaling (ZOOM) and a perspective change (PERSP). See Figures 6 to 8. PAIRS32+ works better for all these distortion cases except for ZOOM0.20. For this case PAIRS64+ works better.

PAIRS16+ clearly outperforms NCC when the template is distorted with salt and pepper noise. See Figure 9. This can be explained by the fact that if a pixel value is set to 255, corresponding to “white” or maximum intensity, it will have a larger effect on the total similarity measurement in NCC then in PAIRS since large intensity values affect the NCC similarity measurement more then small intensity values do. This is not the case for PAIRS where each pixel value (statistically) have the same amount of influence on the similarity measurement independent of the magnitude of its intensity value.

The results from theγ-distortion is shown in Figure 10. As expected PAIRS is far superior then NCC for this type of distortion. This is due to the fact that only the information about which of two intensity values is greater is used when forming a bit in a bitset. This information stays intact for any strictly increasing intensity value transform, at least if we neglect quantization effects. Thus, the bitsets are (nearly)

16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 GAMMA1.5 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 GAMMA2.0 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 GAMMA3.0 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 GAMMA5.0

Figure 10: The miss rate for different amounts ofγ-distortion of the pixel values. The miss rate of NCC for GAMMA30 and GAMMA50 is out of the y-axes range and therefore not shown in the figure.

1 2 3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 PAIRS16 1 2 3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 PAIRS64 1 2 3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 PAIRS1024 1 2 3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 NCC

Figure 11: The distribution of distances between match position and the correct posi-tion. The bar furthest to the right in each graph indicates the relative frequency of all distances greater than 3.0 pixels.

invariant to arbitrary intensity transforms. This fact is especially useful when the image and the template is acquired with different cameras.

In the above experiments a “miss” have been defined as a matching position greater than 2.0 pixels from the correct position. This threshold was set rather arbitrary. Exper-iments with other thresholds seem to give the same qualitative results on how PAIRS and NCC compare. Figure 11 shows the distribution of distances from the match posi-tion to the correct posiposi-tion for NCC and PAIRS for the standard test case (STD). Most match positions are less then one pixel from the correct position. For PAIRS1024 the relative frequency of matches less then 1.0 pixels from the correct position is 0.984. The same figure for PAIRS16, PAIRS64, and NCC is 0.859, 0.949, and 0.911 respec-tively.

To summarize the results PAIRS performs better for most types of distortions ex-cept for Gaussian noise with very high energy. PAIRS64 performs better than NCC for 23 out of the 26 different test cases. PAIRS clearly outperforms NCC for the SALT and GAMMA distortions even when few bytes are used in the bitsets. The distribu-tions of distances between the match position and the correct position are similar for

NCC P16 P32 P64 P128 P256 P512 P1024 STD4 0.526 0.532 0.462 0.429 0.401 0.394 0.387 0.384 STD8 0.152 0.111 0.067 0.048 0.039 0.035 0.034 0.033 STD16 0.046 0.065 0.034 0.019 0.011 0.007 0.005 0.006 STD32 0.022 0.089 0.042 0.020 0.009 0.004 0.003 0.002 STD64 0.064 0.243 0.145 0.087 0.056 0.040 0.031 0.023

Table 2: The miss rate for NCC and PAIRS for different template sizes. The standard test case of distortions has been used. STDX denotes this distortion and the template size X.

PAIRS and NCC. Note that these results are valid for this type of images, templates and distortions. Other choices of test images could give different results.

**4.5**

**Other template sizes**

For the tests described above the template size was kept constant and equal to 16 since this is a reasonable size for the intended applications. Figure 12 and Table 2 shows the results of the STD test case when different template sizes was used. The cases are denoted STDX where X equals the template size. The test images have been chosen from an image database as previously explained. Different sets of images are used for the five different tests since the images that are sorted out due to low variance depends on the size of the template. The number of possible positions of the template in the image is still 412.

The first thing to notice from the figure is that both NCC and PAIRS work best for template sizes 16 and 32. For smaller templates the uniqueness and the structure of the templates decrease which increase the miss rate. The reason why a larger template size also increases the miss rate is that the geometrical distortions affect the outer pixels in a large template more then for a small template. Another interesting thing is to look at the number of bytes needed in the PAIRS algorithm compared to the total number of bytes in the template which is used for the comparisons in the NCC algorithm. The PAIRS algorithm performs as well as NCC when 32, 64 and 128 bytes are used for the template sizes 16, 32 and 64. The ratio between the amount of data in the template and the amount used in the bitsets is 0.04, 0.02, and 0.01 for the cases above. That is, for larger templates we can reduce the data amount more then for small templates and still get the same performance as NCC. This is not surprising since the larger template the more redundant data we have.

The results show that for this test case PAIRS128+ performs better than NCC for all template sizes.

**4.6**

**Performance using other images**

How well PAIRS compares to NCC depends on the set of images used in the test. The images previously used are picked randomly from a large database of photographs. Only the images with a template region with low variance have been discarded. The other set of images are taken from a video sequence. The sequence show two cars driving through an intersection. The video is acquired for the WITAS project from a radio-controlled helicopter and is called REVINGE2D. The position of one of the cars have been tracked for 500 frames and the set of test templates consists of 2020

16 64 256 1024 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 STD4 16 64 256 1024 0 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 STD8 16 64 256 1024 0 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 STD16 16 64 256 1024 0 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 STD32 16 64 256 1024 0 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 STD64

Figure 12: The miss rate for NCC and PAIRS for different template sizes. The standard test case of distortions (STD) was used. Note the different scale for STD4.

Figure 13: Frame 0, 299 and 499 from sequence REVINGE2D. The template positions are marked.

neighborhoods around the tracked position. The test images are 6060

neighbor-hoods around that position. Possibly these images are more relevant for the specific application of tracking cars then the randomly chosen images. Figure 13 shows three of these frames with the template positions marked. All together 5000 matchings are done for each type of applied distortion. Each image is reused 10 times. To limit the experiments only one distortion strength is used per type of distortion. The distortions used are: NODIST, STD, NOISE0.1, ROT15, ZOOM0.20, PERSP50, SALT0.15 and GAMMA5.0. The results are presented in Table 3, and in Figure 14 and 15.

The results from these tests are difficult to interpret since it is difficult to know how well these results generalize to other sequences. They should have some relevance for the specific application of vehicle tracking. When we compare these results with the ones from the Corel image database, we see that PAIRS does not compare as well with NCC for a low number of bytes in the bitsets. However, PAIRS512+ performs better or as good as NCC for all the distortions tested. Note also that many results are close to zero and therefore statistically uncertain. The reason why PAIRS with a low number of bytes does not compare as well with NCC for these images compared to when the previous images were used is not clear. Possibly PAIRS generally compares better with NCC for bad templates, but a definite conclusion cannot be drawn from these limited tests. Further testing where the templates are classified on a scale from “bad” to “good” would possibly provide interesting results.

NCC P16 P32 P64 P128 P256 P512 P1024 NODIST 0.000 0.005 0.001 0.000 0.000 0.000 0.000 0.000 STD 0.003 0.053 0.014 0.003 0.001 0.000 0.000 0.000 NOISE0.1 0.000 0.052 0.011 0.002 0.000 0.000 0.000 0.000 ROT15 0.012 0.121 0.050 0.032 0.020 0.014 0.010 0.009 ZOOM0.20 0.038 0.134 0.080 0.052 0.043 0.038 0.034 0.033 PERSP50 0.038 0.113 0.060 0.042 0.033 0.025 0.024 0.023 SALT0.15 0.017 0.219 0.048 0.005 0.000 0.000 0.000 0.000 GAMMA5.0 0.295 0.004 0.001 0.000 0.000 0.000 0.000 0.000

Table 3: The miss rate for NCC and PAIRS for different distortions. The set of testim-ages are taken from the REVINGE2D sequence.

16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 NODIST 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 STD 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 NOISE0.1 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 ROT15

Figure 14: The miss rate for the NODIST, STD, NOISE0.1, and ROT15 distortions. The test images are taken from the sequence REVINGE2D.

16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 ZOOM0.20 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 PERSP50 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 SALT0.15 16 64 256 1024 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 GAMMA5.0

Figure 15: The miss rate for the ZOOM0.20, PERSP50, SALT0.15, and GAMMA5.0 distortions. The test images are taken from the sequence REVINGE2D.

0 20 40 60 0.35 0.4 0.45 0.5 0.55 0.6 0.65 Graph 1 0 20 40 60 0.35 0.4 0.45 0.5 0.55 0.6 0.65 Graph 2 0 20 40 60 0.35 0.4 0.45 0.5 0.55 0.6 0.65 Graph 3

Figure 16: The probabilities for a zero of the first 64 bits in three different populations of bitsets. The population in Graph 1 is generated by Algorithm 1, the population corresponding to Graph 2 is created by a slightly modified version of Algorithm 1, and the last graph corresponds to a population created by a random number generator which optimally has maximum possible entropy.

**5**

**Statistics of PAIRS bitsets**

**5.1**

**Statistics of acquired bitsets**

The bitsets used in Maximum Entropy Matching should of course have maximum
pos-sible entropy which is fulfilled when the probability of each bit being set to 1 is 0.50
*and the bits biare statistically independent. Given a specific template bitset btemp*;*i*that

*we match with an image bitset bim*;*i*the number of equal bits is the number of zeros in

*bmatch*;*i*
=XOR(*bim*
;*i*
;*btemp*
;*i*
): (9)

*If the image bitsets have maximum entropy the number of equal bits m follows a *
bi-nomial distribution. We continue by studying a population of bitsets which are created
randomly using the same database as in Section 4. Random 1616 neighborhoods are

chosen from which 64-byte bitsets are created.

First we investigate the probability of each bit being set to 0 which should be close to 0.50 to obtain high entropy. Figure 16, Graph 1 shows the relative frequency of the event that the bit is set to 0 for the first 64 bits in the bitset. 10000 bitsets were used. The average of the relative frequencies for the first 64 bits is 0.544 and the standard deviation is 0.034. The reason why these values are not closer to 0.50 is due to the fact that pixel values in a pair are sometimes equal due to the limited resolution and then the bit is set to 0 according to Algorithm 2. This problem is made worse by the fact that the images in the Corel image database are compressed so that neighboring pixel values are more likely be be exactly equal. This problem can be solved by adding an if-statement to Algorithm 2. When the pixel values in a pair are equal the bit can be set to 1 if the pixel value is even. This makes the probabilities closer to 0.50. The result when Algorithm 2 has been modified can be seen in 16, Graph 2. For this case the average is 0.501 and the standard deviation is 0.007. We call these bitsets REAL. Graph 3 is included for reference. The bits in the bitsets used for the result in this graph have been created by a random number generator with the probability of a bit set to 0 equal to 0.50. The average and the standard deviation is 0.502 and 0.005 for this case. This set of bitsets are called OPTIMAL. 10000 bitsets have been used for all three cases. We do not expect a much better results by modifying Algorithm 2 in the

REAL 10 20 30 40 50 60 10 20 30 40 50 60 OPTIMAL 10 20 30 40 50 60 10 20 30 40 50 60

Figure 17: The absolute value of the correlation matrices of the first 64 bits in the bitsets called REAL and OPTIMAL.

way described above but it makes the bitset follow our theory better. For all practical
*applications we can say that the probability of a bit set to 0 is exactly equal to 0.50 if*
we modify Algorithm 2 which is assumed in the rest of this section.

We know that the probability of a bit being 0 is 0.50. To investigate how close to
maximum entropy the bitsets are we now study how dependent the bits are. Since the
bits are independent if they are uncorrelated we study the correlation matrixρ*i j*, see

Equation 2. The absolute value of the correlation matrix of the first 64 bits for REAL and OPTIMAL can be seen Figure !!!. To be able to compare different populations of bitsets we need a measurement on how dependent the bits are. We can use the

*dependence value*
*pdep*=
v
u
u
t
∑*N 1*
*i*=0 ∑
*N 1*
*j*=0ρ
2
*i j*
*N*
*N*2 * _{N}* (10)

*as such a measurement. N is the number of bits. pdep*is simply the root mean square

value of the off-diagonal elements in the matrixρ*. We also define an independence*

*value as*

*pind*=1 *pdep*: (11)

*To verify the correlation between entropy and the independence value pind* bitsets

con-sisting of 8 bits are formed randomly the same way as the REAL set of bitsets were
formed. This is done for a number of different template sizes and compared to the
*estimated relative entropy HR*defined as

*HR*=

∑*N 1*

*i*=0

*pi*log2*pi*

log_{2}*N* : (12)

*pdep, pindand HR*are limited to the interval[0;1]. Since the list of pixel pairs formed

by Algorithm 1 influences the relative entropy and the independence value significantly for small bitsets we study the average of these values for many different lists of pairs. Figure 5.1 shows the result of the average relative entropy and the average indepen-dence value when 50 different lists are used. 10000 bitsets have been used for each list and template size. We can see that the estimated entropy and the independence value are closely related. It is difficult to say how well this result generalizes for bitsets with much more data than one byte. We make the assumption that when two populations of bitsets are compared the one with the highest independence value also has the highest entropy. A proof of the validity of this assumption is not available.

2 4 6 8 10 12 14 16 0 0.2 0.4 0.6 0.8 1 template size independence value relative entropy Figure 18: Blahh....

It is interesting to study the distribution of the number of equal bits when a template bitset is compared to an image neighborhood corresponding to a miss...

**6**

**Comments**

This paper was never completed. Hopefully this part of the paper is still of some value. This paper lacks some important things such as the applications of tracking cars and estimating optical flow. A section about possible improvements of the algorithm should also be added. However, the existing main sections are fairly complete except for Section 5.

**References**

[1] Corel’s homepage, August 2000,www.corel.com.

[2] P. Aschwanden, W. Guggenb ¨uhl. Experimental Results from a Comparative Study
*on Correlation-Type Registration algorithms. Robust Computer Vision, Forstner,*
Rudwiedel (Eds.), Wichmann 1992, pp. 268-289.

[3] Salvatore D. Morgera, Jihad M. Hallik. A Fast Algorithm for Entropy Estimation
*of Grey-level Images. Workshop on Physics and Computation, 1994. PhysComp*
’94, Proceedings.

[4] Aaron D Wyner, Jacob Ziv, Abraham J. Wyner. On the Role of Pattern Matching
*in Information Theory. IEEE Transactions of Information Theory, Vol. 44, No. 6,*
October 1998.