Mice recognition - representation of earprints

(1)

Technical report, IDE0855, May 2008

Mice recognition - representation of

earprints

Master’s Thesis in Computer Systems Engineering

Antonio Vara Piquer

Sebastià Antoni Verger Vidal

School of Information Science, Computer and Electrical Engineering

(2)

1. INTRODUCTION... 4 1.1 Motivation ...4 1.2 Objectives ...4 2. BACKGROUND ... 5 2.1 Feature extraction ...5 2.1.1 Triangulation Method ...5 2.1.2 Minutia Points...6 2.2. Matching ...6 3. METHODS ... 7 3.1. Sparse Representation...7

3.2. New features added to the sparse representation ...9

3.3. Reconstruction of the bvp ...11

3.3.1. Ordering ...11

3.3.2. Selection of candidates for connection ...13

3.3.3 Calculation of the connection probability ...16

3.3.4 Probability matrix and Connection matrix ...17

3.4. Graph Representation...19

4. EXPERIMENTS AND RESULTS ... 21

4.1. Data...21

4.2. Reconstruction of the bvp ...21

4.2.1. Restricting the number of candidates ...21

4.2.2. Parameters for the probability function ...22

4.3. Results ...23

4.3.1. The truth matrix ...23

4.3.2. Evaluation scheme and Calculation of error rates ...23

4.3.3 Results of the algorithm ...26

4.4. Experiment step by step. ...28

5. DISCUSSION... 32

5.1. Graph Matching ...32

5.1.1. Objective...32

5.1.2. Problems in Graph Matching ...32

5.1.3. Binary Tree ...32

6. CONCLUSIONS... 34

(3)

(4)

1. Introduction

There is now a new application area for biometric recognition that can replace today’s invasive methods. Because biometric identification is a non-invasive identification technique, it can be applied with a code space restricted only by the uniqueness of the biometric identifier, which can be done with a predictable error rate.

In this work, we will extract the information encoded in the blood vessel pattern (bvp) in a mouse ear as the unique information for each ear. The tests will be done using a sample of 50 mice.

1.1 Motivation

Investigations involving animals are essential in biomedical research in developing a new method for humans to identify animals. Nowadays the identification is invasive for the animals; by invasive we mean the use of tattoos, transponders, ear tags, freeze brands, subcutaneous electromagnetic transponders, ear notches and tags, or even toe clipping (the latter is, however, considered inhumane) [1]. All these methods are to some degree invasive and restricted in code space (e.g. there is a limited number of notches that can be cut in a mouse’s ear) and new methods are needed that are accurate, non-invasive (and humane) and with a larger code space.

“Biometric identifiers, such as fingerprints, face geometry, voice, iris, retina etc. are today used to verify the identity of humans and it is natural to extend these ideas to animals, since animals also have individual physical characteristics.” [1]

Usually laboratory animals are rodents (e.g. mice and rats) and the measurement must be quick and not unpleasant for the animal since the identification will be made several times. A retinal scan, which takes about 15 seconds to perform on a cow, is completely unfeasible for each rodent in the lab several times a week.

In this paper we present a new way to extract the information of the bvp that can be used later as a technique in the identification of mice.

1.2 Objectives

From a sparse representation defined as: the position of m-points and local direction around the points [3] do a reconstruction of the bvp. By reconstruction we mean finding the true connected m-points and connecting them with straight lines. This procedure involves a more complete description of the bvp and therefore includes more features to describe the bvp. By true connected m-points we mean m-points that are connected by a blood vessel.

(5)

It is important to have an effective representation when matching two reconstructed bvp´s when carrying out the identification (one-to-many matching). The effective representation is an ordered set of m-points. By this we mean that the m-points must be ordered in the same way for the two bvp in such a way that they are matched.

Figure 1. A 24-bit RGB color

image of a mouse ear.

2. Background

There is not so much information about this project to be found because it is a new branch of research. There are some research groups working in this field in addition to private companies investigating in this area. We approached one of these groups but work on the project has been discontinued due to the complexity of the work. The main problem has revolved around extracting unique features in mouse ear images.

2.1 Feature extraction

In previous work related to “Biometric Identification of Mice” [3] the researchers extracted the m-points and the histograms. From that data we can also ascertain the global direction in the blood vessel pattern (bvp) and the local direction around m-points. This is a point which will return to later on in the project description.

2.1.1 Triangulation Method

(6)

Figure 2. Triangulation method

2.1.2 Minutia Points

We will work with that kind of information, the minutia points (m-points). This information is derived from the analysis of the image and is used to determine the location and

orientation of ridge bifurcations and ridge terminations. The procedure is also used in fingerprint detection. In the figure 3 we can see the minutia points in a fingerprint.

Figure 3. Fingerprint with minutia points marked with a square

2.2. Matching

Graph matching is a powerful yet computationally expensive procedure. If the sample graph is matched against a large database of model graphs, the size of the database is introduced as an additional factor into the overall complexity of the matching process.

(7)

3. Methods

3.1. Sparse Representation

The information that we have about the image (see Figure 4) is the sparse representation [3]. The sparse representation is the position (see Figure 5) of the special points called minutae points (m-points labeled red in Figure 4) and the local direction histograms (see Figure 6).

Figure 4. Gray scale image with the

extracted m-points labelled with red numbers 1-13.

Figure 6. Direction histogram for the first nine m-points. The maxima

in the histograms mean that there is a possible connection in this direction. Analysis can confirm the hypothesized connection between the two points.

Point number Column Row

1 36 41 2 36 54 3 64 20 4 40 13 5 65 35 6 55 24 7 58 10 8 55 45 9 18 46 10 21 10 11 22 26 12 26 64 13 50 65

Figure 5. List of the positions of the m-points in

(8)

Thus with the initial information and the objective as stated under 1.2 of this work, we can divide the thesis into the following procedural steps:

Project steps:

0. The point of departure is the sparse representation derived from the work before [3].

1. Perform the reconstruction of the blood vessel pattern (bvp) from the sparse representation. The reconstruction can be seen as a graphical representation of the bvp.

2. Evaluate the performance of the reconstruction.

3. Compute invariant features for each node in the graphical representation.

a. Connecting the m-points of the bvp from the sparse representation. We can calculate which points are the sons of a certain m-point and calculate the vectors between them. With that, we can calculate the three angles between them and the lengths.

4. Identification by graph matching.

a. For this step we will use a graphical structure with the angles of the sons and the length to each son.

b. We want to find an identification of O(n) complexity.

(9)

3.2. New features added to the sparse representation

To improve the analysis of the image and the results, we added two more features: - The mean direction of the blood vessel pattern (see Figure 7)

- The mean direction of each m-point (see Figure 8).

Figure 7. Representation of the mean direction of the blood vessel pattern.

Figure 8. Example of the mean direction of one m-point.

The mean direction of the blood vessel pattern is used for ordering the points in a “natural order”. This order means that the points closer to the root will be checked first and the leaves will be ckecked the last. The mean direction of an m-point is used in the reconstruction process of the bvp.

(10)

Figure 9. The vector image for the gray scale image in Figure 4.

Figure 10. The vectors around an m-point marked in gray in Figure 9.

Now with the position of the m-point and the vectors around this m-point, if we add all the vectors around this m-point in a neighbourhood of this point (see Formula 2), we can derive the mean direction around this m-point. After calculating the mean direction of the m-point the argument is in double angle (d.a.) representation of the gradient.

The formulas to calculate mean directions are as follows:

1. Mean direction for the bvp.

⎩ ⎨ ⎧ ⎭ ⎬ ⎫ =

∑

= N i i bvp Arg v MD 1 (1)

vi= the gradients in the vector image (in d.a. representation).

(11)

2. Mean direction for an m-point. ⎩ ⎨ ⎧ ⎭ ⎬ ⎫ =

∑

= M i i m Arg v MD 1 (2)

vi = the gradients in the neighbourhood of an m-point (in d.a. representation).

M = number of gradients in the neighbourhood.

To calculate the orientation from the gradients in d.a. representation we need to divide the argument by 2 and add pi/2. We can then derive the orientation in a normal angle, rather than in a double angle representation.

3.3. Reconstruction of the bvp

What we now have is the sparse representation of the image (see Figure 5 and 6) in addition to the new features that we added. Our goal is to reconstruct the blood vessel pattern, i.e. to make the connections of the m-points (see Figure 11).

It is important to say that for this project it is better to obtain good connections than to obtain large numbers of connections.

Figure 11. Example of the connections that we want to find in the image showed in the Figure 4, represented with a graph structure.

3.3.1. Ordering

(12)

M is the vector projected in the direction of the vector K of length one.

K

M , is the projection of M in the direction of K, where M ,K is the scalar product.

Thus, we will use projections on the mean direction K to order the m-points as we can see in Figure 13.

Figure 13. Example of the projection on the mean direction of the bvp for the three m-points: M7, M8, and M10.

(13)

Thus with this we have ordered the m-points starting at the root of the bvp.

This procedure will enable the analysis to proceed faster, because we will attempt to

connect the points only backwards, not forwards. Consequently, we will try to connect only the points with those that can be connected with our m-point because of the propagation of our bvp, and we will not have to check the analyzed m-point with all the others.

3.3.2. Selection of candidates for connection

Before starting to find the connections, we have to take into consideration a few properties of the blood vessel pattern:

- With the points ordered, we can see that all point-forward connections are to “sons”, and the backward connections are to the “fathers”. We do not need to take care of the backward connections, because they would have been studied on the father’s connections.

- The connections of the m-points are never between points at more than 8 points of difference, because of the structure of the bvp.

- When two points are connected, they have two very similar local maxima on the direction histogram.

- It is possible that two points have really close local maxima on the histograms without being connected.

To explain better the method, we offer an example showing the connection of one m-point, i.e. the m-point labelled with 6 in Figure 4.

Local maxima in the direction histograms

From the direction histograms we can derive the directions of the blood vessel pattern. These directions are in directions for gradients in a double angle representation and we compute the orientation angle D using the formula as shown in (3) below.

2 2

π

+ = Angle D (3)

Angle is the value in the direction histogram in radians that we want to translate into

orientation angle D.

The information contained in a direction histogram is that every histogram goes through 360º degrees and for every value inside this range we can derive the amount of vectors that are in that direction in the neighbourhood of the m-point.

We want to know the local maxima because it is a feature that can help us in our work to know how the m-points are connected.

We have to iterate through 360 degrees to find all the local maxima in the direction

(14)

Figure 14. Direction histogram for the m-point labelled equal to one. Extracted local maxima are colored in red.

Close in maxima in the direction histograms

For each point, the first that we need is to know the point’s local maxima. Then, for each local maximum we’ll make a list of candidates, i.e. points with a high probability to be connected.

To list the candidates, we will take the value of the first local maxima of our m-point (see Figure 16), and we will then compare this value with the values of the local maxima of the points that are in our comparison colored in green in Figure 15.

List of m-points

7 3 6 5 4 10 8 11 1 13 2 9 12

Studied m-point

m-points that will be ckecked for connection

Figure 15. Example of the relationship between the studied point and the rest of the m-points checked for connections (in green).

(15)

Figure 16. Example of the local maxima comparison

Restricting the number of candidates

When we have a list of candidates, there are some m-points which we know statistically are not connected to the studied m-point. These points are the points that are at long distance or have a large angle to the mean direction of the m-point which is studied. The distance and the difference in angle are computed from the coordinates of the candidate points. In order to increase the speed in the execution of the algorithm, we do not need to take into consideration the m-points which are not true connections with the m-point.

The list of candidates will be filtered with these two values: the angle difference and the distance.

(16)

Figure 17. Example of discarting of two candidate points 10 and 5 because of too big difference in angle.

3.3.3 Calculation of the connection probability

When there is more than one candidate in the candidates list, the procedure to select the best candidate is by calculating the connection probability Pconnection(x,y) of those points according to formula 4.

P_connection(x,y)=P_diffangle(x)⋅P_length(y) (4)

To calculate this probability we use the Gaussian Function (see Formula 5).

2 2 2 ) ( ) ( c b x e x P − − = (5)

Using this formula for the difference in angle to the mean direction (called diffangle), respective for the distance (called length) to the m-point, we estimate the probabilities Pdiffangle(x) and Plength(y) of this point to be connected.

Using the following parameters:

(17)

c2 : This parameter controls the width of the Gaussian function. In our case controls the probabilities values assigned to x.

Figure 18. Example of the Gaussian Function for the parameters c= 0.2, b=0. P(x) = 0.35 when x=0.6.

3.3.4 Probability matrix and Connection matrix

The connection probabilities Pconnection are presented in the form of a Probability matrix (see Figure 19). This matrix has M rows and M columns, where M is the number of m-points of the image. A value P(r,c) in the matrix is interpreted as a probability to connect point r and c.

(18)

POINT 7 3 6 5 4 10 8 11 1 13 2 9 12 7 0 0 0,49 0,27 0,65 0 0 0 0 0 0 0 0 3 0 0 0,79 0,6 0 0 0 0 0 0 0 0 0 6 0,49 0,79 0 0 0 0 0 0 0,44 0 0 0 0 5 0,27 0,6 0 0 0 0 0,75 0 0 0 0 0 0 4 0,65 0 0 0 0 0 0 0,5 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0,75 0 0 0 0 0 0,59 0 0 0,2 11 0 0 0 0 0,5 0 0 0 0 0 0 0,31 0 1 0 0 0,44 0 0 0 0 0 0 0 0,81 0,56 0 13 0 0 0 0 0 0 0,59 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0,81 0 0 0 0,77 9 0 0 0 0 0 0 0 0,31 0,56 0 0 0 0 12 0 0 0 0 0 0 0,2 0 0 0 0,77 0 0

Figure 19. Example of a probability matrix.

This matrix can be threshold in terms of a certain value. It is important to say that a low threshold generates more good connections but also more bad connections. If the threshold is high, we will have less good and bad connections but many more missing connections (see Figure 20)

NOTE: in this project, thresholding means that only values lower than the threshold are put to zero. POINT 7 3 6 5 4 10 8 11 1 13 2 9 12 7 0 0 0 0 0,65 0 0 0 0 0 0 0 0 3 0 0 0,79 0,6 0 0 0 0 0 0 0 0 0 6 0 0,79 0 0 0 0 0 0 0 0 0 0 0 5 0 0,6 0 0 0 0 0,75 0 0 0 0 0 0 4 0,65 0 0 0 0 0 0 0,5 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0,75 0 0 0 0 0 0,59 0 0 0 11 0 0 0 0 0,5 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0,81 0,56 0 13 0 0 0 0 0 0 0,59 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0,81 0 0 0 0,77 9 0 0 0 0 0 0 0 0 0,56 0 0 0 0 12 0 0 0 0 0 0 0 0 0 0 0,77 0 0

(19)

The Connection matrix (see Figure 21) is an M x N matrix, where the index equals the label of an m-point and the matrix elements are the label of the connected points.

m-point 7 4 0 0 3 6 5 0 6 3 0 0 5 3 8 0 4 7 11 0 10 0 0 0 8 5 13 0 11 4 0 0 1 2 9 0 13 8 0 0 2 1 12 0 9 1 0 0 12 2 0 0

Figure 21. Example of a Connection matrix after thresholding the probability matrix with a value of 0.5

3.4. Graph Representation

This part of the project concerns the project steps numbered 3 and 4, so our objective here it is to make a representation from the data that we have extracted from the image.

We are going to use Java to build a graph structure for matching mice, but if the previous part was difficult this part is even more complex because if we do not obtain a good result in the first part it is quite difficult to fill a binary tree with the data, as we have to trust in the results to build the graph representation and the connections.

From the data in Matlab we made an XML file, the structure of which is:

</point>

</mice> </Database>

We use XML to represent the database and we choose this kind of file because it is a well known standard, it is easier to understand than a normal flat file with arrays of numbers inside and numerous tools to work with that kind of file. Furthermore if in the future somebody attempts to continue our work they can use the data in many ways.

(20)

Using Java we read the XML file and we build an xml tree before moving on to putting all the data into a binary tree structure; the structure is shown in Figure 22:

Class node

Figure 22. Graphical representation of the structure. { Integer nodeNumber; Double angle1; Double length1; Double angle2; Double length2; Double angle3; Double length3; node left; node right; }

This is the basic structure to build a binary tree, but we really need a good data set from the previous work, otherwise it is quite impossible to fill the tree.

Figure 23. Result obtained from the reconstruction method.

Figure 24. How the graph is intended to be.

(21)

4. Experiments and Results

4.1. Data

The analyzed data are 50 images of the first data set of images with that we where provided. The data set is composed by images of 50 mice, 5 images of the first 20 and one image of the next 30 mice. To obtain a good representation of the data, we took one image of each mouse numbered 1-20 and the images of mice numbered 21-50.

4.2. Reconstruction of the bvp

Close in maxima in the direction histogram

Closeness of local maxima between direction histograms is defined as a difference in direction of less than 30°.

4.2.1. Restricting the number of candidates

In order to find how to discard candidate points, the method used was to derive statistics from the true connections and look for the maximum in distance and angle difference between two connected points. After analyzing all the true connections of all the fifty images, the largest distance found between two connected points is 51 (see Figure 25). When we then will filter the candidates list we filter all the candidates that occupy a position more than 51+5 (security range)=56 points away from an m-point.

(22)

Figure 26. The histogram of the differences in angle between the mean direction of an m-point and the direction of the connection to a point.

Looking at the histogram for the differences in angle between the mean direction of an m-point and the direction of the connection to a m-point (see Figure 26), we can see that the maximum value is 45 + 5 (security range)=50. Consequently, we can say that this is a good value for filtering the candidates list, i.e. an angle difference > 50 disqualifies a candidate from the list.

4.2.2. Parameters for the probability function

Reference values

For the Gaussian function, we need two reference values, one for the length and one for the diffangle (value b in the Formula 5).

For the difference in angle between the mean direction of the m-point and the direction of the connection to the candidates, we will use the mean direction of the m-point as the reference value for our probability function. That is because we want the probability value higher when this difference is small and lower when there is a major difference.

(23)

Standard deviation

In order to calculate the probability values we only need the values for the standard deviation for the angle and for the distance (c in the Figure 18).

For the value of the angle, we examined the histogram (see Figure 26) and we choose the value of the smallest value 0 and the highest value 45. We then calculated the mean of those two values: 22.5.

We now need to calculate the value of the standard deviation to have a probability of 0.7 with this angle difference =22.5. The result is c=26.64. That is the value for this standard deviation.

For the value of the standard deviation for the distance, we took the value 25 because it is the highest peak on the distances histogram (see Figure 27) and because is close to the mean. As with the angle, we calculated the probability of 0.7 for this value and the result is c=30. That is the value that we will use for the standard deviation for the difference in length.

4.3. Results

NOTE: we only present results for step 1 and 2.

4.3.1. The truth matrix

The Truth matrix is the matrix with the true connections of the bvp. It is calculated by looking at the image and creating the true connection matrix (see 3.3.4) by hand. This matrix will be used to calculate the percentage of the connections that we found and the level of truth in our computer-derived calculation matrix.

4.3.2. Evaluation scheme and Calculation of error rates

When each image is analyzed, the algorithm used to calculate the rates of success, failure and missing of the connections is the following one:

- First we check on the truth matrix to see how many connections there are.

- Second, we check how many connections are the same in the truth matrix and in the connection matrix and count them. These are the good connections.

- Third, we count how many connections are in the truth matrix and not in the connection matrix. These connections are the missing connections.

- Fourth, we count how many connections are in the connection matrix and not in the truth matrix. These are the bad connections.

(24)

One example:

Note: The extracted data have been filtered with one probability of troubleshooting. That means that all the connections with a probability of less than 0.5 have been discarded.

The data that we need for this is the truth matrix (see Figure 27), and the connection matrix (see Figure 28). 7 4 0 0 3 6 5 0 6 3 1 0 5 3 8 0 4 7 10 11 10 4 0 0 8 5 13 0 11 4 0 0 1 6 2 9 13 8 0 0 2 1 12 0 9 1 0 0 12 2 0 0

Figure 27. Truth matrix of the image in Figure 4.

7 4 0 0 3 6 5 0 6 3 0 0 5 3 8 0 4 7 11 0 10 0 0 0 8 5 13 0 11 4 0 0 1 2 9 0 13 8 0 0 2 8 0 0 9 1 12 0 12 2 0 0

Figure 28. Connection matrix of the image in Figure 4.

(25)

Result Matrix Truth matriz 4 0 0 4 0 0 6 5 0 6 5 0 3 1 0 3 0 0 3 8 0 3 8 0 7 10 11 7 11 0 4 0 0 0 0 0 5 13 0 5 13 0 4 0 0 4 0 0 6 2 9 2 9 0 8 0 0 8 0 0 8 12 0 8 0 0 1 0 0 1 12 0 2 0 0 2 0 0 Good Connections Bad Connections Missing Connections

Figure 29. Good, bad, and missing connections between the truth matrix and the connection matrix of the image in Figure 4.

Now we have all the data:

- Total number of connections in the truth matrix: 18 - Number of good connections: 17

- Number of bad connections: 5 - Number of missing connections: 1

Then the rates are:

- Rate of good connections: 0.94 18

17 =

- Rate of bad connections: 0.28 185 =

- Rate of missing connections: 0.05 181 =

(26)

4.3.3 Results of the algorithm

After analysing all the 50 images, we have the Probability matrix and the Connection matrix for each image.

In order to present the results, we filtered the Probability matrix using different values of the threshold (value of the probability filter in Figure 32). In the graphic below (see Figure 32) the results are presented as the rates of good, bad, and missing connections for different threshold values in the range of 0.1 to 1 in steps of 0.1 when generating the connection matrix.

For each threshold value the connection rates for each of the 50 images are calculated and the presented rates in Figure 32 are the mean of these 50 rates.

Figure 32: Graphical representation of the results.

NOTE: This algorithm cannot handle the problem with the non-straight bvp (see Figure 33). This is why the Good connections rate for the threshold of 0.1 is not close to 1. This

(27)

Figure 33. Example of a non-straight bvp.

The difference between the local maxima in the direction histograms of two true-connected m-points 9 and 12 is not close enough to make point12 to be in the candidate list of point 9.

(28)

4.4. Experiment step by step.

The following procedures are all the steps that our program uses to analyze a sparse representation of an image.

1. We start with the information of the sparse representation and the added features:

a. Position of the m-points (see Figure 34). b. Direction histograms

c. The bvp mean direction = 2,3238

d. Each point mean direction (see Figure 35)

M-point Number Column Row

1 36 41 2 36 54 3 64 20 4 40 13 5 65 35 6 55 24 7 58 10 8 55 45 9 18 46 10 21 10 11 22 26 12 26 64 13 50 65

Figure 34. Position of the m-points (Column and Row).

m-point 1 2,348 2 1,593 3 1,998 4 3,057 5 1,944 6 2,412 7 2,967 8 2,16 9 2,52 10 3,121 11 2,784 12 2,264 13 1,789

(29)

2. Now we estimate the local maxima of the direction histograms (see Figure 36) 89 146 20 19 69 299 360 107 39 166 194 131 5 86 350 100 39 21 165 28 321 79 32 0 92 121 158 177 205 0 119 145 219 82 0 0 29 0 0

Figure 36. Values of the local maxima in degrees.

3. Now with the mean direction of the bvp we need to calculate the new order for the m-points (see Figure 37).

Position Number of the m-m-point

1 7 2 3 3 6 4 5 5 4 6 10 7 8 8 11 9 1 10 13 11 2 12 9 13 12

(30)

4. With the new order, we have to rearrange the values of the maxima of the local histograms (see Figure 38), the values of the positions (see Figure 39) and the mean directions around the m-points (see Figure 40).

M-poinNumber Column Row

5 1 58 10 0 16 28 321 2 64 20 0 36 107 39 3 55 24 5 10 39 21 4 65 35 86 350 5 40 13 6 7 16 194 131 6 21 10 9 17 205 0 7 55 45 9 7 32 0 8 22 26 9 11 145 219 9 36 41 9 9 8 146 20 10 50 65 2 0 0 11 36 54 2 1 69 299 12 18 46 2 9 121 158 13 26 64 8 0 0

Figure 38. Values of the maxima of the direction histograms after ordering.

Figure 39. Position of the m-points after the ordering

New Position Old Position mean direction

1 7 2,9665 2 3 1,9976 3 6 2,4123 4 5 1,9444 5 4 3,0572 6 10 3,1212 7 8 2,1598 8 11 2,7841 9 1 2,348 10 13 1,7894 11 2 1,5928 12 9 2,5201 13 12 2,264

(31)

5. Now we enter in the “calculus” function. 6. We take the m-point number “1”

a. We take the first maxima of the local histogram local_maxima=165 i. We search in the matrix of the local maxima for the direction

histograms for the m-points with a value of a local histogram close to 165.

ii. In the list we have the points 5,6,8 and 9

iii. We filter the points with an angle difference greater than 50 and at more than 56 points of distance

iv. In the candidates list we have the point 5.

v. We calculate the the probability for the diffangle: Probability(meanDirectionPoint1, angle_1_5, 27) vi. We calculate the probability for the length:

Probability(0,distance1_5, 30) vii. We multiply these values: 0,6546

viii. We mark the connection between 1 with 5 and 5 with 1 as 0.6546 b. We take the second maxima of the local histogram local_top=28

i. We search the candidates ii. List of candidates: 2,3,4,7,9

iii. We filter per length and per angle: 3

iv. We calculate the probability for the point 3: 0,4926 v. We mark this connection

NOTE: now we repeat this for the next 12 points of the image. The result is the Figure 41. 1 2 3 4 5 6 7 8 9 10 11 12 13 POINT 7 3 6 5 4 10 8 11 1 13 2 9 12 1 7 0 0 0,49 0,27 0,65 0 0 0 0 0 0 0 0 2 3 0 0 0,79 0,6 0 0 0 0 0 0 0 0 0 3 6 0,49 0,79 0 0 0 0 0 0 0,44 0 0 0 0 4 5 0,27 0,6 0 0 0 0 0,75 0 0 0 0 0 0 5 4 0,65 0 0 0 0 0 0 0,5 0 0 0 0 0 6 10 0 0 0 0 0 0 0 0 0 0 0 0 0 7 8 0 0 0 0,75 0 0 0 0 0 0,59 0 0 0,2 8 11 0 0 0 0 0,5 0 0 0 0 0 0 0,31 0 9 1 0 0 0,44 0 0 0 0 0 0 0 0,81 0,56 0 10 13 0 0 0 0 0 0 0,59 0 0 0 0 0 0 11 2 0 0 0 0 0 0 0 0 0,81 0 0 0 0,77 12 9 0 0 0 0 0 0 0 0,31 0,56 0 0 0 0 13 12 0 0 0 0 0 0 0,2 0 0 0 0,77 0 0

order Not order

Figure 41. The probability matrix. The grey points are the probabilities calculated based on the steps above. Furthermore the points are ordered in the old and the new ordering.

(32)

5. Discussion

5.1. Graph Matching

5.1.1. Objective

Our objective was to match a mouse blood vessel pattern (bvp) with a database of bvp to recognize that mouse in the fastest way possible; for that purpose we built a graph structure as we showed in the point 3.4 graph representation.

We call it graph representation but it is really a binary tree[4] because working only with graph pattern we can fall into problems which are very complex; by that we mean that in graph theory existing problems can be solved in polynomial time and also that there exists non polynomial problems.

The non polynomial (NP) problems can take a great deal of time to solve and they also exist as non polynomial complete (NP-Complete). We will discuss this in the next point.

5.1.2. Problems in Graph Matching

A "graph" refers to a collection of vertices or 'nodes' and a collection of edges that connect pairs of vertices. A graph may be undirected, meaning that there is no distinction between the two vertices associated with each edge, or its edges may be directed from one vertex to another.

If in our work we built a normal graph it is possible that we have made some wrong connection and at the end we will have cycles, also in the normal graph matching the problems which are usually NP-Complete. An interesting example is the graph isomorphism problem, the graph theory problem of determining whether a graph isomorphism exists between two graphs. Two graphs are isomorphic if one can be transformed into the other simply by renaming vertices.

We will use the Binary tree method because it is easier to work with and make search algorithms using these kinds of structures than in a normal graph, also the binary tree fits really well with the bvp because every m-point has two sons.

5.1.3. Binary Tree

5.1.3.1. Description

(33)

To build the algorithms consider that each node contains a record with a key value and we will use that value to perform searches. In the implementations which we will present, a value type tree node will only be considered at each node in the tree although in general such a case will consist of two components: a key indicating the field which is performed by management and information associated with such key or seen another way, the information that can be composed in which there is defined an order.

5.1.3.2. Search Binary Tree

A binary tree search is a binary tree with the property that all elements stored in the subtree left of any node x are lower than the element stored in x, and all items stored in the right subtree are x higher than the element stored in x. We can see an example in Figure 42.

Figure 42. Two binary tree with the same elements.

The binary search tree has the property to make a really easy search algorithm.

To determine if k is present in the tree compared with the key at the root, r, if match the search ends with success, if k <r k is clear then, if present, it must be a descendant of the child left from the root, and if the greater a descendent of child on the right.

(34)

6. Conclusions

During our work we could not complete all the steps that we had at the beginning because of time. An overview of the steps of our work is as follows:

1. A reconstruction of the bvp from the sparse representation and evaluation of the results.

2. Identification by graph matching.

We have spent almost all our time with two first steps; we had only one semester to do our work and also this was the first time that we have worked on a project like this. At the beginning we started out a little ‘in the dark’ but our supervisor guided us along the right path and also we learnt a great deal as we went along, but our delays have meant that we have not been able complete the identification by graph matching. However we have written a guidance for future research in the discussion part.

7. References

[1] K. Nilsson, T. Rögnvaldsson, J. Cameron, and C. Jacobson. Biometric Identification of Mice. The 18th International Conference on Pattern Recognition, Hong Kong, 20-24 August 2006.

[2] A. Ellmauthaler, E. Wernsperger, “Biometric Identification of Mice”, Master’s Thesis in Computer Engineering, 2007.

[3] J. Cameron, C. Jacobson, K. Nilsson, and T. Rögnvaldsson. Identifying laboratory rodents using earprints. NC3Rs Publications, June 2007, www.nc3rs.org.uk.