Reed-Solomon Codes: Error Correcting Codes

(1)

Bachelor Degree Project

Reed-Solomon Codes: Error Correcting Codes

Author: Isabell Skoglund Supervisor: Per-Anders

Svensson Examinator: Marcus Nilsson Subject: Mathematics Semester^{: VT2020}

(2)

Abstract

In the following pages an introduction of the error correcting codes known as Reed-Solomon codes will be presented together with different approaches for decoding. This is supplemented by a Mathematica program and a description of this program that gives an understanding in how the choice of decoding algorithms affect the time it takes to find errors in stored or transmitted information.

1 Introduction

To be able to safely store or transmit information, without losing important messages, one needs a way to detect and correct errors that can occur during the process. All available channels of communication do have some degree of noise or interference such as a scratch in a CD or a neighbouring channel in radio transmission, these have to be considered when transmitting information.

One way to still be able to transmit messages safely over a noisy channel is to add some redundancy to the message. This gives the ability to reconstruct the interfered message. It can be done by replacing the symbols in the original message by codewords that will have some redundancy in them.

There are several of different ways to do this, one of the simplest ways is to just repeat the message, which gives an repeating code. That is, the message that is sent is repeated some number of times, so even if some part is disrupted the message is most likely still readable. Here it is probably quite easy to see if there are an error but one is required to send a lot of extra information.

Example 1. If the message that should be sent is hello, this is then repeated some number of times, for example three times so the transmitted code is then hellohellohello. Here if it occurs some error, say that the received message is weloohellohello, then one can still see what the original message is. If there should be to much errors one needs to ask for the message to be resent.

An other way is to append a parity check digit at the end of the message.

That is, if the message is in, for example binary, one can make the rule of adding a 1 at the end of the message if the total number of 1’s in the message should be either odd or even.

Example 2. If the message of 1100001 should be sent the number of ones is 3, if the rule is to have the total number of ones to be even, if it is not even one needs to add an extra 1 in the end, otherwise a 0 is added, this is the parity check digit. Then the transmitted code is 11000011. If there occurs an error, for example, the received message is 11100011, where the total number of ones is now 5, since this is not even there have occurred some error and the message needs to be resent.

This method gives the ability to notice when there are only one error in the message. The error can’t be corrected, it is only detected. Then the

(4)

receiver can ask for the message to be resent [8]. If the message is disrupted it can’t be read but there is almost no extra information sent. This gives the most important balance of error correcting codes, to send as little as possible and still be able to recover the information when it is disrupted.

2 Error Correcting Codes

Error correction codes are used to ensure that potential errors in a message that is sent over a noisy communication channel or is stored on sensitive devices can be detected and corrected within specific limitations [9]. A message that is transmitted over an noisy channel needs to be encoded to obtain codewords. These codewords consists of symbols that come from some alphabet A. These codewords are the ones that are transmitted and the receiver then decodes the message that might no longer be actual codewords. In the decoding process any possible errors are detected and corrected, to some ex- tent. This is represented in figure 1 found in Introduction to Cryptography with Coding Theory Second Edition page 399 [8].

The alphabets that are used in Example 1 and in Example 2 are the English alphabet and the binary numbers, respectively. If A is an alphabet, Aⁿ denotes the set of n-tuples of elements in A, then the elements in a subset of Aⁿ are the codewords of a block code with length n. A block code is a code where all codewords have the same length. Block codes with some additional conditions is mostly used in practise, one common condition is to require that A is a finite field, this gives that Aⁿ is a vector space. These type of codes are called linear codes [8].

Figure 1: Overview for message transmission over a noisy channel [8].

(5)

Example 3. If A = {0, 1} is the alphabet of a binary repetition code where each symbol is repeated four times, and the code is the set {(0, 0, 0, 0), (1, 1, 1, 1)}

that is a subset of A⁴.

To be able to decode any alphabet, it is useful to decide a measure on how close two words are to each other. This measurement is called the Hamming distance, denoted d(v1, v2) for some words v1 and v2 from Aⁿ. The Hamming distance is defined as the number of places where the two words differ, that is the minimum number of errors that needs to occur for v₁ to be changed into v2.

Example 4. If A = {0, 1}, the Hamming distance d = (v₁, v₂) between v₁ = 1100 and v₂ = 0111 in A⁴ is equal to 3 since the two words differ in places 1, 3 and 4.

If one calculated the Hamming distance between all the different codewords in a code C there exists a minimum value, the minimum distance of the code C, denoted d(C), as

d(C) = min {d(v₁, v₂) | v₁, v₂ ∈ C, v₁ 6= v₂} .

The minimum distance of C is important since it gives the smallest number of errors that can change one codeword into another codeword. This is used when a received message have some error so it does not correspond to an existing codeword. These errors are then corrected by finding the codeword that has the smallest Hamming distance from the received message. This is called nearest neighbour decoding, that is, changing the received message to a codeword by changing as few symbols as possible.

Rules can be set up to require that nearest neighbour decoding actually gives the correct answer when there are at most t errors. These are described in Theorem 1. There can occur some trouble when there are more than one nearest neighbour to the received message. For example using the same set up as in Example 4, then 1000 for example have the same Hamming distance to all four of {0000, 1100, 1010, 1001}. Here one approach is to just guess one of them, which can seem risky but if one symbol in a long message is guessed wrong the message will probably still be readable. Or if it represents a pixel in a picture and the colour of this pixel is guessed wrong one will still be able to see the picture. If it is more sensitive information where the meaning of the message is changed depending of one symbol the safest way to go is to have the message resent.

(6)

Theorem 1. A code C can detect up to s errors if d(C) ≥ s + 1 and a code C can correct up to t errors if d(C) ≥ 2t + 1 [8].

According to Theorem 1 a code can detect up to s error if one is able to change any codeword v₁ at s places without changing it to another existing codeword v₂. The code can also correct up to t errors if one is able to change any codeword v₁ at t places and still have the codeword v₁ closest according to the Hamming distance [8].

2.1 Bounds on Codes

As mentioned in the introduction, the balance to send as little additional information as possible and still be able to recover an message after an occurring error is really important. This is called the code rate or information rate, R, of a code and represents the ratio of input data symbols contra transmitted code symbols. It can be calculated for a q-tuple (n, M, d) code, where n is the length of the code, M is the number of codewords in the code and d is the minimum distance of the code, using

R = log_q(M )

n .

This is representing what part of the bandwidth that is being used to transmit actual data. When using a code to transmit messages one would like the relative minimum distance, d/n, to be as large as possible to be able to correct a great number of errors. The relative minimum distance is a measure of the error correcting capability of the code relative to its length [6]. Here one would also like M to be as large as possible so that the code rate R is close to 1, since this gives the bandwidth efficiently when transmitting messages over noisy channels. The problem now is that increasing d tends to decrease M , or increase n, which in turn lower the code rate. This creates an dilemma where one wants both the code rate and the relative minimum distance to be as large as possible. This is described by the so called Singleton bound given by R. Singleton in 1964 presented in Theorem 2.

Theorem 2. Let C be a q-tuple (n,M,d) code. Then M ≤ q^n−d+1[8].

(7)

A code that satisfies the Singleton bound with an equality is called an maximum distance separable, MDS, code. This is a code that has the largest possible value of M for a given n and d.

Proof. Let c and c⁰ be two codewords, c = (a₁, ..., a_n) and c⁰ = (a_d, ..., a_n), respectively. If two codewords c₁ and c₂ are different from each other then they differ in at least d places. When c⁰₁ and c⁰₂ are obtained by removing d − 1 entries from c₁ and c₂, c⁰₁ and c⁰₂ must differ in at least one place. The number M of codewords c is equal to the number of vectors c⁰ obtained in this way. There are at most q^n−d+1 vectors c⁰ since there are n − d + 1 positions in these vectors. This implies that M is less or equal to q^n−d+1, as desired [8].

One class of code that fulfils an equality in the Singleton bound and hence is MDS are the Reed-Solomon codes [8].

Example 5. Using the code in Example 3 that is a binary repetition code of length 4. This is a (4, 2, 4) code. Then the Singleton bound gives

2 = M ≤ q⁴⁻⁴⁺¹= q¹ = 2¹,

here q is 2 since it is a binary code. Since there are an equality in the Singleton bound this code ia an MDS code.

2.2 Linear Codes for Finite Fields

To be able to decode a code efficiently is really important. For the decoding process to be quick it is useful to apply some conditions for the code, this provides use for linear codes. Here the alphabet A will be a finite field F , where F still can be a lot of different alphabets as long as they are finite. For example the binary numbers that gives the alphabet F = Z² or the integers modulo a prime p, which gives the alphabet F = Zp. The corresponding vector space over F is the set of n-tuples codewords in F and is denoted Fⁿ. A subspace of Fⁿ is a nonempty subset, S, that is closed under linear combinations. Then for all s₁, s₂ ∈ S and a₁, a₂ ∈ F it gives that a₁s₁+ a₂s₂ ∈ S. For both the finite fields Z2 and Zp all calculations for elements are done modulo 2 or modulo p, respectively.

Definition 1. A linear code of dimension k and length n over a finite field F is a k-dimensional subspace of Fⁿ. This type of code is called an [n, k]

(8)

code. This could be rewritten to an [n, k, d] code, when d is known and is the minimum distance of the code [8].

For example the binary repetition code in Example 3 is a linear code with one-dimensional subspace of Z⁴2.

The binary parity check code in Example 2 is a linear code with a seven- dimensional subspace of Z⁸2. This binary code of dimension 7 and length 8 consists of the binary vectors such that the sum of all entries is zero modulo 2. Then the vectors

(1, 0, 0, 0, 0, 0, 0, 1), (0, 1, 0, 0, 0, 0, 0, 1), ..., (0, 0, 0, 0, 0, 0, 1, 1)

form a basis of the subspace that contains the binary vectors where the sum of all entries is zero modulo 2.

The ISBN code, that is The International Standard Book Number is an error detecting code that not is linear. When a book is published it is assigned with a ISBN number, that is a 10 digit codeword. The first digit gives the language, the second and third digit represents the publisher and the fourth to ninth digits represents a book identity number that the publisher assigns to the book. The last digit is chosen to fulfil

10

X

j=1

ja_j ≡ 0 mod 11,

where a1, ..., a10are the digits of the ISBN number for a specific book. Since the calculation is made modulo 11 the tenth digit can be 10 which is then represented by X, the first nine digits can only be chosen form {0, 1, ..., 9}

and it is this that makes this code not linear. The code is not closed under linear combinations due to the fact that one can not choose 10 as one of the first nine entries.

When a linear code C of dimension k spans over a finite field F , where F has q elements, then the code C has q^k elements. This is seen if there is a basis of C containing k elements, v₁, ..., v_k. Then all elements of the code C can be uniquely written in the form a1v1 + · · · + akvk, where a1, ..., ak ∈ F . Here there are q choices for each a_i since F contains q elements, and there are k a_i’s since the dimension of the code is k. Hence there are q^k different elements in C. Here the Singleton bound can be rewritten for linear codes to q^k ≤ q^n−d+1, where d is the minimum distance of the code and n is the length. This implies that k + d ≤ n + 1.

(9)

As discussed before the minimum distance for a code is the smallest number of symbols that have to change to transform one codeword into another codeword, and is represented by the Hamming distance for the code. To compute the minimum distance of any arbitrary code, that might not be linear, can be tiresome since it could require computing d(v₁, v₂) for every pair of different codewords that belongs to the code C. When it is known that the code is linear, finding the minimum distance can be done using the Hamming weight instead. The Hamming weight is defined as wt(v₁) = d(v₁, 0), where 0 = (0, 0, ..., 0), that is the number of nonzero places of v1. Then d (C) is instead the smallest Hamming weight of all the nonzero codewords, like

d(C) = min {wt(v1) | 0 6= v1 ∈ C} .

This gives an advantage, since one no longer needs to calculate every codeword against each other, instead it is only one calculation for each codeword and this goes much faster.

When constructing a linear [n, k] code, it needs a k dimensional subspace of Fⁿ. One way to do this is to choose k vectors that are linearly independent to each other and then take their span. That is the set of all linear combinations of these linearly independent vectors [1]. To do this one can choose a k × n generating matrix G of rank k, with entries in F . The subspace is then given by the set of vectors of the form vG, where v runs through all row vectors in F^k. The rows of the generator matrix G are then the basis for a k dimensional subspace of F of all vectors of length n. This subspace is the linear code C. This means that every codeword is uniquely expressible as a linear combination of the rows in G.

Definition 2. Let G be a generating k × n matrix for a linear [n, k] code C.

An (n − k) × n matrix H such that

GH^T = 0,

then H is called the parity check matrix for the code C with generating matrix G.

Theorem 3. If a linear code C has the generating matrix G = [I_k, P ], then H =−P^T, I_n−k is a parity check matrix for C [8].

For a generator matrix G = [I_k, P ], where I_k is the k × k identity matrix, the last n − k columns gives the redundancy that together with the first k

(10)

columns, that still is the message, gives the full codeword. This code is then called systematic, where the first k symbols are the information symbols and the rest are the check symbols.

Example 6. Using Example 2 that is an [8, 7] code the generating matrix G locks like this,

G =







1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 1





 .

So the codeword 11000011 is the sum of the first, second and seventh row modulo 2. This codeword is then obtained by multiplying (1, 1, 0, 0, 0, 0, 1) with the generating matrix.

To check if there have occurred any errors one can use the parity check matrix H = −P^T, In−k where P^T is the transpose of P used to construct the generator matrix G. When taking the dot product between the codeword and matrix H^T and it’s result is not zero, there are some error.

The corresponding parity check matrix H for Example 6 would then be (1, 1, 1, 1, 1, 1, 1, 1), since

−P^T = (−1, −1, −1, −1, −1, −1, −1) = (1, 1, 1, 1, 1, 1, 1)

modulo 2 and I_n−k = I₈₋₇ = I₁ = 1. This gives that if v = 11000011 is a codeword the dot product between v and H^T should be zero, here v · H^T = 1 · 1 + 1 · 1 + 0 · 1 + 0 · 1 + 0 · 1 + 0 · 1 + 1 · 1 + 1 · 1 = 4 ≡ 0 mod 2.

Generally, C =uG | u ∈ A^k is a subspace of Aⁿ, where G is the k × n generating matrix. Then if v₁ = uG is a codeword, v₁H^T should be equal to zero, as

v₁H^T = (uG)H^T = u(GH^T) = 0,

since GH^T = 0 for every generating matrix and its corresponding parity check matrix. If some error e is introduced the received vector v₂ = uG + e, then multiplying this with the parity check matrix yields

v₂H^T = (uG + e)H^T = uGH^T + eH^T = eH^T 6= 0,

(11)

and an error is detected, if e not is a codeword.

If a codeword is transmitted and then vector v is received, the receiver would compute vH^T to see if there have occurred some error. If this is not equal to zero at least one error is detected. The value of vH^T is called the syndrome of vector v and the syndrome of some vector v is denoted S(v).

When vH^T is equal to zero, one can not say that there are no error, one can only say that v is a codeword. Since it is most likely to not be any error when vH^T = 0, than enough errors occurring to change one codeword into another codeword, one can assume that no errors have occurred. Now the parity check matrix can be used to detect and correct errors in the process of decoding a received message. Two definitions about cosets will help in the understanding of the general decoding procedure using the parity check matrix.

Definition 3. Let C be a linear code and let u be an n-dimensional vector.

The set u + C given by

u + C = {u + c | c ∈ C}

is called a coset of C [8].

Definition 4. A vector having minimum Hamming weight in a coset is called a coset leader [8].

Using syndrome decoding requires a lot fewer steps than just searching for the nearest codeword to the received vector would. It can be done by using a syndrome lookup table that consists of the coset leaders and their corresponding syndromes. Then decoding is done by three steps.

1. Calculate the syndrome for a received vector r, S(r) = rH^T.

2. Find the coset leader that has the same syndrome as S(r). Let it be c₀.

3. Decode the received vector r using the coset leader c₀ as r − c₀. Example 7. Let C be a binary linear code that has the generating matrix

G =1 0 1 0 0 1 0 1

.

(12)

Then the code C includes the codewords

{(0, 0, 0, 0), (1, 0, 1, 0), (0, 1, 0, 1), (1, 1, 1, 1)} ,

these elements will be the first row of a decoding table that will help in the decoding process. To create the next row take the vector with the smallest Hamming weight that do not already have a place in the table, there can be more than one choice, and than the next three elements in this row is obtained by the first element of the row subtracted of the element at the top of each column. This is done for all rows until all possible elements of length four is used, together this creates the table

(0,0,0,0) (1,0,1,0) (0,1,0,1) (1,1,1,1) (1,0,0,0) (0,0,1,0) (1,1,0,1) (0,1,1,1) (0,1,0,0) (1,1,1,0) (0,0,0,1) (1,0,1,1) (1,1,0,0) (0,1,1,0) (1,0,0,1) (0,0,1,1).

When a vector is received, look for it in the table and decode it to the vector at the top of the same column. If the received vector v is (0, 0, 0, 1) it is decoded to (0, 1, 0, 1).

Example 7 is quite small so even though (0, 0, 0, 1) is decoded to one of its nearest neighbours it is not the only one that are equally close, (0, 0, 0, 0) is also a nearest neighbour to (0, 0, 0, 1). This becomes a problem since the minimum distance of this code is 2, this means that a general error correction might not be possible. If the code would have fulfilled the conditions described in Theorem 1 the same procedure will decode the vectors correctly.

The code in the example was used since writing the table and search for the received vector in this table can be difficult for large codes. Here the parity check matrix H can be used to make the process more manageable.

The vectors in the first column is the coset leaders, l, if v is in the same row as l, then v = l + c fore some codeword c. This gives that

vH^T = lH^T + cH^T = lH^T,

since c is a codeword it gives that cH^T = 0. The syndromes are here the vector S(v) = vH^T, if two vectors have the same syndrome they belong to the same coset and have the same coset leader, so the table in Example 7 can be replaced by the smaller table,

(13)

(0,0,0,0) (0,0) (1,0,0,0) (1,0) (0,1,0,0) (0,1) (1,1,0,0) (1,1).

Example 8. Using the same code C as in Example 7 with the same generating matrix G, decoding the received vector v = (0, 0, 0, 1) is now done by multiplying it by H^T, this gives

S(v) = vH^T = (0, 0, 0, 1)





 1 0 0 1 1 0 0 1







= (0, 1).

This is the syndrome of the third row in the smaller table, now subtract the coset leader from the vector v modulo 2 and the codeword (0, 1, 0, 1) is found, which is the same as in Example 7.

For large codes this procedure is too inefficient to be practical [8]. For a general linear code the problem of finding the nearest neighbour is hard and is considered a NP-complete problem, where NP stands for “nondeterminis- tic polynomial time” and is a classification of how hard this problem is to solve [4]. There are certain types of codes that have more efficient decoding procedures for example the cyclic codes.

2.3 Cyclic Codes

A linear code C is called cyclic if a cyclic shift of one codeword in C generates another codeword in C. If C is cyclic then if

(c₀, c₁, ..., c_n−1) ∈ C it gives that (c_n−1, c₀, c₁, ..., c_n−2) ∈ C.

When continuing doing cyclic shifts, more codewords are generated, this gives that all cyclic permutations of a codeword is also a codeword. The code used in Example 7 is therefore also a cyclic code, if any of the codewords is shifted in a cyclic way it will still be a codeword.

If F is a finite field and as before consisting of the integers mod p, where p is a prime, then let F [x] denote the set of all polynomials in x that have

(14)

coefficients in F . Then with the positive number n the code will work in F [x]

(xⁿ− 1),

that denotes the elements of F [x] mod (xⁿ− 1). This is the polynomials with degree less than n. If a polynomial of degree n or larger is encountered it is divided by (xⁿ− 1) and the remainder is the new polynomial. A cyclic shift of a word corresponds to multiplying the corresponding polynomial in F [x] / (xⁿ− 1) by x modulo xⁿ− 1. The general description of a cyclic code is given in Theorem 4.

Example 9. Using the code in Example 7 where one codeword (1, 0, 1, 0), which is the first row of the generating matrix G, can be represented as the polynomial g(x) = 1 + x². Then g(x)x gives the second row of G, continuing with g(x)x², that represents two cyclic shifts,

g(x)x² = x²+ x⁴ ≡ 1 + x² mod xⁿ− 1.

Since the degree is equal to n = 4 the computation is done modulo x⁴− 1, this then gives the first row of G again.

Theorem 4. Let C be a cyclic code of length n over a finite field F . For each codeword (c0, ..., cn) in C, the polynomial c0 + c1x + · · · + cn−1xⁿ⁻¹ is associated in F [x]. Then let g (x) be the polynomial of smallest degree out of all the nonzero polynomials obtained from C in this way. Dividing g (x) by it’s highest coefficient, one may assume that g (x) is a monic polynomial, where a monic polynomial is when the leading coefficient is one. This polynomial g (x) is then called the generating polynomial for C and

1. g (x) is uniquely determined by C.

2. g (x) is also a divisor of (xⁿ− 1), i.e. g(x)h(x) = xⁿ − 1 for some h(x) ∈ F (x).

3. C is exactly the set of coefficients of the polynomials of the form g (x) f (x), where deg (f ) ≤ n − 1 − deg (g).

4. A polynomial m (x) ∈ F [x] / (xⁿ− 1) corresponds to a codeword in C if and only if h (x) m (x) ≡ 0 mod (xⁿ− 1), where h(x) is defined by 2 [8].

(15)

If g (x) = c₀ + c₁x + . . . c_k−1x^k−1 + x^k is built like in Theorem 4, then by part 3 of the theorem, every codeword of C corresponds to a polynomial of the form g (x) f (x), where deg (f ) ≤ n − 1 − deg (g). Since f (x) is a linear combination of 1, x, x², ..., x^k−1, this gives that every codeword in C is a linear combination of the codewords corresponding to the polynomials

g (x) , g (x) x, g (x) x², ..., g (x) x^k−1. These are in turn corresponding to the vectors

(g₀, ...g_k, 0, 0...) , (0, g₀, ..., g_k, 0, ...) , ..., (0, ..., 0, g₀, ..., g_k) .

Then a generating matrix for C can be build similar as the one done for the linear codes,

G =







g₀ g₁ . . . g_k 0 0 . . . 0 g₀ g₁ . . . g_k 0 . . . ... ... ... ... ... ... ... 0 . . . 0 g₀ g₁ . . . g_k





 .

To construct the parity check matrix for C corresponding to the generating matrix one uses part 4 of Theorem 4. Here h (x) = h₀+ h₁x + · · · + h_mx^m, where m = n − k,

H =







h_m h_m−1 . . . h₀ 0 0 . . . 0 h_m h_m−1 . . . h₀ 0 . . . ... ... ... ... ... ... ... 0 . . . 0 h_m h_m−1 . . . h₀





 .

This should fulfil that g(x)h(x) = xⁿ− 1 which is equivalent to g(x)h(x) ≡ 0 mod (xⁿ − 1), this in turn gives that GH^T = 0, which is true for every generating matrix and its corresponding parity check matrix. As mentioned for linear codes a parity check matrix H for a linear code C means that vH^T = 0 if and only if v ∈ C. This is the same for the cyclic code, cH^T = 0 if and only if c ∈ C.

Example 10. Constructing a generating matrix G for a code of length 7 can be done by factorising the polynomial x⁷ − 1, since the generating polynomial g(x) should divide

x⁷− 1 = (x − 1)(x³+ x²+ 1)(x³ + x + 1).

(16)

Then the generating polynomial could be chosen to g(x) = 1 + x + x²+ x⁴ this then generates the matrix

G =





1 1 1 0 1 0 0 0 1 1 1 0 1 0 0 0 1 1 1 0 1



.

Here a cyclic shift of the first row gives all the nonzero codewords, so all codewords is C are

C ={(0, 0, 0, 0, 0, 0, 0), (1, 1, 1, 0, 1, 0, 0), (0, 1, 1, 1, 0, 1, 0), (0, 0, 1, 1, 1, 0, 1), (1, 0, 0, 1, 1, 1, 0), (0, 1, 0, 0, 1, 1, 1), (1, 0, 1, 0, 0, 1, 1), (1, 1, 0, 1, 0, 0, 1)}.

Note that it happens to be like this for this particular example, for cyclic codes in general there can be additional codewords, whose cyclic shifts also is codewords.

To check that this is all the codewords one can take the linear combination in every possible way and check that it generates one of the codewords in this list. This code is cyclic since a cyclic shift of one codeword generates another codeword. Here the parity check matrix H is constructed from the parity check polynomial, h(x), that satisfies g(x)h(x) = x⁷− 1, hence h(x) = x³ + x + 1 which gives the matrix

H =







1 0 1 1 0 0 0 0 1 0 1 1 0 0 0 0 1 0 1 1 0 0 0 0 1 0 1 1





 .

The parity check matrix gives a way to detect errors in a transmitted message for cyclic codes in a similar way as for a linear code. It can still be hard to correct the occurring errors for a general cyclic code, to make it easier one can give even more structure to the code, like a BCH code [8].

2.4 BCH Codes

BCH codes were discovered in the late 1950’s by R. C. Bose and D. K. Ray- Chaudhuri and independently by A. Hocquenghem, hence the name BCH.

BCH codes are a class of cyclic codes, that has a decoding algorithm that can correct multiple occurring errors. These types of codes are specificity used in satellites and the special BCH codes called Reed-Solomon codes have

(17)

a lot off different applications. To construct a BCH code one needs some background information regarding polynomials.

If a polynomial d (x) is a divisor of the polynomial f (x) then f (x) = d (x) g (x) for some g(x). Here 1 and f (x) are trivial divisors of f (x) since they are always divisors of f (x). All other divisors are called non trivial, or proper, divisors of f (x). If a polynomial f (x) have no proper divisors in a finite field F , then f (x) is said to be irreducible over F .

Let the polynomial f (x) be of degree n ≥ 1, and irreducible over the finite field F , where

F = GF (pⁿ) = Zp[x]

f (x) = {a₀+ a₁α + · · · + a_n−1αⁿ⁻¹| a_i ∈ Zp, f (α) = 0}, where GF (pⁿ) denotes the Galois field with pⁿ elements, p is a prime. If F^∗ = F \ {0} denotes the group F without the zero element, this group is cyclic. If α is the generating element of this group, such that hαi = F^∗, then f (x) is called a primitive polynomial [1].

When using addition and multiplication of polynomials this is done modulo some irreducible polynomial h (x) of degree n, let Fⁿ[x] be the set of all polynomials in F [x] with degree less than n. Here each codeword in Fⁿ corresponds to a polynomial in Fⁿ[x], so one can also use addition and multiplication of codewords in Fⁿ. Then multiplication in Fⁿ is defined to be modulo an irreducible polynomial of degree n [5].

When using a primitive polynomial to construct GF (2^r), that represents the Galois field with elements based on 2^r, in binary it would be an r-bit number, all computations in the field is easier than when a non-primitive irreducible polynomial is used.

Let β be in Fⁿand represent the codeword corresponding to x mod h (x), where h (x) is a primitive polynomial of degree n. Then βⁱ is equivalent to xⁱ mod h (x). Here note that if 1 ≡ x^m mod h (x) it means that 0 = 1 + x^m mod h (x) which gives that h (x) divides 1 + x^m. Since h (x) is a primitive polynomial, h (x) does not divide 1 + x^m for m less than 2ⁿ− 1, this gives that β^m is not equal to 1 for m less than 2ⁿ− 1. If β^j = βⁱ for j 6= i if and only if βⁱ = β^j−iβⁱ, this implies β^j−i= 1. From this one can say that

Fⁿ\ {0} =βⁱ | i = 0, 1, ..., 2ⁿ− 2 .

That is, every non-zero codeword in Fⁿ can be represented by some power of β. This property makes multiplication in this field easy. An example of

(18)

this using GF (2⁴) and h (x) = 1 + x + x⁴ shown in Table 1 found in Coding Theory and Cryptography the Essentials page 114 [5].

Example 11. Using Table 1 to compute multiplication of codewords is done using powers of β. To compute (1100) (1010) transform the codewords to powers of β, then

(1100) (1010) = β⁴β⁸ = β¹²= 1111.

This can be done since

(1 + x) 1 + x² ≡ 1 + x + x²+ x³

mod h (x) .

codeword polynomial in x mod h (x) power of β

0000 0 -

1000 1 β⁰ = 1

0100 x β

0010 x² β²

0001 x³ β³

1100 1 + x ≡ x⁴ β⁴

0110 x + x² ≡ x⁵ β⁵

0011 x²+ x³ ≡ x⁶ β⁶

1101 1 + x + x³ ≡ x⁷ β⁷

1010 1 + x² ≡ x⁸ β⁸

0101 x + x³ ≡ x⁹ β⁹

1110 1 + x + x² ≡ x¹⁰ β¹⁰

0111 x + x²+ x³ ≡ x¹¹ β¹¹ 1111 1 + x + x²+ x³ ≡ x¹² β¹² 1011 1 + x²+ x³ ≡ x¹³ β¹³

1001 1 + x³ ≡ x¹⁴ β¹⁴

Table 1: Construction of GF (2⁴) where h (x) = 1 + x + x⁴ [5].

An element α in GF (2^r) is called a primitive element if α^m is not equal to 1 for m between 1 and 2^r− 1. That is, α is a primitive element if every non-zero codeword in GF (2^r) can be expressed as a power of α. Then if a primitive polynomial h (x) is used to construct the finite field GF (2^r) with β defined as above, then β is a primitive element.

(19)

Usually the order of the non-zero element α in GF (2^r) is the smallest positive integer m such that α^m = 1. For any non-zero element α in GF (2^r), α has order m less than 2^r− 1. Hence this α is a primitive element if it has order 2^r− 1 [5].

This definition of primitive element α will be useful when one wants to construct the class of BCH codes that is called Reed-Solomon codes, since it is used when constructing the generating polynomial for the code.

To start the construction of a BCH code of length n over a finite field F , one needs to factorize xⁿ− 1 in the same way as in the section of cyclic codes,

xⁿ− 1 = f₁(x)f₂(x) . . . f_r(x),

where each f_i(x) is an irreducible polynomial over the field F . If α is a primitive root modulo n, then α⁰, α¹, ..., αⁿ⁻¹ are the roots of xⁿ − 1 such that

xⁿ− 1 = (x − 1)(x − α) . . . (x − αⁿ⁻¹).

This means that each f_i(x) is a product of some of the factors x−α^j, then each α^j is a root of the polynomials f_i(x). For each j, let q_j(x) be the polynomial f_i(x) that fulfils f_i(α^j) = 0, then the polynomials q₀(x), q₁(x), ..., q_n−1(x) are formed. The polynomials q_l(x) are not all distinct since the polynomial f_i(x) can have two different powers α^j,α^l as roots, then the polynomial f_i(x) will serve as both q_j(x) and q_l(x). Then a BCH code of designed distance δ is a code with the generating polynomial

g(x) = lcm{q_k+1(x), q_k+2(x), ..., q_k+δ−1(x)},

where k is some chosen integer. A BCH code C with designed distance δ, such that d(C) ≥ δ, this is the so-called BCH bound, where the BCH bound says that for a cyclic code C of length n over F with minimum weight d. If C contains δ − 1 consecutive elements for some integer δ, then d is greater or equal to δ [6]. The polynomial g(x) is called the minimal polynomial of α over F since it is the monic polynomial of minimal degree in F (x) such that g(α) = 0.

Example 12. Using the same polynomial as in Example 10, then x⁷− 1 = (x − 1)(x³+ x²+ 1)(x³+ x + 1)

and using the other possible generating polynomial g(x) = x⁴+ x³+ x²+ 1 = (x − 1)(x³ + x + 1). If α then is a root of x³+ x + 1, it is a primitive root

(20)

modulo n, where n is equal to 7. This gives that g(α) = 0 as well as g(α²) = (α²)³ + α² + 1 = 0, since the computations is done with binary numbers and α³ = α + 1 as well as squaring (α³)² = (α + 1)² = α²+ 2α + 1 = α²+ 1. This gives that the square of a root α is also a root of x³+ x + 1, then α⁴ = (α²)² is also a root to g(x). Now g(x) can be rewritten

g(x) = x³+ x + 1 = (x − α)(x − α²)(x − α⁴).

All the remaining powers of α must be roots to (x − 1) and (x³ + x² + 1), respectively, in summary the different polynomials q_i are

q₀(x) = x − 1

q₁(x) = q₂(x) = q₄(x) = x³+ x + 1 q3(x) = q5(x) = q6(x) = x³+ x²+ 1.

If the chosen integer k is equal to −1 and d = 3 then the least common multiple of g(x) is then

g(x) =lcm{qk+1(x), qk+2(x), ..., qk+d−1(x)}

=lcm{q0(x), q1(x)}

=x⁴+ x³+ x²+ 1.

This example says that the minimum weight is at least 3. If k = −1 and d is chosen to 4 then the generating polynomial g₁(x) is

g₁(x) = lcm{q₀(x), q₁(x), q₂(x)} = g(x),

since q₁(x) = q₂(x) the least common multiple does not change and now the minimum weight of this code also is at least 4. The actual minimum weight of this code is equal to 4, which can be seen if the minimum weight of the codewords is calculated.

3 Reed-Solomon Codes

The Reed-Solomon codes where introduced in 1960 by I. S. Reed and G.

Solomon and are a type of BCH codes. If F is a finite field with q elements, where q = p^r, where p is a prime, then for a binary code, 2^r. If n = q − 1,

(21)

then F contains a primitive element, α. Then the generating polynomial is constructed as

g (x) = x − α^b

x − α^b+1 . . . x − α^b+d−1 , (1) where d is between 1 and n. Usually b is chosen to 0 or 1, from here on it will be chosen to 1. This generator polynomial will have coefficients in F and generates a BCH code C over F of length n that is called a Reed-Solomon code.

Since g (α) equals zero for all powers 0, ..., d − 1 of α the BCH bound gives that the minimum distance for C is at least d.

A Reed-Solomon code is a cyclic [n, n + 1 − d, d] code, where n = 2^r− 1 and the elements are from GF (2^r). The rest is based of the fact that the generating polynomial is a polynomial of degree d − 1 so it has at most d nonzero coefficients. This gives that the codeword corresponding to the generating polynomial is a codeword of weight at most d. It gives that the minimum weight for C is exactly d and the dimension of C is n − deg (g) = n + 1 − d, where g is the generating polynomial, which gives the notation for the code. The notation [n, k] can also be used where d = n + 1 − k.

The codewords in C is given by the polynomials g (x) f (x) ,

where deg(f ) is less or equal to n − d. Since there are q different choices for the n − d + 1 coefficients of f (x) there are q^n−d+1 polynomials f (x). Hence there are q^n−d+1 different codewords in C. Due to this the Reed-Solomon code fulfils the criterion for a MDS code, that is, there is an equality in the Singleton bound [8].

3.1 Encoding

To start an encoding process one needs to define the message polynomial. For a Reed-Solomon [n, k] code, k information symbols form the message that is encoded as one block and can be represented by the message polynomial m (x). This message polynomial m (x) is then of order k − 1,

m (x) = m_k−1x^k−1+ · · · + m₁x + m₀,

(22)

where the coefficients m_k−1, ..., m₀ are message symbols from an alphabet, the usually the Galois field GF (2^r).

The Reed-Solomon encoding can be done both cyclic and systematic, when it is done systematic it is still a cyclic structure in the background, for example when constructing the generating polynomial. For both methods of encoding the same generating polynomial g (x) is used, shown in Equation (1). For the cyclic approach the generating matrix is made in the same manner as in the section of cyclic codes. Then the coefficients of the message polynomial represents a message vector instead which is multiplied through the generating matrix to construct a codeword. The generating polynomial is then

g (x) = (x − α) x − α² . . . x − α^d

= g₁+ g₂x + · · · + g_d−1x^d−1, and the corresponding generating matrix for g (x) is

G =







g₀ g₁ . . . g_d−1 0 0 . . . 0 g₀ g₁ . . . g_d−1 0 . . . ... ... ... ... ... ... ... 0 . . . 0 g₀ g₁ . . . g_d−1







| {z }

n

,

where there are n columns and k rows. The codeword c is then c = (m_k−1, ..., m₁, m₀)G.

3.2 Properties

For many types of applications the errors often occur in bursts and are not randomly distributed. A burst error is when there occurs an error in many adjacent bits, for example a scratch on a CD. Reed-Solomon codes, where the message symbols comes from the finite field F = GF (2^r). Here GF (2^r) represents the Galois field with elements based on 2^r. Using the elements of a Galois field as message symbols the coefficients of the message polynomial would be represented by an r-bit binary number. It is due to this that the Reed-Solomon code is good with burst errors, since even though all bits in a symbol is in error it will only count as one symbol error in terms of correction capacity of the code [3].

(23)

4 Methods

When one starts to decode a received message there are some different approaches to consider. There is an direct method that is based on some trial and error, there are also different algorithms to speed up the direct method.

An other type of decoding algorithm are the Berlekamp-Massey algorithm that will be used later on.

4.1 Decoding

Let [n, k, d] be some Reed-Solomon code, where n = 2^r−1, k is the dimension and d is the minimum weight. Since the elements of the code comes from GF (2^r), correcting a received vector means that one needs to both find the the locations of the error as well as magnitudes of the error. The error location is defined as the position in the received vector that holds an error and is refereed to by an error location number, if the jth coordinate of the vector is an error location then then its error location number is x^j. The error magnitude of an error location is the size of the error in this error location j, it is the error in the coefficient of x^j. To decode a Reed-Solomon code one needs to find both the error location and the corresponding error magnitude [5].

If the received vector is represented as an polynomial, R (x), then it can be seen as

R (x) = T (x) + E (x) , (2)

where T (x) is the transmitted codeword and E (x) is the error that have occurred. Here E (x) = E_n−1xⁿ⁻¹+ · · · + E₁x + E₀, where each coefficient is an element from GF (2^r). The positions of the errors are determined by the degree of x. The correction capacity of the code are, as discussed before, d needs to be greater or equal to 2t+1, where t is the number of errors that can be corrected. This gives that if more than t = (d − 1) /2 of the coefficients in E (x) is non-zero the error may not be corrected. Like what happened in Example 7, where the algorithm did still work but the nearest codeword might be another than the one that was transmitted.

To know if there have occurred any errors one needs to calculate the syndromes for the received polynomial. This can be done in some different ways.

One way is to divide the received polynomial by the generator polynomial, if the received polynomial is an actual codeword this can be done without any

(24)

remainder. This property extends to the factors of the generator polynomial, which gives the ability to find each syndrome value S₁, ..., S_dby dividing the received polynomial by each of the factors x − αⁱ corresponding to the factors in Equation 1,

R (x)

x − αⁱ = Q_i(x) + Si

x − αⁱ, (3)

where Q (x) is the quotient and i goes from 1 to d. Here the remainder is the sought syndrome values S₁, ..., S_d. Rearranging Equation (3) gives the equation for each syndrome value,

S_i = Q_i(x) × (x − α_i) + R (x) . Hence when x = αⁱ this can be reduced to

S_i = R αⁱ

= R_n−1 αⁱn−1

+ · · · + R₁αⁱ+ R₀,

where the coefficients R_n−1, ..., R₀ are the symbols of the received polynomial.

This gives an alternative way of finding the syndrome values, namely by substituting x = αⁱ in the received polynomial. This is possible since the syndrome values only are dependent on the error pattern, that is

R αⁱ = T αⁱ + E αⁱ ,

where T (αⁱ) is equal to zero since x − αⁱ is a factor of the generating polynomial that in turn is a factor of T (x). Hence this can be reduced to

R αⁱ = E αⁱ = S_i. (4)

When no error is detected the syndrome values S₁, ..., S_d will be zero [3].

The relation between the syndromes and the error polynomial can be used to set up a set of simultaneous equations and from these the error can be found. Here the error polynomial is rewritten to only include the error locations. Assuming that v errors have occurred, v needs to be less or equal to t, otherwise the errors can not be corrected. The error polynomial is rewritten to

E(x) = Y₁x^j¹ + Y₂x^j² + · · · + Y_vx^j^v,

(25)

where j₁, ..., j_v is the error location number for each error, respectively. The error magnitude at each location is represented by Y₁, ..., Y_v. Substituting this back in the syndrome Equation (4) gives

S_i = E(αⁱ)

= Y₁αîj¹ + Y₂αîj² + · · · + Y_vαîj^v

= Y₁X₁ⁱ + Y₂X₂ⁱ+ · · · + Y_vX_vⁱ,

where (X₁ = α^j¹), ..., (X_v = α^j^v) still are the error locators with error location numbers j₁, ..., j_v. Now the 2t syndrome equations can be set up by a matrix equation:





 S₁ S₂ ... S_2t







=







X₁¹ X₂¹ . . . X_v¹ X₁² X₂² . . . X_v² ... ... ... X₁^2t X₂^2t . . . X_v^2t







×





 Y₁ Y₂ ... Y_v







. (5)

Note that the syndromes S1, ..., S2t−1 corresponds with the roots of the generating polynomial α, ..., α^d chosen in Equation (1).

There are now two different ways to construct the error locator polynomial. The first one, denoted σ(x), is constructed as

σ(x) = (x − X₁)(x − X₂) . . . (x − X_v)

where the error locators X₁, ..., X_v are roots to the polynomial and produces a polynomial of degree v with the coefficients σ1, ..., σv.

The second error locator polynomial, denoted Λ(x), is constructed as Λ(x) = (1 − X₁x)(1 − X₂x) . . . (1 − X_vx), (6) where the factors (1 − Xjx) gives that it is the inverses X₁⁻¹, ..., X_v⁻¹ of the error locators that are the roots of the polynomial with coefficients Λ₁, ..., Λ_v. However the coefficients of both σ and Λ are the same since σ(x) can be rewritten to x^v × Λ(1/x). The two error locator polynomials is used on different places to make the computations easier.

When the error locations X₁, ..., X_v have been found they can be substituted back into the syndrome equation and be solved by direct calculation using matrix inversion of the matrix equation (5), this produces the error magnitudes Y₁, ..., Y_v. If the matrix that are obtained not is invertible an

(26)

alternative method of calculating the error value Y_j is to use the Forney algorithm which will not be described here, Algebraic Coding Theory by Berlekamp E. for further reading [2]. Now when the symbols containing errors have been identified by X_j and the magnitude of these errors Y_j is found, the errors can be corrected by subtracting the error polynomial E(x) of the received vector R(x), by rewriting of Equation (2) this gives the transmitted codeword T (x) [3].

4.1.1 Direct Method

The task of finding the coefficients of the error locator polynomial have some different approaches, this section will describe the direct method.

Now there is a corresponding root X_j⁻¹ for each error that makes Λ(x) equal to zero, this can be written as

1 + Λ₁X_j⁻¹+ · · · + Λ_v−1X_j^−v+1+ Λ_vX_j^−v = 0.

This can be multiplied through by Y_jX_jî+v to be rewritten as Y_jX_jî+v + Λ₁Y_jX_jî+v−1+ · · · + Λ_vY_jX_jⁱ = 0,

where each term of Y_jX_j with powers and j = 1, ..., v can be rewritten as syndromes,

S_i+v + Λ₁S_i+v−1+ · · · + Λ_vS_i = 0, where i = 0, ..., 2t − v − 1.

This now produces a set of 2t − v simultaneous key equations, where Λ₁, ..., Λ_v are unknown. To solve these equations for Λ₁, ..., Λ_v one can use the first v equations,





 S_v Sv+1

S_v+2 ... S_2v−1







=







S_v−1 S_v−2 . . . S₁ Sv Sv−1 . . . S2

S_v+1 S_v . . . S₃ ... ... ... S_2v−2 S_2v−3 . . . S_v−1







×





 Λ₁ Λ2

Λ₃ ... Λ_v







, (7)

except that v is unknown here. So to find what v is it is necessary to calculate the determinant for the matrix, for each value of v. These calculations should start with v = t and continue down until a non-zero determinant is found [7].

(27)

This non-zero determinant gives that the equations are independent and can be solved. Then the coefficients to the error locator polynomial are found by inverting the matrix and solve the equations. When the coefficients of the error locator polynomial is found they can be used to find the sought error location numbers that indicates the error location for each error. When the error location polynomial is written as

Λ(x) = X1(x − X₁⁻¹)X2(x − X₂⁻¹) . . . ,

then the function value will be zero if x = X₁⁻¹, X₂⁻¹, ... and this is the case when x = α^−j¹, α^−j², ..., hence the values of X₁, ..., X_v are found by trial and error. This can be done using the Chien search, that is trying the powers of α^j for j between 0 and n − 1, since this covers the whole field, to find the roots of Λ(x). The values of α^j for j between 0 and n − 1 are substituted into Equation 6 and each result is evaluated. If this expression is evaluated to zero then the value of x is a root and identifies an error location. Here it is j that gives the error location number [3].

4.1.2 Berlekamp–Massey algorithm

This algorithm gives an alternative method for finding the error locator polynomial that is faster than the direct method. Here the error locator polynomial σ(x) is calculated with the syndromes S1, ..., S2t. Let σR(x) = 1 + σ_t−1x + σ_t−2x²+ · · · + σ₀x^t, this can be seen as the reverse of the error locator polynomial σ(x).Then let S(x) = 1 + S₁x + S₂x²+ · · · + S_2tx^2t be the syndrome polynomial. Here by using the division algorithm one can write

σ_R(x)S(x) = q(x)x^2t+1+ r(x), where the degree of r(x) is less or equal to 2t.

This version of the algorithm will produce a polynomial P2t(x) satisfying P_2t(x)S(x) = q_2t(x)x^2t+1+ r_2t(x), where the degree of P_2t(x) is less or equal to t and the degree of r_2t(x) is also less than t. Hence P_2t(x) is equal to σR(x). Now let

q_i(x) = q_i,0+ p_i,1x + · · · + q_i,2t−1−ix^2t−1−i and also let

p_i(x) = x^2t+1−iP_i(x) = p_i,0+ p_i,1x + · · · + p_i,lx^l,

(28)

then at step i, the algorithm calculates q_i(x), p_i(x) and the integers D_i and z_i.

The following steps are then used to calculate the error locator polynomial with the Berlekamp-Massey algorithm. Let T (x) be the transmitted codeword that is encoded using a generator polynomial g(x) constructed in the same manner as in Equation (1), then let the received vector with some error be R(x). The decoding process is then continued as follows:

1. Calculate the syndromes for the received vector as R(αⁱ) = S_i, where i = 1, ..., 2t.

2. Now define

q₋₁ = 1 + S₁x + S₂x²+ · · · + S_2tx^2t q0 = S1+ S2x + · · · + S2tx^2t−1 p−1 = x^2t+1 and

p₀ = x^2t,

as well the initial conditions D−1 = −1, D₀ = 0 and z₀ = 1.

3. Then for i = 1, ..., 2t, q_i(x), p_i(x), D_i and z_i is recursively defined for two different cases as follows

(a) If q_i−1,0 = 0, then

q_i(x) = q_i−1 x , pi(x) = p_i−1

x ,

D_i = 2 + D_i−1 and z_i = z_i−1.

(b) If q_i−1,0 6= 0 then qi(x) =

q_i−1(x) −_q^q^i−1,0

zi−1,0q_z_i−1(x)

x ,

p_i(x) =

p_i−1(x) −_q^q^i−1,0

zi−1,0p_z_i−1(x)

x ,

Di = 2 + minDi−1, Dzi−1 , and z_i =

(i − 1, if D_i−1≥ D_z_i−1, z_i−1, otherwise.

(29)

If e, that is less or equal to t, errors have occurred during the transmission of the codeword then p_2t(x), that is equal to σ_R(x), has degree e and the error locator polynomial σ = p2t,e + p2t,e−1x + · · · + p2t,1x^e−1+ x^e then has e distinct roots. These roots can be found similarly as in the direct method and gives then the error location number [5].

4.2 My program

The Mathematica program that I have used can be found in appendix A, in this section there will be a specific example run of this program to se how it works. The first section of this program takes a text together with some n and k, that represents the Reed-Solomon code [n, k], and encodes the text. The encoding process start with that each letter gets represented by a element in Fn, and k such elements then creates one block of length k. This is then encoded by the k × n generating matrix G that is constructed by the generating polynomial g(x). This polynomial g(x) consists of consecutive powers of the smallest primitive element α modulo n, then

g(x) = (x − α¹)(x − α²) . . . (x − α^n−k).

For this example n = 257 and k = 249, the generating polynomial g(x) = 44 + 118x + 4x²+ 174x³+ 156x⁴+ 42x⁵+ 157x⁶+ 183x⁷+ x⁸, the computations are done mod n and here α = 3. The generating matrix G is then constructed in a the same manner as for a cyclic code. The blocks that consists of the message elements is then multiplied through G to encode the message. An output of the program gives how many errors, at most t errors if d(C) ≥ 2t + 1 according to the Theorem 1, that are introduced in each block and then one calculation of the error positions from the direct method and one from the Berlekamp–Massey algorithm.

Example 13. If 2 errors is introduced in one block of the text found in appendix A, there are some error in two different positions that are represented by x^j. The number of errors as well as there positions are of course unknown when the decoding process begins. First the syndrome values for the corresponding vector is calculated, here

{17, 31, 133, 61, 71, 20, 176, 155},

they are the same for both methods. From now on the methods differ, the first description will consider the direct method.

Reed-Solomon Codes: Error Correcting Codes

Bachelor Degree Project