Polar Codes for

(1)

IN

DEGREE PROJECT ELECTRICAL ENGINEERING, SECOND CYCLE, 30 CREDITS

STOCKHOLM SWEDEN 2018,

Polar Codes for

Identification Systems

LINGHUI ZHOU

(2)

TRITA TRITA-EECS-EX-2018:103 ISSN : 1653-5146

(3)

Abstract

Identification systems are ubiquitous, for example, biometric identification systems with fingerprints and Face IDs, etc. Basically, the identification problem consists of two steps. The enrollment phase where the user’s data are captured, compressed and stored, for example taking the fingerprint or capturing some important features of your face. In the identification phase, an observation, your fingerprint or your face, is compared with the stored information in the database to provide an affirmative answer. Since the system involves many users, both storing and searching for the correct user is challenging.

This project aims to implement compression and identification algorithms for the high dimensional identification system which includes M users. Polar codes are employed to be the main toolbox. Firstly, we implement polar codes for the source compression and then design corresponding identification mappings. The source compression can be seen as the channel decoding of polar codes. In the identification phase, the observation can be seen as the side information, so we will consider using Wyner-Ziv coding for polar codes to reconstruct and identify. In the next step, we will implement polar codes for two-layer Wyner-Ziv coding for identification systems. This will enable us to store the compressed data in separate databases and do the reconstruction in two stages. With the enrollment mapping and identification mapping implemented, we will evaluate the performance of the designed identification systems, such as identification error rate and complexity. Some possible further directions would be to implement more advanced algorithms such as simplified or fast simplified successive cancellation encoding in source coding and universal decoding in identification.

(4)

Sammanfattning

Identifieringssystem frekommer verallt, till exempel, biometriska identifieringssystem med fingeravtryck och ansiktsigenknning, etc. Fundamentalt kan problemet brytas ned i tv faser. Registreringsfasen dr data om anvndaren insamlas, kom- primeras och lagras, till exempel att ta fingeravtryck eller finna viktiga ansik- tsdetaljer. I identifieringsfasen, jmfrs en observation, ditt fingeravtryck eller ansikte, med information som lagrats tidigare fr att ge ett positivt svar. Efter- som systemet hanterar mnga anvndare r bde lagring och skning efter den rtta anvndaren utmanande.

Syftet med detta projekt r att designa och implementera effektiva komprimerings- och identifieringsalgoritmer fr det hgdimensionella Identifieringssystemet med M anvndare. Polar codes anvnds som det huvudsakliga verktyget. Frst implementerar vi polar codes fr effektiv kllkomprimering och designar sedan motsvarande identifieringskartlggning. Kllkomprimering kan ses som kanalavkodningen av polar codes och I identifieringsfasen kan observationen ses som sido-informationen, s vi vervger att anvnda Wyner-Ziv kodning fr polar codes fr att rekonstruera och identifiera. I nsta steg implementerar vi polar codes fr skra Wyner-Ziv problem. Detta tillter oss att spara komprimerad data i separata databaser och terskapa med tv steg. Med registreringskarlggning och identifieringskartlggning implementerade utvrderar vi prestandan av de designade identifieringssystemen med metoder som felfrekvens av identifieringar och berkningskomplexitet.

(5)

Acknowledgment

Firstly, I would like to express my sincerest gratitude to my examiner, Tobias Oechtering, Asst. Prof. at the Department of Information Science and En- gineering of Royal Institute of Technology (KTH), who provided me with the opportunity to do this master thesis and supervising it. I would like to thank my supervisor, Minh Thanh Vu, for his supervision, valuable advice and patience throughout this master thesis project. Finally, I would like to thank my family and friends for their constant support and encouragement.

(6)

List of Symbols and Abbreviations

Symbol Definition

X random variable

X alphabet

x realization of X

|X | cardinality of the alphabet X

W (y|x) channel W

C(W ) capacity of channel W

I(W ) symmetric capacity of channel W

Z(W ) Bhattacharyya parameter

W_N⁽ⁱ⁾ bit channel

W^N N channel uses

WN vector channel of size N

log() logarithm of base 2

h2(p) binary entropy function, −p log p − (1 − p) log(1 − p)

O(N ) asymptotic complexity of N

M number of users in an identification system

α ∗ β α(1 − β) + β(1 − α)

Ber(p) Bernoulli distribution with expectation p

Abbreviations Definition

B-DMC Binary Discrete Memoryless Channel

BEC Binary Erasure Channel

BSC Binary Symmetric Channel

LR Likelihood Ratio

LLR Log Likelihood Ratio

SC Successive Cancellation decoder

SCL Successive Cancellation List decoder

RV Random Variable

MMI maximum mutual information

ML maximum likelihood

l.c.e. lower convex envelope

iff if and only if

i.i.d. independent and identically distributed

(7)

Indroduction

1.1 Motivation

The issue of biometrical identification has raised considerable awareness in the last few decades. In [1], an introduction to biometric identification systems was given. Biometric identification systems, which use physical features to identify individuals, ensure greater security than the traditional identification strategies.

The most common traditional identification methods are passwords, keys, elec- tric tokens, and cards. It happens that passwords can be forgotten and keys or cards can be lost or stolen. However, the physical features of human are unique for each individual and not likely to change in a period of time. The most common physical features are the face, fingerprint, voice ,iris, hand geometry, etc. In [2], a comparison between these five biometrics was given. According to different usage of applications and biometric features’ characteristics, we can match a specific biometric feature to an application [3]. However, different from the traditional identification methods, the implementation of biometric identification systems requires to store the biometric data of the users and reconstruct based on the database. In this work, we will be interested in finding an efficient compression mechanism and reconstruction method.

Polar codes, recently proposed by Arikan [6], are proved to be the first codes that achieve the capacity of the binary-input discrete memoryless channels (B-DMCs). However, the results obtained at short sequence length are not satisfying. It was shown in [7], [8], with the list based successive cancellation decoding, polar codes achieve better performance at short sequence length. In [10], polar codes are also proved to be optimal for lossy source coding. In [16], it was shown that polar code is also optimal for the Wyner-Ziv scenario. Polar codes for two-layer Wyner-Ziv coding was discussed in [20]. In this project, we will use polar codes for the source compression and reconstruction in an identification system. We will also discuss how the list based successive cancellation influence the performance of source coding. In addition, we will consider implementing polar codes for two-layer Wyner-Ziv coding, which will generate two separate databases.

(10)

1.2 Societal Impact

Identification system turns out to play increasingly critical role in our soci- ety. As a result, more accurate and faster identification becomes a crucial task.

Biometric identification system tends to be adopted without limit, as it is implemented by both private organizations and governmental institutes, regardless of political or economic structure, size or geography. It was estimated that the biometrics market will increase from 12.84 billion dollars in 2016 to 29.41 billion dollars by 2022 [4]. The biometric identification systems turns out to play more important role in many areas.

1.3 Introduction of Identification Systems

Biometric identification systems, which use physical features to identify individuals, ensure better security than password or numbers. Some of the most common and best-known features are the face, fingerprints, voice, irises, etc.

Generally, a biometric identification system involves two phases. The first phase, enrollment phase, the physical features of the observed individuals are quantized and stored in the database. In the identification phase, a noisy version of the biometrical data from an unknown individual is observed. The observed data is compared to the enrolled data in the database and decide which user is observed.

Consider in an identification system there might be a large number of individuals involved, it might be difficult to store the original data. It becomes necessary to compress the data efficiently. Possible solutions are data mining, efficient data compression mechanism and storing data in several devices sep- arately. In this thesis, we will focus on the second and the third aspects. In addition, we will also think about implementing corresponding identification mappings.

1.4 Thesis Outline

The report is organized as follows.

• In Chapter 2, we introduce the basics of polar codes, including the channel polarization and transformation. The successive cancellation decoding for polar codes channel coding will also be discussed.

• In Chapter 3, we introduce the polar codes for source coding. Two en- coders will be applied, successive cancellation and list based successive cancellation encoder.

• In Chapter 4, we discuss the polar codes for the Wyner-Ziv problem. The two-layer Wyner-Ziv problem will also be discussed.

• In Chapter 5, we consider the model for an identification system and implement the polar codes for data compression as well as reconstruction.

• In Chapter 6, we will briefly discuss the conclusions, challenges and future work on polar codes for an identification system.

(11)

Chapter 2

Polar Codes for Channel Coding

In this chapter, we will discuss the basics of polar codes for channel coding.

This is based on the work of Arikan [6].

Polar code construction is based on the following transformation. Given the input U₁^N, implement the encoding operation X₁^N = U₁^NG_N, and let x^N₁ transmit through the N copies of a B-DMC W . The transformation matrix G_N is defined as:

GN = G^⊗n₂ RN,

where G^⊗n₂ is the n^thKronecker power of G2and RN is the bit-reversal permutation matrix. The matrix RN can be interpreted as the bit-reversal operator:

there is, if v^N₁ = u^N₁ RN, then vb₁,··· ,b_n= ub_n,··· ,b₁. The n^thKronecker power of G2 is defined as

G^⊗n₂ = G¹₂⊗ G^⊗n−1₂ =

G^⊗n−1₂ 0 G^⊗n−1₂ G^⊗n−1₂

. (2.1)

Here, the base is G¹₂=

1 0 1 1

.

Next, apply the chain rule to the mutual information between the input U₁^N and Y₁^N, there is

I(U₁^N; Y₁^N) =

N

X

i=1

I(U_i; Y₁^N|U₁ⁱ⁻¹) =

N

X

i=1

I(U_i; Y₁^N, U₁ⁱ⁻¹).

The essential observation of polar codes is that with block size N increases, the terms in the summation either approach 0 or 1. This phenomenon is referred as channel polarization.

In the following sections, we will give more details about polar codes.

(12)

2.1 Polarization Basics

2.1.1 Binary Input Channels

Assume X is the field of size two and Y is an arbitrary set. X and Y are the input and output alphabets of a channel W . Denote the channel as W : X → Y.

Then the probability of observing Y = y ∈ Y when the input is X = x ∈ X is

P r{Y = y|X = 0} = W (y|0) and P r{Y = y|X = 1} = W (y|1). (2.2)

2.1.2 Binary Discrete Memoryless Channel

Among the binary input channels, binary discrete memoryless channel (B-DMC) is an important class of channels in information theory. We write W^N to denote the channel corresponding to N independent uses of channel W ; therefore, W^N : X^N → Y^N with W^N(y^N₁ |X₁^N) =QN

i=1W (y_i|xi).

Given a B-DMC W , we can measure the rate and reliability of W by deter- mining two parameters, the symmetric capacity and the Bhattacharyya parameter.

Definition 1. The symmetric capacity is defined as I(W )=^∆X

y∈Y

X

x∈X

1

2W (y|x) log W (y|x)

1

2W (y|0) +¹₂W (y|1), (2.3) The symmetric capacity I(W ) equals the Shannon capacity when W is a symmetric channel. A channel is symmetric if there exist a permutation π such that for each output symbol y there is W (y|1) = W (π(y)|0). Two examples of symmetric channels are the binary symmetric channel (BSC) and the binary erasure channel (BEC).

A BSC with crossover probability pe is a B-DMC W with output alphabet Y = {0, 1}, W (0|0) = W (1|1) = 1 − pe and W (1|0) = W (0|1) = pe. The Shannon capacity for a BSC(p_e) is

C(BSC(p_e)) = 1 − h₂(p_e),

where h₂ is the binary entropy function and h₂(p_e) = −p_elog (p_e) − (1 − p_e) log (1 − p_e).

A BEC with erasure probability peis a B-DMC if there is W (0|0) = W (1|1) = 1 − peand W (e|0) = W (e|1) = pe, where e is the erasure symbol. The Shannon capacity for a BEC(pe) is

C(BEC(pe)) = 1 − pe.

Another important parameter is the Bhattacharyya parameter, which is defined as follows.

Definition 2. The Bhattacharyya parameter is defined as Z(W )=^∆ X

y∈Y

pW (y|0)W (y|1). (2.4)

(13)

In the code construction of polar codes, we will focus on the Bhattacharyya parameters. It is an upper bound on the maximum likelihood (ML) decision error probability [6]. More details about the properties of the Bhattacharyya parameter the relationship with block error probability will be discussed in the later sections.

2.2 Channel Transform

In this section, we will discuss the channel transform and the transformation of I(W ) and Z(w). Firstly, we introduce the basic level transform, then we extend it to recursive transform.

2.2.1 Basic Channel Transform

Let X = {0, 1}, W : X → Y be a B-DMC, and U₁² a random vector that is uniformly distributed over X². Consider the following channel combining of two channels as depicted in Figure 2.1.

U₁

+

W

U₂ W

Y₁

Y₂ X₁

X₂

Figure 2.1: The basic channel transform

Denote the input of the channel with X₁²= U₁²G2, and Y₁²the corresponding outputs. We have the transition probabilities

W₂(y²₁|u²₁)=^∆

2

Y

i=1

W (y_i|x_i), (2.5)

where

G₂=

1 0 1 1

. (2.6)

The channel combining here implies how two individual channels W are transformed to a new channel W₂ : X² → Y². Since the transform between U₁²and X₁²is linear and bijective, we have the mutual information between the input U₁² and output Y₁²

I(U₁²; Y₁²) = I(X₁²; Y₁²) = 2I(W ). (2.7) Now we split the mutual information above by applying chain rule:

I(U₁²; Y₁²) = I(U1; Y₁²) + I(U2; Y₁²|U1) = I(U1; Y₁²) + I(U2; Y₁², U1). (2.8)

(14)

For the term I(U₁; Y₁²), we can interpret it as the mutual information between the input U₁ and the output Y₁², and the input U₂ seen as noise. We denote this ”channel” by W₂⁽¹⁾.

Similarly, the term I(U2; Y₁², U1) can be seen as the mutual information between the input U2 and the output Y₁² with U1 known. Similarly, we denote this ”channel” by W₂⁽²⁾.

Based on this, we can write (W, W ) 7→ (W₂¹, W₂²) for any given B-DMC channel W with

W₂⁽¹⁾(y²₁|u1)=^∆X

u₂

1

2W2(y₁²|u²₁) =X

u₂

1

2W (y1|u1⊕ u2)W (y2|u2) (2.9)

W₂⁽²⁾(y₁², u₁|u2)=^∆ 1

2W₂(y₁²|u²₁) = 1

2W (y₁|u1⊕ u2)W (y₂|u2) (2.10) For simplicity, we define the following notations for the above channel trans- formations. Given any B-DMC W : X → Y, we have

W ∗ W(y1, y2|u1) ^def= 1 2

X

u2∈X

W (y1|u1⊕ u2)W (y2|u2) (2.11)

W ~W(y1, y2, u1|u2) ^def= 1

2W (y1|u1⊕ u2)W (y2|u2) (2.12) For any B-DMC W , the transformation (W, W ) 7→ (W₂¹, W₂²) is rate-preserving and moves the symmetric capacity away from the center in the sense that

I(W₂⁽¹⁾) + I(W₂⁽²⁾) = 2I(W ), (2.13) I(W₂⁽¹⁾) ≤ I(W ) ≤ I(W₂⁽²⁾). (2.14) and the Bhattacharyya parameters of this transformation satisfies

Z(W₂⁽¹⁾) ≤ 2Z(W ) − Z(W )², (2.15)

Z(W₂⁽²⁾) = Z(W )². (2.16)

The equality holds only when W is a BEC [6].

2.2.2 Recursive Channel Transform

In this section, we will explain how the channel combining works at the higher levels. And when the size is large enough, the channels would be polarized to either completely clean or noisy channel.

The next level, second level (n = 2) of channel transform is illustrated in Figure 2.2.

The mapping u⁴₁ 7→ x⁴₁ from the input of W4 to W⁴ can be written as x⁴₁= U₁⁴G₄, where

(15)

U₁ + + W

U₂

U₃

U₄

W

W +

+

Y₁

Y₂

Y₃

Y₄

•

• •

•

X1

X2

X3

X4

Figure 2.2: Second level channel transform

G4= R4G^⊗2₂ =







1 0 0 0

1 0 1 0

1 1 0 0

1 1 1 1







. (2.17)

Then we have the transformation of transition probabilities W4(y⁴₁|u⁴₁) = W⁴(y₁⁴|u⁴₁G4). This operation can be generalized to the higher level in a recursive manner. Define channel WN : XN → YN, do the channel combining

W_N(y₁^N|u^N₁) = W_N/2(y^N/2₁ |u^N_1,e⊕ u^N_1,o)W_N/2(y^N_N/2+1|u^N_1,e), (2.18) where u^N_1,o = (u1, u3, ..., uN −1) and u^N_1,e = (u2, u4, ..., uN). For the channel splitting, apply the chain rule as before,

I(U₁^N; Y₁^N) =

N

X

i=1

I(Ui; Y₁^{N −1}, U₁ⁱ⁻¹).

I(Ui; Y₁^{N −1}, U₁ⁱ⁻¹) can be seen as the mutual information between Uiand (Y₁^{N −1}, U₁ⁱ⁻¹).

Denote this channel by W_N⁽ⁱ⁾, for which the transition probability is W_N⁽ⁱ⁾(y₁^{N −1}, uⁱ⁻¹₁ |ui)^∆= P (y₁^{N −1}, uⁱ⁻¹₁ |ui).

For any n ≥ 0, N = 2ⁿ, 1 ≤ i ≤ N , we have W_2N⁽²ⁱ⁻¹⁾(y₁^2N, u²ⁱ⁻²₁ |u2i−1)

=X

u_2i

1

2W_N⁽ⁱ⁾(y₁^N, u²ⁱ⁻²_1,o ⊕ u²ⁱ⁻²_1,e |u2i−1⊕ u2i)W_N⁽ⁱ⁾(y^2N_{N +1}, u²ⁱ⁻²_1,e |u2i) (2.19)

(16)

W_2N⁽²ⁱ⁾(y₁^2N, u²ⁱ⁻²₁ |u2i−1)

=1

2W_N⁽ⁱ⁾(y₁^N, u²ⁱ⁻²_1,o ⊕ u²ⁱ⁻²_1,e |u2i−1⊕ u2i)W_N⁽ⁱ⁾(y_{N +1}^2N , u²ⁱ⁻²_1,e |u2i) (2.20) The above channels can be denoted by

W_N²ⁱ = W_N/2⁽ⁱ⁾ ∗ W_N/2⁽ⁱ⁾ , W_N²ⁱ⁺¹= W_N/2⁽ⁱ⁾ ~WN/2⁽ⁱ⁾ , For I(W_N⁽ⁱ⁾) and Z(W_N⁽ⁱ⁾) at higher level, we have

I(W_N⁽²ⁱ⁻¹⁾) ≤ I(W_N/2⁽ⁱ⁾ ) ≤ I(I(W_N⁽²ⁱ⁾)), I(W_N⁽²ⁱ⁻¹⁾) + I(W_N⁽²ⁱ⁾) = 2I(W_N/2⁽ⁱ⁾ ), and

Z(W_N⁽²ⁱ⁻¹⁾) ≤ 2Z(W_N/2⁽ⁱ⁾ )²− Z(W_N/2⁽ⁱ⁾ ), Z(W_N⁽²ⁱ⁾) = Z(W_N/2⁽ⁱ⁾ )².

The equality holds when the channel W is BEC [6]. For this special case that W is a BEC with erasure probability e, the Bhattacharyya parameter could be calculated recursively

Z(W_N⁽²ⁱ⁻¹⁾) = 2Z(W_N/2⁽ⁱ⁾ )²− Z(W_N/2⁽ⁱ⁾ ), (2.21) Z(W_N⁽²ⁱ⁾) = Z(W_N/2⁽ⁱ⁾ )². (2.22) with initiation Z(W₁⁽¹⁾) = e.

Let PB denote the block error probability. Then PB can be upper bounded as given in [6]

P_B≤X

Z(W_N⁽ⁱ⁾). (2.23)

There is also an important property of Bhattacharyya parameter for de- graded channel. Consider two B-DMC channels W and W⁰, suppose W W⁰, then there is W_N⁽ⁱ⁾ W⁰⁽ⁱ⁾_N and Z(W_N⁽ⁱ⁾) ≥ Z(W⁰⁽ⁱ⁾_N ) [14, Lemma 4.7].

2.2.3 Channel Polarization

In the previous section, we have discussed the channel transformation from N copies of channel W to polarized ”channels” {W_N⁽ⁱ⁾}^N_i=1. Figure 2.3 illustrates the result of polarization for the case W is a BEC with erasure probability pe= 0.5. The bit channel capacities are calculated using the recursion

I(W_N⁽²ⁱ⁾) = 2I(W_N/2⁽ⁱ⁾ )²− I(W_N/2⁽ⁱ⁾ ), (2.24) I(W_N⁽²ⁱ⁻¹⁾) = I(W_N/2⁽ⁱ⁾ )². (2.25) with initiation I(W₁⁽¹⁾) = 1 − pe. This recursive relation follows from equations (2.19), (2.20) and the fact that I(W_N⁽ⁱ⁾) = 1 − Z(W_N⁽ⁱ⁾) for a BEC W . Note that

(17)

0 100 200 300 400 500 600 Bit channel index

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Capacity

Capacity of bit channels

Figure 2.3: I(W_N⁽ⁱ⁾) for i = 1, · · · , 512 for a BEC(0.5)

this recursion is valid for BECs, calculation for general B-DMCs is not known yet.

Figure 2.3 shows that I(W_N⁽ⁱ⁾) tends to approaches 0 for smaller indices and approaches 1 fro larger indices. It was proved in [6], if the block length N is sufficiently long enough, I(W_N⁽ⁱ⁾) will either approach 0 or 1. This is implied from the following theorem [6].

Theorem 2.1. For any B-DMC W , the channels {W_N⁽ⁱ⁾} polarizes in the sense that, for any fixed δ ∈ (0, 1), as N goes to infinitely through powers of two, the fraction of indices i ∈ {1, · · · , N } for which I(W_N⁽ⁱ⁾) ∈ (1 − δ, 1] goes to I(W ) and the fraction for which I(W_N⁽ⁱ⁾) ∈ [0, δ) goes to 1 − I(W ).

2.3 Code Construction

We use the polarization effect for code construction. The idea of polar coding is to send data only through the channels for which Z(W_N⁽ⁱ⁾) approach 0. Figure 2.4 illustrates the code construction for polar codes with block length N = 8, K = 4 and assuming the channel is W = BEC(¹₂). According to the discussion in Section 2.2.2, the Bhattacharyya parameters for a BEC can be calculated directly with the equations (2.21) and (2.22). Rank the Bhattacharyya parameters, and select the channels with least Z_N⁽ⁱ⁾as information bits, which are U4, U6, U7 and U8.

Therefore the code construction problem for polar codes can be seen as finding the Bhattacharyya parameters Z(W_N⁽ⁱ⁾). To construct a (N, K) polar code, we firstly calculate the Z(W_N⁽ⁱ⁾) for each i ∈ {1, · · · , N } and divide the indices to two parts, free set F^cand frozen set F . We use the indices belonging to F^c to transmit information and fix the indices belonging to F to some known

(18)

U₁ + + W

U2

U₃

U4

U₅

U6

U₇

U8

W W

W

W W +

+

+ +

+

+ Y₁

Y2

Y₃

Y4

Y₅

Y6

Y₇

Y8

+ +

•

• •

•

• •

• f rozen

f rozen f rozen

data f rozen

data

data data Rank

8

7 6

4 5

3 2

1 Z(Wi)

0.9961

0.8789 0.8086

0.3164 0.6836

0.1914 0.1211

0.0039

Figure 2.4: Code construction for polar codes with N = 8, K = 4, W = BEC(¹₂)

values, usually 0.

For BECs, the transformation of Z(W_N⁽ⁱ⁾) and I(W_N⁽ⁱ⁾) are known, so the code construction can be precisely described. While for other channels, the code construction problem is more complex. Arikan proposed to use Monte- Carlo method for estimating the Bhattacharyya parameters [6]. Firstly, generate samples of (U₁^N, Y₁^N) with the given distribution. Then find the empirical means { ˆZ(W_N⁽ⁱ⁾)}, which is estimated as the expectation of the RV

s

W_Nⁱ(Y₁^N, U₁ⁱ− 1|U_i⊕ 1)

W_Nⁱ(Y₁^N, U₁ⁱ− 1|U_i) . (2.26) An successive cancellation (SC) decoder can be used for this computation be- cause the RV is the square root of the decision statistics. The details of SC decoding will be introduced in the next section.

Algorithm 1 describes how to estimate Z_N for a BSC with pre-defined crossover probability p.

(19)

Algorithm 1 The Monte-Carlo estimation

Input: Sequence length N , the crossover probability p, Monte-Carlo itera- tions Runs

Output: The estimated Z(W_N⁽ⁱ⁾), with i = 1, 2, · · · , N

1: Z = zeros(N, 1)

2: for r = 1 : Runs do

3: x = randn(N, 1)

4: y = bsc(x, p)

5: for i = 1 : N do

6: l = LLR⁽ⁱ⁾_N(y₁^N, xⁱ⁻¹₁ |x_i)

7: if l ≥ 0 then

8: Z(W_N⁽ⁱ⁾) = ^Z(W

(i)

N )∗(r−1)+e^{− 1}2l

r

9: else

10: Z(W_N⁽ⁱ⁾) = ^Z(W

(i)

N )∗(r−1)+e¹2l

r 11: return Z

2.4 Polar Codes Achieve Channel Capacity

In the previous sections, we have seen how the channels are polarized and take the advantage of the polarization for code construction. According to Theorem 2.1, the fraction of ”clean” channels tends to approach I(W ), therefore the achievable rate is close to I(W ).

Recall in the channel splitting, we define the bit channel W_N⁽ⁱ⁾with respect to the mutual information term I(U_i; Y₁^{N −1}, U₁ⁱ⁻¹). To define such a channel, the decoder should have access to U₁ⁱ⁻¹and the output Y₁^N. Therefore consider using the SC decoder which decodes in order U₁, · · · , U_N. In this way, the decoder will have an estimation of U₁ⁱ⁻¹when decoding U_i. Based on this idea, Arikan proposed SC decoding based on computing likelihood ratio (LR) [6],

L⁽ⁱ⁾_N(y₁^N, ûⁱ⁻¹₁ )^∆=W_N⁽ⁱ⁾(y^N₁ , ûⁱ⁻¹₁ |0) W_N⁽ⁱ⁾(y^N₁ , ûⁱ⁻¹₁ |1) and generates decision as

(a) If i ∈ F , then set ˆu_i= u_i.

(b) If i ∈ F^c, then calculate L⁽ⁱ⁾_N and set

ˆ ui=

(0 L⁽ⁱ⁾_N(y^N₁, ˆuⁱ⁻¹₁ ) ≥ 1 1 otherwise

As stated in equation 2.23, the block error probability P_B can be upper bounded as

PB ≤ X

i∈F^c

Z(W_N⁽ⁱ⁾).

To let the block error probability PB reduce to sufficiently small or vanish, the Bhattacharyya parameter Z(W_N⁽ⁱ⁾) for i ∈ F^c should approach 0. In [17], Arikan and Telatar obtained the following result which gives the rate of Z(W_N⁽ⁱ⁾) approaching 0.

(20)

Theorem 2.2. Given a B-DMC W and any β < ¹₂, there is

n→∞lim Pr(Z(W_N⁽ⁱ⁾) ≤ 2^−N^βfor i ∈ {1, · · · , N }) = I(W ). (2.27) In [6], Arikan proved polar codes achieve the symmetric capacity.

Theorem 2.3. Given a B-DMC W and fixed rate R < I(W ), for any β < ¹₂ there exists a sequence of polar codes of rate RN < R such that the block error probability

PN = O(2^−N^β).

This can be proved with the following code construction [14]. For any 0 <

β < ¹₂ and > 0. Choose the frozen set F as F = {i : Z(W_N⁽ⁱ⁾> 1

N2^−N^β}.

Theorem 2.3 implies that for sufficiently large enough block length N , there is

|F^c|

N ≥ I(W ) − .

The block error probability of this scheme with SC decoding is PB(F ) ≤ X

i∈F^c

Z(W_N⁽ⁱ⁾) ≤ 2^−N^β.

Therefore Theorem 2.3 proved.

2.5 Decoding Algorithms

2.5.1 Successive Cancellation Decoder

We have discussed SC decoding in the previous section. In this section, we will study the details of SC decoding. Recall the SC decoding is realized by calculating the likelihood ratios (LR). The LRs can be calculated using the recursive formulas in equation 2.18, which gives

L⁽²ⁱ⁻¹⁾_N (y₁^N, ˆu²ⁱ⁻²₁ ) =

L⁽ⁱ⁾_N/2(y^N/2₁ , û²ⁱ⁻²_1,o ⊕ û²ⁱ⁻²_1,e )L⁽ⁱ⁾_N/2(y_N/2+1^N , û²ⁱ⁻²_1,e ) + 1 L⁽ⁱ⁾_N/2(y^N/2₁ , û²ⁱ⁻²_1,o ⊕ û²ⁱ⁻²_1,e ) + L⁽ⁱ⁾_N/2(y_N/2+1^N , û²ⁱ⁻²_1,e ), and

L⁽²ⁱ⁾_N (y^N₁ , û²ⁱ⁻¹₁ ) = L⁽ⁱ⁾_N/2(y^N/2₁ , û²ⁱ⁻²_1,o ⊕ û²ⁱ⁻²_1,e )]^1−2ˆû²ⁱ⁻¹· L⁽ⁱ⁾_N/2(y^N_N/2+1, û²ⁱ⁻²_1,e ).

This calculation can be recursively reduced to block length 1 with initiation L⁽¹⁾₁ (y_i) = W (y_i|0)/W (y_i|1), which can be found directly with the output sequence y and channel parameter.

To avoid doing a large amount of multiplication calculation, do the successive cancellation in the logarithm domain. In logarithm domain, the above algorithm will become

(a) If i ∈ F , then set ˆui= ui.

(21)

(b) If i ∈ F^c, then calculate L⁽ⁱ⁾_N and set

ˆ ui=







0 ln(^W

(i)

N (y^N₁,ˆuⁱ⁻¹₁ |0) W_N⁽ⁱ⁾(y^N₁,ˆuⁱ⁻¹₁ |1)) ≥ 0 1 otherwise

For simplicity, denote LLR⁽ⁱ⁾_N/2(y₁^N/2, û²ⁱ⁻²_1,o ⊕û²ⁱ⁻²_1,e ) with LLR₁and LLR_N/2⁽ⁱ⁾ (y_N/2+1^N , û²ⁱ⁻²_1,e ) with LLR2.Then in log domain, the recursive calculation becomes

LLR⁽²ⁱ⁻¹⁾_N (y₁^N, ˆu²ⁱ⁻²₁ ) = 2 tanh⁻¹(tanh(LLR₁

2 ) tanh(LLR₂

2 )), (2.28) LLR_N⁽²ⁱ⁾(y^N₁ , ˆu²ⁱ⁻¹₁ ) = (−1)^u^ˆ²ⁱ⁻¹LLR₁+ LLR₂=

(LLR2 + LLR1 uˆ²ⁱ⁻¹₁ = 0 LLR2 − LLR1 uˆ²ⁱ⁻¹₁ = 1 (2.29) Using proper approximation [27], equation (2.28) can be approximated to

LLR⁽²ⁱ⁻¹⁾_N (y^N₁ , ˆu²ⁱ⁻²₁ ) = sgn(LLR1)sgn(LLR2) min(|LLR1|, |LLR2|) It is obvious that the recursive form in the logarithm domain is much simpler, which only consist of summation and sign operation operators. Algorithm 2 describes a successive cancellation decoder.

Algorithm 2 Successive cancellation decoder

Input: Received vector y with length N , frozen set F Output: Decoded sequence u

1: u = zeros(N, 1)ˆ

2: for doi = 1 : N

3: if i ∈ F then

4: uˆi= 0

5: else

6: l = LLR⁽ⁱ⁾_N(y₁^N, ˆuⁱ⁻¹₁ )

7: if l ≥ 0 then

8: uˆ_i = 0

9: else

10: uˆ_i = 1

11: x = ˆuG_N

12: return x

2.6 Successive Cancellation List Decoding

The main drawback of SC decoder is that once there is a wrong decision made, it can not be corrected. In order to avoid this problem and improve the performance, an improved version of SC decoding, the Successive Cancellation List (SCL) decoder was introduced to approach the maximum likelihood decoder with an acceptable complexity [7], [8].

Similar to SC decoder, the SCL decoder also uses the recursion calculation to make the decision. However, instead of only making one decision, the SCL

Polar Codes for

Polar Codes for

Identification Systems

LINGHUI ZHOU

Abstract

Sammanfattning

Acknowledgment

List of Symbols and Abbreviations

Contents

Chapter 1

Indroduction

1.1 Motivation

1.2 Societal Impact

1.3 Introduction of Identification Systems

1.4 Thesis Outline

Chapter 2

Polar Codes for Channel Coding

2.1 Polarization Basics

2.2 Channel Transform

+

2.3 Code Construction

2.4 Polar Codes Achieve Channel Capacity

2.5 Decoding Algorithms

2.6 Successive Cancellation List Decoding