IN
DEGREE PROJECT ELECTRICAL ENGINEERING, SECOND CYCLE, 30 CREDITS
STOCKHOLM SWEDEN 2018,
Polar Codes for
Identification Systems
LINGHUI ZHOU
TRITA TRITA-EECS-EX-2018:103 ISSN : 1653-5146
Abstract
Identification systems are ubiquitous, for example, biometric identification sys- tems with fingerprints and Face IDs, etc. Basically, the identification problem consists of two steps. The enrollment phase where the user’s data are captured, compressed and stored, for example taking the fingerprint or capturing some im- portant features of your face. In the identification phase, an observation, your fingerprint or your face, is compared with the stored information in the database to provide an affirmative answer. Since the system involves many users, both storing and searching for the correct user is challenging.
This project aims to implement compression and identification algorithms for the high dimensional identification system which includes M users. Polar codes are employed to be the main toolbox. Firstly, we implement polar codes for the source compression and then design corresponding identification mappings. The source compression can be seen as the channel decoding of polar codes. In the identification phase, the observation can be seen as the side information, so we will consider using Wyner-Ziv coding for polar codes to reconstruct and iden- tify. In the next step, we will implement polar codes for two-layer Wyner-Ziv coding for identification systems. This will enable us to store the compressed data in separate databases and do the reconstruction in two stages. With the enrollment mapping and identification mapping implemented, we will evaluate the performance of the designed identification systems, such as identification error rate and complexity. Some possible further directions would be to imple- ment more advanced algorithms such as simplified or fast simplified successive cancellation encoding in source coding and universal decoding in identification.
Sammanfattning
Identifieringssystem frekommer verallt, till exempel, biometriska identifieringssys- tem med fingeravtryck och ansiktsigenknning, etc. Fundamentalt kan problemet brytas ned i tv faser. Registreringsfasen dr data om anvndaren insamlas, kom- primeras och lagras, till exempel att ta fingeravtryck eller finna viktiga ansik- tsdetaljer. I identifieringsfasen, jmfrs en observation, ditt fingeravtryck eller ansikte, med information som lagrats tidigare fr att ge ett positivt svar. Efter- som systemet hanterar mnga anvndare r bde lagring och skning efter den rtta anvndaren utmanande.
Syftet med detta projekt r att designa och implementera effektiva komprimerings- och identifieringsalgoritmer fr det hgdimensionella Identifieringssystemet med M anvndare. Polar codes anvnds som det huvudsakliga verktyget. Frst imple- menterar vi polar codes fr effektiv kllkomprimering och designar sedan motsvarande identifieringskartlggning. Kllkomprimering kan ses som kanalavkodningen av polar codes och I identifieringsfasen kan observationen ses som sido-informationen, s vi vervger att anvnda Wyner-Ziv kodning fr polar codes fr att rekonstruera och identifiera. I nsta steg implementerar vi polar codes fr skra Wyner-Ziv problem. Detta tillter oss att spara komprimerad data i separata databaser och terskapa med tv steg. Med registreringskarlggning och identifieringskartlggning implementerade utvrderar vi prestandan av de designade identifieringssystemen med metoder som felfrekvens av identifieringar och berkningskomplexitet.
Acknowledgment
Firstly, I would like to express my sincerest gratitude to my examiner, Tobias Oechtering, Asst. Prof. at the Department of Information Science and En- gineering of Royal Institute of Technology (KTH), who provided me with the opportunity to do this master thesis and supervising it. I would like to thank my supervisor, Minh Thanh Vu, for his supervision, valuable advice and patience throughout this master thesis project. Finally, I would like to thank my family and friends for their constant support and encouragement.
List of Symbols and Abbreviations
Symbol Definition
X random variable
X alphabet
x realization of X
|X | cardinality of the alphabet X
W (y|x) channel W
C(W ) capacity of channel W
I(W ) symmetric capacity of channel W
Z(W ) Bhattacharyya parameter
WN(i) bit channel
WN N channel uses
WN vector channel of size N
log() logarithm of base 2
h2(p) binary entropy function, −p log p − (1 − p) log(1 − p)
O(N ) asymptotic complexity of N
M number of users in an identification system
α ∗ β α(1 − β) + β(1 − α)
Ber(p) Bernoulli distribution with expectation p
Abbreviations Definition
B-DMC Binary Discrete Memoryless Channel
BEC Binary Erasure Channel
BSC Binary Symmetric Channel
LR Likelihood Ratio
LLR Log Likelihood Ratio
SC Successive Cancellation decoder
SCL Successive Cancellation List decoder
RV Random Variable
MMI maximum mutual information
ML maximum likelihood
l.c.e. lower convex envelope
iff if and only if
i.i.d. independent and identically distributed
Contents
1 Indroduction 6
1.1 Motivation . . . . 6
1.2 Societal Impact . . . . 7
1.3 Introduction of Identification Systems . . . . 7
1.4 Thesis Outline . . . . 7
2 Polar Codes for Channel Coding 8 2.1 Polarization Basics . . . . 9
2.1.1 Binary Input Channels . . . . 9
2.1.2 Binary Discrete Memoryless Channel . . . . 9
2.2 Channel Transform . . . . 10
2.2.1 Basic Channel Transform . . . . 10
2.2.2 Recursive Channel Transform . . . . 11
2.2.3 Channel Polarization . . . . 13
2.3 Code Construction . . . . 14
2.4 Polar Codes Achieve Channel Capacity . . . . 16
2.5 Decoding Algorithms . . . . 17
2.5.1 Successive Cancellation Decoder . . . . 17
2.6 Successive Cancellation List Decoding . . . . 18
2.7 Complexity Analysis . . . . 20
3 Polar Codes for Source Coding 21 3.1 Source Coding Basics . . . . 21
3.2 Successive Cancellation Encoder . . . . 22
3.3 List based SC Encoder . . . . 23
3.4 Simulation Results and Discussion . . . . 23
4 Polar Codes with Side Information 28 4.1 Wyner-Ziv Problem . . . . 28
4.2 Two-layer Wyner-Ziv Coding . . . . 31
4.2.1 Two-layer Polar Coding . . . . 31
4.2.2 Two-layer Wyner-Ziv Encoding . . . . 33
4.2.3 Two-layer Wyner-Ziv Decoding . . . . 34
4.3 Simulation Results . . . . 38
4.3.1 One-layer Polar Codes for Wyner-Ziv Problem . . . . 38
4.3.2 Two-layer Polar Codes for Wyner-Ziv Problem . . . . 38
5 Identification System 41
5.1 Model of Identification System . . . . 41
5.2 Polar Codes for Identification Systems . . . . 42
5.2.1 Basic Identification Systems . . . . 42
5.2.2 Wyner-Ziv Scenario Based Identification Systems . . . . 43
5.2.3 Two-layer Identification Systems . . . . 44
5.2.4 Two-layer Identification System with Pre-processing . . . 44
5.3 Simulation Results and Discussion . . . . 45
5.3.1 One-layer Polar Codes for Identification Systems . . . . . 45
5.3.2 Two-layer Polar Codes for Identification Systems . . . . . 46
5.4 Complexity Analysis . . . . 46
6 Conclusion and Future Work 51 6.1 Conclusion . . . . 51
6.2 Future Work . . . . 52
Bibliography 52
Chapter 1
Indroduction
1.1 Motivation
The issue of biometrical identification has raised considerable awareness in the last few decades. In [1], an introduction to biometric identification systems was given. Biometric identification systems, which use physical features to identify individuals, ensure greater security than the traditional identification strategies.
The most common traditional identification methods are passwords, keys, elec- tric tokens, and cards. It happens that passwords can be forgotten and keys or cards can be lost or stolen. However, the physical features of human are unique for each individual and not likely to change in a period of time. The most common physical features are the face, fingerprint, voice ,iris, hand geometry, etc. In [2], a comparison between these five biometrics was given. According to different usage of applications and biometric features’ characteristics, we can match a specific biometric feature to an application [3]. However, different from the traditional identification methods, the implementation of biometric identifi- cation systems requires to store the biometric data of the users and reconstruct based on the database. In this work, we will be interested in finding an efficient compression mechanism and reconstruction method.
Polar codes, recently proposed by Arikan [6], are proved to be the first codes that achieve the capacity of the binary-input discrete memoryless chan- nels (B-DMCs). However, the results obtained at short sequence length are not satisfying. It was shown in [7], [8], with the list based successive cancellation decoding, polar codes achieve better performance at short sequence length. In [10], polar codes are also proved to be optimal for lossy source coding. In [16], it was shown that polar code is also optimal for the Wyner-Ziv scenario. Polar codes for two-layer Wyner-Ziv coding was discussed in [20]. In this project, we will use polar codes for the source compression and reconstruction in an identi- fication system. We will also discuss how the list based successive cancellation influence the performance of source coding. In addition, we will consider imple- menting polar codes for two-layer Wyner-Ziv coding, which will generate two separate databases.
1.2 Societal Impact
Identification system turns out to play increasingly critical role in our soci- ety. As a result, more accurate and faster identification becomes a crucial task.
Biometric identification system tends to be adopted without limit, as it is imple- mented by both private organizations and governmental institutes, regardless of political or economic structure, size or geography. It was estimated that the biometrics market will increase from 12.84 billion dollars in 2016 to 29.41 billion dollars by 2022 [4]. The biometric identification systems turns out to play more important role in many areas.
1.3 Introduction of Identification Systems
Biometric identification systems, which use physical features to identify indi- viduals, ensure better security than password or numbers. Some of the most common and best-known features are the face, fingerprints, voice, irises, etc.
Generally, a biometric identification system involves two phases. The first phase, enrollment phase, the physical features of the observed individuals are quantized and stored in the database. In the identification phase, a noisy version of the biometrical data from an unknown individual is observed. The observed data is compared to the enrolled data in the database and decide which user is observed.
Consider in an identification system there might be a large number of in- dividuals involved, it might be difficult to store the original data. It becomes necessary to compress the data efficiently. Possible solutions are data mining, efficient data compression mechanism and storing data in several devices sep- arately. In this thesis, we will focus on the second and the third aspects. In addition, we will also think about implementing corresponding identification mappings.
1.4 Thesis Outline
The report is organized as follows.
• In Chapter 2, we introduce the basics of polar codes, including the channel polarization and transformation. The successive cancellation decoding for polar codes channel coding will also be discussed.
• In Chapter 3, we introduce the polar codes for source coding. Two en- coders will be applied, successive cancellation and list based successive cancellation encoder.
• In Chapter 4, we discuss the polar codes for the Wyner-Ziv problem. The two-layer Wyner-Ziv problem will also be discussed.
• In Chapter 5, we consider the model for an identification system and implement the polar codes for data compression as well as reconstruction.
• In Chapter 6, we will briefly discuss the conclusions, challenges and future work on polar codes for an identification system.
Chapter 2
Polar Codes for Channel Coding
In this chapter, we will discuss the basics of polar codes for channel coding.
This is based on the work of Arikan [6].
Polar code construction is based on the following transformation. Given the input U1N, implement the encoding operation X1N = U1NGN, and let xN1 transmit through the N copies of a B-DMC W . The transformation matrix GN is defined as:
GN = G⊗n2 RN,
where G⊗n2 is the nthKronecker power of G2and RN is the bit-reversal permu- tation matrix. The matrix RN can be interpreted as the bit-reversal operator:
there is, if vN1 = uN1 RN, then vb1,··· ,bn= ubn,··· ,b1. The nthKronecker power of G2 is defined as
G⊗n2 = G12⊗ G⊗n−12 =
G⊗n−12 0 G⊗n−12 G⊗n−12
. (2.1)
Here, the base is G12=
1 0 1 1
.
Next, apply the chain rule to the mutual information between the input U1N and Y1N, there is
I(U1N; Y1N) =
N
X
i=1
I(Ui; Y1N|U1i−1) =
N
X
i=1
I(Ui; Y1N, U1i−1).
The essential observation of polar codes is that with block size N increases, the terms in the summation either approach 0 or 1. This phenomenon is referred as channel polarization.
In the following sections, we will give more details about polar codes.
2.1 Polarization Basics
2.1.1 Binary Input Channels
Assume X is the field of size two and Y is an arbitrary set. X and Y are the input and output alphabets of a channel W . Denote the channel as W : X → Y.
Then the probability of observing Y = y ∈ Y when the input is X = x ∈ X is
P r{Y = y|X = 0} = W (y|0) and P r{Y = y|X = 1} = W (y|1). (2.2)
2.1.2 Binary Discrete Memoryless Channel
Among the binary input channels, binary discrete memoryless channel (B-DMC) is an important class of channels in information theory. We write WN to denote the channel corresponding to N independent uses of channel W ; therefore, WN : XN → YN with WN(yN1 |X1N) =QN
i=1W (yi|xi).
Given a B-DMC W , we can measure the rate and reliability of W by deter- mining two parameters, the symmetric capacity and the Bhattacharyya param- eter.
Definition 1. The symmetric capacity is defined as I(W )=∆X
y∈Y
X
x∈X
1
2W (y|x) log W (y|x)
1
2W (y|0) +12W (y|1), (2.3) The symmetric capacity I(W ) equals the Shannon capacity when W is a symmetric channel. A channel is symmetric if there exist a permutation π such that for each output symbol y there is W (y|1) = W (π(y)|0). Two examples of symmetric channels are the binary symmetric channel (BSC) and the binary erasure channel (BEC).
A BSC with crossover probability pe is a B-DMC W with output alphabet Y = {0, 1}, W (0|0) = W (1|1) = 1 − pe and W (1|0) = W (0|1) = pe. The Shannon capacity for a BSC(pe) is
C(BSC(pe)) = 1 − h2(pe),
where h2 is the binary entropy function and h2(pe) = −pelog (pe) − (1 − pe) log (1 − pe).
A BEC with erasure probability peis a B-DMC if there is W (0|0) = W (1|1) = 1 − peand W (e|0) = W (e|1) = pe, where e is the erasure symbol. The Shannon capacity for a BEC(pe) is
C(BEC(pe)) = 1 − pe.
Another important parameter is the Bhattacharyya parameter, which is de- fined as follows.
Definition 2. The Bhattacharyya parameter is defined as Z(W )=∆ X
y∈Y
pW (y|0)W (y|1). (2.4)
In the code construction of polar codes, we will focus on the Bhattacharyya parameters. It is an upper bound on the maximum likelihood (ML) decision error probability [6]. More details about the properties of the Bhattacharyya parameter the relationship with block error probability will be discussed in the later sections.
2.2 Channel Transform
In this section, we will discuss the channel transform and the transformation of I(W ) and Z(w). Firstly, we introduce the basic level transform, then we extend it to recursive transform.
2.2.1 Basic Channel Transform
Let X = {0, 1}, W : X → Y be a B-DMC, and U12 a random vector that is uniformly distributed over X2. Consider the following channel combining of two channels as depicted in Figure 2.1.
U1
+
WU2 W
Y1
Y2 X1
X2
Figure 2.1: The basic channel transform
Denote the input of the channel with X12= U12G2, and Y12the corresponding outputs. We have the transition probabilities
W2(y21|u21)=∆
2
Y
i=1
W (yi|xi), (2.5)
where
G2=
1 0 1 1
. (2.6)
The channel combining here implies how two individual channels W are transformed to a new channel W2 : X2 → Y2. Since the transform between U12and X12is linear and bijective, we have the mutual information between the input U12 and output Y12
I(U12; Y12) = I(X12; Y12) = 2I(W ). (2.7) Now we split the mutual information above by applying chain rule:
I(U12; Y12) = I(U1; Y12) + I(U2; Y12|U1) = I(U1; Y12) + I(U2; Y12, U1). (2.8)
For the term I(U1; Y12), we can interpret it as the mutual information be- tween the input U1 and the output Y12, and the input U2 seen as noise. We denote this ”channel” by W2(1).
Similarly, the term I(U2; Y12, U1) can be seen as the mutual information between the input U2 and the output Y12 with U1 known. Similarly, we denote this ”channel” by W2(2).
Based on this, we can write (W, W ) 7→ (W21, W22) for any given B-DMC channel W with
W2(1)(y21|u1)=∆X
u2
1
2W2(y12|u21) =X
u2
1
2W (y1|u1⊕ u2)W (y2|u2) (2.9)
W2(2)(y12, u1|u2)=∆ 1
2W2(y12|u21) = 1
2W (y1|u1⊕ u2)W (y2|u2) (2.10) For simplicity, we define the following notations for the above channel trans- formations. Given any B-DMC W : X → Y, we have
W ∗ W(y1, y2|u1) def= 1 2
X
u2∈X
W (y1|u1⊕ u2)W (y2|u2) (2.11)
W ~W(y1, y2, u1|u2) def= 1
2W (y1|u1⊕ u2)W (y2|u2) (2.12) For any B-DMC W , the transformation (W, W ) 7→ (W21, W22) is rate-preserving and moves the symmetric capacity away from the center in the sense that
I(W2(1)) + I(W2(2)) = 2I(W ), (2.13) I(W2(1)) ≤ I(W ) ≤ I(W2(2)). (2.14) and the Bhattacharyya parameters of this transformation satisfies
Z(W2(1)) ≤ 2Z(W ) − Z(W )2, (2.15)
Z(W2(2)) = Z(W )2. (2.16)
The equality holds only when W is a BEC [6].
2.2.2 Recursive Channel Transform
In this section, we will explain how the channel combining works at the higher levels. And when the size is large enough, the channels would be polarized to either completely clean or noisy channel.
The next level, second level (n = 2) of channel transform is illustrated in Figure 2.2.
The mapping u41 7→ x41 from the input of W4 to W4 can be written as x41= U14G4, where
U1 + + W
U2
U3
U4
W
W
W +
+
Y1
Y2
Y3
Y4
•
• •
•
X1
X2
X3
X4
Figure 2.2: Second level channel transform
G4= R4G⊗22 =
1 0 0 0
1 0 1 0
1 1 0 0
1 1 1 1
. (2.17)
Then we have the transformation of transition probabilities W4(y41|u41) = W4(y14|u41G4). This operation can be generalized to the higher level in a recur- sive manner. Define channel WN : XN → YN, do the channel combining
WN(y1N|uN1) = WN/2(yN/21 |uN1,e⊕ uN1,o)WN/2(yNN/2+1|uN1,e), (2.18) where uN1,o = (u1, u3, ..., uN −1) and uN1,e = (u2, u4, ..., uN). For the channel splitting, apply the chain rule as before,
I(U1N; Y1N) =
N
X
i=1
I(Ui; Y1N −1, U1i−1).
I(Ui; Y1N −1, U1i−1) can be seen as the mutual information between Uiand (Y1N −1, U1i−1).
Denote this channel by WN(i), for which the transition probability is WN(i)(y1N −1, ui−11 |ui)∆= P (y1N −1, ui−11 |ui).
For any n ≥ 0, N = 2n, 1 ≤ i ≤ N , we have W2N(2i−1)(y12N, u2i−21 |u2i−1)
=X
u2i
1
2WN(i)(y1N, u2i−21,o ⊕ u2i−21,e |u2i−1⊕ u2i)WN(i)(y2NN +1, u2i−21,e |u2i) (2.19)
W2N(2i)(y12N, u2i−21 |u2i−1)
=1
2WN(i)(y1N, u2i−21,o ⊕ u2i−21,e |u2i−1⊕ u2i)WN(i)(yN +12N , u2i−21,e |u2i) (2.20) The above channels can be denoted by
WN2i = WN/2(i) ∗ WN/2(i) , WN2i+1= WN/2(i) ~WN/2(i) , For I(WN(i)) and Z(WN(i)) at higher level, we have
I(WN(2i−1)) ≤ I(WN/2(i) ) ≤ I(I(WN(2i))), I(WN(2i−1)) + I(WN(2i)) = 2I(WN/2(i) ), and
Z(WN(2i−1)) ≤ 2Z(WN/2(i) )2− Z(WN/2(i) ), Z(WN(2i)) = Z(WN/2(i) )2.
The equality holds when the channel W is BEC [6]. For this special case that W is a BEC with erasure probability e, the Bhattacharyya parameter could be calculated recursively
Z(WN(2i−1)) = 2Z(WN/2(i) )2− Z(WN/2(i) ), (2.21) Z(WN(2i)) = Z(WN/2(i) )2. (2.22) with initiation Z(W1(1)) = e.
Let PB denote the block error probability. Then PB can be upper bounded as given in [6]
PB≤X
Z(WN(i)). (2.23)
There is also an important property of Bhattacharyya parameter for de- graded channel. Consider two B-DMC channels W and W0, suppose W W0, then there is WN(i) W0(i)N and Z(WN(i)) ≥ Z(W0(i)N ) [14, Lemma 4.7].
2.2.3 Channel Polarization
In the previous section, we have discussed the channel transformation from N copies of channel W to polarized ”channels” {WN(i)}Ni=1. Figure 2.3 illustrates the result of polarization for the case W is a BEC with erasure probability pe= 0.5. The bit channel capacities are calculated using the recursion
I(WN(2i)) = 2I(WN/2(i) )2− I(WN/2(i) ), (2.24) I(WN(2i−1)) = I(WN/2(i) )2. (2.25) with initiation I(W1(1)) = 1 − pe. This recursive relation follows from equations (2.19), (2.20) and the fact that I(WN(i)) = 1 − Z(WN(i)) for a BEC W . Note that
0 100 200 300 400 500 600 Bit channel index
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Capacity
Capacity of bit channels
Figure 2.3: I(WN(i)) for i = 1, · · · , 512 for a BEC(0.5)
this recursion is valid for BECs, calculation for general B-DMCs is not known yet.
Figure 2.3 shows that I(WN(i)) tends to approaches 0 for smaller indices and approaches 1 fro larger indices. It was proved in [6], if the block length N is sufficiently long enough, I(WN(i)) will either approach 0 or 1. This is implied from the following theorem [6].
Theorem 2.1. For any B-DMC W , the channels {WN(i)} polarizes in the sense that, for any fixed δ ∈ (0, 1), as N goes to infinitely through powers of two, the fraction of indices i ∈ {1, · · · , N } for which I(WN(i)) ∈ (1 − δ, 1] goes to I(W ) and the fraction for which I(WN(i)) ∈ [0, δ) goes to 1 − I(W ).
2.3 Code Construction
We use the polarization effect for code construction. The idea of polar coding is to send data only through the channels for which Z(WN(i)) approach 0. Figure 2.4 illustrates the code construction for polar codes with block length N = 8, K = 4 and assuming the channel is W = BEC(12). According to the discussion in Section 2.2.2, the Bhattacharyya parameters for a BEC can be calculated directly with the equations (2.21) and (2.22). Rank the Bhattacharyya param- eters, and select the channels with least ZN(i)as information bits, which are U4, U6, U7 and U8.
Therefore the code construction problem for polar codes can be seen as finding the Bhattacharyya parameters Z(WN(i)). To construct a (N, K) polar code, we firstly calculate the Z(WN(i)) for each i ∈ {1, · · · , N } and divide the indices to two parts, free set Fcand frozen set F . We use the indices belonging to Fc to transmit information and fix the indices belonging to F to some known
U1 + + W
U2
U3
U4
U5
U6
U7
U8
W W
W W
W
W W +
+
+ +
+
+ Y1
Y2
Y3
Y4
Y5
Y6
Y7
Y8
+ +
+ +
•
• •
•
•
•
•
•
•
• •
• f rozen
f rozen f rozen
data f rozen
data
data data Rank
8
7 6
4 5
3 2
1 Z(Wi)
0.9961
0.8789 0.8086
0.3164 0.6836
0.1914 0.1211
0.0039
Figure 2.4: Code construction for polar codes with N = 8, K = 4, W = BEC(12)
values, usually 0.
For BECs, the transformation of Z(WN(i)) and I(WN(i)) are known, so the code construction can be precisely described. While for other channels, the code construction problem is more complex. Arikan proposed to use Monte- Carlo method for estimating the Bhattacharyya parameters [6]. Firstly, generate samples of (U1N, Y1N) with the given distribution. Then find the empirical means { ˆZ(WN(i))}, which is estimated as the expectation of the RV
s
WNi(Y1N, U1i− 1|Ui⊕ 1)
WNi(Y1N, U1i− 1|Ui) . (2.26) An successive cancellation (SC) decoder can be used for this computation be- cause the RV is the square root of the decision statistics. The details of SC decoding will be introduced in the next section.
Algorithm 1 describes how to estimate ZN for a BSC with pre-defined crossover probability p.
Algorithm 1 The Monte-Carlo estimation
Input: Sequence length N , the crossover probability p, Monte-Carlo itera- tions Runs
Output: The estimated Z(WN(i)), with i = 1, 2, · · · , N
1: Z = zeros(N, 1)
2: for r = 1 : Runs do
3: x = randn(N, 1)
4: y = bsc(x, p)
5: for i = 1 : N do
6: l = LLR(i)N(y1N, xi−11 |xi)
7: if l ≥ 0 then
8: Z(WN(i)) = Z(W
(i)
N )∗(r−1)+e− 12l
r
9: else
10: Z(WN(i)) = Z(W
(i)
N )∗(r−1)+e12l
r 11: return Z
2.4 Polar Codes Achieve Channel Capacity
In the previous sections, we have seen how the channels are polarized and take the advantage of the polarization for code construction. According to Theorem 2.1, the fraction of ”clean” channels tends to approach I(W ), therefore the achievable rate is close to I(W ).
Recall in the channel splitting, we define the bit channel WN(i)with respect to the mutual information term I(Ui; Y1N −1, U1i−1). To define such a channel, the decoder should have access to U1i−1and the output Y1N. Therefore consider using the SC decoder which decodes in order U1, · · · , UN. In this way, the decoder will have an estimation of U1i−1when decoding Ui. Based on this idea, Arikan proposed SC decoding based on computing likelihood ratio (LR) [6],
L(i)N(y1N, ˆui−11 )∆=WN(i)(yN1 , ˆui−11 |0) WN(i)(yN1 , ˆui−11 |1) and generates decision as
(a) If i ∈ F , then set ˆui= ui.
(b) If i ∈ Fc, then calculate L(i)N and set
ˆ ui=
(0 L(i)N(yN1, ˆui−11 ) ≥ 1 1 otherwise
As stated in equation 2.23, the block error probability PB can be upper bounded as
PB ≤ X
i∈Fc
Z(WN(i)).
To let the block error probability PB reduce to sufficiently small or vanish, the Bhattacharyya parameter Z(WN(i)) for i ∈ Fc should approach 0. In [17], Arikan and Telatar obtained the following result which gives the rate of Z(WN(i)) approaching 0.
Theorem 2.2. Given a B-DMC W and any β < 12, there is
n→∞lim Pr(Z(WN(i)) ≤ 2−Nβfor i ∈ {1, · · · , N }) = I(W ). (2.27) In [6], Arikan proved polar codes achieve the symmetric capacity.
Theorem 2.3. Given a B-DMC W and fixed rate R < I(W ), for any β < 12 there exists a sequence of polar codes of rate RN < R such that the block error probability
PN = O(2−Nβ).
This can be proved with the following code construction [14]. For any 0 <
β < 12 and > 0. Choose the frozen set F as F = {i : Z(WN(i)> 1
N2−Nβ}.
Theorem 2.3 implies that for sufficiently large enough block length N , there is
|Fc|
N ≥ I(W ) − .
The block error probability of this scheme with SC decoding is PB(F ) ≤ X
i∈Fc
Z(WN(i)) ≤ 2−Nβ.
Therefore Theorem 2.3 proved.
2.5 Decoding Algorithms
2.5.1 Successive Cancellation Decoder
We have discussed SC decoding in the previous section. In this section, we will study the details of SC decoding. Recall the SC decoding is realized by calculating the likelihood ratios (LR). The LRs can be calculated using the recursive formulas in equation 2.18, which gives
L(2i−1)N (y1N, ˆu2i−21 ) =
L(i)N/2(yN/21 , ˆu2i−21,o ⊕ ˆu2i−21,e )L(i)N/2(yN/2+1N , ˆu2i−21,e ) + 1 L(i)N/2(yN/21 , ˆu2i−21,o ⊕ ˆu2i−21,e ) + L(i)N/2(yN/2+1N , ˆu2i−21,e ), and
L(2i)N (yN1 , ˆu2i−11 ) = L(i)N/2(yN/21 , ˆu2i−21,o ⊕ ˆu2i−21,e )]1−2ˆu2i−1· L(i)N/2(yNN/2+1, ˆu2i−21,e ).
This calculation can be recursively reduced to block length 1 with initiation L(1)1 (yi) = W (yi|0)/W (yi|1), which can be found directly with the output se- quence y and channel parameter.
To avoid doing a large amount of multiplication calculation, do the successive cancellation in the logarithm domain. In logarithm domain, the above algorithm will become
(a) If i ∈ F , then set ˆui= ui.
(b) If i ∈ Fc, then calculate L(i)N and set
ˆ ui=
0 ln(W
(i)
N (yN1,ˆui−11 |0) WN(i)(yN1,ˆui−11 |1)) ≥ 0 1 otherwise
For simplicity, denote LLR(i)N/2(y1N/2, ˆu2i−21,o ⊕ˆu2i−21,e ) with LLR1and LLRN/2(i) (yN/2+1N , ˆu2i−21,e ) with LLR2.Then in log domain, the recursive calculation becomes
LLR(2i−1)N (y1N, ˆu2i−21 ) = 2 tanh−1(tanh(LLR1
2 ) tanh(LLR2
2 )), (2.28) LLRN(2i)(yN1 , ˆu2i−11 ) = (−1)uˆ2i−1LLR1+ LLR2=
(LLR2 + LLR1 uˆ2i−11 = 0 LLR2 − LLR1 uˆ2i−11 = 1 (2.29) Using proper approximation [27], equation (2.28) can be approximated to
LLR(2i−1)N (yN1 , ˆu2i−21 ) = sgn(LLR1)sgn(LLR2) min(|LLR1|, |LLR2|) It is obvious that the recursive form in the logarithm domain is much simpler, which only consist of summation and sign operation operators. Algorithm 2 describes a successive cancellation decoder.
Algorithm 2 Successive cancellation decoder
Input: Received vector y with length N , frozen set F Output: Decoded sequence u
1: u = zeros(N, 1)ˆ
2: for doi = 1 : N
3: if i ∈ F then
4: uˆi= 0
5: else
6: l = LLR(i)N(y1N, ˆui−11 )
7: if l ≥ 0 then
8: uˆi = 0
9: else
10: uˆi = 1
11: x = ˆuGN
12: return x
2.6 Successive Cancellation List Decoding
The main drawback of SC decoder is that once there is a wrong decision made, it can not be corrected. In order to avoid this problem and improve the perfor- mance, an improved version of SC decoding, the Successive Cancellation List (SCL) decoder was introduced to approach the maximum likelihood decoder with an acceptable complexity [7], [8].
Similar to SC decoder, the SCL decoder also uses the recursion calculation to make the decision. However, instead of only making one decision, the SCL