Lattice Reduction Aided Multiple Input Multiple Output Detection Algorithm Design For 5G Communication

(1)

IN

DEGREE PROJECT

ELECTRICAL ENGINEERING,

SECOND CYCLE, 30 CREDITS

,

STOCKHOLM SWEDEN 2017

Lattice Reduction Aided Multiple

Input Multiple Output Detection

Algorithm Design For 5G

Communication

YUQI ZHANG

(2)

(3)

Abstract

In recent years, multiple input multiple output system has raised intensive re-search interests for its huge potentials to achieve higher spectral efficiency and data rate. However, transmitting multiple data streams through the same time and frequency resources simultaneously leads to severe interference between transmitted signals. In order to solve this problem, lattice reduction is pro-posed as a suboptimal maximum likelihood algorithm. It aims at finding a set of better basis vectors for the channel matrix and results in satisfactory performance improvements compared to conventional linear detectors. But the average complexity of lattice reduction algorithm is too high for practical uses. This thesis studies conventional linear detection algorithms and lattice reduc-tion aided detecreduc-tion algorithms for uplink receiver design. Further an iterative lattice reduction algorithm is proposed by exploiting the frequency coherence in an orthogonal frequency division multiplexing system to achieve low complexity. In this thesis, algorithm performances are verified in various scenarios through simulations. Results show promising performance improvements for presented lattice reduction detection algorithms, at the cost of an acceptable complexity increase.

(4)

Sammanfattning

Under de senaste ˚aren har MIMO system f˚att ökat intensiva forskningsintressen för sina stora möjligheter att uppn˚a högre spektral e↵ektivitet och datahastig-het. Överföring av flera dataströmmar i samma tid och frekvensresurser leder emellertid samtidigt till allvarlig störning mellan sända signaler. För att lösa detta problem föresl˚as lattice reduction som en suboptimal maximal likelihood algorithm. Det syftar till att hitta en uppsättning bättre basviktorer för kanal-matrisen och resultera i tillfredsställande prestandaförbättringar jämfört med konventionella linjära detektorer. Men den genomsnittliga komplexiteten hos lattice reduction är för hög för praktiska användningsomr˚aden. Denna avhand-ling studerar konventionella linjära detekteringsalgoritmer och lattice reduction detekteringsalgoritmer för uplink-mottagare. Vidare föresl˚as en iterativ lattice reduction algoritm genom utnyttjande av frekvenssamhäftningen i ett OFDM system.

I denna avhandling verifieras utförandet av olika algoritmer i olika scena-rier genom simulering. Resultat visar lovande prestandaförbättringar för pre-senterade cc reduction detekteringsalgoritmer, till kostnaden för en acceptabel komplexitetsökning.

(5)

Acknowledgment

First of all, I would like to give my deepest gratitude to my thesis supervisor Dr. Jinliang Huang. I thank him for giving me this opportunity to carry out my thesis project at Huawei Sweden AB and more importantly, for his helps, guidances and encouragements during the entire thesis project. I am privileged to work with him and I could not finish this project without his help. Also many thanks to Leting Li and Shousheng He, who have been continuously helping me with fruitful discussions and valuable advices.

Secondly, I would also like to thank associate professor Ming Xiao at the department of information science and engineering at Royal Institute of Tech-nology, KTH, who acts as my university thesis examiner.

Last but not the least, I dedicate this thesis to my parents and my brother, for the unconditional financial and mental supports they provided during my stay at Sweden and always. I also want to extend this gratitude and love to my girlfriend Ms.Qianli Yu, for encouraging me to keep going for a bright future.

(6)

List of Figures

2.1 OFDM visualization in frequency domain . . . 6

2.2 Implementation structure of OFDM transmitter and receiver. . . 6

2.3 MIMO system model with N transmit antennas and N receive antennas . . . 7

2.4 FDD frame structure in LTE . . . 8

2.5 Turbo encoder structure in LTE . . . 9

2.6 Shift register realization of encoder in LTE . . . 10

2.7 Structure of turbo decoder in LTE . . . 10

3.1 Projection view for ZF in signal space . . . 14

3.2 Projection view for MMSE in signal space . . . 16

3.3 3GPP 16QAM constellation . . . 18

4.1 Lattice Parallelogram and Voronoi regions . . . 21

4.2 Unbounded loops in LLL algorithm . . . 23

4.3 CDF of column swapping number in LLL algorithm . . . 24

4.4 Flow chart of fixed-complexity and iterative LLL algorithm . . . 25

4.5 Generation multiple transmitted symbol vectors . . . 29

5.1 BER performance for QPSK in 4x4 uncoded MIMO system . . . 33

5.2 BER performance for QPSK in 6x6, 8x8 uncoded MIMO systems 33 5.3 BER performance for 16QAM in 4x4 uncoded MIMO system . . 34

5.4 BER performance for 16QAM in 6x6, 8x8 uncoded MIMO systems 34 5.5 BER performance for 64QAM in 4x4 uncoded MIMO system . . 35

5.6 BER performance for 64QAM in 6x6, 8x8 uncoded MIMO systems 35 5.7 Correlation influence in 4x4 16QAM uncoded MIMO systems . . 36

5.8 FER performance for 16QAM in 4x4 coded MIMO systems . . . 37

5.9 FER performance for 16QAM in 8x4 coded MIMO systems . . . 38

5.10 Performance comparison between original and iterative LLL al-gorithm . . . 38

(8)

List of Tables

2.1 Bandwidth and PRB number in LTE . . . 8

2.2 Power delay profile specification in SCM . . . 11

5.1 Parameter settings in uncoded systems . . . 31

5.2 General parameter settings in advanced simulator . . . 32

5.3 Detailed parameter settings in advanced simulator . . . 36

5.4 Real multiplication for lattice reduction algorithm in a N _{⇥ N} MIMO system with M symbol vectors . . . 39

5.5 Real Addition for lattice reduction algorithm in a N_{⇥ N MIMO} system with M symbol vectors . . . 40

(9)

Acronyms

3GPP 3rd Generation Partnership Project.

5G Fifth Generation Mobile Communication Systems. BER Bit Error Rate.

BS Base Station. CP Cyclic Prefix.

DM-RS Demodulation-reference Signals. FDD Frequency Division Duplex. FER Frame Error Rate.

FFT Fast Fourier Transform.

IFFT Inverse Fast Fourier Transform. KPI key Performance Indicator. LLL Lenstra-Lenstra-Lov´asz. LLR Log-likelihood Ratio. LTE Long Time Evaluation. MAP Maximum A-Posterior.

MIMO Multiple-input Multiple-output. MLD Maximum Likelihood Detection. MMSE Minimum Mean Square Error.

MU-MIMO Multiple-User Multiple-input Multiple-output. NP Hard Non-deterministic Polynomial-time Hard. OFDM Orthogonal Frequency Division Multiplexing.

(10)

OFDMA Orthogonal Frequency Division Multiplex Access. PRACH Physical Random Access Channel.

PRB Physical Resource Block.

PUCCH Primary Uplink Control Channel. PUSCH Primary Uplink Shared Channel. QAM Quadrature Amplitude Modulation. QPSK Quadrature Phase Shift Keying. RE Resource Element.

SCM Spatial Channel Model. SNR Signal to Noise Ratio. SRS Sounding Reference Signals.

SU-MIMO Single-User Multiple-input Multiple-output. UE User Equipment.

(11)

Chapter 1

Introduction

1.1 Motivation

Mobile data traffic is expected to have a more than 10 times growth between 2015 and 2021, boosted by a rapid increase in mobile broadband subscriptions[1]. The emerging fifth generation mobile communication systems(5G) are now be-ing developed to meet future need on higher data rate and throughput in radio network. Several techniques are proposed to satisfy this demand, one alterna-tive among which is to increase the number of antennas at both transmitter and receiver side, which is called multiple input multiple output(MIMO) systems. However, one major difficulty in implementation of such radio transmission sys-tems is the signal detection at the receiver side. This is due to the fact that multiple users transmit signals at the same time, and consequently produce severe interference with each other. Conventional detection algorithms su↵er from this interference, resulting in poor performance in MIMO communication systems.

The well-known optimal detection algorithm in this scenario is the maxi-mum likelihood detection(MLD). It searches exhaustively all possible transmit-ted symbol in the signal space and choose the symbols which have minimum Euclidean distance with the received signal. However, the complexity of this al-gorithm grows exponentially with problem dimension and solving the detection problem using MLD is proven to be non-deterministic polynomial-time hard(NP Hard)[2]. As a consequence, in order to benefit from joint detection in MIMO systems, more efficient sub-optimal MLD algorithms have to be developed.

Lattices, or lattice theories, are used to develop powerful algorithms in many wireless communication applications, especially for the precoding and detection problems in MIMO scenarios[3]. Moreover, as orthogonal frequency division multiplexing access(OFDMA) technique is proposed to be used in 5G uplink, it is of particular interest for developing new cost efficient uplink receiver detec-tion algorithm from theoretical and practical perspective using lattice theories. Such algorithm is expected to have promising performance improvement when compared with conventional linear detector at a low complexity.

(12)

1.2 Problem Formulation and Method

This thesis is aimed to study di↵erent types of algorithm for the 5G receiver detection problem, to evaluate the performance based on the bit error rate (BER) and frame error rate(FER) as well as complexity associated with the algorithms in link level, further to try to propose new algorithms that can be used in practical applications.

In order to evaluate the performance of di↵erent algorithms, simulations have been carried out in two link level radio simulators. The simulators are capable of simulating radio link quality in various radio channel models and visualizing di↵erent key performance indicators(KPI). However, all new MLD algorithms have to be implemented into the original simulation framework to finish the thesis work.

The objectives of this thesis project are:

• Study di↵erent MLD algorithms in literature study, further to implement those algorithms into the simulator framework.

• Evaluate the performance of various algorithms through Monte Carlo sim-ulations. The performance is assessed by FER in a coded system and BER in an uncoded system under di↵erent MIMO dimensions and di↵erent signal-to-noise-ratio(SNR).

• Evaluate the complexity of various algorithms based on the number of real multiplication and real additions needed in one resource element(RE) in OFDM systems.

• Propose and evaluate new cost efficient MLD algorithm that can be used in a practical system and analyze the corresponding complexity compared with other algorithms.

1.3 Previous Work

From lattice point of view, detection problem can be treated as finding the closest lattice points in a lattice space[4]. Various lattice decoders based on one algorithm called lattice reduction was proposed for MIMO detection problems[5, 6, 7, 8].The basic idea behind lattice reduction algorithm is to find an unimodu-lar transformation matrix for MIMO channel, such that the transformed chan-nel matrix is more orthogonal, which means the new chanchan-nel matrix has better condition number than original channel matrix. Then the conventional linear detectors can benefit from such transformation and yield a better performance. Alternatively, some other sub-optimal detection algorithms such as tree search algorithm can also be used on the equivalent channel matrix[9].

There are several di↵erent lattice reduction algorithms which give di↵erent performance and complexity trade-o↵[10, 11, 12]. Among them the most popu-lar and famous one is the Lenstra-Lenstra-Lov´asz algorithm(LLL). By repeating reduction and basis vector swapping, a LLL-reduced basis can be obtained with polynomial complexity in the length of the basis. It is also showed that LLL lattice reduction aided detectors achieve full diversity[12]. Moreover, lattice re-duction aided detectors are used in coded system[13, 14, 15]. The log-likelihood

(13)

ratio(LLR) of each coded bit can be obtained by generating an amount of can-didate symbol vectors and calculating corresponding Euclidean distances. In [15], three di↵erent methods of calculating LLR are presented with respect to optimizing performance, complexity and memory usage respectively. In [13], k nearest neighbors algorithm is combined with lattice reduction to get a better performance.

As for complexity, although average complexity of LLL lattice reduction algorithm is polynomial in the length of basis, the complexity of the worst case is unbounded. A fixed version of LLL algorithm is presented in [16] by changing the directions of signal flow inside algorithm.

1.4 Outline of Thesis

This thesis report is organized as follows. In chapter 2, background knowledges related to MIMO transmission and radio network that are necessary to un-derstand the thesis are summarized briefly, while conventional linear detection algorithms are described in chapter 3. Chapter 4 gives an detailed analysis of LLL lattice reduction algorithm and the proposed method to decrease the al-gorithm complexity. Chapter 5 contains parameter settings for the simulations conducted in this thesis and the corresponding simulation results, as well as the complexity analysis. Finally chapter 6 makes a conclusion of this thesis and identities directions for future work.

(14)

(15)

Chapter 2

Background

2.1 OFDM Modulation

Orthogonal frequency division multiplexing(OFDM) is one modulation tech-nique that transmits signals through a set of orthogonal subcarriers. It is used in Long-Term Evolution (LTE) downlinks and lays the foundation of single carrier modulation technique in LTE uplinks. What’s more, OFDMA is also proposed to used in uplinks in the future 5G systems.

For a given OFDM system whose symbol interval is T and sampling rate is N

T, the discrete transmitted signal is written as xk = N_X1 n=0 sk⇤ e j2⇡nk N , (2.1)

where sk denotes the complex symbol through the kthsubcarrier.

As can be seen in the figure2.1, di↵erent subcarriers are evenly spaced and overlap with each other in frequency domain to achieve higher spectrum ef-ficiency than single carrier transmission. Moreover, if subcarriers are spaced apart by an integer times of 1

T, frequency response of other subcarriers are zero due to perfect orthogonality, which can be shown as

Z T 0 smej2⇡fmts¯ne j2⇡fnt= sms¯ne j2⇡(fm fn)T ₁ j2⇡(fm fn) = sm¯sn fm,fn. (2.2)

There are also other advantages by exploiting OFDM modulation. Since the whole bandwidth is splitted into multiple narrow channels, it is very typical that the bandwidth of transmitted signal is smaller than the coherence bandwidth of the subcarrier, which means transmitted signal will experience the same fading amplitude and is more robust to frequency selective fading. What’s more, by in-troducing cyclic prefix(CP), that is, by inserting the last part of OFDM symbol into the beginning of itself, the linear modulation on the channel can be con-verted into a cyclic convolution seen by the demodulator. Then the transmitted signals can be recovered as

sk = 1 N N_X1 k=0 x[k]e j2⇡nkN . (2.3)

(16)

Figure 2.1: OFDM visualization in frequency domain

This makes OFDM feasible to implement using inverse fast Fourier trans-form(IFFT) and fast Fourier transform(FFT), as is shown in figure 2.2.

Further, inter-symbol interference caused by multipath propagation can also be combated by removing cyclic prefix before equalization.

Figure 2.2: Implementation structure of OFDM transmitter and receiver.

2.2 Multiple Antenna Transmission System

Multiple antenna transmission system, or the so called MIMO system, has been studied for many years. Such system is more immune to varying channel con-ditions. For single-user MIMO systems(SU-MIMO), if di↵erent antennas are placed sufficiently apart from each other, they will experience di↵erent fading amplitudes. It is very rare for all antennas to have a very poor channel condition simultaneously, thus diversity can be achieved against channel fading. As for multiple-user MIMO systems(MU-MIMO), di↵erent users are allowed to trans-mit signals using the same time and frequency resources at the cost of more signal processing operations. Both SU-MIMO and MU-MIMO exploit common

(17)

MIMO fundamentals and benefit from spatial multiplexing gain.

Figure 2.3: MIMO system model with N transmit antennas and N receive an-tennas

Considering a NT ⇥ NR MIMO system with NT transmitters and NR re-ceivers and assuming channel conditions are stable during one symbol duration, the system model can be given as

y = Hs + n, (2.4) where s = [s1, s2, . . . , sNT] denotes transmitted signals from di↵erent

trans-mitters while y = [y1, y2, . . . , yNR] denotes received signals. n is the additive

white Gaussian noise vector whose entities are circularly symmetric complex Gaussian random variables CN (0, 2 2_{), and H is the complex N}

R⇥ NT chan-nel matrix with each column representing chanchan-nel gain seen by one transmitter. If we assume that di↵erent columns in channel matrix are linearly indepen-dent, then H is a full rank matrix and the transmitted signals can be recovered by multiplying the inverse of the channel matrix.

0 B @ ˆ s1 .. . ˆ sNT 1 C A = H 1y = 0 B @ s1 .. . sNT 1 C A + H 1n. (2.5)

If the channel is well conditioned, the capacity of this MIMO system is scaled linearly by min(NT, NR), which is also referred as spatial multiplexing if di↵erent data streams are transmitted simultaneously. It is shown that MIMO capacity for rich scattering environments in which the channel matrix is i.i.d Gaussian distributed is given by[17]

C = min(NT, NR)E[log(1 +SNR NT

) ], (2.6) where is the eigenvalues of the wishart matrix W, defined as

W = ⇢ HH_H _N T  NR HHH _N R NT .

(18)

In the rest parts of this thesis, except for pointing out explicitly, simulations are mainly based on MU-MIMO systems where each user is equipped with one transmit antenna.

2.3 LTE Physical Resource

How to exploit time and frequency efficiently is a crucial problem in cellular network. The downlinks of LTE system use OFDMA to schedule di↵erent users dynamically. Each user equipment(UE) is assigned a unique piece of subcarriers over a predetermined time period.

According to 3rd Generation Partnership Project(3GPP) specification[18], transmissions are organized into fragments called frames of length 10ms and three frame structures are supported for di↵erent types of system. For fre-quency division duplex(FDD) system, each frame is divided into 10 sub-frames with equal time duration. Each sub-frame is in turn divided into 2 time slots. The OFDM subcarriers spacing is 15kHz, which corresponds to 66.7µs in time domain. Each slot contains 6 or 7 OFDM symbols for normal or extended CP with length 4.7µs and 16.7µs respectively. The frame structure is visualized in figure2.4.

Figure 2.4: FDD frame structure in LTE

LTE defines the number of subcarriers based on system bandwidth. The re-source element(RE) that contains one sub-carrier in frequency domain over one OFDM symbol duration is the smallest resource in LTE. 12 consecutive subcar-riers in one slot are grouped to one physical resource block(PRB), which is the smallest schedule unit of the base station. The LTE specification allows system bandwidth to range from 1.4MHz to 20MHz, corresponding to the number of PRB ranging from 6 to 100, as shown in table2.1.

System bandwidth(MHz) 1.4 3 5 10 15 20 Number of PRBs 6 15 25 50 75 100

Table 2.1: Bandwidth and PRB number in LTE

In LTE, synchronization and channel estimation are not implemented by preambles. Instead, reference signals are inserted in PRBs to obtain essen-tial channel state information. There are mainly two types of reference sig-nals in uplinks: demodulation-reference sigsig-nals(DM-RS and sounding reference

(19)

signals(SRS). DM-RS is used for equalization and demodulation by base sta-tions(BS). Each UE transmits DM-RS during the fourth OFDM symbol dura-tion in one slot over all allocated subcarriers. The SRS are transmitted sepa-rately on an arbitrary number of subcarriers in an uplink subframe and they are used to convey UE’s channel characteristics.

There are three uplink physical channels in LTE.

• PUCCH, the primary uplink control channel(PUCCH) is used for trans-mitting information containing hybrid automatic repeat request acknowl-edgments, channel state information and requests for uplink resource al-location.

• PUSCH, the primary uplink shared channel(PUSCH) is used for trans-mitting UE’s specific data to BS.

• PRACH, the physical random access channel(PRACH) is used for starting an access from UE to BS.

2.4 Turbo Receiver

The basic idea of channel coding is to introduce redundancy in transmitting signals so that they are more robust to noise and interference. It is known from Shannon theory that sufficient long random code achieves capacity. But in practice it is infeasible to implement due to its high complexity. Turbo codes employ convolutional code to construct a random-like code that approaches capacity.

The turbo encoder used in LTE is a Parallel Concatenated Turbo Code with two constituent encoders with rate 1/2. Hence the overall rate of turbo encoder is 1/3[19]. The structure of the turbo encoder is depicted in figure2.5

Figure 2.5: Turbo encoder structure in LTE For each constituent encoder, the transfer function is given as

G(D) = ✓ 1,1 + D 2_{+ D}3 1 + D + D3 ◆ , (2.7)

(20)

where D denotes the delay operator D = z 1_{. Apparently this encoder} is a recursive systematic encoder with 8 di↵erent states and the shift register realization of it is shown in figure2.6.

Figure 2.6: Shift register realization of encoder in LTE

As shown above, turbo codes use a long interleaver to randomize itself based on the underlying highly structured components so that its performance is not far away from Shannon limits. Further, trellis representation can be used to get e↵ective decoding algorithm.

The optimal decoder is the so-called maximum a-posterior(MAP) decoder which is infeasible for a whole code block due to high complexity. However, BCJR decoding algorithm can be used on structured codes which can be de-scribed with limited state numbers[20]. BCJR algorithm gives estimations of bits posterior distribution by doing forward and backward viterbi-like decoding algorithms on each encoder, taking the received signals and code structure into consideration. Figure 2.7 presents the signal flow of MIMO iterative decoder.

Figure 2.7: Structure of turbo decoder in LTE

First the inner decoder exploits the received signal from channel and a pri-ori bit information to obtain extrinsic information for each bit, which can be interpreted as the di↵erence between a priori information and a posterior infor-mation. Then this information is deinterleaved and fed into the outer decoder as a priori information. Then after BCJR decoding, the output information

(21)

from outer decoder is interleaved again as a priori information to inner decoder. The whole decoding process keeps exchanging information between decoders in this manner until some certain conditions are satisfied, such as reaching max-imum iteration numbers, CRC check being satisfied or reaching large enough information amplitude.

It is worth noting that in the thesis simulation setup, the decoding process only terminates when the number of iterations reaches 8.

2.5 Spatial Channel Model

This section presents the spatial channel model(SCM) used in this thesis based on 3GPP specification.[21]

SCM was designed for simulating MIMO systems and algorithms from both link level calibration and system level performance. It tries to specify various fading and correlation characteristics between BSs and UEs. Some large scale channel characteristics such as delay spread and shadow fading are pre-defined based on measured data. There are in total six di↵erent models that are suitable for simulating di↵erent environments, among which ’Urban’ model for typical city macro environment is used in this thesis.

It is assumed that the transmitted channel consisting of several paths. The power and angle distribution of each path typically follow a Laplacian distri-bution whose center depends on the average angles of arrival and angles of departure. Further, each path is divided into 20 subpaths with the same power whose relative angles are produced by random variables.

As mentioned above, in Urban model the angle spread of BS is either 8 or 15 . The power and angle distribution at BS side are obtained by averaging multipath signals and also pre-defined as power delay profile in table 2.2. During one simulation, the positions of di↵erent UEs are fixed to maintain the same correlation between each other.

Modified Pedestrian A _{Relative power(dB)}Delay(ns) 0₀ 0₀ _-9.7110 _-19.2190 _-22.8410 Vehicular A Delay(ns) 0 310 710 1090 1730 Relative power(dB) 0 -1 -9 -10 -15 Pedestrian B Delay(ns) 0 200 800 1200 2300

Relative power(dB) 0 -0.9 -4.9 -8 -7.8 Table 2.2: Power delay profile specification in SCM

(22)

(23)

Chapter 3

Linear Detection Algorithm

It is well-known that if the transmitter knows the channel state information, us-ing sus-ingular value decomposition can decompose the MIMO channel into several parallel channels. This can be achieved by pre-processing the transmit data and sending them through channel eigenmodes. However, this is not the case where transmitter does not know the exact channel information. If multiple UEs send data through the same time and frequency resource, transmitted signals would arrive received antennas with overlapping with each other. This produces new challenges in separating data at the receiver side.

In the following description of di↵erent detection algorithms, the MIMO notations in section 2.2 are used.

y = Hs + n = n X i=1

hisi+ n, (3.1) where hi denote columns in the NR⇥ NT fading channel matrix, si is the normalized-power transmitted signal from the ith _{transmitter. Further it is} assumed that the number of received antenna is always equal or larger than the number of transmitted antenna.

The simplest detection algorithm for MIMO detection is the linear detection algorithm, also called linear detector. It correlates the received signals linearly with a matrix to obtain corresponding decision information. The equation 3.1 can be written as follow

y = h1s1+ n X

i=2

hisi+ n. (3.2) In order to get a good decision on transmitted signal s1 for the first user, a source of interference, together with the noise, need to be considered. An intuitive solution to this problem is that through the correlated matrix, inter-ference should be suppressed in a manner where the power of desired signal s1 is much larger than the power of sif ori6= 1. This leads to two linear detectors: zero-forcing(ZF) detector and minimum mean square error(MMSE) detector.

(24)

3.1 Zero Forcing Detection

Zero forcing algorithm aims at erasing the e↵ect of interference from other trans-mitters by considering the signal space spanned by vectors h1, h2,· · · , hn. In this space, the received signal is a combination of di↵erent components alongside each vector. The strategy to force the interference to 0, and to maximize the power of the desired signal at the mean time, is simply to project the received signals into direction which is perpendicular to the subspace of all interference. Use wzf to denote the correlated matrix, for signal x1, it is given as

wzf,1H H = eT1 = (1, 0, 0,· · · , 0). (3.3) By considering that wzf is a linear combination of channel vectors, wzf,1= Hc,

HH_{Hc = e}

1= (1, 0, 0,· · · , 0)T. (3.4) Together with other desired signals x2, x3,· · · , the zero forcing detector for MIMO communication is written as

wzf = H(HHH) 1. (3.5) A visualization of ZF method in signal space is shown in figure 3.1. However, compared with optimal matched filter which is used for no-interference scenario, projection leads to a noise amplification problem. It is quite obvious that some signal powers are lost during projection. By normalizing the correlated signal to unit power, addictive white Gaussian noise is also multiplied with the reciprocal of the correlated received signal. This is the major problem of zero forcing detector which results in a bad performance.

(25)

3.2 Minimum Mean Square Error Detection

Since ZF detector leads to a severe noise amplification problem. It would be beneficial to take interference as well as noise into consideration to combat noise amplification. A trade-o↵ between noise and interference need to be achieved. Instead of nulling all the interference, some interference should be kept for the sake of suppressing noise. A new received signal model can be considered as

y = h1s1+ ˜n. (3.6) Compared with 3.2, the interference and white noise are merged together as a new colored noise. In order to apply matched filter which is optimal in no interference scenario, the received signals first go through a whitening filter so that the colored noises become white again. This filter is simply given as the inverse proportional to the power of interference and noise. For kth _{UE, the} filter is given as Wk = N0+ P n X i=1,i6=k hih⇤i, (3.7) W 12y = W 1 2Hs + W 1 2n,˜ (3.8)

where N0 is the noise power and P is the uniform power of all transmitted signals.

Then, by applying matched filter to project the filtered signal through the direction Wh, the result becomes,

(W 12H)HW 1

2y = HHW 1Hs + HHW 1n. (3.9)

So the MMSE detector is defined as, wmmse=

HH HH_{H + N}

0INR

. (3.10) It is known that MMSE detector maximizes the SNR and minimizes the mean square error in estimating the transmitted signals. Also from 3.10, MMSE detector reduces to ZF detector if noise vanishes to 0. Since MMSE makes a compromise between noise and interference, noise is no longer a limited factor in a interference-limited scenario. A view of this detector is shown in figure 3.2.

3.3 A Posteriori SNR

The a posteriori SNR, or post SNR is the e↵ective signal to noise ratio after filtering transmitted signal with linear detectors. By using 3.5 and 3.10, the received signal can be decoupled into several output streams. Assuming that the pre-SNR ˜⇢ is ideally known, the post SNR is given as:

⇢zf,n = ˜ ⇢ [(HH_H) 1_] n,n . (3.11) and ⇢mmse,n = ˜ ⇢ [(HH_{H + N} 0INR) 1_] n,n , (3.12)

(26)

Figure 3.2: Projection view for MMSE in signal space

where n denotes the nth _{output stream and and [}_·]

n,n denotes the nth di-agonal element in the matrix. Further by using column-wise representation of channel matrix, the post SNR can be written as

⇢zf,n = ˜⇢⇥hHnhn hHnHn(HHnHn) 1HHnhn⇤, (3.13) and

⇢mmse,n= ˜⇢⇥hHnhn hHnHn(HHnHn+ N0INR)

1_HH

nhn⇤. (3.14) In the equations above, Hn denotes the channel matrix which excludes the nth _{column. With the help of post SNR values, a new system model after} correlated with linear detector, assuming that noise is normalized, is shown as

ˆ

s(n) = ⇢⇤ d(n) + z(n), (3.15) where d(n) denotes one point in the constellation, and z(n) is the correspond-ing noise after amplification. In MMSE scenario, this noise is with variance

2

z = ⇢ ⇢2.

Based on this model, the calculation of log-likelihood ratio(LLR) for linear de-tector can be simplified.

3.4 Log-likelihood Ratio Calculation in Linear

Detector

Maximizing a posteriori probability of a transmitted bit minimizes the corre-sponding detection error probability. This a posterior probability is often rep-resented by log-likelihood ratio(LLR). In wireless communication systems, the sign of LLR values matches with di↵erent transmitted binary bits, either 1 or 0, while the amplitudes of LLR values show how reliable this bit detection is. Use bI,i/bQ,i to denote the ith bit for real/imaginary part of one symbol. The conditional probability density function of 3.15 is Gaussian distribution

(27)

p(ˆs(n)_{|d(n) = c) =} p 1 2⇡ z exp ⇢ ₁ 2 kˆs(n) ⇢c_k2 2 z . (3.16) Without loss of generality, the calculation of bI,1is given as:

where c(0/1)_{means the constellation points whose first bit is 0/1. Using Jacobian} logarithm approximation:

ln(ea+ eb) = max(a, b) + ln(1 + e |a b|). (3.18) 3.17 can be further simplified as:

⇤I,1= maxc(1)_2d1 ip(ˆs(n)|d(n) = c (1)₎ maxc(0)_2d0 ip(ˆs(n)|d(n) = c (0)₎ = 1 2 2 z ⇢ min c(0)_2d0 I {kˆs(n) ⇢c(0)_k2_} min c(1)_2d0 I {kˆs(n) ⇢c(1)_k2_{} .} (3.19)

Figure 3.3 shows a 16QAM constellation, where the star represents one re-ceived symbol corrupted by white noise. For the first bit in real part, the constellation can be partitioned into 2 parts based on 0/1 values, as shown in the figure. An important observation is that, two possible symbols which belong to two sub-constellation and are closest to the received signal, are always in the same column. So only imaginary part distance need to be considered when we calculate LLR value.

3.19 for the first bit can be calculated directly based on the relative position of the received signals, normalized by ₂2.

⇤I,1=1 4 kˆsQ(n) ⇢k 2 kˆsQ(n) + ⇢k2 = sˆQ(n)⇢, (3.20)

where ˆsI,Q(n) represents the real or imaginary part of ˆs. The Derivation above can be generalized for other bits based on the bit values distribution of constel-lation. LLR calculation for 16QAM is given below.

⇤I/Q,1= 8 < : ˆ sI/Q(n)⇢, |ˆsI/Q,1(n)|  2 2⇢(ˆsI/Q(n) ⇢), sˆI/Q(n) > 2 2⇢(ˆsI/Q(n) + ⇢), sˆI/Q(n) < 2 . (3.21) ⇤I/Q,2= ⇢(|ˆsI,Q(n)| 2⇢). (3.22) Similarly, for 64QAM:

(28)

(29)

Chapter 4

Lattice Reduction

4.1 Algorithm Overview

The basic idea of Lattice Reduction in detection problem is to find out an equivalent channel matrix which has better condition number. Columns in the new channel matrix are more orthogonal to each other thus lead to a better detection performance.

Based on the MIMO model in 3.1, the model of lattice reduction can be described as

y = HTT 1s + n = ˜Hz + n, (4.1) where matrix T is a unimodular matrix whose entries are complex integers with determinant _{±1 or ±j. ˜}H is the equivalent new channel matrix and z is the transmitted signals in lattice reduced domain.

After this transformation, conventional linear detector such as MMSE de-tector can be applied on signal z. Given the new channel matrix has a better condition number, the noise amplification is less severe when compared with the detection method which directly applies MMSE detector. Then the detection results in lattice reduced domain are turned into the original domain by s = Tz. As a summary, there are three main procedures in lattice reduction aided detection algorithm:

• Calculate the unimodular matrix T and get a new better channel matrix. • Solve the detection problem by applying linear detector(MMSE) on lattice

reduced domain and generate a candidate list with di↵erent paths. • Transform the solutions in lattice reduced domain back into the original

domain.

4.2 LLL Algorithm

There are several di↵erent algorithms that developed for implementing lattice reduction. Gauss reduction is constraint to solve the problem with dimension 2[22]. Seysen’s reduction algorithm achieves the best performance at the cost of high complexity[23], while Brun proposes another low complexity algorithm

(30)

but the performance is not quite favorable[24]. Among all theses algorithms, the Lenstra-Lenstra-Lov´asz(LLL) algorithm is the most popular one with satis-factory performance and a polynomial complexity in problem size.

Some basic lattice theories are essential for understanding LLL algorithm. A complex lattice ` is a discrete addictive subgroup of_Zn

j and can be represented by a set of linearly independent basis vectors[3]. _Zj denotes the set of complex integers. For example, a m dimensional complex lattice can be expressed as

` =_{{x|x =} m X n=1

znbn, zn2 Zj}, (4.2) where bn represent basis vectors and zn are complex integers. m is the dimension of the lattice. A m dimensional lattice is analogous with a m dimen-sional MIMO system where each column of channel matrix can be treated as lattice basis vectors and received signals are part of the corresponding lattice. It is shown that a lattice keeps unchanged by multiplying arbitrary integers and the set of basis vectors for the same lattice are not unique. This provides the foundations of lattice reduction. It actually gets a new set of basis which are shorter and orthogonal to each other, but essentially represent the same lattice.

Any lattice is associated with two fundamental regions:

• Parallelograms: defined as the whole space spanned by lattice basis vec-tors. P(x) = {x|x = m X n=1 ✓nbn, 0 ✓ < 1}.

• Voronoi : defined as the set of points which have minimum distance with the space origin.

V(x) = {x|kxk < kx yk, y 2 `}.

A figure of Parallelogram and Voronoi regions is given in figure 4.1. For the same lattice, there are multiple Parallelogram regions due to the existence of infinite sets of basis vectors. In the figure, the red region denotes the par-allelogram for basis vector H =

✓ 1 0 0 1

◆

while the green region denotes the parallelogram for basis vector H =

✓ 4 1 1 0

◆

. On the contrast, the Voronoi region is independent of basis vectors since it only depends on the lattice.

Similarly, these two regions can be used in MIMO detection analogously. For MLD, the optimal solution under white Gaussian noise is to find out the nearest lattice point with respect to the received signal, and this is exactly what Voronoi region indicates. For linear detector, white noise is a↵ected by the detection weight as sld= s + wn so that the corresponding decision region for linear detector is the Parallelogram formed by di↵erent columns of channel matrix. From this perspective, what LLL algorithm does is to transform the Parallelogram region to a Voronoi-like region by reforming a set of shorter and more orthogonal basis vectors, consequently leading to a better performance.

(31)

Figure 4.1: Lattice Parallelogram and Voronoi regions

Definition 1 The matrix H is called LLL reduced if the following conditions are satisfied.

|rk,l|  1

2|rk,k|, for 1  k < l  n, |rl 1,l 1|2 |rl,l|2+|rl 1,l|2, for 2 l  n,

where rk,l is the entry of matrix R from QR decomposition H = QR and is a parameter between [1

4, 1].

The original LLL algorithm consists of three major parts and is shown in detail in algorithm 1

• Gram-Schmidt Orthogonality (QR Decomposition) • Size Reduction

• Lovasz condition

The first step of LLL algorithm is to find out a set of Gram-Schmidt or-thogonal basis for the channel matrix. Similarly to the rationale of zero forcing detector, the new linear independent basis can be obtained by projection sig-nals to the direction which is orthogonal to the previous signal subspaces. For a channel matrix H = [h1, h2,· · · , hn], the new basis can be calculated as

1= h1, 2= h2 h2 1 k 1k2 1 , 3= h3 h3 1 k 1k2 1 h3 2 k 2k2 2, .. .

(32)

This process can be easily implemented by QR decomposition where Q is an orthonormal matrix and R is an up-triangular matrix.

H = [h1, h2,· · · , hn] = QR = [ 1, 2,· · · , n] 2 6 6 6 4 h1· 1 h2· 1 · · · hn· 1 0 h2· 2 · · · hn· 2 .. . ... . .. ... 0 0 _{· · · h}n· n 3 7 7 7 5. (4.3)

The second step is the so-called size reduction operation. It checks the con-dition _|rk,l|  1₂|rk,k| row by row so that the projection of hl to the direction k, when k < l, is shorter than the half length of hk projection into the orthog-onal direction to the subspace spanned by other basis vectors 1, 2,· · · , n. If the conditions are not satisfied, the length of the corresponding basis should be reduced further. Matrix T is used to record the process of size reduction. A more intuitive explanation for size reduction is that subtracting integer copy of other columns from one column makes the channel matrix more orthogonal.

The final step of LLL is to check the Lov´asz condition. If this condition is not fulfilled, it means that the (k 1)th basis vector is not short enough. Thus swapping this column is necessary to make it shorter in further size re-ductions. After each swapping, the counter goes back to the previous stage and size reduction will be performed from the beginning.

Algorithm 1: Lenstra-Lenstra-Lov´asz Algorithm [7] Result: ˜H = ˜Q ˜R, T 1 initialization: Q2 Cn⇥m, R2 Cm⇥m,T = Im, k = 2; 2 while k m do 3 for l = k 1 to 1 step 1 do 4 µ = round(Rl,k/Rl,l); 5 R(1 : l, k) = R(1 : l, k) µR(1 : l, l); 6 T(:, k) = T(:, k) µT(:, l); 7 end 8 if |R(k 1, k 1)|2>|R(k, k)|2+|R(k 1, k)|2 then 9 Swap column k 1 and k in R and T;

10 Calculate ⇥ =  ¯ ↵ ¯ ↵ with ↵ = Rk 1,k 1 kRk 1:k,k 1k = Rk,k 1 kRk 1:k,k 1k ; 11 R(k 1 : k, k 1 : m) = ⇥R(k 1 : k, k 1 : m); 12 Q(:, k 1 : k) = Q(:, k 1 : k)⇥H; 13 k = max(k 1, 2); 14 else 15 k = k + 1; 16 end 17 end

(33)

4.3 Fixed-complexity LLL Algorithm

The original LLL algorithm is guaranteed to produce a new channel matrix whose shortest basis length is upper bounded to the proportional of shortest vector length in the lattice[3]. But the complexity of LLL lattice reduction al-gorithm has no upper bound in the worst case, although the average complexity is polynomial in the problem size[25]. This is similar to the complexity problem of sphere decoding, which is extremely unfavorable in practice.

The unbounded complexity comes from the step 13 in algorithm1. After each swapping, the counter for flow control k will be decreased to start another loop, which leads to a boost of algorithm complexity. Also an uncertain flow is difficult to implement efficiently from hardware perspective. The flow chart is shown in figure 4.2.

Figure 4.2: Unbounded loops in LLL algorithm

One natural step is to find out a way to limit the number of loops in each use of LLL algorithm. More specifically, one single loop can be extracted from the original LLL algorithm and the loop flow can be pre-defined when performing the algorithm. In order to achieve this goal, step 13 in the original algorithm need to be eliminated and operations on matrix Q are shifted to matrix H to output desired new channel matrix directly. The algorithm of extracting single LLL loop is given in algorithm2;

Compared with algorithm 1, the single LLL loop has a fixed flow control for k from 2 to m. Then repetition of such single loop is necessary to get a new channel matrix with good orthogonality. In order to know how many loops are needed, simulations based on simple i.i.d Rayleigh fading channels are conducted. With the result shown in figure 4.3.

As introduced later in section 4.5, under MMSE criteria, the process of lattice reduction is a↵ected by the values of SNR. While it keeps unchanged under the criteria of ZF detector. Figure 4.3 shows the cumulative distribution function of number of basis swapping under di↵erent conditions, which also indicates how many LLL loops are needed for each realization of LLL algorithm. It is

(34)

Algorithm 2: Single LLL Loop Result: ˜H, T 1 initialization: Q2 Cn⇥m, R2 Cm⇥m, T = Im, k = 2; 2 for k = 2 to m step 1 do 3 for l = k 1 to 1 step 1 do 4 µ = round(R_l,k/R_l,l); 5 R(1 : l, k) = R(1 : l, k) µR(1 : l, l); 6 T(:, k) = T(:, k) µT(:, l); 7 H(:, k) = H(:, k) µH(:, l); 8 end 9 if |R(k 1, k 1)|2>|R(k, k)|2+|R(k 1, k)|2 then 10 Swap column k 1 and k in H, R and T;

11 Calculate ⇥ =  ¯ ↵ ¯ ↵ with ↵ = Rk 1,k 1 kRk 1:k,k 1k = Rk,k 1 kRk 1:k,k 1k ; 12 R(k 1 : k, k 1 : m) = ⇥R(k 1 : k, k 1 : m); 13 end 14 end

very clear that 6 loops are able to cover 90% of all realizations to achieve full performance, and even 4 loops are enough for MMSE situation in low SNR region. Actually [16] points out that 4 loops in fixed complexity LLL algorithm together with MMSE detector achieve nearly the same performance with the original algorithm.

Despite this fixed complexity improvement, LLL algorithm still yields a very high complexity. Further steps have to be taken to reduce complexity to apply this algorithm into a practice scenario,

(35)

4.4 Iterative LLL Algorithm

As introduced in chapter 2, a certain amount of PRB are allocated to each UE in a OFDMA system. In each RE, the channel state information is assumed to be stable during one transmission. This means that the whole time and frequency resources are splitted into many blocks and channel coherence inside one block is exploited to transmit information. However, frequency coherence between di↵erent REs is not utilized.

Lattice reduction focuses on acquisition of a channel matrix with a better condition number. Correlations between di↵erent columns in the channel ma-trix, which leads to a poor condition number, strongly depend on the positions of all UEs. Keep in mind that in time domain, the length of one subframe dura-tion is only 1ms and UEs are not likely to change posidura-tions dramatically in such a short time. As a result, frequency coherence between consecutive REs can be brought into consideration. Inspired by iterative thinking used in least mean square estimation method where estimation coefficients are updated iteratively for each raw data input. Lattice reduction results at each subcarrier, matrix T, can also be updated in the same way by reusing the results from its neighbor subcarriers. A flowchart of this iterative lattice reduction is showed in figure 4.4.

Figure 4.4: Flow chart of fixed-complexity and iterative LLL algorithm Di↵erent from initialization of matrix T as an identity matrix, matrix T is stored after LLL algorithm in each RE and used in the next RE. During each detection, the channel matrix is firstly multiplied by the reused matrix T and then single LLL loops can be applied on it. In this manner, simulation results in chapter 5 show that in a practical scenario, performance is still guaranteed with nearly no loss at all.

(36)

4.5 Lattice Reduction Aided Linear Detector

After lattice reduction, the new system model can be modeled as in equation 4.1. z represents new lattices in lattice reduced domain.

y = HTT 1s + n = ˜Hz + n.

Conventional linear detector can benefit from less noise amplification when applied directly on this new model. MMSE detector generally has a better per-formance than ZF detector. But the optimal strategy to apply MMSE detector in lattice reduction is not so straightforward. One obvious solution is shown as

ˆ

zmmse= ( ˜HHH +˜ n2THT) 1H˜Hy. (4.4) But it is pointed out that this solution does not give the best performance[7]. Another approach is to use the uniform framework for linear detector. By extending the channel matrix and applying lattice reduction algorithm on the extended matrix, a better performance can be achieved since MMSE criteria is introduced in the process of lattice reduction.

H =  H nIm and y =  y 0m,1 . Then ZF detection structure can be used on the extended matrix.

ˆ

zmmse= ( ˜H H_˜

H) 1H˜Hy

= ⇢T 1s + ˜n, (4.5) where ˜n denotes the filtered noise. Equation 4.5 indicates that in order to get the final detection results, a transformation is necessary to turn results from lattice reduced domain z into the original domains. Then two problems arise and extra steps are needed for solving those problems:

The first problem is the shift and scaling compensation. According to lat-tice theory in section 4.2, latlat-tice reduction only works for the set of infinite consecutive integers. For M-QAM modulation, a shift and scaling is needed on transmitted signals.

s := s (1 + j) 2 .

There are two possible solutions to carry out this operation. The first option is to make compensation for shift and scaling during the detection, which is adopted in the thesis and presented later. The second option is to incorporate scaling into channel matrix and shift into the received signals.

The second problem is the Shaping problem. The major gain of lattice reduc-tion algorithm comes from the fact that the multiplicareduc-tion with MMSE weight causes less noise amplification. Thus detection in the lattice reduced domain is more reliable. But shaping problem appears when the original constellation points are transformed to lattice reduced domain. Then received signals are quantized in lattice reduced domain based on the decision region for each trans-formed constellation point. But lattices in lattice reduced domain consisting of infinite consecutive integers, thus finding out the constellation boundary in reduced domain is a prerequisite to reconstruct the decision region for points

(37)

around the constellation boundary. However, this process is again computa-tionally expensive. The sub-optimal strategy is first quantize received signal to the nearest integer in lattice reduced domain regardless of the boundary. Then transform the results in lattice reduced domain back into original domain and finally quantize the results in original domain to the nearest constellation points. Mathematically, the whole detection procedure can be depicted as follows.

• Quantization in lattice reduced domain with shift and scaling. ˆ

z =Q(˜zmmse (1 + j)⇢T 1₁

N⇥1

2 ), (4.6) whereQ denotes element-based quantization to nearest integers.

• Transform back to original domain with shift and scaling.

ˆs = 2Tˆz + (1 + j)1N⇥1. (4.7) • Quantize to nearest constellation point.

˜s =_Qc(ˆs), (4.8) where the symbol_Qc denotes the element-based quantization to nearest constellation point.

To clarify various notations, ˜zmmse denotes the decision statistic obtained through detection process 4.1; ˆz denotes the quantized integer in lattice reduced domain; ˆs denotes the integer in original domain and ˜s denotes the nearest constellation points with respect to ˆs.

4.6 Dual LLL Reduction

As pointed in 3.11 and 3.12, the posterior SNR of linear detector is inversely proportional to the length of pseudoinverse of channel matrix. In order to make this posterior SNR as high as possible, it is also feasible to apply LLL lattice reduction algorithm to this pseudoinverse matrix, which can be expressed as,

B = H†HT, (4.9) where H† _{denotes the pseudoinverse of channel matrix.}

Then directly correlating this transformed matrix with received signals yields the system model in dual LLL algorithm.

BHy = (H†HT)HHs + BHn = THH†Hs + BHn = THs + BHn = z + BHn.

(4.10)

Following the standard detection procedure for iterative LLL algorithm, quantization in lattice reduced domain keeps the same. However, the matrix

(38)

T is di↵erent from the matrix in the context before. The transformation from lattice reduced domain to the original domain is given as,

ˆ z =_Q(z˜mmse (1 + j)T H₁ N⇥1 2 ), (4.11) ˆs = 2T†Hˆz + (1 + j)1N⇥1. (4.12) The final step is exactly the same with before, to quantize to the nearest constellation point with respect to ˆs. Later it is shown in chapter 5 that com-pared with normal LLL algorithm, dual LLL algorithm has better performance when the number of antennas in transmitter and receiver sides is larger than 4.

4.7 Log-likelihood Ratio Calculation in Lattice

Reduction

The second equality comes from the fact that the interleaver in turbo encoder makes bits independent of each other. xk,0/1 denote the symbol vectors for all UEs whose kth_{bit is 0/1.}

Next, the likelihood function of y given x can be derived directly from MIMO system model.

p(y_|sx) =

exp[ 1

2 2ky Hsk2]

(2⇡ 2₎N . (4.14) Obviously it is not feasible to find out all possible symbol vectors and apply 4.14 into 4.13. Instead, only limited number of symbol vectors are picked out and Jacobian logarithm approximation 3.18 is also applied to decrease computational complexity.

Given that the last part of the Jacobian logarithm is negligible, the max-log LLR calculation comes out as

L(xk|y) = 1 2smax2Sk,1{ 1 2ky Hsk 2 } 1₂ max s2Sk,0{ 1 2ky Hsk 2 }. (4.15) For this purpose, the LLR calculation in lattice reduction algorithm can be divided into 2 parts.

• Generate a candidate list consisting of various possible transmitted symbol vectors.

(39)

Figure 4.5: Generation multiple transmitted symbol vectors

• Calculate LLR for each bit based on the candidate list and 4.15.

Generating candidate symbol vectors in lattice reduction is based on the quantization error in lattice reduced domain. Since ideally the e↵ect of noise is eliminated in the quantization process in lattice reduced domain. As shown in figure4.5, zmmseis the weighted signal, while z1is the corresponding quantized integer. In addition to only one nearest integer, the neighboring points are also utilized, based on the direction of quantized error.

ˆ

z = [z1, z1+ Re{✏}, z1+ j Im{✏}], (4.16) where ✏ represents quantized error. In this manner, each user has 3 di↵erent lattice points in lattice reduced domain, which represent 3 most possible trans-mitted symbols.

For a MIMO system with 4 transmitters, the number of di↵erent permu-tations of lattices in lattice reduced domain is 34 _{= 81. Ideally 81 di↵erent} vectors will be obtained when those permutations are transformed back into original domain. But in the transformation process, di↵erent vectors in lattice reduced domain might result in the same vector in the original domain, due to the shaping problem and the second quantization as depicted in 4.8.

Then 4.15 can be applied to di↵erent vectors and get LLR values for each bit. Later these LLR values are fed into turbo decoder introduced in chapter 2 to get the final detection result for a coded system.

(40)

(41)

Chapter 5

Simulations and Results

5.1 Simulation Setup

The simulations in this thesis project are carried out in two radio simulators. The first simulator supports simulations for uncoded MIMO systems under i.i.d Rayleigh fading channel. It is used for validating performance of presented lattice reduction algorithms and studying the influence brought by di↵erent system dimensions and modulation schemes.

For each simulation in this simulator, 200000 realizations are utilized to obtain a statistical result. The simulator setup for the uncoded system is shown in table 5.1,

Parameters Settings

Modulations QPSK/16QAM/64QAM MIMO(NT ⇥ NR) 4x4, 6x6, 8x8

Channel i.i.d Rayleigh fading Channel Estimation Ideal Channel correlation Supported

Realization 200000

Table 5.1: Parameter settings in uncoded systems

Ideal channel estimation means receiver knows exactly what the channel state information is. And the channel correlation refers to a pre-defined matrix concatenated with channel matrix at the receiver side. This matrix reflects how transmitted data interfere with each other. In the simulation, a widely used correlated matrix for 4⇥ 4 MIMO system is adopted as:

J = 8 > > < > > :

1 0.01 + 0.7i 0.47 0.08i 0.19 0.26i 0.01 0.7i 1 0.01 + 0.7i 0.47 0.08i

0.47 + 0.08i 0.01 0.7i 1 0.01 + 0.7i 0.19 + 0.26i 0.47 + 0.08i 0.01 0.7i 1

9 > > = > > ; . (5.1) The second radio simulator is an internal advanced radio simulator which is capable of simulating more realistic scenario. It is based on the structure of

(42)

OFDMA uplink receiver. Multiple features including turbo encoder & decoder, spatial channel model, antenna to beam transformation and channel estimation are available in this simulator. The details of the simulator setting used in this thesis project is shown in table 5.2.

TTI Number 1000 Data Symbol per TTI 12

Number of RB 6

Number of Subcarriers per RB 12

Channel Spatial Channel Model Channel Estimation Ideal/Realistic Beam Domain Transformation Supported

Cell number 1

Number of Horizontal Antenna Port 8 Number of Vertical Antenna Port 4 Cluster Number 1 Subpath number 20 BS Subpath Horizontal Angle Spread 15 BS Subpath Vertical Angle Spread 8 BS Subpath delay spread range 170-750 µs

BS Height 32m

UE Height 1.5m

UE Position Fixed Table 5.2: General parameter settings in advanced simulator

5.2 Uncoded Systems

The simulation results for quadrature phase shift keying(QPSK) modulation in the uncoded systems is shown in figure 5.1 and 5.2. The performance evaluation is based on bit error rate(BER), which is defined as the ratio between the number of detection errors and the number of transmitted bits. For the purpose of comparison, performance of di↵erent methods are displayed in the same figure with respect to di↵erent SNR values.

• As the number of antennas increases, the performance of linear detectors only has minor performance improvement due to more severe interference. • Regardless of the number of antennas in the MIMO systems, there is al-ways a huge di↵erence between linear detector and lattice reduction aided detector. At the level of BER equals to 10 3_{, performance improvements} for lattice reduction algorithm are around 15dB. This gap can be eas-ily explained by less noise amplification due to a more orthogonal lattice reduced channel matrix.

• Compared the results within 6 ⇥ 6 MIMO and 8 ⇥ 8 MIMO systems, dual lattice reduction algorithm outperforms the original lattice reduction al-gorithm. The rationale is that dual lattice reduction is directly applied on

(43)

pseudoinverse of channel matrix. So the basis vectors of this matrix are optimized to be as short as possible. According to 3.11, posteriori SNR of ZF is directly a↵ected by basis vector length. Shorter basis vectors lead to higher posteriori SNR values and a better performance. However, performance di↵erence is not so clear in systems with less number of an-tennas, where original LLL algorithm and dual LLL algorithm have the same e↵ect on basis vector length.

Figure 5.1: BER performance for QPSK in 4x4 uncoded MIMO system

(a) 6X6 MIMO (b) 8x8 MIMO

(44)

For 16QAM, the simulation results for MIMO systems with antenna number ranging from 4 to 8 are shown in figure 5.3 and 5.4

• BER performance of 16QAM is worse than the performance of QPSK system, For the BER at the level of 10 2_{, linear detector is at the position} of 27dB, while the lattice reduction aided detector is at around 18dB. Again, it is obviously from the results that lattice reduction aided detectors achieve diversity, while the diversity of linear detectors keeps 1.

• High order modulation scheme is more sensitive to noise. So performance di↵erence between ZF detector and MMSE detector becomes minor in 16QAM. But there is still around 1dB and larger gain between lattice reduced aided ZF and MMSE detector for BER below 10 3

Figure 5.3: BER performance for 16QAM in 4x4 uncoded MIMO system

(45)

For 64QAM, the similar simulation results are shown in figure 5.5 and 5.6 • For 64QAM modulation, the di↵erence between ZF detector and MMSE

detector vanishes and the overall performance again decreases compared with lower order constellation modulations. But lattice reduction aided linear detectors still yield 10dB performance gain compared with conven-tional linear detector.

• Dual lattice reduction algorithm has larger performance improvement in 8_{⇥ 8 MIMO system. In the situation where MMSE criteria does not} help in BER performance, dual lattice reduction can be a good algorithm candidate for detection in a system with larger number of antennas and high order constellation modulation.

Figure 5.5: BER performance for 64QAM in 4x4 uncoded MIMO system

(46)

Simulation results in figure 5.7 show BER performance when the correlated matrix 5.1 is applied in the simulation. By adding this correlation matrix, more interference is introduced between data streams, which leads to a large performance degradation. As can be observed in the figure, at the BER level of 10 1_{, MMSE detector has around 8dB performance loss, while lattice reduction} aided MMSE detector only has 5dB performance loss since it takes interference into consideration as well.

Figure 5.7: Correlation influence in 4x4 16QAM uncoded MIMO systems

5.3 Coded System

Impacts of di↵erent modulation schemes and number of antennas on lattice reduction algorithm have been studied in the previous section. It is shown that MMSE detector outperforms ZF detector in all simulation scenarios. Di↵erences between ZF, MMSE detectors and their lattice reduction aided versions are quite similar. So in this section, simulation results for more realistic scenario are presented mainly based on MMSE and lattice reduction aided MMSE detector.

Modulation and Coding Scheme Index 20 MIMO(NR⇥ NT) 4x4, 8x4 Beam Domain Transformation True Channel Estimation Ideal UE Position Fixed Position Table 5.3: Detailed parameter settings in advanced simulator

(47)

For the following simulations, the detailed parameter settings are shown in table 5.3 while general settings can be found in table 5.2. The UE positions are carefully selected so that interference are introduced in the system. Mean-while ideal channel information is utilized at the receiver side. Only results for iterative LLL lattice reduction algorithm is presented here, since iterative lattice reduction yields nearly no performance loss compared with the original algorithm, which will be showed later.

As can be seen from the figure 5.8, lattice reduction aided MMSE algorithm still achieves diversity and has a much better performance than linear MMSE detector in realistic scenario. At the FER level of 10 1_{, the performance gain} is over 5dB.

Figure 5.8: FER performance for 16QAM in 4x4 coded MIMO systems For the 8⇥ 4 MIMO system, linear detector attains 6dB at the FER level of 10 1_{, which is much better than the performance of 4}_{⇥4 MIMO. On the basis of} this improvement, lattice reduction aided MMSE detector brings in even more 2dB increase.

Figure 5.10 shows performance di↵erence between iterative LLL algorithm and original LLL algorithm in an 8_{⇥ 4 OFDMA system. It is clear that both} algorithms achieve diversity and the maximum performance loss at all FER level is only 0.1dB, which is a negligible di↵erence. This is due to the frequency coherence between di↵erent RE-channel information keeps the same or changes slowly in a relatively short time over the neighboring subcarriers. Although one single LLL loop is not enough to get a new orthogonal channel matrix for the first several REs, matrix T will reach optimal solution fast thus results in optimal detection performance in the rest REs. As a consequence, the overall system performance get very little negative influence by this iterative version LLL algorithm.

(48)

Figure 5.9: FER performance for 16QAM in 8x4 coded MIMO systems

Figure 5.10: Performance comparison between original and iterative LLL algo-rithm

(49)

5.4 Complexity

The complexity of LLL algorithm in the worst case is proven to be unbounded[16]. However, it is verified that the average number of column swapping is smaller than O(n2_{logn), where n is the dimension of MIMO system. This supports} the conclusion that the average complexity of original LLL algorithm is upper-bounded by O(n4_{logn)[5]. In the fixed complexity lattice reduction algorithm} and iterative lattice reduction algorithm, the minimum unit is one single LLL loop. Table 5.4 and 5.5 give an detailed description on complexity of these two algorithm based on real number multiplication and addition. From a engineer-ing perspective, the complexity is counted in the worst case, where all steps in algorithm 2 are added up regardless whether the conditions in step 9 are fulfilled or not.

Steps Real Multiplication 4_{⇥ 4 16QAM} M=81 Original LLL 4 LLL loops 4(14₃N3_{+ 4N}2 26 3N ) 1312 Iterative LLL HT 4N3 ₂₅₆ Single LLL Loop 4(14 3N 3_{+ 4N}2 26 3N ) 328 Common Steps MMSE weight 4(2N3₊1 3(N3 N )) + 4N3 848 QR Decomposition 4N3 ₂₃₆ Correlating Signal with

MMSE Weight

4N2 ₆₄ Shift and Scaling 4(1

3N 3₊2 3N ) 96 Transformation to Original Domain 4N2logM₃ 192

Table 5.4: Real multiplication for lattice reduction algorithm in a N_{⇥N MIMO} system with M symbol vectors

There is no doubt that iterative LLL algorithm yields much less complexity than fixed-complexity LLL algorithm, not to mention original LLL algorithm whose worst case complexity is unbounded. This di↵erence is even higher when the system dimension increases. But the performance of these algorithms keep nearly the same in that scenario.

(50)

Steps Real Addition 4⇥ 4 16QAM M=81 Original LLL 4 LLL loops 4(14 3N 3_{+ 2N}2 20 3N ) 1216 Iterative LLL HT 4N2_(N ₁₎ ₁₉₂ Single LLL Loop 4(14₃N3_{+ 2N}2 20 3N ) 304 Common Steps MMSE weight 12N3 _12N2₊ 1 6(4N 2 _{9N + 5)} 716 QR Decomposition 4N3 _3N2 ₂₀₈ Correlating Signal with

MMSE Weight

4N2 ₆₄ Shift and Scaling 1₃(10N2 _{9N + 5) +}

2N M 443 Transformation to Original Domain 4N2_logM 3 192 Table 5.5: Real Addition for lattice reduction algorithm in a N _{⇥ N MIMO} system with M symbol vectors

(51)

Chapter 6

Conclusions and Future

Work

In this thesis, lattice reduction aided detection algorithms are studied. Sim-ulation results show that lattice reduction aided linear detectors outperforms conventional linear detectors both in uncoded systems with i.i.d Rayleigh fad-ing channel and more realistic OFDMA coded systems with SCM. There are around 10dB performance improvements for various MIMO system dimension within uncoded system and more than 2dB in a realistic coded system. An iter-ative lattice reduction algorithm is proposed for minimizing complexity problem of original LLL algorithm. This proposed algorithm operates on each subcar-rier in an OFDMA system and exploits frequency coherence to reuse the result obtained from previous subcarriers. Simulation results show that this iterative algorithm yields nearly no performance loss while reduces the complexity by half.

Future continuation for this algorithm is to study more efficient method of generating di↵erent symbol vectors so that LLL algorithm can be used in higher order coded MIMO systems. The method adopted in this thesis is not feasible in the system whose dimension is higher than 8. Since it produces too many symbol vectors which cause too much computational complexity. Another al-ternative is to use tree search algorithm after lattice reduction. Conventional tree search algorithm need less number paths in each level of the searched tree due to a better channel matrix condition number. But still there are some prob-lems in this method. Originally LLR values of di↵erent paths can be calculated along with tree search process but with lattice transformation, the Euclidean distance acquired in lattice reduced domain can not be used in original domain. This leads to huge complexity increase in order to get proper LLR value. So finding an algorithm to efficiently calculate LLR values in this scenario could further improve detection performance without drastically increasing computa-tional complexity.

Lattice Reduction Aided Multiple Input Multiple Output Detection Algorithm Design For 5G Communication

IN

DEGREE PROJECT

ELECTRICAL ENGINEERING,

SECOND CYCLE, 30 CREDITS

,

STOCKHOLM SWEDEN 2017

Lattice Reduction Aided Multiple

Input Multiple Output Detection

Algorithm Design For 5G

Communication

YUQI ZHANG

Abstract

Sammanfattning

Acknowledgment

Contents

List of Figures

List of Tables

Acronyms

Chapter 1

Introduction

1.1

Motivation

1.2

Problem Formulation and Method

1.3

Previous Work

1.4

Outline of Thesis

Chapter 2

Background

2.1

OFDM Modulation

2.2

Multiple Antenna Transmission System

2.3

LTE Physical Resource

2.4

Turbo Receiver

2.5

Spatial Channel Model

Chapter 3

Linear Detection Algorithm

3.1

Zero Forcing Detection

3.2

Minimum Mean Square Error Detection

3.3

A Posteriori SNR

3.4

Log-likelihood Ratio Calculation in Linear

Detector

Chapter 4

Lattice Reduction

4.1

Algorithm Overview

4.2

LLL Algorithm

4.3

Fixed-complexity LLL Algorithm

4.4

Iterative LLL Algorithm

4.5

Lattice Reduction Aided Linear Detector

4.6

Dual LLL Reduction

4.7

Log-likelihood Ratio Calculation in Lattice

Reduction

Chapter 5

Simulations and Results

5.1

Simulation Setup

5.2

Uncoded Systems

5.3

Coded System

5.4

Complexity

Chapter 6