Institutionen för systemteknik
Department of Electrical Engineering
Examensarbete
Study of Channel Estimation in MIMO-OFDM
Systems for Software Defined Radio
Examensarbete utfört i Datorteknik vid Tekniska högskolan i Linköping
av
Qi Wang
LITH-ISY-EX--07/3954--SE
Linköping 2007
Department of Electrical Engineering Linköpings tekniska högskola
Linköpings universitet Linköpings universitet
Study of Channel Estimation in MIMO-OFDM
Systems for Software Defined Radio
Examensarbete utfört i Datorteknik
vid Tekniska högskolan i Linköping
av
Qi Wang
LITH-ISY-EX--07/3954--SE
Handledare: Di Wu
isy, Linköings universitet
Johan Eilert
isy, Linköings universitet
Examinator: Dake Liu
isy, Linköings universitet
Avdelning, Institution
Division, Department
Division of Computer Engineering Department of Electrical Engineering Linköpings universitet
SE-581 83 Linköping, Sweden
Datum Date 2007-10-19 Språk Language Svenska/Swedish Engelska/English Rapporttyp Report category Licentiatavhandling Examensarbete C-uppsats D-uppsats Övrig rapport
URL för elektronisk version
http://www.ep.liu.se/2007/3954
ISBN
—
ISRN
LITH-ISY-EX--07/3954--SE
Serietitel och serienummer
Title of series, numbering
ISSN
—
Titel
Title Study of Channel Estimation in MIMO-OFDM Systems for Software Defined Radio
Författare
Author
Qi Wang
Sammanfattning
Abstract
The aim of the thesis is to find out the most suitable channel estimation algorithms for the existing MIMO-OFDM SDR platform. Starting with the analysis of several prevalent channel estimation algorithms, MSE performance are compared under different scenarios. As a result of the hardware independent analysis, the complex-valued matrix computations involved in the algorithms are decomposed to real FLoating-point OPerations (FLOPs). Four feasible algorithms are selected for hardware dependent discussion based on the proposed hardware architecture. The computational latency is exposed as a manner of case study.
Nyckelord
Abstract
The aim of the thesis is to find out the most suitable channel estimation algorithms for the existing MIMO-OFDM SDR platform. Starting with the analysis of several prevalent channel estimation algorithms, MSE performance are compared under different scenarios. As a result of the hardware independent analysis, the complex-valued matrix computations involved in the algorithms are decomposed to real FLoating-point OPerations (FLOPs). Four feasible algorithms are selected for hardware dependent discussion based on the proposed hardware architecture. The computational latency is exposed as a manner of case study.
Acknowledgments
First of all, I would like to express my gratitude to Di Wu and Johan Eilert in the division of computer engineering, who came out the idea and supervised this thesis. Their different working style, that Di introduces novel inspiration while Johan helps to go deeper, are the most important thing I learned from the thesis work. In addition, special thanks go to Di, who presented the hardware architecture, which has been used to expose the hardware dependent computational latency of estimation algorithms.
Also, I would like to thank my examiner Prof. Dake Liu, for providing the opportunity to approach the academic world, for his useful suggestions and endless encouragement.
Finally, for my parents in China who support me the most, all my friends here and there always believe in me, any thankful words are not enough.
Contents
1 Introduction 1
1.1 Motivation . . . 1
1.2 Objective . . . 2
1.3 Reading Guidelines . . . 2
2 MIMO-OFDM System Model 3 3 Overview of MIMO-OFDM Channel Estimation 7 3.1 LS vs. MMSE . . . 8
3.1.1 LS . . . 8
3.1.2 MMSE . . . 8
3.2 Time Domain vs. Frequency Domain . . . 9
3.3 Block-Type Pilot vs. Comb-Type Pilot . . . 9
3.3.1 Estimation with Block-Type Pilot . . . 10
3.3.2 Estimation with Comp-Type Pilot . . . 10
3.4 Iterative vs. Direct Method . . . 12
4 Algorithms Analysis 13 4.1 LS Estimator with Block-Type Pilot . . . 13
4.1.1 FFT-based Time Domain Approach . . . 13
4.1.2 Matrix Manipulation based Time Domain Approach . . . . 14
4.1.3 Frequency Domain Approach . . . 16
4.1.4 Optimal Training Sequence Design . . . 17
4.1.5 Channel Length in Time Domain . . . 18
4.2 MMSE Estimator with Block-Type Pilot . . . 18
4.2.1 Variant 1. RHH-based . . . 18
4.2.2 Variant 2. Rhh-based . . . 20
4.3 Estimation with Comb-Type Pilot (PSA Channel estimation) . . . 21
4.3.1 Appended Notation . . . 22
4.3.2 PSA Estimation with FFT Interpolation . . . 23
4.3.3 PSA Estimation with WIF . . . 23
x Contents
5 Simulation 27
5.1 Scenarios . . . 27
5.2 LS Estimator with Block-Type Pilot . . . 28
5.3 MMSE Estimator with Block-Type Pilot . . . 28
5.4 PSA Estimator . . . 31
5.5 Comparison . . . 31
5.5.1 Urban Micro Scenario . . . 31
5.5.2 Urban Macro Scenario . . . 32
6 Computational Cost 35 6.1 Kernel Arithmetic . . . 35
6.1.1 K-FFT . . . 35
6.1.2 Matrix Multiplication . . . 36
6.1.3 Matrix Inversion . . . 37
6.1.4 Singular Value Decomposition . . . 39
6.2 Complexity in FLOPs . . . 43
6.2.1 Real Time Computation . . . 43
6.2.2 Off-line Computation . . . 44
7 Hardware Dependent Analysis 47 7.1 Hardware Platform . . . 48
7.1.1 Matrix Processor . . . 48
7.2 Computational Latency: A Case Study . . . 49
7.2.1 LS Estimator with Block-Type Pilot . . . 51
7.2.2 MMSE Estimator with Block-Type Pilot . . . 51
7.2.3 Estimator with Comb-Type Pilot . . . 52
8 Conclusion 55 8.1 Conclusion . . . 55
8.2 Future Works . . . 55
List of Figures
2.1 MIMO-OFDM System . . . 4
3.1 MIMO-OFDM System with Partitio in Time & Frequency Domainn 9 3.2 Block-type Pilot . . . 10
3.3 Comb-type Pilot . . . 11
3.4 Pilot Tones in Two Dimensions . . . 11
4.1 Time Domain FFT-based Channel Estimator for MIMO-OFDM system . . . 15
5.1 MSE/SNR Result of LS Estimator with Block-Type Pilot in Urban Micro scenario . . . 28
5.2 MSE/SNR Result of LS Estimator with Block-Type Pilot in Urban Macro Scenario . . . 29
5.3 MSE/SNR Result of RHH involved MMSE Estimator with Block-Type Pilot in Urban Micro Scenario . . . 29
5.4 MSE/SNR Result of Rhh involved MMSE Estimator with Block-Type Pilot in Urban Micro Scenario . . . 30
5.5 MSE/SNR Result of Rhh involved MMSE Estimator with Block-Type Pilot in Urban Macro Scenario . . . 30
5.6 MSE/SNR Result of PSA Estimator with FFT Interpolation in Ur-ban Macro Scenario . . . 31
5.7 MSE/SNR Result of PSA Estimator with Wiener Interpolation Fil-ter in Urban Macro Scenario . . . 32
5.8 Comparison of Channel Estimators with Block-Type Pilot in Urban Micro Scenario . . . 32
5.9 Comparison of PSA Channel Estimators in Urban Macro Scenario 33 6.1 K = 8 radix-2 FFT Algorithm . . . 36
7.1 Hardware Architecture of Channel Estimator . . . 48
7.2 Matrix Processor . . . 49
List of Tables
4.1 Complexity of the FFT-based LS Estimator with Block-Type pilot 15
4.2 Complexity of Matrix Manipulation based LS Estimator . . . 16
4.3 Complexity of the Frequency Domain LS Estimator . . . 17
4.4 Complexity of Basic RHH-based MMSE Estimator . . . 20
4.5 Complexity of SVD Optimized RHH-based MMSE Estimator . . . 20
4.6 Complexity of Basic Rhh-based MMSE Estimator . . . 21
4.7 Complexity of SVD Optimized Rhh-based MMSE Estimator . . . . 21
4.8 Complexity of the PSA Estimator with FFT Interpolator . . . 23
4.9 Complexity of Basic PSA Estimator with WIF . . . 26
4.10 Complexity of SVD I PSA Estimator with WIF . . . 26
4.11 Complexity of SVD II PSA Estimator with WIF . . . 26
5.1 Simulation Parameters . . . 27
6.1 Conversion from complex to FLOPs . . . 35
6.2 Complexity of the Algorithms in Real FLOPs . . . 43
6.3 Example of the Complexity in Real FLOPs . . . 44
7.1 Complex Floating-Point Instructions . . . 50
7.2 Cycle Cost of the FFT-based LS Estimator with Block-Type pilot 51 7.3 Cycle Cost of SVD Optimized Rhh-based MMSE Estimator . . . . 51
7.4 Cycle Cost of the PSA Estimator with FFT Interpolator . . . 52
3
Glossary
3GPP 3rd Generation Partnership Project ASIC Application Specific Integrated Circuit ASIP Application Specific Instruction-set Processor BER Bit Error Ratio
CP Cyclic Prefix
DFT Discrete Fourier Transform DMA Direct Memory Access FFT Fast Fourier Transform FLOPs FLoating-point OPerations
FPCMAC Floating-Point Complex Multiply ACcumulate IFFT Inverse Fast Fourier Transform
LS Least Square
LTE Long Term Evolution
MIMO Multiple-Input-Multiple-Output MMSE Minimum Mean Square Error MSE Mean Square Error
OFDM Orthogonal Frequency Division Multiplexing
PE Processing Element
PHY PHYsical layer
PSA Pilot Symbol Aided
PSAM Pilot Symbol Aided Modulation QPSK Quadrature Phase-Shift Keying
RX Receiving antenna
SDR Software Defined Radio
SNR Signal-to-Noise Ratio
SVD Singular Value Decomposition TX Transmitting antenna
WIF Wiener Interpolation Filter
Chapter 1
Introduction
1.1
Motivation
The concept of Software-Defined Radio (SDR) was first proposed by Joseph Mi-tola in [1], where enabled by an ideal Analog/Digital converter (ADC), all RF and baseband signal processing are done in digital. However, this concept later was proved to be too expensive due to the extraordinary requirement on the ADC. As defined in [2], SDR is a transceiver, the functions of which are realized as programs running on a suitable processor if feasible. Based on the same hardware, different transmitter/receiver algorithms are implemented in software. This makes it possi-ble that an arbitrary platform is apossi-ble to support a multimode system adaptively. In contrast to the ideal software radio directly sample the antenna outputs, SDR in this thesis means a practical version of software radio receives the sampled signal after a band selection filter.
SDR has become realistic over recent years owing to the scaling of semicon-ductor process. By carefully defining the functional coverage, application specific programmable hardware based solution can achieve similar efficiency as fixed-functional ASIC. Furthermore, it brings flexibility, lower development cost and shorter time-to-market.
The Multiple-Input-Multiple-Output (MIMO) technique, as one of the promis-ing technologies in modern communications, can increase the spectrum efficiency by deploying multiple antennas at both transmitter and receiver sides in a rich scattering environment[3]. The utilization of multiple spatially distributed anten-nas introduces spatial dimension beyond time and frequency, which leads to the development of space-time coding but also takes signal processing capability and hardware complexity into challenge.
Orthogonal Frequency Division Multiplexing (OFDM) has become popular be-cause of its capability to convert frequency-selective channel into a bunch of parallel frequency-flat sub-channels. Due to the property that sub-carriers are orthogonal to each other, the guard bands are no longer necessary, which largely increases the spectrum efficiency. Although signals in different sub-carriers are overlapped in frequency, it is possible to recover at the receiver side as long as the
2 Introduction
onality is maintained. Recent development in MIMO techniques has promised a significant growth in performance upon general OFDM system.
It has come to be a heated topic that MIMO-OFDM will be involved in various communication standards. Under such circumstance, it comes the idea that to implement MIMO-OFDM into SDR platform supporting multiple standards. This thesis mainly focuses on the channel estimation module which is crucial but also computation intensive in space time decoders.
1.2
Objective
The thesis is related to the research work of the multi-antenna enhanced SDR performed in the Division of Computer Engineering at Linköping University, and intended to investigate several channel estimation algorithms commonly referred to and analyze their computational complexity, followed by the discussion on the implementation issues related.
1.3
Reading Guidelines
First of all, Chapter 2 defines the system model on which all problem in this thesis will be based.
Chapter 3 gives an overview of the channel estimation in MIMO-OFDM com-munication system. Several algorithms commonly referred to are elaborated in Chapter 4. Simulation is done in Chapter 5, where MSE performances of selected algorithms are compared.
In Chapter 6, as a result of hardware independent analysis, the FLOPs cost of the algorithms based on the kernel arithmetic are calculated. In Chapter 7, hardware dependent analysis is performed on top of the proposed hardware archi-tecture.
Chapter 2
MIMO-OFDM System
Model
A MIMO-OFDM system model is depicted in Figure 2.1. The system has NT
transmitting (TX) antennas, NR receiving (RX) antennas and K sub-carriers in
one OFDM block. It is assumed that time-variant wireless channel obey Rayleigh distribution and is quasi-static in consecutive P OFDM block duration. The max-imum multipath delay length is L. The length of Cyclic Prefix (CP) is chosen to be longer than L. Channels between couples of TX-RX antennas are mutual uncorrelated.
At a transmission time n, a stream of binary bits b is coded into NT symbol
blocks. Then the signal on kth sub-carrier at ith TX antenna is denoted by
Xi[n, k], where i = 1, . . . , NT, k = 0, . . . , K − 1, n = 0, . . . , P − 1. The received
signal at RX antenna j is Yj[n, k] = NT X i=1 Xi[n, k]Hij[n, k] + Nj[n, k]. (2.1)
Where Hij[n, k] is the frequency response between antennas i and j, Ni[n, k] is
the additive Gaussian noise with zero mean and variance σn2. Since RX antennas
are merely replicas to each other, the notation j for RX antenna will be omitted in the following part. Thus, for convenience of later mathematical manipulation, we define Xi(n) = diag(Xi[n, 0] Xi[n, 1] . . . Xi[n, K − 1]) ∈ CK×K X(n) = [X1[n] X2[n] . . . XNT[n]] ∈ C K×KNT Hi= [Hi(0) Hi(1) . . . Hi(K − 1)]T ∈ CK×1 H = [HT1 HT2 . . . HTN T] T ∈ CKNT×1 Y(n) = [Y (n, 0) Y (n, 1) . . . Y (n, K − 1)]T ∈ CK×1 (2.2) 3
4 MIMO-OFDM System Model
Figure 2.1. MIMO-OFDM System
Then Eq.(2.1) can be rewritten as
Y(n) = X(n)H + N(n) (2.3)
Considering the case of P blocks of pilot symbols, there are
Y = [YT(0) YT(1) . . . YT(P − 1)]T ∈ CKP ×1
X = [XT(0) XT(1) . . . XT(P − 1)]T ∈ CKP ×KNT
N = [NT(0) NT(1) . . . NT(P − 1)]T ∈ CKP ×1
(2.4)
Thus the system equation in frequency domain comes to be
Y = XH + N (2.5)
For OFDM systems with proper cyclic extension and sample timing, it has been shown in [19] that, with tolerable leakage, the channel frequency response can be expressed as the Discrete Fourier Transform (DFT) of received time domain impulses like Hi[n, k] = L−1 X l=0 hi[n, l]WKkl. (2.6)
where WK = exp(−j(2π/K)), k is the number of sub-carriers (tones) of an OFDM
block. The average power of hi[n, l] and index L( K) depend on the delay profiles
and dispersion of the wireless channels. Specifically, time domain channel response from TX antenna i can be represented in vector format:
hi= [hi(0) hi(1) . . . hi(L − 1)]T
5
If we define the K × K DFT transform matrix FK and K × L mapping matrix M
for zero-padding as FK= W00 . . . W0(K−1) .. . . .. ... W(K−1)0 . . . W(K−1)(K−1) (2.8) M = [IL×L 0(K−L)×L]T (2.9)
where IL×Lis the identity matrix with the size of L × L. The transformation from
frequency domain to time domain channel response can be denoted by
H = FMh (2.10)
where
FM= diag(FKM FKM . . . FKM) (2.11)
is a NTK ×NTL block-wise diagonal matrix with NT sub-matrices on the diagonal.
Substituting Eq.(2.10) into Eq.(2.5), the system equation is given by
Y = XFMh + N (2.12)
Define KP × NTL data matrix
A = XFM (2.13)
then the dependency between frequency domain Y and time domain channel re-sponse h can be interpreted as linear
Y = Ah + N (2.14)
The channel estimation discussed in this thesis will be based on Eqs. (2.5) and (2.14). For channel estimation, the symbols sending from the transmitter side are agreed by the both sides in advance. These symbols are called pilot symbols. With a sequence of well-designed pilot symbols, the channel estimator determines the solution of Eqs. (2.5) and (2.14) where both Y and X or A are available.
Chapter 3
Overview of MIMO-OFDM
Channel Estimation
The performance boost brought by MIMO systems mainly attributes to the space-time techniques which aimed to eliminate/compensate the channel effects at both transmission and detection phase. Channel state information generated by the channel estimation module, is either sent into the detection block or fed back to the transmitter side to construct beamforming weight vector. In this chapter, methods of channel estimation in MIMO-OFDM systems will be explained briefly. Basically, methods of channel estimation can be classified according to four dimensions.
• From the view of estimation theory, there are
– Least Square (LS) Estimation
– Minimum Mean Square Error (MMSE) Estimation
• According to the processing domain, estimation can be done in
– Time Domain – Frequency Domain
• Due to the different pilot-symbol arrangements, there are
– Estimator with Block-Type Pilot (Training Based)
– Estimator with Comb-Type Pilot (Pilot Symbol Aided Modulation)
• With different estimation iteration, there are:
– Iterative Methods – Direct Methods
8 Overview of MIMO-OFDM Channel Estimation
3.1
LS vs. MMSE
Estimation theory is a branch of statistical signal processing. It deals with prob-lem of estimating parameters based on the measured data. The purpose of the estimation theory is to develop an estimator, preferably an implementable one that can be used in practice. The estimator takes the measurement data as inputs and produces estimated values of the parameters.
By examining Eq.(2.5) and Eq.(2.14), both of them correspond linear equations system. H and h are unknown vectors. X and A are the known matrices. Y is the measurement matrix for both. Upon this, there are two estimators mainly used for the problem of channel estimation, namely Least Squares (LS) and Minimum Mean Square Error (MMSE).
3.1.1
LS
In mathematical terms, taking the Eq.(2.14) as an example, the channel estimation is to find a solution ˆh for the equation Aˆh ≈ Y. What LS mean is to minimize the Euclidean norm squared of the residual Aˆh − Y, that is,
k Aˆh − Y k2= (Aˆh − Y)H(Aˆh − Y)
= (Aˆh)H(Aˆh) − YHAˆh − (Aˆh)HY + YHY
The minimum is found at the zero of the derivative with respect to ˆh, then 2AHAˆh − 2AHY = 0 =⇒ AHAˆh = AHY
Therefore, ˆh will be given by ˆ
h = (AHA)−1AHY (3.1)
under the condition that the A has full column rank. The term (ATA)−1AH is
called pseudo-inverse of matrix A, sometimes denoted by A†.
3.1.2
MMSE
MMSE estimator aims to approach optimal result by exploit the statistical de-pendence between the measured data and the estimated parameter. Eq.(2.14) is chosen to be an example, where h is to be estimated. On purpose of minimizing the Mean Square Error (MSE) E[(h − ˆhMMSE)2], according to [8], the estimated
channel impulse response will be given by ˆ
h = RhYRYY−1Y (3.2)
where RhY, RYY are the cross covariance matrix between h and Y and the
auto-covariance matrix of Y with
RhY= E[hYH] = E[h(Ah + N)H] = RhhAH
RYY= E[YYH] = E[(Ah + N)(Ah + N)H]
= E[Ah(Ah)H+ AhNH+ N(Ah)H+ NNH] = ARhhAH+ σ2nI
3.2 Time Domain vs. Frequency Domain 9
Rhh= E[hhH] is the auto-covariance matrix of h and σn2 denotes the noise
vari-ance E|nk|2. These two quantities are assumed to be known at the estimator.
Then 3.2 can be rewritten as
ˆ
h = RhhAH(ARhhAH+ σ2nI)−1Y
= Rhh(Rhh+ σn2(A
HA)−1)−1A†Y (3.3)
3.2
Time Domain vs. Frequency Domain
Figure 3.1. MIMO-OFDM System with Partitio in Time & Frequency Domainn
There are two different views towards the issue of time/frequency domain parti-tion. Both of them are frequently mentioned and have to be distinguished carefully. According to the first view, estimators based on the Eq.(2.14) are specified to be time domain approach which generate the estimated time domain channel impulse response ˆh; while the estimation drawn from Eq.(2.5) refers to the fre-quency domain approach, where channel transfer function ˆH is produced directly. It has been proved in [10] that with certain condition satisfied, LS channel estima-tion with block-type pilots are theoretically equivalent in such time domain and frequency domain.
From the other point of view, the time/frequency classification depends on from which domain the estimator takes the received signal. As shown in Figure 3.1, the information bits are first pre-coded in frequency domain, then fed into IFFT block in order to be transmitted in time domain. At the receiver, if the estimator taking the received signal after FFT transformation is called frequency domain estimator, while those has time domain inputs before FFT is time domain approach.
3.3
Block-Type Pilot vs. Comb-Type Pilot
As mentioned in the end of Chapter 2, one of the crucial part of channel estimation is how to design the pilot symbols which will be agreed to both sides of the
trans-10 Overview of MIMO-OFDM Channel Estimation
mission. Basically, pilot tones can be inserted either into all of the sub-carriers of OFDM blocks periodically or a subset of sub-carriers of each OFDM block.
The first one, channel estimation with block-type pilot, sometimes also men-tioned as training based, has been developed under the assumption of slow fading channel. The second, comb-type pilot channel estimation is introduced to satisfy the need for equalizing when the channel changes even from one OFDM block to the subsequent one.
3.3.1
Estimation with Block-Type Pilot
Figure 3.2. Block-type Pilot
Figure 3.2 illustrates the block-type pilot that a continuous pilot block to obtain channel impulse response on all sub-carriers. The length of the training block is fixed to the number of sub-carriers in an OFDM block. Estimation is fulfilled over each period of channel coherence time Tc, which is a statistical measure of the
time duration over which the channel impulse response is essentially invariant. It is assumed that the channel is flat block-fading when block-type pilot is employed. When objects move in high speed, the channel coherence time Tc would shrink
significantly due to the increasing Doppler spread (fm) like Tc = 0.423/fm [12].
To follow the faster channel changing, the estimator has to run the estimation process more frequently. Since the pilot symbols are regarded as overhead, increase of travel speed may bring low data rate.
3.3.2
Estimation with Comp-Type Pilot
Comb-type pilot channel estimation is introduced to handle the situation that the channel changes even from one OFDM block to the subsequent one. It is performed by inserting pilot symbols into a subset of the sub-carriers of either each OFDM block as in Figure 3.3 or some of the OFDM blocks periodically (Figure 3.4).
3.3 Block-Type Pilot vs. Comb-Type Pilot 11
Estimation is done upon these known sub-carriers. The other tones are handled by interpolation. Such a scheme that multiplexing pilot symbols with data is referred to as Pilot-Symbol-Aided-Modulation (PSAM), giving the corresponding channel estimator named as Pilot-Symbol-Aided (PSA) estimator.
Figure 3.3. Comb-type Pilot
Figure 3.4. Pilot Tones in Two Dimensions
By periodically inserting pilots in the time-frequency grid such that sampling theorem is satisfied, the channel response can be reconstructed by exploiting the correlation of the received signal in frequency and time. In wireless communication, small-scale channel fading can be characterized in both dimensions of frequency and time. Let τmaxand fmaxbe the maximum delay spread and Doppler frequency,
12 Overview of MIMO-OFDM Channel Estimation
and Df and Dt be the pilot spacing in frequency and time domain, they must
satisfy Dfτmax/T ≤ 1 and fmaxTsymDt≤ 1/2 [13]. T and Tsym are the OFDM
symbol duration without and with the guard interval. It is also suggested in [13] that two times oversampling has a good compromise between pilot overhead and performance. While for MIMO-OFDM channel estimation, larger oversampling factor has to be chosen for separating signals from different antennas [14].
3.4
Iterative vs. Direct Method
As in computational mathematics, the iterative method attempts to approach the exact solution of the system equation (2.14) and (2.5) by finding successive approximations starting from an initial guess of H. In contrast, the direct method solve the problem in one-shot. For general MIMO channel estimator, iterative use more than one iterations to remove additive noise effects and achieve better performance. But for systems with small number of variables, iterative methods are relative expensive than the direct ones.
Chapter 4
Algorithms Analysis
Based on the classification demonstrated in Chapter 3, eight typical algorithms are chosen for discussion. They are going into three categories, namely LS esti-mator with block-type pilot, MMSE estiesti-mator with block-type pilot and channel estimator for PSAM. As a preparation for the implementation discussion, the ba-sic arithmetic operations for each algorithm are summarized, which decides the implementation cost of these methods.
4.1
LS Estimator with Block-Type Pilot
The LS channel estimation with block-type pilot for MIMO-OFDM was first pro-posed in [15], and was simplified in [16]. It is a FFT-based approach, where frequency domain received signal after FFT are collected as inputs to the chan-nel estimation module, then time domain chanchan-nel impulse response are estimated. Frequency domain channel matrix is provided as the output.
It is also possible that such a estimation procedure accomplished by pure matrix manipulation, where also time domain ˆh is generated then to be transformed into frequency domain. The complexity of the matrix manipulation based approach is slightly different with the FFT-based one.
As an alternative, equivalent frequency domain channel matrix can be figured out directly under certain conditions [10]. However, such conditions are proved to be unreachable in practical MIMO systems.
All these three versions of LS estimator with block-type pilot will be explained in detail within this section. It is assumed that estimation is fulfilled within one training block, that is, P = 1.
4.1.1
FFT-based Time Domain Approach
A. Principle
Figure 4.1 depicts a general FFT based channel estimator for MIMO-OFDM sys-tem.
14 Algorithms Analysis
Given the notations (2.2), (2.4), we define
pj[n, l], 1 K L−1 X k=0 XjH[n, k]Y [n, k]WK−kl qij[n, l], 1 K L−1 X k=0 XiH[n, k]Xj[n, k]WK−kl (4.1)
where i and j are the indices of different TX antennas. The estimated channel impulse response ˆh can be drawn by
ˆ hLS[n] = Q−1[n]p[n] (4.2) where p[n], (p1[n] p2[n] . . . pNT[n]) T Q[n], Q11[n] . . . QNT1[n] .. . . .. ... Q1NT[n] . . . Q22[n] (4.3) with pi[n], (pi[n, 0] pi[n, 1] . . . pi[n, L − 1])T Qij[n], qij[n, 0] qij[n, −1] . . . qij[n, −L + 1] qij[n, 1] qij[n, 0] . . . qij[n, −L + 2] .. . ... . .. ... qij[n, L − 1] qij[n, L − 2] . . . qij[n, 0] (4.4)
Finally, ˆHLS[n] is given by applying length K-FFT to ˆhLS[n]
B. Complexity
Since Q−1 merely depends on the pilot symbols predefined, it can be calculated offline. Thus, the real-time computation is to produce p and multiply with the pre-calculated Q−1.
Upon this, the complexity issue consists of two parts: offline complexity and real time complexity. (Table. 4.1)
4.1.2
Matrix Manipulation based Time Domain Approach
A. Principle
According to the estimation theory in [8], estimated channel impulse response can be derived from Eq.(2.14) as
ˆ
4.1 LS Estimator with Block-Type Pilot 15
Figure 4.1. Time Domain FFT-based Channel Estimator for MIMO-OFDM system
Offline Complexity:
Function Size Counts
Complex-valued mult. - NT× NT × K
IFFT K NT× NT
Matrix inversion NTL × NTL 1
Real Time Complexity:
Function Size Counts
Complex-valued mult. - NT× K
IFFT K NT
Matrix mult. (NTL × NTL) × (NTL × 1) 1
FFT K NT
16 Algorithms Analysis
where A† is pseudo-inverse of A and is given by Singular Value Decomposition (SVD): A†= V Σ−1 0 0 0 UH (4.6)
U and V are unitary matrices, Σ−1 = diag(λ−11 , λ−12 , . . . , λ−1r ). λi are nonzero
singular values of A, and r is the rank of A. However, such singular value decom-position only exists under the condition that the number of rows is larger than that of columns. Thus when K > NTL and rank(A) = NTL, A† = (AHA)−1AH.
Therefore, the condition for time domain LS channel estimation is
K > NTL
rank(A) = NTL
(4.7)
and ˆhLS and ˆHLS can be obtained
ˆ hLS= (AHA)−1AHY ˆ HLS= FMˆhLS (4.8) B. Complexity
The first problem of all estimation based on Eq.(2.14) is to produce matrix A according to the transmitted signal X, which is done by offline matrix multiplica-tion. In Eq.(4.8), ˆhLS= (AHA)−1AHY, (AHA)−1AH is generated offline. The
real time computation load mainly depends on the matrix multiplication of the offline result and received signal matrix Y and Fourier Transform (Table.4.2).
Offline Complexity:
Function Size Counts
Matrix mult. (K × K)diag.× (K × L) NT
(L × K) × (K × L) NT
Matrix inversion NTL × NTL 1
Matrix mult. (NTL × NTL) × (NTL × K) 1
Real Time Complexity:
Function Size Counts
Matrix mult. (NTL × K) × (K × 1) 1
FFT K NT
Table 4.2. Complexity of Matrix Manipulation based LS Estimator
4.1.3
Frequency Domain Approach
A. Principle
According to Eq. 2.5, LS channel estimation in frequency domain is given by ˆ
4.1 LS Estimator with Block-Type Pilot 17
Similar to the deduction in sec 4.1.2.A, SVD can also be applied to X under the condition of
P > NT
rank(X) = NTK
(4.10)
and ˆHLScan be drawn as
ˆ
HLS= (XHX)−1XHY (4.11)
B. Complexity
The complexity of the frequency domain LS estimator is illustrated in Table 4.3
Offline Complexity:
Function Size Counts
Complex-valued mult. - NT × K
Matrix inversion (NTK × NTK)diag. 1
Matrix mult. (K × K)diag.× (K × K)diag. NT
Real Time Complexity:
Function Size Counts
Matrix mult. (NTK × K) × (K × 1) 1
Table 4.3. Complexity of the Frequency Domain LS Estimator
C. Discussion
Unless Eq.(4.10) satisfied, the pseudo-inversion of X will be not exist, leaving the system under-determined. While in realistic cases without assumption of P = 1, P usually give the value of 1 or 2, otherwise the data transmission efficiency will be decreased. For MIMO system where more than 1 antenna deployed, the condition that P > NT is impossible to meet.
From another perspective, by comparing the condition in Eq.(4.7) and Eq.(4.10), it is noticed that two schemes are equivalent if K = L. That is, all K taps are taken into account for the frequency domain estimation, while merely L ( K) for that in time domain. Factually, there are only L taps can be received due to loss and fading effect in practice.
Such a constraint leaves any comparison between time domain and frequency domain for LS channel estimation meaningless.
4.1.4
Optimal Training Sequence Design
It has been proved in [15] that MSE of the basic temporal channel estimation achieves the low bound when Qij[n] = 0 for all i 6= j. Specifically, the training
signals from different antennas should satisfy
qij[n, l] =
1 for i = j
18 Algorithms Analysis
Consequently,Q[n] = I, where I is a NTL × NTL identity matrix [16]. Upon this,
if the training signal for the TX antenna 1 is X1[n, k], the phase shift codes for TX antenna i will be
Xi[n, k] = X1[n, k]e−j 2π
KK¯0(i−1)k. (4.13)
¯
K0 = bK/NTc ≥ L and bxc denotes the largest integer no larger than x. Given
the matrix Qijto be diagonal, the computational complexity for matrix inversion
is largely reduced. Therefore, optimum training sequence can not only reduce the complexity of channel estimation but also improve the performance during the training period.
4.1.5
Channel Length in Time Domain
In an OFDM system, the maximum channel delay spread is obtained by statistics, which decides the length of Cyclic Prefix (CP). As indicated in [15], for a system with a 160µs symbol duration and 128 tones, L = 17 taps are used for estimation if the channel delay spread is 20µs.
For practical multipath wireless channels, the number of paths with significant energy is relatively small compared to the FFT size K. Most samples have little or no energy at all except noise perturbation. Neglecting channel taps may in-troduce degradation if signal energy is missed. But usually total noise from those taps is much higher than the multipath energy contained, especially for low SNR values [20]. Hence, ignore non-significant estimate taps can improve the channel estimation performance to some extent.
4.2
MMSE Estimator with Block-Type Pilot
The LS estimator are obtained by using only knowledge of training symbols, which can be further improved by making use of the channel statistical property either in frequency domain or time domain. In frequency domain, channel statistics is described by the auto-covariance matrix of the channel transfer function RHH,
while in time domain by that of the channel impulse response Rhh.
4.2.1
Variant 1. R
HH-based
A. Principle
The MMSE estimator utilizes frequency domain auto-covariance matrix RHH to
improve the result of LS estimator. Based on the system Eq.(2.5), the formula of MMSE estimator can be shown by Eq.(4.14).
ˆ
HMMSE= RHH(RHH+ σn2(X
HX)−1)−1Hˆ
LS (4.14)
We replace the term (XHX)−1 with its expectation E{(XHX)−1}. Simulations
indicate that the performance degradation is negligible[21]. Assuming the same signal constellation on all tones and equal probability on all constellation points,
4.2 MMSE Estimator with Block-Type Pilot 19
we have E{(XHX)−1} = E{|1/Xk|2}I. Thus Eq.(4.14) can be further expressed
as ˆ HMMSE= RHH(RHH+ β SN RI) −1Hˆ LS (4.15)
β = E{|Xk|2}E{|1/Xk|2} is a constant depending on the training symbols
constel-lation. SN R = E{|Xk|2}/σn2 is the Signal-to-Noise Ratio. The auto-correlation
between sub-carriers is represented by the auto-covariance matrix of H as RHH=
E{HHH}, which is an NTK × NTK matrix. The LS estimates ˆHLS follow the
definition in Eq.(4.8).
B. SVD Optimization
The SVD based channel estimator for single antenna OFDM systems has been given in [21]. Although such a solution is not able to be directly applied to MIMO systems, similar scheme can be obtained by performing SVD on the channel auto-covariance matrix, in order that large-scale matrix inversion is avoided.
RHH= UΣUH (4.16)
where Σ = diag(λ0, λ1, . . . , λr−1, 0, . . . , 0)) with the nonzero singular values λ0≥
λ1 ≥ . . . ≥ λr−1 and r = rank(Σ). U is a unitary matrix with its columns
representing the eigenvectors. Thus, using the result in [21] and Eq.(4.15), the optimal rank r estimator is given by
ˆ
HMMSE,r= UΩUHHˆLS (4.17)
where Ω is a diagonal matrix with
δk = ( λk λk+SN Rβ for k = 0, 1, . . . , r − 1 0 for k = r, r + 1, . . . , K − 1 (4.18) on its diagonal. C. Complexity
The MMSE estimator gives some modification upon the initial estimates of the LS estimator. Such modification is based on the knowledge of the channel auto-covariance matrix RHH. In practice, RHH depends on the physical properties of
the transmission channel such as placements of obstacles, weather, temperature and etc., which may be varying over time and have to be updated every period of time interval. In this section, RHH is assumed to be known. In other word, the
environment is nearly static with both TX/RX and obstacles static or moving in relatively low speed which have no effect on channel statistic properties.
Excluding the complexity of the initial LS estimates, the computation required by the MMSE estimators without and with SVD optimization are summarized in Table 4.4 and 4.5.
20 Algorithms Analysis Real Time Complexity:
Function Size Counts
Complex-valued add. - NTK
Matrix inversion (NTK × NTK) 1
Matrix mult. (NTK × NTK) × (NTK × 1) 2
Table 4.4. Complexity of Basic RHH-based MMSE Estimator Offline Complexity:
Function Size Counts
SVD NTK × NTK 1
Real Time Complexity:
Function Size Counts
real add. - r
real mult. - 2r
Matrix mult. (NTK × NTK)diag.× (NTK × NTK) 1
(NTK × NTK) × (NTK × 1) 2
Table 4.5. Complexity of SVD Optimized RHH-based MMSE Estimator
4.2.2
Variant 2. R
hh-based
A. Principle
According to the estimation theory in [8], it is also feasible to estimate time domain channel impulse response based on the system equation (2.14). Follow the Eq.(3.3) and replace the (AHA)−1 with its expectation like
E{(AHA)−1} = E{((XFM)H(XFM))−1}
= (FM)−1E{(XHX)−1}(FHM)−1
= 1
KE{(X
HX)−1}.
Hence, the MMSE channel estimates can be expressed as:
ˆ hMMSE= Rhh(Rhh+ β SN R · KI) −1hˆ LS (4.19)
β and SN R have the same definition as in frequency domain. K is the OFDM
size. Rhh = E{hhH}, the auto-covariance matrix of h, which is an NTL × NTL
matrix. ˆhLS is as defined in Eq.(4.8).
B. SVD based Approach
Similar to the approach in frequency domain, the SVD is applicable to the time domain approach. Given the NTL × NTL time domain channel auto-covariance
4.3 Estimation with Comb-Type Pilot (PSA Channel estimation) 21
matrix Rhh, which is much smaller than that in frequency domain, the
compu-tational load consuming by SVD is considerably decreased. Compared with that in frequency domain, there are only two differences in time domain approach, namely replacing the RHHwith Rhh= VΣ0VHand the elements on the diagonal
of matrix Ω0 turn to be like
δk0 = ( λk λk+SN R·Kβ for k = 0, 1, . . . , r − 1 0 for k = r, r + 1, . . . , K − 1 (4.20) C. Complexity
Excluding the complexity of the initial LS estimates, the corresponding complex-ity of Rhh-based MMSE estimators without and with SVD optimization can be
summarized in Table. 4.6 and 4.7.
Real Time Complexity:
Function Size Counts
Complex-valued add. - NTL
Matrix inversion (NTL × NTL) 1
Matrix mult. (NTL × NTL) × (NTL × 1) 2
FFT K NT
Table 4.6. Complexity of Basic Rhh-based MMSE Estimator
Offline Complexity:
Function Size Counts
SVD NTL × NTL 1
Real Time Complexity:
Function Size Counts
real add. - r
real mult. - 2r
Matrix mult. (NTL × NTL)diag.× (NTL × NTL) 1
(NTL × NTL) × (NTL × 1) 2
FFT K NT
Table 4.7. Complexity of SVD Optimized Rhh-based MMSE Estimator
4.3
Estimation with Comb-Type Pilot (PSA
Chan-nel estimation)
Since the PSA channel estimation only applies pilot to a subset of the OFDM sub-carriers, the estimation procedure includes estimation on pilot tones and in-terpolation to the data tones, so that the channel response on all sub-carriers are
22 Algorithms Analysis
drawn. In this section, we will consider the basic 1D pilot insertion in frequency dimension. Interpolation with FFT and Wiener Interpolation Filter (WIF) will be discussed.
4.3.1
Appended Notation
For PSA channel estimation, KP known pilot sub-carriers are periodically
mul-tiplexed into the K sub-carriers. The spacing between two adjacent pilots is
Df = bK/KPc. In this section, we define a subset of the transmitted signals
containing only pilot symbols from TX i to be Xpi[n, k]. Similar to the notation
in Chapter 3, the received pilot sequence can be expressed as
Yp(n) = Xp(n)Hp+ N(n) (4.21)
where the transmitted pilot sequence, the channel transfer function and the addi-tive noise are given by
Xpi(n) = diag(Xpi[n, 0] Xpi[n, 1] . . . Xpi[n, KP− 1]) ∈ CKP×KP Xp(n) = [Xp1[n] Xp2[n] . . . XpNT[n]] ∈ C KP×KPNT Hpi= [Hpi(0) Hpi(1) . . . Hpi(KP− 1)]T ∈ CKP×1 Hp= [HpT1 HpT2 . . . HpTNT]T ∈ CKPNT×1 Yp(n) = [Yp(n, 0) Yp(n, 1) . . . Yp(n, KP− 1)]T ∈ CKP×1 (4.22)
In our discussion, the pilot tones are assumed to stay in the first sub-carriers over every Df sub-carriers. For the sake of the pilot extraction, we define a
sampling vector D with the length of Df.
D = 1 0 . . . 0
| {z }
Df
Then for each TX antenna, the downsampling matrix Dsplwith the size of Kp×K
can be built by
Dspl= diag(D D · · · D
| {z }
KP
) (4.23)
Finally, we get the NTKP × NTK downsampling matrix, in another word, pilot
extraction matrix S as
S = diag(Dspl Dspl · · · Dspl
| {z }
NT
) (4.24)
Therefore, channel transfer functions on pilot tones Hpcan be easily extracted
by
4.3 Estimation with Comb-Type Pilot (PSA Channel estimation) 23
4.3.2
PSA Estimation with FFT Interpolation
A. Principle
With the FFT interpolation, estimates of the pilot tones ˆhp is done by the basic
FFT-based approach (Sec. 4.1.1), then transformed into frequency domain to get ˆ
H after zero-padding to the required FFT size. B. Complexity
The complexity of the PSA estimator with FFT interpolator is roughly the same as that of LS estimator with block-type pilot, except for the FFT size refers to the number of pilot tones. Therefore, the detailed complexity analysis can be found in Table 4.8
Offline Complexity:
Function Size Counts
Complex-valued mult. - NT × NT× KP
IFFT KP NT × NT
Matrix inversion NTL × NTL 1
Real Time Complexity:
Function Size Counts
Complex-valued mult. - NT × KP
IFFT KP NT
Matrix mult. (NTL × NTL) × (NTL × 1) 1
FFT K NT
Table 4.8. Complexity of the PSA Estimator with FFT Interpolator
4.3.3
PSA Estimation with WIF
A. Principle
With the WIF, the estimated channel transfer function is given by
ˆ
HWIF= WYp= RHYpR
−1
YpYpYp (4.26)
The WIF minimizes the MSE between the desired response H and the filtered output ˆHWIF, given the received pilot sequence Yp [24]. Similar to the MMSE
estimator, the channel statistics is required. Rewrite the notations in Eq.(4.26) to be RYpYp= E[YpY H p] = XpRHpHpX H p + σ 2 nI RHYp = E[H(XpHp+ ˜N)] = XpRHHp (4.27)
where RHHp is the cross-covariance matrix of the channel frequency response
24 Algorithms Analysis
matrix of pilot sub-carriers. Then Eq.(4.26) comes to be
ˆ
HWIF= RHHp(RHpHp+ σ
2
n(XpXHp))−1HˆpLS (4.28)
With the definition of β = E{|Xk|2}E{|1/Xk|2}, Eq.(4.26) can be expressed as
ˆ HWIF= RHHp(RHpHp+ β SNR) −1Hˆ pLS (4.29)
The initial LS estimates on pilot sub-carriers ˆHpLS follows Sec. 4.1.1.
B. SVD Optimization I
Eq.(4.29) looks almost the same as Eq.(4.15) in MMSE estimator with block-type pilot, which gives the idea that matrix inversion can be avioded by per-form SVD to both RHHp and RHpHp. Since RHpHp is a symmetric matrix, its
SVD is represented by RHpHp = VΣpV
H. The SVD of R
HHp is represented by
RHHp = UΣV
H. Thus, Eq.(4.29) can be rewritten as
ˆ HWIF= UΣVH(VΣpVH+ β SNRI) −1Hˆ pLS = UΣ(Σp+ β SNRI) −1VHHˆ pLS = UΩVHHˆpLS (4.30)
with the updated singular values δk on the diagonal of Ω.
δk =
( λk
λpk+SNRβ
for k = 0, 1, · · · , r − 1
0 for k = r, r + 1, · · · , NTK − 1
where λk and λpkdenote the singular values of Σ and Σprespectively.
Basically, in order to get all elements in Eq.(4.30), two SVDs have to be per-formed. However, as RHpHp is symmetric and resulting V is an univary matrix
which also can be drawn from the SVD of RHHp, the SVD of RHpHp can be
replaced by matrix multiplication to get the singular value matrix Σp. Hence,
estimation can be fulfilled with only one SVD operation.
Σp= VHRHpHpV (4.31)
C. SVD Optimization II
Similar to the MMSE estimator with block-type pilot, the WIF can be simpli-fied by using time domain auto-covariance matrix of all sub-carriers Rhh. As a
preparation, with the Eq.(2.10) and Eq.(4.25), we can rewrite RHHp and RHpHp
as RHHp = E{HHp} = E{FMhh HFH MS H} = F MRhhFHMS H (4.32) RHpHp = E{HpHp} = E{SFMhh H FHMS H } = SFMRhhFHMS H (4.33)
4.3 Estimation with Comb-Type Pilot (PSA Channel estimation) 25
Performing SVD on Rhh, we have Rhh= V0Σ0V0H. Then,
RHHp = FMV 0Σ0V0HFH MS H (4.34) RHpHp = SFMV 0Σ0V0HFH MS H (4.35) Let FMV0 = P SFMV0= Q (4.36) we can get ˆ HWIF= PΣ0QH(QΣ0QH+ β SNRI) −1Hˆ pLS = PΣ0(Σ0+ β SNR(Q HQ)−1)−1Q−1Hˆ pLS = PΣ0(Σ0+ β SNR · KP )−1(SFMV0)−1HˆpLS = PΣ0(Σ0+ β SNR · KP )−1V0HF−1MS−1SFMˆhLS = PΩ0V0HˆhLS (4.37)
It is noted from Sec. 4.3.2 that time domain channel impulse responses are equiv-alent for pilot tones and all sub-carriers, in other words, ˆhLS= ˆhpLS. To get the
corresponding frequency domain channel transfer function, they are zero-padded to different length. In addition, on the diagonal of Ω0, the updated singular values
δ0 k are: δk0 = λ0k λ0 k+ β SNR·KP for k = 0, 1, · · · , r − 1 0 for k = r, r + 1, · · · , NTL − 1
where λ0k denotes the singular value of Σ0.
By now, there is only one small scale SVD operation required for the estimator.
D. Complexity
Still, assuming the full channel auto-covariance matrix is available, from which RHHp and RHpHp can be figured out. The complexity of the original PSA
es-timator with WIF excluding the computation of the initial LS estimates of pilot tones are listed in Table 4.9. The optimized versions I and II are summarized in Table 4.10 and Table 4.11.
26 Algorithms Analysis
Real Time Complexity:
Function Size Counts
Complex-valued add. - NTKP
Matrix inversion (NTKP × NTKP) 1
Matrix mult. (NTKP × NTKP) × (NTKP× 1) 1
(NTK × NTKP) × (NTKP× 1) 1
Table 4.9. Complexity of Basic PSA Estimator with WIF
Offline Complexity:
Function Size Counts
SVD NTK × NTKP 1
Matrix Mult. (NTKP × NTKP) × (NTKP× NTKP) 2
Real Time Complexity:
Function Size Counts
real add. - r
real mult. - 2r
Matrix mult. (NTKP × NTKP)diag.× (NTKP× 1) 1
(NTK × NTKP) × (NTKP× 1) 1
(NTK × NTK) × (NTK × 1) 1
Table 4.10. Complexity of SVD I PSA Estimator with WIF
Offline Complexity:
Function Size Counts
SVD NTL × NTL 1
Real Time Complexity:
Function Size Counts
real add. - r
real mult. - 2r
Matrix mult. (NTL × NTL) × (NTL × 1) 2
(NTL × NTL)diag.× (NTL × 1) 1
FFT K NT
Chapter 5
Simulation
In order to compare the performance of algorithms in Chapter 4, simulation is carried out in MATLAB. Mean Square Error (MSE) performances are provided as the results.
5.1
Scenarios
Channel estimation algorithms discussed in Chapter 4 are implemented based on the 3GPP Spatial Channel Model [27]. The carrier center frequency is fc= 2GHz,
and the wavelength is λ = 0.15m. Following the 3GPP LTE PHY downlink transmission scheme, for a bandwidth of W = 5M Hz, the number of sub-carriers in an OFDM block is K = 512. A QPSK modulated 2 × 1 MIMO-OFDM system is considered. This can be easily developed to the multiple RXs case because RXs are completely independent from each other. For the sake of simplicity no data multiplexing with the pilots is involved.
Two scenarios are considered namely urban micro and urban macro. Parame-ters under different scenarios are summarized in Table 5.1.
Urban Micro Urban Macro
Maximum Delay Spread (µs) 1.2 3
Number of Taps 6 15
Ave. Speed (m/s) < 3 15
Maximum Doppler Frequency fd (Hz) 20 100
Channel Coherence Time Tc (µs) 9 × 103 1.8 × 103
FFT Size 512 512
OFDM Symbol (Block) Duration (µs) 102.4 102.4
Table 5.1. Simulation Parameters
28 Simulation
5.2
LS Estimator with Block-Type Pilot
The LS estimator with block-type pilot is simulated under scenarios of urban micro and urban macro.
Under the urban micro scenario, where the channel between TX and RX can be regarded as static, 6 taps channel impulse response are generated by 3GPP SCM. While in real cases, it is almost impossible to acquire the exact number of channel taps L at the estimator. Although there exists some method to estimate the channel length [28], the estimated value is never perfect. In order to see the impact of the mismatch between estimated ˆL and real L, both 8 taps and 5 taps
estimation cases are simulated.
Figure 5.1 shows that the perfect knowledge of channel length (6 taps used for estimation) leads to the minimum MSE. Taking more taps for estimation may introduce more noise but less signal power gained. This may degrade the perfor-mance slightly. In contrast, for the smaller ˆL case, the curve is worse and saturates
when SNR increases. This is interpreted as useful energy loss in the missing taps. For the urban macro scenario, where the mobile station moves at intermediate speed, 15 taps channel is assumed. Also, 15, 18, 13 and 9 taps estimation are car-ried out for comparison. The resulted MSE/SNR curves are shown in Figure 5.2. The trend goes roughly the same as that in Figure 5.1. 9 taps estimation gives bad degradation even compared with the 13 taps case. Due to the fact that less energy (both signal and noise) contained in the later taps, the 18 taps’ curve is quite close to the perfect case.
Figure 5.1. MSE/SNR Result of LS Estimator with Block-Type Pilot in Urban Micro
scenario
5.3
MMSE Estimator with Block-Type Pilot
MMSE estimators with Block-Type Pilot are evaluated under the pre-defined sce-narios, among which, the RHHbased approach in urban micro and the Rhhbased
5.3 MMSE Estimator with Block-Type Pilot 29
Figure 5.2. MSE/SNR Result of LS Estimator with Block-Type Pilot in Urban Macro
Scenario
approach in both. Like in the LS case, various number of taps are used for esti-mation to see the mismatch effect. Since training based MMSE estimators usually involve large-scale matrix inversion or SVD, the simulation takes rather long time. MSE/SNR results are shown in Figure 5.3, 5.4 and 5.5. The curves of all full rank solution stays quite close to each other, which going down linearly from an order of lower than 10−2 to an order of lower than 10−4 over the SNR between 5 and 25. For the low rank case, frequency domain RHH aided approach gives
better performance, which is about 10−2and stays almost constant although SNR rising. On the contrary, time domain Rhh involved scheme acts a bit poor in low
rank case.
Figure 5.3. MSE/SNR Result of RHH involved MMSE Estimator with Block-Type
30 Simulation
Figure 5.4. MSE/SNR Result of Rhhinvolved MMSE Estimator with Block-Type Pilot
in Urban Micro Scenario
Figure 5.5. MSE/SNR Result of Rhhinvolved MMSE Estimator with Block-Type Pilot
5.4 PSA Estimator 31
5.4
PSA Estimator
For the PSA approach, estimators with FFT interpolation and WIF are tested under the urban macro scenario, where the subject is moving in middle speed. Assuming the perfect numer of channel taps 15 at the estimator, pilot spacing of 2, 4 and 8 are adopted and performance of different interpolation schemes are compared respectively.
From Figure 5.6 and 5.7, it can be discovered that the performance degrades as the pilot spacing increases. The Df = 2 case, where half of the tones are
occupied by pilot symbols, gives the minimum MSE. It is obvious that more pilot tones contributes more than fewer, because interpolation merely tries ot approach the real value. However small pilot spacing decreases the number of sub-carriers available for useful data. Therefor, there is a trade-off between the number of pilot tones and data tones.
Figure 5.6. MSE/SNR Result of PSA Estimator with FFT Interpolation in Urban
Macro Scenario
5.5
Comparison
5.5.1
Urban Micro Scenario
The urban micro scenario describes a state that both two sides of the transmission as well as the obstacles in between are relatively static. This is the usual case in indoor environments. The distance between TX and RX is in an order of 100 meters, giving a shorter rms delay and fewer taps received.
Under this scenario, performances of LS and MMSE estimator with block-type pilot of different estimated channel length are displayed in Figure 5.8. When all the significant taps are taken into consideration (6 taps and 8 taps cases), the MMSE estimator acts slightly better than LS estimator. The curves go down as the SNR increased. If some significant taps are missed (for example 5 taps case),
32 Simulation
Figure 5.7. MSE/SNR Result of PSA Estimator with Wiener Interpolation Filter in
Urban Macro Scenario
the performance of LS and MMSE are overlayed in the level of lower 10−1, and remain almost flat although SNR increased.
Figure 5.8. Comparison of Channel Estimators with Block-Type Pilot in Urban Micro
Scenario
5.5.2
Urban Macro Scenario
The urban macro scenario discribes transmission properties in common outdoor area, where TX and RX are surrounded by both static and moving obstacles but Line-Of-Sight is unavailable. The channel coherence time is shorter compared with the urban micro scenario. This implies that channel statistics may change sometimes and has to be updated over certain period of time.
5.5 Comparison 33
MSE/SNR curves of schemes with FFT interpolation and WIF are shown in Fig-ure 5.9. Estimator with WIF provides a better result comparing with the FFT interpolator with the same pilot spacing, and even approaches the the perfor-mance of estimation with twice number of pilot tones. For instance, the curve of
Df = 4,WIF stayes quite close to that of Df = 2, FFT.
Chapter 6
Computational Cost
According to the principle and complexity analysis conducted in Chapter 4, compu-tational cost of the selected algorithms are addressed in this section. Since this the-sis is aimed for SDR based on programmable hardware, all complex-valued arith-metic operations are decomposed to real FLoating-point OPerations (FLOPs).
6.1
Kernel Arithmetic
For each algorithm, estimation process has been decomposed to complex matrix computation. Based on the theory and analysis in [29], these computation can be transformed to basic complex-valued operation, which is further conversed to real FLOPs in Table 6.2. On top of this, the major computation involved in the channel estimation algorithms can be measured by real FLOPs.
6.1.1
K-FFT
Fast Fourier Transform (FFT) is a computationally efficient algorithm of DFT by exploiting the symmetry and periodicity property of the phase factor WK. An
example of K = 8 radix-2 FFT algorithm is given in Figure 6.1. Basically, the calculation of the K point radix-2 FFT requires K2 log2K complex multiplication
Complex Operation Real Floating-point Operation
complex-complex mult. 4 real mult. + 2 real add. = 6 FLOPs complex-complex add./sub. 2 real add./sub. = 2 FLOPs
magnitude squared 2 real mult. + 1 real add. = 3 FLOPs real-complex mult. 2 real mult. = 2 FLOPs
complex inverse 1 magnitude squared + 2 real div. = 5 FLOPs complex-complex div. 1 complex inverse + 1 complex mult. = 11 FLOPs
Table 6.1. Conversion from complex to FLOPs
36 Computational Cost
and K log2K complex addition, which means 2K log2K real multiplications and 3K log2K real additions. In all, it is 5K log2K FLOPs.
Figure 6.1. K = 8 radix-2 FFT Algorithm
6.1.2
Matrix Multiplication
Generally speaking, multiplication of two complex matrices is fulfilled by Algo-rithm. 6.1[29]:
Algorithm 6.1 If A ∈ Cm×p, B ∈ Cp×n, and C ∈ Cm×n
are given, then this algorithm overwrites C with AB + C.
for i = 1 : m do for j = 1 : n do
for k = 1 : p do
C(i, j) = A(i, k)B(k, j) + C(i, j)
end for end for end for
Mult(n × n square, n × n square)
Through Algorithm 6.1, there are mnp complex multiplication and mnp complex addition required to get the product of two matrices. Under the condition that both matrices are n × n square, this number is specified to be n3. Considering 1 complex multiplication is 4 real multiplication and 2 real addition and 1 complex addition requires 2 real addition, the multiplication of two n-by-n complex matrix
6.1 Kernel Arithmetic 37
requires 4n3 real multiplication and 4n3 real addition respectively, that is, 8n3 FLOPs.
Mult(n × n square, n × n triangular)
For the case a square matrix multiply with a triangular matrix, Algorithm 6.1 turns to be a variant with the inner loop decreased to j times.
Algorithm 6.2 If A ∈ Cn×n, B ∈ Cn×n is upper triangular, then this
algorithm computes C = AB as
for i = 1 : n do for j = 1 : n do
for k = 1 : j do
C(i, j) = A(i, k)B(k, j) + C(i, j)
end for end for end for
Consequently, Algorithm 6.2 requires n22(n + 1) complex multiplication and addition respectively, that is, 4n3+ 4n2 real FLOPs.
Mult(n × n square, n × 1 vector)
Finally we arrive at a square matrix multiplied with a vector, where the second loop in Algorithm 6.1 can be removed.
Algorithm 6.3 If A ∈ Cn×n
, B ∈ Cn×1 is a vector, then this algorithm computes C = AB as
for i = 1 : n do for k = 1 : n do
C(i) = A(i, k)B(k) + C(i)
end for end for
Therefore, the multiplication of a square matrix and a vector calls for n2 com-plex multiplication and addition, which gives 8n2 FLOPs in all.
6.1.3
Matrix Inversion
The basic idea of matrix inversion is to factorize the matrix into an orthogonal matrix Q and an upper triangular matrix R by applying QR decomposition. Then the inverse of the original matrix can be drawn by computing the inverse of the upper triangular matrix. This process can be summarized by
A = QR =⇒ A−1= R−1Q−1= R−1QH (6.1)
There are three operation involved in the computation of matrix inversion, namely QR decomposition, upper triangular matrix inversion and multiplication.