MirsadˇCirki´c OptimizationofComputationalResourcesforMIMODetection

(1)

Link¨oping Studies in Science and Technology Thesis No. 1514

Optimization of

Computational Resources for

MIMO Detection

Mirsad ˇ

Cirki´c

Division of Communication Systems Department of Electrical Engineering (ISY) Link¨oping University, SE-581 83 Link¨oping, Sweden

(2)

This is a Swedish Licentiate Thesis.

The Licentiate degree comprises 120 ECTS credits of postgraduate studies.

Optimization of Computational Resources for MIMO Detection c

2011 Mirsad ˇCirki´c, unless otherwise noted. LIU-TEK-LIC-2011:53

ISBN: 978-91-7393-011-6 ISSN 0280-7971

(3)

(4)

(5)

Abstract

For the past decades, the demand in transferring large amounts of data rapidly and reliably has been increasing drastically. One of the more promis-ing techniques that can provide the desired performance is the multiple-input multiple-output (MIMO) technology where multiple antennas are placed at both the transmitting and receiving side of the communication link. One major implementation difficulty of the MIMO technology is the signal sep-aration (detection) problem at the receiving side of the MIMO link. This is due to the fact that the transmitted signals interfere with each other and that separating them can be very difficult if the MIMO channel conditions are not beneficial, i.e., the channel is not well-conditioned.

For well-conditioned channels, low-complexity detection methods are often sufficiently accurate. In such cases, performing computationally very ex-pensive optimal detection would be a waste of computational power. This said, for MIMO detection in a coded system, there is always a trade-off be-tween performance and complexity. The fundamental question is, can we save computational resources by performing optimal detection only when it is needed, and something simpler when it is not? This is the question that this thesis aims to answer. In doing so, we present a general framework for adaptively allocating computational resources to different (“simple” and “difficult”) detection problems. This general framework is applicable to any MIMO detector and scenario of choice, and it is exemplified using one partic-ular detection method for which specific allocation techniques are developed and evaluated.

(6)

(7)

Acknowledgments

First of all, I would like to thank my supervisor, Prof. Erik G. Larsson, who has given me the opportunity and guidance to my first steps in entering the world’s research community. His great knowledge and ability to question, to address, and to solve technical problems is very inspiring. I would also like to thank my colleagues at Communication Systems and Information Cod-ing divisions. Especially, Dr. Daniel Persson and Docent Jan-˚Ake Larsson, who were never too “busy” to give a helping hand and were always open for fruitful discussions, deserve a big thank you. I also want thank my fellow PhD students: especially Reza Moosavi, TVK Chaitanya, Erik Axell, and Johannes Lindblom for many valuable technical and non-technical discus-sions. I wish to convey my sincere thanks and appreciation to Dr. Lasse Alfredsson and Dr. Mikael Olofsson for their help in my teaching work. I thank my parents, Emir & Raza ˇCirkić, that have supported and encour-aged me to pursue my dream of becoming a researcher. Last but most definitely none the least, I would like to thank my fiance, Therése M. ˇCirkić, for never doubting in me and for putting up with me working many late after-office hours.

Link¨oping, December 2011 Mirsad ˇCirki´c

(8)

Motivation

For the past decades, the demand in transferring large amounts of data rapidly and reliably [1] has been increasing drastically. Researchers and en-gineers have been developing different wireless technologies and techniques to meet this continuously increasing demand. One of the more promising ones is the multiple-input multiple-output (MIMO) technology where mul-tiple antennas are placed at both the transmitting and receiving side of the communication link. It has many advantages over the single antenna tech-nologies such as increased information throughput [2] and link reliability through diversity [3] in rich scattering environments.

One major implementation difficulty of the MIMO technology is the sig-nal separation (detection) problem at the receiving side of the MIMO link. This is due to the fact that the transmitted signals interfere with each other and that separating them can be very difficult if the MIMO channel ma-trix is not well-conditioned. Optimal detection methods, which have been well-known for a long time and which solve these difficult detection prob-lems optimally, enumerate the transmitted constellation. Unfortunately, the complexity of such enumeration procedures increases exponentially in the number of transmitting antennas. Therefore, such methods are not realiz-able for large number of transmitting antennas. Especially, this is the case for the newly emerging technology known as very large MIMO where more than tens of antennas are placed at both the transmitter and receiver [4]. Today’s research addresses the MIMO detection problem in an approximate manner by developing methods that yield adequate performance with toler-able complexity [5–17].

(14)

4

The research revolving around MIMO detection had its first boom during the code-division multiple-access era, where the MIMO link consisted of multiple single-antenna users transmitting to a multiple-antenna receiver [18]. The detection problem was to separate the streams of each user. Since then, a vast literature has been produced that presents different approximate detection methods. The simplest and computationally cheapest algorithms stem from the least squares solution where a linear decoupling procedure is performed followed by a quantization step that chooses a unique point from the transmitted constellation. Some more advanced methods offer the possibility of trading performance for complexity via some user parameter: sphere decoding (SD) [5], fixed-complexity SD (FCSD) [6], soft-output via partial marginalization (PM) [7], and lattice reduction (LR) [8].

The vast majority of MIMO detection literature presents different detection techniques, but not which technique to use based on the effective chan-nel conditions. For well-conditioned chanchan-nels, sub-optimal low-complexity methods such as soft zero-forcing (ZF), are often sufficiently accurate. In fact, for wireless environments that yield channel matrices with orthogonal columns, the ZF method is optimal and equivalent to maximum ratio com-bining. In such a case, performing optimal detection and computationally heavy enumeration would be a waste of computational power. This said, for MIMO detection, there is always a trade-off between performance and com-plexity. The fundamental question is, can we save computational resources by performing heavy enumeration only when it is needed, and something simpler when it is not? There is not much earlier work that considers such adaptive detection, which is the the main question that this thesis addresses and tries to answer. More about this further ahead.

(15)

MIMO Detection

We consider the real-valued MIMO-channel model

y= Hs + e, (1)

where H∈ RNR×NT_{is the MIMO channel matrix with full column-rank and}

s∈ SNT _{is the transmitted vector. Further, e}_{∈ R}NR _{∼ N (0,}N0

2 I) denotes

the noise vector and y∈ RNR _{is the received vector. The channel is perfectly}

known to the receiver and in what follows, we assume that NR ≥ NT since

this is typical in practice and simplifies the mathematics performed in this paper. With separable complex symbol constellations, every complex-valued model of type (1) can be posed as a real-valued model, see e.g. [7].

There are two main categories of detectors: hard decision detectors which decide whether a bit is zero or one and soft decision detectors which decide how likely it is that a bit is zero or one. The latter category of detectors in comparison with the previous one produces more information at the out-put and in a receiving chain with appropriate soft decoders will yield much better performance. It is noteworthy that any hard decision detector can be modified to produce soft decisions via the max-log approximation. A large part of the literature to the present day considers hard decision detectors.

1 Hard MIMO Detection

The optimal hard MIMO detector is the one that chooses the candidate in the symbol vector constellationSNT _{that maximizes the a posteriori probability}

(16)

6

of the symbol vectors. Hence, ˆ

s, argmax

s∈SNT P (s|y).

(2)

With uniform a priori probabilities and the model in (1), the hard detection problem becomes equivalent to

ˆ

s= argmin

s∈SNT ky − Hsk = argmins∈SNT

y′_{− Ls} _, ₍₃₎

where y′_{= Q}T_{y, H = QL, Q}_{∈ R}NR×NT _{with Q}T_Q_{= I, and L}_{∈ R}NT×NT

is a lower-triangle invertible matrix. This problem, which is referred to as the maximum likelihood (ML) problem, can be thought of as a tree-search problem where the algorithm needs to reach all the leaf nodes in the tree in order to decide which branch path is optimal, see Fig. 1. The exponential complexity in NT (|S|NT possible solution candidates) that arises in this

problem statement is the main issue in MIMO detection. It has required serious attention and still does. Thus, many approximate methods have been proposed, some of which are explained thoroughly in the sections that follow. A more condensed overview can be found in [19].

1.1 Zero Forcing

One of the crudest and the simplest approximations of the problem in (3) is the zero forcing (ZF) solution. It consists of two steps: decoupling of the interfering symbols using a pseudo-inverse and then quantizing the decoupled symbols independently per dimension. Hence, first a relaxed problem is solved where the relaxation is made on the constraint s∈ SNT_,

ˆ

sr= argmin s∈RNT

y′_{− Ls} _{= L}−1_y′_, ₍₄₎

which can be performed using Gaussian elimination: ˆsr,1 = y′1/L11, ˆsr,2 =

(y′

2− L21sˆr,1)/L22 and so on. Thereafter, the relaxation is removed and the

resulting solution in (4) is quantized ˆ

szf, argmin

s∈SNT kˆsr− sk = ⌈ˆsr⌋,

(5)

where the operator ⌈·⌋ denotes the quantization operation. Note that this is done independently per dimension. The complexity of this detector is

(17)

1. Hard MIMO Detection 7 layer 1 (ˆs1) layer 2 (ˆs2) c11 c11 c12 c12 c1M c1M c21 c22 c2M c11+ c21 c11+ c22 c11+ c2M · · · · · · · · · · · · · · · .. .

Figure 1: Hard detection as a tree-search problem. The illustration shows the tree down to layer 2. In each node, there are M =_{|S| constellation points} (branch paths) to chose from. The branch metric for a particular branch, say c22where for example s1= a1 and s2 = a2, is c22= |y′i− L21a1−

L22a2| and so on. The objective is to find a path down to layer NTthat

(18)

8

mainly dictated by the QL-decomposition H = QL in (3) which requires of the orderO(NT3) operations. The main problem that inhibits the

per-formance of this algorithm is the occurrence of ill-conditioned H matrices, which after the inversion in (4) significantly increase the noise variance in ˆ

sr compared to that in y. There are several flavors of this solution, such

as minimum mean square-error inspired solutions, which improve the per-formance somewhat, but unfortunately do not avoid the fundamental issue regarding the conditioning of the problem.

1.2 Zero Forcing with Decision-Feedback

The ZF with decision-feedback (ZF-DF) method is an iterative extension to the ZF method. Instead of performing a quantization step after all the symbols are decoupled, it does that intermediately. Hence, using Gaussian elimination, it detects ˆsdf,1 = ⌈y1′/L11⌋, then it assumes s1 = ˆsdf,1 to be

known and detects ˆsdf,2 = ⌈(y′2− L21sˆdf,1)/L22⌋, and so on. The ZF-DF

solution is denoted with ˆsdf. In a tree-search like formulation as in Fig. 1,

the ZF-DF method can be viewed as a greedy search algorithm that in each layer goes only along the branch with the so far smallest accumulated branch metric.

Compared to the ZF method, the complexity is of the same order and the ZF-DF method generally yields better performance. The main issue with it is, due to the noise variance, the error propagation that occurs when a wrong decision is made on the preceding symbol. This error is then highly likely to induce an error in the next decision and so on. Error propagation can be partially reduced by ordering the symbols properly, instead of using natural ordering as the algorithm is presented. A good ordering approach is to detect the symbols with the lowest noise variance first. This is not necessarily the optimal approach and there are indeed many possible ways available to do so. Unfortunately, even with optimal ordering (the ordering that minimizesky′_{− Lˆs}

dfk), the error propagation has a significant impact

on the performance since the fundamental issue with the conditioning of the problem remains. The worse the conditioning is, the larger is the risk of significant error propagation.

(19)

1. Hard MIMO Detection 9

1.3 Sphere Decoding

The sphere decoding (SD) method, first presented in [20] in a different con-text than MIMO detection, performs a partial enumeration of the entire constellationSNT_{[5]. It does that via a pre-determined user parameter, the}

so-called sphere radius R, by considering only candidate vectors s ∈ SNT

that lie inside the sphereky′_{− Lsk}2_{≤ R; hence the name sphere decoding.}

The partial enumeration is performed on the fly by sequentially deciding, in the tree-search problem in Fig. 1, at one layer at the time which candidates do not lie within the sphere and thus are to be excluded in the following layers. Hence, for layer k, the candidates s1, . . . , sk∈ S (branch paths in the

tree) that fulfillPk_i=1Pi_j=1(y′

i− Lijsj)2 ≤ R are the only ones considered

in the upcoming layers k + 1, . . . , NT. The rest of the paths are ignored. In

the end, if the radius R is large enough such that the SD method visits the paths down to layer NT, the candidates ˆs1, ˆs2, . . . , ˆsNTthat yield the smallest

quantityPNT

i=1

Pi

j=1(yi′− Lijˆsj)2(accumulated branch metric) when the SD

method finishes give the solution to (3).

There are many extensions to this method. One of the simplest extensions is SD with pruning. Some more advanced extensions are mentioned in Sec. 1.4 and 1.5. In the SD method with pruning, the sphere radius R is varied. It is first set to some very large value, then when the algorithm reaches a leaf node (a node in layer NT), the corresponding candidate vector ˆs will

be within the radius and its distance to y′ _{will be smaller than the current}

radiusky′_{− Lˆsk}2_{< R; the radius R is then reduced to}

ky′_{− Lˆsk}2_{. When}

the algorithm reaches another leaf node, the distance of the corresponding candidate vector to y′ _{will be smaller than the previously updated radius;}

the radius is therefore updated again and so forth.

By adjusting the initial sphere radius R, the SD algorithm provides a trade-off between complexity and performance. For smaller R, the algorithm may not reach any leaf node and therefore not yield a unique solution vector. One would have to pick randomly amongst the candidate vectors left. On the other hand if R is large, the algorithm may, depending on H, visit many candidates and therefore require a significant amount of time before it finishes.

The main advantage of the SD is that it is simple to implement and appre-hend. One of the main drawbacks of the SD algorithm is the variable com-plexity depending on the effective channel conditions, and that the worst-case and expected-worst-case complexity is exponential in the number NT [21].

(20)

10

Due to the sequential tree-search like nature of the algorithm, it does not fit well for parallel implementations which is a necessity when NTis large.

1.4 Fixed Complexity Sphere Decoding

To address the drawbacks of the SD algorithm, the authors in [6] proposed an approach that fixes the complexity of SD and provides a highly parallelizable structure. In order to explain the fixed complexity SD (FCSD) method in [6], for fixed r∈ {0, . . . , NT−1}, we define the following partitioning of the model

in (1) y= Hs + e = hH Hei | {z } col. permut. of H ¯ sT ˜sT | {z } permut. of s T + e = H ¯s+ eH˜s+ e, (6) where H ∈ RNR×r+1_{, e}_H _{∈ R}NR×(NT−r−1)_{, ¯}_s _{∈ S}r+1_{, and ˜}_s _{∈ S}NT−r−1_.

The choice of partitioning involves the choice of a permutation, and how to perform this choice is not obvious. In fact, there are NT

r+1

possible parti-tionings in (6). The aim in FCSD is to find a partitioning such that the condition number of the matrix eH is minimized and it will be clear why in what follows.

The FCSD method offers a trade-off between exact and approximate com-putation of (3) via the parameter r. More specifically, the FCSD splits the minimization in (3) into two parts

ˆ s= argmin s∈SNT ky − Hsk = argmins¯∈Sr+1 argmin ˜ s∈SNT−r−1 y − H¯s − eHs˜ _, (7) and then approximates the second minimization by a simple sub-optimal hard detector such as ZF-DF. Thus, the enumeration is only performed over the ¯spart of the vector s. This can be viewed in the tree-search problem as a full search down to layer r followed by a greedy search that only picks the best branch paths down to the last layer. The ZF-DF method is computationally much more efficient than the exact minimization in (7), but it performs well only for well-conditioned matrices. However, the secondary min problems in (7) are generally well-conditioned since the matrices eH are tall. In addition to that, when forming the partitioning in (6), the original symbol order in s= [s1, . . . , sNT]

T _{is permuted in (6) so that the condition number of e}_H _is

minimized. Notably, FCSD performs ZF-DF for r = 0 and solves the exact ML problem (as defined by (3)) for r = NT− 1.

(21)

1. Hard MIMO Detection 11

1.5 Reduced-Dimension Maximum Likelihood Search

The reduced-dimension maximum-likelihood search (RD-MLS) method [22] uses the same core idea as the FCSD. It splits the minimization problem in (3), by using the partitioned model in (6), into two parts as in (7). Then, as opposed to the FCSD method that uses a simple hard detector to solve the secondary problem, it uses solely a linear estimator F such as the MMSE estimator to approximate the secondary problem without performing any quantization. Hence, ˆ ¯s = argmin ¯ s∈Sr+1 y − H¯s − eHF(y− H¯s) = argmin ¯ s∈Sr+1kz − G¯sk , (8)

where z = y− eHF yand G = eH(I− F H). This reduces the dimension of the minimization to that of the space of ¯s. Then, the problem with the re-duced dimension is solved using an SD type of algorithm. Using the solution in the reduced dimension problem ˆ¯s, the rest of the symbol vector is detected with a simple hard detector such as the MMSE with successive interference cancellation method. The main difference between the RD-MLS and the FCSD method is that FCSD uses the output of a simple hard detector in the secondary minimization when solving the primary minimization whereas the RD-MLS does not. The disadvantage of the RD-MLS algorithm is that it does not improve the conditioning of the reduced problem (matrix G) compared to the original one (matrix H), as is done in the FCSD method. The reason is the unquantized linear estimator (matrix F ) which essentially results in a projection of the reduced dimension space ¯sonto the orthogonal complement of the column-space of H.

This algorithm provides two parameters with which complexity can be traded for performance: the dimension parameter r and the sphere radius R of the SD algorithm. The RD-MLS algorithm reduces the complexity of the SD method. It also inherits the SD algorithms properties, both good and bad. For instance, the disadvantage of variable complexity comes along.

1.6 Lattice-Reduction Aided Detectors

The lattice-reduction (LR) aided MIMO detection algorithms build upon the idea of transforming an ill-conditioned problem into an equivalent well-conditioned problem via a linear transform T that fulfills certain conditions.

(22)

12

Once the problem is well-conditioned, simple hard detectors such as ZF can be used to achieve near-optimal performance.

How does this work? First, the finite constellationS is, if the constellation points are uniformly spaced, scaled with some scalar 1/α and superimposed to enumerate all integers Z. The infinite constellation ZNT _{represents a}

square lattice with dimension NT. Then the ML problem in (3) is relaxed

to find the point in the lattice (after applying αH) that is closest to the received data y. Hence, the relaxed ML problem becomes

ˆ

x= argmin

x∈ZNT ky − αHxk ,

(9) which is equivalent to finding the point, in the lattice having the basis con-sisting of the column vectors of αH, that is closest to y

(argmin_x_∈αHZNTky − xk). The basis matrix αH does not contain a unique

set of basis vectors for the particular lattice αHZNT_{. Indeed, there is an}

infinite number of different basis vectors that span the same lattice; some yield well-conditioned basis matrices and some do not, see Fig. 2. By find-ing an appropriate transformation matrix T , an ill-conditioned realization of the problem in (9) can be transformed into an equivalent well-conditioned problem. The T matrix must be invertible and endomorphic (maps a space onto itself) with respect to the lattice spanned by αH. Hence,

ˆ x= argmin x∈ZNT ky − αHxk = argminx∈ZNT y − αHT T−1_x = T argmin x∈ZNT ky − αHT xk .

With a well-conditioned problem (matrix αHT ), finding the closest lattice point is easy. The main difficulty has been shifted from finding the closest point in a lattice to finding an appropriate transformation matrix T . There are certainly many ways to find a good matrix T : some are compu-tationally demanding and some are not. One of the most well-known and computationally cheap algorithms is the LLL lattice reduction (LLL-LR) al-gorithm introduced by A.K. Lenstra, H.W. Lenstra, and L. Lov´asz in [23]. This algorithm has, as many other LR algorithms do due to their iterative nature, a variable complexity which has not been upper-bounded yet. Nev-ertheless, the LLL average complexity has been shown to be polynomial in the number NT. For more detail on LR aided MIMO detection, see [8] and

the references therein. The authors in [8] summarize the main properties and algorithms of LR.

(23)

2. Soft MIMO Detection 13

s1

s2

B1 B2

Figure 2: Square lattice generated by two different bases. The two bases are given by the column vectors in B1=

1 0 0 1 and B2= 1 0 2 2 , respectively. The basis matrix B1is obviously much better conditioned than B2and

thus finding the closest lattice point ˆxin the basis given by B1is much

simpler than in that given by B2.

2 Soft MIMO Detection

The optimal soft information desired by the channel decoder is the a poste-riori log-likelihood ratio

l(bi|y) , log _{P (b} i= 1|y) P (bi= 0|y) , (10)

where biis the i:th bit of the transmitted vector s. The quantity in (10) tells

us how likely it is that the i:th bit of s is equal to zero or one, respectively. By using Bayes’ rule, performing marginalization over all bits except the i:th bit, and assuming uniform a priori probabilities, the log-likelihood ratio (LLR) becomes l(bi|y) = log   P s:bi(s)=1exp − 1 N0ky − Hsk 2 P s:bi(s)=0exp −N10ky − Hsk 2   . (11)

where the notation P_s:b_i_(s)=x means the sum over all possible vectors s∈ SNT _{for which the i:th bit is equal to x. In (11), there are} |S|NT _terms

that need to be evaluated and added, which again results in an exponential complexity in NT. This is a significant obstacle when it comes to realizing

(24)

14

2.1 Max-Log Detection

A very good approximation of (11) is the so called max-log approximation where the sums in both the numerator and the denominator are replaced by their corresponding largest term. Why max-log is a very good approximation follows from the fact that

log(ea+ eb) = max(a, b) + log(1 + e−|a−b|)≈ max(a, b) + e−|a−b|≈ max(a, b), where the approximations are very close to equalities when e−|a−b| _{is very}

small. The quantity e−|a−b|_{is generally very small even when the difference}

between a and b is not very large. The max-log approximation of (11) is

l(bi|y) ≈ 1 N0( maxs:bi(s)=0 ky − Hsk2− max s:bi(s)=1 ky − Hsk2). (12) The complexity of this approximation remains being exponential in the num-ber NTsince a full enumeration of all the|S|NTcandidates is still performed.

With this approximation, we end up in taking hard decisions, i.e., solving minimization problems, in order to obtain soft decisions. Therefore, any hard detector of choice can produce soft values via this approximation.

2.2 Soft Zero Forcing

A very crude approximation to (11), as in the hard decision case, is the soft zero forcing approximation. Similarly to hard decision ZF, soft decision ZF first decouples the symbols in s via

z= L−1y′= s + L−1e= s + n, n_{∼ N (0, (L}TL)−1). The decoupled vector model is split up in NT scalar models

zk= sk+ nk, k = 1, . . . , NT. (13)

Then, by approximating the noise terms nk as uncorrelated over all k =

1, . . . , NT, soft decisions are computed on the bits in sk (independently of

the bits in sℓ∀ℓ 6= k) by applying (11) on the scalar model in (13). The

complexity of this algorithm is of the same order of magnitude as in the hard decision case, namelyO(NT3).

(25)

3. Summary 15

2.3 The PM Method

The soft-output via partial marginalization (PM) method in [7] offers a trade-off between exact and approximate computation of (11), via a pa-rameter r∈ {0, . . . , NT− 1}. The PM method is an extension of the hard

decision detector FCSD to the case of soft decision detection. It retains the highly parallelizable structure. We present the slightly modified version in [24] of the method in [7], which is simpler than that in [7] but without comprising performance.

Consider again the partitioned model in (6), where now ¯s_{∈ S}r+1 _contains

the i:th bit in the original symbol vector s. How the partitioning is chosen (6) is analogue to that in FCSD, i.e., the condition number of the matrix eH is aimed to be minimized. The PM method implements a two-step approxi-mation of (11). More specifically, in the first step it approximates the sums of (11) that correspond to ˜swith a maximization,

l(bi|y) ≈ log      X ¯ s:bi(s)=1 max ˜ s exp −_N1 0ky − H¯s − eH˜sk 2 X ¯ s:bi(s)=0 max ˜ s exp −_N1 0ky − H¯s − eH˜sk 2     . (14)

In the second step, the maximization in (14) is approximated, as in the FCSD method, with a simple hard detector such as the ZF-DF detector [7]. Why this approximation is reasonable follows from the discussions in Sec. 1.4 and Sec. 2.1. Notably, PM performs ZF-DF aided max-log detection for r = 0 and computes the exact LLR values (as defined by (11)) for r = NT− 1.

3 Summary

Table 1 presents a summary of all the methods presented so far. Apart from the detection accuracy and complexity, there are two categories of detector properties that are important to observe: fixed versus variable complexity and adjustable versus constant complexity. Fixed-complexity detectors allow for highly optimized hardware implementations and adjustable-complexity detectors allow for a trade-off between detection accuracy and complexity. Note that in Table 1, the per-H preprocessing is disregarded in the com-plexity count and it can vary for different detectors. For instance, the SD

(26)

16

detector decisions fixed com-plexity adjustable complex-ity complexity per y accuracy

ZF hard YES NO O(NT3) poor

ZF-DF hard YES NO O(NT3) poor

SD hard NO YES - good

FCSD hard YES YES O(|S|r_N

T3) good

RD-MLS hard NO YES - good

LLL-LR hard NO YES - good

ML hard YES NO O(|S|NT₎ _optimal

soft ZF soft YES NO O(NT3) good

PM soft YES YES O(|S|r_N

T3) good

Max-Log soft YES NO O(|S|NT₎ _excelent

LLR soft YES NO O(|S|NT₎ _optimal

Table 1: A summary of the presented detectors. The complexity is presented in terms of the number (order of magnitude) of elementary operations (addi-tion, multiplica(addi-tion, comparison etc.) needed to calculate hard/soft bits in one received vector y disregarding the preprocessing that is made per H. The accuracy is presented in terms of how well the detectors perform for difficult (ill-conditioned) MIMO detection problems.

method has very little and the FCSD method has a bit higher preprocessing complexity. Nevertheless, if the channel does not change for several channel uses at the time, the preprocessing complexity can be amortized over several channel uses and therefore ignored.

(27)

Adaptive Computational

Resource Allocation

1 Previous Work

In recent years, a limited amount of litterature has been produced that considers adaptive detection. Some general ideas were outlined in [25] and some more specific aspects of the problem were addressed in [26–30]. The work of [26] is specific to the SD algorithm where the idea is to let the algorithm decide whether it requires more or less processing power to solve a detection problem. This is possible due to the fact that the SD al-gorithm has a variable complexity that depends on the channel conditions. For ill-conditioned channels, it requires more time (processing power) to find the solution to the detection problem than for well-conditioned chan-nels. In [26], a maximum allowed received-data-packet as well as a per-detection-problem time (processing power) limit is set. The SD algorithm is then executed to solve the detection problems as they appear without vi-olating the specified time limits. A received data packet generally contains multiple detection problems. The first detection problems that fit within the predetermined time limits are solved, and the problems that do not are simply ignored.

In [28], the ideas of [26] are extended to adaptively vary the user param-eter (sphere radius) of the SD algorithm in an iterative decoding setting;

(28)

18

in iterative decoding, the detector and decoder interchange soft information in several iterative steps in order to improve the overall performance. For SD, using a larger sphere radius means better performance with higher com-plexity and smaller means vice versa. The idea in [28] is to set a minimum accuracy threshold for each detected bit and use that threshold to determine the initial smallest allowed sphere radius that will assure a certain accuracy of each detected bit. For some bits, it is computationally more expensive to meet the accuracy threshold than for others. Then in subsequent iterations, the sphere radius is decreased for those bits that pass well over the accuracy threshold in order not to spend unnecessary computational power. Since SD has a variable complexity, so do the techniques in [26] and [28], which is not a desirable property due to the necessity of over-dimensioned hardware; the utilization will not be full whenever the channel conditions are good and the SD algorithm finishes early. Additionally, the techniques in [26] and [28] do not give priority to the detection problems that are most “beneficial” to solve, which is something that can, as we will see in this thesis, yield large performance gains.

The approach in [27] tries to predict whether a detection problem is simple or difficult. An approximate bit-error-rate expression is derived given the current channel conditions for the ZF detector and for the optimal detector. This measure is then used to predict whether it is necessary to use the optimal detector or if it is sufficient to use the simple ZF detector on different detection problems. The aim is to reduce processing power without violating a predetermined minimum bit error rate. This approach is crude in the sense that it performs either computationally cheap ZF detection or expensive optimal detection, but nothing in between.

A more delicate approach is given in [29], which is specific to the FCSD algo-rithm. Recall that the FCSD algorithm performs ZF-DF and ML detection for the user parameter r = 0 and r = NT− 1, respectively. The FCSD user

parameter is adjusted beforehand using a rule that is based on the estimation variance such that a predetermined tolerance level is met with the aim to reduce the required computational power. This procedure makes the FCSD algorithm adopt to the effective channel conditions in a less crude manner than in [27]. Similarly in [30], such a procedure is performed but instead using a rule that is based on the condition number of the effective channel matrix. The better the conditioning is, the smaller user parameter r is used.

(29)

2. Contributions of the Thesis 19

2 Contributions of the Thesis

The main contributions of this thesis relate to adaptive means of allocating computational resources at the receiver during the signal separation (detec-tion) process. A general framework is presented, which is not specific to certain detection algorithms nor scenarios. The main ideas of this thesis are exemplified with the PM method, but can be applied to any MIMO detector of choice. The techniques that are proposed facilitate fixed complexity detec-tion and incorporate the possibility of tuning the trade-off between complex-ity and detection accuracy with arbitrarily fine (discrete) resolution. The work in [26–30] can all be thought of as special cases of the general framework presented here.

The main contents consist of two papers. The first paper aims to address the allocation problem directly and as a result brings out new interesting problems, such as identifying quantities (measures) on which the adaptive allocation should be based upon. One of the fundamental and promising measures proposed there require the knowledge of the probability distribu-tion of the detector outputs, which is difficult to acquire. In the second paper, this difficulty is addressed and a good approximate distribution is found. Additionally, this thesis presents supplementary unpublished simu-lation results.

Paper A Allocation of Computational Resources for Soft MIMO Detection

Authored by Mirsad ˇCirki´c, Daniel Persson, and Erik G. Larsson.

Published in the IEEE Journal on Selected Topics in Signal Processing, Special Issue on Soft MIMO Detection, Dec., 2011. The work is mainly based on the conference papers in [31, 32].

We consider soft MIMO detection for the case of block fading. That is, the transmitted codeword spans over several independent channel realizations and several instances of the detection problem must be solved for each such realization. We develop methods that adaptively allocate computational re-sources to the detection problems of each channel realization, under a total per-codeword complexity constraint. Our main results are a formulation of the problem as a mathematical optimization problem with a well-defined ob-jective function and constraints, and algorithms that solve this optimization problem efficiently computationally.

(30)

20

Paper B Approximating the LLR Distribution for the Optimal and Partial Marginalization MIMO Detectors

Authored by Mirsad ˇCirki´c, Daniel Persson, Jan-˚Ake Larsson, and Erik G. Larsson.

Will be submitted for publication in the near future. The work is an exten-sion of the conference paper [33].

We consider an approximation of the LLR distribution for the soft-output via partial marginalization MIMO detector, which incorporates the optimal soft detector as a special case. More specifically, in a MIMO AWGN setting, we approximate the LLR distribution conditioned on the transmitted signal and the channel matrix with a Gaussian mixture model (GMM). Our main results consist of an analytical expression of the GMM model and a proof that, in the limit of high SNR, this LLR distribution converges in probability towards a unique Gaussian distribution.

3 Future Work

The presented work has closed some open problems but in doing so, it has opened others. Finding efficient algorithms that perform the adaptive allo-cation was not the difficult part in this work. This part has been mainly resolved with several efficient sub-optimal and optimal algorithms. The dif-ficult part that still contains open questions is to find quantitative accuracy measures that predict the detector accuracy given the effective channel con-ditions. The results in Paper A indicate that there exists measures that can yield better performance than what is presented. Thus, finding such accuracy measures is definitely a future work worth considering.

So far, the work in this thesis has not considered detection with iterative decoding. It is nevertheless fully possible to develop such extensions but it would require more care when determining the detector accuracy mea-sures. Similarly, higher order constellations are not considered neither and implementing such an extension is straightforward but it requires finding appropriate detector accuracy measures as well.

The fundamental ideas of this work can be extended further to other appli-cations, such as energy efficient detection. One could controll the processing frequency [34] and or adapt the signal sampling in order to save energy.

(31)

3. Future Work 21

Energy efficient detection can be performed by controlling the processing frequency of the different computational units that perform detection. Easy detection problems that can be solved using computationally cheap detec-tion methods with processing units that work at a lower frequency. On the contrary, difficult detection problems that require computationally heavy and advanced detection methods, can be processed with units using higher frequency in order for all problems (simple and difficult) to be solved during the same amount of time. This is beneficial since, as pointed out in [34], processing units performing the same task at a lower frequency in a longer time interval require less energy than at a higher frequency in a shorter time interval. In addition to that, the hardware utilization will be very high since none of the computational units will wait for one another, which is a very desirable property.

Adaptive signal sampling can be utilized to control the sampling frequency at each received antenna in order to simplify the computations without compris-ing performance. Higher resolution arithmetics require more sophisticated hardware, more energy to acquire, and more computational power to pro-cess. Now, this technique would be beneficial since some antennas would in one extreme case only see noise for which high resolution sampling would be a waste of energy and computational resources. In such an extreme case, one could just simply shut down that antenna and the chain of units that follows. The technique where antennas are shut down due to bad channel conditions is largely known as antenna selection [35]. This technique can be extended in a more delicate manner by adjusting the number of significant bits at each antenna based on the effective channel conditions (information throughput) at each antenna.

(32)

22

(33)

Bibliography

[1] M. Meeker, S. Flannery, L. Wu, and et al., “The mobile internet report,” Tech. Rep., Morgan Stanley Research, Dec. 2009.

[2] E. Telatar, “Capacity of multi-antenna Gaussian channels,” European Transactions on Telecommunications, vol. 10, no. 6, pp. 585–596, Jun. 2001.

[3] D. Tse and P. Viswanath, Fundamentals of wireless communication, Cambridge Uni. Press, New York, NY, USA, 2005.

[4] N. Srinidhi, T. Datta, A. Chockalingam, and B. S. Rajan, “Layered tabu search algorithm for large-MIMO detection and a lower bound on Ml performance (early access),” IEEE Transactions on Communications, 2011.

[5] B. Hassibi and H. Vikalo, “On the sphere-decoding algorithm I. Expected complexity,” IEEE Transactions on Signal Processing, vol. 53, no. 8, pp. 2806–2818, Aug. 2005.

[6] L.G. Barbero and J.S. Thompson, “Extending a fixed-complexity sphere decoder to obtain likelihood information for turbo-MIMO systems,” IEEE Transactions on Vehicular Technology, vol. 57, no. 5, pp. 2804 –2814, Sept. 2008.

[7] E. G. Larsson and J. Jald´en, “Fixed-complexity soft MIMO detection via partial marginalization,” IEEE Transactions on Signal Processing, vol. 56, no. 8, pp. 3397–3407, Aug. 2008.

[8] D. W¨ubben, D. Seethaler, J. Jald´en, and G. Matz, “Lattice reduction,” IEEE Signal Processing Magazine, vol. 28, no. 3, pp. 70–91, May 2011.

(34)

24

[9] J. Choi and H. Nguyen, “SIC-based detection with list and lattice reduc-tion for MIMO channels,” IEEE Transacreduc-tions on Vehicular Technology, vol. 58, no. 7, pp. 3786–3790, Sept. 2009.

[10] D. L. Milliner, E. Zimmermann, J. R. Barry, and G. Fettweis, “A fixed-complexity smart candidate adding algorithm for soft-output MIMO de-tection,” IEEE Journal of Selected Topics in Signal Processing, vol. 3, no. 6, pp. 1016–1025, Dec. 2009.

[11] L. Bai and J. Choi, “Partial MAP-based list detection for MIMO sys-tems,” IEEE Transactions on Vehicular Technology, vol. 58, no. 5, pp. 2544–2548, June 2009.

[12] Y. Li and J. Moon, “Reduced-complexity soft MIMO detection based on causal and noncausal decision feedback,” IEEE Transactions on Signal Processing, vol. 56, no. 3, pp. 1178–1187, Mar. 2008.

[13] P. Aggarwal, N. Prasad, and X. Wang, “An enhanced deterministic monte carlo method for near-optimal MIMO demodulation with QAM constellations,” IEEE Transactions on Signal Processing, vol. 55, no. 6, pp. 2395–2406, June 2007.

[14] J. Choi, Y. Hong, and J. Yuan, “An approximate MAP-based iterative receiver for MIMO channels using modified sphere detection,” IEEE Transactions on Wireless Communications, vol. 5, no. 8, pp. 2119–2126, Aug. 2006.

[15] Z. Guo and P. Nilsson, “Algorithm and implementation of the K-best sphere decoding for MIMO detection,” IEEE Journal on Selected Areas in Communications, vol. 24, no. 3, pp. 491–503, Mar. 2006.

[16] I. Nevat, T. Yang, K. Avnit, and J. Yuan, “MIMO detection with high-level modulations using power equality constraints,” IEEE Transactions on Vehicular Technology, vol. 59, no. 7, pp. 3383–3392, Sept. 2010. [17] Z. Muhammad and Z. Ding, “Blind multiuser detection for synchronous

high rate space-time block coded transmission,” IEEE Transactions on Wireless Communications, vol. 10, no. 7, pp. 2171–2185, July 2011. [18] S. Verdu, Multiuser Detection, Cambridge Uni. Press, New York, NY,

USA, 1st edition, 1998.

[19] E. G. Larsson, “MIMO detection methods: How they work,” IEEE Signal Processing Magazine, vol. 26, no. 3, pp. 91–95, May 2009.

(35)

Bibliography 25

[20] U. Fincke and M. Pohst, “Improved methods for calculating vectors of short length in a lattice, including a complexity analysis,” Mathematics of Computations, vol. 44, no. 170, pp. 463–471, 1985.

[21] J. Jald´en and B. Ottersten, “On the complexity of sphere decoding in digital communications,” IEEE Transactions on Signal Processing, vol. 53, no. 4, pp. 1474–1484, Apr. 2005.

[22] J. W. Choi, B. Shim, A.C. Singer, and N. I. Cho, “Low-complexity de-coding via reduced dimension maximum-likelihood search,” IEEE Trans-actions on Signal Processing, vol. 58, no. 3, pp. 1780–1793, Mar. 2010. [23] A. K. Lenstra, H. W. Lenstra, and L. Lov´asz, “Factoring polynomials

with rational coefficients,” Mathematische Annalen, vol. 261, no. 4, pp. 515–534, 1982.

[24] D. Persson and E. G. Larsson, “Partial marginalization soft MIMO detection with higher order constellations,” IEEE Transactions on Signal Processing, vol. 59, no. 1, pp. 453–458, Jan. 2011.

[25] D. W. Waters, N. Sommer, A. Batra, and S. Hosur, “Dynamic resource allocation to improve MIMO detection performance,” U.S. Patent Ap-plication 0 137 762 A1, Jun. 12 2008.

[26] C. Studer, A. Burg, and H. B¨olcskei, “Soft-output sphere decoding: algorithms and VLSI implementation,” IEEE Journal on Selected Areas in Communications, vol. 26, no. 2, pp. 290–300, Feb. 2008.

[27] I.-W. Lai, G. Ascheid, H. Meyr, and T.-D. Chiueh, “Low-complexity channel-adaptive MIMO detection with just-acceptable error rate,” in Proc. IEEE 69th Vehicular Technology Conference (VTC), 2009, pp. 1–5. [28] K. Nikitopoulos and G. Ascheid, “Complexity adjusted soft-output sphere decoding by adaptive LLR clipping,” IEEE Communications Let-ters, vol. 15, no. 8, pp. 810–812, Aug. 2011.

[29] K.-C. Lai, C.-C. Huang, and J.-J. Jia, “Variation of the fixed-complexity sphere decoder,” IEEE Communications Letters, vol. 15, no. 9, pp. 1001– 1003, Sept. 2011.

[30] X. Wu and J. S. Thompson, “FPGA implementation of an efficient high-throughput sphere decoder for MIMO systems based on the smallest singular value threshold,” in Proc. IEEE NASA/ESA Conference on Adaptive Hardware and Systems, 2010, pp. 340–345.

(36)

26

[31] M. ˇCirki´c, D. Persson, and E. G. Larsson, “Optimization of computa-tional resource allocation for soft MIMO detection,” in Proc. 43:rd Asilo-mar Conference on Signals, Systems and Computers, 2009, pp. 1488– 1492.

[32] M. ˇCirki´c, D. Persson, and E. G. Larsson, “New results on adaptive computational resource allocation in soft MIMO detection,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Pro-cessing, 2011, pp. 2972–2975.

[33] M. ˇCirki´c, D. Persson, E. G. Larsson, and J.-˚A. Larsson, “Gaussian approximation of the LLR distribution for the ML and partial marginal-ization MIMO detectors,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, 2011, pp. 3232–3235.

[34] E. G. Larsson and O. Gustafsson, “The impact of dynamic voltage and frequency scaling on multicore DSP algorithm design,” IEEE Signal Processing Magazine, vol. 28, no. 3, pp. 127–144, May 2011.

[35] S. Sanayei and A. Nosratinia, “Antenna selection in MIMO systems,” IEEE Communications Magazine, vol. 42, no. 10, pp. 68–73, Oct. 2004.

(37)

Part II

Included Papers

(38)

(39)

Link¨oping Studies in Science and Technology Licentiate Theses, Division of Communication Systems

Department of Electrical Engineering (ISY) Link¨oping University, Sweden

Erik Axell, Topics in Spectrum Sensing for Cognitive Radio, Thesis No. 1417, 2009. Johannes Lindblom, Resource Allocation on the MISO Interference Channel, Thesis No. 1438, 2010.

Reza Moosavi, Aspects of Control Signaling in Wireless Multiple Access Systems, Thesis No. 1493, 2011.

MirsadˇCirki´c OptimizationofComputationalResourcesforMIMODetection

Optimization of

Computational Resources for

MIMO Detection

Mirsad ˇ

Cirki´c

Abstract

Acknowledgments

Contents

I

Introduction

1

II

Included Papers

27

Part I

Motivation

MIMO Detection

1

Hard MIMO Detection

1.1

Zero Forcing

1.2

Zero Forcing with Decision-Feedback

1.3

Sphere Decoding

1.4

Fixed Complexity Sphere Decoding

1.5

Reduced-Dimension Maximum Likelihood Search

1.6

Lattice-Reduction Aided Detectors

2

Soft MIMO Detection

2.1

Max-Log Detection

2.2

Soft Zero Forcing

2.3

The PM Method

3

Summary

Adaptive Computational

Resource Allocation

1

Previous Work

2

Contributions of the Thesis

3

Future Work

Bibliography

Part II

Included Papers