achieving the optimal diversity-versus-multiplexing tradeoff for mimo flat channels with qam...

12
5312 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 12, DECEMBER 2006 Achieving the Optimal Diversity-Versus-Multiplexing Tradeoff for MIMO Flat Channels With QAM Space–Time Spreading and DFE Equalization Abdelkader Medles, Member, IEEE, and Dirk T. M. Slock, Fellow, IEEE Abstract—The use of multiple transmit (Tx) and receive (Rx) antennas allows to transmit multiple signal streams in parallel and hence to increase communication capacity. We have previously introduced simple convolutive linear precoding schemes that spread transmitted symbols in time and space, involving spatial spreading, delay diversity and possibly temporal spreading. In this paper we show that the use of the classical multiple-input–mul- tiple-output (MIMO) decision feedback equalizer (DFE) (but with joint detection) for this system allows to achieve the optimal diversity-versus-multiplexing tradeoff introduced in Zheng and Tse, “Diversity and multiplexing: A fundamental tradeoff in multiple-antenna channels,” IEEE Trans. Inf. Theory, May 2003, when a minimum mean squared error (MMSE) design is used. One of the major contributions of this work is the diversity anal- ysis of a MMSE equalizer without the Gaussian approximation. Furthermore, the tradeoff is discussed for an arbitrary number of transmit and receive antennas. We also show the tradeoff obtained for a MMSE zero forcing (ZF) design. So, another originality of this paper is to show that the MIMO optimal tradeoff can be attained with a suboptimal receiver, in this case a DFE, as opposed to optimal maximum likelihood sequence estimation (MLSE). Index Terms—Decision-feedback, diversity, equalization, linear precoding, multiple-input–multiple-output (MIMO), multi- plexing, quadrature amplitude modulation (QAM) constellations, tradeoff. I. INTRODUCTION T HE diversity degree is (currently) defined as the asymp- totic slope of the error probability curve at high signal-to-noise ratio (SNR). So it can be enjoyed with a fixed symbol constellation at high SNR. On the other hand, if the SNR is increasing, the channel capacity increases and can be taken advantage off by increasing the rate via an adap- tive modulation. However, the work of Zheng and Tse [1] Manuscript received December 21, 2004; revised February 24, 2006. Eurécom’s research is partially supported by its industrial partners: BMW, Bouygues Telecom, Cisco Systems, France Télécom, Hitachi Europe, SFR, Sharp, ST Microelectronics, Swisscom, Thales. The research reported herein was also partially supported by the French RNRT project LaoTseu, and by the European Network of Excellence NewCom. A. Medles was with the Eurecom Institute, 06904 Sophia Antipolis Cedex, France. He is now with Icera Semiconductor, Bristol BS32 4AQ, U.K. (e-mail: [email protected]). D. T. M. Slock is with the Eurecom Institute, 06904 Sophia Antipolis Cedex, France (e-mail: [email protected]). Communicated by M. Médard, Associate Editor for Communications. Color versions of Figs. 2, 5, and 6 are available online at http://ieeexplore. ieee.org. Digital Object Identifier 10.1109/TIT.2006.885489 showed that both high SNR benefits cannot be enjoyed simul- taneously and a compromise must be accepted. An optimal diversity-versus-rate tradeoff exists. In the context of mul- tiple-input–multiple-output (MIMO) transmission, this is also called the diversity-versus-multiplexing tradeoff since MIMO systems allow to increase the rate through spatial multiplexing. The optimal tradeoff for frequency flat MIMO channels has been derived in [1], together with a proper positioning of some existing space–time coding schemes and a theoretical scheme based on Gaussian codes. This paper has sparked a number of research activities to pro- pose practical techniques approaching the optimal tradeoff. In [2], Yao and Wornell et al. proposed a numerically optimized rotation based code that achieves the optimal tradeoff for a channel. More recently, Tavildar and Viswanath [3] introduced a design criterion for permutation codes in order to achieve the diversity-versus-multiplexing optimal tradeoff for the parallel channel, that results from the zero-forcing equalized original MIMO channel. This equalization results in a degraded tradeoff when compared to the original channel. Furthermore, the per- mutation codes have to be optimized for every setting (constel- lation size and number of transmit antennas), which limits their application for rate adaptive systems. Another technique, LAST, based on lattice coding was pro- posed by El Gamal et al. in [4]. It achieves the optimal tradeoff. In [4], it is stated that the use of LAST codes, as opposed to linear dispersion (LD) codes, is essential to attain the optimal tradeoff. However, the technique proposed in this paper is a special form of LD codes and the proposed technique also attains the optimal tradeoff. The essential difference between LAST and LD codes is one of shaping region, which is a hypersphere for LAST codes and a hypercube for LD codes. A hypersphere allows for some coding gain but is nonessential for the tradeoff (diversity degree) considered here. On the other hand, the shaping region and dithering considered in [4] lead to a significant complexity increase in both transmitter and receiver. Furthermore, no explicit lattice construction is provided in [4]. And, if a lattice code can be found, it needs to be adapted for every (multiplexing) rate used. In LD codes, one only needs to adapt a simple symbol (e.g., quadrature amplitude modulation (QAM)) constellation. The latest proposed block codes, the perfect space–time codes with minimum delay (PSTC-MD) [5], [6], have a nonvanishing determinant, which ensures that they achieve the optimal diver- sity-versus-multiplexing tradeoff [7] if a ML decoder is used. PSTC-MD have a minimum delay of ( being the 0018-9448/$20.00 © 2006 IEEE

Upload: dtm

Post on 24-Sep-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

5312 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 12, DECEMBER 2006

Achieving the Optimal Diversity-Versus-MultiplexingTradeoff for MIMO Flat Channels With QAMSpace–Time Spreading and DFE Equalization

Abdelkader Medles, Member, IEEE, and Dirk T. M. Slock, Fellow, IEEE

Abstract—The use of multiple transmit (Tx) and receive (Rx)antennas allows to transmit multiple signal streams in parallel andhence to increase communication capacity. We have previouslyintroduced simple convolutive linear precoding schemes thatspread transmitted symbols in time and space, involving spatialspreading, delay diversity and possibly temporal spreading. In thispaper we show that the use of the classical multiple-input–mul-tiple-output (MIMO) decision feedback equalizer (DFE) (butwith joint detection) for this system allows to achieve the optimaldiversity-versus-multiplexing tradeoff introduced in Zheng andTse, “Diversity and multiplexing: A fundamental tradeoff inmultiple-antenna channels,” IEEE Trans. Inf. Theory, May 2003,when a minimum mean squared error (MMSE) design is used.One of the major contributions of this work is the diversity anal-ysis of a MMSE equalizer without the Gaussian approximation.Furthermore, the tradeoff is discussed for an arbitrary number oftransmit and receive antennas. We also show the tradeoff obtainedfor a MMSE zero forcing (ZF) design. So, another originality ofthis paper is to show that the MIMO optimal tradeoff can beattained with a suboptimal receiver, in this case a DFE, as opposedto optimal maximum likelihood sequence estimation (MLSE).

Index Terms—Decision-feedback, diversity, equalization, linearprecoding, multiple-input–multiple-output (MIMO), multi-plexing, quadrature amplitude modulation (QAM) constellations,tradeoff.

I. INTRODUCTION

THE diversity degree is (currently) defined as the asymp-totic slope of the error probability curve at high

signal-to-noise ratio (SNR). So it can be enjoyed with afixed symbol constellation at high SNR. On the other hand,if the SNR is increasing, the channel capacity increases andcan be taken advantage off by increasing the rate via an adap-tive modulation. However, the work of Zheng and Tse [1]

Manuscript received December 21, 2004; revised February 24, 2006.Eurécom’s research is partially supported by its industrial partners: BMW,Bouygues Telecom, Cisco Systems, France Télécom, Hitachi Europe, SFR,Sharp, ST Microelectronics, Swisscom, Thales. The research reported hereinwas also partially supported by the French RNRT project LaoTseu, and by theEuropean Network of Excellence NewCom.

A. Medles was with the Eurecom Institute, 06904 Sophia Antipolis Cedex,France. He is now with Icera Semiconductor, Bristol BS32 4AQ, U.K. (e-mail:[email protected]).

D. T. M. Slock is with the Eurecom Institute, 06904 Sophia Antipolis Cedex,France (e-mail: [email protected]).

Communicated by M. Médard, Associate Editor for Communications.Color versions of Figs. 2, 5, and 6 are available online at http://ieeexplore.

ieee.org.Digital Object Identifier 10.1109/TIT.2006.885489

showed that both high SNR benefits cannot be enjoyed simul-taneously and a compromise must be accepted. An optimaldiversity-versus-rate tradeoff exists. In the context of mul-tiple-input–multiple-output (MIMO) transmission, this is alsocalled the diversity-versus-multiplexing tradeoff since MIMOsystems allow to increase the rate through spatial multiplexing.The optimal tradeoff for frequency flat MIMO channels hasbeen derived in [1], together with a proper positioning of someexisting space–time coding schemes and a theoretical schemebased on Gaussian codes.

This paper has sparked a number of research activities to pro-pose practical techniques approaching the optimal tradeoff. In[2], Yao and Wornell et al. proposed a numerically optimizedrotation based code that achieves the optimal tradeoff for achannel. More recently, Tavildar and Viswanath [3] introduceda design criterion for permutation codes in order to achieve thediversity-versus-multiplexing optimal tradeoff for the parallelchannel, that results from the zero-forcing equalized originalMIMO channel. This equalization results in a degraded tradeoffwhen compared to the original channel. Furthermore, the per-mutation codes have to be optimized for every setting (constel-lation size and number of transmit antennas), which limits theirapplication for rate adaptive systems.

Another technique, LAST, based on lattice coding was pro-posed by El Gamal et al. in [4]. It achieves the optimal tradeoff.In [4], it is stated that the use of LAST codes, as opposed tolinear dispersion (LD) codes, is essential to attain the optimaltradeoff. However, the technique proposed in this paper is aspecial form of LD codes and the proposed technique alsoattains the optimal tradeoff. The essential difference betweenLAST and LD codes is one of shaping region, which is ahypersphere for LAST codes and a hypercube for LD codes.A hypersphere allows for some coding gain but is nonessentialfor the tradeoff (diversity degree) considered here. On theother hand, the shaping region and dithering considered in [4]lead to a significant complexity increase in both transmitterand receiver. Furthermore, no explicit lattice construction isprovided in [4]. And, if a lattice code can be found, it needs tobe adapted for every (multiplexing) rate used. In LD codes, oneonly needs to adapt a simple symbol (e.g., quadrature amplitudemodulation (QAM)) constellation.

The latest proposed block codes, the perfect space–time codeswith minimum delay (PSTC-MD) [5], [6], have a nonvanishingdeterminant, which ensures that they achieve the optimal diver-sity-versus-multiplexing tradeoff [7] if a ML decoder is used.PSTC-MD have a minimum delay of ( being the

0018-9448/$20.00 © 2006 IEEE

MEDLES AND SLOCK: OPTIMAL DIVERSITY-VERSUS-MULTIPLEXING TRADEOFF FOR MIMO FLAT CHANNELS 5313

number of transmit and the number of receive antennas), thedetection is performed jointly for QAM symbols. The jointdecoding in the decision feedback equalizer (DFE) receiver pro-posed here leads to a joint detection of only QAM symbols,resulting in lower detection complexity.

In what follows, superscripts denote the transpose,Hermitian transpose and (Moore–Penrose) pseudoinverse [8],respectively. For two Hermitian matrices , of size

, the matrix inequality signifies that thedifference is positive semidefinite: , which inturn means that for any complexvector . Consider a Hermitian positive semidefinite matrix

with the following eigen decomposition ,where is a real diagonal matrix containing only the nonzeroeigenvalues on the diagonal, and the columns of containthe corresponding (orthonormal) eigenvectors. The square root(unique up to a postmultiplication with a unitary matrix) andthe pseudoinverse of are defined asand resp., where the power of a diagonalmatrix is the diagonal matrix of the powers of its diagonal elel-ments. will denote the square root of . The filtering(convolution) operation can be denotedby , where is the filtertransfer function and is the one sample delay operator:

. The -transform of is simply obtained bysubstituting by . The superscript

in the filter denotes the Hermitian transpose and timereversal of the impulse response coefficients of

is the matched filter version of .The rest of this paper is organized as follows. Section II

introduces the particular space–time spreading (STS) schemeconsidered in this paper. Relations with threading and lineardispersion approaches are highlighted and the associated framestructure is introduced also. The MIMO Decision-FeedbackEqualizer considered in this paper is presented in Section III.The minimum mean square error (MMSE), Unbiased MMSE,and zero-forcing designs are elaborated, as also the detailsfor the resulting feedforward and feedback filters for theprecoder-channel cascade at hand. Section IV constitutes themain part of the paper as it derives the diversity-multiplexingtradeoff. In part A the Unbiased MMSE design is considered.This part proceeds in three steps. In Step 1, the frame errorrate is related to the probability of a first symbol error. Thelatter gets lower bounded in Step 2 with particular attentionfor the Gaussian noise and ISI parts in the symbol error, andfor edge effects in the frame. In Step 3, an easy upper boundis introduced and the asymptotic sandwiching of both boundsleads to the desired tradeoff result. Part A ends with a general-ization of the number of transmit antennas, which was so farconstrained to be a power of two. In part B, a worse tradeofffor the zero-forcing design is obtained by paralleling the threesteps of part A. The advantages of the proposed convolutiveprecoding STS scheme compared to minimum block length

Fig. 1. MIMO transmission with STS.

linear dispersion codes such as the Golden Code are highlightedin Section V, both from a receiver complexity as from a BERpoint of view. The simulations furthermore show that errorpropagation in the DFE only lead to an SNR offset. Finally,some concluding remarks are offered in Section VI.

II. SPACE–TIME SPREADING (STS) SCHEME

The MIMO system with single-carrier transmissionis shown in Fig. 1 and is essentially described by

(1)

where the white noise power spectral density matrix is. The input signal is white and the SNR

parameter is defined as . We consider the case of channelstate information being absent at the transmitter (Tx) and per-fect at the receiver (Rx). The channel elements are assumed to beindependent and identically distributed (i.i.d.) following a cen-tered Gaussian distribution with unit norm. The linear precodingconsidered here (introduced in [9] and further analyzed in [10])consists of a modification of VBLAST, obtained by inserting asquare matrix prefilter before inputting the vector signal

into the channel . The (“full rate”) componentsignals of are called streams or layers. The suggested prefilteris

(2)

with

......

...(3)

where are the roots of , and the FIRfilter length is .

Note that for a channel with a delay spread of symbolperiods, the prefilter can be immediately adapted byreplacing the elementary delay by in . In whatfollows, we focus on the frequency-flat channel case, inwhich case symbol stream passes through the equiv-alent finite-impulse response single-input–multiple-output(FIR-SIMO) channel which now hasmemory due to the delay diversity introduced by . It is

5314 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 12, DECEMBER 2006

Fig. 2. MIMO transmission with STS.

important that the different columns of the channel matrixget spread out in time to get full diversity (otherwise, thestreams just pass through a linear combination of the columns,as in VBLAST, which offers limited diversity). The delay di-versity only becomes effective by the introduction of the spatialspreading matrix , which has equal magnitude elements foruniform diversity spreading, because different streams need tosee different SIMO channels.

The prefilter is hence the cascade of the spreading matrix, that spreads the information in each symbol of the

stream over all the components of , followed by thedelay diversity that maps the different components of ontodifferent symbol periods in a DBLAST-like fashion, see Fig. 2.

The specific Vandermonde choice for shown in (3) corre-sponds to the discrete Fourier transform (DFT) matrix postmul-tiplied by a diagonal matrix containing the elements of the firstrow of . This choice for can be shown to lead to maximumcoding gain in case of QAM symbols [9], [10], among all ma-trices with normalized columns. In fact, with the components of

taken from the same constellation with a minimum distance, and for all and , the

coding gain defined as satisfies [9]

(4)

With the proposed space-time spreading, each symbol streamhas the same matched filter bound (MFB), which is proportionalto the squared channel Frobenius norm (total energy), namelyMFB . Hence full diversity (regardless of channelfading structure (joint distribution of channel coefficients), andequal to for i.i.d. fading), is exploited. Also, since theprefilter is paraunitary and trans-forms the white vector stream into the white vector stream

, no loss in ergodic capacity is incurred [9] (at sufficientlyhigh SNR). The STS scheme discussed here has been introducedin [9], [10] as a full (symbol) rate full diversity scheme. Withhindsight, such a scheme in signal processing parlance meant ascheme that can reach the endpoints on both axes of the optimaldiversity-versus-multiplexing tradeoff curve. However, whetherthe whole optimal curve can be attained depends on the receiver,as illustrated in this paper with two examples.

A strongly related approach with an interesting interpreta-tion is obtained as follows. Consider grouping the symbol se-quence in groups of consecutive symbols, then one group

of symbols forms a square matrix of size

. An alternative approach is obtained by transposing the ma-trix before inputting its colums into (hence in-putting the rows of into instead of the columns,this interleaving has no effect if no channel coding is intro-duced). It corresponds to spreading within streams instead ofbetween streams. The resulting scheme can be interpreted asfollows. It corresponds to the sawtooth threading approach of[11], which transforms the time-invariant (flat) MIMO channelinto a periodically time-varying SIMO channel for each stream(period ), and then the temporal fading gets exploited with the

constellation rotation matrix as suggested in [12].In the case of linear dispersion codes [13], [14], a packet of

vector symbols (hence a matrix) gets constructed as alinear combination of fixed matrices in which the combinationcoefficients are symbols . A particular case is the Alamouticode which is a full diversity single rate code correspondingto block length . Here we focus onessentially continuous transmission in which linear precodingcorresponds to MIMO prefiltering. The linear convolutive pre-coding scheme proposed here can also be considered as a spe-cial case of linear dispersion codes (making abstraction of thepacket boundaries) in which the fixed matrices are time-shiftedversions of the impulse responses of the columns of the MIMOprefilter .

In what follows we denote the overall channel by. At the receiver side we propose a DFE receiver. This

receiver, as we will see, achieves the optimal diversity versusmultiplexing tradeoff and has an acceptable complexity that re-sults from a sphere decoding of a vector of size . The draw-back of the DFE receiver is the error propagation due to the suc-cessive detection and cancellation. The error propagation has noimpact on the frame error rate, but it degrades the symbol (andbit) error rate. To avoid this, we can take advantage of the pres-ence of binary channel codes (used in general for error correc-tion) and choose appropriate codes, like those introduced in [15]and [16], that allow successive joint detection and decoding.

A. Frame Structure

Although the proposed linear prefiltering technique is ide-ally intended for continuous transmission, in practice data getstransmitted in packets or frames. As with convolutional channelcodes, the proposed convolutive linear precoding may requireproper handling of the frame borders. The memory introducedby the prefiltering does not have to lead to interframe interfer-ence, at least for a frequency-flat channel, if circulant convolu-tion would be used for the prefilter. However, the requirement ofproper initialization of a DFE, as proposed here, necessitates theintroduction of a guard interval of size equal to the memory ofthe DFE feedback filter, which is equal to the size of the memoryof the prefilter, hence of size symbol periods. We shall as-sume, without loss of generality, that the frame of symbol pe-riods starts with a guard interval of size followed by datasymbols at time , and is followed by theguard interval of the next frame. The introduction of the guardinterval leads to a reduction in rate by a factor whichcan be made arbitrarily small by increasing the frame length .We will neglect this reduction in the sequel to simplify notation.

MEDLES AND SLOCK: OPTIMAL DIVERSITY-VERSUS-MULTIPLEXING TRADEOFF FOR MIMO FLAT CHANNELS 5315

Fig. 3. Conventional MIMO-DFE receiver.

III. CONVENTIONAL MIMO-DFE RECEIVER

Consider the classical MIMO-DFE, in which the symbolvectors are processed sequentially in time (see Fig. 3). Wecall this the conventional MIMO-DFE [17], as opposed tothe sequential processing MIMO-DFE (an extension of theVBLAST Rx to channels with memory) in which the compo-nent sequences of are processed sequentially (in componentorder) [18].

The Rx starts with a preprocessing by the matched filter (MF), the output of which is . The DFE, oper-

ating on the MF output, produces the following symbol estimate

(5)

where the feedback filter is such thatis causal, monic and minimum phase. There

are two possible designs of interest for the DFE filters, the min-imum mean squared error (MMSE) and the MMSE zero forcing(ZF) designs. We consider the MMSE design first.

A. MMSE Conventional MIMO DFE Rx

The linear MMSE symbol vector estimate (MMSE linearequalizer output) can be expressed as

(6)

where and . The symbol

estimate leads to symbol estimation errorand we can write

(7)

The orthogonality principle of MMSE estimation leads to thepower spectral density matrix

(8)

Consider now the minimum and maximum phase spectral fac-torization of (see [17]). Let be the unique causal,monic minimum phase factor of , then

(9)

where is a constant positive definite hermitian matrix. Then

. By canceling the anticausalintersymbol interference (ISI) linearly, namely by choosing

, we get the DFE Rx

(10)

and hence

(11)

and in fact with

(hence is temporally white). The feedback filteris closely related to the backward MIMO predic-

tion error filter of the spectrum , which satisfies. Indeed, obviously

and . So is thecovariance matrix of the backward prediction error vector ofthe spectrum . The following Theorem provides inthe case of a frequency flat MIMO channel.

Theorem 1: For a frequency-flat MIMO channel combinedwith the proposed precoder filter in (2), the feedback filteris

(12)

and the corresponding covariance matrix of the backward pre-diction error of is

(13)

where and result from the LDU triangular matrix factoriza-tion of .

Let and note that (13) correspondsto the eigendecomposition of .

Proof: We need to show thatis a minimum phase causal monic filter

and verifies Constant Matrix. Sinceis upper triangular with unit diagonal, then due to the diag-

onal structure of is a monic causal filter.As is unitary, is also a causal monic filter. Furthermore

, which shows that is minimumphase. To complete the proof of the theorem it is sufficient toverify that is a constantmatrix.

Unbiased MMSE (UMMSE) Conventional MIMO DFE Rx:The discussion below will be valid also for the case

. This will lead to singular matrices and requires the use ofpseudoinverses (denoted by ). Now, it is well-known that

5316 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 12, DECEMBER 2006

is a biased estimate of , sinceindeed

(14)

where the sample

(15)

is now uncorrelated with the sample . The covariance matrixof the vector is

(16)

The UMMSE feedforward (and feedback) filter is obtained byremoving the bias in (14) via inversion of . Since

can be singular, its inversion requires the use of thepseudoinverse

(17)

The corresponding feedback filter is

(18)

and the output of the DFE is then

(19)

where . The capacity1 of sucha Tx system with UMMSE DFE Rx, assuming perfect feedback

1The notion of capacity will be used loosely here, in the sense of mutualinformation for the case of Gaussian inputs with the same spectrum.

and joint decoding of the components of , is (interpreting (19)as a vector AWGN channel)

(20)

In order to show that equals the capacity of the MIMOchannel, note that

(21)

since and . Hence such a Rx anddecoding strategy conserves capacity. The linear precoding in-troduces memory into the equivalent channel , whereas theDFE makes the equivalent channel memoryless again (temporalcorrelation in the noise gets ignored at the DFE detection point).

B. MMSE-ZF Conventional MIMO-DFE Rx

For this case we need to assume . The linearMMSE-ZF symbol vector estimate is

(22)

where now . Consider again the spectralfactorization . Then in the same way asfor the MMSE design, and again with the usual DFE analysisassumption that detected symbols (in the feedback) are correct,we get

(23)

where

and this time result from the LDU factorization of. Unlike the MMSE design, the MMSE-ZF

design makes no compromise between noise enhancement andinterference cancellation. It removes (zero-forces) the entireinterference. Hence bias is not an issue and the noise at theoutput of the equalizer, , is Gaussian.

In both DFE designs, the feedback filter is monic causaland FIR of length , and by the same analysis the feedforwardfilter is anticausal and FIRof length also. Note that is FIR because doesnot have zeros. In both DFE designs, different choices are pos-sible for the detection of the symbol vector . For example a

MEDLES AND SLOCK: OPTIMAL DIVERSITY-VERSUS-MULTIPLEXING TRADEOFF FOR MIMO FLAT CHANNELS 5317

V-BLAST-like detector can be used. However, such a sequentialprocessing of the symbol vector components degrades perfor-mance. The optimal choice is the joint detection of the compo-nents of using a weighted minimum distance detector whichin the MMSE ZF case minimizes

w.r.t. ( is the DFE output). Such a detector has an accept-able complexity especially for a small number of transmit an-tennas and small constellation size. The complexity can be re-duced by the use of sphere decoding. Such a DFE detector withjoint decoding is less complex than direct maximum likelihoodsequence estimation (MLSE), which could in principle be donewith the Viterbi algorithm, but in which the number of statesgrows exponentially with (instead of for joint detectionin the DFE) due to the memory of introduced by the pre-coder .

IV. DIVERSITY-VERSUS-MULTIPLEXING TRADEOFF

In what follows, we study the diversity-versus-multiplexingtradeoff achieved by the Conventional MIMO DFE equalizer,applied to the linearly precoded system considered here.

A. Case of UMMSE DFE Design

Theorem 2: In the case of a frequency-flat channel,( integer), the use of a weighted minimum distance de-

tector and QAM constellations allows the Conventional MIMODFE Rx, with UMMSE design, to achieve the optimal diver-sity-versus-multiplexing tradeoff given by (see [1]).is given by the piecewise-linear function connecting the points

, where

(24)

with and .This theorem shows that the proposed transmitter-receiver

combination with UMMSE design allows to attain the optimaldiversity-versus-multiplexing tradeoff derived in [1]. The re-sult is derived without the use of the Gaussian assumption forthe residual intersymbol interference (ISI) at the output of theUMMSE-DFE equalizer. This aspect is handled by boundingthe residual ISI contribution, which bound turns out to be inde-pendent of the SNR and its influence disappears for high SNR.

Proof: Consider the unbiased MMSE ConventionalMIMO-DFE Rx. Let denote exponential equality in terms ofSNR, i.e., means

(25)

The exponential inequalities are defined in the same way, forexample means

(26)

The proof of Theorem 2 is structured in three steps. In Step 1, wecharacterize the frame (block) error probability in terms of theprobability of a first symbol error. In Step 2, we derive a lowerbound on this first symbol error probability. In this same step,

we also handle the non-Gaussain interference. Finally, in Step3, we characterize the behavior of the error probability for largeSNR and derive the diversity-versus-multiplexing tradeoff.

Step 1: The symbol vectors of the transmitted frame are de-tected sequentially in time using the DFE Rx. We denote by

the event of making an error when detecting the th symbolvector ( is the complement or the event in which no erroris made when detecting the th symbol vector). Whenever thereis an error on any of the detected symbols, the frame is saidto be in error. denotes the frame error probability. isthe probability of the union of individual error events

(27)

The union and intersection of events is distributive, so the eventcan be written as the union of the two events and

(28)

and are two disjoint sets ,hence

(29)

where denotes . Exploiting this fact, we canshow by recursion that

(30)

Since probability is nonnegative, , and also

(31)

which leads to

(32)

We would like to show that , which will be obtainedif we can show that . Since is finite, it is suffi-cient to show that for any

. To that end, consider a genie-aidedreceiver that has access to the correct value of the past detectedsymbols and cancels exactly the interfer-ence coming from these symbols. The genie-aided receiver re-produces for most of the frame the same situation of the firstsymbol for which there is no interference from the past. Then theprobability of error of the genie-aided receiver when detectingsymbol , , satisfies

(33)Indeed, the DFE symbol estimation error is stationary as far asthe noise contribution is concerned, whereas for the residual ISI,it is determined by the filterwhich is anticausal and FIR of memory . So the ISI is sta-tionary for most of the frame, except for the last symbols,where less future symbols interfere due to the following guard

5318 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 12, DECEMBER 2006

interval. The second part of (33) will be discussed at the end ofStep 2. On the other hand

(34)

In the event , the symbols , are cor-rectly detected, which provides access to their true values, hence

(35)

Obviously, when knowing the true values of , theprobability to make an error on deciding them is zero, hence

(36)

and (34) becomes

(37)

where in the second inequality we used the fact that. Combining (33) and (37), we conclude

that , and from (30)

(38)

Since is finite

(39)

hence . Combined with (32) the desired result thenfollows:

(40)

The above results are independent of the criterion used for thedesign of the equalizer filters, and assume only the DFE struc-ture.

Step 2: In this step of the proof, we derive a lower bound onthe first symbol error probability for a fixed channel realization

. In Step 2, we shall denote as to simplify notation.We use the weighted minimum distance detector at the outputof the UMMSE Conventional MIMO DFE (19):

(41)

An error occurs if we decide for transmitted . For anerror to occur we need to have

(42)

where .If we denote by , then (42) is equivalent to

(43)

where is defined in (15) and denotes the real partof its argument. Consider now the rotated symbols .Then the MMSE DFE estimate gets produced with thefilters

(44)

and the corresponding symbol estimation error has a diag-onal covariance matrix . Let and

. Note thatwhich is the projection matrix on the column space of

. is a diagonal matrix of ones and zeros, and henceverifies . Now, (43) translates to the following in-equality for

(45)

where the second inequality is an application of theCauchy–Schwartz inequality [8]. This can also be writtenas

(46)

The instantaneous channel capacity isbeing diagonal, Jensen’s inequality

leads to

(47)

MEDLES AND SLOCK: OPTIMAL DIVERSITY-VERSUS-MULTIPLEXING TRADEOFF FOR MIMO FLAT CHANNELS 5319

Now consider a Tx scheme in which the transmitted rate varieswith SNR. The different components of come from the sameQAM constellation of size where

is the overall allocated rate2 and is a posi-tive integer. The minimum distance of the constellation is ,with . So the components of

belong to the QAM constellation

. The error components of then belong to the set

. This leads to the upperbound

(48)

On the other hand the choice of ensures that [9]

(49)

Applying the bounds (47), (48), (49) to (46), leads to

(50)

For a given channel realization, the error event is includedin the event described by (50), hence

(51)

The normalized symbol error , which is for , can bewritten as

(52)

can be written as where

is an vector,is an matrix,

and

2The actual rate is R(�) = (1 � )r ln � as already commented onearlier.

The normalized DFE output error is a mixture of Gaussiannoise and non-Gaussian ISI . In the following, we showhow to handle the error probability without Gaussian assump-tion.

By construction of the matrix norm (largest singularvalue) (applied here to ) induced by the vector

norm [8], . On theone hand

(53)

Consequently, . On the other hand, allthe components of belong to the same QAM constellation,hence

(54)

We conclude that . Now, by the triangle inequality[8], we get . Upperbound (51) now leads to

(55)

where

(56)

has a zero mean Gaussian distribution, with a bounded co-variance matrix , by a reasoning similar to theone in (53), and with rank equal to . The rank(which is also the rank of where )is a function of and . can only be zero if whichis clearly a case of outage and hence does not need to be con-sidered further (see Step 3). Let be a matrix of rank

such that . Denotewhere . Then is Gaussian with zeromean and covariance matrix . Alsoimplies , and since , we get

(57)

From (55), the error probability is then majorized by

(58)

So the main idea in the probablity of error analysis withoutGaussian assumption is that in the normalized error, the contri-bution of the residual ISI is finite, whereas for any rate below

5320 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 12, DECEMBER 2006

capacity, an error event gets situated in an exponentially re-ceding tail of the Gaussian noise part. The tail is parameterizedby where for high SNR , so that the errorprobability, for a rate below capacity, behaves asymptoticallyas if the noise was effectively Gaussian. Further analysis of theerror probability is presented in Step 3 of the proof.

For the last symbol periods of the frame, less symbolsappear in the residual ISI and as a result gets reduced toand gets increased to , hence

(59)

which leads to the second part of (33), at least for the asymptoticanalysis considered here.

Step 3: In this step, we seek to study the behavior of theerror probability for large SNR, in order to derive the diversity-versus-multiplexing tradeoff achieved by the proposed scheme.Let , be the nonzero eigenvalues of sortedin a nondecreasing order. We continue in the footsteps of [1] andintroduce the variables via . At high SNR we have

, where denotes . On theother hand the capacity satisfies ,

hence . In [1], it was shown that for anallocated rate , the outage probability is

outage (60)

where is given by the piecewise-linear function con-necting the points , where

(61)

It was also shown in the same reference that any scheme withrate has an error probability that satisfies

(62)

hence is also called the optimal diversity-ratetradeoff curve.

Let . We define the outage event as. The complementary event of outage is denoted

as no outage . An upper bound on can now be derived as

outage

outage no outage (63)

since

outage

outage no outage (64)

From (60) we conclude that

outage

(65)

Let’s characterize no outage . By applying the Chernoffbound to (58), we get for any

no outage

(66)

In particular for

no outage

(67)

For any channel realization, the event means, or hence

Introducing the definition of (50), (54), (56) into (67) leadsto

no outage

(68)

Now for any . Hence for any finite

we have no outage , and by consequence

no outage (69)

Combining this result with (63) and (65) leads to

(70)

This is valid for any , hence

(71)

Using (40), we arrive at an upper bound for the frame error prob-ability

(72)

Combined with the lower bound of (62), this allows us to con-clude that the proposed scheme attains the optimal diversity-versus-multiplexing tradeoff, i.e.

(73)

MEDLES AND SLOCK: OPTIMAL DIVERSITY-VERSUS-MULTIPLEXING TRADEOFF FOR MIMO FLAT CHANNELS 5321

It is actually possible to extend this result to any number oftransmit antennas.

Theorem 3: The use of a weighted minimum distancedetector and QAM constellations allows the ConventionalMIMO-DFE Rx, with UMMSE design, to achieve the op-timal diversity-versus-multiplexing tradeoff for any number oftransmit antennas .

Proof: Theorem 2 showed that the UMMSE ConventionalMIMO-DFE achieves the optimal tradeoff for withinteger. The proof of Theorem 2 can be extended to the case of

by using a number of data streamsand a precoding matrix that contains only the first rowsof

......

...(74)

where now the are the roots of .The overall channel is . One can nowintroduce a virtual channel of size , thatcontains in the first columns and zeros in the re-maining columns: . This leads to

where is the delay diversitymatrix of size . Using this embedding, Steps 1 and 3of the proof of Theorem 2 remain unchanged, whereas for Step2, should be replaced by .

B. Case of MMSE-ZF-DFE Design

Theorem 4: In the case of a frequency-flat channel and( integer), the use of a weighted minimum dis-

tance detector and QAM constellations (for the streams )allows the Conventional MIMO-DFE Rx, with MMSE-ZF de-sign, to achieve the diversity-versus-multiplexing tradeoff givenby . The tradeoff is the piecewise-linear functionconnecting the points , with

(75)

The tradeoff of the MMSE-ZF DFE gets compared to that of theMMSE DFE in Fig. 4.

Proof: The proof of Theorem 4 follows the same lines asthe proof of Theorem 2. We point out below the minor differ-ences.

Step 1 This step is the same as Step 1 of Theorem 2, its mainresult is

(76)

Step 2 As we have established in Step 2 for The-orem 2, an error occurs when detecting the first symbol if

whereand (see Sec-

tion III-B). Unlike in the MMSE design, the MMSE-ZFConventional MIMO-DFE makes no compromise betweeninterference cancellation and noise enhancement, and cancelsthe interference entirely. Hence has a Gaussian distribution

Fig. 4. Diversity-versus-multiplexing tradeoff comparison between MMSEand MMSE-ZF-DFE designs (assuming N � N ).

with zero mean and covariance matrix . Againusing the Cauchy–Swartz inequality we get

(77)

which can also be written as

(78)

Let . Using Jensen’sinequality we get

(79)

As in the proof of Theorem 2, the error event can be shown tobe included in the following event

(80)

Step 3 is required for the MMSE-ZF design tohold. is identifiable from the QR factorization of .In fact, is a unitary matrix and is a uppertriangular matrix, and also with .Denote where is column of , and

the projection of onto the orthogonal complementof . Then and .

has been reported in [1] to be the instantaneous ca-pacity of the BLAST technique. The get introduced via

. Paralleling the steps in [1], anoutage event occurs for . In [1], it hasbeen shown that , where denotes theexponential equality. is the piecewise-linear functionconnecting the points , with

(81)

It was also shown in the same reference that any scheme withrate , and a ZF constrained structure, has an error

5322 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 12, DECEMBER 2006

probability that satisfies

(82)

The rest of the proof is the same as for Theorem 2, apart thatis upper bounded by

(83)

In the end we conclude that and

(84)

V. COMPARISON WITH PERFECT SPACE–TIME CODES WITH

MINIMUM DELAY

The perfect space–time codes with minimum delay(PSTC-MD) are block codes that span symbol periods[5], [6]. They have a full rate of information symbolsper symbol period. They also have a nonvanishing deter-minant, which ensures that they achieve the optimal diver-sity-versus-multiplexing tradeoff if a ML decoder is used [7].The minimum delay of PSTC-MD (block size), ensures thatthey have minimum complexity for ML detection w.r.t. all (un-structured) block codes that are known to achieve the optimaldiversity-versus-multiplexing tradeoff. Note though that STScodes are very structured Linear Dispersion codes in which thebasis matrices are only nonzero in one diagonal. ML detectionfor the STS codes could be done efficiently using the Viterbialgorithm (with the state involving symbols). Wecontinue to focus here on the DFE receiver though for STS.

For the Golden Code [19] is a PSTC-MD example,for which the transmitted code block has the followingform:

(85)

whereand . are the encoded QAM symbols. In thefollowing we compare the STS and the PSTC-MD in terms ofdetection complexity. For performance evaluation, simulationsare performed for the case by comparing the STS andthe Golden Code. The Golden Code is the best known code withminimum delay for [19].

A. Detection Complexity

The ML detection for PSTC-MD and the minimum distancedetection for the DFE equalizer output can both be performedin a numerically efficient way by using the sphere decoder(SD). The complexity in arithmetic operations of the SDwas shown in [20], [21] to be polynomial in the number ofjointly decoded real symbols , for most of the SNR rangeof interest

(86)

The exponent is in general a function of the SNR and thesymbol constellation size. Both PSTC-MD-ML and STS-DFEuse the same QAM constellations for the same transmission rate.

STS With DFE Equalizer: The number of jointly detectedreal symbols is , and this detection is performed

Fig. 5. Performance comparison of STS with UMMSE-DFE receiver and theGolden Code (GC) with ML receiver.

each symbol period. Hence the complexity per symbol periodis . One should also consider the complexity ofthe filtering by the feedforward and feedback filters, which is

per symbol period, by exploiting the fact thatthese filters are cascades of constant matrix multiplications anddelays. And the complexity of computing the filters is

, but that is only once per channel coherence time. So thedominating complexity is that of detection.

PSTC-MD With ML Detection : The number of transmittedQAM symbols per code block is , hence the number ofjointly detected real symbols is . The detection isperformed each symbol periods, hence the complexity persymbol period is . Hence thePSTC-MD receiver is times more complex than the STSreceiver. Typical values of the exponent are mostly cubic

for high SNR, but can range from 3 to 7 for the low tomiddle SNR region and high-order constellations [21].

B. Numerical Performance

In Fig. 5, we compare the performance of STS with theUMMSE design for the DFE receiver (‘STS’ curves) with theGolden Code with an ML decoder (‘GC’ curve), whereas inFig. 6 the performance of the STS with the UMMSE and theMMSE-ZF designs for the DFE receiver is shown.

The number of antennas is and the STS framelength is fixed here to . Since the frame length of the STSand the block length of the Golden code (two symbol periods) aredifferent,wechose theBERforperformancecomparison.Perfor-mance is evaluated for both 4-QAM and 16-QAM constellations.In order to eliminate the error propagation effect, the “STS-GA”curve shows the performance of the DFE decoder in which per-fect feedback is assumed (Genie Aided (GA) detection). Suchperformance can ultimately be achieved by combining with a bi-nary correcting code (FEC) [15], [16] to decode and reencode thepast detected bits before using them in the feedback.

MEDLES AND SLOCK: OPTIMAL DIVERSITY-VERSUS-MULTIPLEXING TRADEOFF FOR MIMO FLAT CHANNELS 5323

Fig. 6. Performance comparison of the UMMSE and the MMSE-ZF designsfor the Genie-Aided DFE receiver of STS.

From Fig. 5 we can observe that the STS, with UMMSEDFE receiver, outperforms the Golden Code when error prop-agation is eliminated but suffers otherwise. The difference inperformance between ‘STS-GA’ and ‘GC’ is more importantfor higher order constellations, at BER: 1.1 dB for the16-QAM constellation against 0.4 dB for the 4-QAM constel-lation. This offset is due to a lower coding gain of the min-imum-delay Golden Code scheme.

In Fig. 6 we can verify that the UMMSE design largely outper-forms the MMSE ZF design, in particular the BER slope at highSNR is steeper for the UMMSE than for the MMSE ZF. This indi-cates thatahigherdegreeofdiversity isexploitedbytheUMMSE.Finally we note that in the example considered the guard intervalof the STS reduces the rate by %. As mentioned be-fore, this loss can be avoided by increasing the frame length .

VI. CONCLUDING REMARKS

In this paper, we showed again for the specific transmissionproblem and equalizer considered here, that even though anMMSE design tends to the corresponding MMSE-ZF design athigh SNR, those two designs have a substantially different be-havior from a diversity point of view, even though diversity ischaracterized at high SNR. Furthermore, MMSE designs alsoapply to scenarios in which zero forcing is not possible. Forfinite SNR, it may be desirable to robustify the transmissionscheme against error propagation by combining it with error cor-recting codes [15], [16], [22].

The introduction of a DFE reduces the cascade of prefilter,frequency-flat channel and DFE to a frequency-flat MIMOsystem again, leading to the joint detection of a vector ofsymbols of size (when it is a power of two). We conjecturethat the complexity of this detection problem constitutes alower bound for the complexity of MIMO transceiver schemesattaining the optimal diversity-versus-multiplexing tradeoff.

We may also remark that even though the space–timespreading transmission scheme considered here is easily ex-

tended to the frequency-selective channel case [18], the use of aDFE to attain the optimal tradeoff is not so clear. What happensin the frequency-flat case is that the feedforward and feedbackfilters of the DFE are simple functions of the prefilterand the channel separately and do not require to considerthe cascade . This separation property disappears in thefrequency-selective case.

REFERENCES

[1] L. Zheng and D. Tse, “Diversity and multiplexing: A fundamentaltradeoff in multiple-antenna channels,” IEEE Trans. Inf. Theory, vol.49, no. 5, pp. 1073–1096, May 2003.

[2] H. Yao and G. Wornell, “Achieving the full MIMO diversity-multi-plexing frontier with rotation-based space-time codes,” in Proc. 41thAnnu. Allerton Conf. Commun., Contr., Comput., Monticello, IL, Oct.2003.

[3] S. Tavildar and P. Viswanath, “Permuation codes: Achieving the di-versity-multiplexing tradeoff,” in Proc. IEEE ISIT 2004, Chicago, IL,Jun.–Jul. 2004.

[4] H. E. Gamal, G. Caire, and M. Damen, “Lattice coding and decodingachieve the optimal diversity-vs-multiplexing tradeoff of mimo chan-nels,” IEEE Trans. Inf. Theory, vol. 50, no. 6, pp. 968–985, Jun. 2004.

[5] F. Oggier, G. Rekaya, J.-C. Belfiore, and E. Viterbo, “Perfect spacetime block codes,” IEEE Trans. Inf. Theory, vol. 52, no. 9, pp.3885–3902, Sep. 2006.

[6] P. Elia, B. A. Sethuraman, and P. Kumar, “Perfect space-time codeswith minimum and non-minimum delay for any number of antennas,”in Proc. Int. Conf. Wireless Networks, Communication and MobileComputing, Hawaii, Jun. 2005, vol. 1, pp. 722–727.

[7] P. Elia, K. R. Kumar, and H. Lu, “Explicit space-time codes thatachieve the diversity-multiplexing gain tradeoff,” in Proc. IEEE Int.Symp. Information Theory, Adelaide, Australia, Sep. 2005.

[8] G. Golub and C. V. Loan, Matrix Computations, 3rd ed. Baltimore,MD: The Johns Hopkins University Press, 1996.

[9] A. Medles and D. Slock, “Linear convolutive space-time precoding forspatial multiplexing mimo systems,” in Proc. 39th Annu. Allerton Conf.Commun., Contr., Comput., Monticello, IL, Oct. 2001.

[10] ——, “Miltistream space-time coding by spatial spreading, scramblingand delay diversity,” in Proc. ICASSP Conf., Orlando, FL, May 2002.

[11] H. E. Gamal and R. Hammons, “A new approach to layered space-timecoding and signal processing,” IEEE Trans. Inf. Theory, vol. 47, no. 6,pp. 2321–2334, Sept. 2001.

[12] E. B. X. Giraud and J. Belfiore, “Algebraic tools to build modulationschemes for fading channels,” IEEE Trans. Inf. Theory, vol. 43, no. 3,pp. 938–952, May 1997.

[13] B. Hassibi and B. M. Hochwald, “High-rate codes that are linear inspace and time,” IEEE Trans. Inf. Theory, vol. 48, no. 7, pp. 1804–1824,Aug. 2000.

[14] S. Galliou and J. Belfiore, “A new familly of linear full-rate space-time codes based on Galois theory,” in Proc. IEEE ISIT 2002, Lausane,Switzerland, Jun.–Jul. 2002.

[15] T. Guess and M. Varanasi, “A new successively decodable coding tech-nique for intersymbol-interference channels,” in Proc. IEEE ISIT, Sor-rento, Italy, Jun. 2000.

[16] M. Kobayashi and G. Caire, “A low complexity approach to space-timecoding for multipath channels,” in Proc. 6th Int. Symp. Wireless Pers.Multimedia Commun., Yokosuka, Japan, Oct. 2003.

[17] A. Duel-Hallen, “Equalizers for multiple input/multiple output chan-nels and PAM systems with cyclostationary input sequences,” IEEE J.Sel. Areas Commun., vol. 10, pp. 630–639, Apr. 1992.

[18] A. Medles and D. Slock, “Spatial multiplexing by spatiotemporalspreading of multiple symbol streams,” in Proc. The 5th Int. Symp.Wireless Pers. Multimedia Commun. 2002 (WPMC 2002), Honolulu,HI, Oct. 2002.

[19] G. R. J. C. Belfiore and E. Viterbo, “The golden code: A 2� 2 full-rate space-time code with nonvanishing determinants,” IEEE Trans. Inf.Theory, vol. 51, no. 4, pp. 1432–1436, Apr. 2005.

[20] H. Vikalo and B. Hassibi, “On the sphere-decoding algorithm I.expected complexity,” IEEE Trans. Inf. Theory, vol. 53, no. 8, pp.2806–2818, Aug. 2005.

[21] ——, “Oon the sphere-decoding algorithm II. generalizations, second-order statistics, and applications to communications,” IEEE Trans. Inf.Theory, vol. 53, no. 8, pp. 2819–2834, Aug. 2005.

[22] A. Medles and D. Slock, “Linear versus channel coding trade-offs infull diversity full rate mimo systems,” in Proc. ICASSP Conf., Mon-treal, QC, Canada, May 2004.