aditya k jagannatham bw efficient estimation - ucsd...

UNIVERSITY OF CALIFORNIA SAN DIEGO

Bandwidth Efficient Channel Estimation for Multiple-Input Multiple-Output

(MIMO) Wireless Communication Systems: A Study of Semi-Blind and

Superimposed Schemes.

A dissertation submitted in partial satisfaction of the

requirements for the degree Doctor of Philosophy

in

Electrical Engineering

(Communication Theory and Systems)

by

Aditya K. Jagannatham

Committee in charge:

Professor Bhaskar D. Rao, ChairProfessor Ian AbramsonProfessor Robert BitmeadProfessor Kenneth Kreutz-DelgadoProfessor Laurence Milstein

2007

Copyright

Aditya K. Jagannatham , 2007

All rights reserved.

The dissertation of Aditya K. Jagannatham is approved,

and it is acceptable in quality and form for publication

on microfilm:

Chair

University of California San Diego

2007

iii

To My Father,

Prof. Anantha Swamy Jagannatham

(January 5, 1950 - October 14, 2004)

iv

TABLE OF CONTENTS

Signature Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv

Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

Vita and Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 MIMO System Modeling and Channel Estimation . . . . . . . . . 41.2 Estimation Philosophies . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.1 Pilot based Estimation . . . . . . . . . . . . . . . . . . . . 51.2.2 Blind Estimation . . . . . . . . . . . . . . . . . . . . . . . 61.2.3 Semi-Blind Philosophy . . . . . . . . . . . . . . . . . . . . 7

1.3 Complex-Constrained Cramer-Rao Bounds . . . . . . . . . . . . . 81.4 Whitening-Rotation Based Semi-Blind MIMO Channel Estimation 91.5 FIM based Regularity Analysis of Semi-Blind MIMO FIR Channels 111.6 Semi-Blind Channel Estimation for MRT Based MIMO Systems . 131.7 Superimposed Pilots for MIMO Channel Estimation . . . . . . . . 141.8 Channel Estimation for Time-Varying Channels . . . . . . . . . . 161.9 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2 Complex Constrained Cramer-Rao Bound (CC-CRB) . . . . . . . . . . 192.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.2 CRB For Complex Parameters With Constraints . . . . . . . . . . 202.3 A Constrained Matrix Estimation Example . . . . . . . . . . . . . 24

2.3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . 242.3.2 Cramer-Rao Bound . . . . . . . . . . . . . . . . . . . . . . 252.3.3 ML Estimate and Simulation Results . . . . . . . . . . . . 28

2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

v

3 Whitening-Rotation Based Semi-Blind MIMO Channel Estimation . . . 313.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . 323.3 Estimation accuracy for semi-blind approaches . . . . . . . . . . . 35

3.3.1 Estimation Accuracy of the WR scheme . . . . . . . . . . 383.3.2 Constrained CRB of the WR scheme . . . . . . . . . . . . 39

3.4 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.4.1 Orthogonal Pilot ML (OPML) estimator . . . . . . . . . . 403.4.2 Iterative ML procedure for general pilot - IGML . . . . . . 423.4.3 Total Optimization . . . . . . . . . . . . . . . . . . . . . . 46

3.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 493.6 OFDM Channel Estimation . . . . . . . . . . . . . . . . . . . . . 52

3.6.1 Problem Description . . . . . . . . . . . . . . . . . . . . . 533.6.2 Simulation results . . . . . . . . . . . . . . . . . . . . . . . 56

3.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563.8 Appendix for Chapter(3) . . . . . . . . . . . . . . . . . . . . . . . 58

4 Fisher Information Based Regularity and Semi-Blind Estimation of MIMO-FIR Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . 674.3 Semi-Blind Fisher Information Matrix (FIM) . . . . . . . . . . . . 69

4.3.1 FIM: A General Result . . . . . . . . . . . . . . . . . . . . 714.3.2 Blind FIM . . . . . . . . . . . . . . . . . . . . . . . . . . . 744.3.3 Pilots and FIM . . . . . . . . . . . . . . . . . . . . . . . . 764.3.4 Pilots and Identifiability . . . . . . . . . . . . . . . . . . . 77

4.4 Semi-Blind Estimation: Performance . . . . . . . . . . . . . . . . 784.4.1 Asymptotic Semi-Blind FIM . . . . . . . . . . . . . . . . . 79

4.5 Semi-blind Estimation: Algorithm . . . . . . . . . . . . . . . . . . 814.5.1 Orthogonal Pilot ML (OPML) for Q Estimation . . . . . . 824.5.2 Orthogonal Pilot Matrix Construction . . . . . . . . . . . 83

4.6 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 844.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884.8 Appendix for Chapter(4) . . . . . . . . . . . . . . . . . . . . . . . 90

4.8.1 Proof of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . 904.8.2 Proof of Theorem 3 . . . . . . . . . . . . . . . . . . . . . . 91

5 Semi-Blind Estimation for Maximum Ratio Transmission . . . . . . . . 945.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 945.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

5.2.1 System Model and Notation . . . . . . . . . . . . . . . . . 975.2.2 Conventional Least Squares Estimation (CLSE) . . . . . . 985.2.3 Semi-Blind Estimation . . . . . . . . . . . . . . . . . . . . 99

5.3 Conventional Least Squares Estimation (CLSE) . . . . . . . . . . 101

vi

5.3.1 Perturbation of Eigenvectors . . . . . . . . . . . . . . . . . 1015.3.2 MSE in vc . . . . . . . . . . . . . . . . . . . . . . . . . . . 1025.3.3 Received SNR and Symbol Error Rate (SER) . . . . . . . 104

5.4 Closed-Form Semi-Blind estimation (CFSB) . . . . . . . . . . . . 1065.4.1 MSE in vs with Perfect us . . . . . . . . . . . . . . . . . . 1075.4.2 Received SNR with Perfect us . . . . . . . . . . . . . . . . 1085.4.3 MSE in vs with Noise-Free Training . . . . . . . . . . . . . 1085.4.4 Received SNR with Noise-Free Training . . . . . . . . . . . 1095.4.5 Semi-blind Estimation: Summary . . . . . . . . . . . . . . 110

5.5 Comparison of CLSE and Semi-blind Schemes . . . . . . . . . . . 1105.5.1 Performance of a 2 × 2 System with CLSE and CFSB . . . 1115.5.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 1125.5.3 Semi-blind Estimation: Limitations and Alternative Solu-

tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1145.6 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 1145.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

5.7.1 Proof of Lemma 1: . . . . . . . . . . . . . . . . . . . . . . 1185.7.2 Received SNR with perfect us . . . . . . . . . . . . . . . . 1185.7.3 Proof for equations (5.28) and (5.29) . . . . . . . . . . . . 1195.7.4 Performance of Alamouti Space-Time Coded Data with

Conventional Estimation . . . . . . . . . . . . . . . . . . . 1205.7.5 Other Useful Lemmas: . . . . . . . . . . . . . . . . . . . . 122

6 Superimposed Pilots for MIMO Channel Estimation . . . . . . . . . . . 1306.1 Superimposed Pilots (SP) Based MIMO Estimation . . . . . . . . 1326.2 MSE of Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 134

6.2.1 Cramer-Rao Bound (CRB) for SP Estimation . . . . . . . 1366.2.2 Semi-Blind SP Estimation . . . . . . . . . . . . . . . . . . 140

6.3 Throughput Performance . . . . . . . . . . . . . . . . . . . . . . . 1416.3.1 A Throughput Lower Bound for Channels with Correlated

Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1426.3.2 Throughput Comparison of Superimposed and Conventional

Pilots (CP) . . . . . . . . . . . . . . . . . . . . . . . . . . 1436.3.3 Conventional Pilots (CP) based estimation . . . . . . . . . 145

6.4 Optimal Power Allocation in SP . . . . . . . . . . . . . . . . . . . 1476.4.1 Minimum Variance Distortionless Response (MVDR) Beam-

former . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1476.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 151

6.5.1 MSE of Estimation . . . . . . . . . . . . . . . . . . . . . . 1516.5.2 Throughput Performance . . . . . . . . . . . . . . . . . . . 1526.5.3 Optimal Power Allocation . . . . . . . . . . . . . . . . . . 153

6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1546.7 Appendix for Chapter(6) . . . . . . . . . . . . . . . . . . . . . . . 156

vii

6.7.1 Proof of Expression for MSEs in section(6.2) . . . . . . . . 1566.7.2 Proof of Theorem 6 . . . . . . . . . . . . . . . . . . . . . . 1576.7.3 Proof of Theorem 7 . . . . . . . . . . . . . . . . . . . . . . 1606.7.4 MVDR - Post-Processing SNR . . . . . . . . . . . . . . . . 161

7 MIMO Time Varying Channel Estimation . . . . . . . . . . . . . . . . . 1637.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1637.2 Problem Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

7.2.1 SP Estimation Based on the CEBEM MIMO Model . . . . 1657.3 EM Based Algorithm for CEBEM SP Estimation . . . . . . . . . 166

7.3.1 Likelihood computation and Sphere Decoding . . . . . . . 1687.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1717.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

viii

LIST OF FIGURES

Figure 1.1: Schematic representation of a MIMO System. . . . . . . . . . 2Figure 1.2: Schematic representation of a MIMO frame. . . . . . . . . . . 5Figure 1.3: Pictorial Representation of Pilot vs. Blind Tradeoff. . . . . . 7Figure 1.4: Pictorial Representation of Conventional Vs. Superimposed

Pilots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Figure 2.1: Computed MSE Vs SNR,∣∣∣Q(1, 1) − Q(1, 1)

∣∣∣

2

. . . . . . . . . 29

Figure 2.2: Computed MSE Vs SNR,∥∥∥Q − Q

∥∥∥

2

. . . . . . . . . . . . . . 29

Figure 3.1: MSE vs. SNR of OPML semi-blind channel estimation andthe semi-blind CRB with perfect knowledge of W . Also shown forreference is MSE of the exclusively training based channel estimate.H is an 8 × 4 complex flat-fading channel matrix and pilot lengthL = 12. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

Figure 3.2: Computed MSE vs. Pilot length (L) for the OPML, IGML,ROML and exclusive training based channel estimation. H is an8 × 4 complex flat-fading channel matrix and SNR = 8 dB . . . . . 51

Figure 3.3: Comparison of OPML with perfect W , OPML with imperfector estimated W , total optimization and training based estimationof H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Figure 3.4: Probability of Bit Error vs. SNR for 8 × 4 MIMO systememploying OPML, Total Optimization (N = 1000, 500). The per-formance of the exclusively training based channel estimate is alsogiven for comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Figure 3.5: Constrained Vs. unconstrained channel estimation for OFDM. 56

Figure 4.1: Schematic representation of an SB system. . . . . . . . . . . 65Figure 4.2: Schematic representation of input and output symbol blocks. 69Figure 4.3: Paley Hadamard Matrix . . . . . . . . . . . . . . . . . . . . . 83Figure 4.4: Rank deficiency of the complex MIMO FIM Vs. number of

transmitted pilot symbols (Lp)for a 6 × 5 MIMO FIR system oflength Lh = 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

Figure 4.5: MSE Vs SNR in a 4× 2 MIMO channel with Lh = 2 channeltaps, Lp = 20 pilot symbols. . . . . . . . . . . . . . . . . . . . . . . 86

Figure 4.6: MSE performance for estimation of a 4×2 MIMO frequency-selective channel. Left- MSE Vs. Lp and Right - MSE Vs. numberof blind symbols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Figure 4.7: Symbol error rate (SER) Vs. SNR for QPSK symbol trans-mission of a 4 × 2 MIMO frequency selective channel with Lh = 2channel taps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

ix

Figure 5.1: MIMO system model, with beamforming at the transmitterand receiver. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Figure 5.2: Comparison of the transmission scheme for conventional leastsquares (CLSE) and closed-form semi-blind (CFSB) estimation. . . 124

Figure 5.3: Average channel gain of a t = r = 2 MIMO channel with L =2, N = 8 and PD = 6dB, for the CLSE and beamforming, CFSB andbeamforming (with and without knowledge of u1), CLSE and whitedata (Alamouti-coded), and perfect beamforming at transmitter andreceiver. Also plotted is the theoretical result for the performanceof Alamouti-coded data with channel estimation error, given by (5.34)125

Figure 5.4: MSE in v1 vs training data length L, for a t = r = 4 MIMOsystem. Curves for CLSE, CFSB and OPML with perfect u1 areplotted. The top five curves correspond to a training symbol SNRof 2dB, and the bottom five curves 10dB. . . . . . . . . . . . . . . . 126

Figure 5.5: SER of beamformed-data vs number of training symbols L,t = r = 4 system, for two different values of white-data lengthN , and data and training symbol SNR fixed at PT = PD = 6dB.The two competing semi-blind techniques, OPML and CFSB, areplotted. CFSB marginally outperforms OPML for N = 50, as itonly requires an accurate estimate of u1 from the blind data. . . . . 127

Figure 5.6: SER vs L, t = r = 4 system, for two different values ofN , and data and training symbol SNR fixed at PT = PD = 6dB.The theoretical and experimental curves are plotted for the CFSBestimation technique. Also, the LCSB technique outperforms boththe conventional (CLSE) and semi-blind (CFSB) techniques. . . . . 128

Figure 5.7: SER versus data SNR for the t = r = 2 system, with L =2, N = 16, γp = 2dB. ‘CLSE-Alamouti’ refers to the performanceof the spatially-white data with conventional estimation, ‘CLSE-bf’is the performance of the beamformed data with vc, ‘CFSB’ and‘LCSB’ refer to the performance of the corresponding techniquesafter accounting for the loss due to the white data. ‘CFSB-u1’is the performance of CFSB with perfect-u1, and ‘Perf-bf’ is theperformance with the perfect u1 and v1 assumption. . . . . . . . . . 129

Figure 6.1: Schematic of a Superimposed Pilot System. . . . . . . . . . . 131Figure 6.2: Schematic diagram of the superimposed pilot(SP) frame struc-

ture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133Figure 6.3: Schematic of conventional (time-multiplexed) pilots frame (block)

structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145Figure 6.4: MSE of Estimation of MIMO wireless channel with r = t = 4,

PNR = 5dB, Nf = 10 and Lp = 8 symbols. . . . . . . . . . . . . . . 148Figure 6.5: MSE of Estimation of SIMO Rayleigh wireless channel with

r = 4 antennas, Nf = 20, Lp = 8, PNR = 5dB. . . . . . . . . . . . 150

x

Figure 6.6: Throughput performance of SP and CP vs. Nf , SNR = PNR= 5dB, Lp = 64. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

Figure 6.7: Throughput performance of SP and CP Vs. SNR for a 4 × 4Rayleigh flat-fading MIMO channel with Nf = 10 sub-frames andLp = 64 pilots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

Figure 6.8: Detection performance vs. SNR for SP based estimation.SER vs. SNR (Pd/σ

2n) for QPSK signaling, r = 4 SIMO channel

and different [Nf , Lp, αs (dB)]. . . . . . . . . . . . . . . . . . . . . . 156

Figure 6.9: Optimal power allocation ratio 10 log10

(ρ⋆

d

ρ⋆t

)

of a r = 4 an-

tenna SIMO channel Vs. Total Power (αsdB) for various Nf , Lp. . . 158

Figure 7.1: MSE of Kalman based estimation of a time-varying wirelesschannel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

xi

LIST OF TABLES

Table 6.1: Table showing covariance matrices for SP and CP systems withchannel estimation error. . . . . . . . . . . . . . . . . . . . . . . . . 146

xii

ACKNOWLEDGEMENTS

First and foremost, I would like to thank my advisor Prof. Bhaskar Rao

for his continuous guidance and dedicated personal efforts which led to the fruition

of this research work. His valuable advice and inputs contributed to a great extent

in this work. I am also thankful to him for the uninterrupted financial support

during my several years here at UCSD. It was a privilege to work with him. I

would also like to thank my committee members, Prof. Ian Abramson, Prof.

Robert Bitmead, Prof. Kenneth Kreutz-Delgado and Prof. Laurence Milstein,

for their inputs and critique which have helped address important aspects in this

research. I owe special gratitude to the UCSD CoRe research grant agency1 and

the affiliated companies for supporting me throughout my PhD program.

The text of chapter 2, in part, is a reprint of the material as it appears

in A.K. Jagannatham and B.D. Rao, “Cramer-Rao Lower Bound for Constrained

Complex Parameters”, IEEE Signal Processing Letters, Vol. 11, No. 11, Nov’04,

Pages: 875 - 878 and A. K. Jagannatham and B. D. Rao,“Complex Constrained

CRB and its applications to Semi-Blind MIMO and OFDM Channel Estimation”,

Sensor Array and Multichannel Signal Processing Workshop Proceedings, 2004,

Barcelona, Spain, 18-21 July 2004, Pages: 397 - 401. Chapter 3, in part, is a

reprint of a paper which has been published as A.K. Jagannatham and B. D.

Rao, “Whitening-Rotation Based Semi-Blind MIMO Channel Estimation”, IEEE

Transactions on Signal Processing Vol. 54, No. 3, Mar’06, Pages: 861 - 869, A. K.

Jagannatham and B. D. Rao,“A Semi-Blind Technique For MIMO Channel Matrix

Estimation”, 4th IEEE Workshop on Signal Processing Advances in Wireless Com-

munications, 2003, Rome, Italy , 15-18 June 2003 Pages:304 - 308, Rome, Italy,

A. K. Jagannatham and B. D. Rao,“Constrained ML Algorithms for Semi-Blind

MIMO Channel Estimation”, IEEE Global Telecommunications Conference, 2004

GLOBECOM ’04, Vol. 4, Nov’29 - Dec’3, 2004, Pages: 2475 - 2479. The text of

1This work was supported by CoRe Research Grants Com00-10074, com02-10119, com04-10176 andcom04-10173

xiii

chapter 4, has appeared in A.K. Jagannatham and B. D. Rao, “FIM Regularity

for Gaussian Semi-Blind MIMO FIR Channel Estimation”, Conference Record of

the Thirty-Ninth Asilomar Conference on Signals, Systems and Computers, Oct.

28 - Nov. 1, 2005, Pages: 848 - 852. Chapter 5, in part, is a reprint of the

material which has appeared as A.K. Jagannatham, C.R. Murthy and B.D. Rao,

“A Semi-Blind Channel Estimation Scheme for MRT”, Proceedings of IEEE Inter-

national Conference on Acoustics, Speech, and Signal Processing, 2005, (ICASSP

’05), Mar’05, Vol. 3, Pages: 585 - 588. The final chapter, chapter 6, is adapted

from the content of A.K. Jagannatham and B.D. Rao,“Superimposed Vs. Conven-

tional Pilots for Channel Estimation”, Conference Record of the Fortieth Asilomar

Conference on Signals, Systems and Computers, Nov., 2006. The dissertation au-

thor was the primary researcher and author, and the co-authors listed in these

publications contributed to or supervised the research which forms the basis for

this dissertation.

My labmate Chandra R. Murthy deserves a special mention not only for

his valuable technical feedback but also for directly collaborating with me on part

of this work. I am grateful to the DSP lab members Abhijeet bhorkar, Zhongren

Cao, Ethan Duni, Yogananda Isukapalli, Cecile Levasseur, Joseph Murray, June

Chul Roh, Shankar Shivappa, Anand Subramaniam, Thomas Svantesson, Yeliz

Tokgoz, David Wipf, Chengjin Zhang, Wenyi Zhang, Jun Zheng and UCSD col-

leagues Preeti Nagvanshi, Ramesh Annavajjala for the many hours of discussions,

both technical and non-technical. UCSD’s Mesa graduate housing has provided

me with a very comfortable and affordable abode during the years of my graduate

studies. This stay was made lively by my roommates Ashay Dhamdhere, Sandeep

Kanumuri, Daniel Richter and I thank them for providing great company. I also

wish to thank friends at UCSD, Sumit Bhardwaj, Anuj Grover, Anuj Mishra,

Swamy Muddu, Ali Rangwala, Sourja Ray, Satish Narayanasamy and Sachin Ta-

lati, whose companionship has contributed to enriching the grad life experience at

UCSD. My studies so far have been made a fun filled experience by friends Mo-

xiv

han Dunga, Prashanth Gangu, Phanindra Ganti, Srikanth Geedipalli, Ram Kolli,

Girish Nagavarapu, Sameer Ranjan, Pramod Reddy, Sridhar Reddy, Saurabha

Tavildar, Sampath Vetsa, Satish Vutukuru and many others.

I express a deep sense of gratitude to family friends Dr. Venkat R. Mali

and Harini Mali for their help, both material and emotional, during my stay here.

My uncle, Sreedhara Swamy Jagannatham and aunt Lalitha Jagannatham have

been a constant source of support and motivation for me during this entire period

and I wish to thank them profoundly. Finally, I wish to greatly thank my sister

Anila and brother in law Dr. Nandan R. Thirunahari for their help and encour-

agement over the years. Seeing them happy contributes to the joy of my life and

I wish them all the success in their careers.

In the tradition of India, it is not customary to thank ones parents be-

cause they are in essence present in each of their child’s endeavors. This work

belongs to my mother Smt. Bhagya L. Jagannatham and my father Dr. Anantha

Swamy Jagannatham (Professor of Chemistry and former Vice Chancelor of Osma-

nia University, Hyderabad, India), as much as it is mine. The toughest challenge

during this course was coping with the loss of my father, whose dream it was to see

me earn this PhD. In him I lost a mentor, supporter and a great friend. I dedicate

this thesis to his loving memory. May his soul rest in peace.

xv

VITA

1980 Born - Hyderabad,Andhra Pradesh, INDIA.

1995 Central Board of Secondary Education Certificate,Hyderabad Public School,Hyderabad, AP, INDIA.

1997 AP State Board of Intermediate Education Certificate,Little Flower Junior College,Hyderabad, INDIA.

2001 Bachelor of Technology,Electrical Engineering,Indian Institute of Technology Bombay,Powai, Mumbai, INDIA.

2001-2007 Research Assistant,Department of Electrical and Computer Engineering,University of California San Diego,La Jolla, CA, U.S.A

2003 Graduate Student Intern,Zyray Wireless (Now Broadcom Inc.),San Diego, CA, U.S.A.

2004 Master of Science Degree,Electrical Engineering,University of California San Diego,La Jolla, CA, U.S.A.

2007 Doctor of Philosophy Degree,Electrical Engineering,(Communication Theory and Systems)University of California San Diego,La Jolla, CA, U.S.A.

xvi

PUBLICATIONS

A. K. Jagannatham and B. D. Rao, “Whitening-Rotation Based Semi-Blind MIMOChannel Estimation”, IEEE Transactions on Signal Processing, Vol. 54, No. 3,Mar’06, Pages: 861 - 869.

A. K. Jagannatham and B. D. Rao, “Cramer-Rao Lower Bound for ConstrainedComplex Parameters”, IEEE Signal Processing Letters, Vol. 11, No. 11, Nov’04,Pages: 875 - 878.

A. K. Jagannatham and B. D. Rao,“Superimposed Vs. Conventional Pilots forChannel Estimation”, Conference Record of the Fortieth Asilomar Conference onSignals, Systems and Computers, Nov., 2006.

A. K. Jagannatham and B. D. Rao,“FIM Regularity for Gaussian Semi-BlindMIMO FIR Channel Estimation ”, Conference Record of the Thirty-Ninth Asilo-mar Conference on Signals, Systems and Computers, Oct. 28 - Nov. 1, 2005,Pages: 848 - 852.

A. K. Jagannatham, C. R. Murthy and B. D. Rao,“A Semi-Blind Channel Estima-tion Scheme for MRT”, Proceedings of IEEE International Conference on Acous-tics, Speech, and Signal Processing, 2005, (ICASSP ’05), Mar’05, Vol. 3, Pages:585 - 588.

A. K. Jagannatham and B. D. Rao,“Constrained ML Algorithms for Semi-BlindMIMO Channel Estimation”, IEEE Global Telecommunications Conference, 2004GLOBECOM ’04, Vol. 4, Nov’29 - Dec’3, 2004, Pages: 2475 - 2479.

A. K. Jagannatham and B. D. Rao,“Complex Constrained CRB and its applica-tions to Semi-Blind MIMO and OFDM Channel Estimation”, Sensor Array andMultichannel Signal Processing Workshop Proceedings, 2004, Barcelona, Spain,18-21 July 2004, Pages: 397 - 401.

A. K. Jagannatham and B. D. Rao,“A Semi-Blind Technique For MIMO ChannelMatrix Estimation”, 4th IEEE Workshop on Signal Processing Advances in Wire-less Communications, 2003, Rome, Italy , 15-18 June 2003 Pages:304 - 308, Rome,Italy.

xvii

ABSTRACT OF THE DISSERTATION

Bandwidth Efficient Channel Estimation for Multiple-Input Multiple-Output

(MIMO) Wireless Communication Systems: A Study of Semi-Blind and

Superimposed Schemes

by

Aditya K. Jagannatham

Doctor of Philosophy in Electrical Engineering

(Communication Theory and Systems)

University of California, San Diego, 2007

Professor Bhaskar D. Rao, Chair

This dissertation aims to explore and analyze novel schemes for band-

width efficient channel estimation in multiple-input multiple-output (MIMO) wire-

less systems. As the number of receive/transmit antennas increases in MIMO sys-

tems, the number of channel coefficients to be estimated increases. This, together

with the low SNR of operation in MIMO systems, necessitates an increase in the

pilot symbol overhead which leads to a reduction in the bandwidth efficiency. To

alleviate this problem, we study several procedures such as whitening-rotation and

superimposed pilots for bandwidth efficient MIMO channel estimation. The CRLB

serves as an important tool in the performance evaluation of estimators which arise

frequently in the fields of communications and signal processing. In applications

such as semi-blind channel estimation one is frequently faced with the estimation of

constrained complex parameters. We present a result for the Cramer-Rao bound

(CRB) for complex-constrained parameters and the utility of this framework is

illustrated in the subsequent work on semi-blind channel estimation.

xviii

In addition to using the pilot sequence, the accuracy of the channel esti-

mate at the receiver can be enhanced by employing second-order statistical infor-

mation. For this purpose, we propose a whitening-rotation (WR) based algorithm

for semi-blind estimation of the complex flat-fading MIMO channel matrix H. Uti-

lizing complex constrained CRB, we show that the semi-blind scheme can signifi-

cantly improve estimation accuracy. Next, we consider the problem of semi-blind

(SB) channel estimation for multiple-input multiple-output (MIMO) frequency-

selective (FS) channels. We motivate a Fisher information matrix (FIM) based

analysis of this semi-blind estimation problem and demonstrate that the rank de-

ficiency of the FIM is related to the number of un-identifiable parameters. We

also establish the minimum number of pilot symbols necessary to achieve regu-

larity (full-rank) of the FIM for identifiability. The efficacy and applicability of

the semi-blind philosophy is further exemplified by demonstrating its utility in

the context of channel estimation for MIMO systems employing Maximum Ratio

Transmission (MRT).

Superimposed pilots (SP) are another interesting alternative to reduce

the impact of a pilot overhead without a significant increase in computational

complexity. We present a study of the mean-squared error (MSE) and throughput

performance of superimposed pilots (SP) for the estimation of a MIMO wireless

channel. We illustrate a semi-blind scheme for SP based MIMO channel estima-

tion, which improves performance over the traditional mean-estimator. A new

result is presented for the worst-case capacity of a communication channel with

correlated information symbols and noise. We also address the issue of optimal

source-pilot power allocation for SP. In the end we consider the problem of esti-

mation of a time-selective MIMO wireless channel using superimposed pilot (SP)

symbols. We demonstrate a scheme for channel estimation based on a complex ex-

ponential basis expansion model (CEBEM) approximation of the time-selective

wireless channel. We further reduce the MSE of estimation by employing an

expectation-maximization (EM) based iterative estimation procedure.

xix

1 Introduction

In recent years, the field of wireless communications has experienced a

revolutionary growth which has changed the face of telecommunications. On the

one hand the easy availability of GSM and CDMA based mobile wireless cellular

devices have connected hitherto far flung corners of the world, while the WI-FI

based 802.11b/a devices have enabled ubiquitous data/content access. However,

at present, the bandwidth available on these devices is limited. Currently there

is a flurry of research and development activity to produce 4G wireless devices

that will in the near future support very high data rate applications such as DVB,

which distribute high-definition multimedia content.

One of the most challenging issues faced by designers in such an endeavor

is the harsh nature of the wireless radio channel. Unlike the wireline channel, the

wireless channel is highly variable and subject to fading. This fading nature of the

wireless channel arises from the superposition of multiple signal paths due to scat-

tering from local obstructions such as buildings, trees and other objects. Initially,

wireless communication systems were single-input single-output (SISO) systems,

which means that they employ a single transmit antenna that inputs the transmit

symbol stream into the radio channel, and a single output antenna that receives

a single copy of the signal. However, the capacity of such a SISO wireless link is

adversely affected by fading as the received signal is severely attenuated when the

channel is in a deep fade. This problem can be partially alleviated by introducing

multiple antennas at the receiver. By separating the receiver antennas by spacing

greater than approximately half the wavelength of the narrowband signal, one can

1

2

Figure 1.1: Schematic representation of a MIMO System.

ensure that the radio channel seen by each of these antennas is an independent re-

alization of the fading process. Thus, at any given instant, the probability that all

of these channels are in a deep fade is greatly reduced, thereby ensuring signal reli-

ability. This innovative scheme whereby multiple antennas at the receiver ensures

an improved signal quality by combating fading is also termed as diversity recep-

tion. Alternatively, it can be demonstrated that by introducing multiple transmit

antennas the resulting multiple-input single-output (MISO) system can combat

fading by using transmit diversity[1]. Thus, diversity at the wireless transmitter

or the receiver helps combat fading and such systems that use multiple-antennas

at the transmitter or the receiver are also termed colloquially as smart antenna

systems.

It is now interesting to address the effect of introducing multiple an-

tennas on both the receiver and transmitter and such a system is termed as a

multiple-input multiple-output or MIMO system. In one of the most interesting

results in communication theory, it can be demonstrated [2] that the effect of such

an introduction of multiple antennas is the possibility of a linear increase in the

information-theoretic capacity with the minimum of the number of transmit or

receive antennas. Thus, a MIMO system can effectively result in an multi-fold

increase in capacity over a conventional SISO wireless link. This is achieved by

transmitting different information streams over the transmit antennas and is also

termed as spatial multiplexing. Thus, a MIMO system offers the dual benefits

3

of increased capacity due to spatial multiplexing and fading suppression due to

receive/transmit diversity. These properties have greatly attracted the interest of

researchers in the wireless community and in recent years there has been a flurry

of activity in the MIMO area. This forms the general area of work described in

this thesis.

Channel estimation for a wireless system is a significantly more challeng-

ing task than for a wireline channel. This is attributed to the mobility feature of

wireless devices which causes temporal variations in the wireless channel arising

due to the doppler effect. Thus, the period of time for which the radio channel is

invariant, also known as the coherent time, is of the order of a few milli-seconds,

necessitating frequent re-estimation of the channel coefficients.

The issue of channel estimation assumes a much more critical significance

in the context of a MIMO system. As the diversity of the MIMO system increases,

the number of parameters to be estimated increases as the product rt, where r/t

denote the number of receive/transmit antennas in the MIMO system. Further,

due to the diversity feature of the MIMO system, the SNR at each antenna is even

lower. For instance, employing binary orthogonal FSK modulation at an operation

BER of 2× 10−3, while an SNR of 25 dB is required with a single receive antenna,

an SNR of 12dB suffices with 4 antennas [3]. The SNR at each antenna is even

lower. Hence, in a MIMO system, ironically, one needs to estimate many more

parameters at a much lower SNR compared to a SISO system. Hence, such harsh

conditions motivate the development of more robust channel estimation techniques

which estimate the MIMO channel ”efficiently”. This notion of efficiency will be

clarified in the discussion that follows and the topic of efficient channel estimation

forms the central goal of this thesis.

4

1.1 MIMO System Modeling and Channel Estimation

The MIMO wireless system can be represented as a matrix wireless chan-

nel as will be seen below. Let xd(k) ∈ Ct×1 be the kth transmitted MIMO symbol

vector. This vector xd(k) is given as,

xd(k) =

xd,1(k)

xd,2(k)...

xd,t(k)

,

where xd,j(k), 1 ≤ j ≤ t is the symbol transmitted from the jth transmit antenna.

Similarly, the receive symbol vector yd(k) is given as,

yd(k) =

yd,1(k)

yd,2(k)...

yd,r(k)

,

where yd,i(k), 1 ≤ i ≤ r is the received signal at the ith receive antenna. In a

flat-fading MIMO system (where symbol duration is much greater than the multi-

path delay spread of the channel), the input-output system model at each receive

antenna can be expressed as,

yd,i(k) =t∑

i=1

hi,jxd,j(k) + ηi(k),

where ηi(k) is the noise added at the ith receiver. This can be represented in matrix

form as,

yd(k) = Hxd(k) + η(k),

5

Figure 1.2: Schematic representation of a MIMO frame.

where the matrix H ∈ Cr×t represents the MIMO channel. This matrix H is given

as,

H =

h11 h12 . . . h1t

h21 h22 . . . h2t

......

. . ....

hr1 hr2 . . . hrt

.

The coefficient hij represents the flat-fading channel between the ith receiver and

the jth transmitter. Knowledge of this channel matrix is necessary for detection

of the transmitted symbols xd(k). Hence, it is necessary to estimate the channel

matrix H. This procedure of estimating H is known as channel estimation.

This channel estimate can then either be employed at the receiver for detection[4]

or also fedback to the transmitter[5, 6] for transmit precoding and beamforming.

Finally, it is worth mentioning that the combined impact of channel state feed-

back and estimation error is an interesting problem which has been handled in a

comprehensive fashion in [7].

1.2 Estimation Philosophies

1.2.1 Pilot based Estimation

A typical wireless system involves the transmission of a sequence of sym-

bols also known as a frame. The number of symbols in the frame is also termed

6

as the frame length. Each such frame consists of an initial transmission of pilot

symbols (or simply pilots) and the number of these symbols is the pilot length

of the frame. A schematic diagram of such a system is shown in fig(1.2). These

pilots are a fixed set of symbols which are known at the receiver. Thus, at the

receiver, by observing the outputs to this known sequence of pilots, one can esti-

mate the MIMO channel. This estimate can then be employed for detection of the

information symbols transmitted subsequently. Such a scheme is termed as pilot

based estimation and is the most commonly employed channel estimation proce-

dure. A mathematically rigorous treatment of this procedure is presented in the

subsequent chapters. Overall, the above scheme has the dual benefits of a robust

estimate and low computational complexity. However, a major drawback of such a

scheme is that these predetermined pilots themselves carry no information. Hence,

these pilot symbols are in effect an overhead on the communication system and

result in wastage of bandwidth making such schemes ”bandwidth inefficient”.

1.2.2 Blind Estimation

Blind estimation is a novel strategy to eliminate the pilot overhead in a

communication system. Ideally, a blind scheme does not employ any pilot symbols

and instead relies only on the information symbol outputs to estimate the channel.

One is now curious as to how this can be achieved since the information sym-

bols, by their very definition are unknown at the receiver. It is to be noted, that

though the information symbols are individually unknown, one can have statistical

knowledge about an ensemble of such symbols. For instance, if the transmitter em-

ploys a symmetric transmit constellation with equal priori probabilities (as is the

case frequently), then the received symbol stream has a statistical mean of zero.

Further, with knowledge of the covariance of the input information symbols, the

computed covariance of the output information symbols can be employed to esti-

mate at least part of the channel. Thus, statistical information provides a viable

means to estimate the channel. Theoretically, if such a blind scheme were pos-

7

Figure 1.3: Pictorial Representation of Pilot vs. Blind Tradeoff.

sible, it would eliminate completely the need to transmit pilot symbols and thus

would be totally bandwidth efficient. In principle such schemes exist[8], but are

often computationally complex and are plagued by convergence problems. Further,

most blind schemes can estimate the channel only up to a residual indeterminate

phase factor. Thus, blind schemes, while being extremely bandwidth efficient, are

unattractive to implement in wireless systems where robustness of the estimate

and computational complexity are critical.

1.2.3 Semi-Blind Philosophy

Thus, as seen above, there is an inherent complexity and robustness vs.

bandwidth efficiency tradeoff in pilot and blind estimation schemes as shown in

fig(1.3). One can now ask, is it possible to design schemes which are a hybrid of

pilot and blind schemes. In other words, we desire to construct a channel estimation

scheme with a limited number of pilots to alleviate the potential shortcomings of a

blind scheme, while also employing statistical ”blind” information for bandwidth

efficiency. Such a scheme is semi-blind in nature since it employs both pilot and

8

blind information. The development of such a scheme is motivated for the following

reason. Given a certain amount of pilot information, we wish to enhance the

quality of the channel estimate by employing statistical information to aid the

estimation process, or in other words, we wish to minimize the number of pilot

symbols transmitted by employing statistical information to improve the nature of

the channel estimate, thereby increasing the bandwidth efficiency. The formulation

and analysis of such schemes in the context of a MIMO wireless system forms a

focal point of this thesis.

1.3 Complex-Constrained Cramer-Rao Bounds

At this point, the discussion digresses slightly to discuss another impor-

tant aspect of channel estimation. The formulation of novel channel estimation

schemes as described above is incomplete without an accompanying analysis that

evaluates the merits and demerits of the particular estimator. In particular, in the

context of estimation, we are left with the problem of evaluating the mean-squared

error performance of the designed estimator, which is the frequently employed met-

ric to judge the performance of an estimator. The Cramer-Rao bound theory from

statistical estimation literature presents a classic method to characterize the per-

formance of an unbiased or asymptotically unbiased estimator. However, most

literature deals only with the problem of estimation of unconstrained real param-

eters. By this we mean that the components of the parameter vector that is being

estimated are real and can vary independent of each other in a space of appropriate

dimension.

However, in the context of semi-blind estimation as described above, one

is frequently accosted with the problem of analyzing the performance of an es-

timator with constraints. Further, the components of the parameter vector are

usually drawn from an underlying complex space. Such an estimation problem

with constraints essentially reduces to the estimation of a parameter vector lying

9

on a complex manifold. For instance, consider a unit-norm constrained complex

parameter vector θ ∈ Cm×1, i.e. θ is an m-dimensional complex vector with the

constraint f(θ)

defined as,

f(θ)

, θH θ =∥∥θ

∥∥

2= 1. (1.1)

The above scenario can be considered as a typical example of constrained complex

parameter estimation. As will be seen in the later chapters, the nature of semi-

blind estimation necessitates the development of a framework to analyze the MSE

performance of such estimators. In chapter(2) we address the issue of Cramer-Rao

Bounds for a constrained complex parameter. This framework is then employed

throughout the rest of the thesis to analyze the performance of the proposed esti-

mation schemes.

1.4 Whitening-Rotation Based Semi-Blind MIMO Chan-

nel Estimation

In chapter(3), we introduce the ”whitening-rotation” (WR) scheme for

semi-blind MIMO flat-fading channel estimation. This scheme is described as

follows. Consider a MIMO channel H ∈ Cr×t which has at least as many receive

antennas as transmit antennas i.e. r ≥ t. Then, the channel matrix H can

be decomposed as H = WQH where W ∈ Cr×t and Q ∈ C

t×t is unitary i.e.

QHQ = QQH = I. The matrix W is popularly termed as the whitening matrix1.

Q induces a rotation on the space Ct×1 and is therefore known as the rotation

matrix. For instance, consider the singular value decomposition (SVD) of H given

as H = PΣV H . A possible choice for W,Q, which is employed in subsequent

portions of this work, is given by

W = PΣ , and Q = V. (1.2)

1If a ∈ Ct×1 is a random vector such that E

aa

H

= I and b ∈ Cr×1 is obtained by transforming a

as b = Ha, then W can be employed to decorrelate or whiten b as c = W †b i.e. E

cc

H

= I

10

It then becomes clearly evident that all such matrices W satisfy the property

WWH = HHH and it is well known that W can be determined from the blind

data alone. Q can then be exclusively determined from the transmitted pilot

symbols. Such a technique potentially improves estimation accuracy because the

matrix Q by virtue of its unitary constraint is parameterized by a fewer number of

parameters (t2 parameters to be precise)and hence can be determined with greater

accuracy from the limited pilot data.

The estimation of the unitary matrix Q is significantly more involved

since Q, unlike H, is a constrained matrix, constrained as QQH = QHQ = It.

Thus, the matrix Q lies on a constrained manifold. In this context, the CC-CRB

theory mentioned in the last section is employed to quantify the bounds on the

MSE of estimation of the matrix Q, in essence quantifying the performance of

the WR semi-blind scheme. In chapter(3) we demonstrate through a CC-CRB

analysis that the WR scheme has an MSE that is at least 3dB lower than the MSE

of a pilot based estimator. Thus, it leads to a significant reduction in the MSE of

estimation.

In chapter(3), we also formulate several maximum-likelihood (ML) schemes

for the estimation of the unitary matrix Q. These schemes are then demonstrated

to asymptotically achieve the CC-CRB, thus indeed reducing the MSE over con-

ventional pilot based estimation. The constrained ML schemes and the CC-CRB

analysis form the focus of chapter(3).

The results arising out of the above CC-CRB analysis in the context of

WR based MIMO estimation are very general and can be readily applied in a wide

variety of estimation scenarios. One such application in the context of orthogo-

nal frequency division multiplexing (OFDM) systems is demonstrated towards the

end of chapter(3). We consider the problem of time-domain vs frequency-domain

channel estimation for OFDM systems, similar to the study in [9]. In their study,

they derive the ratio of the MSE of estimation of the above competing schemes

for OFDM channel estimation. However, their analysis involves an elaborate com-

11

putation of the actual MSE covariance matrices. We present an alternative and

simple approach to characterize the above scenario using the framework of con-

strained complex parameters. The analysis presented demonstrates the versatile

nature of the CC-CRB concept and its suitability for diverse estimation scenarios.

1.5 FIM based Regularity Analysis of Semi-Blind MIMO

FIR Channels

The WR scheme described above elaborates one possible estimation al-

gorithm for efficient estimation of the MIMO wireless channel. In the efforts to

minimize the number of pilot symbols transmitted in a frame, it is essential to

address the following question: exactly how many pilots are required to estimate a

wireless channel in the presence of blind information? Or in other words, is blind

information sufficient to estimate the MIMO channel, thus making the transmis-

sion of pilots redundant. We now investigate the theoretical limit on the minimum

number of pilot transmissions necessary for complete estimation of the MIMO

channel2. With this knowledge of the least number of pilots necessary, one can in

principle restrict the number of pilot transmissions to the minimum necessary to

estimate the MIMO channel, enhancing the bandwidth efficiency. Also, to make

our study more general, we now consider a frequency selective MIMO channel

modeled as an Lh tap MIMO FIR filter.

Addressing the above issue forms the central aim of chapter(4). We

demonstrate in there that the Fisher Information Matrix(FIM) based analysis can

be employed as a tool to characterize the number of identifiable parameters in the

MIMO system. In fact, to be more precise, it is demonstrated that the rank of the

FIM is equal to the number of identifiable parameters, or in other words, the rank

of the nullspace of the FIM yields the number of unidentifiable parameters. First,

employing the Gaussianity assumption on the transmitted information symbols,

2By ”complete” we mean the total identifiability of the MIMO channel without any residualindeterminacy

12

we compute the blind information likelihood of a MIMO FIR system. This blind

FIM component is denoted by the matrix J b. It is then demonstrated that the

rank of the nullspace of J b is at least t2, thus implying that blind information alone

is not sufficient to estimate the MIMO channel.

Hence, we now rely on pilot symbols to identify the t2 parameters which

cannot be identified from blind information alone. This can be done by simply

reformulating the FIM for the total information available, i.e. pilot and blind

symbol outputs. It is demonstrated therein that under certain circumstances,

the FIM corresponding to pilot and blind information can be evaluated as a sum

J b +J t, where J b, J t are the FIMs corresponding to the blind and training symbols

respectively. The rank of this matrix can now be evaluated for each additional pilot

symbol. The minimum number of pilot symbols necessary for identifiability can

then be arrived at by simply looking at the number of pilot transmissions for which

the resulting FIM has full rank. The study in the chapter shows that at least t pilot

symbol transmissions are necessary in the MIMO FIR context for identifiability.

Thus we address the question of the fundamental limit on the number of pilot

transmissions.

Further, we demonstrate that the semi-blind CRB converges asymptot-

ically to the complex-constrained CRB demonstrated in chapter(3). Thus, the

semi-blind scheme can indeed achieve better performance by greatly reducing the

MSE of estimation of the MIMO FIR channel. Finally, for optimum MSE perfor-

mance, it is desired to employ an orthogonal pilot sequence. The design of such

a sequence is not straight forward in the context of a MIMO frequency selective

channel, as the resulting Toeplitz structure of the pilot symbol matrix imposes

additional constraints on the nature of the pilot sequence. We demonstrate a

construction scheme based on the Paley-Hadamard matrix structure to construct

an orthogonal pilot symbol matrix in the context of a MIMO frequency selective

channel.

13

1.6 Semi-Blind Channel Estimation for MRT Based MIMO

Systems

Maximum ratio transmission (MRT) is an innovative transmission scheme

for MIMO systems. It relies on beamforming employing the dominant singular

vectors of the MIMO channel. Let the singular value decomposition of the MIMO

channel be given as H = UΣV H , where U ∈ Cr×r, V ∈ C

t×t are the left and right

singular matrices respectively. Let u1 denote the dominant left singular vector

of the MIMO channel and v1 the dominant right singular vector. In MRT, the

MIMO transmitter employs v1 for transmit beamforming. Let xd(k) be the kth

transmitted data symbol. The transmit vector for this symbol is given as v1xd(k).

At the MIMO receiver, the received data vector can be expressed as,

yd(k) = Hv1xd(k) + η(k) = σ21u1xd(k) + η(k),

where σ21 is the dominant singular value of the channel. It can now be seen that

the receiver can employ u1 for receive beamforming. Hence, the final MRT system

can be represented as,

yd(k) = uH1 yd(k) = σ2

1xd(k) + η(k).

It can be seen that the above channel can be modeled as a SISO channel with

gain σ21. The attractive feature of MRT is its low implementation complexity

while retaining the diversity gain of the MIMO system. It can be demonstrated

the MRT achieves the full MIMO diversity. Further, it also achieves the MIMO

capacity at low SNR. Hence, MRT has many advantages for implementation in a

MIMO system.

It can be observed from the above description that an implementation

of MRT requires the estimation of the dominant left and right singular vectors

of the MIMO channel. One scheme to estimate these vectors is to first estimate

the MIMO channel and then perform an SVD on the channel estimate to in turn

estimate the dominant singular vectors. This is termed as the conventional scheme

14

for MRT channel estimation. However, such a two step estimation procedure is

sub-optimal and can result in a poor MSE performance. In chapter(5) we present

a semi-blind scheme for channel estimation in the context of MRT. This scheme

directly estimates the left and right beamforming vectors while employing the

statistical information from the transmitted data symbols.

We employ a framework based on the eigenvector perturbation analysis

from [10] to derive the expressions for MSE and BER performance of both the

schemes. It is also demonstrated that the semi-blind scheme can potentially achieve

a lower MSE than the conventional scheme. Thus, semi-blind estimation provides

a versatile estimation philosophy to address a multitude of estimation problems

arising in the context of MIMO systems.

1.7 Superimposed Pilots for MIMO Channel Estimation

Up to this point, our desire to improve the MIMO bandwidth efficiency

has been focused on a conventional pilot transmission model, where the pilot sym-

bols are time multiplexed with the information symbols. However, superimposed

pilots (SP) present a paradigm shift in the area of pilots based channel estimation.

In SP based systems, the pilot symbols are superimposed over the information

symbols (see fig(1.4)), thus enabling the transmission of information symbols over

the entire frame. Such a scheme would result in a reduction in the signal power

allocated to the data symbols. However, from Shannon’s famous channel capac-

ity result[11], the capacity of a communication system varies logarithmically with

SNR, while it depends linearly on bandwidth. Thus, by avoiding the exclusive

transmission of pilot symbols, one is in fact enhancing the bandwidth available for

information transmission, thus enhancing the overall throughput of the system.

At the receiver it is now necessary to develop novel signal processing

schemes for pilot and data separation followed by channel estimation. In chap-

ter(6) we focus on the MSE and throughput performance of MIMO channel esti-

15

Figure 1.4: Pictorial Representation of Conventional Vs. Superimposed Pilots

mation with superimposed pilots. We derive the Cramer-Rao Bound for SP based

MIMO estimation. Employing an asymptotic analysis, we demonstrate that the

MSE bound of an SP scheme for a SIMO system is 3dB lower than that of the

mean-based SP estimator popularly employed in literature. This is shown to arise

because the mean-based estimator only employs the first-order statistical infor-

mation and ignores the information present in the second-order output statistics

(or the covariance). Based on this, we present an improved semi-blind estimator,

that employs both the mean and statistical information to compute an enhanced

estimate of the MIMO channel. This semi-blind estimate is seen to have a superior

performance compared to the SP mean based estimator. Despite this improvement

in MSE performance of the semi-blind SP estimator, SP based estimation is out-

performed by CP in terms of MSE. The reason for such a performance degradation

of SP is the additional interference from the information symbols. Hence, for a

given constant per frame pilot power, the estimation performance of an SP system

is bounded by the performance of the CP system. Yet, in spite of such a loss in

MSE performance, SP can result in a net gain of throughput owing to the band-

width efficiency arising out of simultaneous transmission of pilot and information

streams as mentioned in the discussion above. Motivated by this observation, we

16

derive a framework to quantify the throughput performance of SP and CP sys-

tems. A precise closed form expression for the capacity of a system with error in

the channel estimate is intractable. Hence, we employ the framework of worst case

capacity analysis with estimation error, first proposed in [12]. We generalize their

result to the scenario with correlated information and noise symbols, where the

correlation itself arises due to the error in the channel estimate. This framework

can then be employed to characterize the throughput performance of SP and CP

systems. It is observed that SP based systems can yield an improved throughput

performance in comaprison to CP. Finally, we also address one other crucial ques-

tion in SP systems, that of power allocation between source and pilot symbols. We

derive a closed form expression for the optimum pilot power allocation, which is

presented towards the end.

1.8 Channel Estimation for Time-Varying Channels

Up until this point in this thesis our study has largely focused on channels

that are block time-invariant, i.e. channels that are invariant over one frame of

symbols. This is true of MIMO channels where there is no relative mobility between

the transmitter and receiver such as an indoor wireless LAN scenario. However,

in the context of mobile cellular communications, the mobile terminal users are

frequently in motion, introducing a Doppler shift in the carrier. Together with the

multipath environment, this results in a temporal variation of the MIMO channel,

making the channel time selective. Such a time-varying channel presents additional

challenges for channel estimation.

Popular approaches to estimating time-selective channels involve devel-

oping a parametric model for this time-varying channel and then following a para-

metric estimation approach for the model coefficients. One such frequently em-

ployed scheme involves modeling the time-selective MIMO channel as a vector

auto-regressive (AR) process. The coefficients of this AR process are then esti-

17

mated using the Yule-Walker MMSE algorithm. The optimal channel estimator,

conditioned on this AR model is a Kalman filter [13].

Recently, there has been an increasing research activity on complex ex-

ponential basis expansion models (CEBEM) for the modeling of time-selective

channels. In our study in chapter(7) we study the performance of CEBEM in the

context of channel estimation using superimposed pilots for time-varying MIMO

channels. The performance of the SP scheme can be further enhanced by employ-

ing an iterative estimation procedure which employs a soft-decision based scheme

to compute the channel estimate. The expectation-maximization algorithm natu-

rally lends itself to such an estimation scenario, since the transmitted information

symbols can be treated as the classical missing information. Thus, by computing

the likelihoods for the different symbols of the source constellation, one can arrive

at soft decisions on the transmitted symbols which can be employed to enhance

the channel estimate. The complexity of EM based iterative estimation increases

exponentially with the number of transmit antennas t and the size of the transmit

constellation. For instance, for a MIMO system with 4 transmit antennas em-

ploying a 16-QAM constellation, it is required to perform 16t = 65, 536 likelihood

computations for each symbol, per EM iteration, which is prohibitively large for

implementation in a wireless device where the computational power available is

limited. Hence we present a novel modification to the EM algorithm where the

number of likelihood computations can be greatly reduced by employing the sphere

decoding(SD) algorithm in conjunction with the EM algorithm. Thus, the SD-EM

algorithm based on a CEBEM model is a viable and effective strategy to tackle

the problem of channel estimation for a time-varying MIMO channel.

1.9 Discussion

Currently, The Global System for Mobile(GSM) telecommunication stan-

dard specification, which is widely employed around the world as a cellular mobile

18

standard, uses 26 symbols in every time slot of 156 symbols for synchronization

and channel acquisition[14]. Thus, pilot overhead represents about 16% of the

data payload, which is a significant overhead. This figure is bound to rise in

MIMO systems in view of the reasons cited earlier in the chapter. Further, as the

communication bandwidth increases due to the changing nature of modern wire-

less devices which are supporting increasingly rich multimedia applications, the

doppler spread increases. For instance, at a speed of about v = 60 miles/hr (≈ 26

m/s) the doppler bandwidth D of a fc = 2 GHz (fc denotes the carrier frequency)

channel is,

D =26

3 × 108× 2 × 109 ≈ 200Hz. (1.3)

Thus the channel coherence time Tc is of the order of 1/(4fd) ≈ 1.25ms. This

in turn implies that the pilot overhead has to be sent over the channel every

1.25 milliseconds, resulting in a substantial bandwidth overhead and poor spectral

efficiency.

On the other hand, emerging wireless standards and strategies are pro-

gressively complex in terms of wireless connectivity and spectrum management.

Wireless adhoc networks are aimed at supporting communication between a large

number of mobile nodes in which each node acts as a data forwarding node, thus

setting up a mobile network route on the fly. Dynamic spectrum strategies such as

cognitive radio are based on the concept of multi-user spectrum utilization, where

when a block of spectrum which is not being currently utilized, also termed as

a ”spectrum-hole”, becomes available, it is rapidly allocated to other users. Such

wireless scenarios of adhoc networks and cognitive radio require fast channel acqui-

sition to support the increasingly dynamic and fluctuating communication links.

This is more complex in the context of MIMO where the nodes are equipped with

multiple antennas, thus giving rise to new challenges in design and implementa-

tion of channel estimation algorithms. This thesis addresses some issues in such

an endeavor.

2 Complex Constrained

Cramer-Rao Bound (CC-CRB)

2.1 Introduction

In this chapter, we present a general theory of the Cramer-Rao Bound

(CRB) for the estimation of complex-constrained parameters which provides a

valuable framework to analyze the MIMO channel estimation problems arising

in the several chapters that follow in this thesis. The CRB serves as an impor-

tant tool in the performance evaluation of estimators which arise frequently in

the fields of communications and signal processing. Most problems involving the

CRB are formulated in terms of unconstrained real parameters [15]. Two use-

ful developments of the CRB theory have been presented in later research. The

first being a CRB formulation for unconstrained complex parameters given in [16].

This treatment has valuable applications in studying the base-band performance

of modern communication systems where the problem of estimating complex pa-

rameters arises frequently. A second result is the development of the CRB theory

for constrained real parameters [17–19]. However, in applications such as semi-

blind channel estimation one is faced with the estimation of constrained complex

parameters. Though one can reduce the problem to that of estimating constrained

real parameters by considering the real and imaginary components of the complex

parameter vector, the complicated resulting expressions result in loss of insight.

Using the calculus of complex derivatives as is often done in signal processing ap-

19

20

plications, considerable insight and simplicity can be achieved by working with

the complex vector parameter as a single entity [15, 20, 21]. We thus present an

extension of the result in [17–19] inspired by the theory in [16] for the case of

constrained complex parameters. To conclude, we illustrate its usefulness by an

example of a semi-blind channel estimation problem.

2.2 CRB For Complex Parameters With Constraints

Consider the complex parameter vector γ ∈ Cn×1. Let γ , α + jβ such

that the real and imaginary parameter vectors α, β ∈ Rn×1 and ξ ,

[αT , βT

]T.

Assume that the likelihood function of the (possibly complex) observation vector

ω ∈ Ω parameterized by ξ is s(ω; ξ). Let ˆξ : Ω → R2n×1 be given as ˆξ ,

[

ˆαT , ˆβT]T

,

where ˆα, ˆβ are unbiased estimators of α, β respectively. In the foregoing analysis,

we define the gradientdr(α)

dα∈ R

1×n of a scalar function r(α) as a row vector:

dr(α)

dα,

[dr(α)

dα1

,dr(α)

dα2

, . . . ,dr(α)

dαn

]

. (2.1)

Let θ ∈ C2n×1 be defined as in [16] by

θ ,

γ

γ∗

. (2.2)

Suppose now that the l complex constraints on θ are given as

h(θ)

= 0, (2.3)

i.e. h(θ)∈ C

l×1. We then construct an extended constraint set (of possibly

redundant constraints) f(θ)∈ C

2l×1 as

f(θ)

,

h

(θ)

h∗ (θ)

= 0. (2.4)

An important observation from (2.4) above is that symmetric complex constraints

on these parameters are treated as disjoint. For instance, given the orthogo-

nality of complex parameter vectors θ1, θ2, i.e. θH1 θ2 = 0, the symmetric con-

straint θH2 θ1 = 0 is to be treated as an additional complex constraint and hence

21

f(θ) =[θH1 θ2, θ

H2 θ1

]T. The extension of the constraints is akin to the extension

of the parameter set from γ to θ = [γ, γ∗] called for when dealing with complex

parameters, and the need will become evident from the proof of lemma(1). Repa-

rameterizing h(θ)

= hR

(θ)

+ jhI

(θ)

in terms of ξ, let the set of 2l parameter

constraints for ξ be given by g(ξ)

=[

hR

(θ)T

,hI

(θ)T

]T∣∣∣∣θ=α+jβ

. Employing nota-

tion defined in [17] and borrowing the notion of a complex derivative from [15,20],

we define F(θ)∈ C

2l×2n as

F(θ)

,∂f(θ)

∂θ=

[∂f

(θ)

∂γ,

∂f(θ)

∂γ∗

]

, (2.5)

It then follows from the properties of the complex derivative [20] that

F(θ)

=1

2T

∂g(ξ)

∂ξS, (2.6)

where T ∈ C2l×2n, S ∈ C

2n×2n are given as

T ,

1 j

1 −j

⊗ Il×l , S ,

1 1

−j j

⊗ In×n. (2.7)

The non-minimality of the set of complex constraint does not affect the CRB

. Alternatively, a minimal set of complex constraints can be obtained by first

formulating g(ξ)

and then reparameterizing in terms of θ. However, such a process

involves a tedious procedure of separating the real and imaginary parts, when it

might be more natural to consider the complex parameters themselves as in the

above example of orthogonality of parameter vectors. Let rank(F

(θ))

= k < 2n.

Hence there exists a U ∈ C2n×2n−k such that U forms an orthonormal basis for the

nullspace of F (θ) i.e. F (θ)U = 0. Let the likelihood of the observed data p(ω; θ)

be reparameterized as s(ω; ξ

)by substituting γ = α + jβ, γ∗ = α− jβ. Define ∆

as

∆ ,∂ ln p(ω; θ)

∂θ=

[(

1

2

∂ ln s(ω; ξ

)

∂α− j

2

∂ ln s(ω; ξ

)

∂β

)

,

(

1

2

∂ ln s(ω; ξ

)

∂α+

j

2

∂ ln s(ω; ξ

)

∂β

)]T

,

(2.8)

22

where the last equation follows from the definition of p(ω; θ). Let J = E∆∗∆T

denote the Fisher information matrix (FIM) for the unconstrained estimation of

θ. Also assume that

A.1: The parameter vector ξ ∈ R2n×1 and the likelihood function s

(ω; ξ

)satisfy

the regularity conditions as in [17, 22]. We present them below for the sake

of completeness.

(i) ξ ∈ Ξ, where Ξ ⊆ R2n.

(ii)∂s(ω;ξ)

∂ξi, i ∈ 1, 2, . . . , 2n exists and is a.s. finite for every ξ ∈ Ξ.

(iii)∫

∣∣∣∣

∂ks(ω;ξ)∂ξk

i

∣∣∣∣< ∞, for every ξ ∈ Ξ, and k = 1, 2.

(iv) E

∣∣∣∣

∂s(ω;ξ)∂ξi

∣∣∣∣

2

< ∞, for every ξ ∈ Ξ.

We now present a result for the constrained complex estimator ˆθ analo-

gous to the real case.

Lemma 1. Under assumption A.1 and constraints given by (2.3), the constrained

estimator ˆθ : Ω → Cn×1 defined as

ˆθ ,

ˆα + j ˆβ

ˆα − j ˆβ

(2.9)

satisfies the property

E(

ˆθ − θ)

∆T

UUH = UUH . (2.10)

Proof. From the results for constrained real parameter vector in [17,19] we have

E(

ˆξ − ξ)

∆T

U UT = U UT , (2.11)

where ∆ =

[∂ ln s

(ω; ξ

)

∂α

∂ ln s(ω; ξ

)

∂β

]

, and U ∈ C2n×2n−k is a basis for the

nullspace of∂g

(ξ)

∂ξ. Let U =

[UT

I , UTR

]T, UI , UR ∈ R

n×2n−k, ˜α , ˆα − α and

23

˜β , ˆβ − β. Then rewriting the above expression in terms of block partitioned

matrices we have,

∫

Ω

˜α

˜β

[∂ ln s

(ω; ξ

)

∂α

∂ ln s(ω; ξ

)

∂β

]

UI

UR

×[

UTI UT

R

]

dω

=

UI

UR

[

UTI UT

R

]

. (2.12)

Let U ∈ C2n×2n−k is defined as

U ,1√2

UI + j UR

UI − j UR

.

With some manipulation, (2.12) can be written in terms of complex matrices as

∫

Ω

˜α + j ˜β

˜α − j ˜β

[

1

2

∂ ln s(ω; ξ

)

∂α− j

2

∂ ln s(ω; ξ

)

∂β,

1

2

∂ ln s(ω; ξ

)

∂α+

j

2

∂ ln s(ω; ξ

)

∂β

]

UUH dω

)

= UUH ,

Using (2.8) and (2.9), the above equation can be expressed in the form given by

(2.10). It remains to show that U forms a basis for the nullspace of F(θ). It

follows from the definition of U that∂g

(ξ)

∂ξU = 0 and this equality is true if and

only if,

1√2

∂g(ξ)

∂ξ

(1

2SSH

)

U = 0 (2.13)

⇔ 1

2√

2T

∂g(ξ)

∂ξSSHU = 0 (2.14)

⇔ F(θ)(

1√2SHU

)

= 0, (2.15)

where the equalities in (2.13), (2.14) follow from the facts 12SSH = I and T is

invertible, respectively. The matrices S, T have been defined in (2.7). It can be seen

that U = 1√2SHU and therefore U ⊥ F

(θ). Moreover, UHU = 1

2UT SSHU = Ik×k.

Hence U contains orthonormal columns. Showing that it spans the nullspace of

24

F(θ)

completes the proof. Let U not span the nullspace of F(θ). Then there

exists u ,[uT

a ,uTb

]Twhere ua,ub ∈ C

n×1 such that F(θ)u = 0 and UHu = 0.

Hence we have T∂g

(ξ)

∂ξSu = 0 ⇒ ∂g

(ξ)

∂ξSu = 0 as T is an invertible matrix.

Let u , Su = [uTa + uT

b , juTb − juT

a ]T . Since∂g

(ξ)

∂ξis real we have

∂g(ξ)

∂ξuR = 0

where uR is the real part of u. Also, it can be observed that UHu = 0 ⇒ UT u = 0

and since U is a real matrix, UT uR = 0. Thus there exists a real vector viz.

v , uR ∈ R2n×1 such that

∂g(ξ)

∂ξv = UTv = 0 contradicting the assumption that

U is a basis for the nullspace of∂g

(ξ)

∂ξ. This completes the proof.

Theorem 1. Under assumption A.1 and constraints given by (2.3), the CRB for

estimation of the constrained parameter θ ∈ C2n×1 is then given as

E

(ˆθ − θ

) (ˆθ − θ

)H

≥ U(UHJU

)−1UH . (2.16)

Proof. Let PU = UUH be the projection matrix onto the column space of U and

let W ∈ C2n×2n be an arbitrary matrix. Let ˜θ ,

(ˆθ − θ

)

. As in [17] we now

consider E

(˜θ − WPU∆∗

) (˜θ − WPU∆∗

)H

. Following a procedure similar to

that for real vectors provided in [17], the proof of (2.16) then follows by making the

obvious modifications for complex matrices (i.e. replacing the transpose operator

with the hermitian, etc.).

2.3 A Constrained Matrix Estimation Example

2.3.1 Problem Formulation

We consider in this section the problem of pilot assisted semi-blind esti-

mation of a complex MIMO (Multi-Input Multi-Output) channel matrix H ∈ Ct×t

(i.e. # transmit antennas = # receive antennas = t). Let a total of L pilot symbols

be transmitted. The channel input-output relation is represented as

yk = Hxk + vk , k = 1, 2, . . . , L, (2.17)

25

where yk,xk ∈ Ct×1 are the received and transmitted signal vectors at the k-th

time instant. vk ∈ Ct×1 is spatio-temporally uncorrelated Gaussian noise such

that Evkv

Hk

= σ2

nI. H can be factorized using its singular value decomposition

(SVD) as H = PΣQH where P,R ∈ Ct×t are orthogonal matrices such that PHP =

QHQ = I, Σ = diag (σ1, σ2, . . . , σt), σ1 ≥ σ2 ≥ . . . ≥ σt > 0. P, Σ can be estimated

using blind techniques. We then employ the pilot data exclusively to estimate the

constrained orthogonal matrix Q. More about the significance of such a problem

can be found in [23].

2.3.2 Cramer-Rao Bound

Let yk = PHyk,vk = PHvk. Denote by qi the i-th column of the matrix

Q. The unconstrained input-output relation for each qi can be written as

yk,i = σixHk qi + vk,i, (2.18)

where yk,i denotes the i-th element of yk and analogously for vk,i. Define the

desired parameter vector to be estimated θ , [vec (Q), vec (Q∗)]T . It can now be

seen that θ is a constrained parameter vector and the constraints are given as

qHi qi = 1, 1 ≤ i ≤ t (2.19)

qHi qj = 0, 1 ≤ i < j ≤ t. (2.20)

Hence, the set of t +(

t2

)complex constraints h

(θ)

is given as,

h(θ)

=

qH1 q1 − 1

qH1 q2

qH3 q1

...

qHt qt − 1

.

26

The extended constraint set f(θ)

is then given as

f(θ)

=

qH1 q1 − 1

qH1 q2

qH1 q3

...

qHt qt − 1

...

qH1 q1 − 1

qH2 q1

qH3 q1

...

qHt qt − 1

.

The matrix f(θ)

can be employed to compute U . However, it can be noticed

that the repeated constraint qHi qi − 1 for i = 1, 2, . . . , t is trivially redundant.

Eliminating this redundancy, the minimal set of t+2(

t2

)= t2 set of non-redundant

constraints f(θ)

can be obtained as

f(θ)

=

qH1 q1 − 1

qH1 q2

qH2 q1

qH1 q3

qH3 q1

. . .

qHt qt − 1

.

The matrix F(θ)

is constructed as given in (2.5), by differentiating f(θ)

with

respect to the parameter vector θ. For example, the derivative of constraint # 2

i.e. qH1 q2 is given as,

∂qH1 q2

∂θ=

[0,qH

1 , 0, . . . ,qT2 , 0, 0, . . .

],

27

where we have used the fact that∂qH

1

∂q1= ∂q2

∂qH2

= 0. This result follows from the

properties of the complex derivative in [15]. Similarly,

∂qH1 q1

∂θ=

[qH

1 , 0, 0, . . . ,qT1 , 0, 0, . . .

],

and so on. The matrix U is an orthogonal basis for the nullspace of F(θ). Hence,

for this example, the matrices F(θ)∈ C

t2×2t2 , U ∈ C2t2×t2 can be written explic-

itly and are given as

F(θ)

=

qH1 0 0 . . . qT

1 0 0 . . .

0 qH1 0 . . . qT

2 0 0 . . .

qH2 0 0 . . . 0 qT

1 0 . . .

0 qH2 0 . . . 0 qT

2 0 . . .

qH3 0 0 . . . 0 0 qT

1 . . .

0 0 q1 . . . qT3 0 0 . . .

......

.... . .

......

.... . .

,

U =1√2

q1 0 q2 0 q3 . . .

0 q1 0 q2 0 . . .

0 0 0 0 0 . . ....

......

......

. . .

−q∗1 −q∗

2 0 0 0 . . .

0 0 −q∗1 q∗

2 0 . . .

0 0 0 0 −q∗1 . . .

......

......

.... . .

.

The simplistic and insightful nature of the above matrices F(θ), U in terms of

the orthogonal parameter vectors q1,q2, . . . ,qt, is particularly appealing and il-

lustrates the efficacy of using the complex CRB . From Eq(2.18) and using the

results for least-squares estimation [15] the Fisher information matrix J(θ)∈

C2t2×2t2 for the unconstrained case is given by the block diagonal matrix J

(θ)

=

1σ2

n

(I2×2 ⊗ Σ2 ⊗ XpX

Hp

). The complex constrained CRB for the parameter vector

θ is then obtained by substituting these matrices in (2.16).

28

2.3.3 ML Estimate and Simulation Results

We now compute the Maximum-Likelihood (ML) estimate and compare

its performance with that predicted by the CRB. The received symbol vectors can

be stacked as Yp , (y1, y2, . . . , yL). Let Xp be defined analogously by stacking the

transmitted symbol vectors. Then Q the ML estimate of Q is given as a solution

of the cost

Q = arg min∥∥∥Yp

H − XHp QΣ

∥∥∥

2

subject to QQH = I

where the norm ‖·‖ is the matrix Frobenius norm such that ‖A‖2 = tr(AAH

).

From [24] the constrained estimate Q employing an orthonormal pilot sequence

Xp (i.e. XpXHp = I) is given as

Q = PpRHp where PpΣpR

Hp = SV D

(

XpYpH

Σ)

(2.21)

Our simulation set-up consists of a 4 × 4 MIMO channel H (i.e. t = 4). A single

realization of H was generated as a matrix of zero-mean circularly symmetric

complex Gaussian random entries such that the variance of the real and imaginary

parts was unity. The source symbol vectors x ∈ C4×1 are assumed to be drawn from

a BPSK constellation and the orthonormality condition is achieved by using the

Hadamard structure. The transmitted pilot was assumed to be of length L = 12

symbols. The error was then averaged for a fixed H over several instantiations

(Ni = 1000) of the channel noise vk. Figure(2.1) shows the MSE in the 1st element

Q(1, 1)

(

i.e.∣∣∣Q(1, 1) − Q(1, 1)

∣∣∣

2)

vs its CRB. Similar results were obtained for the

CRB of other elements of Q. Figure(2.2) then shows the total MSE in estimation

of Q

(

i.e.∥∥∥Q − Q

∥∥∥

2)

vs the trace of the CRB matrix. The ML estimate Q can be

seen to achieve a performance close to the CRB and its performance progressively

improves with increasing SNR.

29

0 5 10 15 20

10−4

10−3

MSE Vs CRLB For Estimation of Q(1,1)

SNR

MS

E

Computed MSECRLB

Figure 2.1: Computed MSE Vs SNR,∣∣∣Q(1, 1) − Q(1, 1)

∣∣∣

2

0 5 10 15 20

10−2

10−1

MSE Vs CRLB For Estimation of Q

SNR

MS

E

MSECRLB

Figure 2.2: Computed MSE Vs SNR,∥∥∥Q − Q

∥∥∥

2

30

2.4 Conclusion

As illustrated in the example above, the CC-CRB framework provides

an elegant means to characterize the MSE of estimation of constrained matrices,

a problem that frequently arises in the context of semi-blind MIMO estimation.

A complete example of such an estimation procedure is demonstrated in the next

chapter on whitening-rotation (WR) based MIMO channel estimation.

Acknowledgement

The text of this chapter, in part, is a reprint of the material as it appears

in A. K. Jagannatham and B. D. Rao, “Cramer-Rao Lower Bound for Constrained

Complex Parameters”, IEEE Signal Processing Letters, Vol. 11, No. 11, Nov’04,

Pages: 875 - 878 and A. K. Jagannatham and B. D. Rao,“Complex Constrained



Barcelona, Spain, 18-21 July 2004, Pages: 397 - 401.

3 Whitening-Rotation Based

Semi-Blind MIMO Channel

Estimation

3.1 Introduction

As elaborated in chapter(1) semi-blind schemes provide a bandwidth ef-

ficient means to estimate the MIMO wireless channel. In this chapter, we present

one such semi-blind scheme, termed as the whitening-rotation (WR) scheme, for

MIMO channel estimation. We utilize the fact that the MIMO channel matrix H

can be decomposed as the product H = WQH , where W is a whitening matrix

and Q is a unitary matrix, i.e. QQH = QHQ = I. It is well known that W can be

computed blind from the second order statistical information in received output

data. Training data can then be utilized to estimate only the unitary matrix Q.

Significant estimation gains can then be achieved by estimation of such orthog-

onal matrices which are parameterized a much fewer number of parameters. A

more rigorous justification of this statement is given in subsequent sections. Such

a whitening-rotation factorization based estimation procedure naturally arises in

the independent component analysis (ICA) based framework for source separa-

tion, where it has been noted that when the sources are uncorrelated Gaussian,

the channel matrix can be estimated blind up to a rotation matrix. A more com-

plete discussion of ICA can be found in [25, 26]. A totally blind higher order

31

32

statistics algorithm based on such a decomposition is elaborated in [27], for any

source distribution.

Extensive work has been done by Slock et. al. in [28, 29] where several

semi-blind techniques have been reported. More relevant literature to our semi-

blind estimation scheme can be found in Pal’s work [30, 31]. However, it does not

consider the problem of a constrained estimator for Q. Our research is novel in the

following aspects. First, we use the theory of complex constrained Cramer-Rao

bound (CC-CRB) reported in [32] to quantify exactly how much improvement in

performance can be achieved over a traditional training based technique. Also,

since Q is a unitary constrained matrix, optimal estimation of Q necessitates the

construction of constrained estimators. Such an estimator can be found in [33,34]

for an orthogonal pilot sequence. We refer to this as the OPML estimator and

examine its properties. Another salient feature of this chapter is the development

of a novel IGML algorithm for the constrained estimation of Q employing any (not

necessarily orthogonal) pilot sequence. We then present the ROML algorithm as

a low complexity alternative to the IGML estimator.

The chapter is organized as follows. The next section describes the prob-

lem setup. An analysis of the constrained CRBs is given in section 3.3 and estima-

tion algorithms are presented in section 3.4. Finally, simulation results are given

in section 3.5 and we conclude with section 3.7.

3.2 Problem Formulation

Consider a flat-fading MIMO channel matrix H ∈ Cr×t where t is the

number of transmit antennas and r is the number of receive antennas in the sys-

tem, and each hij represents the flat-fading channel coefficient between the ith

receiver and jth transmitter. Denoting the complex received data by y ∈ Cr×1, the

equivalent base-band system can be modelled as

y(k) = Hx(k) + η(k), (3.1)

33

where k represents the time instant, x ∈ Ct×1 is the complex transmitted sym-

bol vector and η is spatio-temporally white additive Gaussian noise such that

Eη(k)η(l)H

= δ(k, l)σ2

nI where δ(k, l) = 1 if k = l and 0 otherwise. Also,

the sources are assumed to be spatially and temporally independent with iden-

tical source power σ2s i.e. E

x(k)x(l)H

= δ(k, l)σ2

sI. The signal to noise ratio

(SNR) of operation is defined as SNR ,σ2

s

σ2n. Now assume that the channel

has been used for a total of N symbol transmissions. Out of these N transmis-

sions, the initial L symbols are known training symbols and the observed outputs

are thus training outputs. Stacking the training symbols as a matrix we have

Xp = [x(1),x(2), . . . ,x(L)] where Xp ∈ Ct×L. Yp ∈ C

r×L is given by similarly

stacking the received training outputs. The remaining N − L information sym-

bols transmitted are termed as ’blind symbols’ and their corresponding outputs

as ’blind outputs’. Xb ∈ Ct×N−L, Yb ∈ C

r×N−L can be defined analogously for the

blind symbols. [Xp, Yp] , Yb is the complete available data.

Consider two possible estimation strategies. H can be estimated exclu-

sively using the pilot Xp given as

HTS = YpX†p, (3.2)

where X†p denotes the Moore-Penrose pseudo-inverse of Xp. This qualifies as train-

ing based estimation and is simple to implement. However, it results in poor usage

of available bandwidth since the pilot itself conveys no source information. Alter-

natively, H may be estimated from blind data without the aid of any pilot. Thus,

in effect this reduces to the case L = 0 and only blind data Yb is available. This

is very efficient in usage of bandwidth since it totally eliminates the need for a

pilot. However, most second order statistics based blind techniques are limited

to estimating the channel matrix up to a scaling and permutation indeterminacy

as detailed in [26],[8]. Blind methods that employ higher order statistics typi-

cally require a large number of data symbols. Moreover, such techniques are often

computationally complex and result in ill-convergence. Based on the above obser-

vations, one is motivated to find a technique which performs reasonably well in

34

terms of bandwidth efficiency and computational complexity. Moreover, pilot sym-

bols are usually feasible in communication scenarios. Hence, the focus of our work

has been to develop a semi-blind estimation procedure which uses a small number

of pilot symbols along with blind data. Such a procedure serves the dual pur-

pose of reducing the required pilot overhead at the same time achieving a greater

estimation accuracy for a given number of pilot symbols.

Consider a MIMO channel H ∈ Cr×t which has at least as many receive

antennas as transmit antennas i.e. r ≥ t. Then, the channel matrix H can

be decomposed as H = WQH where W ∈ Cr×t and Q ∈ C

t×t is unitary i.e.

QHQ = QQH = I. The matrix W is popularly termed as the whitening matrix

1. Q induces a rotation on the space Ct×1 and is therefore known as the rotation

matrix. For instance, consider the singular value decomposition (SVD) of H given

as H = PΣV H . A possible choice for W,Q, which is employed in subsequent

portions of this chapter, is given by

W = PΣ , and Q = V. (3.3)

It then becomes clearly evident that all such matrices W satisfy the property

WWH = HHH and it is well known that W can be determined from the blind data

Yb. Q can then be exclusively determined from Xb. This semi-blind estimation

procedure is termed as a Whitening-Rotation (WR) scheme. Such a technique

potentially improves estimation accuracy because the matrix Q by virtue of its

unitary constraint is parameterized by a fewer number of parameters and hence

can be determined with greater accuracy from the limited pilot data Xp. The

precise improvement in quantitative terms is presented in the next section.

To avoid repetition, we present here a list of assumptions which may be

potentially employed in our work. The exact subset of assumptions used will be

stated specifically in the result.

A.1 W ∈ Cr×t is perfectly known at the output.

1If a ∈ Ct×1 is a random vector such that E

aa

H

= I and b ∈ Cr×1 is obtained by transforming a

as b = Ha, then W can be employed to decorrelate or whiten b as c = W †b i.e. E

cc

H

= I

35

A.2 Xp ∈ Ct×L is orthogonal i.e. XpX

Hp = σ2

sL It×t.

A.1 is reasonable if we assume the transmission of a long data stream from which

W can be estimated with considerable accuracy and A.2 can be easily achieved for

signal constellations such as the BPSK, QPSK etc. by using an integer orthogonal

structure such as the Hadamard matrix.

3.3 Estimation accuracy for semi-blind approaches

We now present a general result to quantify the improvement in estima-

tion accuracy of semi-blind schemes over training based channel estimators. The

Cramer-Rao bound (CRB) is frequently used as a framework to study the estima-

tion efficiency. However, semi-blind approaches involve estimation of constrained

complex parameter vectors. Therefore in our analysis, we use the CC-CRB frame-

work developed in [32], inspired by the result in [17], which provides an ideal setting

to study the performance of such schemes. However, from the CRB matrices which

describe a lower bound on the estimation covariance it is harder to interpret the

achievable estimation accuracy in quantitative terms. This necessitates the devel-

opment of a postive scalar measure to evaluate and contrast the performance of

different estimators. Frequently, the trace of the covariance or the MSE in estima-

tion is used to quantify the performance of an estimator. We next present a result

which justifies the use of such a positive scalar measure.

Lemma 2. Let A,B ∈ Cn×n be positive definite matrices and let A ≥ B i.e.

uHAu ≥ uHBu,∀u ∈ Cn×1. Then tr(A) = tr(B) ⇔ A = B.

Proof. It is easy to see that A = B ⇒ tr (A) = tr (B). To prove the converse,

observe that A ≥ B ⇒ G = A − B ≥ 0 and hence G is Positive Semi-Definite

(PSD). Further tr (A) = tr (B) ⇒ tr(G) = 0 ⇒ ∑ni=1 λi = 0 where λi are the

eigenvalues of G. However G is PSD and hence λi ≥ 0, ∀i. Therefore λi =

0, ∀i ⇒ G = 0 ⇒ A = B.

36

Setting A,B to be the error covariance and the covariance lower bound

(obtained from the CRB analysis) respectively, it is easy to see that if the trace

of the covariance approaches the trace of the bound, then the covariance itself

approaches the bound. Thus, given the estimation error matrix E , H − H,

it is reasonable to consider the mean of the squared Frobenius norm of E given

by E‖E‖2

F

= E

tr

(EEH

), as a performance measure. We now present a

central result which relates the MSE of estimation to the number of unconstrianed

parameters in H.

Lemma 3. Under A.2, the minimum estimation error in H is directly proportional

to Nθ the number of unconstrained real parameters required to describe H and in

fact,

E

∥∥∥H − H

∥∥∥

2

F

≥ σ2n

2σ2sL

Nθ. (3.4)

Proof. H is an r × t dimensional matrix and therefore has 2rt real parameters.

Let parameter vector γ be defined as γ ,

[

vec(HT

)T, vec

(HH

)T]

where vec (H)

denotes a stacking of of the columns of H as vec (H) =[hT

1 , hT2 , . . . , hT

t

]Tand hi

denotes the ith column of H for 1 ≤ i ≤ t. Since we are concerned with a con-

strained parameter estimation problem, we wish to employ the CC-CRB (Complex

Constrained CRB). For this purpose we will need to redefine the following nota-

tion. Let the extended set of constraints on γ be given as f (γ) = 0 such that

f (γ) ∈ Fk×1, where F is the space of functions f such that f : C2rt → R. Let

F(θ)∈ C

k×2rt be defined as F(θ)

,∂f(γ)∂γ

. Thus, there exists a matrix U such

that the columns of U form an orthonormal basis of the nullspace of F(θ).

Since the number of un-constrained parameters in H is Nθ, the number of

constraints on the system is given as 2rt−Nθ. This can be seen as follows. Let the

elements of H be stacked as δ ,

[

vec (Re(H))T , vec (Im(H))T]T

∈ R2rt×1. Define

ζ , [ζ1, ζ2, . . . , ζNθ]T as the vector of the unconstrained parameters ζi, 1 ≤ i ≤ Nθ.

Let the parametric representation of the elements of δ be given as δj , χj

(ζ), 1 ≤

j ≤ 2rt, and χj : RNθ×1 → R. Let δ , [δ1, δ2, . . . , δNθ

]T . Define the vector function

χ as χ , [χ1, χ2, . . . , χNθ]T . Therefore, χ : R

Nθ×1 → RNθ×1 as χ

(ζ)

= δ. Now,

37

by the inverse function theorem [35], under mild conditions 2 on χ, there exists an

inverse function ¯χ : RNθ×1 → R

Nθ×1 such that ¯χ(

δ)

= ζ. The 2rt−Nθ constraints

on the parameter vector δ and in turn on the elements of H are then obtained by

the constraint equations

χj

(

¯χ(

δ))

− δj = 0, Nθ + 1 ≤ j ≤ 2rt. (3.5)

Therefore, rank(F

(θ))

= 2rt −Nθ, the number of non-redundant constraints. It

follows that U ∈ C2rt×Nθ . From [32], the CC-CRB for the estimation of γ is given

as

E(

ˆγ − γ) (

ˆγ − γ)H

≥ U(UHJU

)−1UH , (3.6)

where J is the unconstrained complex Fisher information matrix (FIM). J for the

above scenario is then given as J = σ2sLσ2

nI2rt×2rt [15]. Substituting this expression for

J in (3.6) and considering the trace of resulting matrices on both sides as justified

by lemma 2, we have

tr(

E(

ˆγ − γ) (

ˆγ − γ)H

)

≥ tr((

UHJU)−1

)

, (3.7)

E

2∥∥∥H − H

∥∥∥

2

F

≥ tr

(σ2

n

σ2sL

INθ×Nθ

)

,

=σ2

n

σ2sL

Nθ,

E

∥∥∥H − H

∥∥∥

2

F

≥ σ2n

2σ2sL

Nθ. (3.8)

Thus, the above result validates the claim that the estimation of a matrix

with fewer un-constrained parameters i.e. a constrained matrix, can result in a

significant improvement in estimation accuracy. We next examine the significance

of the result in lemma 3 as applied to the WR based semi-blind algorithm.

2Existence of inverse function requires the dervative χ be continuous and the linear operator χ′ beinvertible. A rigorous formulation can be found in [35]

38

3.3.1 Estimation Accuracy of the WR scheme

The following result, which compares the lower bounds of estimation er-

rors of the training based and WR schemes, gives critical insight into the estimation

accuracy of the proposed semi-blind scheme.

Lemma 4. Under assumptions A.1 and A.2, the potential gain of the semi-blind

algorithm (in dB) in terms of MSE of estimation is 10 log10

(2rt

).

Proof. Under A.1, since W is perfectly known, it suffices to estimate the unitary

matrix Q to estimate the channel matrix as H = WQ. From [36], the number of

real parameters required to parameterize Q which under A.1 equals the number

of un-constrained parameters in H is given as NQ = t2. However, the general

matrix H has NH = 2rt un-constrained real parameters. Hence, from the result in

lemma 3 the estimation gain in dB of the semi-blind scheme which estimates the

constrained unitary matrix rather than the complex matrix H is given by

G = 10 log10

(NH

NQ

)

dB = 10 log10

(2r

t

)

dB, (3.9)

which completes the proof.

Two advantages of the WR scheme can be seen from the above result.

1. In the case when the number of receivers equals the number of transmitters

i.e. r = t, the algorithm can potentially perform 3dB more efficiently than

estimating H directly.

2. The estimation gain progressively increases as r, the number of receive an-

tennas, increases. This can be expected since as r increases, the complexity

of estimating H (size r × t) increases while that of Q (size t × t) remains

constant.

Thus for a size 8× 4 complex channel matrix H, i.e. H ∈ C8×4 the estimation gain

of the semi-blind technique is 6 dB which represents a significant improvement

over the conventional technique described in (3.2).

39

3.3.2 Constrained CRB of the WR scheme

An exact expression is now derived for the variance bound in each ele-

ment of H. To begin with, we assume that only A.1 holds. Let the channel matrix

be factorized using its singular value decomposition (SVD) as H = PΣQH where

P ∈ Cr×t, Q ∈ C

t×t are orthogonal matrices such that PHP = Ir×r, QHQ = It×t,

Σ = diag (σ1, σ2, . . . , σt), σ1 ≥ σ2 ≥ . . . ≥ σt > 0. As seen earlier in (3.3), W

can be given as W = PΣ. Let qi for 1 ≤ i ≤ t be the columns of Q. De-

fine the desired parameter vector to be estimated ρ ,

[

vec (Q)T , vec (Q∗)T]T

=[qT

1 ,qT2 , . . . ,qT

t ,qH1 ,qH

2 , . . . ,qHt

]T. It can then be seen that ρ is a constrained

parameter vector and the constraints are given as

qHi qi = 1, 1 ≤ i ≤ t (3.10)

qHi qj = 0, 1 ≤ i < j ≤ t. (3.11)

Let Uf ∈ C2t2×t2 be defined as

Uf =

U1

U2

,1√2

q1 0 q2 0 q3 . . .

0 q1 0 q2 0 . . .

0 0 0 0 0 . . ....

......

......

. . .

−q∗1 −q∗

2 0 0 0 . . .

0 0 −q∗1 q∗

2 0 . . .

0 0 0 0 −q∗1 . . .

......

......

.... . .

, (3.12)

From [32], CQ, the CC-CRB for the estimation error of ρ can be obtained as

CQ = Uf

(UH

f JUf

)−1UH

f , (3.13)

and the Fisher information matrix J ∈ C2t2×2t2 for the unconstrained case is given

by the block diagonal matrix J = 1σ2

n

(I2×2 ⊗ Σ2 ⊗ XpX

Hp

). Block partitioning CQ

as

CQ =

CQ11 CQ12

CQ21 CQ22

, (3.14)

40

the CRB for the estimation of ω = vec (Q) is given by CQ11 . Let θ = vec(HT

)

and Γ = W ⊗ It×t. We then have θ = Γω. Hence from the property of the CRB

under transforms [15] the error covariance of estimation of the channel matrix H

is then given as

E

(ˆθ − θ

) (ˆθ − θ

)H

≥ ΓCQ11ΓH . (3.15)

Eq.(3.15) gives the bound for a general pilot Xp. Additionally if A.2 holds, then

from [15] it follows that J = σ2sLσ2

n(I2×2 ⊗ Σ2 ⊗ I) and is therefore diagonal. Further

it can be verified that UHJU is also diagonal and is given as UHJU = σ2sLσ2

n

12Σ

where Σ ∈ Ct2×t2 is given as Σ = diag ([2σ2

1, σ21 + σ2

2, σ22 + σ2

1, 2σ22, σ

21 + σ2

3, . . .]).

Hence, CQ11 = U1σ2

n

σ2sL

(12Σ

)−1

UH1 . Substituting these quantities in Eq.(3.15) the

CRB for the estimation of θ is obtained as

E

(ˆθ − θ

) (ˆθ − θ

)H

≥ (PΣ ⊗ It×t) U1σ2

n

σ2sL

(1

2Σ

)−1

UH1 (PΣ ⊗ It×t)

H

= CH . (3.16)

The variance of the (k, l) element of H is obtained as CH ((k − 1)t + l, (k − 1)t + l)

E

∣∣∣H(k, l) − H(k, l)

∣∣∣

2

≥ CH ((k − 1)t + l, (k − 1)t + l) (3.17)

=σ2

n

σ2sL

t∑

i=1

t∑

j=1

σ2i

σ2j + σ2

i

|Pk,i|2 |Ql,j|2 ,

where pk,i, qj,l represent the (k, i) element of P and (j, l) element of Q respectively.

Thus Eq.(3.18) give the variance for the estimation of each element of H. The

weighing factorσ2

i

σ2j +σ2

i

in each term of the above summation results in the net

reduction of estimation error over the training based scheme as given in lemma 4.

3.4 Algorithms

3.4.1 Orthogonal Pilot ML (OPML) estimator

Under A.1 and A.2, Q the constrained OPML estimator of Q such that

Q : Cr×L → S, where S is the manifold of t× t unitary matrices, is then obtained

41

by minimizing the likelihood

∥∥Yp − WQHXp

∥∥

2such that QQH = I. (3.18)

It is shown in [24,37] that Q under the above conditions is given by

Q = VQUHQ where UQΣQV H

Q = SV D(WHYpX

Hp

). (3.19)

The above equation thus yields a closed form expression for the computation of

Q, the ML estimate of Q. The channel matrix H is then estimated as H = WQH .

We next present properties of the above estimator.

Properties of the OPML Estimator:

In this section we discuss properties of the OPML estimator. We show

that the estimator is biased and hence does not achieve the CRB for finite sample

length. However, from the properties of ML estimators, it achieves the CRB

asymptotically as the sample length increases. Further, it is also shown that the

bound is achieved for all sample lengths at high SNR .

P.1 There does not exist a finite length constrained unbiased estimator of the

rotation matrix Q and hence Q, the OPML estimator of Q is biased.

Proof. Let there exist Q such that Q : Cr×L → S is an constrained unbiased

estimator of Q. Cr×L is the observation space (Yb) and S is the manifold of

orthogonal matrices. Then Q = Q + E where E is such that EE

= 0.

Now since Q is a constrained estimator we have QQH = I and therefore,

(Q + E

)H (Q + E

)= I,

which when simplified using the fact that QQH = I yields

QHE + EHQ + EEH = 0.

42

Rearranging terms in the above expression and taking the expectation of

quantities on both sides (where the expectation is with respect to the distri-

bution of E conditioned on Q) yields

tr(

QH EE

+ E

E

HQ

)

= −tr(

E∥

∥E∥∥

2)

. (3.20)

It can immediately be observed that the right hand side is strictly less than 0

while the left hand side is equal to zero (by virtue of EE

= 0) and hence

the contradiction.

The above result then implies that the CRB cannot be achieved in a general

scenario as there does not exist an unbiased estimator which is necessary

for the achievement of the CRB. However, the properties presented next

guarantee the asymptotic achievability of the CRB both in sample length

and SNR.

P.2 The OPML estimator achieves the CRB given in (3.18) as the pilot sequence

length L → ∞.

Proof. Follows from the asymptotic property of ML estimators, reviewed in

[15].

P.3 The OPML estimator of Q achieves the CRB given in (3.18) at high SNR,

i.e. as σ2s

σ2n→ ∞.

Proof. The above result can be proved using the theory of matrix eigenspace

perturbation analysis detailed in [10]. The detailed proof can be found in

the appendix.

3.4.2 Iterative ML procedure for general pilot - IGML

The ML estimate of Q for an orthogonal pilot Xp is given by (3.19).

In this section we present the IGML algorithm to compute the estimate for any

43

given pilot sequence Xp, i.e. when A.2 does not necessarily hold. As it is shown

later, the proposed IGML scheme reduces to the OPML under A.2. The ML cost-

function to be minimized is given as in (3.18). Let A.1 hold true and Yp , PHYp.

With constraints given by (3.10) and (3.11), the Lagrange cost f(Q, λ, µ

)to be

minimized can then be formulated as

f(Q, λ, µ

)=

t∑

i=1

∥∥∥Yp(i) − σiq

Hi Xp

∥∥∥

2

+t∑

i=1

Reλi

(qH

i qi − 1)

+t∑

i=1

t∑

j=i+1

Reµijq

Hi qj

,

where λi ∈ R, µij ∈ C are the Lagrange multipliers, Yp(i) ∈ C1×L is the i-th row

(output at the i-th receiver) and qi is the i-th column of Q for 1 ≤ i, j ≤ t. Define

the matrix of Lagrange multipliers S ∈ Ct×t as Sii , λi, Sij , µij if i > j and

Sij , µ∗ji if i < j. Observe that S is a hermitian symmetric matrix i.e. S = SH .

The above cost function can now be differentiated with respect to Re qi , Im qifor 1 ≤ i ≤ t. These quantities can then be equated to 0 for extrema and after

some manipulation, the resulting equations can be represented in terms of complex

matrices as

XpYHp Σ − XpX

Hp QΣ2 = QS. (3.21)

where Q is unitary. We avoid repeated mention of this constraint in the foregoing

analysis and it is implicitly assumed to hold. Let A , XpYHp Σ = XpY

Hp W .

QHA− QHXpXHp QΣ2 = S.

As noted, S = SH and therefore the lagrange multiplier matrix S can be eliminated

as

QHA−AHQ = QHXpXHp QΣ2 − Σ2QHXpX

Hp Q. (3.22)

Adding and subtracting Lσ2s Σ2 in (3.22)and rearranging terms yields,

QH(A +

(Lσ2

s It×t − XpXHp

)QΣ2

)=

(AH + Σ2QH

(Lσ2

s It×t − XpXHp

))Q.

Let T , A +(Lσ2

s It×t − XpXHp

)QΣ2. Thus from the above equation, QHT

is hermitian symmetric or in other words QHT = T HQ. Also, if UT ΛT V HT =

44

SV D (T ) then, QHUT ΛT V HT = SV D

(QHT

). We have then from the symmetry

of QHT ,

QHUT = VT ⇒ Q = UT V HT . (3.23)

Expression (3.23) gives the critical step in the IGML algorithm which is succinctly

presented below. Some of the definitions above are repeated for the sake of com-

pleteness.

IGML Algorithm: Let A.1 hold, i.e. W = W = PΣ. Xp is the transmitted pilot

symbol sequence and not necessarily orthogonal. We then compute the constrained

ML estimate of Q as follows.

S.1 Compute A = XpYHp W , where Yp is the received output data.

S.2 Let Q0 denote the initial estimate of the unitary matrix Q. Compute Q0 by

employing Xp , W and Yp in (3.19).

S.3 Repeat for N iterations. At the kth iteration i.e.1 ≤ k ≤ N ,

S.3.1 Let Tk = A +(Lσ2

s It×t − XpXHp

)Qk−1Σ

2.

S.3.2 Compute refined estimate of Qk from Tk by employing (3.23).

S.4 Finally estimate H as H = WQHN .

N , the number of iterations is small and typically N ≤ 5 as found in

our simulations. It can now also be noticed that if A.2 holds, XpXHp = Lσ2

s I.

Therefore, T = A = XpYHp W . The SVD of T is then given by UT ΛT V H

T =

VQΣQUHQ . It follows that the IGML solution given as

Q = UT V HT = VQUH

Q , (3.24)

is similar to the solution given in (3.19). Thus, when Xp is orthogonal, the IGML

algorithm converges in a single iteration to the OPML solution.

45

Finally, we wish to compare the CRB of estimation of H for the IGML and

OPML schemes. Let Xp, the pilot for the IGML scheme be a random sequence such

that EXpX

Hp

= Lσ2

s I or in other words it is statistically white. Denoting by J

the unconstrained FIM for IGML, we have from section(3.3), J = I2×2⊗ 1σ2

nXpX

Hp .

Therefore, EJ

= J . The average CRB of IGML, where the averaging is over

the distribution of Xp is then given as CRBIGML = EJ−1

. Employing Jensen’s

inequality for matrices from [38] we have,

CRBIGML = EJ−1

≥

(E

J)−1

= J−1 = CRBOPML. (3.25)

Thus, the error in the estimation of H is minimum for an orthogonal pilot Xp.

Similar optimality properties of orthogonal pilots have been previously reported in

[39] and [40].

’Rotation-Optimization’ ML (ROML)

The above suggested IGML scheme to compute Q for a general pilot

sequence Xp might be computationally complex owing to the multiple SVD com-

putations involved. Thus, to avoid the complexity involved in the full computation

of the optimal ML solution, we propose a simplistic ROML procedure for the sub-

optimal estimation of Q, thus trading complexity for optimality. The first step of

ROML involves construction of a modified cost function as

minQ

∥∥∥WYp − QHXp

∥∥∥

2

where QQH = I. (3.26)

Yp = WYp is the whitening pre-equalized data. The closed form solution Q for the

modified cost in (3.26) is given as

Q = VhUHh where UhShV

Hh = SV D

(

WYpXHp

)

, (3.27)

which can be implemented with low complexity. This result for problem (3.26)

follows by noting its similarity to problem (3.18). Several choices can then be con-

sidered for the pre-equalization filter W . The standard Zero-Forcing (ZF) equalizer

46

is given by WZF = W † (where † denotes the Moore-Penrose pseudo-inverse) and is

usually referred to as ’data whitening’ in literature. However, ZF is susceptible to

noise enhancement as frequently cited in literature. Alternatively, a robust MMSE

pre-filter is given as WMMSE = σ2sW

H(σ2

sWWH + σ2nI

)−1.

Q given by (3.27) is a reasonably accurate closed form estimate of Q.

However, the resulting estimate does not have any statistical optimality properties

as it does not compute the solution to the true cost function given in (3.18).

This estimate of Q can now be employed to initialize the IGML procedure to

minimize the true cost. However, to avoid the complexity associated with an SVD

computation, a constrained minimization procedure (ex: ’fmincon’ in MATLAB)

can now employed to converge to the solution with the t2 non-linear constraints

given by the unit norm and mutual orthogonality of the rows of Q. This procedure

then yields Q which is close to the optimal ML estimate and the low computational

cost of the proposed solution makes it attractive to implement in practical systems.

3.4.3 Total Optimization

This procedure builds on the above described schemes. The ML schemes

(OPML and IGML) for estimating the unitary matrix are optimal given perfect

knowledge of W . However, in finite total symbol run situations where this as-

sumption is not valid (for example in fast fading mobile environments where the

data symbols available in the channel coherence time are limited and hence the

estimated whitening matrix may not be exact as assumed earlier), the disjoint

estimation of the whitening matrix from blind symbols and rotation matrix from

pilot symbols is not optimal. We present a scheme for such a system to iteratively

compute the joint solution for W and Q based on minimizing a Gaussian likelihood

cost function.

47

Initialization of W and Q

W can be estimated from the output correlation matrix Ry which is given

as

Ry = σ2sWWH + σ2

nI. (3.28)

The ML estimate of Ry can be computed blindly from the entire received data

y(1),y(2), . . . ,y(N) as Ry = 1N

∑Ni=1 y(i)y(i)H . Using relation (3.28) and assum-

ing that σ2s and σ2

n are known at the receiver, WWH may be estimated as

HHH =1

σ2s

(

Ry − σ2nI

)

= WMLWHML. (3.29)

WML can then be computed from a Cholesky factorization of Ry. Q, the initial

estimate of Q is then computed by employing W in the OPML or IGML algorithms

outlined in sections (3.4.1) and (3.4.2) respectively.

Likelihood for Total Optimization

In order to arrive at a reasonably tractable likelihood function, we now

assume that the transmitted data x(k), k = L + 1, . . . , N is Gaussian, i.e. x ∼N (0, σ2

sI). The likelihood of the complete received data, conditioned on the pilot

symbols Xp is given as

L (W,Q) =1

2(N − L) ln |R(W )| −

N∑

i=L+1

y(i)HR(W )−1y(i)

︸︷︷︸

L1(W )

− 1

σ2n

L∑

j=1

‖y(j) − WQx(j)‖2

︸︷︷︸

L2(W,Q)

,

(3.30)

where R(W ) , σ2sWWH + σ2

nI. L1 is a function entirely of blind data and L2

depends only on training data. This cost function can be minimized for W to

compute W as given below.

Total Optimization: Let Xp be the transmitted pilot symbol sequence, not neces-

sarily orthogonal. We then compute estimates of W and Q matrices as follows.

48

T.1 Compute W0, the initial estimate of W from (3.29).

T.2 Compute Q0 by employing Xp , W0 in the IGML algorithm in section (3.4.2).

T.3 Repeat for NT iterations. At the kth iteration i.e.1 ≤ k ≤ NT ,

T.3.1 Using Wk−1 as an initial estimate, compute Wk by minimizing

L(

W, Qk

)

(’fminunc’ in MATLAB).

T.3.2 Compute the IGML estimate of Qk from Wk.

T.4 Finally estimate H as H = WNTQH

NT.

It is seen from the simulation results that minimization of the above like-

lihood yields an improved estimate of the channel matrix H even when elements

of the transmitted symbol vectors x(k) are drawn from a discrete signal constel-

lation. This solution however involves a computational overhead. Nevertheless it

provides a useful benchmark for the estimation of the flat-fading channel matrix

H. Practical implementation of this algorithm would require a recipe for efficient

numerical computation.

As the data length N increases with pilot length L kept constant, the

effect of L2 on the above expression diminishes for the estimation of W . Hence

for large blind data lengths N , maximizing the likelihood expression L with re-

spect to W , reduces to maximization of L1. The solution W is then given by the

ML estimate in (3.29). The second step maximizes L2, which is the cost function

optimized by the OPML and IGML algorithms. Thus, as N → ∞, the total opti-

mization scheme reduces to a one iteration algorithm involving the ML estimation

of W followed by the constrained ML estimation of Q.

49

3.5 Simulation Results

Our simulation set-up consists of a 8×4 MIMO channel H (i.e. r = 8, t =

4). H was generated as a matrix of zero-mean circularly symmetric complex Gaus-

sian random entries such that the sum variance of the real and imaginary parts was

unity. For an orthogonal pilot, the source symbol vectors x ∈ C4×1 are assumed to

be drawn from a BPSK constellation and the orthonormality condition is achieved

by using the Hadamard structure. But otherwise, for general pilot sequences and

data vectors, symbols were drawn from a 16-QAM signal constellation. Further,

the transmitted for the transmitted training symbol vectors XpXHp = Lσ2

sI and

for the data vectors Ex(k)x(l)H

= δ(k, l)σ2

sI thus maintaining the source power

σ2s constant. Noise vectors η(k) were generated as spatio-temporally uncorrelated

complex Gaussian random vectors and with variance of each element equal to σ2n.

The SNR of operation was measured as SNR = 10 log10

(σ2

s

σ2n

)

. Simulations de-

scribed below investigate the performance of the proposed semi-blind algorithm

under different conditions.

Experiment 1: In this experiment, we demonstrate the enhancement in estima-

tion accuracy that can be achieved by the use of statistical side information (white

data) as qunatified by lemma(4). For this purpose we evaluate the MSE perfor-

mance of the different constrained ML estimators of Q under A.1 and compare it

to the training based estimate given by (3.2) which neglects white data. The MSE

of estimation of the channel matrix H has been averaged over 1000 instantiations

of the channel noise η. In Fig.3.1., this MSE has been plotted vs. SNR in the

range 4dB ≤ SNR ≤ 11dB for the OPML semi-blind scheme. As noted in section

3.3.1, the MSE of semi-blind scheme is 6 dB lower than that of exclusively training

based channel estimation. The CRB of the semi-blind scheme is also plotted for

reference.

Next we compute the MSE for different pilot lengths L in the range

50

4 5 6 7 8 9 10 11

10−1

100

SNR

MS

E

ML Error vs. SNR. H is 8 X 4 L = 12

TrainingOPMLCR Bound

Figure 3.1: MSE vs. SNR of OPML semi-blind channel estimation and the semi-

blind CRB with perfect knowledge of W . Also shown for reference is MSE of the

exclusively training based channel estimate. H is an 8 × 4 complex flat-fading

channel matrix and pilot length L = 12.

20 ≤ L ≤ 100. A statistically white pilot(E

XpX

Hp

= Lσ2

sI)

was employed for

the IGML, ROML and training based schemes while an orthogonal pilot Xp was

used for the OPML scheme with XpXHp = Lσ2

sI, thus maintaining constant source

power. Fig.3.2.-left shows the error for these different schemes and also that for

the exclusive training based scheme. It can be seen that the semi-blind schemes

are 6dB more efficient than the training scheme as suggested by lemma 4. OPML

performs very close to the CRB while the IGML progressively improves towards

the CRB as the pilot length increases. In Fig.3.2.-right, which is a blown up ver-

sion of the same plot, it is seen that the ROML because of its sub-optimality loses

slightly (0.5 dB) in terms of estimation gain when compared to the other con-

strained estimators.

Experiment 2: We now consider the effect of estimation inaccuracies in W arising

from the availability of finite blind data. We demonstrate the performance of the

51

20 30 40 50 60 70 80 90 10010

−2

10−1

Pilot Length

MS

E

ML Error H is 8 X 4 SNR = 7.96 dB

TrainingOPMLIGMLROMLCR Bound

50 55 60 65 70

10−1.8

10−1.7

10−1.6

10−1.5

Pilot Length

MS

E

ML Error H is 8 X 4 SNR = 7.96 dB

TrainingOPMLIGMLROMLCR Bound

Figure 3.2: Computed MSE vs. Pilot length (L) for the OPML, IGML, ROML

and exclusive training based channel estimation. H is an 8×4 complex flat-fading

channel matrix and SNR = 8 dB

Total Optimization procedure for the joint optimization of W and Q and contrast

it with the MSE of the IGML estimate with imperfect W . We consider estimation

of W from N − L = 300, 500, 1000 blind data symbols with the source symbols

drawn from a 16-QAM constellation and employing (3.29). The pilot sequence Xp

was orthogonal. As in the previous experiment, we consider the MSE in estimation

for different pilot lengths 20 ≤ L ≤ 100. It can then be seen from Fig.3.3. that

while the OPML with imperfect W for N − L = 500 performs marginally better

than the training sequence based technique (L ≤ 60 training symbols), the To-

tOpt scheme which optimizes the likelihood in (3.30) performs consistently better

than the training sequence based scheme in all the cases. Their performance is

also compared to the situation of availability of perfect knowledge of W (perf W)

which can be seen to achieve the best performance. As noted in sec 3.4.3, the

performance of TotOpt approaches that of the OPML with perfect W as N → ∞.

Experiment 3: Finally, we consider Pe of detection of the transmitted symbol vec-

52

10 20 30 40 50 60 70 80 90

10−2

10−1

PILOT Length (L)

MS

E

Total Optimization. H is 8 X 4 SNR = 10dB

Perf WTrainingImperf W, N−L = 500Tot Opt, N−L = 300Tot Opt, N−L = 500Tot Opt, N−L = 1000

Figure 3.3: Comparison of OPML with perfect W , OPML with imperfect or esti-

mated W , total optimization and training based estimation of H.

tors employing H estimated from different schemes. We illustrate the performance

of OPML with perfect knowledge of W at the receiver and Total Optimization

with N − L = 1000, 500 blind symbols. The performance of the exclusively train-

ing based estimate of H is also plotted for L = 12. Fig.3.4. shows the probability

of error detection vs SNR for a linear MMSE receiver at the output for an 8 × 4

system H. It can be seen that at an SNR of 6 dB the semi-blind scheme achieves

about a 1 dB improvement in probability of bit error detection performance and

thus improves over the exclusively training based estimate.

3.6 OFDM Channel Estimation

The concept of constrained parameter estimation and CC-CRB provides

a general and powerful framework to characterize problems of a wide variety arising

in the context wireless channel estimation. To illustrate this point we demonstrate

another application of constrained estimation framework by considering a problem

that arises in the context of channel estimation in Orthogonal frequency division

multiplexing (OFDM) communication systems. The problem of time vs frequency

53

0 1 2 3 4 5 6

10−3

10−2

Eb/N

o (dB)

Pe

PROB. SYMBOL ERROR VS Es/N

o L = 12

OPML, Perfect WTot Opt, N−L = 1000Tot Opt, N−L = 500Training

Figure 3.4: Probability of Bit Error vs. SNR for 8 × 4 MIMO system employing

OPML, Total Optimization (N = 1000, 500). The performance of the exclusively

training based channel estimate is also given for comparison.

domain channel estimation for an OFDM based communication system has been

detailed in [9,41]. It has been shown there in that when the number of subcarriers

K exceeds the numbers of taps L in the channel impulse response (CIR), the

time domain least squares channel estimate (TLSE) is more accurate than the

frequency domain least squares estimate (FLSE). Indeed, we demonstrate below

that this result follows as an immediate consequence of the complex constrained

CRB theory developed above.

3.6.1 Problem Description

Employing notation in [9], let the complex baseband channel from the

transmitter to the receiver be modelled by a tapped delay line as

h (τ, t) =L−1∑

l=0

hl (t) δ (τ − lTs) , (3.31)

where L is the number of taps in the channel and is known. Let K denote the

number of subcarriers and p , [a0, a1, . . . , aK−1]T be the pilot signal known at

54

the receiver. Denoting the discrete time CIR as h = [h0, h1, . . . , hL−1]T , the cyclic-

prefix extended OFDM communication system can be modeled as

r = ah + n (3.32)

where r,n ∈ CK×1 are the received symbol vector and additive white Gaussian

noise respectively. The matrix a ∈ CK×L is constructed from the pilot symbols as

a ,

a0 aK−1 aK−2 . . . aK−L+1

a1 a0 aK−1 . . . aK−L+2

......

.... . .

...

aL−1 aL−2 aL−3 . . . a0

......

.... . .

...

aK−1 aK−2 aK−3 . . . aK−L

. (3.33)

h, the LS estimate of h is given as

h =(aHa

)−1aHr. (3.34)

The frequency domain equivalent of the system in (3.32) can be obtained by com-

puting the DFT of both sides as

R = Fr = Fah + Fn, (3.35)

where F ∈ CK×K is given as

F =

W 00 . . . W 0(K−1)

.... . .

...

W (K−1)0 . . . W (K−1)(K−1)

(3.36)

and W il , e−j 2πilK . The system in (3.35) is then given as

R = AH + N, (3.37)

where A , diag (Fp) ∈ CK×K , N , Fn and H , Fh where F is the left K × L

sub-matrix of F. The unconstrained least squares estimate H, which is also the

FLSE, is therefore given as

Hf =(AHA

)−1AHR. (3.38)

55

However, the parameter vector H is a constrained parameter vector and in fact,

the constraints on H are given as

f (H) ,¯FHH = 0 (3.39)

where ¯F is the right K × (K − L) sub-matrix of F . Therefore, from (3.39), it can

be seen that the number of constraints on H is K − L. Hence, even though H

contains K complex parameters (2K real parameters), it only contains L (< K)

un-constrained complex parameters (2L real parameters) and these un-constrained

parameters are in fact the elements of the parameter vector h. H is given as

a function of its un-constrained parameters as H = Fh. Thus, an alternative

constrained technique to estimate H, based on the estimate in (3.34) is given as,

Ht = F h, (3.40)

which is also the TLSE of H. Assuming a constant power spectrum as in [9],

AHA = I. Hence, the pilot orthogonality requirement of theorem 3 is satisfied

and in fact, the CRB is exactly achievable since the estimation problem in this

case involves a linear least squares cost function and the noise is Gaussian [15].

Therefore, from theorem 3, the ratio of the estimation error of the FLSE in (3.38)

to the estimation error in the TLSE in (3.40) is precisely given by the ratio of the

number of parameters to the number of un-constrained parameters as

E

∥∥∥Hf − H

∥∥∥

2

F

E

∥∥∥Ht − H

∥∥∥

2

F

=K

L. (3.41)

as reported in [9], where the above conclusion was reached after an explicit com-

putation of the covariance matrices of the time domain and frequency domain

estimation schemes. Thus, the constrained parameter framework and particularly

theorem 3 provides a powerful framework, where results such as the one in (3.41)

can be deduced by just reckoning the number of un-constrained parameters, thus

avoiding explicit computation of the error covariance matrices.

56

0 2 4 6 8 10 12 1410

−1

100

101

SNR

MS

E

OFDM − Channel Estimation. K = 40, L = 5

Unconstrained − FLSEConstrained − TLSE

Figure 3.5: Constrained Vs. unconstrained channel estimation for OFDM.

3.6.2 Simulation results

Our simulation setup consisted of an OFDM system with K = 40 subcar-

riers and L = 5 taps. The channel h was generated as a complex Gaussian vector

of zero mean independent entries and with the variance of real and imaginary parts

equal to 0.5. The time domain and frequency domain channel estimates were found

as given in (3.40) and (3.38) respectively. The experiment was repeated for 1000

iterations at different SNRs in the range 2 − 14 dB. The mean estimation error

vs SNR is given in Fig.3.5. It can be seen the the time domain estimate is more

accurate than the frequency domain estimate. Also, the ratio of the estimation

error of the FLSE to TLSE is precisely 10 log10

(KL

)= 6dB.

3.7 Conclusions

A semi-blind scheme based on a whitening-rotation decomposition of the

channel matrix H has been proposed for MIMO flat-fading channel estimation.

The algorithm computes the whitening matrix W blind from received data and

the unitary matrix Q exclusively from the pilot data. Closed form expressions

57

for the CRB of the proposed scheme have been derived employing the CC-CRB

framework. Using the bounds, it is shown that the lower bound for the MSE in

channel matrix estimation is directly proportional to the number of un-constrained

parameters leading to the conclusion that the semi-blind scheme can be very effi-

cient when the number of receive antennas is greater than or equal to the number

of transmit antennas. We also develop and analyze algorithms for channel estima-

tion based on the decomposition. Properties of the constrained ML estimator of

Q have been studied and an iterative constrained Q-estimator has been detailed

for non-orthogonal pilot sequences. In the absence of perfect knowledge of W , a

Gaussian likelihood function has been presented for the joint estimation of W and

Q. Simulation results have been presented to support the algorithms and analy-

sis and they demonstrate improved performance compared to exclusively training

based estimation. The applicability of the above framework is also shown in the

context of time versus frequency domain channel estimation in OFDM systems.

Acknowledgement


in A. K. Jagannatham and B. D. Rao, “Whitening-Rotation Based Semi-Blind

MIMO Channel Estimation”, IEEE Transactions on Signal Processing, Vol. 54,

No. 3, Mar’06, Pages: 861 - 869, A. K. Jagannatham and B. D. Rao,“A Semi-Blind

Technique For MIMO Channel Matrix Estimation”, 4th IEEE Workshop on Signal

Processing Advances in Wireless Communications, 2003, Rome, Italy , 15-18 June

2003 Pages:304 - 308, Rome, Italy, A. K. Jagannatham and B. D. Rao,“Constrained

ML Algorithms for Semi-Blind MIMO Channel Estimation”, IEEE Global Telecom-

munications Conference, 2004 GLOBECOM ’04, Vol. 4, Nov’29 - Dec’3, 2004,

Pages: 2475 - 2479, A. K. Jagannatham and B. D. Rao,“Complex Constrained



Barcelona, Spain, 18-21 July 2004, Pages: 397 - 401.

58

3.8 Appendix for Chapter(3)

In this section we present a more rigorous justification of P.3, i.e. the

OPML estimator of Q achieves the CRB given in (3.18) at high SNR, i.e. as

σ2s

σ2n→ ∞. We start by recapitulating a result from matrix perturbation theory

[10].

Perturbation of eigenvectors : Consider a first order perturbation of a symmetric

matrix R ∈ Cn×n by an error matrix ∆R to yield R ∈ C

n×n i.e. R = R + ∆R. Let

sk represent the eigenvectors and λk the eigenvalues of R for k = 1, 2, . . . , r. Also

let the eigenvalues be distinct, i.e. λi 6= λj, ∀i 6= j. Then for small perturbations

∆R, the perturbed eigenvectors sk can be approximated as

sk = sk +n∑

r=1r 6=k

ωrksr , ωrk ,sHr ∆R sk

λk − λr

, (3.42)

and the perturbed eigenvalues λk are given as

λk = λk + ∆λk , ∆λk , sHk ∆R sk. (3.43)

Perturbation analysis of the SVD : The above results then provide a framework

for the analysis of an SVD of a perturbed matrix as the SVD involves operations

similar to the eigen decomposition. Let Φ , 1σ2

sLWHYpX

Hp . ηp ∈ C

t×L is obtained

by stacking the noise vectors η(k) k = 1, 2, . . . , L as ηp , [η(1), η(2), . . . , η(L)].

Thus, Yp = HXp + ηp and substituting this in the expression for Φ yields

Φ = ΣPH(PΣQHXp + ηp

)XH

p = Σ2QH + E,

where E = ΣUHηpXHp can be regarded as a perturbation matrix. From (3.19)

it is clear that computing the high SNR estimate of Q involves a computation

of the SVD of Φ. Define Ω ∈ Ct×t as Ω , Σ2. The SVD of the unperturbed

matrix ΩQH is given as It×tΩQH . We now wish to use the result in (3.42) to

compute the SVD of the perturbed matrix Φ = ΩQH + E given as IΩQH where

I, Q ∈ Ct×t and Ω = diag

(

Ω1, Ω2, . . . , Ωt

)

. The perturbed left and right singular

59

vectors Ii, qi ∈ Ct×1for i = 1, . . . , t are given in terms of the basis vectors of I,Q

as

Ii = Ii + KiiIi +t∑

j=1

j 6=i

αjiIj , qi = qi + Liiqi +t∑

j=1

j 6=i

βjiqj, (3.44)

where Kii, αji, Lii, βji are the perturbation coefficients. Recasting the above expres-

sions in terms of matrices, the perturbed matrices I, Q are given as I = IC, Q =

QD where C, D ∈ Ct×t are defined as

C ,

1 + K11 α12 · · · α1t

α21 1 + K22 · · · α2t

......

. . ....

αt1 αt2 . . . 1 + Ktt

, D ,

1 + L11 β12 · · · β1t

β21 1 + L22 · · · β2t

......

. . ....

βt1 βt2 . . . 1 + Ltt

.

(3.45)

In the absence of noise, Kii = αji = Lii = βji = 0, ∀ i, j and thus C, D = It×t.

Hence these coefficients are essentially small in magnitude and therefore higher

order terms involving them are neglected.

Perturbation coefficients : We wish to now find expressions for these perturba-

tion coefficients in terms of the perturbation matrix E. By definition Ii are the

eigenvectors of ΦΦH which is given as,

ΦΦH =(ΩQH + E

) (ΩQH + E

)H= Ω2 + E,

where the perturbation matrix E = EQΩ + ΩQHEH and the higher order term

EEH has been neglected. Thus, Ii is the eigenvector of Ω2 while Ii is the perturbed

eigenvector of Ω2 + E. Hence from Eq(3.42) the coefficients αji are given as

αji =IHj EIi

Ω2i − Ω2

j

=ΩiI

Hj Eqi + Ωjq

Hj EHIi

Ω2i − Ω2

j

. (3.46)

Similarly qi is the perturbed eigenvector of ΦHΦ. Hence considering ΦHΦ and

repeating the above procedure it can be shown that the complex coefficients βji

60

are given as

βij =ΩjI

Hj Eqi + Ωiq

Hj EHIi

Ω2i − Ω2

j

. (3.47)

It can also be observed that

αij = −α∗ji , βij = −β∗

ji. (3.48)

Thus C, D are skew-Hermitian matrices. Let Ωi = Ωi + ∆Ωi. Ω21, Ω

22, . . . , Ω

2t are

eigenvalues of the perturbed matrix ΦΦH . We have (Ωi + ∆Ωi)2 − Ω2

i ≈ 2Ωi∆Ωi.

Hence using the result for perturbed eigenvalues from Eq(3.42) we have 2Ωi∆Ωi =

IHi EIi and hence

∆Ωi =1

2Ωi

IHi

(ΩQHEH + EQΩ

)Ii. (3.49)

Finally, we derive a constraint for the coefficients Kii, Lii. By definition, the sin-

gular vectors satisfy∥∥∥Ii

∥∥∥

2

= ‖qi‖2 = 1. Using this in (3.44) we have

Kii + K∗ii = −

t∑

j=1

j 6=i

|αji|2 , Lii + L∗ii = −

t∑

j=1

j 6=i

|βji|2 . (3.50)

Also, from the properties of the singular value decomposition it can be seen that

ΩiqHi = Ii

(

IΩQH)

= Ii

(ΩQH + E

). This yields,

Ωi

qi + Liiqi +

t∑

j=1

j 6=i

βjiqj

H

=

IH

i + KiiIi +t∑

j=1

j 6=i

αjiIHj

(IΣQH + E

). (3.51)

Right multiplying both sides by qi in (3.51) and simplifying by ignoring second

and higher order terms (such as ∆ΩiLii , KiiE etc.) yields,

(1 + K∗ii) Ωi + IH

i Eqi = (Ωi + ∆Ωi) (1 + L∗ii)

⇒ K∗ii − L∗

ii =1

Ωi

(

∆Ωi −1

Ωi

IHi EQΩIi

)

. (3.52)

Finally, substituting the expression for ∆Ωi from (3.49) yields

K∗ii − L∗

ii =1

2Ω2i

IHi

(ΩQHEH − EQΩ

)Ii. (3.53)

61

(3.46),(3.47),(3.53) thus give expressions for the perturbation coefficients depend

on the current realization of the observation noise ηp through E. Next, we express

the estimation error in H in terms of the perturbation coefficients. From (3.19), Q

is given as Q = IQH . We now employ the above expressions for the perturbation

coefficients to compute the MSE in the estimation of H.

Estimation error : The estimate of H is given as H = WQ = P√

ΩIQH . Simplify-

ing the estimation error H − H yields,

H − H = P√

ΩQH − P√

ΩCDHQH

=t∑

i=1

pi

√Ωiq

Hi −

t∑

i=1

√Ωipi (1 + Kii + L∗

ii)qHi

−t∑

j=1

t∑

i=j+1

√Ωjpi

(β∗

ij − α∗ij

)qH

j

−t∑

i=1

i−1∑

j=1

√Ωipi (αij − βij)q

Hj ,

where we have have employed (3.48) and neglected second order terms of the type

αijβkl in the above expansion. The estimation error of the (k, l)-th term may then

be obtained as,

Hkl − Hkl = −t∑

i=1

√ΩiPki (Kii + L∗

ii) Q∗li −

t∑

j=1

t∑

i=j+1

√ΩjPki

(β∗

ij − α∗ij

)Q∗

lj

−t∑

i=1

i−1∑

j=1

√ΩiPki (αij − βij) Q∗

lj.

From (3.53) we have Kii + L∗ii = Kii − Lii −

∑tj=1

j 6=i|βji|2 and therefore neglecting

higher than second order terms |Kii + L∗ii|2 ≈ |Kii − Lii|2. Utilizing this fact, the

62

variance of the estimation error in the (k, l)-th term can be written as

E

∣∣∣Hkl − Hkl

∣∣∣

2

=t∑

i=1

Ωi |Pki|2 E|Kii − Lii|2

|Qli|2

+t∑

j=1

t∑

i=j+1

Ωj |Pki|2 E|βij − αij|2

|Qlj|2

+t∑

i=1

i−1∑

j=1

Ωi |Pki|2 E|αij − βij|2

|Qlj|2 . (3.54)

Perturbation coefficient statistics : From the expressions for the perturbation coef-

ficients αij , βij given in (3.46) and (3.47) respectively we have,

|αij − βij| =

∣∣IH

j Eqi − qHj EHIi

∣∣

Ωi + Ωj

. (3.55)

Let Aii , E|Kii − Lii|2

and Bij , E

|βij − αij|2

. Also, let E = ΣG where

G , 1σ2

sLUHηpX

Hp . Since the noise η is spatio-temporally white, the elements of

the matrix ηp are uncorrelated. Further the variance of each element of ηp is σ2n.

U,Xp are orthogonal matrices. Hence G has uncorrelated entires. However, since

every column of Xp has a variance of σ2sL, the variance of each element of G is

given as E|Gij|2

, σ2

G = σ2n

σ2sL

. Also,

IHj Eqi − qH

j EHIi = σjIHj Gqi − σiq

Hj GHIi. (3.56)

Since |Ii|2 = |qi|2 = 1 we have E∣

∣IHj Gqi

∣∣2

= E∣

∣qHj GHIi

∣∣2

= σ2G. Also,

from properties of zero-mean circularly symmetric random variables IHj Gqi and

qHj GHIi are uncorrelated. Therefore, E

∣∣IH

j Eqi − qHj EHIi

∣∣2

=(σ2

j + σ2i

)σ2

G =

(Ωi + Ωj) σ2G. Substituting this result in (3.55) yields

Bij =σ2

G

Ωi + Ωj

, (3.57)

where the expectation is over the distribution of ηp conditioned on the channel

matrix H. Similarly repeating the above analysis for Kii −Lii from the expression

63

given in (3.53) yields, Aii =σ2

G

2ΩiSubstituting these results in (3.54) we have

E

∣∣∣Hkl − Hkl

∣∣∣

2

=σ2

n

σ2sL

t∑

i=1

1

2|Pki|2 |Qli|2 +

σ2n

σ2sL

t∑

i=1

t∑

j=1

j 6=i

Ωi

Ωi + Ωj

|Pki|2 |Qlj|2

=σ2

n

σ2sL

t∑

i=1

t∑

j=1

σ2i

σ2j + σ2

i

|Pki|2 |Qlj|2 . (3.58)

which is similar to the CRB of the variance of each element given in (3.18). Thus

the constrained ML estimator achieves the CRB asymptotically in SNR.

4 Fisher Information Based

Regularity and Semi-Blind

Estimation of MIMO-FIR

Channels

4.1 Introduction

Identifiability is another key consideration in the study of channel estima-

tion, especially in the context of semi-blind channel estimation where it is unclear

as to how much of the wireless channel can be estimated from the statistical infor-

mation available. In [42] the authors make an intuitive observation that the nullity

of the Fisher information matrix equals the number of unidentifiable parameters.

It is then demonstrated that at least one parameter is unidentifiable from the sta-

tistical information in the context of a SIMO channel. This is also reported widely

in research on subspace based channel estimation such as [43]. In [44,45], a variety

of channel estimation scenarios such as pilot, blind and semi-blind are considered

for a SIMO channel where criterion have been derived for identifiability. In the

context of MIMO channels, criterion for identifiability have been demonstrated in

[46]. Treating the transmitted symbols as deterministic unknown quantities, the

conditions for the regularity of the MIMO FIM and strict identifiability have been

demonstrated in [47]. Yet another interesting result on identifiability is reported in

64

65

Figure 4.1: Schematic representation of an SB system.

[48] where it has been demonstrated that in the context of orthogonal space-time

block codes the channel matrix can be estimated up to a complex scalar ambiguity

from blind symbols. Identifiability results for a general MIMO space-time coding

framework (not necessarily orthogonal) are presented in [49].

Parallel to these developments, work has been simultaneously reported

on blind and semi-blind algorithms for the estimation of MIMO frequency selective

channels. In [50] Tugnait and Huang have elaborated a blind algorithm based on

linear prediction for the estimation of MIMO frequency selective channels from

higher order statistics, and also demonstrate that it is not necessary to assume

that the channel matrix is column reduced. Another interesting scheme is demon-

strated in [51] for equalizable matrices that are not necessarily irreducible but can

be decomposed into the product of an irreducible and para-unitary component. A

closed form blind channel estimator has been demonstrated for MIMO channels

employing orthogonal space-time block codes in [52]. An algorithm for the simulta-

neous semi-blind estimation of channel state information of multiple transmitters

employing orthogonal space-time block codes has been demonstrated in [53].

In this work we address issues regarding the semi-blind estimation of

MIMO-FIR channels through an FIM approach. The FIM has also been employed

as a tool in [42] and [47] to study SIMO and MIMO channel identifiability issues

respectively. However, a key difference in our work is that we treat the unknown

data symbols as stochastic quantities bearing valuable statistical information. In

66

fact, a comparison of the stochastic (random) and deterministic (nonrandom) sig-

nal scenarios has been presented in [54] in the context of direction of arrival es-

timation where it has been demonstrated that if Cs, Cd denote the Cramer-Rao

Bounds (CRB) on the covariance matrices of the stochastic and deterministic es-

timation schemes, then Cd ≥ Cs. In other words, the stochastic model is statisti-

cally more efficient than its deterministic counterpart. The stochastic signal model

has the advantage that the number of unknowns in the system no longer grows

with the transmitted data symbols and more over, the statistical information can

be employed to enhance the accuracy of the computed channel estimate as we

demonstrate. It can be observed that an r × t MIMO system with Lh channel

taps has 2rtLh real parameters. Given the additional statistical information in a

semi-blind system, a key concern is how many pilot symbols are necessary for iden-

tifiability. Using the FIM framework, we demonstrate that at least t pilot symbols

are necessary for regularity (zero nullity of the FIM) which implies identifiability

[47] of the MIMO-FIR channel. This framework can also demonstrate the well

known result namely that the MIMO channel can be estimated at most up to t2

indeterminate parameters from second order statistical information1. Thus, in a

MIMO-FIR system, under certain mild conditions, pilot symbols are needed only

to identify t2 blindly unidentifiable parameters. Further, we quantify the perfor-

mance enhancement of a semi-blind scheme relative to an exclusive training based

scheme which does not leverage on the statistical information available. Employing

an asymptotic analysis we show that the semi-blind CRB (SB-CRB) indeed ap-

proaches the complex-constrained CRB (CC-CRB) [55] for the estimation of these

t2 parameters. Therefore, the semi-blind MSE of estimation is potentially very

small compared to an exclusive pilot based scheme.

It is known from earlier work that the t2 blindly unidentifiable param-

eters correspond to a t × t unitary matrix indeterminacy [46]. Employing this

1This does not imply that 2rtLh − t2 of the original 2rtLh parameters can be estimated but ratherthat each of the 2rtLh parameters can be identified only up to t2 degrees of freedom. This will be moreclear from the discussion in the following sections.

67

observation, we describe a scheme for the SB estimation of a MIMO-FIR chan-

nel based on an irreducible-unitary decomposition. The irreducible part can be

estimated from blind symbols by employing second order statistics (SOS) based

techniques. One such scheme which utilizes linear prediction is elaborated in [50].

Finally, we demonstrate an orthogonal pilot sequence based maximum-likelihood

(ML) scheme for the estimation of the unitary matrix indeterminacy and also de-

scribe a technique for the construction of an orthogonal pilot symbol matrix for

a MIMO FIR system from Paley-Hadamard matrices. The rest of the chapter is

organized as follows. The MIMO-FIR estimation problem is formulated in section

4.2. In section 4.3 we present results on the SB-FIM followed by section 4.4 which

describes the irreducible-unitary SB scheme for channel estimation. Simulation

results are presented in section 4.6 followed by conclusions in section 4.7. In what

follows, i ∈ m,n represents m ≤ i ≤ n; i,m, n ∈ N where N denotes the set of

natural numbers, rank (·) the Rank of a matrix and N (·) represents the null space

of a matrix.

4.2 Problem Formulation

Consider an Lh tap frequency-selective MIMO Channel. The system

input-output relation can be expressed as,

y(k) =

Lh−1∑

i=0

H(i)x(k − i) + n(k), (4.1)

where y(k),x(k) are the kth received and transmitted symbol vectors respec-

tively. η(k) is spatio-temporally white additive Gaussian noise of variance σ2n,

i.e. Eη(k)η(l)H

= σ2

nδ (k − l) Ir where δ (k) = 1 if k = 0 and 0 otherwise.

Let t, r be the number of transmitters, receivers and therefore, y(k) ∈ Cr×1 and

x(k) ∈ Ct×1. Each H(i) ∈ C

r×t, i ∈ 0, Lh − 1 is the MIMO channel matrix cor-

responding to the i-th lag. Also, we assume r > t, i.e. the number of receivers is

greater than the number of transmitters. Let xp(1),xp(2), . . . ,xp(Lp) be a burst

68

of Lp transmitted pilot symbols. The subscript p in the above notation represents

pilots. We also assume that the leading channel matrix H(0) is full rank. Let

H ∈ Cr×Lht be defined as,

H , [H(0), H(1), . . . , H(Lh − 1)] .

The input output relation can then be represented as Yp = HXp + Np, where the

block Toeplitz pilot matrix Xp ∈ CLp×Lht is constructed from the transmitted pilot

symbols as

Xp ,

xp(1) xp(2) . . . xp(Lp)

0 xp(1) . . . xp(Lp − 1)...

.... . .

...

0 0 . . . xp(Lp − Lh + 1)

. (4.2)

For the data symbols (which are blind information at the receiver), let us stack

N > Lh transmitted symbol vectors in X (k) described by the system model given

below as,

y (kN)

y (kN − 1)...

y ((k − 1)N + Lh)

︸︷︷︸

Y(k)

= H

x (kN)

x ((k − 1)N)...

x ((k − 1)N + 1)

︸︷︷︸

X (k)

+

η (kN)

η (kN − 1)...

η ((k − 1)N + Lh)

,

(4.3)

where the matrix H ∈ C(N−Lh+1)r×Nt is the standard block Sylvester channel

matrix often employed for the analysis of MIMO-FIR channels [42] and is given

as,

H ,

H(0) H(1) H(2) . . . H(Lh − 1) 0 . . .

0 H(0) H(1) . . . H(Lh − 2) H(Lh − 1) . . .

0 0 H(0) . . . H(Lh − 3) H(Lh − 2) . . ....

......

. . ....

.... . .

.

The input vector X (k) ∈ CNt×1 is the data symbol block. This model is represented

pictorially in fig.4.2., where the length of each block of data is N symbols long. Such

69

Figure 4.2: Schematic representation of input and output symbol blocks.

a stacking of the input/output symbols into blocks results in loss of a small number

of information symbols (Lh − 1 symbols per block) due to interblock interference

(IBI) as depicted in fig.4.2. This model is similar to the one considered in [56] and

is adapted because eliminating the IBI makes the analysis tractable by yielding

simplistic likelihood expressions.

Let the transmitted data symbols x(k) be spatio-temporally white, i.e.

Ex(k)x(l)H

= σ2

sδ(k − l)It and the normalized source power σ2s , 1. Hence the

covariance of the block input vector X (k) is given as RX , EX (i)X (i)H

= IN .

Next, we present insights into the nature of the above estimation problem.

4.3 Semi-Blind Fisher Information Matrix (FIM)

In this section we formally setup the complex FIM for the estimation of

the channel matrix H and provide insights into the nature of semi-blind estimation.

The parameter vector to be estimated θH ∈ C2Lhrt×1 is defined by stacking the

70

complex parameter vector and its conjugate as suggested in [16,55] as,

θH ,

θH(0)

θH(1)

...

θH(Lh−1)

, where θH(i) ,

vec (H(i))

vec (H(i))∗

∈ C2rt×1,

and vec (·) denotes the standard matrix vector operator which represents a column

wise stacking of the entries of a matrix into a single column vector. In what follows

k ∈ 0, Lh − 1, i ∈ 1, rt. Observe also that θ∗H(2krt + i) = θH ((2k + 1)rt + i). Let

Lb blocks of data symbols X (p), p ∈ 1, Lb be transmitted. In addition, let the data

symbol vectors x(l), l ∈ Lp + 1, NLb + Lp be Gaussian. Then, RY , the correlation

matrix of the output Y defined in section.4.2 is given as,

RY = EY(l)Y(l)H

= HHH + σ2

nI,

where Ry ∈ C(N−Lh+1)r×(N−Lh+1)r. To make the analysis tractable, the ISI between

the pilot and first block of blind output symbols is ignored as shown in fig.4.2. The

log-likelihood expression for the semi-blind scenario described above is then given

by a simple sum of the blind and pilot log-likelihoods as,

L (Y ; θH) = Lb + Lp,

where Lb, the Gaussian log-likelihood of the blind symbols is given as,

Lb = −Lb∑

k=1

tr(Y(k)HR−1

Y Y(k))− Lb ln detRY ,

and Lp, the least-squares log-likelihood of the pilot part is given as,

Lp =1

σ2n

Lp∑

i=1

∥∥∥∥∥yp(i) −

Lh−1∑

j=0

H(j)xp(i − j)

∥∥∥∥∥

2

. (4.4)

Hence, the FIM for the sum likelihood is given as,

JθH = J b + Jp, (4.5)

71

where J b, Jp ∈ C2rtLh×2rtLh are the FIMs for the blind and pilot symbols bursts

respectively, which are defined by the likelihoods Lb,Lp [15, 55]. This splitting of

the FIM into pilot and blind components is similar to the approaches employed in

[57, 58]. Next we present a general result on the properties of the FIM before we

apply it to the problem at hand in the succeeding sections.

4.3.1 FIM: A General Result

In this section, we present an interesting property of an FIM based anal-

ysis by demonstrating a relation between the rank of the FIM and the number

of unidentifiable parameters. Let α ∈ Ck×1 be the complex parameter vector of

interest. As described in [16,55] for estimation of complex parameters, we employ

a stacking of α as θ =[αT , αH

]T ∈ Cn×1 where n , 2k. Let p

(ω; g

(θ))

, be the

pdf of the observation vector ω parameterized by g(θ), where g

(θ)

: Cn×1 → C

l×1

is a function of the parameter vector θ. Similar to stacking α, α∗, let the function

f(θ) : Cn×1 → C

m×1,m , 2l be defined as,

f(θ) =

g(θ)

g∗(θ)

. (4.6)

Given the log-likelihood L(ω, θ

), ln p

(ω, f

(θ))

, the FIM Jθ ∈ Cn×n is given [15]

as,

Jθ , −E

∂2L(ω; θ

)

∂θ∂θH

.

Let f(θ)

be an identifiable function of the parameter θ, i.e. the FIM with respect

to f(θ)

has full rank. We then have the following result.

Lemma 5. Let p(ω; f

(θ))

, be the pdf of the observation vector ω and f(θ)

: Cn×1 →

Cm×1 be a function of the parameter vector θ satisfying the following conditions.

C.1 Let f(θ)

itself be an identifiable function of the parameter θ, i.e. the

FIM with respect to f(θ)

has full rank.

72

C.2 Let rank

(

N(

∂f(θ)∂θ

))

≥ d, or in other words the dimension of the null

space of∂f(θ)

∂θis at least d.

Under the above conditions, the FIM J(θ)∈ C

n×n is rank deficient and moreover,

rank(J

(θ))

≤ n − d.

Proof. Let p(ω, f

(θ))

be the pdf of the observations ω. The derivative of the

log-likelihood with respect to the parameter vector θ is given as,

∂

∂θln p

(ω, f

(θ))

=∂

∂f(θ) ln p

(ω, f

(θ)) ∂f

(θ)

∂θ.

The un-constrained FIM for the estimation of the parameter vector θ is given as,

J(θ)

= E

−(

∂

∂θln p

(ω, f

(θ))

)T∂

∂θln p

(ω, f

(θ))

=

(

∂f(θ)

∂θ

)T

E

−

(

∂

∂f(θ) ln p

(ω, f

(θ))

)T∂

∂f(θ) ln p

(ω, f

(θ))

︸︷︷︸

J(f(θ))

∂f(θ)

∂θ

Hence, from the condition C.2 above it follows that rank(J

(θ))

≤ n − d.

The value d has a critical significance in the above lemma and is related

to the number of un-identifiable parameters in the system. We now provide a

deeper insight into the the above result that connects the nature of the FIM to

the number of unidentifiable parameters. Explicitly, let θ be reparameterized by

the real parameter vector ξ ,[ξT1 , ξT

2

]T, ξ1 ∈ R

d′×1, ξ2 ∈ Rd×1 as θ

(ξ). Let

f(θ(ξ)

)∈ C

m×1 satisfy the property,

∂f(θ(ξ)

)

∂ξ2

=∂f

(θ)

∂θ

∂θ

∂ξ2

= 0m×d, (4.7)

or in other words, the function f(θ)

remains unchanged as the parameter vector

θ varies over the d dimensional constrained manifold θ(ξ2

)and thus

∂f(θ)∂θ

has at

least a d dimensional null space. The parameter vector ξ2 is the un-constrained

73

parameterization of the constraint manifold and represents the unidentifiable pa-

rameters. We assume that ∂θ∂ξ2

is full rank, i.e. θ is non-trivially dependant on

ξ2.

The above scenario can be better illustrated with the aid of the following

example. Consider the estimation of a frequency flat channel matrix H, i.e. Lh = 1.

Let the system be parameterized as α = vec (H) and hence from the discussion

above

θ ,

vec (H)

vec (H)∗

. (4.8)

From the system model described in (4.1), the pdf of the observation vector

y ∈ Cr×1 is given by y ∼ N

(0, g

(θ))

, where g(θ)∈ C

r2×1 is the output cor-

relation defined as g(θ)

, vec(HHH + σ2

nI). Consider a re-parameterization of

the channel matrix H by the real parameter vector ξ as H(ξ)

= W (ξ1)Q(ξ2),

where W(ξ1

)∈ C

r×t is also known as a whitening matrix and Q(ξ2

)∈ C

t×t

is a unitary matrix. g(θ)

is now a many to one mapping since g(θ(ξ))

=

vec(

W(ξ1

)W

(ξ1

)H+ σ2

nI)

and

f(θ(ξ))

=

vec

(

W(ξ1

)W

(ξ1

)H+ σ2

nI)

(

vec(

W(ξ1

)W

(ξ1

)H+ σ2

nI))∗

, (4.9)

which is independent of the parameter vector ξ2. Hence, it is a many to one

mapping since for all unitary matrices Q(ξ2

)we have,

∂f(θ(ξ)

)

∂ξ2

= 0r2×t2 , (4.10)

since ξ2 ∈ Rt2×1 (t2 is the number of real parameters to characterize a t× t unitary

matrix).

Thus, the rank of the FIM is deficient by at least d, which is the number of

un-identifiable parameters. This implies that each parameter θi in θ is identifiable

only up to d degrees of freedom owing to the un-identifiability of the ξ2 component

which is of dimension d. For instance, in the above example of the flat-fading

74

channel, each parameter can be identified only up to the corresponding parameter

of the whitening matrix. The above result has interesting applications, especially

in the investigation of identifiability issues in the context of blind and semi-blind

wireless channel estimation. In the next section we apply this analysis to the

problem of semi-blind MIMO-FIR channel estimation where we examine the rank

of the semi-blind FIM for different cases and derive further insights into the nature

of the estimation problem.

4.3.2 Blind FIM

We now apply the above result to our problem of MIMO-FIR channel

estimation. We start by investigating the properties of the blind FIM J b. Let

the block Toeplitz parameter derivative matrix E(k) ∈ C(N−Lh+1)r×Nt be defined

employing complex derivatives as

E(krt + i) ,∂H

∂θ2krt+iH

.

From the results for the Fisher information matrix of a complex Gaussian stochastic

process [59], J b defined in (4.5) above is given as,

J b2krt+i,2lrt+j = J b

(2l+1)rt+j,(2k+1)rt+i

= Lbtr(

E(krt + i)HHR−1Y HE(lrt + j)HR−1

Y

)

J b2krt+i,(2l+1)rt+j =

(J b

(2l+1)rt+j,2krt+i

)∗

= Lbtr(E(krt + i)HHR−1

Y E(lrt + j)HHR−1Y

), (4.11)

where J bk,l denotes its (k, l)th element. We can now apply the result in lemma 5

above to this FIM matrix J b and we have the following result on the rank of the

blind FIM for the MIMO FIR channel.

Theorem 2. Let the channel matrix H(0) have full column rank. Then, a rank

upper-bound on the blind FIM J b ∈ C2rtLh×2rtLh defined in (4.11) above is given

as,

rank(J b

)≤ 2rtLh − t2.

75

In fact, a basis for the t2×1 subspace of the null space N(J b

)is given by U (H) ∈

C2rtLh×t2 as,

U (H) ,

U(H(0))

U(H(1))...

U(H(Lh))

,

where the matrix function U (H) : Cr×t → C

2rt×t2 for the matrix H = [h1,h2, . . . ,ht]

is defined as,

U (H) =

−h∗1 −h∗

2 −h∗3 . . . 0 0 . . .

0 0 0 . . . h∗1 −h∗

2 . . .

0 0 0 . . . 0 0 . . ....

......

. . ....

.... . .

h1 0 0 . . . h2 0 . . .

0 h1 0 . . . 0 h2 . . .

0 0 h1 . . . 0 0 . . ....

......

. . ....

.... . .

. (4.12)

Proof. See Appendix 4.8.1.

Thus from the above result and lemma 5, it is clear that MIMO-FIR

impulse response of the channel can be estimated at most up to an indeterminacy of

t2 real parameters from the statistical information. Thus, using blind information

alone is not sufficient for the estimation of MIMO channel. Alternatively, the above

result has significant implications for estimation accuracy. As r, Lh increase, the

number of real parameters(2rtLh) in the system that need to be identified increases

many fold but the number of parameters that cannot be identified from blind

symbols may be as small as t2 implying that a wealth of data can be identified from

the blind symbols without any need for pilots. Continuing, we derive properties of

the semi-blind (pilot+blind) FIM JθH defined in (4.5).

76

4.3.3 Pilots and FIM

Recall that xp(1),xp(2), . . . ,xp(Lp) are the Lp transmitted pilot sym-

bols. Then, the FIM of the pilot symbols Jp is given as, Jp =∑Lp

i=1 Jp (i), where,

Jp (i) is the FIM contribution from the ith pilot symbol transmission. Given com-

plex vectors in Ct×1, let the matrix function V (i, j) : (Ct×1, Ct×1) → C

2rt×2rt be

defined as,

iVj ,

xp(i)xp(j)

H ⊗ Ir 0

0 xp(i)∗xp(j)

T ⊗ Ir

, if i, j > 0, (4.13)

and iVj = 02rt×2rt if i ≤ 0 or j ≤ 0. After some manipulations, it can be shown

that the FIM contribution Jp(i) ∈ C2rtLh×2rtLh is given as,

Jp(i) =1

σ2n

iViiVi−1 . . . iVi−Lh+1

i−1Vii−1Vi−1 . . . i−1Vi−Lh+1

......

. . ....

i−Lh+1Vii−Lh+1Vi−1 . . . i−Lh+1Vi−Lh+1

. (4.14)

The following result bounds the rank of the semi-blind (pilot + blind) FIM JθH .

Theorem 3. Let Lp ≤ t pilot symbols xp(1),xp(2), . . . ,xp(Lp) be transmitted and

the matrix H(0) be full column rank as assumed above. A rank upper bound of the

sum (pilot + blind) FIM JθH defined in (4.5) above is given as,

rank (JθH) ≤ 2rtLh − t2 +(2tLp − Lp

2), 0 ≤ Lp ≤ t (4.15)

or in other words, a lower bound on the rank deficiency is given as t2−(2tLp − Lp

2)

=

(t − Lp)2.


The above result gives an expression for the rank upper bound of the

MIMO-FIR Fisher information matrix for each transmitted pilot symbol. Since

identifiability requires a full rank FIM, it thus presents a key insight into the

number of pilot symbols needed for identifiability of the MIMO FIR system as

shown next.

77

4.3.4 Pilots and Identifiability

From the above result, one can obtain a lower bound for the minimum

number of pilot symbols necessary to achieve regularity or a full rank FIM JθH .

This result is stated below.

Lemma 6. The number of pilot symbol transmissions Lp should at least equal the

number of transmit antennas t for the the FIM JθH to be full rank and hence the

MIMO-FIR system in (4.1) to be identifiable.

Proof. It is easy to see from (4.15), that for Lp < t,

rank (JθH) = 2rtLh − (t − Lp)2 < 2rtLh,

i.e. strictly less than full rank. As the number of pilot symbols increases, for Lp = t,

rank (JθH) ≤ 2rtLp, where 2rtLp is the dimension of JθH and therefore represents

full rank. Hence, at least t pilot symbols are necessary for the identifiability of the

MIMO-FIR wireless channel.

Thus at least t pilot symbols are necessary for the system to become iden-

tifiable. This seemingly counter intuitive result implies that the use of statistical

information does not necessarily mean the reduction in the minimum number of

pilot symbols for identifiability, which is equal to t symbols even for pilot based

estimation. However, one has to observe that in the case of semi-blind estima-

tion potentially fewer number(t2) of parameters need to be estimated. Hence,

even though semi-blind schemes necessitate the transmission of a similar mini-

mum number of pilot symbols, the accuracy of estimation of such a scheme can

be higher owing to the fact that they estimate fewer parameters. Exactly how

much improvement can one expect by using such a scheme is quantified in the

next section where we present results about the asymptotic performance of the SB

estimator using the above FIM framework.

78

4.4 Semi-Blind Estimation: Performance

In this section we demonstrate and quantify the advantage of employ-

ing semi-blind estimation as compared to pilot based estimation. For this pur-

pose, let the MIMO transfer function of the FIR channel be defined as H(z) =∑Lh−1

i=0 H(i)z−i. Let H(z) satisfy :

A.1 H(z) is irreducible i.e. H(z) has full column rank for all z 6= 0 (including

z = ∞). It follows that if H(z) is irreducible, the leading coefficient matrix

[h1(0),h2(0), . . . ,ht(0)] has full column rank (substitute z = ∞ in H(z)).

A.2 H(z) is column reduced, i.e. the trailing coefficient matrix

H(Lh − 1) = [h1(Lh − 1),h2(Lh − 1), . . . ,ht(Lh − 1)]

has full column rank.

The above assumptions are mild in nature and usually satisfied with very high

probability by wireless channel matrices arising from the random fading coeffi-

cients. For a discussion about special scenarios where the above conditions are

not satisfied the reader is referred to works [50,51]. Under the assumptions above,

it is known [46] that H(z) can be identified up to a constant t × t unitary ma-

trix from second-order statistical information. Based on the above observations we

conjecture that if the MIMO transfer function H(z) satisfies A.1 and A.2 described

above, the rank of the blind MIMO Fisher information matrix J b is given as

rank(J b

)= 2rtLh − t2, (4.16)

i.e. the upper bound on the rank of the blind FIM J b in theorem 2 holds with

equality. A proof of the above statement has not been obtained, however this

has been extensively observed in our simulations. It is further conjectured that

in the case of fading wireless channels where the channel coefficients are random

quantities, the above result holds with probability 1 (see section 4.6). Therefore,

for the purposes of this section we assume that the result holds.

79

The above result implies that at most t2 of the 2rtLh parameters of the

MIMO-FIR system are unidentifiable from statistical information. This has signif-

icant implications for semi-blind schemes which leverage on statistical information

to potentially achieve estimation gains. In the next section, through an analysis of

the asymptotic FIM, we quantify this improvement in performance that can result

from a semi-blind scheme.

4.4.1 Asymptotic Semi-Blind FIM

Consider now the asymptotic performance of the semi-blind scheme from

an FIM perspective. As the amount of blind information increases (Lb → ∞),

the variance of estimation of the blind information (for instance the covariance

matrices) progressively decreases to zero, implying that the blindly identifiable

parameters (such as the whitening matrix) can be estimated accurately. Thus,

the SB estimation problem reduces to the constrained estimation problem of the

t2 blindly un-identifiable parameters. The CRB for such an estimation scheme is

given by the Complex-Constrained CRB as illustrated in [55]. The next result

demonstrates that the SB-CRB indeed converges to the complex-constrained CRB

(CC-CRB) as the amount of blind information increases. Hence, the limiting MSE

is equal to the MSE for the complex constrained estimation of the t2 blindly un-

identifiable parameters. SB techniques can therefore yield a far lesser MSE of

estimation than an exclusively pilot based scheme as illustrated by the following

result.

Theorem 4. Let Jp = Lp

σ2nI2rtLh

, which is achieved by an orthogonal pilot matrix2.

Then, as the number of blind symbol transmissions increases (Lb → ∞), the semi-

blind CRB JθH−1 approaches the CRB for the exclusive estimation of the t2 blindly

unidentifiable parameters. Further, the trace of the CRB matrix converges to,

limLb→∞

E

∥∥∥H − H

∥∥∥

2

F

≥ 1

2tr

(JθH

−1)

=

(σ2

n

2Lp

)

t2, (4.17)

2the construction of such an orthogonal pilot matrix Xop is shown later in section 4.5.2

80

which depends only on t2, the number of blindly unidentifiable parameters.

Proof. Given the fact that Jp = Lp

σ2nI2rtLh

, the semi-blind FIM can be expressed as

JθH =Lp

σ2n

I2rtLh×2rtLh+ LbJ

b,

where J b is the blind FIM corresponding to a single observed blind symbol block

Y and is given as J b ,(J b/Lb

), where the blind FIM J b is defined in (4.11).

From (4.16), it can be seen that J b is rank deficient and rank(

J b)

= rank(J b

)=

2rt − t2. Let the eigen-decomposition of J b be given as J b = EbΛbEbH , where

Λb ∈ C(2rt−t2)×(2rt−t2) is a diagonal matrix. Then,

J (θH) =Lp

σ2n

[E⊥

b , Eb

] [E⊥

b , Eb

]H+ LbEbΛbE

Hb

=[Eb, E

⊥b

]

Lp

σ2nI + LbΛb 0

0 Lp

σ2nI

[Eb, E

⊥b

]H.

Hence the CRB J−1 (θH) is given as,

J−1 (θH) =[Eb, E

⊥b

]

(Lp

σ2nI + LbΛb

)−1

0

0 σ2n

LpI

[Eb, E

⊥b

]H.

As the number of blind symbols Lb → ∞, the diagonal matrix(

Lp

σ2nI + LbΛb

)−1

→0(2rt−t2)×(2rt−t2) in the above expression. Thus the semi-blind bound approaches

the complex constrained Cramer-Rao bound (CC-CRB)[55] given as,

limLb→∞

J−1 (θH) =σ2

n

Lp

E⊥b E⊥

b

H.

In fact, the bound on the MSE is clearly seen to be given as,

E

∥∥∥θH − θH

∥∥∥

2

F

≥ σ2n

Lp

tr(

E⊥b E⊥

b

H)

⇒ 2

(

E

∥∥∥H − H

∥∥∥

2

F

)

≥ σ2n

Lp

tr(

E⊥b E⊥

b

H)

E

∥∥∥H − H

∥∥∥

2

F

≥ 1

2

σ2n

Lp

(2rt −

(2rt − t2

))

=

(σ2

n

2Lp

)

t2,

81

which is the constrained bound for the estimation of the MIMO channel matrix

H.

Thus the bound for the MSE of estimation and hence the asymptotic

MSE of the maximum-likelihood estimate of the channel matrix H with the aid

of blind information, is directly proportional to t2. The MSE of least-squares

estimation exclusively using pilot symbols is given as 12tr

(Jp−1) =

(σ2

n

2Lp

)

2rtLh

and is proportional to 2rtLh, the total number of real parameters. Hence, the SB

estimate which has an asymptotic MSE lower by a factor of 2(

rt

)Lh can potentially

be very efficient compared to exclusive pilot only channel estimation schemes. For

instance, in a MIMO system with r = 4, t = 2 and Lh = 2 channel taps, the

potential reduction in MSE by empoying a semi-blind estimation procedure is

2(

rt

)Lh = 9dB as demonstrated in section 4.6. Thus the SB estimation scheme

can result in significantly lower MSE.

4.5 Semi-blind Estimation: Algorithm

As shown above, the SB problem involves identifying t2 parameters from

the pilot data. These t2 parameters correspond to a unitary matrix. More precisely,

Let H(z) ∈ Cr×t(z) be the r× t irreducible channel transfer matrix. Let the input

output system model be as shown in section.4.2 Then, H(z) can be identified up

to a unitary matrix from the output second order statistics of data. The matrix

H ∈ Cr×Lp can be expressed as

H = W(ILh

⊗ QH), where W , [W (1),W (2), . . . ,W (Lh − 1)] .

From the above result, the matrices W (i), i ∈ 0, Lh − 1 can be estimated blind

from the correlation lags Ry(j), j ∈ 0, Lh − 1. In the flat fading channel case

(Lh = 1), this can be done by a simple Cholesky decomposition of the instantaneous

output correlation matrix Ry(0). However, for the case of frequency selective

channels, estimating the matrices W (i) is more involved and a scheme based on

82

designing multiple delay linear predictors is given in [50] (Set na = 0, d = nb =

Lh − 1 and it follows that Fi = W (i)). It thus remains to compute the unitary

matrix Q ∈ Ct×t, i.e. Q is such that QQH = QHQ = I and Q has very few

parameters (t2 real parameters, [55]). In the next section, we present algorithms

for the estimation of this unitary Q indeterminacy from the transmitted pilot

symbols.

4.5.1 Orthogonal Pilot ML (OPML) for Q Estimation

We now describe a procedure to estimate the unitary matrix Q from

an orthogonal pilot symbol sequence Xp. Let Xp(i), i ∈ 0, Lh − 1 be defined as

Xp(i) , [xp(Lh − i),xp(Lh − i + 1), . . . ,xp(Lp − i)]. Let Xop the pilot matrix be

defined as,

Xop ,

Xp(0)

Xp(1)...

Xp(Lh − 1)

.

We choose a different structure for this pilot matrix as compared to Xp in (4.2)

to enable the construction of an orthogonal pilot matrix as shown later. The least

squares cost function for the constrained estimation of the unitary matrix Q can

then be written as,∥∥∥∥∥Yp −

Lh−1∑

i=0

W (i)QHXp(i)

∥∥∥∥∥

2

, subject to QQH = It.

Let the pilot matrix Xop be orthogonal, i.e. Xo

p

(Xo

p

)H= LpILht. The cost mini-

mizing Q is then given as,

Q = UV H , where UΣV H = SV D

(Lh−1∑

i=0

X(i)Y HW (i)

)

.

Proof. Follows from an extension of the result in [55].

Finally, H is given as H , WQH . It now remains to demonstrate a

scheme to construct such an orthogonal pilot matrix Xop which is treated next.

83

Figure 4.3: Paley Hadamard Matrix

4.5.2 Orthogonal Pilot Matrix Construction

An orthogonal pilot in the context of MIMO FIR channels can be con-

structed by employing the Paley Hadamard (PH) orthogonal matrix structure

shown in fig.4.3 and such a scheme has been described in [60]. The PH ma-

trix has blocks of shifted orthogonal rows (illustrated with the aid of rectangular

boundaries), thus giving it the block sylvester structure. Each transmit stream

of orthogonal pilots for the FIR system can be constructed by considering the ’L’

shaped block shown in the figure and removing the prefix of Lh (channel length)

symbols at the receiver. Thus, the modified pilot matrix (obtained by removing the

initial Lh − 1 columns corresponding to zero transmissions in (4.2)) for orthogonal

pilots Xop is given as,

Xop ,

xp(Lh) xp(Lh + 1) . . . xp(Lp)

xp(Lh − 1) xp(Lh) . . . xp(Lp − 1)...

.... . .

...

xp(1) xp(2) . . . xp(Lp − Lh + 1)

. (4.18)

Orthogonal pilots have shown to be optimally suited for MIMO channel estimation

in studies such as [39, 40]. However, the Sylvester structure of FIR pilot matrices

further constrains the set of orthogonal pilot symbol streams compared to flat-

fading channels. As the number of channel taps increases, employing a PH matrix

84

0 1 2 3 4 50

5

10

15

20

25

Pilot Length (Lp)

FIM

Ran

k D

efic

ienc

y

FIM Rank Deficiency Vs. Pilot Length

t = 5

Figure 4.4: Rank deficiency of the complex MIMO FIM Vs. number of transmitted

pilot symbols (Lp)for a 6 × 5 MIMO FIR system of length Lh = 5.

to construct an orthogonal pilot symbol stream implies choosing a PH matrix

with a large dimension. This in turn implies an increase in the length of the pilot

symbol sequence and hence a larger overhead in communication. This problem can

be alleviated by employing a non-orthogonal pilot symbol sequence which results

in slightly suboptimal estimation performance but enables the designer to choose

any pilot length desired. The IGML scheme for channel estimation using non-

orthogonal pilots is described in [55] for flat-fading channels and can be extended

to FIR channels in a straight forward manner. Experimental results have shown

its performance to be comparable to the scheme that employs orthogonal pilots.


In this section we present results of computer simulation experiments to

illustrate the salient aspects of the work described above. In a majority of our

simulations below we consider a 4× 2 MIMO FIR channel with 2 taps i.e. Lh = 2,

r = 4 and t = 2. Each of the elements of H is generated as a zero-mean circularly

85

symmetric complex Gaussian random variables of unit variance, i.e a Rayleigh

fading wireless channel. The orthogonal pilot sequence is constructed from Paley

Hadamard matrices by employing the scheme in section 4.5.1. The transmitted

symbols, both pilot and blind (data) are assume to be drawn from a quadrature

phase shift keying (QPSK) symbol constellation[3].

Experiment 1: In fig.4.4. we plot the rank deficiency of the FIM of a 6×5 MIMO

FIR system (r = 6, t = 5) with Lh = 5 channel taps as a function of the num-

ber of transmitted pilot symbols Lp. The rank was computed for 100 realizations

of randomly generated Rayleigh fading MIMO channels and the rank deficiency

observed was precisely [25, 16, 9, 4, 1, 0] for Lp = [0, 1, 2, 3, 4, 5] transmitted pilot

symbols respectively. Hence, rank deficiency 25 for Lp = 0 verifies that the result

in (4.16) holds with overwhelming probability in the case of randomly generated

MIMO channels. Further, for 1 ≤ Lp ≤ 5, the rank deficiency is given as (5 − Lp)2

which additionally verifies that the bound in (4.15) for FIM rank deficiency as a

function of number of pilot symbols holds with equality with high probability.

Experiment 2: In fig.6.4. we plot the MSE vs SNR when the whitening matrix

W (z) is estimated from NLb = 1000, 5000 blind received symbols employing the

linear prediction based scheme from [50]. The Q matrix is estimated from Lp = 20

orthogonal pilot symbols employing the semi-blind scheme in section 4.5.1. For

comparison we also plot the MSE when H is estimated exclusively from training

using least-squares [15, 55], the asymptotic complex constrained CRB (CC-CRB)

given by (4.17) and the MSE of estimation with the genie assisted case of perfect

knowledge of W (z). It can be observed that the MSE progressively decreases

towards the complex constrained CRB as the number of blind symbols increases.

Also observe as illustrated in theorem(4), the asymptotic SB estimation error is

10 log(

324

)= 9dB lower than the pilot based scheme as illustrated in section 4.4.1.

In fig.4.6.(left) we plot the MSE performance of the competing estima-

86

2 3 4 5 6 7 8 9 1010

−2

10−1

SNR

MS

E

MSE Vs. SNR(dB), Lh = 2, r X t = 4X2, L

p = 20

Semi−BlindNL

b = 1000

NLb = 5000

TrainingAsymp CRB

Figure 4.5: MSE Vs SNR in a 4 × 2 MIMO channel with Lh = 2 channel taps, Lp

= 20 pilot symbols.

tion schemes above for different transmitted pilot symbol lengths Lp and 5000

transmitted QPSK data symbols (blind received symbols). As illustrated in sec-

tion 4.5.1, we employ Paley-Hadamard matrices to construct the orthogonal pilot

sequences. Since such matrices exist only for certain lengths Lp, we plot the per-

formance for Lp = 12, 20, 48, 68, 140 pilot symbols. The asymptotic semi-blind

performance is 9dB lower in MSE as seen above. Also, for a given number of blind

symbols, the performance gap in MSE of performance of the semi-blind scheme

with W (z) estimation and that of the training scheme progressively decreases.

This is due to the fact that more blind symbols are required to accurately estimate

the whitening-matrix W (z) for the MSE performance of the semi-blind scheme to

be commensurate with the performance improvement of the pilot based scheme.

Finally, in fig.4.6.(right) we plot the performance of the competing schemes for

different number of blind symbols in the range 1000 − 5000 QPSK symbols and

Lp = 12 pilot symbols. The performance of the SB scheme with W (z) estimated

can be seen to progressively improve as the number of received blind symbols in-

creases.

87

20 40 60 80 100 120 140

10−2

10−1

Pilot Length (Lp)

MS

E

MSE Vs. Lp, L

h = 2, r X t = 4X2, SNR = 5dB

Semi−BlindNL

b = 5000

TrainingAsymp CRB

1000 1500 2000 2500 3000 3500 4000 4500 500010

−2

10−1

100

NLb (#Blind Symbols)

MS

E

MSE Vs. NLb (# Blind Symbols), SNR = 5dB, L

h = 2, r X t = 4X2, L

p = 20

Semi−BlindImperfect W(z)TrainingAsymp CRB

Figure 4.6: MSE performance for estimation of a 4× 2 MIMO frequency-selective

channel. Left- MSE Vs. Lp and Right - MSE Vs. number of blind symbols.

Experiment 3: We compare the symbol error rate (SER) performance of the

training and semi-blind channel estimation schemes. At the receiver, we employ a

stacking as in (4.3) of 7 received symbol vectors y(k) followed by a MIMO mini-

mum mean-square error (MMSE) detector[4] constructed from the MIMO channel

matrix H. In fig.4.7. we plot the SER of detection of the transmitted QPSK

symbols Vs. SNR in the range 2 − 16dB. It can be seen that the asymptotic

semi-blind estimator has a 1 − 2dB improvement in detection performance over

the exclusive training based scheme. The SER drops from around 1 × 10−1 at

2dB to 1 × 10−8 at 16dB. Thus, an SB based estimation scheme can potentially

yield significant throughput gains when employed for the estimation of the wireless

MIMO frequency-selective channel.

88

2 4 6 8 10 12 14 16

10−8

10−7

10−6

10−5

10−4

10−3

SNR

SE

R

SER Vs SNR, Lh = 2, r X t = 4X2

TrainingAsymp SB

Figure 4.7: Symbol error rate (SER) Vs. SNR for QPSK symbol transmission of

a 4 × 2 MIMO frequency selective channel with Lh = 2 channel taps.

4.7 Conclusion

In this work we have investigated the rank properties of the semi-blind

FIM of a MIMO FIR channel and demonstrated that at least t pilot symbol trans-

missions are necessary to achieve a full rank FIM (and hence identifiability) for

an Lh tap r × t (r > t) channel. Under certain mild conditions, the MIMO

channel transfer function H(z) can be decomposed as H(z) = W (z)QH , where

the whitening transfer function W (z) can be estimated from the blind symbols

alone. A constrained semi-blind estimation scheme has been presented to estimate

the unitary matrix Q from pilot symbols along with an algorithm to achieve an

orthogonal pilot matrix structure for MIMO frequency selective channels using

Paley-Hadamard matrices. Simulation results demonstrate the performance of the

proposed scheme.

89

Acknowledgement


in A. K. Jagannatham and B. D. Rao,“FIM Regularity for Gaussian Semi-Blind

MIMO FIR Channel Estimation ”, Conference Record of the Thirty-Ninth Asilo-

mar Conference on Signals, Systems and Computers, Oct. 28 - Nov. 1, 2005,

Pages: 848 - 852.

90


4.8.1 Proof of Theorem 2

Proof. We illustrate the result for the simpler case of the flat fading channel i.e.

Lh = 1. Then, H = H = H(0) = H ∈ Cr×t. Let Φ ,

(HHH + σ2

nI)−1

. Let the

blind FIM J b be block partitioned as,

J b ,

J b

11 J b12

J b21 J b

22

. (4.19)

It can be verified from (4.11) that J b21 = J b

12H

and J b22 = J b

11T. The block compo-

nents of the FIM are given as

J b11 =

hH1 Φh1Φ11 hH

1 Φh1Φ21 . . . hH1 ΦhtΦr1

hH1 Φh1Φ12 hH


......

. . ....

hH1 Φh1Φ1r hH

1 Φh1Φ2r . . . hH1 ΦhtΦrr

hH2 Φh1Φ11 hH


hH2 Φh1Φ12 hH


......

. . ....

hHt Φh1Φ1r hH

t Φh1Φ2r . . . hHt ΦhtΦrr

,

91

which can be written succinctly as(HHΦH

)⊗ ΦT . Similarly, J b

12 is given as,

J b12 =

χ11χ11 χ12χ11 . . . χ11χ21 . . . χ1rχt1

χ11χ12 χ12χ12 . . . χ11χ22 . . . χ1rχt2

......

. . . . . .. . .

...

χ11χ1r χ12χ1r . . . χ11χ2r . . . χ1rχtr

χ21χ11 χ22χ11 . . . χ21χ21 . . . χ2rχt1

χ21χ12 χ22χ12 . . . χ21χ22 . . . χ2rχt2

......

. . ....

. . ....

χ21χ1r χ22χ1r . . . χ21χ2r . . . χ2rχtr

χt1χ11 χt2χ11 . . . χt1χ21 . . . χtrχt1

......

. . ....

. . ....

χt1χ1r χt2χ1r . . . χt1χ2r . . . χtrχtr

.

where χ , HHΦ. It can now be seen that JU = 0, where U is as defined

in (4.12). For instance the top t elements of JU(:, 1) (where we employ MATLAB

notation and U(:, 1) denotes the first column of U) are given as [J b11, J

b12]U(:, 1) =

−(hH

1 Φh1

)ΦTh∗

2 +(hT

2 Φ)T (

hH1 Φh1

)= 0 and so on. The structure of the FIM

for the most general case of arbitrary Lh is complex, but the result can be seen to

hold by employing a symbolic manipulation software tool such as the MATLAB

symbolic toolbox package.

It can be seen that the top part of the null space basis matrix U (H) is

U (H(0)). As assumed earlier, rank (H(0)) = t. Now it is easy to see that if U (H)

is rank deficient, U (H(0)) is rank deficient and from its structure, H(0) is rank

deficient violating the assumption. Hence rank (U (H)) = t2.


Proof. J b and Jp(i), i ∈ 1, k are positive semi-definite (PSD) matrices. We use the

following property: If A,B are PSD matrices, (A + B)v = 0 ⇔ Av = Bv = 0.

92

Therefore, JθHv = 0 ⇔ J bv = Jp(i)v = 02rtLh×1, ∀i ∈ 1, k. In other words,

N (J) = N(J b

)Lp⋂

i=1

N (Jp (i)) .

Let v ∈ N (J). Then, from the null space structure of J b in (4.12), it follows that

v = U (H) s, where s ∈ Ct2×1. Also,

Jp(i)v = Jp(i)U (H) s = 0, ∀i ∈ 1, Lp.

From lemma 7, this implies that iVi U(H(0))s = 0, ∀i ∈ 1, Lp. where iVi is as

defined in (4.13). Let the matrix T (u) : Ct×1 → C

2t×t2 be defined as

T (u) ,

0 −u∗1 −u∗

2 . . .

−u∗1 0 0 . . .

......

.... . .

u2 u1 0 . . .

0 0 u1 . . ....

......

. . .

. (4.20)

Recall that H(0) is assumed to have full column rank. Then, from the structure

of U (H(0)) given in (4.12) it can be shown that the relation above holds if and

only if, Ps = 0, where the matrix P ∈ C2Lpt×t2 is given as

P ,

T (xp(1))

T (xp(2))...

T (xp(Lp))

. (4.21)

It can then be seen that matrix G ∈ C2tLp×Lp

2forms a basis for the leftullspace of

P, i.e. GTP = 0, where

G ,

xp(1) xp(2) 0 . . .

xp(1)∗ 0 xp(2)∗ . . .

0 0 xp(1) . . .

0 xp(1)∗ 0 . . ....

......

. . .

(4.22)

93

Thus, rank (P) ≤ 2tLp − Lp2 and therefore, right nullity (or nullity) of P is

dim (N (P)) ≥ t2−(2tLp − Lp

2). And therefore, rank (JθH) ≤ 2rtLh−dim (N (P)) =

2rtLh − t2 +(2tLp − Lp

2).

Lemma 7. Let Jp(i)v = 0, ∀i ∈ 1, Lp where v = U (H) s. Then, iVi U (H(0)) s =

0, ∀i ∈ 1, Lp, where iVi is as defined in (4.13).

Proof. Consider Jp(1), the FIM contribution of the first transmitted pilot symbol.

It can be seen clearly that Jp(1) is given as,

Jp(1) =

1V1 02rt×(2Lh−2)rt

0(2Lh−2)rt×2rt 0(2Lh−2)rt×(2Lh−2)rt

. (4.23)

Hence, Jp(1)U (H) s = 0 implies that 1V1U(H(0))s = 0. Further, from the prop-

erties of the matrix Kronecker product we have AB ⊗ CD = (A ⊗ C) (C ⊗ D).

Substituting A = xp(i), B = xp(i)H , C = D = Ir, one can then obtain that

K (xp(1)) U (H(0)) s = 0

K (xp(i)) ,

xp(i)

H ⊗ Ir 0

0 xp(i)T ⊗ Ir

(4.24)

Since H(0) (and hence H(0)∗) is full rank, after some manipulation it can be shown

that the above condition implies T (xp(1)) s = 0. Now consider the contribution

of the second pilot transmission Jp(2). Jp(2)U (H) s = 0 implies that

K (xp(2)) U (H(0)) s + K (xp(1)) U (H(1)) s = 0. (4.25)

Since T (xp(1)) s = 0 and U (H(0)) , U (H(1)) have the same structure, it can be

shown that K (xp(1)) U (H(1)) s = 0 and hence it follows from the above equation

that K (xp(2)) U (H(0)) s = 0 which in turn implies that T (xp(2)) s = 0 and hence

2V2 U(H(0)) = 0 and so on. This proves the lemma.

5 Semi-Blind Estimation for

Maximum Ratio Transmission

5.1 Introduction

To recapitulate, the standard technique to estimate the channel is to

transmit a sequence of training symbols (also called pilot symbols) at the begin-

ning of each frame. This training symbol sequence is known at the receiver and

thus the channel is estimated from the measured outputs to training symbols.

Training based schemes usually have very low complexity making them ideally

suited for implementation in systems (e.g., mobile stations) where the available

computational capacity is limited.

However, the above training-based technique for channel estimation in

MRT based MIMO systems is transmission scheme agnostic. For example, channel

estimation algorithms when MRT is employed at the transmitter only need to

estimate v1 and u1, where v1 and u1 are the dominant eigenvectors of HHH

and HHH respectively, H is the r × t channel transfer matrix, and r / t are the

number of receive / transmit antennas. Hence, techniques that estimate the entire

H matrix from a set of training symbols and use the estimated H to compute v1

and u1 may be inefficient, compared to techniques designed to use the training

data specifically for estimating the beamforming vectors. Moreover, as r increases,

the mean squared error (MSE) in estimation of v1 ∈ Ct remains constant since

the number of unknown parameters in v1 does not change with r, while that of H

94

95

increases since the number of elements, rt, grows linearly with r. Added to this, the

complexity of reliably estimating the channel increases with its dimensionality. The

channel estimation problem is further complicated in MIMO systems because the

SNR per bit required to achieve a given system throughput performance decreases

as the number of antennas is increased. Such low SNR environments call for more

training symbols, lowering the effective data rate.

For the above reasons, semi-blind techniques can enhance the accuracy

of channel estimation by efficiently utilizing not only the known training sym-

bols but also the unknown data symbols. Hence, they can be used to reduce the

amount of training data required to achieve the desired system performance, or

equivalently, achieve better accuracy of estimation for a given number of training

symbols, thereby improving the spectral efficiency and channel throughput. Work

on semi-blind techniques for the design of fractional semi-blind equalizers in multi-

path channels has been reported earlier by Pal in [30,31]. In [28,58] error bounds

and asymptotic properties of blind and semi-blind techniques are analyzed. In

[24, 55], an orthogonal pilot based maximum likelihood (OPML) semi-blind esti-

mation scheme is proposed, where the channel matrix H is factored into the prod-

uct of a whitening matrix W and a unitary rotation matrix Q. W is estimated

from the data using a blind algorithm, while Q is estimated exclusively from the

training data using the OPML algorithm. However, feedback-based transmission

schemes such as MRT pose new challenges for semi-blind estimation, because em-

ployment of the precoder (beamforming vector) corresponding to an erroneous

channel estimate precludes the use of the received data symbols to improve the

channel estimate. This necessitates the development of new transmission schemes

to enable implementation of semi-blind estimation, as shown in Section 5.2-5.2.3.

Furthermore, the proposed techniques specifically estimate the MRT beamforming

vector and hence can potentially achieve better estimation accuracy compared to

techniques that are independent of the transmission scheme.

The contributions of this chapter are as follows. We describe the training-

96

only based conventional least squares estimation (CLSE) algorithm, and derive an-

alytical expressions for the MSE in the beamforming vector, the mean received SNR

and the symbol error rate (SER) performance. For improved spectral efficiency (re-

duced training overhead), we propose a closed-form semi-blind (CFSB) algorithm

that estimates u1 from the data using a blind algorithm, and estimates v1 exclu-

sively from the training. This necessitates the introduction of a new signal trans-

mission scheme that involves transmission of information-bearing spectrally white

data symbols to enable semi-blind estimation of the beamforming vectors. Expres-

sions are derived for the performance of the proposed CFSB scheme. We show that

given perfect knowledge of u1 (which can be achieved when there are a large num-

ber of white data symbols), the error in estimating v1 using the semi-blind scheme

asymptotically achieves the theoretical Cramer-Rao lower bound (CRB), and thus

the CFSB scheme outperforms the CLSE scheme. However, there is a trade-off in

transmission of white data symbols in semi-blind estimation, since the SER for the

white data is frequently greater than that for the beamformed data. Thus, we show

that there exist scenarios where for a reasonable number of white data symbols,

the gains from beamformed data for this improved estimate in CFSB outweigh

the loss in performance due to transmission of white data. As a more general

estimation method when a given number of blind data symbols are available, we

propose a new scheme that judiciously combines the above described CFSB and

CLSE estimates based on a heuristic criterion. Through Monte-Carlo simulations,

we demonstrate that this proposed linearly-combined semi-blind (LCSB) scheme

outperforms the CLSE and CFSB scheme in terms of both estimation accuracy as

well as SER and thus achieves good performance.

The rest of this chapter is organized as follows. In Section 5.2, we present

the problem setup and notation. We also present both the CFSB and CLSE

schemes in detail. The MSE and the received SNR performance of the CLSE

scheme are derived using a first order perturbation analysis in Section 5.3 and the

performance of the CFSB scheme is analyzed in Section 5.4. In Section 5.5, to

97

conduct an end-to-end system comparison, we derive the performance of Alamouti

space-time coded data with training-based channel estimation, and present the

proposed LCSB algorithm. We compare the different schemes through Monte-

Carlo simulations in Section 5.6 and present our conclusions in Section 5.7.

5.2 Preliminaries

5.2.1 System Model and Notation

Fig. 5.1 shows the MIMO system model with beamforming at the trans-

mitter and the receiver. We model a flat-fading channel by a complex-valued

channel matrix H ∈ Cr×t. We assume that H is quasi-static and constant over

the period of one transmission block. We denote the singular value decompo-

sition (SVD) of H by H = UΣV H , and Σ ∈ Rr×t contains singular values

σ1 ≥ σ2 ≥ . . . ≥ σm > 0, along the diagonal, where m = rank(H). Let v1

and u1 denote the first columns of V and U , respectively.

The channel input-output relation at time instant k is

yk = Hxk + ηk, (5.1)

where xk ∈ Ct is the channel input, yk ∈ C

r is the channel output, and ηk ∈ Cr

is the spatially and temporally white noise vector with i.i.d. zero mean circularly

symmetric complex Gaussian (ZMCSCG) entries. The input xk could denote either

be data or training symbols. Also, we let the noise power in each receive antenna

be unity, that is, Eηkη

Hk

= Ir, where E · denotes the expectation operation,

and Ir is the r × r identity matrix.

Let L training symbol vectors be transmitted at an average power PT per

vector (T stands for ‘training’). The training symbols are stacked together to form

a training symbol matrix Xp ∈ Ct×L as Xp = [x1,x2, . . . ,xL] (p stands for ‘pilot’).

We employ orthogonal training sequences because of their optimality properties in

channel estimation [40]. That is, XpXHp = γpIt, where γp , LPT /t, thus maintain-

ing the training power of PT . The data symbols xk could either be spatially-white

98

(i.e., Exkx

Hk

= (PD/t) It), or it could be the result of using beamforming at the

transmitter with unit-norm weight vector w ∈ Ct×1

(i.e., Exkx

Hk = PDwwH

),

where the data transmit power is ExH

k xk

= PD (D stands for ‘data’). We let

N denote the number of spatially-white data symbols transmitted, that is, a total

of N + L symbols are transmitted prior to transmitting beamformed-data. Note

that the N white data symbols carry (unknown) information bits, and hence are

not a waste of available bandwidth.

In this chapter, we restrict our attention to the case where the transmitter

employs MRT to send data, that is, a single data stream is transmitted over t

transmit antennas after passing through a beamformer w. Given the channel

matrix H, the optimum choice of w is v1 [61]. Thus, MRT only needs an accurate

estimate of v1 to be fed-back to the transmitter. We assume that t ≥ 2, since

when t = 1, estimation of the beamforming vector has no relevance. Finally,

we will compare the performance of different estimation techniques using several

different measures, namely, the MSE in the estimate of v1, the gain (rather, the

power amplification/attenuation), and the symbol error rate (SER) of the one-

dimensional channel resulting from beamforming with the estimated vector v1

assuming uncoded M -ary QAM transmission. The performance of a practical

communication would also be affected by factors such as quantization error in v1,

errors in the feedback channel, feedback delay in time-varying environments, etc.,

and a detailed study of these factors warrant separate treatment.

5.2.2 Conventional Least Squares Estimation (CLSE)

Conventionally, an ML estimate of the channel matrix, Hc, is first ob-

tained from the training data as the solution to the following least squares prob-

lem:

Hc = arg minG∈ Cr×t

‖Yp − GXp‖2F , (5.2)

where ‖·‖F represents the Frobenius norm, Yp is the r×L matrix of received symbols

given by Yp = HXp + ηp, where ηp ∈ Cr×L is the set of AWGN (spatially and

99

temporally white) vectors. From [15], the solution to this least squares estimation

problem can be shown to be Hc = YpX†p, where X†

p is the Moore-Penrose generalized

inverse of Xp. Since orthogonal training sequences are employed, we have X†p =

1γp

XHp , and consequently

Hc =1

γp

YpXHp . (5.3)

The ML estimate of v1 and u1, denoted vc and uc respectively, is now obtained

via an SVD of the estimated channel matrix Hc. Since Hc is the ML estimate of

H, from properties of ML estimation of principal components [62], the vc obtained

by this technique is also the ML estimate of v1 given only the training data.

5.2.3 Semi-Blind Estimation

In the scenario that the transmitted data symbols are spatially-white, the

ML estimate of u1 is the dominant eigenvector of the output correlation matrix

Ry, which is estimated as Ry =∑N

i=1 yiyHi . Now, the estimate of u1 is obtained

by computing the following SVD

UΣ2UH = Ry. (5.4)

Note that it is possible to use the entire received data to compute Ry in (5.4)

rather than just the data symbols, in this case, N should be changed to N + L.

The estimate of u1, denoted us (the subscript ‘s’ stands for semi-blind), is thus

computed blind from the received data as the first column of U . As N grows, a

near perfect estimate of u1 can be obtained.

In order to estimate u1 as described above, it is necessary that the trans-

mitted symbols be spatially-white. If the transmitter uses any (single) beamform-

ing vector w, the expected value of the correlation at the receiver is Hw(Hw)H =

HwwHHH 6= HHH , and hence, the estimated eigenvector will be a vector pro-

portional to Hw instead of u1. Fig. 5.2 shows a schematic representation of the

CLSE and the CFSB schemes. Thus, the CFSB scheme involves a two-phase

data transmission: spatially-white data followed by beamformed data. White data

100

transmission could lead to a loss of performance relative to beamformed data, but

this performance loss can be compensated for by the gain obtained from the im-

proved estimate of the MRT beamforming vector. Thus, the semi-blind scheme can

have an overall better performance than the CLSE scheme. Section 5.5 presents

an overall SER comparison in a practical scenario, after accounting for the perfor-

mance of the white data as well as for the beamformed data.

Having obtained the estimate of u1 from the white data, the training

symbols are now exclusively used to estimate v1. Since the vector v1 has fewer

real parameters (2t−1) than the channel matrix H (2rt), it is expected to achieve a

greater accuracy of estimation for the same number of training symbols, compared

to the CLSE technique which requires an accurate estimate of the full H matrix in

order to estimate v1 accurately. If u1 is estimated perfectly from the blind data,

the received training symbols can be filtered by uH1 to obtain

uH1 Yp = σ1v

H1 Xp + uH

1 ηp. (5.5)

Since ‖u1‖ = 1, (here ‖·‖ represents the 2-norm) the statistics of the Gaussian

noise ηp are unchanged by the above operation. We seek the estimate of v1 as the

solution to the following least squares problem

vs = arg minv∈ Ct, ‖v‖=1

∥∥uH

1 Yp − vHXpσ1

∥∥

2, (5.6)

where vs denotes the semi-blind estimate of v1. The following lemma establishes

the solution.

Lemma 1. If Xp satisfies XpXHp = γpIt, the least squares estimate of v1 (under

‖v1‖ = 1) given perfect knowledge of u1 is

vs =XpY

Hp u1

∥∥XpY H

p u1

∥∥. (5.7)


101

Closed-Form Semi-Blind Estimation Algorithm (CFSB)

Based on the above observations, the proposed CFSB algorithm is as

follows. First, we obtain us, the estimate of u1, from (5.4). Then, we estimate

v1 from the L training symbols by substituting us for u1 in (5.7). This requires

L + N symbols to actually estimate v1, however, N of these symbols are data

symbols (which carry information bits). Hence, we can potentially achieve the

desired accuracy of estimation of v1 using fewer training symbols compared to the

CLSE technique.

An alternative to employing u1 at the receiver is employ maximum ratio

combining (MRC), i.e., to use an estimate of Hv1/ ‖Hv1‖ (which can be accurately

estimated as the dominant eigenvector of the sample covariance matrix of the

beamformed data). The performance of such a scheme is summarized in [63], and

the analysis can be carried out along the lines presented in this chapter.

5.3 Conventional Least Squares Estimation (CLSE)

5.3.1 Perturbation of Eigenvectors

We recapitulate a result from matrix perturbation theory [10] that we

will use frequently in the sequel. Consider a first order perturbation of a hermitian

symmetric matrix R by an error matrix ∆R to get R, that is, R = R+∆R. Then,

if the eigenvalues of R are distinct, for small perturbations, the eigenvectors sk of

R can be approximately expressed in terms of the eigenvectors sk of R as

sk = sk +n∑

r=1r 6=k

sHr ∆Rsk

λk − λr

sr, (5.8)

where n is the rank of R, λk is the k-th eigenvalue of R, and λk 6= λj, k 6= j.

When k = 1, we have s1 = Sd, where S = [s1, s2, . . . , sn] is the matrix

of eigenvectors and d = [1,sH2 ∆Rs1

λ1−λ2, . . . , sH

n ∆Rs1λ1−λn

]T . One could scale the vector s1

to construct a unit-norm vector as s1 = s1/ ‖s1‖. Then, s1 = Sd, where d =

102

d/∥∥d

∥∥ = [1 + ∆d1, ∆d2, . . . , ∆dn]T . Following an approach similar to [64], if ∆di

are small, since ‖d‖ = 1, the components ∆di are approximately given by

∆di ≃ sHi ∆Rs1

λ1 − λi

, i = 2, . . . , n

∆d1 ≃ −1

2

n∑

i=1

|∆di|2 . (5.9)

Note that ∆d1 is real, and is a higher-order term compared to ∆di, i ≥ 2.

We will use this fact in our first-order approximations to ignore terms such as

|∆d1|2 , |∆d1|3 , . . . and |∆di|3 , |∆di|4 , . . . , i ≥ 2. In the sequel, we assume that

the dominant singular value of H is distinct, so the conditions required for the

above result are valid.

5.3.2 MSE in vc

To compute the MSE in vc, we use (5.3) to write the matrix HHc Hc as a

perturbation of HHH and use the above matrix perturbation result to derive the

desired expressions.

HHc Hc = V Σ2V H + Et, (5.10)

where Et ≈[V ΣUHEp + Ep

HUΣV H]

with Ep = 1γp

ηpXHp . Here, we have ignored

the EpHEp term in writing the expression for Et, since it is a second order term due

to the 1γp

factor in Ep. Now, we can regard Et as a perturbation of the matrix HHH.

As seen in Section 5.2(5.2.2), vc is estimated from the SVD of Hc. Since the basis

vectors V span Ct, we can let vc = V d, and write d = [1 + ∆d1, ∆d2, . . . , ∆dt]

T

as a perturbation of [1, 0, . . . , 0]T .

For i ≥ 2, ∆di is obtained from (5.9) as

∆di =vi

HEtv1

σ21 − σ2

i

=σiu

Hi Epv1 + σ1v

Hi Ep

Hu1

σ21 − σ2

i

. (5.11)

Note that, if r < t, we have σi = 0, i > r, hence, ∆di = vHi Ep

Hu1/σ1, for i > r.

Therefore, to simplify notation, we can define ui , 0r×1 and vj , 0t×1, for i > r

and j > t respectively. The following result is used to find E|∆di|2

.

103

Lemma 2. Let µ1, µ2 ∈ C be fixed complex numbers. Let σ2p = 1

γpdenote the

variance of one of the elements of Ep. Then,

E∣

∣µ1uiHEpvj + µ2vi

HEpHuj

∣∣2

= σ2p

(|µ1|2 + |µ2|2

), (5.12)

for any 1 ≤ i ≤ r, 1 ≤ j ≤ t.

Proof. Let a , uiHEpvj and b , vi

HEpHuj. Then, from lemma 6 in Section

5.7.5 of the Appendix, a and b are circularly symmetric random variables. Since

Ep is circularly symmetric (E Ep (i, j) Ep (k, l) = 0,∀ i, j, k, l) and a and b∗ are

both linear combinations of elements of Ep, we have E ab∗ = 0. Finally, since

‖ui‖ = ‖vj‖ = 1, the variance of a and b are equal, and σ2a = σ2

b = σ2p. Substituting,

we have

E∣

∣µ1uiHEpvj + µ2vi

HEpHuj

∣∣2

= |µ1|2 σ2a + |µ2|2 σ2

b = σ2p(|µ1|2 + |µ2|2).

Using the above lemma with µ1 = σi, µ2 = σ1 and j = 1, we get, for

i ≥ 2,

E|∆di|2

= σ2

p

σ21 + σ2

i

(σ21 − σ2

i )2 , (5.13)

where the expectation is taken with respect to the AWGN term ηp. The following

lemma helps simplify the expression further. We omit the proof, as it is straight-

forward.

Lemma 3. If vc = V d, then

‖vc − v1‖2 = 2 (1 − Re(d1)) = − (∆d1 + ∆d∗1) , (5.14)

where d1 = 1 + ∆d1 is the first element of d.

Using (5.13) in (5.9) and substituting into in (5.14), the final estimation

error is

E‖vc − v1‖2 =

1

γp

t∑

i=2

σ21 + σ2

i

(σ21 − σ2

i )2 . (5.15)

104

5.3.3 Received SNR and Symbol Error Rate (SER)

In this section, we derive the expression for the received SNR when beam-

forming using vc at the transmitter and filtering using uc at the receiver. Since

the unitary matrices V and U span Ct and C

r, vc and uc can be expressed as

uc = Uc and vc = V d respectively. Borrowing notation from Section 5.3-5.3.2, let

c = [1 + ∆c1, ∆c2, . . . , ∆cr]T ∈ C

r and d = [1 + ∆d1, ∆d2, . . . , ∆dt]T ∈ C

t respec-

tively. Then, c can be derived by a perturbation analysis on HcHHc analogous to

that in (5.10) in Section 5.3-5.3.2. We obtain

∆ci =σiv

Hi Ep

Hu1 + σ1uHi Epv1

σ21 − σ2

i

,

where, as before, we define σi = 0, and ui , 0r×1, vj , 0t×1, for i > r and j > t

respectively, so that ∆ci = 0; i > r, as expected. The channel gain is given by

uHc Hvc = cHΣd = σ1(1 + ∆d1)(1 + ∆c∗1) +

t∑

i=2

σi∆c∗i ∆di.

Ignoring higher order terms (cf. Section 5.2(5.3.1)), the power amplification ρc ,

E∣

∣uHc Hvc

∣∣2

is

ρc ≈ σ21E

1 + (∆d1 + ∆d∗1) + (∆c1 + ∆c∗1) +

t∑

i=2

σi

σ1

(∆ci∆d∗i + ∆c∗i ∆di)

.

(5.16)

From (5.9) and (5.13), we have

E ∆d1 + ∆d∗1 = − 1

γp

t∑

i=2

σ21 + σ2

i

(σ21 − σ2

i )2 ; E ∆c1 + ∆c∗1 = − 1

γp

r∑

i=2

σ21 + σ2

i

(σ21 − σ2

i )2 .

Now, ∆ci∆d∗i can be written as

E ∆ci∆d∗i = E

(σiv

Hi Ep

Hu1 + σ1uHi Epv1

σ21 − σ2

i

) (σiv

H1 Ep

Hui + σ1uH1 Epvi

σ21 − σ2

i

)

,

= E

σ1σi

(∣∣uH

i Epv1

∣∣2+

∣∣vH

i EpHu1

∣∣2)

(σ21 − σ2

i )2

,

=2σ2

pσ1σi

(σ21 − σ2

i )2 =

2σ1σi

γp (σ21 − σ2

i )2 .

105

And likewise for ∆c∗i ∆di. Denoting m , rank(H), the power amplification is

ρc = σ21

(

1 − 1

γp

t∑

i=2

σ21 + σ2

i

(σ21 − σ2

i )2 − 1

γp

r∑

i=2

σ21 + σ2

i

(σ21 − σ2

i )2 +

4

γp

m∑

i=2

σ2i

(σ21 − σ2

i )2

)

,

= σ21 −

2

γp

m∑

i=2

σ21

σ21 − σ2

i

− 1

γp

(r + t − 2m) . (5.17)

In obtaining (5.17), we have used the fact that σi = 0 for i > m, where m =

rank(H). Finally, the received SNR is

SNR = ρcPD, (5.18)

where PD is the power per data symbol. The power amplification with perfect

knowledge of H at the transmitter and the receiver is ρp , σ21. As γp = LPT /t

increases, ρc approaches ρp. Note that, when r = 1, the above expression simplifies

to ρc = ρp− 1γp

(t−1). Also,∑m

i=2σ21

σ21−σ2

i

≥ (m−1) since σ1 ≥ σi. Hence, if r = t, the

CLSE performs best when the channel is spatially single dimensional (for example,

in keyhole channels or highly correlated channels), that is, σi = 0, i ≥ 2. In this

case, we have ρc = ρp − 2γp

(t − 1). At the other extreme, if the dominant singular

values are very close to each other such that (σ21 − σ2

2) < 2/γp, the analysis is

incorrect because it requires that the dominant singular values of H be sufficiently

separated. For Rayleigh fading channels, i.e., H has i.i.d. ZMCSCG entries of unit

variance, we can numerically evaluate the probability Prσ21 − σ2

2 < 2/γp to be

approximately 1.7 × 10−4, with r = t = 4 and a typical value of γp = 10dB. Thus,

the above analysis is valid for most channel instantiations.

Having determined the expected received SNR for a given channel instan-

tiation, assuming uncoded M-ary QAM transmission, the corresponding SER PM

is given as [3]

P√M (ρc) = 2

[

1 − 1√M

]

Q

(√

3ρcPT

M − 1

)

(5.19)

PM (ρc) = 1 −(1 − P√

M (ρc))2

, (5.20)

where Q(·) is the Gaussian Q-function, and ρc is given by (5.17). The above

expression can now be averaged over the probability density function of σ2i through

106

numerical integration.

5.4 Closed-Form Semi-Blind estimation (CFSB)

First, recall that the first order Taylor expansion of a function of two

variables g(x, y) is given by

g(x + ∆x, y + ∆y) − g(x, y) =∂g(x, y)

∂x∆x +

∂g(x, y)

∂y∆y + O

(∆x2

)+ O

(∆y2

)

≈ [g (x + ∆x, y) − g (x, y)] + [g (x, y + ∆y) − g (x, y)] .

Now, in CFSB, the error in v1 (or loss in SNR) occurs due to two reasons: first,

the noise in the received training symbols, and second, the use of an imperfect

estimate of u1 (from the noise in the data symbols and availability of only a finite

number N of unknown white data). More precisely, let the estimator of v1 be

expressed as a function vs = f(Yp, us) of the two variables Yp and us. Using the

above expansion, we have

f (Yp, us) − f (HXp,u1) ≈ [f (Yp,u1) − f (HXp,u1)] + [f (HXp, us) − f (HXp,u1)]

(5.21)

where vs = f (Yp, us) and from (5.7), v1 = f (HXp,u1). Since the training noise

ηp and the error in the estimate us are mutually independent, we get

E‖vs − v1‖2 ≈ E

‖f(Yp,u1) − f(HXp,u1)‖2

︸︷︷︸

T1

+ E‖f(HXp, us) − f(HXp,u1)‖2

︸︷︷︸

T2

. (5.22)

Note that the term T1 represents the MSE in vs as if the receiver had perfect

knowledge of us (i.e., us = u1), and the term T2 represents the MSE in vs when

the training symbols are noise-free (i.e., Yp = HXp). Hence, the error in vs can

be thought of as the sum of two terms: the first one being the error due to the

107

noise in the white (unknown) data, and the second being the error due to the noise

in the training data. A similar decomposition can be used to express the loss in

channel gain (relative to σ1).

5.4.1 MSE in vs with Perfect us

In this section we consider the error arising exclusively from the training

noise, by setting us = u1. Let vs be defined as vs ,XpY H

p u1

σ1γp. Then, from (5.5)

vs = v1 +Ep

Hu1

σ1

,

where, Ep , ηpXHp /γp, as before. Recall from (5.7) that vs = vs

‖vs‖ . Now, ‖vs‖ can

be simplified as ‖vs‖2 ≃ 1 +(uH

1 Epv1 + vH1 Ep

Hu1

)/σ1, whence we get

vs ≃(

v1 +Ep

Hu1

σ1

)[

1 − 1

2σ1

(uH

1 Epv1 + vH1 Ep

Hu1

)]

.

Ignoring terms of order EpHEp and simplifying, the MSE in vs is

vs − v1 ≃ EpHu1

σ1

− 1

2σ1

(uH

1 Epv1 + vH1 Ep

Hu1

)v1

‖vs − v1‖2 =

∥∥Ep

Hu1

∥∥

2

σ21

− 1

4σ21

∣∣uH

1 Epv1 + vH1 Ep

Hu1

∣∣2. (5.23)

Taking expectation and simplifying the above expression using lemma 2, we get

E‖vs − v1‖2 =

1

2γpσ21

(2t − 1) . (5.24)

Interestingly, the above expression is the Cramer-Rao lower bound (CRB) for the

estimation of v1 assuming perfect knowledge of u1, which we prove in the following

theorem.

Theorem 5. The error given in (5.24) is the CRB for the estimation of v1 under

perfect knowledge of u1.

Proof. From (5.36), the effective SNR for estimation of v1 is γs = γpσ21. From the

results derived for the CRB with constrained parameters [23, 65], since XpXpH

=

108

It/γp, the estimation error in v1 is proportional to the number of parameters,

which equals 2t − 1 as v1 is a t-dimensional complex vector with one constraint

(‖v1‖ = 1). The estimation error is given by

E‖vs − v1‖2 =

1

2γs

Num. Parameters

=1

2γpσ21

(2t − 1) , (5.25)

which agrees with the ML error derived in (5.24).

5.4.2 Received SNR with Perfect us

We start with the expression for the channel gain when using us and vs as

the transmit and receive beamforming vectors. When we have perfect knowledge

of u1 at the receiver, us = u1 and vs = vs/ ‖vs‖, where vs = v1 + Euu1 and

Eu , EpH/σ1. The power amplification with perfect knowledge of u1, denoted by

ρu , E∣

∣uH1 Hvs

∣∣2

= E

|uH1 Hvs|2‖vs‖2

. As shown in the Appendix 5.7.2, this can

be simplified to

ρu = σ21 −

t − 1

γp

. (5.26)

Finally, the received SNR is given by PDρu, as before. Comparing the above

expression with the power amplification with CLSE (5.17), we see that when r = t,

even in the best case of a spatially single-dimensional channel ρc = ρp− 2γp

(t−1) <

ρu. Next, when r = 1, CLSE and CFSB techniques perform exactly the same:

ρc = ρu = σ21 − t−1

γpsince u1 = 1 (that is, no receive beamforming is needed).

Thus, if perfect knowledge of u1 is available at the receiver, CFSB is guaranteed

to perform as well as CLSE, regardless of the training symbol SNR.

5.4.3 MSE in vs with Noise-Free Training

We now present analysis to compute the second term in (5.22), the MSE

in vs solely due to the use of the erroneous vector us in (5.7), and hence let ηp = 0,

or Yp = HXp. As in Section 5.3-5.3.3, we can express us as a linear combination c

109

of the columns of U as us = Uc. We slightly abuse notation from Section 5.4-5.4.1

and redefine vs as vs , XpYHp us/γp = V Σc. Hence,

‖vs‖2 = cHΣ2c.

Thus, from (5.7), we have, vs = V c, where c = Σc√cHΣ2c

. From lemma (3),

‖vs − v1‖2 = 2 (1 − Re(c1)) . (5.27)

Let c = [1 + ∆c1, ∆c2, . . . , ∆cr]T . Then, as shown in the Appendix 5.7.3, c1, the

first element of c, is given by

c1 ≃ 1 − 1

2

r∑

i=2

σ2i

σ21

|∆ci|2 , (5.28)

and hence ‖vs − v1‖2 =∑r

i=2σ2

i

σ21|∆ci|2. Let γd be defined as γd , NPD/t. Then,

from Appendix 5.7.3, E|∆ci|2

is given by

E|∆ci|2

=

1

(σ21 − σ2

i )2

(σ2

1σ2i

N+

σ2i + σ2

1

γd

+N

γ2d

)

. (5.29)

Substituting, we get the final expression for the MSE as

E‖vs − v1‖2 =

r∑

i=2

σ2i

σ21 (σ2

1 − σ2i )

2

(σ2

1σ2i

N+

σ2i + σ2

1

γd

+N

γ2d

)

. (5.30)

Note that the above expression decreases as O(1/N) (since γd depends linearly on

N), and therefore the MSE asymptotically (as N → ∞) approaches the bound in

(5.24).

5.4.4 Received SNR with Noise-Free Training

The power amplification with noise-free training, denoted ρw, is given by

ρw =∣∣uH

s Hvs

∣∣2. We also have us = Uc and vs = V c, where c = Σc√

cHΣ2c. Then,

uHs Hvs = cHΣc =

√cHΣ2c, and thus

ρw = cHΣ2c = σ21 (1 + ∆c∗1) (1 + ∆c1) +

r∑

i=2

σ2i ∆c∗i ∆ci

≃ σ21 (1 + ∆c1 + ∆c∗1) +

r∑

i=2

σ2i |∆ci|2 .

110

Substituting for ∆c1 from (5.9) and ∆ci from (5.29), we obtain the power ampli-

fication with noise-free training as

ρw = σ21 −

r∑

i=2

1

(σ21 − σ2

i )

(σ2

1σ2i

N+

σ21 + σ2

i

γd

+N

γ2d

)

. (5.31)

As before, the received SNR is given by PDρw. Note that ρw approaches ρp = σ21

for large values of length N and SNR γd.

5.4.5 Semi-blind Estimation: Summary

Recall that γp = LPT /t and γd = NPD/t. The final expressions for the

MSE in vs and the power amplification, from (5.22), are:

E‖v1 − vs‖2 =

(2t − 1)

2γpσ21

+r∑

i=2

σ2i

σ21 (σ2

1 − σ2i )

2

(σ2

1σ2i

N+

σ2i + σ2

1

γd

+N

γ2d

)

, (5.32)

ρs = σ21 −

t − 1

γp

−r∑

i=2

1

(σ21 − σ2

i )

(σ2

1σ2i

N+

σ21 + σ2

i

γd

+N

γ2d

)

. (5.33)

The SER with semi-blind estimation is given by PM (ρs), with PM (·) defined as in

(5.19).

5.5 Comparison of CLSE and Semi-blind Schemes

In order to compare the CFSB and CLSE techniques, one needs to ac-

count for the performance of the white data versus beamformed data, an issue

we address now. Generic comparison of the semi-blind and conventional schemes

for any arbitrary system configuration is difficult, so we consider an example to

illustrate the trade-offs involved. We consider the 2× 2 system with the Alamouti

scheme [1] employed for white data transmission, and with uncoded 4-QAM sym-

bol transmission. The choice of the Alamouti scheme enables us to present a fair

comparison of the two estimation algorithms since it has an effective data rate of 1

bit per channel use, the same as that of MRT. Additionally, it is possible to employ

a simple receiver structure, which makes the performance analysis tractable.

111

Let the beamformed data and the white data be statistically independent,

and a zero-forcing receiver based on the conventional estimate of the channel (5.3)

be used to detect the white data symbols. In Appendix 5.7.4, we derive the average

SNR of this system as

ρw =

(

‖H‖4F +

‖H‖2F

γp

)

Px

‖H‖2F

γpPx + ‖H‖2

F + 2rγp

, (5.34)

where ‖·‖F is the Frobenius norm, Px is the per-symbol transmit power and γp =

LPT /t as defined before. From (5.34), we can also obtain the symbol error rate

performance of the Alamouti coded white data by using (5.19) with ρcPD replaced

by ρw. The resulting expression can be numerically averaged over the pdf of

‖H‖2F , which is Gamma distributed with 2rt degrees of freedom, to obtain the

SER. The analysis of the beamformed data with the CFSB estimation when the

Alamouti scheme is employed to transmit spatially white data remains largely the

same as that presented in the previous section, where we had assumed that Xd

satisfies EXdX

Hd

= γdIt. With Alamouti white-data transmission, we have that

XdXHd = γdIt, which causes the Eχ term to drop out in (5.40) of Appendix 5.7.3.

5.5.1 Performance of a 2 × 2 System with CLSE and CFSB

In order to get a more concrete feel for the expressions obtained in the

preceding, let us consider a 2 × 2 system with L = 2, N = 8, PD = 6dB and

110 total symbols per frame, i.e., 2 training symbols, 8 white data symbols and

100 beamformed data symbols in the semi-blind case, and 2 training symbols and

108 beamformed data symbols in the conventional case. The average channel

power gain ρ versus training symbol SNR (PT ), obtained under different CSI and

signal transmission conditions are shown in Fig. 5.3. When the receiver has

perfect channel knowledge (labelled perfect u1, v1), the average power gain ρ is

E σ21 = 5.5dB, independent of the training symbol SNR. The ρ with CLSE as

well as the semi-blind techniques asymptotically tend to this gain of 5.5dB as the

SNR becomes large, since the loss due to estimation error becomes negligible. The

112

channel power gain with only white (Alamouti) data transmission asymptotically

approaches 3dB (the gain per symbol of the 2×2 system with Alamouti encoding).

The channel power gain at any PT is given by (5.34), which is validated

in Fig. 5.3 through simulation. Observe that at a given training SNR, there is

a loss of approximately Pa = −3dB in terms of the channel gain performance for

the Alamouti scheme compared to the beamforming with conventional estimation.

The results of the channel power gain obtained by employing the CFSB technique

with N = 8 Alamouti-coded data symbols are shown in Fig. 5.3, and show the

improved performance of CFSB. By transmitting a few (N = 8) Alamouti-coded

symbols, the CFSB scheme obtains a better estimate of v1, thereby gaining about

Psb = 0.8dB per symbol over the CLSE scheme, at a training SNR of 2dB.

If the frame length is 110 symbols, we have Ld =100 beamformed data

symbols in the semi-blind case and Ld + N = 108 beamformed data symbols in

the conventional case. Using the beamforming vectors estimated by the CFSB

algorithm, we then have a net power gain ρg given by ρg = Ld+NLd/Psb+N/Pa

, or about

0.4dB per frame. Thus, this simple example shows that CFSB estimation can

potentially offer an overall better performance compared to the CLSE. Although

we have considered uncoded modulation here, in more practical situations a chan-

nel code will be used with interleaving both between the white and beamformed

symbols as well as across multiple frames. In this case, burst errors can be avoided

and the errors in the white data symbols corrected. Furthermore, the performance

of the white data symbols can also be improved by employing an MMSE receiver

or other more advanced multi-user detectors rather than the zero-forcing receiver,

leading to additional improvements in the CFSB technique.

5.5.2 Discussion

We are now in a position to discuss the merits of the conventional es-

timation and the semi-blind estimation. Clearly, the CLSE enjoys the advan-

tages of being simple and easy to implement. As with any semi-blind technique,

113

CFSB being a second-order method requires the channel to be relatively slowly

time-varying. If not, the CLSE can still estimate the channel quickly from a few

training symbols, whereas the CFSB may not be able to converge to an accurate

estimate of u1 from the second order statistics computed using just a few received

vectors. Another disadvantage of the CFSB is that it requires the implementation

of two separate receivers, one for detecting the white data and the other for the

beamformed data. However, the CFSB estimation could outperform the CLSE in

channels where the loss due to the transmission of spatially-white data is not too

great, i.e., in full column-rank channels. Given the parameters N , L, PT and PD,

the theory developed in this chapter can be used to decide if the CFSB technique

would offer any performance benefits versus the CLSE technique. If the CFSB

technique is to perform comparably or better than the CLSE, two things need to

be satisfied:

1. The estimation performance of CLSE and CFSB should be comparable, i.e.,

the number of white data symbols N and the data power PD should be large

enough to ensure that the estimate us is accurate, so that the resulting vs

can perform comparably to the conventional estimate. For example, since

the channel gain with semi-blind estimation is given by (5.32), N should be

chosen to be of the same order as γd; and both N and γd should be of the

order higher than γp. With such a choice, the (t − 1)/γp term will dominate

the SNR loss in the CFSB, thus enabling the beamformed data with CFSB

estimation to outperform the beamformed data with CLSE.

2. The block length should be sufficiently long to ensure that after sending L+N

symbols, there is sufficient room to send as many beamformed symbols as is

necessary for the CFSB technique to be able to make up for the performance

lost during the white data transmission. In the above example, after having

obtained the appropriate value of N , one can use (5.34) to determine the

loss due to the white data symbols (for the t = 2 case), and then finally

determine whether the block length is long enough for the CFSB to be able

114

to outperform the CLSE method.

In Section 5.6, we demonstrate through additional simulations that the CFSB tech-

nique does offer performance benefits relative to the CLSE, for an appropriately

designed system.

5.5.3 Semi-blind Estimation: Limitations and Alternative Solutions

The CFSB algorithm requires a sufficiently large number of spatially-

white data (N) to guarantee a near perfect estimate of u1 and this error cannot

be overcome by increasing the white-data SNR. It is therefore desirable to find

an estimation scheme that performs at least as well as the CLSE algorithm, re-

gardless of the value of N and L. Formal fusion of the estimates obtained from

the CLSE and CFSB techniques is difficult, hence we adopt an intuitive approach

and consider a simple weighted linear combination of the estimated beamforming

vectors as follows:

u1 =βuγpuc + γdus

‖βuγpuc + γdus‖2

, v1 =βvγpvc + γdvs

‖βvγpvc + γdvs‖2

. (5.35)

The above estimates will be referred to as the linear combination semi-blind (LCSB)

estimates. The weights γp = LPT /t and γd = NPD/t are a measure of the accu-

racy of the vectors estimated from the CLSE and CFSB schemes respectively. The

scaling factor of βu and βv is introduced because uc obtained from known training

symbols is more reliable than the blind estimate us when L = N and γp = γd. In

our simulations, for t = r = 4, the choice βu = βv = 4 was found to perform well.

Analysis of the impact of βu and βv is a topic for future research.


In this section, we present simulation results to illustrate the performance

of the different estimation schemes. The simulation setup consists of a Rayleigh

flat fading channel with 4 transmit antennas and 4 receive antennas (t = r = 4).

115

The data (and training) are drawn from a 16-QAM constellation. 10,000 random

instantiations of the channel were used in the averaging.

Measuring the error between singular vectors

In the simulations, v1 and v1 are obtained by computing the SVD of

two different matrices H and H respectively. However, the SVD involves an

unknown phase factor, that is, if v1 is a singular vector, so is v1ejφ for any

φ ∈ (−π, π] . Hence, for computational consistency in measuring the MSE in v1, we

use the following dephased norm in our simulations, similar to [66]: ‖v1 − v1‖2

DN,

2(1 −

∣∣vH

1 v1

∣∣); which satisfies ‖v1 − v1‖2

DN= minφ∈(−π,π]

∥∥v1 − v1e

jφ∥∥

2. The

norm considered in our analysis is implicitly consistent with the above dephased

norm. For example, the norm in (5.14) is the same as the dephased norm, since

the perturbation term ∆d1 is real (as noted in Section 5.3-5.3.1). Also, for small

additive perturbations, it can easily be shown that (for example) in (5.23), the

dephased norm reduces to the Euclidean norm.

Experiment 1

In this experiment, we compute the MSE of conventional estimation and

the MSE of the semi-blind estimation with perfect u1, which serves as a benchmark

for the performance of the proposed semi-blind scheme. Fig.5.4 shows the MSE

in v1 versus L, for two different values of pilot SNR (or γp), with perfect u1.

CFSB performs better than the CLSE technique by about 6dB, in terms of the

training symbol SNR for achieving the same MSE in v1. The experimental curves

agree well with the theoretical curves from (5.15), (5.24). Also, the results for

the performance of the semi-blind OPML technique proposed in [55] are plotted

in Fig. 5.4. In the OPML technique, the channel matrix H is factored into the

product of a whitening matrix W (= UΣ) and a unitary rotation matrix Q. A

blind algorithm is used to estimate W , while the training data is used exclusively

to estimate Q. Thus, the OPML technique outperforms the CFSB because it

116

assumes perfect knowledge of the entire U and Σ matrices (and is computationally

more expensive). The CFSB technique, on the other hand, only needs an accurate

estimate of u1 from the spatially-white data.

Experiment 2

Next, we relax the perfect u1 assumption. Fig.5.5 shows the SER per-

formance of the CLSE, OPML and the CFSB schemes at two different values of

N , as well as the N = ∞ (perfect knowledge of U) case. At N = 50 white data

symbols, the CLSE technique outperforms the CFSB for L ≥ 24, as the error in u1

dominates the error in the semi-blind technique. As white data length increases,

the CFSB performs progressively better than the CLSE. Also, in the presence of

a finite number (N) of white data, the CFSB marginally outperforms the OPML

scheme as CFSB only requires an accurate estimate of the dominant eigenvector

u1 from the white data. In Fig. 5.6, we plot both the theoretical and experimental

curves for the CFSB scheme when N = 100, as well as the simulation result for

the LCSB scheme defined in Section 5.5-5.5.3. The LCSB outperforms the CLSE

and the CFSB technique at both N = 50 and N = 100. Thus, the theory devel-

oped in this chapter can be used to compare the performance of CFSB and CLSE

techniques for any choice of N and L.

Experiment 3

Finally, as an example of overall performance comparison, Fig. 5.7 shows

the SER performance versus the data SNR of the different estimation schemes for a

2× 2 system, with uncoded 4-QAM transmission, L = 2 training symbols, N = 16

white data symbols (for the semi-blind technique) and a frame size Ld = 500

symbols. The parameter values are chosen for illustrative purposes, and as L and

PT increase, the gap between the CLSE and CFSB reduces. From the graph, it is

clear that the LCSB scheme outperforms the CLSE scheme in terms of its SER

performance, including the effect of white data transmission.

117

5.7 Conclusion

In this chapter, we have investigated training-only and semi-blind channel

estimation for MIMO flat-fading channels with MRT, in terms of the MSE in the

beamforming vector v1, received SNR and the SER with uncoded M-ary QAM

modulation. The CFSB scheme is proposed as a closed-form semi-blind solution

for estimating the optimum transmit beamforming vector v1, and is shown to

achieve the CRB with the perfect u1 assumption. Analytical expressions for the

MSE, the channel power gain and the SER performance of both the CLSE and

the CFSB estimation schemes are developed, which can be used to compare their

performance. A novel LCSB algorithm is proposed, which is shown to outperform

both the CFSB and the CLSE schemes over a wide range of training lengths and

SNRs. We have also presented Monte-Carlo simulation results to illustrate the

relative performance of the different techniques.

Acknowledgement

The text of this chapter, in part, is a reprint of the material as it ap-

pears in A. K. Jagannatham, C.R. Murthy and B.D. Rao,“A Semi-Blind Channel

Estimation Scheme for MRT”, Proceedings of IEEE International Conference on

Acoustics, Speech, and Signal Processing, 2005, (ICASSP ’05), Mar’05, Vol. 3,

Pages: 585 - 588.

118

Appendix for Chapter(5)

5.7.1 Proof of Lemma 1:

Let Yp ,uH

1 Yp

σ1γp, Xp ,

Xp

γp, and n ,

uH1 ηp

σ1γp. Then, since the training

sequence is orthogonal, XpXpH

= It holds. Substituting into (5.5), we have

Yp = vH1 Xp + n. (5.36)

Thus, we seek the estimate of v1 as the solution to the following least squares

problem

vs = arg minv∈ Ct, ‖v‖=1

∥∥∥Yp − vHXp

∥∥∥

2

. (5.37)

Note that

arg minv1: ‖v1‖=1

∥∥∥Yp − vH

1 Xp

∥∥∥

2

= arg minv1: ‖v1‖=1

(

YpYpH

+‖v1‖2

γp

− YpXpHv1 − vH

1 XpYpH

)

= arg maxv1: ‖v1‖=1

(

YpXpHv1 + vH

1 XpYpH

)

.

The v1 that maximizes the above expression is readily found to be v1 = XpYpH

/∥∥∥XpYp

H∥∥∥.

Substituting for Xp and Yp, the desired result is obtained.

5.7.2 Received SNR with perfect us

Here, we derive the expression in (5.26). For notational simplicity, define

x , vH1 Euu1 and y , uH

1 EHu Euu1. Then, we have

ρu = E

σ2

1 (1 + x) (1 + x∗)

1 + x + x∗ + y

≃ σ21E

(1 + x) (1 + x∗)

(1 − (x + x∗ + y) + (x + x∗ + y)2)

≃ σ21 (1 + E xx∗ − y) , (5.38)

where x∗ is the complex conjugate of x. Also, E xx∗ =σ2

p

σ21

= 1γpσ2

1, and E y =

EuH

1 EHu Euu1

= t

γpσ21. Thus, the power amplification for perfect u1 is given by

ρu = σ21 − t−1

γp.

119

5.7.3 Proof for equations (5.28) and (5.29)

In order to derive an expression for c1, we write c = [1 + ∆c1, ∆c2, . . . , ∆ct]T

as a perturbation of [1, 0, . . . , 0]T . Since c = Σc√cHΣ2c

, equating components, we have

c1 =σ1 (1 + ∆c1)

√

σ21 |1 + ∆c1|2 +

∑ri=2 σ2

i |∆ci|2

≃ (1 + ∆c1)

[

1 − 1

2

(

2∆c1 +r∑

i=2

σ2i

σ21

|∆ci|2)]

≃ 1 − 1

2

r∑

i=2

σ2i

σ21

|∆ci|2 .

Substituting in (5.27), we get

‖v1 − vs‖2 =r∑

i=2

σ2i

σ21

|∆ci|2 . (5.39)

It now remains to compute ∆ci. Recall that us is computed from the SVD in (5.4).

Stacking the transmitted and received data vectors into matrices Xd ∈ Ct×N and

Yd ∈ Cr×N and the noise vectors into ηd ∈ C

r×N , with appropriate scaling we can

rewrite (5.4) as

UΣ2UH = HHH + Es,

where,

Es , HEχHH + HEχη + EHχηH

H + Eη,

and Eχ , 1γd

(XdX

Hd − γdIt

), Eχη ,

XdηHd

γd, Eη , 1

γd

(ηdη

Hd − NI

), and finally

γd = NPD

t, as before.

Observe that, since the white data Xd and AWGN are mutually in-

dependent, the elements of Eχ, Eχη and Eη are pairwise uncorrelated. Also,

E|Eχ(i, j)|2

=

(PD

t

)2/(

N(

PD

t

)2)

= 1/N , E|Eχη(i, j)|2

=

(PD

t

)/(

N(

PD

t

)2)

=

1/γd, and E|Eη(i, j)|2

= 1/

(

N(

PD

t

)2)

= N/γ2d . Thus, from the first order per-

turbation analysis (5.8), ∆ci =uH

i Esu1

σ21−σ2

i

, and therefore

E|∆ci|2

=

1

(σ21 − σ2

i )2

(

E∣

∣uHi HEχHHu1

∣∣2

+ E∣

∣uHi HEχηu1

∣∣2

+ E∣

∣uHi EH

χηHHu1

∣∣2

+ E∣

∣uHi Eηu1

∣∣2)

. (5.40)

120

Simplifying the different components in the above expression, we have,

E∣

∣uHi HEχHHu1

∣∣2

= σ21σ

2i /N, E

∣∣uH

i Eηu1

∣∣2

= N/γ2d ,

and,

E∣

∣uHi HEχηu1

∣∣2

= σ2i /γd.

Substituting into (5.40), we get (5.29).

5.7.4 Performance of Alamouti Space-Time Coded Data with Con-

ventional Estimation

In this section, we determine the performance of Alamouti space-time

coded data for a general r × 2 matrix channel with estimation error and a zero-

forcing receiver. Similar results for other specific cases can be found in [67], [68].

Denote the r× 2 channel matrix H in terms of its columns as H = [H1, h2]. Also,

let the 2×L orthogonal training symbol matrix Xp be defined in terms of its rows

as XTp = [XT

p1, XTp2]

T . Thus, from (5.3), the channel is estimated conventionally as

Hc =1

γp

[YpX

Hp1, YpX

Hp2

]

[

H1, H2

]

=

[

h1 +ηpX

Hp1

γp

,h2 +ηpX

Hp2

γp

]

(5.41)

The effective channel with Alamouti-coded data transmission can be represented

by stacking two consecutively received r× 1 vectors y1 and y∗2 vertically as follows

y1

y∗2

=

h1 h2

−h∗2 h∗

1

x1

x∗2

+

ηw1

η∗w2

, (5.42)

where ηwi, i = 1, 2 is the AWGN affecting the white data symbols. When a zero-

forcing receiver based on the estimated channel is employed, the received vectors

are decoded using[

H1, H2

]

as

x1

x∗2

=

HH

1 −HT2

HH2 HT

1

y1

y∗2

. (5.43)

121

It is clear from symmetry that the performance of x1 and x2 will be the same; hence,

we can focus on determining the SER performance of x1. Now, x1 contains three

components, the signal component coming from x1, and a leakage term coming

from the symbol x2 and the noise term coming from the white noise term ηw as

follows

x1 =(

HH1 h1 + hH

2 H2

)

︸︷︷︸

ξx1

x1 +(

HH1 h2 − hH

1 H2

)

︸︷︷︸

ξx2

x∗2 + HH

1 ηw1 − ηHw2H2

︸︷︷︸

ξn

(5.44)

The coefficient of the x1 term, denoted ξx1 is

ξx1 =

(

h1 +ηpX

Hp1

γp

)H

h1 + hH2

(

h2 +ηpX

Hp2

γp

)

, (5.45)

= ‖H‖2F +

Xp1ηHp h1 + hH

2 ηpXHp2

γp

. (5.46)

From the above equation, it is clear that the performance of the x1 symbol is

dependent on the training noise instantiation ηp. However, we can consider the

average power gain, averaged over the training noise, as follows

E|ξx1|2

= ‖H‖4

F +1

γ2p

EXp1η

Hp h1h

H1 ηpX

Hp1 + hH

2 ηpXHp2Xp2η

Hp h2

,(5.47)

= ‖H‖4F +

1

γ2p

(γp ‖h1‖2 + γp ‖h2‖2) , (5.48)

= ‖H‖4F +

‖H‖2F

γp

, (5.49)

where, in (5.47), the cross terms disappear since the noise ηp is zero-mean and due

to the orthogonality of the training Xp. Similarly, the coefficient of the x∗2 term,

denoted ξx2, can be simplified as

ξx2 =Xp1η

Hp h2 − hH

1 ηpXHp2

γp

. (5.50)

We will assume for simplicity that the x2 term is an additive white Gaussian noise

impairing the estimation of x1, i.e., we do not perform joint detection. This noise

term is independent of the AWGN component ηw. Similar to the coefficient of x1,

122

we can consider the average power gain of the x2 term, which can be obtained after

a little manipulation as

E|ξx2|2

=

‖H‖2F

γp

. (5.51)

Finally, the noise term, denoted ξn, is

ξn = hH1 ηw1 − ηH

w2h2 +Xp1η

Hp ηw1 − ηH

w2ηpXHp2

γp

, (5.52)

from which we can obtain the noise power as

E|ξn|2

= ‖H‖2

F +2r

γp

. (5.53)

Thus, the SNR for detection of a white data symbol is given by

ρw =

(

‖H‖4F +

‖H‖2F

γp

)

Px

‖H‖2F

γpPx + ‖H‖2

F + 2rγp

. (5.54)

5.7.5 Other Useful Lemmas:

In this section, we present three useful lemmas without proof for the sake

of brevity.

Lemma 4. Let Xp ∈ Ct×L be an orthogonal set of vectors (i.e., XpX

Hp = γpIt),

and let ηp ∈ Cr×L contain i.i.d. ZMCSCG entries with mean µ = 0 and variance

σ2n = 1. Then, the elements of Ep = Xpη

Hp are uncorrelated, and the variance of

each element of Ep is σ2p = γp.

Lemma 5. A transformation of Ep (defined in lemma 4) by any orthogonal matrix

V ∈ Ct×t (i.e., V V H = V HV = It) to get E = V Ep, leaves the second order

statistics of Ep unaltered, that is,

E E(i, j) = E

E(i, j)

= 0

E E(i, j)E∗(k, l) = E

E(i, j)E∗(k, l)

= σ2pδ (i − k, j − l) , ∀ i, j, k, l,

where δ (p, q) = 1 when p = q = 0, and 0 otherwise.

123

Lemma 6. If the random vector Xp ∈ Ct×L has zero-mean circularly symmetric

i.i.d. entries, then so does vHXp, where v ∈ Ct×1. Further, if v satisfies ‖v‖ = 1,

then the variance of an element of Xp is the same as that of vHXp.

124

H r x t

ss Decoding /

Decision

Beamforming

||z|| = 1

Transmit

Beamforming

Receive

Czw

||w|| = 1

Ct

r

Figure 5.1: MIMO system model, with beamforming at the transmitter and re-

ceiver.

Training

Training White Data

Conventional

Semi−blind

Beamformed Data

Beamformed DataEst v

1

1Est. u , v

1

1Est. u

Figure 5.2: Comparison of the transmission scheme for conventional least squares

(CLSE) and closed-form semi-blind (CFSB) estimation.

125

−2 0 2 4 6 8 10 12−2

−1

0

1

2

3

4

5

6

Po

we

r A

mp

lific

atio

n (

dB

)

Pilot SNR (dB)

Perfect+u1,v

1CFSB+perfect u

1CFSB+estimated u

1

CLSE+beamformingCLSE+Alamouti, expCLSE+Alamouti, theory

Figure 5.3: Average channel gain of a t = r = 2 MIMO channel with L = 2,

N = 8 and PD = 6dB, for the CLSE and beamforming, CFSB and beamforming

(with and without knowledge of u1), CLSE and white data (Alamouti-coded), and

perfect beamforming at transmitter and receiver. Also plotted is the theoretical

result for the performance of Alamouti-coded data with channel estimation error,

given by (5.34)

.

126

10 20 30 40 50 60

10−2

10−1

Pilot Length (L)

MS

E in

v1

CLSE−TheoryCLSECFSB−TheoryCFSB, perfect u

1

OPML − perfect Upilot SNR = 2dB

pilot SNR = 10dB

Figure 5.4: MSE in v1 vs training data length L, for a t = r = 4 MIMO system.

Curves for CLSE, CFSB and OPML with perfect u1 are plotted. The top five

curves correspond to a training symbol SNR of 2dB, and the bottom five curves

10dB.

127

10 20 30 40 50 60

10−2

SE

R

num pilot

CLSE−expOPML, N=50CFSB, N=50CFSB−u1OPML−UPerf−bf

10 20 30 40 50 60

10−2

SE

R

num pilot

CLSE−expOPML, N=100CFSB, N=100CFSB−u1,theoryCFSB−u1Perf−bf

Figure 5.5: SER of beamformed-data vs number of training symbols L, t = r = 4

system, for two different values of white-data length N , and data and training

symbol SNR fixed at PT = PD = 6dB. The two competing semi-blind techniques,

OPML and CFSB, are plotted. CFSB marginally outperforms OPML for N = 50,

as it only requires an accurate estimate of u1 from the blind data.

128

10 20 30 40 50 60

10−2

SE

R

num pilot

CLSE−expCLSE−theoryCFSB−exp, N=50LCSB, N=50Perf−bf

10 20 30 40 50 60

10−2

SE

R

num pilot

CLSE,expCFSB−theory, N=100CFSB−exp, N=100Perf−bf

Figure 5.6: SER vs L, t = r = 4 system, for two different values of N , and data and

training symbol SNR fixed at PT = PD = 6dB. The theoretical and experimental

curves are plotted for the CFSB estimation technique. Also, the LCSB technique

outperforms both the conventional (CLSE) and semi-blind (CFSB) techniques.

129

−2 0 2 4 6 8 10 12 1410

−4

10−3

10−2

10−1

100

SE

R

data SNR (dB)

CLSE−AlamoutiCLSE−bfCFSBCFSB−u1LCSBPerf−bf

Figure 5.7: SER versus data SNR for the t = r = 2 system, with L = 2, N =

16, γp = 2dB. ‘CLSE-Alamouti’ refers to the performance of the spatially-white

data with conventional estimation, ‘CLSE-bf’ is the performance of the beam-

formed data with vc, ‘CFSB’ and ‘LCSB’ refer to the performance of the corre-

sponding techniques after accounting for the loss due to the white data. ‘CFSB-u1’

is the performance of CFSB with perfect-u1, and ‘Perf-bf’ is the performance with

the perfect u1 and v1 assumption.

6 Superimposed Pilots for

MIMO Channel Estimation

Until the recent past, blind schemes [8, 42] have been the only alterna-

tive to estimate a channel without wasting bandwidth. Frequently, such schemes

cannot estimate the channel completely and leave a residual indeterminate ’phase

factor’, such as a complex phase for SIMO [44] and a unitary matrix [46] for MIMO

systems. Further, the optimization algorithms are often complex, involving second

and higher order statistics, and frequently result in sub-optimal performance from

convergence to local minima. Recent advances in signal processing have suggested

an innovative scheme for channel estimation wtih superimposed pilot(SP) symbols,

also termed as hidden or embedded pilots. SP based schemes employ additional

power to transmit a repetitive sequence of pilot symbols superimposed over the

information bearing data symbols and hence do not sacrifice bandwidth by exclu-

sively transmitting pilots. Further, since they employ first order (mean) statistics,

compared to blind algorithms which traditionally employ second and higher order

statistics, they result in simplistic algorithms which obviate convergence problems.

Thus, they offer the attractive benefit of bandwidth efficiency with moderate com-

putational complexity. Early research on such channel estimation schemes has

been reported in [69, 70]. Alternative schemes for SP based estimation have been

explored in [71–73] and [74]. SP based channel estimation for OFDM systems is

discussed in [75].

In this chapter, we derive the true Cramer-Rao Bounds (CRB) for SP

130

131

Figure 6.1: Schematic of a Superimposed Pilot System.

based estimation where only approximate bounds have been derived previously in

the literature[71]. We demonstrate that the simplistic first-order statistic (mean)

based estimation scheme proposed in works such as [71, 72] has a sub-optimal es-

timation performance compared to the CRB. Hence, we propose a semi-blind SP

estimation scheme which asymptotically achieves the CRB[30,55], thus improving

the estimation performance over existing schemes. In the SIMO context, this es-

timate can be seen to have an asymptotic MSE that is 3dB lower than the mean

based estimate. Another aspect of our work is the development of a framework for

the throughput performance analysis of SP. A similar study has been presented in

[76, 77] for frequency selective single-input multiple-output (SIMO) channels. A

novel contribution of our work is to derive an expression for the capacity lower

bound of channels with source-noise correlation to analyze the throughput per-

formance of SP. This framework is more general and can be used to characterize

the throughput performance of any estimator, and is not limited to the MMSE

estimate as in [39, 76]. Specifically, we focus on the maximum-likelihood (ML)

estimate, which is commonly employed in practice. This expression for worst case

capacity is also utilized to demonstrate the throughput gains of SP over a sys-

tem employing conventional pilots (CP). From simulation studies employing this

132

framework, SP can be seen to potentially outperform CP in terms of overall system

throughput, especially in scenarios where the block length is small so that repeated

transmission of pilot symbols results in a significant bandwidth overhead. This typ-

ically arises in adhoc and sensor networks, where the information is communicated

in short bursts over a large number of channels. It also arises in mobile wireless

scenarios where a short coherence time renders repeated training inefficient.

Further, unlike in CP based estimation where the channel estimate is

independent of the data SNR and depends only on pilot to noise ratio (PNR),

in SP estimation the transmitted data has a corrupting influence on the channel

estimate. This aspect has been considered in the study in [72, 78]. Our work

further addresses this issue of optimal pilot-source power allocation for a fixed

total transmit power based on maximizing the post-processing SNR (PSNR) for a

Capon like receive beamformer.

The rest of the chapter is organized as follows. In the next section we

formulate the problem. We derive the expressions for SP estimation in section

(6.2) and present the CRB analysis for SP in section (6.2.1). The expression

for worst case capacity with correlation is presented in section (6.3) followed by

performance comparison with CP in section (6.3.2). The expressions for optimum

power allocation are derived in section (6.4). We present results from simulation

studies in section (6.5) and conclusions in the end.

6.1 Superimposed Pilots (SP) Based MIMO Estimation

Consider a multiple-input multiple-output (MIMO) wireless system with

r receive antennas, t transmit antennas and r ≥ t, i.e. at least as many receive

antennas as transmit antennas. Let H = [h1,h2, . . . ,hr] ∈ Cr×t, denote the flat-

fading MIMO channel, where hj , [h1j, h2j, . . . , hrj]T represents the vector of

complex fading coefficients between the jth transmit antenna and the receiver. The

equivalent discrete-time baseband MIMO system model after matched filtering and

133

Figure 6.2: Schematic diagram of the superimposed pilot(SP) frame structure.

sampling is given as,

y(k) = Hx(k) + η(k), 1 ≤ k ≤ Nb (6.1)

where the index k denotes the time instant and y(k) ∈ Cr×1, x(k) ∈ C

t×1 denote the

kth received and transmitted symbol vectors respectively. The vector η(k) ∈ Cr×1 is

complex circularly-symmetric spatio-temporally uncorrelated additive white Gaus-

sian noise (AWGN) of power σ2n, i.e. E

η(k)η(l)H

= σ2

n δ(k− l)Ir, where δ(k) = 1

if k = 0 and 0 otherwise. The SP estimation scheme can be described as follows.

Let each frame of contiguous transmitted symbols contain Nf sub-frames of length

Lp symbols where Nb , NfLp denotes the block length. The transmitted data sym-

bols xsd(k) are assumed to be stochastic in nature with E xs

d(k) = 0 and power P sd

i.e. Exs

d(k)xsd(l)

H

= P sd δ(k − l)It. Let Xd , [xs

d(1),xsd(2), . . . ,xs

d (Nb)] ∈ C1×Nb

be the transmitted information symbol sequence. Each such sub-frame consists

of independent data symbols with the pilot sequence Xp ∈ Ct×Lp defined as,

Xp , [xp(1),xp(2), . . . ,xp(Lp)], of length Lp symbols and pilot power P st superim-

posed over the data symbols i.e. tr(XpX

Hp

)= tP s

t Lp. Also, let ρsd , (P s

d /σ2n) and

ρst , (P s

t /σ2n) be the signal-to-noise power ratio (SNR) and pilot-to-noise power

ratio (PNR) respectively. A schematic diagram of this SP frame structure is given

in fig.(6.2). The actual transmitted symbol at the kth instant, xs(k), is therefore

134

given as,

xs(k) , xsd(k) + xs

p(k) = xsd(k) + xp (mod(k − 1, Lp) + 1) . (6.2)

The SP system model has the form,

ys(k) = H (xsd(k) + xp (mod(k − 1, Lp) + 1))

︸︷︷︸

xs(k)

+η(k), 1 ≤ k ≤ Nb, (6.3)

where ys(k),xs(k) are the kth received symbol vector and transmitted symbol

respectively. We employ a scheme similar to the ones suggested in [69, 71, 73] to

estimate the channel H, which is described below. Let ys(k) ∈ Cr×1, k ∈ 1, Lp be

defined as,

ys(k) ,1

Nf

Nf−1∑

j=0

ys (k + jLp) , 1 ≤ k ≤ Lp. (6.4)

Let Ys ∈ Cr×Lp , [ys(1), ys(2), . . . , ys(Lp)], be a stacking of the processed received

symbol vectors. Statistically, EYs

= HXp. The channel estimate Hs is now

computed by the standard least squares procedure [15] as,

Hs = YsX†p = YsXH

p

(XpX

Hp

)−1= Ys

(Xs

p

)†, (6.5)

where Xsp , [Xp,Xp, . . . ,Xp] ∈ C

t×Nf Lp is the superimposed pilot signal. We refer

to the above estimate as the mean-estimate for superimposed pilots as it employs

the mean of the received signal Ys. Since it is based only on the first order statistics

of Ys, it converges faster (compared to second and higher order statistics) while

having a low complexity of implementation. The estimate Hs is then used for

detection of the transmitted data xsd(k) after removing the superimposed pilot

symbol xp (mod(k − 1, Lp) + 1).

6.2 MSE of Estimation

In this section, we first compute the MSE of estimation for the SP es-

timator given in (6.5). Following this, we present the Cramer-Rao bound (CRB)

135

for SP based estimation, which yields a lower asymptotic MSE than that obtained

for the estimator in (6.5), since the mean-estimate ignores the channel informa-

tion available in the second-order statistics(source covariance). We demonstrate

for a SIMO channel that this asymptotic MSE bound for SP is 3dB lower than

that achieved by the simplistic mean estimator and develop a semi-blind MIMO

estimation scheme that achieves this bound at high SNR (ρsd).

From equation (6.4) above, the quantity Y s is given as, Ys = HXp +

HXsd + N, where Xs

d and N are defined analogously for xsd(k), η(k), 1 ≤ k ≤ Nb as

in (6.4). Simplifying the expression for the SP estimate given in (6.5), the quantity

Hs can be seen to be given as,

Hs = H + HXsdX

Hp

(XpX

Hp

)−1+ NXH

p

(XpX

Hp

)−1. (6.6)

The MSE of the mean-estimate for SP, defined as MSEs , E

∥∥∥Hs − H

∥∥∥

2

F

, can

be simplified as demonstrated in appendix section (6.7.1) to yield,

MSEs = E

tr

((

Hs − H) (

Hs − H)H

)

=1

Nf

(tr

(HHH

)P s

d + rσ2n

)tr

(XpX

Hp

)−1.

(6.7)

The optimal pilot symbol matrix for SP estimation that minimizes the MSE of

estimation can be obtained as X⋆p = arg min MSEs = arg min tr

(XpX

Hp

)−1. The

following result gives the structure of the optimal pilot matrix Xp.

Lemma 8. For a fixed total pilot power tr(XpX

Hp

)= tLpP

st , the optimal pilot

symbol matrix X⋆p ∈ C

t×Lp that minimizes the quantity MSEs, the MSE of estima-

tion of the MIMO channel H using superimposed pilots, is given by X⋆p such that

X⋆p

(X⋆

p

)H= P s

t LpIt, i.e. the pilot symbol matrix Xp is orthogonal.

Proof. Similar to [79,80].

In the remainder of the chapter we assume that Xp = X⋆p, the optimal orthogonal

pilot sequence. Thus, the MSE for SP based estimation is given as,

MSEs =tP s

d

NbP st

tr(HHH

)+

rtσ2n

NbP st

(6.8)

136

We wish to look at the dominant MSE term, which is achieved by defining MSE∞s ∈

O (P sd ), the asymptotic MSE of the mean-estimate at high SNR as,

MSE∞s , P s

d

(

limP s

d→∞

MSEs

P sd

)

=tP s

d

NbPt

tr(HHH

). (6.9)

Hence, MSEs can be expressed as MSEs = MSE∞s + o (Pd

s). Ideally, it is desired

MSE∞s = 0, to ensure that the MSE does not progressively increase without bound

as the source power P sd increases. However, as is seen from above, this is not true of

the SP mean-estimate, which is adversely affected as P sd increases. Thus, increasing

P sd might potentially result in worsening the estimate Hs and in turn results in

poor detection performance. We explore the Cramer-Rao bound estimate in the

next section to address the issue of optimal MSE performance.

6.2.1 Cramer-Rao Bound (CRB) for SP Estimation

In this section, we compute the complex CRB for the SP based esti-

mation of H. To make the analysis tractable and demonstrate insights into SP

estimation, we assume a Gaussian symbol source i.e. xd(k) ∼ N (0, P sd It). It is

worth mentioning that the results derived employing this simplification are in close

agreement with the performance of a system employing a discrete signal constel-

lation such as quadrature phase-shift keying (QPSK). As suggested in [32] for the

construction of CRBs of complex parameters, let the complex parameter vector

θ ∈ C2rt×1 be constructed by stacking the parameter vector H and its conjugate as

θ , [vec (H), vec (H∗)]T . From the SP system model for pilot symbol outputs given

in (6.3), the parameter dependent log-likelihood (log-likelihood ignoring additive

constants) for the estimation of the parameter vector θ is given as,

L(Ys|Xs

p; θ)

= −Nb ln |Re|−Nb∑

i=1

(ys(i) − Hxs

p(i))H

R−1e

(ys(i) − Hxs

p(i))

(6.10)

where Ys , [ys(1),ys(2), . . . ,ys (Nb)], xsp(i) , xp (mod (i − 1, Lp) + 1) and Re, the

covariance of this effective noise is given as Re , PdsHHH+σ2

nIr. The Cramer-Rao

Bound (CRB) for the estimation of θ is given by the matrix J−1θ

, where Jθ ∈ C2r×2r

137

is the complex Fisher information matrix (FIM) for the parameter vector θ ∈ C2r×1

and is given as,

Jθ = −E

∂2L(Ys |Xp ; θ

)

∂θ ∂θH

Therefore, the MSE lower bound for SP based estimation denoted by MSEb is given

as MSEb = tr(J−1

θ

), which is also the asymptotic MSE of an maximum-likelihood

(ML) estimator which maximizes the likelihood in (6.10). The SP mean-estimate

suggested in [71] et al. and described in equation (6.5) above is the ML estimate

ignoring the dependance of the covariance Re on H and employs a straight forward

LS estimator i.e.,

Hs = arg min L(Ys|Xs

p,Re;H)

= arg min

Nb∑

i=1

(ys(i) − Hxs

p(i))H

R−1e

(ys(i) − Hxs

p(i))

,

where Re is assumed known. This procedure although suboptimal, results in a

simple estimation algorithm when compared to minimizing the true cost function

involving Re (H). The FIM corresponding to such an estimation procedure, which

exclusively employs the information in the pilots while ignoring the information in

the covariance Re, is given by the Pilot FIM (PFIM) component Jp

θof the total

FIM Jθ as,

Jp

θ= −E

∂2L(Ys|Xs

p,Re;H)

∂θ ∂θH

=

(

Xsp

(Xs

p

)H)

⊗(R−1

e )T

0

0(

Xsp

(Xs

p

)H)T ⊗

R−1e

. (6.11)

where⊗

denotes the matrix Kronecker product. Hence the MSE for the exclusive

pilot based SP estimation of H is given as,

MSEpb =

1

2tr

(

Jp−1

θ

)

= tr

((

Xsp

(Xs

p

)H)−1

)

tr (Re) = tr(HHH

) P sd

NbP st

t+σ2

n

NbP st

rt,

which is equal to the MSE for the SP estimate given in section (6.2). The factor

12

in the above expression is to account for the fact that the parameter vector θ

represents the MSE of H and H∗.

138

Thus, the mean based estimate is suboptimal in the sense that it ignores

the information in the second order statistics (covariance Re) while minimizing the

likelihood in (6.10). The true FIM corresponding to information in both Xp and

Re can be obtained as Jθ = Jp

θ+ Jr

θ, where the FIM component Jr

θcorresponds

to the information in the covariance matrix Re. Let the block Toeplitz parameter

derivative matrix E(k) ∈ Cr×t be defined employing complex derivatives as, E(i) ,

∂H

∂θi. The component Jr

θcan be seen to be given as[59,81],

Jri,j = Jr

rt+j,rt+i = Nb (P sd )2 tr

(

E(i)HHR−1e HE(j)HR−1

e

)

Jri,rt+j =

(Jr

rt+j,i

)∗= Nb (P s

d )2 tr(E(i)HHR−1

e E(j)HHR−1e

).

The covariance FIM Jrθ

can be block partitioned as,

Jr , Nb (P sd )2

Jr

11 Jr12

Jr21 Jr

22

.

It can be verified that Jr21 = Jr

12H and Jr

22 = Jr11

T . The block components of the

FIM are given as, Jr11 =

(HHR−1

e H) ⊗

(R−1e )

Tand,

Jr12 =

(

eH⊗

HHR−1e

⊗

e)

⊙(

eH⊗

HHR−1e

⊗

e)T

,

where e = [1, 1, . . . , 1]T ∈ Cr×1 and ⊙ denotes the matrix Hadamard product. The

expressions for the FIM components Jp

θ, Jr

θfrom can be employed to obtain the

true FIM Jθ. Thus, the CRB for SP based estimation of H is obtained as,

E

(

θ − ˆθ) (

θ − ˆθ)H

≥ J−1θ

. (6.12)

Also, Jθ > Jp

θin the positive semi-definite matrix sense [82] and hence, MSEb <

MSEs. A more insightful result can be obtained in the contex of a SIMO channel

h ∈ Cr×1, i.e. t = 1. The high SNR approximation to the CRB matrix given by

the result below yields a critical insight into the relation between this MSE bound

MSEb and the quantity MSEs.

139

Theorem 6. In the context of a SIMO wireless channel h ∈ Cr×1, the MSE bound

for SP based estimation is given as MSEb = MSE∞b + o (P s

d ), where MSE∞b the

high SNR asymptote is,

MSE∞b , P s

d

(

limP s

d→∞

MSEb

P sd

)

=1

2

(P s

d

NbP st

)

‖h‖2 . (6.13)

Proof. Given in appendix 6.2.1.

Interesting observations can be made from the above result. The quantity

MSE∞b , or the dominant term in the SP MSE bound, increases linearly with P s

d ,

similar to the MSE of the simplistic mean-estimator in section (6.1). This means

that even if one were to use the available statistical information for estimation,

the least achievable MSE of estimation still increases with SNR, similar to the

mean-estimate in section (6.2) above. Hence, the problem of source-pilot power

allocation assumes a critical significance in the context of SP and is addressed in

section (6.4). We also have the following result.

Lemma 9. The asymptotic MSE measures MSE∞b and MSE∞

s , the asymptotic

MSE bound and the asymptotic MSE of the mean-estimate respectively for SP

based estimation are related as,

MSE∞b

MSE∞m

=1

2. (6.14)

Proof. Follows from (6.9) and (6.13).

The above result implies that at reasonably high SNRs, the MSE of esti-

mating the channel by employing the complete information in the likelihood (6.10)

is 3dB lower than that of the mean-estimate. Neglecting the covariance informa-

tion in Re results in a 3 dB loss of estimation performance in the SIMO context.

We now describe a semi-blind estimation algorithm below, which asymptotically

achieves the above MSE bound for SP based MIMO estimation and thus has a

lower MSE of estimation compared to the mean-estimator of section (6.1).

140

6.2.2 Semi-Blind SP Estimation

In this section, we derive a semi-blind SP estimator, that asymptotically

achieves the CRB for SP estimation by employing the information in the output

covariance Re. Observe that the output covariance Ry is given as,

Ry = Eys(i)ys(i)H

= (P s

d + P st )HHH + σ2

nIr = P st HHH + Re.

Hence, let the output covariance Ry be estimated from the received data symbols

as Ry , 1Nb

(∑Nb

i=1 ys(i)ys(i)H)

. Employing a cholesky matrix factorization, one

can compute the matrix estimate W such that,

WWH =1

(P sd + P s

t )

(

Ry − σ2nIr

)

(6.15)

The matrix W is also known as the whitening matrix [55] and differs from the

estimate of the channel Hb by a unitary matrix Qb i.e. Hb = WQHb . The unitary

matrix Qb can be estimated from Xsp by minimizing the the likelihood,

Qb = arg min tr

((

Ys − WQHXsp

)H

R−1e

(

Ys − WQHXsp

))

, (6.16)

subject to the constraint QbQHb = QH

b Qb = It. The quantity Re is given as

Re , P sdWWH + σ2

nIr. It can then be demonstrated that the optimal unitary

matrix Qb that minimizes the above likelihood is given as,

Qb = UbVHb , where, UbΣbV

Hb = SV D(Xs

p (Ys)H R−1e W), (6.17)

under the condition that the pilot symbol matrix Xp is orthogonal, as has been as-

sumed for optimal MSE performance. The semi-blind channel estimate is obtained

as Hb = WQHb . This is akin to the whitening-rotation SB procedure elaborated

in [55]. The above SB estimator yields a biased estimate at low SNR, owing to the

constrained ML estimator in (6.16). However, the bias progressively decreases as

the SNR increases. Hence, theoretically, the SB estimator asymptotically achieves

the MSE lower bound in (6.13) at high SNR [55]. Simulation studies demonstrated

in section (6.5) suggest that the performance of the proposed SB scheme is close

to the bound even for moderately high SNR.

141

6.3 Throughput Performance

One of the promising aspects of SP estimation compared to CP, is the

potential savings in bandwidth due to the transmission of superimposed data and

pilot signal, thereby eliminating an exclusive slot for the transmission of pilot sym-

bols. In this section we wish to quantify the throughput performance of SP and con-

trast it with that of CP to demonstrate the achievable bandwidth gains. The result

in [39] provides an expression to characterize the worst case capacity of a communi-

cation channel in the presence of channel estimation errors. The framework therein

relies on the central assumption that the channel estimate H and the estimation

error H − H satisfy the decorrelation property, i.e. E

H(

H − H)H

= 0r×r,

which is satisfied by the minimum mean-squared error (MMSE) estimate. How-

ever, this result cannot be used in the context of SP based estimation for the

following reasons.

A.1 The SP channel channel estimate Hs is correlated with the transmitted data

symbols Xsd as, E

Hs [xsd(k)]i

=(

P sd

NbPst

)

hj

(xs

p(k))H

. This can be seen

from (6.6) and is unlike the scenario in [39] where the channel estimate is

uncorrelated with the data, i.e. E

H [xsd(k)]i

= 0r×r.

A.2 Further, it can also be observed that the decorrelation property mentioned

above is not satisfied by many estimators including the least-squares (LS)

estimator. For instance, it can be observed from section(6.2) that,

tr

(

E

Hc

(

H − Hc

)H)

= −(

tσ2n

NbP st

)

tr (Ir) 6= 0. (6.18)

This is a disadvantage since the LS estimate is robust and has a low com-

putational complexity which makes it especially suited for implementation

in wireless systems. Therefore it is of interest to develop a framework that

takes into account such estimators.

The following discussion presents a result for the worst-case capacity Cw of a

channel with non-zero signal-noise correlation. This frame-work can be employed

142

to quantify the throughput performance of the SP sytem. Further, we also utilize

this framework to demonstrate the bandwidth gains of SP when compared with a

system employing conventional pilots (CP) for channel estimation.

6.3.1 A Throughput Lower Bound for Channels with Correlated Noise

In this section, similar to the result in [39], we derive an expression for

the throughput lower bound of a communication system with correlated noise.

Consider the communication channel,

y(k) = Hx(k) + v(k) = s(k) + v(k), s(k),y(k) ∈ Cr×1 (6.19)

where v(k) ∈ Cr×1 is additive noise and s(k) , Hx(k). The worst case capacity

for the above channel is,

Cw = minpv(·), tr(Rv)=rσ2

n

maxpx(·), tr(Rx)=tPd

I (y;x) (6.20)

The important difference between the above model and the one in [39] is that

the signal and noise components s(k),v(k) are not necessarily uncorrelated, i.e.

Ev(k)s(l)H

= δ(k − l)Rvs 6= 0. The result below gives the expression for the

worst case capacity of the above channel.

Theorem 7. Worst Case Correlated Capacity: Let the system input-output

model of a matrix-valued noisy communication channel be given as,

y(k) = Hx(k) + v(k) = s(k) + v(k), (6.21)

where x(k) ∈ Ct×1,v(k) ∈ C

r×1 represent the signal and the unknown noise com-

ponents respectively. Let v(k),x(k) satisfy the covariance constraints,

Ex(k)Hx(l)

= δ(k − l)tr (Rx) = rPt, E

v(k)Hv(l)

= δ(k − l)tr (Rv) = rσ2

n,

and δ(k) = 1 if and only if k = 0 and δ(k) = 0 otherwise. Further, let the

correlation between the signal and noise components be given as,

Ev(k)s(l)H

= δ(k − l)Rvs = δ(k − l)RH

sv,

143

where Rvs is not necessarily 0r×r. For the above communication system, the worst

case capacity Cw as defined in (6.20) is given by the expression,

Cw (Rs,Rv,Rvs) = mintr(Rv)=rσ2

n

maxtr(Rx)=tPd

log∣∣∣I + R−1

v|s (Rs + Rvs)R−1s (Rs + Rvs)

H∣∣∣ ,

(6.22)

where the conditional covariance Rv|s ∈ Cr×r is given as Rv|s , Rv −RvsR

−1s Rsv.

Proof. See appendix 6.7.3.

We employ this result next to derive expressions for the worst case ca-

pacity of the SP and CP estimation schemes. Also, it can be seen from (6.22)

that that for the case of uncorrelated noise, i.e. Rsv = Rvs = 0r×r, the expression

above reduces to the result in [39] for the uncorrelated signal-noise case,

Cw = mintr(Rv)=rσ2

n

maxtr(Rx)=tPd

log∣∣I + R−1

v Rs

∣∣ = min

tr(Rv)=rσ2n

maxtr(Rx)=tPd

log∣∣∣I + R−1

v HRxHH

∣∣∣ .

(6.23)

6.3.2 Throughput Comparison of Superimposed and Conventional Pi-

lots (CP)

We now apply the result for the worst case capacity derived above to the

scenarios of SP and CP based channel estimation. Let ys(k) denote the output of

the SP system after removal of the pilot symbol xp (mod(k − 1, Lp) + 1) employing

the estimate Hs described in (6.6). From (6.3), the input-output relation for the

SP channel after pilot removal is given as,

ys(k) = Hsxsd(k) +

(

H − Hs

)

(xp (mod(k − 1, Lp) + 1) + xsd(k)) + η(k)

︸︷︷︸

vs(k)

, (6.24)

where ss(k) , Hxsd(k) and ss(k),vs(k) ∈ C

r×1 denote the effective SP channel

output (after pilot removal), noise respectively at the kth time instant. The opti-

mal Rx which maximizes the worst case capacity in (6.22) depends on the channel

matrix H and can be fairly challenging to compute. However, in simplistic commu-

nication scenarios where the channel information is not fedback to the transmitter,

144

a reasonable choice for the transmit covariance matrix is Rx = PtIt, where power

is loaded uniformly on all the transmit antennas. Further, since in our study we

are only interested in a comparison between the SP and CP scenarios, the above

choice of Rx can be used as a benchmark. Hence, the throughput lower bound

for the SP system and the throughput bound for SP semi-blind system in bits per

channel use as is given as,

Csw = Cw (Rs

s,Rsv,R

svs) , Cb

w = Cw

(Rb

s,Rbv,R

bvs

)(6.25)

where, the expressions for the covariance matrices Rs,Rv,Rvs for different the es-

timation schemes are listed in table (6.1). The covariance matrices associated with

the SP mean-estimator can be derived from the expression in (6.6) and employing

the expression for E

(

Hs − H) (

Hs − H)H

, which can be simplified as,

E

(

Hs − H) (

Hs − H)H

=

(1

NbP st

)2 (

E

HXsdX

Hp Xp

(Xs

d

)HHH

)

+

(1

NbP st

)2(E

NXH

p XpNH

)

=

(1

NbP st

)2 (

E

HXsdG

(Xs

d

)HHH

+ ENsGNH

s

)

=

(tP s

d

NbP st

)

HHH +

(tσ2

n

NbP st

)

Ir, G , XHp Xp

where the last equality follows from the simplification EXdGXH

d

= tr (G) P s

d It =

tNbPst P s

d It, as demonstrated in appendix section (6.7.1). For the SP semi-blind

estimate, we derive the approximate error covariance E

(

H − Hb

)(

H − Hb

)H

from the expression for the semi-blind CRB in (6.12). The quantity J b, the error

matrix for SP semi-blind estimation is given as,

E

(

Hb − H) (

Hb − H)H

≈ J b, J bij ,

t−1∑

k=0

[J−1

θ

]

i+kr,j+kr. (6.26)

The above expression provides a good lower-bound on the error covariance, and

is tight even at moderate SNR. It can hence be employed to derive the error

covariance matrices associated with SP semi-blind estimation.

145

Figure 6.3: Schematic of conventional (time-multiplexed) pilots frame (block)

structure.

6.3.3 Conventional Pilots (CP) based estimation

In contrast to SP, CP based channel estimation involves the exclusive

transmission of pilot symbols, which results in a bandwidth overhead. The CP

system frame can be modeled as a transmission of Lp pilot symbols followed by

(Nf − 1) Lp information bearing data symbols. Since the total pilot power in SP

is NbPt, we scale each CP pilot symbol by a factor of√

Nf to transmit equal

pilot power i.e. P ct = P s

t

√Nf . Similarly the CP source power is scaled as P c

d =

P sd /

√

1 − 1Nf

to ensure equal source power. A schematic diagram for the frame

structure of a CP based system is given in fig.(6.3). The input-output model for

the CP system is given as,

yc(k) = Hxc(k)+η(k), where xc(k) =

xcp(k) =

√Nfx

sp(k), 1 ≤ k ≤ Lp

xcd(k) = 1

√

1− 1Nf

xsd(k), Lp + 1 ≤ k ≤ Nb

(6.27)

Defining a stacking of the received pilot symbol outputs as

Yc , [yc(1),yc(2), . . . ,yc(Lp)] , (6.28)

the conventional estimate Hc is then given by the well known LS estimate as,

Hc = Yc(Xc

p

)†= Yc

(Xc

p

)H(

Xcp

(Xc

p

)H)−1

, (6.29)

146

Table 6.1: Table showing covariance matrices for SP and CP systems with channel

estimation error.

SP CP

Rvs Rsvs = − t(P s

d)2

NbPst

(

1 + tNb

)

HHH − P sdtσ2

n

NbPstIr Rc

vs = − tσ2nP c

d

Nf P ctIr

Rs Rss = P s

d

(

1 +tP s

d

NbPst

)

HHH +tσ2

nP sd

NbPstIr Rc

s = P cdHHH +

tP cdσ2

n

Nf P ctIr

Rv Rsv = σ2

nIr + (P sd + P s

t )(

tP sd

NbPstHHH + tσ2

n

NbPstIr

)

Rcv = σ2

nIr +tσ2

nP cd

Nf P ctIr

SP-SB CSIR

Rvs Rbvs = −P s

dJ b Rpvs = 0r

Rs Rbs = P s

dHHH + P sdJ b Rp

s = PdHHH

Rv Rbv = (P s

d + P st )J b + σ2

nIr Rpv = σ2

nIr

where Xcp =

[xc

p(1), . . . ,xcp (Lp)

]. The worst case throughput performance of CP

is given as,

Ccw =

(

1 − 1

Nf

)

Cw (Rcs,R

cv,R

cvs) , (6.30)

where the factor(

Nb−Lp

Nb

)

=(

1 − 1Nf

)

arises due to a loss of one sub-frame per

frame owing to exclusive transmission of the pilot symbols. This results in a loss

in throughput in CP systems, especially for a low number of sub-frames Nf . As

illustrated by the simulation results, for reasonable values of SNR (= Pd/σ2n),

PNR(= Pt/σ2n) and number of sub-frames(= Nf ), an SP scheme has a throughput

of approximately 0.5 bits per channel use greater than that of CP. This is predomi-

nantly because the CP is disadvantaged by the loss of one sub-frame of bandwidth

due to the transmission of pilot symbols exclusively, while the estimation errors

are comparable at low SNRs. Hence, for reasonable SNRs and short data frame

sizes SP has a higher throughput than CP. This makes SP especially suitable for

employment in scenarios such as adhoc and sensor networks, where the informa-

tion transmitted is typically bursty and of short duration and the pilot overhead

in CP would be comparable to the total transmitted data.

147

6.4 Optimal Power Allocation in SP

It can be seen from (6.6) that Hs, the estimate of the channel is corrupted

by the data symbols xsd(k) which enhance the noise during the estimation of the

channel. This scenario presents an interesting tradeoff in SP systems. While on

one hand, higher data power improves the detection performance, it also results

in a poor channel estimate and loss in detection performance. In fact, for a given

number of frames Nf , if the source power Pd is too high, the detection performance

tends to be very poor. Motivated by this observation, we derive expressions for

the optimal data SNR ρsd

(

, P sd /σ2

n

)

in the SIMO context to maximize the post-

processing SNR(PSNR) for Capon beamforming. Consider the analogous SIMO

SP system model, obtained by setting the number of receive antennas r = 1 in

(6.3), with channel vector denoted as h. After estimation of hs and subtracting

the pilot symbol xp (mod(k − 1, Lp) + 1), the model for the detection of the symbol

xsd(k) employing the computed estimate hs is given as demonstrated in (6.24) as,

ys(k) = hsxsd(k) + ∆hs (xp (mod(k − 1, Lp) + 1) + xs

d(k)) + η(k)︸︷︷︸

vs(k)

, (6.31)

where v(k) represents the effective detection noise and ∆hs, the error in the esti-

mate of h is defined as ∆hs , h − hs. The expression for the covariance of the

effective noise Rsv is given in table 6.1. In the discussion below, we present the

expression for the optimum SNR-PNR allocation for PSNR maximization at the

receiver .

6.4.1 Minimum Variance Distortionless Response (MVDR) Beamformer

The MVDR beamformer [83] wm is given as a solution to the detection

SNR maximization criterion described as,

wm = arg minwHRsvw subject to, wHhs = 1.

From the result in [83], wm is given as wHm =

(

hHs (Rs

v)−1 hs

)−1

hH (Rsv)

−1. Sub-

stituting this in (6.31) above, the expression for the estimation of xd(k) can be

148

2 4 6 8 10 12 14 16 18 20

10−1

100

101

SNR

MS

E

MIMO MSE Vs. SNR for SP Estimation, PNR = 5 dB, Nf = 10, L

p = 8

SP Mean EstimateSP Mean Estimate TheorySP CRBSP SemiBlindSP Semi Blind AsympCP

Figure 6.4: MSE of Estimation of MIMO wireless channel with r = t = 4, PNR =

5dB, Nf = 10 and Lp = 8 symbols.

149

obtained as,

wHmys(k) = wH

mhsxsd(k) + wH

mvs(k) = xsd(k) + wH

mvs(k).

Hence, the post-processing SNR for the MVDR beamformer can be seen to be

given as,

κm =P s

d

E|wH

mvs(k)|2 = Pdh

Hs (Rs

v)−1 hs. (6.32)

As demonstrated in appendix 6.7.4, the above expression can be simplified by

substituting the expression for Rsv in table 6.1 to yield,

κm ≈ ρstρ

sdNb ‖h‖2

(ρsd + ρs

t) ρsd ‖h‖2 + ρs

t (Nb + 1) + ρsd

, (6.33)

where ρsd, ρ

st are the data and pilot SNR respectively as defined previously. Let the

total symbol transmit power be constrained as,

ρst + ρs

d = αs. (6.34)

The optimum power allocation ρ⋆t , ρ

⋆d that maximizes the above expression for the

post-processing SNR κm is given by the following result.

Lemma 10. The optimum PNR ρ⋆t that maximizes the post-processing SNR κm

for the MVDR beamformer with the transmit power constraint in (6.34) above, is

given as,

ρ⋆t =

1

γ

(√

δ2 + δαsγ − δ)

, ρ⋆d = αs − ρ⋆

t (6.35)

where δ , (αs)2 ‖h‖2 + αs and γ , Nb − αs ‖h‖2.

Proof. The above result can be readily obtained by differentiating the expression

in (6.33) and noting that ρ⋆t > 0.

The expression in (6.35) gives the optimum pilot and data power alloca-

tion that maximizes the post-processing SNR for MVDR reception.

150

0 5 10 15 20

10−2

10−1

SNR

Est

imat

ion

MS

E

SIMO Estimation MSE Vs SNR, PNR = 8 dB, Nf = 20, L

p = 12

SP Mean EstimateSP Mean Estimate TheorySP CRBSP CRB AsympSP Semi BlindMATLABSP Semi Blind AsympCP

Figure 6.5: MSE of Estimation of SIMO Rayleigh wireless channel with r = 4

antennas, Nf = 20, Lp = 8, PNR = 5dB.

151


6.5.1 MSE of Estimation

In our simulations we employ a MIMO/SIMO wireless channel with r = 4

receive antennas, t ∈ 4, 1 transmit antennas and QPSK symbol modulation. We

consider a Rayleigh fading channel with coefficients Hij, 1 ≤ i, j ≤ 4 generated as

independent zero-mean circularly symmetric complex Gaussian random variables of

unit variance, i.e. E‖Hij‖2 = 1. In the first example we consider the estimation

performance of the semi-blind SP scheme with an orthogonal pilot sequence Xp of

length Lp = 12 symbols and Nf = 10 sub-frames per frame, or a total of Nb = 120

symbols per frame. It is assumed that the channel does not vary significantly

over the block, or in other words, the channel coherence time is larger than the

block duration. Fig.(6.4) shows computed MSE averaged over 2000 independent

realizations of the wireless channel H. It is seen that the MSE of the SP estimate

Hs given by (6.5) is in close agreement with theory from section(6.2). The semi-

blind estimate in (6.17) has a lower MSE than the mean-estimate and achieves the

CRB in (6.12). The asymptotic semi-blind estimator, which has the least MSE, is

the semi-blind estimate as Nb → ∞, implying that the estimate of the whitening

matrix W = W. It can also be seen that even though the CRB results are derived

assuming Gaussian signaling, they are in close agreement with the performance of

a system employing a discrete constellation, QPSK in the above case. Fig.(6.5)

shows the MSE of estimation of a SIMO channel h with r = 4 receive antennas. In

this scenario, the semi-blind estimate in (6.17) involves the constrained estimation

of a scalar phase. As illustrated in lemma 9, it is seen that at high SNR the

SP MSE bound is 3dB lower than the MSE of the SP mean-estimate. We also

plot the MSE of the estimate hf , which is obtained by employing a numerical

optimization routine fminunc(·) in MATLAB to optimize the likelihood in (6.10).

The mean-estimate hs is employed to initialize the procedure. This estimate can

also be seen to achieve the asymptotic MSE bound for SP estimation. Thus, both

152

6 8 10 12 14 16

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Throughput (bits/channel use)

Pou

tage

Worstcase MIMO Throughput Outage with SP

CPSP−MeanSP−SemiBlindPerfect CSIR

Figure 6.6: Throughput performance of SP and CP vs. Nf , SNR = PNR = 5dB,

Lp = 64.

these estimators asymptotically outperform the SP mean-estimator. Finally, the

MSE of the CP estimate from (6.29) is plotted for comparison and can be seen to

outperform SP based estimation. This is expected as the performance of CP is not

limited by the data SNR. However, CP has a net throughput loss compared to SP

as is seen next.

6.5.2 Throughput Performance

Utilizing the framework of worst case correlated capacity developed in

section(6.3.2), we compute the net throughput performance of a superimposed

pilot system and contrast it with the performance of a system employing conven-

tional or time-multiplexed pilots. The results shown in fig. (6.6) consider a system

with Lp = 64 pilot symbols, Nf = 4 sub-frames and PNR, SNR fixed at 5dB.

Employing the expressions in (6.25) and (6.30), we plot the probability of outage

153

of the throughput lower bounds for SP and CP. In Fig. (6.6) it can be seen that

the throughput of SP semi-blind estimation and SP mean-estimation is approxi-

mately 1.0 to 0.5 bits per channel use, respectively, higher than that of CP based

estimation. This throughput margin progressively decreases as the CP bandwidth

loss relative to the block length, i.e. Lp symbols per block of Nb = NfLp symbols,

decreases. Thus, SP estimation can potentially yield significant bandwidth gains,

especially in scenarios which warrant communication in bursts of shorter block

lengths, where employing CP results in significant pilot overheads.

Fig. (6.7) shows the bit error rate (SER) of detection for SP and CP

based detection employing a QPSK symbol constellation, for SNR in the range

2 − 35dB, Nf = 10 sub-frames and pilot sequence length Lp = 64. We employ

a minimum mean-squared error (MMSE) receiver for symbol detection. The net

throughput expressions for SP and CP, denoted by µs, µc respectively, are given

as,

µs = tq (1 − pse) , µc = tq

(

1 − 1

Nf

)

(1 − pce) , (6.36)

where q, is the number of bits per complex symbol (q = 2 for QPSK) and pse, p

ce

are the bit error rates of the SP and CP systems respectively. The SP throughput

with perfect channel state information at receiver (Per CSIR) is also given for

comparison. It can be observed that SP based detection schemes, both mean and

semi-blind, outperform CP by about 0.7 bits per channel use in the mid-SNR range

of 5 − 15dB. This experiment demonstrates a practical MIMO scenario where SP

yields throughput gains over CP by avoiding the exclusive transmission of pilots.

As the SNR increases, the throughput of the SP scheme progressively worsens due

to increasing signal interference.

6.5.3 Optimal Power Allocation

Next we consider the problem of optimal data/pilot transmit power al-

location for receive post-procesing SNR (PSNR) maximization, as examined in

section(6.4). Fig.(6.8) demonstrates the symbol error rate (SER) performance as

154

a function of SNR Vs. transmitted SNR for a QPSK transmit constellation, when

MVDR beamforming is employed at the receiver. We consider the receiver SER

corresponding to several choices of sub-frame number, pilot sequence length and

total transmit power, given in the figure by the legend entry [Nf , Lp, αs (dB)]. For

instance, the legend entry [16, 8, 12.5] denotes the SIMO SER performance curve

for Nf = 16, Lp = 8, αs = 12.5dB. The SER performance reaches a minimum

for a unique SNR (in this scenario for ρd ≈ 11dB) and increases for higher data

power allocation. The corresponding vertical line represents the analytically com-

puted optimal power allocation from the expression in (6.35) and is seen to be

approximately 0.5dB away from optimal performance. Thus, it yields a reliable

benchmark for optimal power allocation. Fig. (6.9) illustrates the optimal power

allocation ratio 10 log10

(ρ⋆

d

ρ⋆t

)

vs. total transmit power αs dB for different numbers

of pilot length Lp and sub-frames Nf . It can be seen that as the block length

(NfLp) increases, the fraction of pilot power decreases from −8dB to −13dB. Fur-

ther, increasing total transmit power results in increasing pilot power allocation to

offset the increase in estimation error from data.

6.6 Conclusion

In the above work we have derived the CRB for SP estimation and demon-

strated a semi-blind scheme that achieves this CRB. We have analyzed the effective

throughput of SP and CP systems employing a novel result for the worst-case ca-

pacity with correlated symbols and noise. It has been observed that SP has a

higher effective throughput than CP based systems. Thus, SP based estimation

can lead to a significant conservation of bandwidth in communication systems.

Acknowledgement


in A. K. Jagannatham and B. D. Rao,“Superimposed Vs. Conventional Pilots for

155

5 10 15 20 25 30

5

5.5

6

6.5

7

7.5

8

SNR (dB)

Thr

ough

put

Throughput Vs SNR, Nf = 10, L

p = 64, PNR = 5 dB

SPCPSP−SBSP−SB AsympSP−Per CSIR

Figure 6.7: Throughput performance of SP and CP Vs. SNR for a 4× 4 Rayleigh

flat-fading MIMO channel with Nf = 10 sub-frames and Lp = 64 pilots.

Channel Estimation”, Conference Record of the Fortieth Asilomar Conference on

Signals, Systems and Computers, Nov., 2006.

156

5 6 7 8 9 10 11 12 13 14

10−3

10−2

SNR (dB)

SE

R

SER Vs. SNR for Total Power Constraint, [Nf,L

p,α (dB)]

[8,4,15]ρ

opt

[8,8,12.5]ρ

opt

[16,8,12.5]ρ

opt

Figure 6.8: Detection performance vs. SNR for SP based estimation. SER vs. SNR

(Pd/σ2n) for QPSK signaling, r = 4 SIMO channel and different [Nf , Lp, α

s (dB)].


6.7.1 Proof of Expression for MSEs in section(6.2)

The expression for MSEs can be simplified as,

MSEs = tr(

E(

HXsd

)Fp

(HXs

d

)H)

+ tr(E

NFpN

H)

,

where Fp ,

(

XHp

(XpX

Hp

)−1) (

XHp

(XpX

Hp

)−1)H

. Let U = [u(1),u(2), . . . ,u(L)] ∈C

m×L be any matrix such that Eu(k)u(l)H

= σ2

uδ(k − l)I. Then, EUFpU

H

157

is given as,

EUFpU

H

=L∑

j=1

L∑

i=1

E

u(i) [Fp]ij u(j)H

=L∑

j=1

L∑

i=1

σ2uδ(i − j) [Fp]ij Im

=L∑

j=1

σ2u [Fp]jj Im = σ2

utr (Fp) Im.

Hence, it follows that E

XsdFp

(Xs

d

)H

=P s

d

Nftr (Fp) It and E

NFpN

H

= σ2n

Nftr (Fp) Ir.

Further,

tr (Fp) = tr

((

XHp

(XpX

Hp

)−1) (

XHp

(XpX

Hp

)−1)H

)

= tr((

XpXHp

)−1 (XpX

Hp

)−1XpX

Hp

)

= tr((

XpXHp

)−1)

Hence the expression in (6.7) follows.


It can be observed that,

E

tr

(

∂(ys(i) − hxs

p(i))H

∂θi

∂R−1e

∂θ∗j

(ys(i) − hxs

p(i))

)

= tr

(

xsp(i)

∗∂hH

∂θi

∂R−1e

∂θ∗jE

ys(i) − hxs

p(i)

)

= 0

since Eys(i) − hxs

p(i)

= E hxsd(i) + η(i) = 0. Hence, the total FIM corre-

sponding to information in both Xp and Re can be obtained as Jθ = Jp

θ+Jr

θwhere

the FIM component Jrθ

corresponds to the information in the covariance matrix

Re. From the results for the FIM of a complex Gaussian stochastic process [59,81],

158

0 2.5 5 7.5 10 12.5 15

−12

−10

−8

−6

−4

−2

Total SNR (ρs+ρ

t) dB

ρ topt /ρ

sopt d

B

Optimal Power Allocation for Superimposed Pilots

Nf = 64, L

p = 32

Nf = 32, L

p = 16

Nf = 16, L

p = 16

Figure 6.9: Optimal power allocation ratio 10 log10

(ρ⋆

d

ρ⋆t

)

of a r = 4 antenna SIMO

channel Vs. Total Power (αsdB) for various Nf , Lp.

the covariance FIM component Jrθ∈ C

2r×2r is given as,

Jrθ (i, j) = Jr

θ (r + j , r + i) = Nb (P sd )2 tr

(∂h

∂hi

hHR−1e h

∂h

∂hj

H

R−1e

)

,

1 ≤ i, j ≤ r

Jrθ (i, r + j) =

(Jr

θ (r + j, i))∗

= Nb (P sd )2 tr

(∂h

∂hi

hHR−1e

∂h

∂hj

hHR−1e

)

,

1 ≤ i, j ≤ r.

It can then be shown after simplification that the matrix Jrθ

is given as,

Jrθ = Nb (P s

d )2

(hHR−1

e h)(R−1

e )T (

hHR−1e

)ThHR−1

e(hHR−1

e

)H (hHR−1

e

)∗ (hHR−1

e h)R−1

e

.

Using results on matrix inversion [82], the quantities hHR−1e and hHR−1

e h can be

further simplified as hHR−1e = hH

σ2n+P s

d‖h‖2 and hHR−1

e h = ‖h‖2

σ2n+P s

d‖h‖2 . Substituting

these expressions in the FIM expression above we obtain the final expression for

159

Jrθ. The SP FIM is given as,

Jθ = NbPst

(R−1

e )T

0r×r

0r×r R−1e

+ Nb (P sd )2

‖h‖2(R−1e )

T

σ2n+P s

d‖h‖2

h∗hH

(σ2n+P s

d‖h‖2)

2

hhT

(σ2n+P s

d‖h‖2)

2‖h‖2R−1

e

σ2n+P s

d‖h‖2

.

Let the constants α, β, γ, θ be defined as α , NbPst , γ , σ2

n +P sd ‖h‖2, β ,

Nb(P sd)

2

γ

and θ ,βγ. Substituting these in the above expression for the FIM, Jθ can be

written as,

Jθ =

(α + β ‖h‖2) (R−1

e )T

0

0(α + β ‖h‖2)R−1

e

︸︷︷︸

K−1θ

+θ

0 h∗hH

hhT 0

.

where Kθ is defined as,

Kθ ,1

α + β ‖h‖2

RT

e 0

0 Re

.

Employing the matrix inversion lemma [82], the CRB for the parameter vector θ

given by J−1θ

can be expressed as,

J−1θ

= Kθ − Kθ

h∗ 0

0 h

R−1θ

0 hH

hT 0

Kθ,

where the matrix Rθ is defined as

Rθ ,1

θI2r +

0 hH

hT 0

Kθ

h∗ 0

0 h

=1

θI2r +

1

α + β ‖h‖2

0 hHReh

hTRTe h∗ 0

.

The MSE bound for the estimation of the parameter vector θ is given as

MSEb =1

2tr

(J−1

θ

)=

1

2tr (Kθ) −

1

2tr

R−1θ

0 hH

hT 0

KθKθ

h∗ 0

0 h

.

Simplifying the above expression, it can be demonstrated that the MSE lower

bound for the estimation of h is given as

E

∥∥∥h − h

∥∥∥

2

≥ tr (Re)

α + β ‖h‖2 +

(hHReh

)

(α + β ‖h‖2)

(hHReReh

)

(α + β ‖h‖2)2 |Rθ|

, (6.37)

160

where |Rθ| is the determinant of the matrix Rθ and is given as,

|Rθ| =1

θ2−

(hHReh

) (hTReh

∗)

(α + β ‖h‖2)2 =

(σ2

n + P sd ‖h‖2)4

(Nb (P s

d )2)2 − ‖h‖4 (σ2

n + P sd ‖h‖2)2

(α + β ‖h‖2)2

= γ2

(

1

β2− ‖h‖4

(α + β ‖h‖2)2

)

=αγ2

(α + 2β ‖h‖2)

β2(α + β ‖h‖2)2

At high SNR i.e. as P sd → ∞, it can be seen that limP s

d→∞ |Rθ| = 2αγ2

β2‖h‖2 . It can

be observed that hHReReh → (P sd )2 ‖h‖6 and

(hHReh)(α+β‖h‖2)

→ P sd‖h‖2

βas P s

d → ∞.

Substituting these and the expression for |Rθ| from above in the MSE expression

in (6.37), the high SNR CRB asymptote is obtained as,

MSE∞b = P s

d

(

limP s

d→∞

MSEb

P sd

)

= P sd

(

limP s

d→∞

1

P sd

‖h‖2

Nb

+‖h‖2 P s

d

2NbPt

)

=‖h‖2 P s

d

2NbPt

.


The capacity of the communication channel of (6.19) with uncorrelated

Gaussian noise is given by the well known maximization of mutual information

[2, 11]. When the nature of the noise process v(k) is unknown, the worst case

capacity [39] can be expressed as,

Cw = minpv(·), tr(Rv)=rσ2

n

maxps(·), tr(Rx)=tPd

I (y; s)

The system in (6.21) can be equivalently written as,

y(k) ≡(I + RvsR

−1s

)s(k) + v(k),

where v , v+RvsR−1ss s, and the innovations noise v is uncorrelated with the source

s i.e. EvsH

= 0. The covariance Rv is given as Rv = Rv|s = Rv−RvsR

−1s Rsv.It

can be seen that the transformation,

v

s

=

Ir RvsR

−1ss

0r×r Ir

v

s

(6.38)

is invertible (since the transform matrix is upper triangular). Therefore, given

a distribution function pv, s(·), there exists a distribution pv, s(·) and vice versa.

161

Hence it can be seen that,

minpv(·), EvvH=Rv

maxps(·), EssH=Rs

I (y; s) = minpv(·), EvvH=Rv

maxps(·), EssH=Rs

I (y; s) ,

Now, employing the result for worst case capacity with uncorrelated noise from

[39], the worst case capacity for the above system can be seen to be given as,

Cw = mintr(Rv)=rσ2

n

maxtr(Rs)=tPd

log∣∣∣I + R−1

v

(I + RvsR

−1s

)Rs

(I + RvsR

−1s

)H∣∣∣ ,

= mintr(Rv)=rσ2

n

maxtr(Rs)=tPd

log∣∣∣I + R−1

v|s (Rs + Rvs)R−1s (Rs + Rvs)

H∣∣∣ ,

which is the expression for the worst case capacity in (6.22).

6.7.4 MVDR - Post-Processing SNR

Below, we derive the expression in equation in (6.33) for the MVDR

post-processing SNR κm. From table 6.1, the covariance of the effective noise

Rsv = βhhhH + βnI, where the constants βh, βn are defined as βh ,

P sd

Nb

(

1 +P s

d

P st

)

and βn ,σ2

n

Nb

(

1 +P s

d

P st

)

+ σ2n. Using results on matrix inversion [82], the matrix

(Rsv)

−1 can be expressed as,

(Rsv)

−1 =1

βh

(

βh

βn

I − βh

βn

Ih

(

1 + hH βh

βn

Ih

)−1βh

βn

I

)

=1

βn

I−(

βh

βn

)hhH

βn + βh ‖h‖2 .

(6.39)

Substituting this expression for (Rsv)

−1 in (6.32), the expression for κm can be

simplified as,

κm = P sd h

Hs (Rs

v)−1 hs,

≈ P sdh

H (Rsv)

−1 h,

=P s

d

βn

‖h‖2 −(

βh

βn

)P s

d ‖h‖4

βn + βh ‖h‖2 ,

=P s

d ‖h‖2

βn + βh ‖h‖2 .

Substituting the expressions for βh, βn defined above, the final expression for κm

in terms of the quantities P sd , P s

t is obtained as,

κm =Nb ‖h‖2 P s

d P st

‖h‖2 P sd (P s

d + P st ) + σ2

n (P st (1 + Nb) + P s

d ),

162

which reduces to the expression in (6.33).

7 MIMO Time Varying Channel

Estimation

7.1 Introduction

The previous chapter has studied a novel scheme for channel estimation

based on superimposed pilots(SP). In such systems, pilot symbols are not trans-

mitted exclusively but superimposed over the information symbols and hence do

not result in a bandwidth overhead. In [84] it has been demonstrated that the

transmission of such a sequence of superimposed pilot symbols can in fact result

in increased throughput performance.

The problem of MIMO channel estimation is further complicated by the

relative motion between the base station and mobile terminals. This results in

a time varying channel arising due to the doppler shift in the carrier signal. The

velocity of the mobile terminal dictates the doppler bandwidth of the channel which

in turn determines the coherence time of the channel, or the time duration for

which the channel can be assumed to be static. Thus, as the velocity of the mobile

node increases, the coherence time decreases. A popular scheme to estimate such

a time varying channel is the auto-regressive (AR) modeling based Kalman filter

estimation [13]. More recently, in works such as [85,86], complex exponential basis

expansion modeling (CEBEM) based channel estimation with superimposed pilots

has been shown to yield promising results. The CEBEM models for time-varying

channels were first presented in [87] and have recently gained much attention. In

163

164

this work, we study the performance of SP for CEBEM based MIMO time-selective

channel estimation.

Further, the performance of the SP based channel estimation scheme

can be significantly improved by employing a soft-decision based iterative algo-

rithm. The expectation-maximization algorithm[82,88] provides a framework that

is suited for such a procedure since the unknown information symbols can be

treated as missing data. However, one of the major shortcomings of such a scheme

is the associated high computational cost. For instance, employing a 16-QAM

constellation in a 4 transmit antenna MIMO system, it is necessary to perform

164 = 65, 536 likelihood computations, which is prohibitively high. In this context,

we suggest a novel modification to the EM algorithm based on the sphere-decoding

algorithm[89]. This scheme can reduce the computational complexity order by re-

ducing the number of likelihood computations to the number of sphere vectors.

This scheme trades off computational complexity for MSE performance and has

a slightly higher MSE owing to the sub-optimality resulting from the selection of

a fewer source vectors. In the end we present simulation results comparing the

performance of different time-varying MIMO channel estimation schemes. In the

discussion that follows the notation k ∈ m,n represents m ≤ k ≤ n, where k,m, n

are integers. The vector em is defined as em , [1, 1, . . . , 1]T ∈ Cm×1.

7.2 Problem Setup

Consider an r×t MIMO system, i.e. a MIMO system with r receive and t

transmit antennas. Let the multiple-input multiple-output (MIMO) system model

be described as,

y(l) =

∫ ∞

−∞H (l, τ)x (l − τ) dτ + η(l), (7.1)

where l denotes continuous time and H (l, τ) ∈ Cr×t is the time-varying MIMO

channel impulse response. For simplicity, we assume that the coherent bandwidth

Bc of the channel is such that Bc >> Rs, where Rs is the symbol baud. Hence,

165

from [3], the MIMO channel response H (l, τ) can be approximated by the time-

selective but frequency-flat response H (l, τ) = H(l)δ (τ). The discrete time MIMO

system model at the sampling instants can be represented as,

y(k) = H(k)x(k) + η(k), (7.2)

where the index k denotes the sampling index. We now consider the problem of

estimation of the channel H (k) using superimposed pilot symbols. Let a frame

of Nb symbols be transmitted by repeated superposition of a pilot sequence Xp =

[xp(1),xp(2), ...,xp (Lp)] of length Lp symbols i.e. xp(k) = xp (mod (k − 1, Lp) + 1).

Hence, the SP system model can be derived as,

y(k) = H(k) (xd(k) + xp (mod (k − 1, Lp) + 1)) + η(k), (7.3)

where xd(k) are the stochastic zero-mean (E xd(k) = 0) transmitted data sym-

bols of power Pd, i.e. Exd(k)xd(k)H

= PdIr.

7.2.1 SP Estimation Based on the CEBEM MIMO Model

Let fd be the maximum frequency of the doppler spread. Then, fd ,

fd/Rs is the normalized doppler component. From [85], the complex exponential

basis expansion of the time varying MIMO channel H (k) can be expressed as,

H(k) =V∑

v=−V

H(v)ej2πv(k−1)/Nb , V , ⌈fdNb⌉,

where the matrices H(v) ∈ Cr×t, −V ≤ v ≤ V are the coefficient matrices of the

CEBEM model and H ,

[

H (−V ) , H (−V + 1) , . . . , H (V )]

. Let the exponential

basis matrix dv(k) ∈ C(2V +1)×1 be defined as,

dv(k) ,

e−j2πV (k−1)

e−j2π(V −1)(k−1)

...

ej2πV (k−1)

.

166

Hence, the equivalent CEBEM based SP system model can be described as,

y(k) = HDv(k) (xd(k) + xp (mod (k − 1, Lp) + 1)) + η(k),

where Dv(k) , (dv(k)⊗

It) and⊗

denotes the matrix Kronecker product. Let

the matrices Xp,Xs be given as,

Xp ,

xp(1) 0 0 . . . 0

0 xp(2) 0 . . . 0...

......

. . ....

0 0 0 . . . xp (Nb)

, Xs ,

xs(1) 0 0 . . . 0

0 xs(2) 0 . . . 0...

......

. . ....

0 0 0 . . . xs (Nb)

,

where, xs(k) , xd(k) + xp (mod (k − 1, Lp) + 1) is symbol transmitted at the kth

symbol instant. Let Dv , [dv(1),dv(2), . . . ,dv (Nb)]. Let the received symbol vec-

tor matrix Ys be defined as, Ys , [ys(1),ys(2), . . . ,ys (Nb)]. The above CEBEM

estimation scenario can be recast as,

Ys =(

Dv

⊗

It

)

Xp +(

Dv

⊗

It

)

Xd + Ns,

where the matrix Ns represents a stacking of the noise vectors η(k) similar to Ys

defined above. The estimate ˆH is given as,

ˆHp = Ys

[(

Dv

⊗

It

)

Xp

]†, (7.4)

where † denotes the Moore-Penrose pseudo-inverse [82]. The time-selective channel

estimate is now given as,

H(k) ,ˆHp

[

dv(k)⊗

It

]

.

Thus, one can arrive at an estimate of the time-varying MIMO channel H(k). In

the next section we present an EM based iterative algorithm to enahance the above

channel estimate.

7.3 EM Based Algorithm for CEBEM SP Estimation

Let the symbols at each transmit antenna be drawn from a set Ω of size

|Ω|. Let Ωt denote the set of all possible such symbol vectors from which each data

167

symbol xd(k) is drawn, i.e. xd(k) ∈ Γ and Γ = Ω×Ω×. . .×Ω. The size of this set is

given as |Γ| = |Ω|t. Let this set Γ be indexed by j ∈ 1, |Ω|t where xj denotes the j-

th symbol in the set. Also, let xj ∈ Γ have an apriori probability denoted by γj i.e.

p (xd(i) = xj) = γj , ∀i. Define the indicator parameter χi(j), i ∈ 1, Nb, j ∈ 1, |Ω|t

as χi(j) = 1 if xd(i) = xj and 0 otherwise. It can be seen that xd(i) are given as

a function of χi(j)s as xd(i) =∑|Ω|t

j=1 χi(j)xj. The ML estimate of H is given by

optimization of the cost function,

ˆH = arg min∥∥∥Ys − H

[(

Dv

⊗

It

)

Xs

]∥∥∥

2

.

The estimate H is then given by the least squares solution,

ˆH = Ys

[(

Dv

⊗

It

)

Xs

]†.

The closed form solution in (7.5) above can be implemented with very low compu-

tational complexity, if χi(j) were known. It can thus be seen that this formulation

naturally leads the way for application of the EM algorithm by defining the com-

plete data Π = (Ys, χ), where χ ∈ 0, 1|Ω|t×Nb is defined as χ ,[χ1, χ2, . . . , χNb

]

and χi , [χi(1), χi(2), . . . , χi(|Ω|t

)]T ∈ 0, 1|Ω|t×1 , i ∈ 1, Nb. The matrix χ is

popularly known as the ’missing’ data. In Gaussian noise, the log-likelihood of the

complete data Π is given by the sum of the log-likelihoods of the individual ys(i)

as,

Lg (Π;H) =

Nb∑

k=1

|Ω|t∑

j=1

χk(j)∥∥∥ys(k) − HDv(k)

(xp(k) + xj

)∥∥∥

2

. (7.5)

The EM algorithm can now be employed to compute the ML estimate of the matrix

H as follows. Let H(k) denote the estimate of H at the kth iteration. The algorithm

is initialized by the pilot estimate H(0) = ˆHp. Then, H(k) is given as,

H(k) = arg minU (k−1)(

Π; H)

(7.6)

168

where the quantity U (k−1)(

Π; H)

is defined as,

U (k−1)(

Π; H)

,

∫ ∞

−∞p(

χ|Ys; H(k−1)

)

Lg

(

Π; H)

dχ

=

Nb∑

k=1

|Ω|t∑

j=1

Eχk(j)

∣∣H(k−1)

∥∥∥ys(k) − HDv(k)

(xp(k) + xj

)∥∥∥

2

.

By definition of χi(j) it follows that E χi(j)|H(k) = p(

xj|ys(i); H(k)

)

and is

given as,

Eχi(j)

∣∣H(k) = p

(xj|ys(i);H

(k))

=p(ys(i),x

j;H(k))

p (ys(i);H(k))(7.7)

=p(ys(i)|xj;H(k)

)p (xj)

∑|Ω|tl=1 p (ys(i)|xl;H(k)) p (xl)

, (7.8)

where the quantity p(

xj|ys(i); H(k)

)

is given by the exponential function,

p(

xj|ys(i); H(k)

)

= e−‖ys(i)−H(k)Dv(i)(xp(i)+xj)‖2

.

The closed form expression for H(k) which maximizes the likelihood in (7.6) is

given as,

H(k) = R(k)yx

(R(k)

xx

)−1,

where the matrices R(k)yx and R

(k)xx are defined as,

R(k)yx ,

1

Nb

Nb∑

i=1

|Ω|t∑

j=1

p(

xj|ys(i); H(k)

)

ys(i)(xp(i) + xj

)HDv(j)

H ,

R(k)xx ,

1

Nb

Nb∑

i=1

|Ω|t∑

j=1

p(

xj|ys(i); H(k)

)

Dv(j)(xp(i) + xj

) (xp(i) + xj

)HDv(j)

H .

7.3.1 Likelihood computation and Sphere Decoding

The complexity of the posterior probability computation in (7.8) is of

the order of O(|Ω|t

)likelihood computations, such as p

(ys(i)|xj;H(k)

), j ∈ 1, |Ω|t

for each i ∈ 1, Nb. As seen above, for a 4 × 4 MIMO system employing a 16

169

QAM signal constellation (|Ω|t ≈ 105), making the computational complexity pro-

hibitively high for real-time implementation. It can also be observed that the

quantity p(ys(i)|xj;H(k)

), j ∈ 1, |Ω|t is significantly different from zero only for

symbol vectors xj in the neighborhood of xd(i) =∑|Ω|t

j=1 χi(j)xj. The sphere de-

coding algorithm described in [89] for maximum likelihood detection in MIMO

systems can be employed to find the vectors xj such that

∥∥ys(i) − H(k)Dv(i)

(xp(i) + xj

)∥∥ ≤ re (7.9)

where re is the sphere radius. We then choose re such that only a few Nsp >

Nc << |Ω|t signal vectors lie in the sphere. Nc is a certain ’critical’ number of

constellation vectors which contribute significantly to the likelihood expression for

each received symbol ys(i). These Nsp symbols can then be used to compute the

probabilities p(ys(i)|xj;H(k)

)of the vectors xj. Let F(k)(i) ∈ C

2rt×2rt be defined

as,

F(k)(i) ,

Re

(

H(k)Dv(i))

−Im(

H(k)Dv(i))

Im(

H(k)Dv(i))

Re(

H(k)Dv(i))

, (7.10)

and ys(i) ∈ C2r×1 be defined as ys(i) =

[Re

(ys(i)

T), Im

(ys(i)

T)]T

. Then, the

cost function in (7.9) reduces to building the set of sphere vectors Ωj such that,

Ωi ,xj :

∥∥ys(i) − F

(xp(i) + xj

)∥∥ ≤ re

, (7.11)

where xj, xp(i) are obtained by a stacking of xj,xp(i) respectively, similar to ys(i).

The construction of the above set can be further simplified as follows. Let the

source vectors xj be drawn from a 16QAM symbol constellation and this scheme

can be readily extended to square constellations of other sizes. Then, xj can

be represented as xj =√

Pd

10

(2sj − 5

(et +

√−1 et

)), where sj = sj

p +√−1 sj

q.

Each element of the vectors sjp, s

jq is drawn from the set S , Smin, Smax, where

Smin = 1, Smax = 4 for the 16 QAM constellation. The above set Ωi can be recast

as,

Ωi ,sj :

∥∥ys(i) − Gsj

∥∥ ≤ re

, (7.12)

170

where ys(i) , ys(i) − Fxp(i) +√

5Pd

2Fe2t and G ,

√2Pd

5F. The vector sj is

obtained from sj similar to ys(i) defined above. Let VR = G, be the QR factor-

ization of G (i.e. VHV = I and R is upper triangular). Let V be block partitioned

as V =[Vr, Vn

], Vr ∈ C

r×t, Vn ∈ Cr×r−t. Let u = [u1, u2, . . . , u2t]

T, VH

r ys(i).

Below, we adapt the sphere decoding algorithm in [89] to find the set of sphere

vectors Ωi. Let rm,n denote the (m,n)th entry of the matrix R.

Algorithm

S.1 Set k = 2t, r2t = r2

e −∥∥VH

n ys(i)∥∥

2, u2t|2t+1 = u2t. Ωi = .

S.2 Computation of Bounds for sk:

Uk , min

⌊rk + uk|k+1

rk−1,k−1

⌋

, Smax

,

sk = max

⌈−rk + uk|k+1

rk−1,k−1

⌉

− 1, Smin − 1

.

S.3 Set sk = sk + 1. If sk ≤ Uk go to S.5. Else to S.4.

S.4 Set k = k + 1. If k = 2t + 1 terminate algorithm. Else go to S.3.

S.5 If k = 1 go to S.6. Else, set k = k − 1.

uk|k+1 = uk −2t∑

j=k+1

rk,jsj ,

r2k = r2

k+1 −(

uk+1 −2t∑

j=k+1

rk+1,jsj

)2

.

Go to S.2

S.6 Solution found. Let s , [s1, s2, . . . , s2t]T be partitioned as s =

[sTp , sT

q

]T

where, sp, sq ∈ Rt×1. Ωi = Ωi

⋃√

Pd

10

(2 (sp + jsq) − 5

(et +

√−1 et

))

and go to S.3.

171

5 5.5 6 6.5 7 7.5 8 8.5 9

10−1

SNR (dB)

MS

E

MSE Vs. SNR for CEBEM−EM Based Time Selective Channel Estimation

MSE−CEBEM

MSE Static

MSE−CEBEM−EM

MSE−CEBEM−EM−SD

Figure 7.1: MSE of Kalman based estimation of a time-varying wireless channel.

We then sort the computed probabilities p(ys(i)|xj;H(k)

), for xj ∈ Ωi

and choose the largest Nc of them and the corresponding symbol vectors. The

rest of the probabilities for each ys(i) are set to 0. These probabilities are used

to compute the posterior probabilities in (7.8). Thus every iteration of the EM

algorithm uses only Nc << |Ω|t symbol vectors, significantly speeding up the

likelihood computation.

7.4 Simulations

For our simulations, we consider a 4×4 MIMO system with frame length

Nb = 240 symbols. The information, pilot symbols are drawn from a QPSK symbol

constellation. The wireless channel between the receive antenna i and transmit

antenna j is given by the modified Jake’s process outlined in [90] as,

[H(t)]ij =

√

2

Np

Np∑

n=1

ej[Ψn]ij cos(

2πfdt cos [Υn]ij + [Φ]ij

)

,

172

and the matrix Υn is given as,

Υn =1

4Np

((2πn − π) ere

Tt + Θ

),

where the entries of Ψn, Θ, Φ ∈ Rr×t are IID and uniformly distributed as U [−π, π )

and fd , fc (vm/c) is the doppler frequency shift for a node in motion with velocity

vm. The coefficients of the channel matrix H(k) at the sampling instants k are

given as,

[H (k)]ij =

√

2

Np

Np∑

n=1

ej[Ψn]ij cos(

2πfdk cos[Υn]ij + [Φ]ij

)

, (7.13)

and fd , fd/Rs is the normalized doppler frequency. The normalized doppler is a

convenient handle on the nature of variation of a fast-fading process and for most

wireless applications fd ≤ 0.01. We employ average mean-squared error as the

metric to evaluate the performance of the above estimators which is given as,

¯MSE =1

rtNb

Nb∑

l=1

∥∥∥H(k) − H(k)

∥∥∥

2

F(7.14)

In fig.(7.1) we plot the MSE of CEBEM based SP estimation of the time-varying

4 × 4 MIMO channel employing the standard mean-estimator described in (7.4).

We also plot the MSE of EM based iterative estimation with sphere-decoding

described in section(7.3) for re =√

30 and compare this performance with the

standard mean based SP estimator. It can be seen that the EM based iterative

scheme has a significantly lower MSE compared to the mean-estimator. It can

also be seen this low complexity iterative procedure has a slight performance loss

compared to the full complexity EM, which results from the sub-optimality of

the sphere decoding based EM procedure. This MSE difference worsens as the

SNR increases. The MSE of the the static SP mean estimator, which assumes an

invariant MIMO channel matrix H(k) = H, is plotted for comparison. It can be

seen that the static estimator results in poor MSE performance.

173

7.5 Conclusion

In this work we successfully demonstrated the applicability of CEBEM

based modeling in the context of time-selective MIMO channel estimation using

superimposed pilots (SP). An expectation-maximization(EM) based soft decoding

procedure has been suggested for iterative refining of the MIMO channel estimate.

The computational complexity of the EM implementation has been substantially

reduced by adapting the sphere decoding algorithm for likelihood computation.

8 Conclusions

In this thesis we investigated several different schemes for bandwidth ef-

ficient channel estimation in the context of MIMO systems. As the number of

receive/transmit antennas grows in a MIMO system, the number of parameters

to be estimated increases significantly. This, coupled with the low SNR regime

operation of MIMO systems poses great challenges for channel estimation. Trans-

mission of pilot symbols to estimate all the channel coefficients constitutes a sig-

nificant bandwidth overhead in MIMO systems. The alternative to pilot based

estimation is blind estimation, in which the channel is estimated exclusively from

the statistical information present in the transmitted information symbols. Such

a scheme is bandwidth optimal as it avoids the transmission of pilots. However,

it can result in a significant computational complexity overhead and also such al-

gorithms frequently result in convergence to local minima due to non-convexity of

the cost functions.

We have demonstrated that semi-blind estimation which employs both

pilots and blind information, significantly reduces the mean-squared error of es-

timation of a MIMO system. Further, by employing pilots, one can resolve the

indeterminacy that often arises in blind estimation. By achieving an enhanced

MSE performance through employment of blind information, the number of pilots

in the symbol frame can be reduced which leads to greater bandwidth efficiency.

From a constrained CRB analysis, it has been shown that asymptotically such

a semi-blind scheme leads to at least a 3dB decrease in the MSE of estimation.

Several constrained maximum-likelihood schemes have been suggested that asymp-

174

175

totically achieve this constrained Cramer-Rao bound for semi-blind estimation.

We have also addressed the issue of the minimum number of pilot symbols

required for the identifiability of a MIMO frequency selective (FS) channel through

a Fisher information matrix based regularity analysis. It is demonstrated that the

rank deficiency of the Gaussian FIM of a FS channel is at least t2 and further, that

at least t pilot symbols are necessary for the complete estimation of a FIR MIMO

channel. It is also shown that the semi-blind estimation bound asymptotically

converges to the complex constrained CRB thus resulting in a significantly reduced

MSE of estimation of the semi-blind scheme.

The semi-blind estimation philosophy has also been demonstrated to yield

performance improvements in the context of maximum-ratio transmission (MRT)

based MIMO systems. MRT based MIMO relies on beamforming at the transmitter

and receiver to utilize the dominant eigenmode of transmission of a MIMO system

and hence has a low implementation complexity. A semi-blind scheme for MRT

based estimation which directly estimates the dominant left and right singular

vectors of the MIMO channel matrix has been demonstrated. The expressions for

MSE performance and the effective channel gain for the semi-blind scheme and

conventional pilot scheme have been derived using a matrix perturbation analysis.

It has been demonstrated that semi-blind estimation yields MSE and throughput

gains compared to the conventional pilot based scheme.

In a paradigm shift in channel estimation, pilots superimposed over data

symbols or superimposed pilots offer another alternative for bandwidth efficient

channel estimation. Such a scheme avoids the exclusive transmission of pilots. A

semi-blind scheme has been demonstrated for SP based estimation, which improves

performance over the traditional mean-estimator. The throughput performance of

the SP system has been characterized through the development of a worst case

capacity analysis for systems with information noise correlation. Through this

analysis it has been demonstrated that the SP scheme can yield throughput gains

compared to a conventional system employing time-multiplexed pilots. Finally,

176

an application of SP based estimation has been demonstrated in the context of a

time-varying MIMO channel which is modeled using a complex exponential basis

expansion model.

In summary, the schemes proposed in this thesis represent a progress to-

wards the design of bandwidth efficient algorithms for channel estimation in MIMO

systems. Future work can include designing novel implementation strategies to de-

ploy these schemes in real time wireless devices. Such efficient algorithms can

result in significantly enhancing the bandwidth efficiency of MIMO systems by

reducing the pilot overhead.

Bibliography

[1] S. Alamouti, “A simple transmit diversity technique for wireless communica-tions,” Selected Areas in Communications, IEEE Journal on, vol. 16, pp. 1451– 1458, Oct. 1998.

[2] E. Telatar, “Capacity of multi-antenna Gaussian channels,” European Trans-actions on Telecommunications, vol. 10, pp. 585–596, Nov. 1999.

[3] J. Proakis, Digital Communications. NewYork,NY-10020: McGraw-HillHigher Education, international ed., 2001.

[4] A. Paulraj, R. Nabar, and D. Gore, Introduction to Space-Time WirelessCommunications. Cambridge University Press, first ed., 2003.

[5] J. Zheng, E. Duni, and B. D. Rao, “Analysis of multiple antenna systemswith finite-rate feedback using high resolution quantization theory,” IEEETransactions on Signal Processing, To appear.

[6] J. Zheng and B. D. Rao, “Capacity analysis of correlated multiple antennasystems with finite rate feedback,” Proceedings of the IEEE International Con-ference on Communications, Jun. 2006.

[7] Y. Isukapalli, R. Annavajjala, and B. D. Rao, “Performance analysis of trans-mit beamforming for MISO systems with imperfect feedback,” IEEE Trans-actions Communications, In Review.

[8] L. Tong and S. Perreau, “Multichannel blind identification: From subspace tomaximum likelihood methods,” Proceedings of the IEEE, october 1998.

[9] Z. Cheng and D. Dahlhaus, “Time versus frequency domain channel estima-tion for ofdm systems with antenna arrays,” Proc. of 6th International Con-ference on Signal Processing (ICSP’02), Beijing, China, vol. 2, pp. 1340–1343,Aug 2002.

[10] J. H. Wilkinson, The Algebraic Eigenvalue Problem. Walton St., Oxford:Oxford University Press, first ed., 1965.

[11] T. M. Cover and J. A. Thomas, Elements of Information Theory. N.Y.: JohnWiley & Sons, Inc., 1991.

177

178

[12] B. Hassibi and B. Hochwald, “How much training is needed in multiple-antenna wireless links,” IEEE Trans. on Info. Theory, Apr 2003.

[13] M. Yan and B. D. Rao, “Performance of an array receiver with a kalmanchannel predictor for fast rayleigh flat fading environments,” IEEE Journalon Selected Areas in Communications-Wireless Series, vol. 19, pp. 1164–1172,Jun 2001.

[14] T. S. Rappaport, Wireless communications: Principles and Practice. UpperSadle River, NJ 07458: Prentice Hall, second ed., 2002.

[15] S. M. Kay, Fundamentals of Statistical Signal Processing,Vol I: EstimationTheory. Prentice Hall PTR, first ed., 1993.

[16] A. van den Bos, “A Cramer-Rao lower bound for complex parameters,” IEEETransactions on Signal Processing, vol. 42, p. 2859, october 1994.

[17] P. Stoica and B. C. Ng, “On the cramer-rao bound under parametric con-straints,” IEEE Signal Processing Letters, vol. 5, pp. 177–179, Jul 1998.

[18] J. Gorman and A. Hero, “Lower bounds on parametric estimators with con-straints,” IEEE Transactions on Information Theory, vol. 36, pp. 1285 – 1301,Nov. 1990.

[19] T. Marzetta, “A simple derivation of the constrained multiple parame-ter cramer-rao bound,” IEEE Transactions on Signal Processing, vol. 41,pp. 2247–2249, June 1993.

[20] D. H. Brandwood, “A complex gradient operator and its application in adap-tive array theory,” IEE Proc., vol. 130, pp. 11–16, Feb. 1983.

[21] R. Fischer, Precoding and Signal Shaping for Digital Transmission (Appendix-A). Wiley InterSciences, 2002.

[22] S. Zacks, The theory of statistical inference. John Wiley and Sons, first ed.,1971.

[23] A. K. Jagannatham and B. D. Rao, “A semi-blind technique for MIMO chan-nel matrix estimation,” in Proc. of IEEE Workshop on Signal Processing Ad-vances in Wireless Communications (SPAWC 2003) , # 582, (Rome, Italy),2003.

[24] A. Medles, D. T. M. Slock, and E. D. Carvalho, “Linear prediction based semi-blind estimation of MIMO FIR channels,” Third IEEE SPAWC, Taoyuan,Taiwan, pp. 58–61.

[25] P. Comon, “Independent component analysis, a new concept?,” Signal Pro-cessing, vol. 36, no. 3, pp. 287–314, 1994.

179

[26] J. Cardoso, “Blind signal separation : Statistical principles,” Proceedings ofthe IEEE, vol. 86, pp. 2009–25, Oct 1998.

[27] V. Zarzoso and A. Nandi, “Adaptive blind source separation for virtually anysource probability density function,” IEEE Transactions on signal processing,vol. 48, Feb. 2000.

[28] E. Carvalho and D. Slock, “Asymptotic performance of ML methods for semi-blind channel estimation,” Thirty-First Asilomar Conference, vol. 2, pp. 1624–8, 1998.

[29] A. Medles and D. Slock, “Semiblind channel estimation for mimo spatial mul-tiplexing systems,” Vehicular Technology Conference, Fall 2001, 2001.

[30] D. Pal, “Fractionally spaced semi-blind equalization of wireless channels,” TheTwenty-Sixth Asilomar Conference, vol. 2, pp. 642–645, 1992.

[31] D. Pal, “Fractionally spaced equalization of multipath channels: a semi-blindapproach,” 1993 International Conference on Acoustics, Speech and SignalProcessing, vol. 3, pp. 9–12, 1993.

[32] A. K. Jagannatham and B. D. Rao, “Cramer-Rao lower bound for constrainedcomplex parameters,” IEEE Signal Processing Letters, vol. 11, pp. 875–878,Nov. 2004.

[33] A.Medles, D. Slock, and E.D.Carvalho, “Linear prediction based semi-blindestimation of MIMO FIR channels,” Third IEEE SPAWC, Taiwan, 2001.

[34] Y. Sung and L. T. et. al., “Semiblind channel estimation for space-time codedWCDMA,” 36th Asilomar Conference on Sig., Syst., pp. 1637–1641, 2002.

[35] A. Taylor and W. Mann, Advanced Calculus. Wiley Text Books, 3rd ed.

[36] P. Vaidyanathan, Multirate systems and filter banks. Englewood Cliffs, NJ,USA: Prentice Hall, 1993.

[37] G. H. Golub and C. F. V. Loan, Matrix Computations. Johns Hopkins UnivPr, second ed., 1984.

[38] T. S. Ferguson, Mathematical Statistics: A Decision Theoretic Approach.Boston: Academic Press, 1967.

[39] B. Hassibi and B. Hochwald, “How much training is needed in multiple-antenna wireless links,” IEEE Transactions on Information Theory, vol. 49,pp. 951–964, Apr 2003.

[40] T. Marzetta, “BLAST training: Estimating channel characteristics for high-capacity space-time wireless,” Proc. 37th Annual Allerton Conference onCommunications, Control, and Computing, Monticello, IL, pp. 22–24, Sept.1999.

180

[41] J. Heiskala and J. Terry, OFDM Wireless LANs: A Theoretical and PracticalGuide. SAMS Publishing, 2002.

[42] Y. Hua, “Fast maximum-likelihood for blind identification of multiple FIRchannels,” IEEE transactions on Signal Processing, vol. 44, pp. 661–672,March 1996.

[43] P. Loubaton, E. Moulines, and P. Regalia, Signal Processing Advances inWireless and Mobile Communications, Chapter 3: Subspace Method for BlindIdentification and Deconvolution. Upper Saddle River, NJ 07458: PrenticeHall, first ed., 2001.

[44] E. Carvalho and D. Slock, “Blind and semi-blind FIR multichannel estimation:Global identifiability conditions,” IEEE Transactions on Signal Processing,vol. 50, pp. 1053–1064, April 2004.

[45] E. deCarvalho and D. Slock, Signal Processing Advances in Wireless and Mo-bile Communications, Chapter 7: Semi-Blind methods for FIR multi-channelestimation. Upper Saddle River, NJ 07458: Prentice Hall, first ed., 2001.

[46] D. Slock, “Blind joint equalization of multiple synchronous mobile users usingoversampling and/or multiple antennas,” Conference Record of the Twenty-Eighth Asilomar Conference on Signals, Systems and Computers, 1994, vol. 2,pp. 1154–1158.

[47] T. Moore, B. Sadler, and R. Kozick, “Regularity and strict identifiability inMIMO systems,” IEEE Transactions on Signal Processing, vol. 50, pp. 1831–1842, August 2002.

[48] N. Ammar and Z. Ding, “On blind channel identifiability under space-timecoded transmission,” Asilomar conference on signals, systems and computers,vol. 1, pp. 664–668, Nov. 2002.

[49] A. Swindlehurst and G. Leus, “Blind and semi-blind equalization for gen-eralized space-time block codes,” IEEE Transactions on Signal Processing,vol. 50, pp. 2489–2498, Oct. 2002.

[50] J. Tugnait and B. Huang, “Multistep linear predictors-based blind identifi-cation and equalization of multiple-input multiple-output channels,” IEEETransactions on Signal Processing, vol. 48, pp. 26–38, January 2000.

[51] Y. Inuye and R.-W. Liu, “A system-theoretic foundation for blind equalizationof an FIR MIMO channel system,” IEEE transactions on circuits and systems-I: Fundamental theory and applications, vol. 49, pp. 425–435, april 2002.

[52] S. Shahbazpanahi, A. Gershman, and J.H.Manton, “Closed form blind mimochannel estimation for orthogonal space-time block codes,” IEEE Transac-tions on Signal Processing, vol. 53, pp. 4506–4517, Dec. 2005.

181

[53] S. Shahbazpanahi, A. Gershman, and G. Giannakis, “Semi-blind multi-userMIMO channel estimation based on Capon and MUSIC techniques,” Proceed-ings of the International Conference on Acoustics, Speech, and Signal Pro-cessing, vol. 4, pp. 773–776, 2005.

[54] P. Stoica and A. Nehorai, “Performance study of conditional and uncon-ditional direction of arrival estimation,” IEEE Transactions on Acoustics,Speech and Signal Processing, vol. 38, pp. 1783–1795, Oct.

[55] A. K. Jagannatham and B. D. Rao, “Whitening rotation based semi-blindMIMO channel estimation,” IEEE Transactions on Signal Processing, vol. 54,pp. 861–869, Mar. 2006.

[56] N. Dhahir and A. Sayed, “The finite-length multi-input multi-output MMSE-DFE,” IEEE Transactions on Signal Processing, vol. 48, pp. 2921 – 2936, Oct.2000.

[57] A. Medles and D. T. Slock, “Augmenting the training sequence part in semi-blind estimation for MIMO channels,” Proc. of the 37th Asilomar conferenceon signals, systems and computers, 2003.

[58] E. Carvalho and D. Slock, “Cramer-Rao bounds for semi-blind and trainingsequence based channel estimation,” First IEEE Workshop on Signal Process-ing Advances in Wireless Communications, pp. 129–32, 1997.

[59] K. Miller, Complex Stochastic Processes: An Introduction to Theory and Ap-plication. Addison-Wesley, first ed., 1974.

[60] M. Siyau, P. Nobles, and R. Ormondroyd, “Channel estimation for layeredspace-time systems,” Proc. Signal Processing Advances in Wireless Commu-nications, pp. 482–486, Jun. 2003.

[61] T. K. Y. Lo, “Maximum ratio transmission,” IEEE Trans. Commun., vol. 47,pp. 1458–1461, Oct. 1999.

[62] T. W. Anderson, An Introduction to Multivariate Statistical Analysis, ch. 11.John Wiley & Sons, 1971.

[63] A. K. Jagannatham, C. R. Murthy, and B. D. Rao, “A semi-blind MIMOchannel estimation scheme for MRT,” in Proc. ICASSP, vol. 3, (Philadelphia,PA, USA), pp. 585–588, Mar. 2005.

[64] M. Kaveh and A. J. Barabell, “The statistical performance of the MUSICand the minimum-norm algorithms in resolving plane waves in noise,” IEEETransactions on Acoustics, Speech, and Signal Processing, vol. 34, no. 2,pp. 331–341, 1986.

182

[65] A. K. Jagannatham and B. D. Rao, “Complex constrained CRB and its ap-plication to semi-blind MIMO and OFDM channel estimation,” in Proc. ofthe IEEE SAM Workshop, 2004, (Sitges, Barcelona).

[66] D. J. Love and R. W. Heath, Jr., “Equal gain transmission in multiple-inputmultiple-output wireless systems,” IEEE Trans. Commn., vol. 51, pp. 1102–1110, July 2003.

[67] D. Gu and C. Leung, “Performance analysis of transmit diversity scheme withimperfect channel estimation,” IEE Electronics Letters, vol. 39, pp. 402–403,Feb. 2003.

[68] T. Baykas and A. Yongacoglu, “Robustness of transmit diversity schemes withmultiple receive antennas at imperfect channel state information,” in IEEECCECE 2003, vol. 1, pp. 191–194, may 2003.

[69] F. Mazzenga, “Channel estimation and equalization for M-QAM transmissionwith a hidden pilot sequence,” IEEE Transactions on Broadcasting, vol. 46,pp. 170–176, Jun 2000.

[70] B. Farhang-Boroujeny, “Pilot-based channel identification: Proposal for semi-blind identification of communication channels,” Electronics Letters, vol. 31,pp. 1044–1046, June 1995.

[71] G. Zhou, M. Viberg, and T. McKelvey, “Superimposed periodic pilots for blindchannel estimation,” Proceedings of the 35th Annual Asilomar Conference onSignals, Systems and Computers, pp. 653–657, Nov 2001.

[72] J. Tugnait and W. Luo, “On channel estimation using superimposed trainingand first-order statistics,” IEEE Communications Letters, vol. 7, pp. 413–415,Sep. 2003.

[73] A. Orozco-Lugo, M. Lara, and D. McLernon, “Channel estimation using im-plicit training,” IEEE Transactions on Signal Processing, vol. 52, pp. 240–254,Jan 2004.

[74] M. Ghogho and A. Swami, “Estimation of doubly-selective channels in blocktransmissions using data-dependent training,” Proceedings of the EuropeanSignal Processing Conference (EUSIPCO), 2006.

[75] N. Chen and G. Zhou, “A superimposed periodic pilot scheme for semi-blindchannel estimation of OFDM systems,” Proceedings of 2002 IEEE 10th DigitalSignal Processing Workshop and the 2nd Signal Processing Education Work-shop, pp. 362–365, Oct 2002.

[76] P. Bohlin and M. Coldrey, “Performance evaluation of MIMO communicationsystems based on superimposed pilots,” IEEE International Conference onAcoustics, Speech and Signal Processing, pp. 425–428, 2004.

183

[77] M. Coldrey, Ph.D. Thesis: Estimation and performance analysis for wire-less multiple antenna communication channels. Goteborg, Sweden: ChalmersUniversity of Technology, 2006.

[78] J. K. Tugnait, H. Shuangchi, and X. Meng, “On superimposed-training powerallocation for time-varying channel estimation,” IEEE/SP 13th Workshop onStatistical Signal Processing, pp. 1330 – 1335, Jul. 2005.

[79] M. Biguesh and A. Gershman, “Training-based mimo channel estimation: astudy of estimator tradeoffs and optimal training signals,” IEEE Transactionson Signal Processing, vol. 54, Mar. 2006.

[80] M. Dong and L. Tong, “Optimal design and placement of pilot symbols forchannel estimation,” IEEE Transactions on Signal Processing, vol. 50, De-cember 2002.

[81] A. K. Jagannatham and B. D. Rao, “Semi-blind MIMO FIR channel esti-mation: Regularity and algorithms,” Submitted to the IEEE Transactions onSignal Processing.

[82] T. K. Moon and W. C. Stirling, Mathematical Methods and Algorithms forSignal Processing. Prentice Hall, first ed., 2000.

[83] H. L. V. Trees, Optimum Array Processing: Part IV of Detection, Estimationand Modulation Theory. New York: Wiley Interscience, 2002.

[84] A. K. Jagannatham and B. D. Rao, “Superimposed pilots (SP) vs. conven-tional pilots (CP) based MIMO wireless channel estimation,” In perparation.

[85] S. He and J. K. Tugnait, “Direct equalization of multiuser doubly-selectivechannels based on superimposed training,” Proc. of European Signal Process-ing Conf. (EUSIPCO), Florence, Italy, Sep 2006.

[86] M. Ghogho and A. Swami, “Estimation of doubly-selective channels in blocktransmissions using data-dependent superimposed training,” Proc. of Euro-pean Signal Processing Conf. (EUSIPCO), Florence, Italy, Sep 2006.

[87] G. Giannakis and C. Tepedelenlioglu, “Basis expansion models and diversitytechniques for blind identification and equalization of time-varying channels.”

[88] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood fromincomplete data via the EM algorithm,” J. Royal Stats. Soc., vol. 39, pp. 1–38,1977.

[89] B. Hassibi and H. Vikalo, “On the sphere decoding algorithm: Part i, theexpected complexity,” submitted to IEEE Transactions on Signal Processing.

[90] “Simulation models with correct statistical properties for Rayleigh fadingchannels,” IEEE Transactions on Communications, vol. 51, pp. 920–928, Jun.2003.

aditya k jagannatham bw efficient estimation - ucsd...

Documents