robust and optimal control

ROBUST AND OPTIMAL

CONTROL

ROBUST AND OPTIMAL

CONTROL

KEMIN ZHOU

with

JOHN C. DOYLE and KEITH GLOVER

PRENTICE HALL, Englewood Cli�s, New Jersey 07632

TO OUR PARENTS

Contents

Preface xiii

Notation and Symbols xvi

List of Acronyms xx

1 Introduction 1

1.1 Historical Perspective : : : : : : : : : : : : : : : : : : : : : : : : : : 11.2 How to Use This Book : : : : : : : : : : : : : : : : : : : : : : : : : 41.3 Highlights of The Book : : : : : : : : : : : : : : : : : : : : : : : : : 6

2 Linear Algebra 17

2.1 Linear Subspaces : : : : : : : : : : : : : : : : : : : : : : : : : : : : 172.2 Eigenvalues and Eigenvectors : : : : : : : : : : : : : : : : : : : : : 202.3 Matrix Inversion Formulas : : : : : : : : : : : : : : : : : : : : : : : 222.4 Matrix Calculus : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 242.5 Kronecker Product and Kronecker Sum : : : : : : : : : : : : : : : : 252.6 Invariant Subspaces : : : : : : : : : : : : : : : : : : : : : : : : : : : 262.7 Vector Norms and Matrix Norms : : : : : : : : : : : : : : : : : : : 282.8 Singular Value Decomposition : : : : : : : : : : : : : : : : : : : : : 322.9 Generalized Inverses : : : : : : : : : : : : : : : : : : : : : : : : : : 362.10 Semide�nite Matrices : : : : : : : : : : : : : : : : : : : : : : : : : : 362.11 Matrix Dilation Problems* : : : : : : : : : : : : : : : : : : : : : : : 382.12 Notes and References : : : : : : : : : : : : : : : : : : : : : : : : : : 44

3 Linear Dynamical Systems 45

3.1 Descriptions of Linear Dynamical Systems : : : : : : : : : : : : : : 45

vii

viii CONTENTS

3.2 Controllability and Observability : : : : : : : : : : : : : : : : : : : 47

3.3 Kalman Canonical Decomposition : : : : : : : : : : : : : : : : : : : 53

3.4 Pole Placement and Canonical Forms : : : : : : : : : : : : : : : : : 58

3.5 Observers and Observer-Based Controllers : : : : : : : : : : : : : : 63

3.6 Operations on Systems : : : : : : : : : : : : : : : : : : : : : : : : : 65

3.7 State Space Realizations for Transfer Matrices : : : : : : : : : : : : 68

3.8 Lyapunov Equations : : : : : : : : : : : : : : : : : : : : : : : : : : 71

3.9 Balanced Realizations : : : : : : : : : : : : : : : : : : : : : : : : : 72

3.10 Hidden Modes and Pole-Zero Cancelation : : : : : : : : : : : : : : 78

3.11 Multivariable System Poles and Zeros : : : : : : : : : : : : : : : : : 80

3.12 Notes and References : : : : : : : : : : : : : : : : : : : : : : : : : : 89

4 Performance Speci�cations 91

4.1 Normed Spaces : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 91

4.2 Hilbert Spaces : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 93

4.3 Hardy Spaces H2 and H1 : : : : : : : : : : : : : : : : : : : : : : : 97

4.4 Power and Spectral Signals : : : : : : : : : : : : : : : : : : : : : : 102

4.5 Induced System Gains : : : : : : : : : : : : : : : : : : : : : : : : : 104

4.6 Computing L2 and H2 Norms : : : : : : : : : : : : : : : : : : : : : 112

4.7 Computing L1 and H1 Norms : : : : : : : : : : : : : : : : : : : : 114


5 Stability and Performance of Feedback Systems 117

5.1 Feedback Structure : : : : : : : : : : : : : : : : : : : : : : : : : : : 117

5.2 Well-Posedness of Feedback Loop : : : : : : : : : : : : : : : : : : : 119

5.3 Internal Stability : : : : : : : : : : : : : : : : : : : : : : : : : : : : 121

5.4 Coprime Factorization over RH1 : : : : : : : : : : : : : : : : : : : 126

5.5 Feedback Properties : : : : : : : : : : : : : : : : : : : : : : : : : : 130

5.6 The Concept of Loop Shaping : : : : : : : : : : : : : : : : : : : : : 134

5.7 Weighted H2 and H1 Performance : : : : : : : : : : : : : : : : : : 137


6 Performance Limitations 143

6.1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 143

6.2 Integral Relations : : : : : : : : : : : : : : : : : : : : : : : : : : : : 145

6.3 Design Limitations and Sensitivity Bounds : : : : : : : : : : : : : : 149

6.4 Bode's Gain and Phase Relation : : : : : : : : : : : : : : : : : : : 151


CONTENTS ix

7 Model Reduction by Balanced Truncation 153

7.1 Model Reduction by Balanced Truncation : : : : : : : : : : : : : : 1547.2 Frequency-Weighted Balanced Model Reduction : : : : : : : : : : : 1607.3 Relative and Multiplicative Model Reductions : : : : : : : : : : : : 1617.4 Notes and References : : : : : : : : : : : : : : : : : : : : : : : : : : 167

8 Hankel Norm Approximation 169

8.1 Hankel Operator : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1708.2 All-pass Dilations : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1758.3 Optimal Hankel Norm Approximation : : : : : : : : : : : : : : : : 1848.4 L1 Bounds for Hankel Norm Approximation : : : : : : : : : : : : 1888.5 Bounds for Balanced Truncation : : : : : : : : : : : : : : : : : : : 1928.6 Toeplitz Operators : : : : : : : : : : : : : : : : : : : : : : : : : : : 1948.7 Hankel and Toeplitz Operators on the Disk* : : : : : : : : : : : : : 1958.8 Nehari's Theorem* : : : : : : : : : : : : : : : : : : : : : : : : : : : 2008.9 Notes and References : : : : : : : : : : : : : : : : : : : : : : : : : : 206

9 Model Uncertainty and Robustness 207

9.1 Model Uncertainty : : : : : : : : : : : : : : : : : : : : : : : : : : : 2079.2 Small Gain Theorem : : : : : : : : : : : : : : : : : : : : : : : : : : 2119.3 Stability under Stable Unstructured Uncertainties : : : : : : : : : : 2149.4 Unstructured Robust Performance : : : : : : : : : : : : : : : : : : 2229.5 Gain Margin and Phase Margin : : : : : : : : : : : : : : : : : : : : 2299.6 De�ciency of Classical Control for MIMO Systems : : : : : : : : : 2329.7 Notes and References : : : : : : : : : : : : : : : : : : : : : : : : : : 237

10 Linear Fractional Transformation 239

10.1 Linear Fractional Transformations : : : : : : : : : : : : : : : : : : : 23910.2 Examples of LFTs : : : : : : : : : : : : : : : : : : : : : : : : : : : 24610.3 Basic Principle : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 25610.4 Redhe�er Star-Products : : : : : : : : : : : : : : : : : : : : : : : : 25710.5 Notes and References : : : : : : : : : : : : : : : : : : : : : : : : : : 260

11 Structured Singular Value 261

11.1 General Framework for System Robustness : : : : : : : : : : : : : : 26211.2 Structured Singular Value : : : : : : : : : : : : : : : : : : : : : : : 26611.3 Structured Robust Stability and Performance : : : : : : : : : : : : 27611.4 Overview on � Synthesis : : : : : : : : : : : : : : : : : : : : : : : : 28711.5 Notes and References : : : : : : : : : : : : : : : : : : : : : : : : : : 290

x CONTENTS

12 Parameterization of Stabilizing Controllers 291

12.1 Existence of Stabilizing Controllers : : : : : : : : : : : : : : : : : : 29212.2 Duality and Special Problems : : : : : : : : : : : : : : : : : : : : : 29512.3 Parameterization of All Stabilizing Controllers : : : : : : : : : : : : 30112.4 Structure of Controller Parameterization : : : : : : : : : : : : : : : 31212.5 Closed-Loop Transfer Matrix : : : : : : : : : : : : : : : : : : : : : 31412.6 Youla Parameterization via Coprime Factorization* : : : : : : : : : 31512.7 Notes and References : : : : : : : : : : : : : : : : : : : : : : : : : : 318

13 Algebraic Riccati Equations 319

13.1 All Solutions of A Riccati Equation : : : : : : : : : : : : : : : : : : 32013.2 Stabilizing Solution and Riccati Operator : : : : : : : : : : : : : : 32513.3 Extreme Solutions and Matrix Inequalities : : : : : : : : : : : : : : 33313.4 Spectral Factorizations : : : : : : : : : : : : : : : : : : : : : : : : : 34213.5 Positive Real Functions : : : : : : : : : : : : : : : : : : : : : : : : : 35313.6 Inner Functions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 35713.7 Inner-Outer Factorizations : : : : : : : : : : : : : : : : : : : : : : : 35813.8 Normalized Coprime Factorizations : : : : : : : : : : : : : : : : : : 36213.9 Notes and References : : : : : : : : : : : : : : : : : : : : : : : : : : 364

14 H2 Optimal Control 365

14.1 Introduction to Regulator Problem : : : : : : : : : : : : : : : : : : 36514.2 Standard LQR Problem : : : : : : : : : : : : : : : : : : : : : : : : 36714.3 Extended LQR Problem : : : : : : : : : : : : : : : : : : : : : : : : 37214.4 Guaranteed Stability Margins of LQR : : : : : : : : : : : : : : : : 37314.5 Standard H2 Problem : : : : : : : : : : : : : : : : : : : : : : : : : 37514.6 Optimal Controlled System : : : : : : : : : : : : : : : : : : : : : : 37814.7 H2 Control with Direct Disturbance Feedforward* : : : : : : : : : 37914.8 Special Problems : : : : : : : : : : : : : : : : : : : : : : : : : : : : 38114.9 Separation Theory : : : : : : : : : : : : : : : : : : : : : : : : : : : 38814.10 Stability Margins of H2 Controllers : : : : : : : : : : : : : : : : : : 39014.11 Notes and References : : : : : : : : : : : : : : : : : : : : : : : : : : 392

15 Linear Quadratic Optimization 393

15.1 Hankel Operators : : : : : : : : : : : : : : : : : : : : : : : : : : : : 39315.2 Toeplitz Operators : : : : : : : : : : : : : : : : : : : : : : : : : : : 39515.3 Mixed Hankel-Toeplitz Operators : : : : : : : : : : : : : : : : : : : 39715.4 Mixed Hankel-Toeplitz Operators: The General Case* : : : : : : : 39915.5 Linear Quadratic Max-Min Problem : : : : : : : : : : : : : : : : : 40115.6 Notes and References : : : : : : : : : : : : : : : : : : : : : : : : : : 404

CONTENTS xi

16 H1 Control: Simple Case 405

16.1 Problem Formulation : : : : : : : : : : : : : : : : : : : : : : : : : : 405

16.2 Output Feedback H1 Control : : : : : : : : : : : : : : : : : : : : : 406

16.3 Motivation for Special Problems : : : : : : : : : : : : : : : : : : : : 412

16.4 Full Information Control : : : : : : : : : : : : : : : : : : : : : : : : 416

16.5 Full Control : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 423

16.6 Disturbance Feedforward : : : : : : : : : : : : : : : : : : : : : : : 423

16.7 Output Estimation : : : : : : : : : : : : : : : : : : : : : : : : : : : 425

16.8 Separation Theory : : : : : : : : : : : : : : : : : : : : : : : : : : : 426

16.9 Optimality and Limiting Behavior : : : : : : : : : : : : : : : : : : 431

16.10 Controller Interpretations : : : : : : : : : : : : : : : : : : : : : : : 436

16.11 An Optimal Controller : : : : : : : : : : : : : : : : : : : : : : : : : 438


17 H1 Control: General Case 441

17.1 General H1 Solutions : : : : : : : : : : : : : : : : : : : : : : : : : 442

17.2 Loop Shifting : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 445

17.3 Relaxing Assumptions : : : : : : : : : : : : : : : : : : : : : : : : : 449

17.4 H2 and H1 Integral Control : : : : : : : : : : : : : : : : : : : : : : 450

17.5 H1 Filtering : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 454

17.6 Youla Parameterization Approach* : : : : : : : : : : : : : : : : : : 456

17.7 Connections : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 460

17.8 State Feedback and Di�erential Game : : : : : : : : : : : : : : : : 462

17.9 Parameterization of State Feedback H1 Controllers : : : : : : : : : 465


18 H1 Loop Shaping 469

18.1 Robust Stabilization of Coprime Factors : : : : : : : : : : : : : : : 469

18.2 Loop Shaping Using Normalized Coprime Stabilization : : : : : : : 478

18.3 Theoretical Justi�cation for H1 Loop Shaping : : : : : : : : : : : 481


19 Controller Order Reduction 489

19.1 Controller Reduction with Stability Criteria : : : : : : : : : : : : : 490

19.2 H1 Controller Reductions : : : : : : : : : : : : : : : : : : : : : : : 497

19.3 Frequency-Weighted L1 Norm Approximations : : : : : : : : : : : 504

19.4 An Example : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 509


xii CONTENTS

20 Structure Fixed Controllers 515

20.1 Lagrange Multiplier Method : : : : : : : : : : : : : : : : : : : : : : 51520.2 Fixed Order Controllers : : : : : : : : : : : : : : : : : : : : : : : : 52020.3 Notes and References : : : : : : : : : : : : : : : : : : : : : : : : : : 525

21 Discrete Time Control 527

21.1 Discrete Lyapunov Equations : : : : : : : : : : : : : : : : : : : : : 52721.2 Discrete Riccati Equations : : : : : : : : : : : : : : : : : : : : : : : 52821.3 Bounded Real Functions : : : : : : : : : : : : : : : : : : : : : : : : 53721.4 Matrix Factorizations : : : : : : : : : : : : : : : : : : : : : : : : : : 54221.5 Discrete Time H2 Control : : : : : : : : : : : : : : : : : : : : : : : 54821.6 Discrete Balanced Model Reduction : : : : : : : : : : : : : : : : : : 55321.7 Model Reduction Using Coprime Factors : : : : : : : : : : : : : : : 55821.8 Notes and References : : : : : : : : : : : : : : : : : : : : : : : : : : 561

Bibliography 563

Index 581

Preface

This book is inspired by the recent development in the robust and H1 control theory,particularly the state-spaceH1 control theory developed in the paper by Doyle, Glover,Khargonekar, and Francis [1989] (known as DGKF paper). We give a fairly compre-hensive and step-by-step treatment of the state-space H1 control theory in the styleof DGKF. We also treat the robust control problems with unstructured and structureduncertainties. The linear fractional transformation (LFT) and the structured singularvalue (known as �) are introduced as the uni�ed tools for robust stability and perfor-mance analysis and synthesis. Chapter 1 contains a more detailed chapter-by-chapterreview of the topics and results presented in this book.

We would like to thank Professor Bruce A. Francis at University of Toronto for hishelpful comments and suggestions on early versions of the manuscript. As a matterof fact, this manuscript was inspired by his lectures given at Caltech in 1987 and hismasterpiece { A Course in H1 Control Theory. We are grateful to Professor Andr�e Titsat University of Maryland who has made numerous helpful comments and suggestionsthat have greatly improved the quality of the manuscript. Professor Jakob Stoustrup,Professor Hans Henrik Niemann, and their students at The Technical University ofDenmark have read various versions of this manuscript and have made many helpfulcomments and suggestions. We are grateful to their help. Special thanks go to ProfessorAndrew Packard at University of California-Berkeley for his help during the preparationof the early versions of this manuscript. We are also grateful to Professor Jie Chen atUniversity of California-Riverside for providing material used in Chapter 6. We wouldalso like to thank Professor Kang-Zhi Liu at Chiba University (Japan) and ProfessorTongwen Chen at University of Calgary for their valuable comments and suggestions.In addition, we would like to thank G. Balas, C. Beck, D. S. Bernstein, G. Gu, W.Lu, J. Morris, M. Newlin, L. Qiu, H. P. Rotstein and many other people for theircomments and suggestions. The �rst author is especially grateful to Professor PramodP. Khargonekar at The University of Michigan for introducing him to robust and H1control and to Professor Tryphon Georgiou at University of Minnesota for encouraginghim to complete this work.

Kemin Zhou

xiii

xiv PREFACE

John C. DoyleKeith Glover

xvi NOTATION AND SYMBOLS

Notation and Symbols

R �eld of real numbersC �eld of complex numbersF �eld, either R or CR+ nonnegative real numbers

C� and C � open and closed left-half planeC+ and C + open and closed right-half planeC 0 , jR imaginary axisD unit diskD closed unit disk@D unit circle

2 belong to� subset[ union\ intersection

2 end of proof3 end of example~ end of remark

:= de�ned as' asymptotically greater than

/ asymptotically less than

� much greater than� much less than

� complex conjugate of � 2 C

j�j absolute value of � 2 C

Re(�) real part of � 2 C

�(t) unit impulse�ij Kronecker delta function, �ii = 1 and �ij = 0 if i 6= j1+(t) unit step function

NOTATION AND SYMBOLS xvii

In n� n identity matrix[aij ] a matrix with aij as its i-th row and j-th column elementdiag(a1; : : : ; an) an n� n diagonal matrix with ai as its i-th diagonal elementAT transposeA� adjoint operator of A or complex conjugate transpose of AA�1 inverse of AA+ pseudo inverse of AA�� shorthand for (A�1)�

det(A) determinant of ATrace(A) trace of A�(A) eigenvalue of A�(A) spectral radius of A�(A) the set of spectrum of A�(A) largest singular value of A�(A) smallest singular value of A�i(A) i-th singular value of A�(A) condition number of AkAk spectral norm of A: kAk = �(A)Im(A), R(A) image (or range) space of AKer(A), N(A) kernel (or null) space of AX�(A) stable invariant subspace of AX+(A) antistable invariant subspace of A

Ric(H) the stabilizing solution of an AREg � f convolution of g and f Kronecker product� direct sum or Kronecker sum\ angleh; i inner productx ? y orthogonal, hx; yi = 0D? orthogonal complement of D, i.e.,

�D D?

�or

�DD?

�is unitary

S? orthogonal complement of subspace S, e.g., H?2

L2(�1;1) time domain Lebesgue spaceL2[0;1) subspace of L2(�1;1)L2(�1; 0] subspace of L2(�1;1)L2+ shorthand for L2[0;1)L2� shorthand for L2(�1; 0]l2+ shorthand for l2[0;1)l2� shorthand for l2(�1; 0)L2(jR) square integrable functions on C 0 including at 1L2(@D ) square integrable functions on @DH2(jR) subspace of L2(jR) with analytic extension to the rhp

xviii NOTATION AND SYMBOLS

H2(@D ) subspace of L2(@D ) with analytic extension to the inside of @DH?2 (jR) subspace of L2(jR) with analytic extension to the lhpH?2 (@D ) subspace of L2(@D ) with analytic extension to the outside of @DL1(jR) functions bounded on Re(s) = 0 including at 1L1(@D ) functions bounded on @DH1(jR) the set of L1(jR) functions analytic in Re(s) > 0H1(@D ) the set of L1(@D ) functions analytic in jzj < 1H�1(jR) the set of L1(jR) functions analytic in Re(s) < 0H�1(@D ) the set of L1(@D ) functions analytic in jzj > 1

pre�x B or B closed unit ball, e.g. BH1 and B�pre�x Bo open unit ballpre�x R real rational, e.g., RH1 and RH2, etc

R[s] polynomial ringRp(s) rational proper transfer matrices

G�(s) shorthand for GT (�s) (continuous time)G�(z) shorthand for GT (z�1) (discrete time)�A BC D

�shorthand for state space realization C(sI �A)�1B +D

or C(zI �A)�1B +D

F`(M;Q) lower LFTFu(M;Q) upper LFTS(M;N) star product

xx LIST OF ACRONYMS

List of Acronyms

ARE algebraic Riccati equationBR bounded realCIF complementary inner factorDF disturbance feedforwardFC full controlFDLTI �nite dimensional linear time invariantFI full informationHF high frequencyi� if and only iflcf left coprime factorizationLF low frequencyLFT linear fractional transformationlhp or LHP left-half plane Re(s) < 0LQG linear quadratic GaussianLQR linear quadratic regulatorLTI linear time invariantLTR loop transfer recoveryMIMO multi-input multi-outputnlcf normalized left coprime factorizationNP nominal performancenrcf normalized right coprime factorizationNS nominal stabilityOE output estimationOF output feedbackOI output injectionrcf right coprime factorizationrhp or RHP right-half plane Re(s) > 0RP robust performanceRS robust stabilitySF state feedbackSISO single-input single-outputSSV structured singular value (�)SVD singular value decomposition

1Introduction

1.1 Historical Perspective

This book gives a comprehensive treatment of optimal H2 and H1 control theory andan introduction to the more general subject of robust control. Since the central subjectof this book is state-space H1 optimal control, in contrast to the approach adoptedin the famous book by Francis [1987]: A Course in H1 Control Theory, it may behelpful to provide some historical perspective of the state-space H1 control theory tobe presented in this book. This section is not intended as a review of the literaturein H1 theory or robust control, but rather only an attempt to outline some of thework that most closely touches on our approach to state-space H1. Hopefully our lackof written historical material will be somewhat made up for by the pictorial history ofcontrol shown in Figure 1.1. Here we see how the practical but classical methods yieldedto the more sophisticated modern theory. Robust control sought to blend the best ofboth worlds. The strange creature that resulted is the main topic of this book.

The H1 optimal control theory was originally formulated by Zames [1981] in aninput-output setting. Most solution techniques available at that time involved analyticfunctions (Nevanlinna-Pick interpolation) or operator-theoretic methods [Sarason, 1967;Adamjan et al., 1978; Ball and Helton, 1983]. Indeed, H1 theory seemed to many tosignal the beginning of the end for the state-space methods which had dominated controlfor the previous 20 years. Unfortunately, the standard frequency-domain approaches toH1 started running into signi�cant obstacles in dealing with multi-input multi-output(MIMO) systems, both mathematically and computationally, much as the H2 (or LQG)theory of the 1950's had.

1

2 INTRODUCTION

Figure 1.1: A picture history of control

Not surprisingly, the �rst solution to a general rational MIMO H1 optimal controlproblem, presented in Doyle [1984], relied heavily on state-space methods, althoughmore as a computational tool than in any essential way. The steps in this solution were asfollows: parameterize all internally-stabilizing controllers via [Youla et al., 1976]; obtainrealizations of the closed-loop transfer matrix; convert the resulting model-matchingproblem into the equivalent 2�2-block general distance or best approximation probleminvolving mixed Hankel-Toeplitz operators; reduce to the Nehari problem (Hankel only);solve the Nehari problem by the procedure of Glover [1984]. Both [Francis, 1987] and[Francis and Doyle, 1987] give expositions of this approach, which will be referred to asthe \1984" approach.

In a mathematical sense, the 1984 procedure \solved" the general rational H1 op-timal control problem and much of the subsequent work in H1 control theory focusedon the 2 � 2-block problems, either in the model-matching or general distance forms.Unfortunately, the associated complexity of computation was substantial, involving sev-eral Riccati equations of increasing dimension, and formulae for the resulting controllerstended to be very complicated and have high state dimension. Encouragement came

1.1. Historical Perspective 3

from Limebeer and Hung [1987] and Limebeer and Halikias [1988] who showed, forproblems transformable to 2� 1-block problems, that a subsequent minimal realizationof the controller has state dimension no greater than that of the generalized plant G.This suggested the likely existence of similarly low dimension optimal controllers in thegeneral 2� 2 case.

Additional progress on the 2� 2-block problems came from Ball and Cohen [1987],who gave a state-space solution involving 3 Riccati equations. Jonckheere and Juang[1987] showed a connection between the 2 � 1-block problem and previous work byJonckheere and Silverman [1978] on linear-quadratic control. Foias and Tannenbaum[1988] developed an interesting class of operators called skew Toeplitz to study the2 � 2-block problem. Other approaches have been derived by Hung [1989] using aninterpolation theory approach, Kwakernaak [1986] using a polynomial approach, andKimura [1988] using a method based on conjugation.

The simple state spaceH1 controller formulae to be presented in this book were �rstderived in Glover and Doyle [1988] with the 1984 approach, but using a new 2�2-blocksolution, together with a cumbersome back substitution. The very simplicity of the newformulae and their similarity with the H2 ones suggested a more direct approach.

Independent encouragement for a simpler approach to the H1 problem came frompapers by Petersen [1987], Khargonekar, Petersen, and Zhou [1990], Zhou and Khar-gonekar [1988], and Khargonekar, Petersen, and Rotea [1988]. They showed that forthe state-feedback H1 problem one can choose a constant gain as a (sub)optimal con-troller. In addition, a formula for the state-feedback gain matrix was given in termsof an algebraic Riccati equation. Also, these papers established connections betweenH1-optimal control, quadratic stabilization, and linear-quadratic di�erential games.

The landmark breakthrough came in the DGKF paper (Doyle, Glover, Khargonekar,and Francis [1989]). In addition to providing controller formulae that are simple andexpressed in terms of plant data as in Glover and Doyle [1988], the methods in thatpaper are a fundamental departure from the 1984 approach. In particular, the Youlaparameterization and the resulting 2 � 2-block model-matching problem of the 1984solution are avoided entirely; replaced by a more purely state-space approach involvingobserver-based compensators, a pair of 2�1 block problems, and a separation argument.The operator theory still plays a central role (as does Redhe�er's work [Redhe�er, 1960]on linear fractional transformations), but its use is more straightforward. The keyto this was a return to simple and familiar state-space tools, in the style of Willems[1971], such as completing the square, and the connection between frequency domaininequalities (e.g kGk1 < 1), Riccati equations, and spectral factorizations. This bookin some sense can be regarded as an expansion of the DGKF paper.

The state-space theory of H1 can be carried much further, by generalizing time-invariant to time-varying, in�nite horizon to �nite horizon, and �nite dimensional toin�nite dimensional. A ourish of activity has begun on these problems since the publi-cation of the DGKF paper and numerous results have been published in the literature,not surprising, many results in DGKF paper generalizemutatis mutandis, to these cases,which are beyond the scope of this book.

4 INTRODUCTION

1.2 How to Use This Book

This book is intended to be used either as a graduate textbook or as a reference forcontrol engineers. With the second objective in mind, we have tried to balance thebroadness and the depth of the material covered in the book. In particular, somechapters have been written su�ciently self-contained so that one may jump to thosespecial topics without going through all the preceding chapters, for example, Chapter13 on algebraic Riccati equations. Some other topics may only require some basic linearsystem theory, for instance, many readers may �nd that it is not di�cult to go directlyto Chapters 9 � 11. In some cases, we have tried to collect some most frequently usedformulas and results in one place for the convenience of reference although they may nothave any direct connection with the main results presented in the book. For example,readers may �nd that those matrix formulas collected in Chapter 2 on linear algebraconvenient in their research. On the other hand, if the book is used as a textbook, itmay be advisable to skip those topics like Chapter 2 on the regular lectures and leavethem for students to read. It is obvious that only some selected topics in this book canbe covered in an one or two semester course. The speci�c choice of the topics dependson the time allotted for the course and the preference of the instructor. The diagramin Figure 1.2 shows roughly the relations among the chapters and should give the userssome idea for the selection of the topics. For example, the diagram shows that the onlyprerequisite for Chapters 7 and 8 is Section 3.9 of Chapter 3 and, therefore, these twochapters alone may be used as a short course on model reductions. Similarly, one onlyneeds the knowledge of Sections 13.2 and 13.6 of Chapter 13 to understand Chapter 14.Hence one may only cover those related sections of Chapter 13 if time is the factor. Thebook is separated roughly into the following subgroups:

� Basic Linear System Theory: Chapters 2 � 3.

� Stability and Performance: Chapters 4 � 6.

� Model Reduction: Chapters 7 � 8.

� Robustness: Chapters 9 � 11.

� H2 and H1 Control: Chapters 12 � 19.

� Lagrange Method: Chapter 20.

� Discrete Time Systems: Chapter 21.

In view of the above classi�cation, one possible choice for an one-semester courseon robust control would cover Chapters 4 � 5; 9 � 11 or 4 � 11 and an one-semesteradvanced course on H2 and H1 control would cover (parts of) Chapters 12 � 19.Another possible choice for an one-semester course on H1 control may include Chapter4, parts of Chapter 5 (5:1 � 5:3; 5:5; 5:7), Chapter 10, Chapter 12 (except Section 12.6),parts of Chapter 13 (13:2; 13:4; 13:6), Chapter 15 and Chapter 16. Although Chapters7 � 8 are very much independent of other topics and can, in principle, be studied at any

1.2. How to Use This Book 5

��

4

��

5 ��

6 ��

7 ��

8

��

9

��

10 ��

11

��

12

��

15 ��

14

��

13

��

16

��

17 ��

18

��

2 ��

3

��

19 ��

20 ��

21

?

-

�� AAU

�� 13:213:6

3:9

?

-

?

@@@R?

?

?

?

-

-

13:4?

@@@I

?�

��

?

?

Figure 1.2: Relations among the chapters

6 INTRODUCTION

stage with the background of Section 3.9, they may serve as an introduction to sourcesof model uncertainties and hence to robustness problems.

Robust Control H1 Control Advanced Model & ControllerH1 Control Reductions

4 4 12 3.95 5.1�5.3,5.5,5.7 13.2,13.4,13.6 76* 10 14 87* 12 15 5.4,5.78* 13.2,13.4,13.6 16 10.19 15 17* 16.1,16.210 16 18* 17.111 19* 19

Table 1.1: Possible choices for an one-semester course (* chapters may be omitted)

Table 1.1 lists several possible choices of topics for an one-semester course. A courseon model and controller reductions may only include the concept of H1 control andthe H1 controller formulas with the detailed proofs omitted as suggested in the abovetable.

1.3 Highlights of The Book

The key results in each chapter are highlighted below. Note that some of the state-ments in this section are not precise, they are true under certain assumptions that arenot explicitly stated. Readers should consult the corresponding chapters for the exactstatements and conditions.

Chapter 2 reviews some basic linear algebra facts and treats a special class of matrixdilation problems. In particular, we show

minX

�X BC A

� = max

� � C A� ;

�BA

� �

and characterize all optimal (suboptimal) X .

Chapter 3 reviews some system theoretical concepts: controllability, observability,stabilizability, detectability, pole placement, observer theory, system poles and zeros,and state space realizations. Particularly, the balanced state space realizations arestudied in some detail. We show that for a given stable transfer function G(s) there

is a state space realization G(s) =

�A BC D

�such that the controllability Gramian P

and the observability Gramian Q de�ned below are equal and diagonal: P = Q = � =diag(�1; �2; : : : ; �n) where

AP + PA� +BB� = 0

1.3. Highlights of The Book 7

A�Q+QA+ C�C = 0:

Chapter 4 de�nes several norms for signals and introduces the H2 spaces and the H1spaces. The input/output gains of a stable linear system under various input signalsare characterized. We show that H2 and H1 norms come out naturally as measures ofthe worst possible performance for many classes of input signals. For example, let

G(s) =

�A BC 0

�2 RH1; g(t) = CeAtB

Then kGk1 = supkg � uk2kuk2

and �1 � kGk1 �Z 1

0

kg(t)k dt � 2

nXi=1

�i. Some state

space methods of computing real rational H2 and H1 transfer matrix norms are alsopresented:

kGk22 = trace(B�QB) = trace(CPC�)

andkGk1 = maxf : H has an eigenvalue on the imaginary axisg

where

H =

�A BB�= 2

�C�C �A��:

Chapter 5 introduces the feedback structure and discusses its stability and perfor-mance properties.

e

e

++

++e2

e1

w2

w1

� �

-6

?

-

K

P

We show that the above closed-loop system is internally stable if and only if

�I �K�P I

��1=

�I + K(I � PK)�1P K(I � PK)�1

(I � PK)�1P (I � PK)�1

�2 RH1:

Alternative characterizations of internal stability using coprime factorizations are alsopresented.

Chapter 6 introduces some multivariable versions of the Bode's sensitivity integralrelations and Poisson integral formula. The sensitivity integral relations are used tostudy the design limitations imposed by bandwidth constraint and the open-loop un-stable poles, while the Poisson integral formula is used to study the design constraints

8 INTRODUCTION

imposed by the non-minimum phase zeros. For example, let S(s) be a sensitivity func-tion, and let pi be the right half plane poles of the open-loop system and �i be thecorresponding pole directions. Then we show thatZ 1

0

ln�(S(j!))d! = ��max

Xi

(Repi)�i��i

!+H1; H1 � 0:

This equality shows that the design limitations in multivariable systems are dependenton the directionality properties of the sensitivity function as well as those of the poles(and zeros), in addition to the dependence upon pole (and zero) locations which isknown in single-input single-output systems.

Chapter 7 considers the problem of reducing the order of a linear multivariabledynamical system using the balanced truncation method. Suppose

G(s) =

24 A11 A12

A21 A22

B1

B2

C1 C2 D

35 2 RH1

is a balanced realization with controllability and observability Gramians P = Q = � =diag(�1;�2)

�1 = diag(�1Is1 ; �2Is2 ; : : : ; �rIsr )

�2 = diag(�r+1Isr+1 ; �r+2Isr+2 ; : : : ; �NIsN ):

Then the truncated system Gr(s) =

�A11 B1

C1 D

�is stable and satis�es an additive

error bound:

kG(s)�Gr(s)k1 � 2

NXi=r+1

�i:

On the other hand, if G�1 2 RH1, and P and Q satisfy

PA� +AP +BB� = 0

Q(A�BD�1C) + (A�BD�1C)�Q+ C�(D�1)�D�1C = 0

such that P = Q = diag(�1;�2) with G partitioned compatibly as before, then

Gr(s) =

�A11 B1

C1 D

�is stable and minimum phase, and satis�es respectively the following relative and mul-tiplicative error bounds:

G�1(G�Gr) 1�

NYi=r+1

�1 + 2�i(

q1 + �2i + �i)

�� 1

G�1r (G�Gr) 1�

NYi=r+1

�1 + 2�i(

q1 + �2i + �i)

�� 1:


Chapter 8 deals with the optimal Hankel norm approximation and its applicationsin L1 norm model reduction. We show that for a given G(s) of McMillan degree nthere is a G(s) of McMillan degree r < n such that G(s)� G(s)

H= inf

G(s)� G(s) H= �r+1:

Moreover, there is a constant matrix D0 such that

G(s)� G(s)�D0

1�

NXi=r+1

�i:

The well-known Nehari's theorem is also shown:

infQ2RH�

1

kG�Qk1 = kGkH = �1:

Chapter 9 derives robust stability tests for systems under various modeling assump-tions through the use of a small gain theorem. In particular, we show that an uncertainsystem described below with an unstructured uncertainty � 2 RH1 with k�k1 < 1 isrobustly stable if and only if the transfer function from w to z has H1 norm no greaterthan 1.

∆wz

nominal system

Chapter 10 introduces the linear fractional transformation (LFT). We show thatmany control problems can be formulated and treated in the LFT framework. In par-ticular, we show that every analysis problem can be put in an LFT form with somestructured �(s) and some interconnection matrix M(s) and every synthesis problemcan be put in an LFT form with a generalized plant G(s) and a controller K(s) to bedesigned.

wz��

-

M

�

y u

wz� �

�

-

G

K

10 INTRODUCTION

Chapter 11 considers robust stability and performance for systems with multiplesources of uncertainties. We show that an uncertain system is robustly stable for all�i 2 RH1 with k�ik1 < 1 if and only if the structured singular value (�) of thecorresponding interconnection model is no greater than 1.

nominal system

∆

∆

∆∆

1

2

3

4

Chapter 12 characterizes all controllers that stabilize a given dynamical system G(s)using the state space approach. The construction of the controller parameterization isdone via separation theory and a sequence of special problems, which are so-called fullinformation (FI) problems, disturbance feedforward (DF) problems, full control (FC)problems and output estimation (OE). The relations among these special problems areestablished.

DF OE

FI FC

� -

6

?

� -

6

?

equivalent equivalent

dual

dual

For a given generalized plant

G(s) =

�G11(s) G12(s)G21(s) G22(s)

�=

24 A B1 B2

C1 D11 D12

C2 D21 D22

35

we show that all stabilizing controllers can be parameterized as the transfer matrix fromy to u below where F and L are such that A+ LC2 and A+B2F are stable.


cc c cc

66� � �

�

y1u1

z w

uy

-

� �

Q

?

�G

?

�

-

6

-

-

� � � � �

D22

�LF

A

B2

RC2

Chapter 13 studies the Algebraic Riccati Equation and the related problems: theproperties of its solutions, the methods to obtain the solutions, and some applications.In particular, we study in detail the so-called stabilizing solution and its applications inmatrix factorizations. A solution to the following ARE

A�X +XA+XRX +Q = 0

is said to be a stabilizing solution if A+RX is stable. Now let

H :=

�A R�Q �A�

�

and let X�(H) be the stable H invariant subspace and

X�(H) = Im

�X1

X2

�

where X1; X2 2 C n�n . If X1 is nonsingular, then X := X2X�11 is uniquely determined

by H , denoted by X = Ric(H).A key result of this chapter is the relationship between the spectral factorization of a

transfer function and the solution of a corresponding ARE. Suppose (A;B) is stabilizableand suppose either A has no eigenvalues on j!-axis or P is sign de�nite (i.e., P � 0 orP � 0) and (P;A) has no unobservable modes on the j!-axis. De�ne

�(s) =�B�(�sI �A�)�1 I

� � P SS� R

��(sI �A)�1B

I

�:

12 INTRODUCTION

Then

�(j!) > 0 for all 0 � ! �1() 9 a stabilizing solution X to

(A�BR�1S�)�X +X(A�BR�1S�)�XBR�1B�X + P � SR�1S� = 0

() the Hamiltonian matrix

H =

�A�BR�1S� �BR�1B�

�(P � SR�1S�) �(A�BR�1S�)�

�

has no j!-axis eigenvalues.

Similarly,

�(j!) � 0 for all 0 � ! �1() 9 a solution X to


such that �(A �BR�1S� �BR�1B�X) � C �.

Furthermore, there exists a M 2 Rp such that

� =M�RM:

with

M =

�A B�F I

�; F = �R�1(S� + B�X):

Chapter 14 treats the optimal control of linear time-invariant systems with quadraticperformance criteria, i.e., LQR and H2 problems. We consider a dynamical systemdescribed by an LFT with

G(s) =

24 A B1 B2

C1 0 D12

C2 D21 0

35 :

G

K

z

y

w

u

� �

�

-


De�ne

H2 :=

�A 0

�C�1C1 �A��

B2

�C�1D12

� �D�12C1 B�2

�

J2 :=

�A� 0

�B1B�1 �A

��

C�2�B1D

�21

� �D21B

�1 C2

�X2 := Ric(H2) � 0; Y2 := Ric(J2) � 0

F2 := �(B�2X2 +D�12C1); L2 := �(Y2C�2 +B1D�21):

Then the H2 optimal controller, i.e. the controller that minimizes kTzwk2, is given by

Kopt(s) :=

�A+B2F2 + L2C2 �L2

F2 0

�:

Chapter 15 solves a max-min problem, i.e., a full information (or state feedback)H1 control problem, which is the key to the H1 theory considered in the next chapter.Consider a dynamical system

_x = Ax+B1w +B2u

z = C1x+D12u; D�12�C1 D12

�=�0 I

�:

Then we show that supw2BL2+

minu2L2+

kzk2 < if and only if H1 2 dom(Ric) and X1 =

Ric(H1) � 0 where

H1 :=

�A �2B1B

�1 �B2B

�2

�C�1C1 �A��

Furthermore, u = F1x with F1 := �B�2X1 is an optimal control.

Chapter 16 considers a simpli�ed H1 control problem with the generalized plantG(s) as given in Chapter 14. We show that there exists an admissible controller suchthat kTzwk1 < i� the following three conditions hold:

(i) H1 2 dom(Ric) and X1 := Ric(H1) � 0;

(ii) J1 2 dom(Ric) and Y1 := Ric(J1) � 0 where

J1 :=

�A� �2C�1C1 � C�2C2

�B1B�1 �A

�:

(iii) �(X1Y1) < 2 .

14 INTRODUCTION

Moreover, the set of all admissible controllers such that kTzwk1 < equals the set ofall transfer matrices from y to u in

M1

Q

u y� �

�

-

M1(s) =

24 A1 �Z1L1 Z1B2

F1 0 I�C2 I 0

35

where Q 2 RH1, kQk1 < and

A1 := A+ �2B1B�1X1 +B2F1 + Z1L1C2

F1 := �B�2X1; L1 := �Y1C�2 ; Z1 := (I � �2Y1X1)�1:

Chapter 17 considers again the standard H1 control problem but with some as-sumptions in the last chapter relaxed. We indicate how the assumptions can be relaxedto accommodate other more complicated problems such as singular control problems.We also consider the integral control in the H2 and H1 theory and show how thegeneral H1 solution can be used to solve the H1 �ltering problem. The conventionalYoula parameterization approach to the H2 and H1 problems is also outlined. Finally,the general state feedback H1 control problem and its relations with full informationcontrol and di�erential game problems are discussed.

Chapter 18 �rst solves a gap metric minimization problem. Let P = ~M�1 ~N be anormalized left coprime factorization. Then we show that

infK stabilizing

�KI

�(I + PK)�1

�I P

� 1

= infK stabilizing

�KI

�(I + PK)�1 ~M�1

1

=

�q1� � ~N ~M

� 2H

��1:

This implies that there is a robustly stabilizing controller for

P� = ( ~M + ~�M )�1( ~N + ~�N )

with � ~�N~�M

� 1< �

if and only if

� �q1� � ~N ~M

� 2H:

Using this stabilization result, a loop shaping design technique is proposed. The pro-posed technique uses only the basic concept of loop shaping methods and then a robust


stabilization controller for the normalized coprime factor perturbed system is used toconstruct the �nal controller.

Chapter 19 considers the design of reduced order controllers by means of controllerreduction. Special attention is paid to the controller reduction methods that preserve theclosed-loop stability and performance. In particular, two H1 performance preservingreduction methods are proposed:

a) Let K0 be a stabilizing controller such that kF`(G;K0)k1 < . Then K is also a

stabilizing controller such that F`(G; K)

1< if

W�12 (K �K0)W

�11

1< 1

where W1 and W2 are some stable, minimum phase and invertible transfer matri-ces.

b) Let K0 = �12��122 be a central H1 controller such that kF`(G;K0)k1 < and

let U ; V 2 RH1 be such that � �1I 00 I

��1

��12

�22

��U

V

�� 1

< 1=p2:

Then K = U V �1 is also a stabilizing controller such that kF`(G; K)k1 < .

Thus the controller reduction problem is converted to weighted model reduction prob-lems for which some numerical methods are suggested.

Chapter 20 brie y introduces the Lagrange multiplier method for the design of �xedorder controllers.

Chapter 21 discusses discrete time Riccati equations and some of their applications indiscrete time control. Finally, the discrete time balanced model reduction is considered.

16 INTRODUCTION

2Linear Algebra

Some basic linear algebra facts will be reviewed in this chapter. The detailed treatmentof this topic can be found in the references listed at the end of the chapter. Hence we shallomit most proofs and provide proofs only for those results that either cannot be easilyfound in the standard linear algebra textbooks or are insightful to the understanding ofsome related problems. We then treat a special class of matrix dilation problems whichwill be used in Chapters 8 and 17; however, most of the results presented in this bookcan be understood without the knowledge of the matrix dilation theory.

2.1 Linear Subspaces

Let R denote the real scalar �eld and C the complex scalar �eld. For the interest of thischapter, let F be either R or C and let Fn be the vector space over F, i.e., Fn is either Rn

or C n . Now let x1; x2; : : : ; xk 2 Fn . Then an element of the form �1x1+ : : :+�kxk with�i 2 F is a linear combination over F of x1; : : : ; xk. The set of all linear combinationsof x1; x2; : : : ; xk 2 F

n is a subspace called the span of x1; x2; : : : ; xk, denoted by

spanfx1; x2; : : : ; xkg := fx = �1x1 + : : :+ �kxk : �i 2 Fg:

A set of vectors x1; x2; : : : ; xk 2 Fn are said to be linearly dependent over F if there

exists �1; : : : ; �k 2 F not all zero such that �1x2 + : : : + �kxk = 0; otherwise they aresaid to be linearly independent.

Let S be a subspace of Fn , then a set of vectors fx1; x2; : : : ; xkg 2 S is called a basisfor S if x1; x2; : : : ; xk are linearly independent and S = spanfx1; x2; : : : ; xkg. However,

17

18 LINEAR ALGEBRA

such a basis for a subspace S is not unique but all bases for S have the same numberof elements. This number is called the dimension of S, denoted by dim(S).

A set of vectors fx1; x2; : : : ; xkg in Fn are mutually orthogonal if x�i xj = 0 for alli 6= j and orthonormal if x�i xj = �ij , where the superscript

� denotes complex conjugatetranspose and �ij is the Kronecker delta function with �ij = 1 for i = j and �ij = 0for i 6= j. More generally, a collection of subspaces S1; S2; : : : ; Sk of Fn are mutuallyorthogonal if x�y = 0 whenever x 2 Si and y 2 Sj for i 6= j.

The orthogonal complement of a subspace S � Fn is de�ned by

S? := fy 2 Fn : y�x = 0 for all x 2 Sg:

We call a set of vectors fu1; u2; : : : ; ukg an orthonormal basis for a subspace S 2 Fn ifthey form a basis of S and are orthonormal. It is always possible to extend such a basisto a full orthonormal basis fu1; u2; : : : ; ung for Fn . Note that in this case

S? = spanfuk+1; : : : ; ung;and fuk+1; : : : ; ung is called an orthonormal completion of fu1; u2; : : : ; ukg.

Let A 2 Fm�n be a linear transformation from Fn to Fm , i.e.,

A : Fn 7�! Fm :

(Note that a vector x 2 Fm can also be viewed as a linear transformation from F to Fm ,hence anything said for the general matrix case is also true for the vector case.) Thenthe kernel or null space of the linear transformation A is de�ned by

KerA = N(A) := fx 2 Fn : Ax = 0g;

and the image or range of A is

ImA = R(A) := fy 2 Fm : y = Ax; x 2 F

ng:It is clear that KerA is a subspace of Fn and ImA is a subspace of Fm . Moreover, it canbe easily seen that dim(KerA) + dim(ImA) = n and dim(ImA) = dim(KerA)?. Notethat (KerA)? is a subspace of Fn .

Let ai; i = 1; 2; : : : ; n denote the columns of a matrix A 2 Fm�n , then

ImA = spanfa1; a2; : : : ; ang:The rank of a matrix A is de�ned by

rank(A) = dim(ImA):

It is a fact that rank(A) = rank(A�), and thus the rank of a matrix equals the maximalnumber of independent rows or columns. A matrix A 2 Fm�n is said to have full rowrank if m � n and rank(A) = m. Dually, it is said to have full column rank if n � mand rank(A) = n. A full rank square matrix is called a nonsingular matrix. It is easy

2.1. Linear Subspaces 19

to see that rank(A) = rank(AT ) = rank(PA) if T and P are nonsingular matrices withappropriate dimensions.

A square matrix U 2 Fn�n whose columns form an orthonormal basis for Fn is calledan unitary matrix (or orthogonal matrix if F = R), and it satis�es U�U = I = UU�.The following lemma is useful.

Lemma 2.1 Let D =�d1 : : : dk

� 2 Fn�k (n > k) be such that D�D = I, so

di; i = 1; 2; : : : ; k are orthonormal. Then there exists a matrix D? 2 Fn�(n�k) such that�D D?

�is a unitary matrix. Furthermore, the columns of D?, di; i = k + 1; : : : ; n,

form an orthonormal completion of fd1; d2; : : : ; dkg.

The following results are standard:

Lemma 2.2 Consider the linear equation

AX = B

where A 2 Fn�l and B 2 Fn�m are given matrices. Then the following statements areequivalent:

(i) there exists a solution X 2 Fl�m .

(ii) the columns of B 2 ImA.

(iii) rank�A B

�=rank(A).

(iv) Ker(A�) �Ker(B�).Furthermore, the solution, if it exists, is unique if and only if A has full column rank.

The following lemma concerns the rank of the product of two matrices.

Lemma 2.3 (Sylvester's inequality) Let A 2 Fm�n and B 2 Fn�k . Then

rank (A) + rank(B)� n � rank(AB) � minfrank (A); rank(B)g:For simplicity, a matrix M with mij as its i-th row and j-th column's element will

sometimes be denoted as M = [mij ] in this book. We will mostly use I as above todenote an identity matrix with compatible dimensions, but from time to time, we willuse In to emphasis that it is an n� n identity matrix.

Now let A = [aij ] 2 C n�n , then the trace of A is de�ned as

Trace(A) :=

nXi=1

aii:

Trace has the following properties:

Trace(�A) = �Trace(A); 8� 2 C ; A 2 C n�n

Trace(A+B) = Trace(A) + Trace(B); 8A; B 2 C n�n

Trace(AB) = Trace(BA); 8A 2 C n�m ; B 2 Cm�n :

20 LINEAR ALGEBRA

2.2 Eigenvalues and Eigenvectors

Let A 2 C n�n ; then the eigenvalues of A are the n roots of its characteristic polynomialp(�) = det(�I�A). This set of roots is called the spectrum of A and is denoted by �(A)(not to be confused with singular values de�ned later). That is, �(A) := f�1; �2; : : : ; �ngif �i is a root of p(�). The maximal modulus of the eigenvalues is called the spectralradius, denoted by

�(A) := max1�i�n

j�ij

where, as usual, j � j denotes the magnitude.If � 2 �(A) then any nonzero vector x 2 C n that satis�es

Ax = �x

is referred to as a right eigenvector of A. Dually, a nonzero vector y is called a lefteigenvector of A if

y�A = �y�:

It is a well known (but nontrivial) fact in linear algebra that any complex matrix admitsa Jordan Canonical representation:

Theorem 2.4 For any square complex matrix A 2 Cn�n , there exists a nonsingular

matrix T such thatA = TJT�1

whereJ = diagfJ1; J2; : : : ; Jlg

Ji = diagfJi1; Ji2; : : : ; Jimig

Jij =

2666664

�i 1�i 1

. . .. . .

�i 1�i

3777775 2 C

nij�nij

withPl

i=1

Pmi

j=1 nij = n, and with f�i : i = 1; : : : ; lg as the distinct eigenvalues of A.

The transformation T has the following form:

T =�T1 T2 : : : Tl

�Ti =

�Ti1 Ti2 : : : Timi

�Tij =

�tij1 tij2 : : : tijnij

�where tij1 are the eigenvectors of A,

Atij1 = �itij1;

2.2. Eigenvalues and Eigenvectors 21

and tijk 6= 0 de�ned by the following linear equations for k � 2

(A� �iI)tijk = tij(k�1)

are called the generalized eigenvectors of A. For a given integer q � nij , the generalizedeigenvectors tijl;8l < q, are called the lower rank generalized eigenvectors of tijq .

De�nition 2.1 A square matrix A 2 Rn�n is called cyclic if the Jordan canonical formof A has one and only one Jordan block associated with each distinct eigenvalue.

More speci�cally, a matrix A is cyclic if its Jordan form hasmi = 1; i = 1; : : : ; l. Clearly,a square matrix A with all distinct eigenvalues is cyclic and can be diagonalized:

A�x1 x2 � � � xn

�=�x1 x2 � � � xn

�26664�1

�2. . .

�n

37775 :

In this case, A has the following spectral decomposition:

A =nXi=1

�ixiy�i

where yi 2 C n is given by26664

y�1y�2...y�n

37775 =

�x1 x2 � � � xn

��1:

In general, eigenvalues need not be real, and neither do their corresponding eigenvec-tors. However, if A is real and � is a real eigenvalue of A, then there is a real eigenvectorcorresponding to �. In the case that all eigenvalues of a matrix A are real1, we willdenote �max(A) for the largest eigenvalue of A and �min(A) for the smallest eigenvalue.In particular, if A is a Hermitian matrix, then there exist a unitary matrix U and areal diagonal matrix � such that A = U�U�, where the diagonal elements of � are theeigenvalues of A and the columns of U are the eigenvectors of A.

The following theorem is useful in linear system theory.

Theorem 2.5 (Cayley-Hamilton) Let A 2 C n�n and denote

det(�I �A) = �n + a1�n�1 + � � �+ an:

ThenAn + a1A

n�1 + � � �+ anI = 0:

1For example, this is the case if A is Hermitian, i.e., A = A�.

22 LINEAR ALGEBRA

This is obvious if A has distinct eigenvalues. Since

An + a1An�1 + � � �+ anI = T�1 diag f: : : ; �ni + a1�

n�1i + � � �+ an; : : :g T = 0;

and �i is an eigenvalue of A. The proof for the general case follows from the followinglemma.

Lemma 2.6 Let A 2 Cn�n . Then

(�I �A)�1 =1

det(�I �A)(R1�

n�1 +R2�n�2 + � � �+Rn)

anddet(�I �A) = �n + a1�

n�1 + � � �+ an

where ai and Ri can be computed from the following recursive formulas:

a1 = �TraceA R1 = Ia2 = � 1

2 Trace(R2A) R2 = R1A+ a1I...

...an�1 = � 1

n�1 Trace(Rn�1A) Rn = Rn�1A+ an�1I

an = � 1n Trace(RnA) 0 = RnA+ anI:

The proof is left to the reader as an exercise. Note that the Cayley-Hamilton Theoremfollows from the fact that

0 = RnA+ anI = An + a1An�1 + � � �+ anI:

2.3 Matrix Inversion Formulas

Let A be a square matrix partitioned as follows

A :=

�A11 A12

A21 A22

�

where A11 and A22 are also square matrices. Now suppose A11 is nonsingular, then Ahas the following decomposition:�

A11 A12

A21 A22

�=

�I 0

A21A�111 I

� �A11 00 �

��I A�111 A12

0 I

�

with � := A22 �A21A�111 A12, and A is nonsingular i� � is nonsingular.

Dually, if A22 is nonsingular, then�A11 A12

A21 A22

�=

�I A12A

�122

0 I

� �� 00 A22

��I 0

A�122 A21 I

�

2.3. Matrix Inversion Formulas 23

with � := A11 �A12A�122 A21, and A is nonsingular i� � is nonsingular. The matrix �

(�) is called the Schur complement of A11 (A22) in A.

Moreover, if A is nonsingular, then

�A11 A12

A21 A22

��1=

�A�111 +A�111 A12�

�1A21A�111 �A�111 A12�

�1

��1A21A�111 ��1

�

and �A11 A12

A21 A22

��1=

"��1 ��1A12A

�122

�A�122 A21��1

A�122 +A�122 A21��1A12A

�122

#:

The above matrix inversion formulas are particularly simple if A is block triangular:

�A11 0A21 A22

��1=

�A�111 0

�A�122 A21A�111 A�122

�

�A11 A12

0 A22

��1=

�A�111 �A�111 A12A

�122

0 A�122

�:

The following identity is also very useful. Suppose A11 and A22 are both nonsingularmatrices, then

(A11 �A12A�122 A21)

�1 = A�111 +A�111 A12(A22 �A21A�111 A12)

�1A21A�111 :

As a consequence of the matrix decomposition formulas mentioned above, we cancalculate the determinant of a matrix by using its sub-matrices. Suppose A11 is non-singular, then

detA = detA11 det(A22 �A21A�111 A12):

On the other hand, if A22 is nonsingular, then

detA = detA22 det(A11 �A12A�122 A21):

In particular, for any B 2 Cm�n and C 2 C n�m , we have

det

�Im B�C In

�= det(In + CB) = det(Im +BC)

and for x; y 2 C n

det(In + xy�) = 1 + y�x:

24 LINEAR ALGEBRA

2.4 Matrix Calculus

Let X = [xij ] 2 Cm�n be a real or complex matrix and F (X) 2 C be a scalar real orcomplex function of X ; then the derivative of F (X) with respect to X is de�ned as

@

@XF (X) :=

�@

@xijF (X)

�:

Let A and B be constant complex matrices with compatible dimensions. Then thefollowing is a list of formulas for the derivatives2:

@@X TracefAXBg = ATBT

@@X TracefAXTBg = BA

@@X TracefAXBXg = ATXTBT +BTXTAT

@@X TracefAXBXTg = ATXBT +AXB

@@X TracefXkg = k(Xk�1)T

@@X TracefAXkg =

�Pk�1i=0 X

iAXk�i�1�T

@@X TracefAX�1Bg = �(X�1BAX�1)T

@@X log detX = (XT )�1

@@X detXT = @

@X detX = (detX)(XT )�1

@@X detfXkg = k(detXk)(XT )�1:

And �nally, the derivative of a matrix A(�) 2 Cm�n with respect to a scalar � 2 C isde�ned as

dA

d�:=

�daijd�

�so that all the rules applicable to a scalar function also apply here. In particular, wehave

d(AB)

d�=

dA

d�B +A

dB

d�dA�1

d�= �A�1 dA

d�A�1:

2Note that transpose rather than complex conjugate transpose should be used in the list even if the

involved matrices are complex matrices.

2.5. Kronecker Product and Kronecker Sum 25

2.5 Kronecker Product and Kronecker Sum

Let A 2 Cm�n and B 2 C p�q , then the Kronecker product of A and B is de�ned as

AB :=

26664

a11B a12B � � � a1nBa21B a22B � � � a2nB...

......

am1B am2B � � � amnB

37775 2 C

mp�nq :

Furthermore, if the matrices A and B are square and A 2 C n�n and B 2 Cm�m thenthe Kronecker sum of A and B is de�ned as

A�B := (A Im) + (In B) 2 Cnm�nm :

Let X 2 Cm�n and let vec(X) denote the vector formed by stacking the columns of Xinto one long vector:

vec(X) :=

266666666666666666664

x11x21...

xm1

x12x22...

x1nx2n...

xmn

377777777777777777775

:

Then for any matrices A 2 C k�m , B 2 C n�l , and X 2 Cm�n , we have

vec(AXB) = (BT A)vec(X):

Consequently, if k = m and l = n, then

vec(AX +XB) = (BT �A)vec(X):

Let A 2 C n�n and B 2 Cm�m , and let f�i; i = 1; : : : ; ng be the eigenvalues of Aand f�j ; j = 1; : : : ;mg be the eigenvalues of B. Then we have the following properties:

� The eigenvalues of AB are the mn numbers �i�j , i = 1; 2; : : : ; n, j = 1; 2; : : : ;m.

� The eigenvalues of A � B = (A Im) + (In B) are the mn numbers �i + �j ,i = 1; 2; : : : ; n, j = 1; 2; : : : ;m.

26 LINEAR ALGEBRA

� Let fxi; i = 1; : : : ; ng be the eigenvectors of A and let fyj ; j = 1; : : : ;mg be theeigenvectors of B. Then the eigenvectors of A B and A� B correspond to theeigenvalues �i�j and �i + �j are xi yj .

Using these properties, we can show the following Lemma.

Lemma 2.7 Consider the Sylvester equation

AX +XB = C (2:1)

where A 2 Fn�n , B 2 Fm�m , and C 2 Fn�m are given matrices. There exists a uniquesolution X 2 Fn�m if and only if �i(A)+�j(B) 6= 0; 8i = 1; 2; : : : ; n and j = 1; 2; : : : ;m.

In particular, if B = A�, (2.1) is called the \Lyapunov Equation"; and the necessaryand su�cient condition for the existence of a unique solution is that �i(A) + ��j(A) 6=0; 8i; j = 1; 2; : : : ; n

Proof. Equation (2.1) can be written as a linear matrix equation by using the Kro-necker product:

(BT �A)vec(X) = vec(C):

Now this equation has a unique solution i� BT �A is nonsingular. Since the eigenvaluesof BT �A have the form of �i(A) +�j(B

T ) = �i(A) +�j(B), the conclusion follows. 2

The properties of the Lyapunov equations will be studied in more detail in the nextchapter.

2.6 Invariant Subspaces

Let A : C n 7�! C n be a linear transformation, � be an eigenvalue of A, and x be acorresponding eigenvector, respectively. Then Ax = �x and A(�x) = �(�x) for any� 2 C . Clearly, the eigenvector x de�nes an one-dimensional subspace that is invariantwith respect to pre-multiplication by A since Akx = �kx;8k. In general, a subspaceS � C n is called invariant for the transformation A, or A-invariant, if Ax 2 S for everyx 2 S. In other words, that S is invariant for A means that the image of S under Ais contained in S: AS � S. For example, f0g, C n , KerA, and ImA are all A-invariantsubspaces.

As a generalization of the one dimensional invariant subspace induced by an eigen-vector, let �1; : : : ; �k be eigenvalues of A (not necessarily distinct), and let xi be the cor-responding eigenvectors and the generalized eigenvectors. Then S = spanfx1; : : : ; xkgis an A-invariant subspace provided that all the lower rank generalized eigenvectorsare included. More speci�cally, let �1 = �2 = � � � = �l be eigenvalues of A, and

2.6. Invariant Subspaces 27

let x1; x2; : : : ; xl be the corresponding eigenvector and the generalized eigenvectors ob-tained through the following equations:

(A� �1I)x1 = 0

(A� �1I)x2 = x1...

(A� �1I)xl = xl�1:

Then a subspace S with xt 2 S for some t � l is an A-invariant subspace only if all lowerrank eigenvectors and generalized eigenvectors of xt are in S, i.e., xi 2 S; 81 � i � t.This will be further illustrated in Example 2.1.

On the other hand, if S is a nontrivial subspace3 and is A-invariant, then there isx 2 S and � such that Ax = �x.

An A-invariant subspace S � C n is called a stable invariant subspace if all theeigenvalues of A constrained to S have negative real parts. Stable invariant subspaceswill play an important role in computing the stabilizing solutions to the algebraic Riccatiequations in Chapter 13.

Example 2.1 Suppose a matrix A has the following Jordan canonical form

A�x1 x2 x3 x4

�=�x1 x2 x3 x4

�2664�1 1

�1�3

�4

3775

with Re�1 < 0, �3 < 0, and �4 > 0. Then it is easy to verify that

S1 = spanfx1g S12 = spanfx1; x2g S123 = spanfx1; x2; x3gS3 = spanfx3g S13 = spanfx1; x3g S124 = spanfx1; x2; x4gS4 = spanfx4g S14 = spanfx1; x4g S34 = spanfx3; x4g

are all A-invariant subspaces. Moreover, S1; S3; S12; S13, and S123 are stable A-invariantsubspaces. However, the subspaces S2 = spanfx2g, S23 = spanfx2; x3g, S24 = spanfx2; x4g,and S234 = spanfx2; x3; x4g are not A-invariant subspaces since the lower rank general-ized eigenvector x1 of x2 is not in these subspaces. To illustrate, consider the subspaceS23. Then by de�nition, Ax2 2 S23 if it is an A-invariant subspace. Since

Ax2 = �x2 + x1;

Ax2 2 S23 would require that x1 be a linear combination of x2 and x3, but this isimpossible since x1 is independent of x2 and x3. 3

3We will say subspace S is trivial if S = f0g.

28 LINEAR ALGEBRA

2.7 Vector Norms and Matrix Norms

In this section, we will de�ne vector and matrix norms. Let X be a vector space, a real-valued function k�k de�ned on X is said to be a norm on X if it satis�es the followingproperties:

(i) kxk � 0 (positivity);

(ii) kxk = 0 if and only if x = 0 (positive de�niteness);

(iii) k�xk = j�j kxk, for any scalar � (homogeneity);

(iv) kx+ yk � kxk+ kyk (triangle inequality)for any x 2 X and y 2 X . A function is said to be a semi-norm if it satis�es (i), (iii),and (iv) but not necessarily (ii).

Let x 2 C n . Then we de�ne the vector p-norm of x as

kxkp :=

nXi=1

jxijp!1=p

; for 1 � p �1:

In particular, when p = 1; 2;1 we have

kxk1 :=nXi=1

jxij;

kxk2 :=vuut nX

i=1

jxij2;

kxk1 := max1�i�n

jxij:

Clearly, norm is an abstraction and extension of our usual concept of length in 3-dimensional Euclidean space. So a norm of a vector is a measure of the vector \length",for example kxk2 is the Euclidean distance of the vector x from the origin. Similarly,we can introduce some kind of measure for a matrix.

Let A = [aij ] 2 Cm�n , then the matrix norm induced by a vector p-norm is de�nedas

kAkp := supx6=0

kAxkpkxkp

:

In particular, for p = 1; 2;1, the corresponding induced matrix norm can be computedas

kAk1 = max1�j�n

mXi=1

jaij j (column sum) ;

2.7. Vector Norms and Matrix Norms 29

kAk2 =p�max(A�A) ;

kAk1 = max1�i�m

nXj=1

jaij j (row sum) :

The matrix norms induced by vector p-norms are sometimes called induced p-norms.This is because kAkp is de�ned by or induced from a vector p-norm. In fact, A canbe viewed as a mapping from a vector space C n equipped with a vector norm k�kp toanother vector space Cm equipped with a vector norm k�kp. So from a system theoreticalpoint of view, the induced norms have the interpretation of input/output ampli�cationgains.

We shall adopt the following convention throughout the book for the vector and matrixnorms unless speci�ed otherwise: let x 2 C n and A 2 Cm�n , then we shall denote theEuclidean 2-norm of x simply by

kxk := kxk2and the induced 2-norm of A by

kAk := kAk2 :

The Euclidean 2-norm has some very nice properties:

Lemma 2.8 Let x 2 Fn and y 2 F

m .

1. Suppose n � m. Then kxk = kyk i� there is a matrix U 2 Fn�m such that x = Uyand U�U = I.

2. Suppose n = m. Then jx�yj � kxk kyk. Moreover, the equality holds i� x = �yfor some � 2 F or y = 0.

3. kxk � kyk i� there is a matrix � 2 Fn�m with k�k � 1 such that x = �y.

Furthermore, kxk < kyk i� k�k < 1.

4. kUxk = kxk for any appropriately dimensioned unitary matrices U .

Another often used matrix norm is the so called Frobenius norm. It is de�ned as

kAkF :=pTrace(A�A) =

vuut mXi=1

nXj=1

jaij j2 :

However, the Frobenius norm is not an induced norm.

The following properties of matrix norms are easy to show:

Lemma 2.9 Let A and B be any matrices with appropriate dimensions. Then

30 LINEAR ALGEBRA

1. �(A) � kAk (This is also true for F norm and any induced matrix norm).

2. kABk � kAk kBk. In particular, this gives A�1 � kAk�1 if A is invertible.

(This is also true for any induced matrix norm.)

3. kUAV k = kAk, and kUAV kF = kAkF , for any appropriately dimensioned unitarymatrices U and V .

4. kABkF � kAk kBkF and kABkF � kBk kAkF .

Note that although pre-multiplication or post-multiplication of a unitary matrix ona matrix does not change its induced 2-norm and F -norm, it does change its eigenvalues.For example, let

A =

�1 01 0

�:

Then �1(A) = 1; �2(A) = 0. Now let

U =

"1p2

1p2

� 1p2

1p2

#;

then U is a unitary matrix and

UA =

� p2 00 0

�

with �1(UA) =p2, �2(UA) = 0. This property is useful in some matrix perturbation

problems, particularly, in the computation of bounds for structured singular valueswhich will be studied in Chapter 10.

Lemma 2.10 Let A be a block partitioned matrix with

A =

26664

A11 A12 � � � A1q

A21 A22 � � � A2q

......

...Am1 Am2 � � � Amq

37775 =: [Aij ];

and let each Aij be an appropriately dimensioned matrix. Then for any induced matrixp-norm

kAkp �

26664kA11kp kA12kp � � � kA1qkpkA21kp kA22kp � � � kA2qkp

......

...kAm1kp kAm2kp � � � kAmqkp

37775 p

: (2:2)

Further, the inequality becomes an equality if the F -norm is used.

2.7. Vector Norms and Matrix Norms 31

Proof. It is obvious that if the F -norm is used, then the right hand side of inequal-ity (2.2) equals the left hand side. Hence only the induced p-norm cases, 1 � p � 1,will be shown. Let a vector x be partitioned consistently with A as

x =

26664x1x2...xq

37775 ;

and note that

kxkp =

26664kx1kpkx2kp...

kxqkp

37775 p

:

Then

k[Aij ]kp := supkxk

p=1

k[Aij ]xkp = supkxk

p=1

26664Pq

j=1 A1jxjPqj=1 A2jxj

...Pqj=1 Amjxj

37775 p

= supkxkp=1

266666664

Pqj=1 A1jxj

p Pq

j=1 A2jxj

p

... Pqj=1 Amjxj

p

377777775

p

� supkxkp=1

26664Pq

j=1 kA1jkp kxjkpPqj=1 kA2jkp kxjkp

...Pqj=1 kAmjkp kxjkp

37775 p

= supkxk

p=1

26664kA11kp kA12kp � � � kA1qkpkA21kp kA22kp � � � kA2qkp

......

...kAm1kp kAm2kp � � � kA1qkp

3777526664kx1kpkx2kp...

kxqkp

37775 p

� supkxk

p=1

hkAijkpi pkxkp

= hkAijkpi

p:

2

32 LINEAR ALGEBRA

2.8 Singular Value Decomposition

A very useful tool in matrix analysis is Singular Value Decomposition (SVD). It will beseen that singular values of a matrix are good measures of the \size" of the matrix andthat the corresponding singular vectors are good indications of strong/weak input oroutput directions.

Theorem 2.11 Let A 2 Fm�n . There exist unitary matrices

U = [u1; u2; : : : ; um] 2 Fm�m

V = [v1; v2; : : : ; vn] 2 Fn�n

such that

A = U�V �; � =

��1 00 0

�where

�1 =

26664�1 0 � � � 00 �2 � � � 0...

.... . .

...0 0 � � � �p

37775

and�1 � �2 � � � � � �p � 0; p = minfm;ng:

Proof. Let � = kAk and without loss of generality assume m � n. Then from thede�nition of kAk, there exists a z 2 Fn such that

kAzk = � kzk :By Lemma 2.8, there is a matrix ~U 2 Fm�n such that ~U� ~U = I and

Az = � ~Uz:

Now let

x =z

kzk 2 Fn ; y =

~Uz ~Uz 2 Fm :

We have Ax = �y. LetV =

�x V1

� 2 Fn�n

andU =

�y U1

� 2 Fm�m

be unitary.4 Consequently, U�AV has the following structure:

A1 := U�AV =

�� w�

0 B

�4Recall that it is always possible to extend an orthonormal set of vectors to an orthonormal basis

for the whole space.

2.8. Singular Value Decomposition 33

where w 2 Fn�1 and B 2 F(m�1)�(n�1) .Since

A�1

26664

10...0

37775

2

2

= (�2 + w�w);

it follows that kA1k2 � �2 + w�w. But since � = kAk = kA1k, we must have w = 0.An obvious induction argument gives

U�AV = �:

This completes the proof. 2

The �i is the i-th singular value of A, and the vectors ui and vj are, respectively,the i-th left singular vector and the j-th right singular vector. It is easy to verify that

Avi = �iui

A�ui = �ivi:

The above equations can also be written as

A�Avi = �2i vi

AA�ui = �2i ui:

Hence �2i is an eigenvalue of AA� or A�A, ui is an eigenvector of AA�, and vi is aneigenvector of A�A.

The following notations for singular values are often adopted:

�(A) = �max(A) = �1 = the largest singular value of A;

and�(A) = �min(A) = �p = the smallest singular value of A :

Geometrically, the singular values of a matrix A are precisely the lengths of the semi-axes of the hyperellipsoid E de�ned by

E = fy : y = Ax; x 2 Cn ; kxk = 1g:

Thus v1 is the direction in which kyk is largest for all kxk = 1; while vn is the directionin which kyk is smallest for all kxk = 1. From the input/output point of view, v1 (vn)is the highest (lowest) gain input direction, while u1 (um) is the highest (lowest) gainobserving direction. This can be illustrated by the following 2� 2 matrix:

A =

�cos �1 � sin �1sin �1 cos �1

� ��1

�2

��cos �2 � sin �2sin �2 cos �2

�:

34 LINEAR ALGEBRA

It is easy to see that A maps a unit disk to an ellipsoid with semi-axes of �1 and �2.

Hence it is often convenient to introduce the following alternative de�nitions for thelargest singular value �:

�(A) := maxkxk=1

kAxk

and for the smallest singular value � of a tall matrix:

�(A) := minkxk=1

kAxk :

Lemma 2.12 Suppose A and � are square matrices. Then

(i) j�(A+�)� �(A)j � �(�);

(ii) �(A�) � �(A)�(�);

(iii) �(A�1) =1

�(A)if A is invertible.

Proof.

(i) By de�nition

�(A+�) := minkxk=1

k(A+�)xk� min

kxk=1fkAxk � k�xkg

� minkxk=1

kAxk � maxkxk=1

k�xk= �(A)� �(�):

Hence ��(�) � �(A+�)��(A). The other inequality �(A+�)��(A) � �(�)follows by replacing A by A+� and � by �� in the above proof.

(ii) This follows by noting that

�(A�) := minkxk=1

kA�xk

=r

minkxk=1

x��A�A�x

� �(A) minkxk=1

k�xk= �(A)�(�):

(iii) Let the singular value decomposition of A be A = U�V �, then A�1 = V ��1U�.Hence �(A�1) = �(��1) = 1=�(�) = 1=�(A).

2.8. Singular Value Decomposition 35

2

Some useful properties of SVD are collected in the following lemma.

Lemma 2.13 Let A 2 Fm�n and

�1 � �2 � � � � � �r > �r+1 = � � � = 0; r � minfm;ng:Then

1. rank(A) = r;

2. KerA = spanfvr+1; : : : ; vng and (KerA)? = spanfv1; : : : ; vrg;3. ImA = spanfu1; : : : ; urg and (ImA)? = spanfur+1; : : : ; umg;4. A 2 Fm�n has a dyadic expansion:

A =rXi=1

�iuiv�i = Ur�rV

�r

where Ur = [u1; : : : ; ur], Vr = [v1; : : : ; vr], and �r = diag (�1; : : : ; �r);

5. kAk2F = �21 + �22 + � � �+ �2r ;

6. kAk = �1;

7. �i(U0AV0) = �i(A); i = 1; : : : ; p for any appropriately dimensioned unitary ma-trices U0 and V0;

8. Let k < r = rank(A) and Ak :=Pk

i=1 �iuiv�i , then

minrank(B)�k

kA�Bk = kA�Akk = �k+1:

Proof. We shall only give a proof for part 8. It is easy to see that rank(Ak) � k andkA�Akk = �k+1. Hence, we only need to show that min

rank(B)�kkA�Bk � �k+1. Let

B be any matrix such that rank(B) � k. Then

kA�Bk = kU�V � �Bk = k�� U�BV k�

� Ik+1 0�(�� U�BV )

�Ik+10

� = �k+1 � B

where B =�Ik+1 0

�U�BV

�Ik+10

�2 F(k+1)�(k+1) and rank(B) � k. Let x 2 Fk+1

be such that Bx = 0 and kxk = 1. Then

kA�Bk � �k+1 � B

� (�k+1 � B)x = k�k+1xk � �k+1:

Since B is arbitrary, the conclusion follows. 2

36 LINEAR ALGEBRA

2.9 Generalized Inverses

Let A 2 Cm�n . A matrix X 2 C n�m is said to be a right inverse of A if AX = I .Obviously, A has a right inverse i� A has full row rank, and, in that case, one of the rightinverses is given by X = A�(AA�)�1. Similarly, if Y A = I then Y is called a left inverseof A. By duality, A has a left inverse i� A has full column rank, and, furthermore, one ofthe left inverses is Y = (A�A)�1A�. Note that right (or left) inverses are not necessarily

unique. For example, any matrix in the form

�I?

�is a right inverse of

�I 0

�.

More generally, if a matrix A has neither full row rank nor full column rank, then allthe ordinary matrix inverses do not exist; however, the so called pseudo-inverse, knownalso as the Moore-Penrose inverse, is useful. This pseudo-inverse is denoted by A+,which satis�es the following conditions:

(i) AA+A = A;

(ii) A+AA+ = A+;

(iii) (AA+)� = AA+;

(iv) (A+A)� = A+A.

It can be shown that pseudo-inverse is unique. One way of computing A+ is by writing

A = BC

so that B has full column rank and C has full row rank. Then

A+ = C�(CC�)�1(B�B)�1B�:

Another way to compute A+ is by using SVD. Suppose A has a singular value decom-position

A = U�V �

with

� =

��r 00 0

�; �r > 0:

Then A+ = V �+U� with

�+ =

��1r 00 0

�:

2.10 Semide�nite Matrices

A square hermitian matrix A = A� is said to be positive de�nite (semi-de�nite) , denotedby A > 0 (� 0), if x�Ax > 0 (� 0) for all x 6= 0. Suppose A 2 F

n�n and A = A� � 0,then there exists a B 2 Fn�r with r � rank(A) such that A = BB�.

Lemma 2.14 Let B 2 Fm�n and C 2 Fk�n . Suppose m � k and B�B = C�C. Thenthere exists a matrix U 2 Fm�k such that U�U = I and B = UC.

2.10. Semide�nite Matrices 37

Proof. Let V1 and V2 be unitary matrices such that

B1 = V1

�B1

0

�; C1 = V2

�C1

0

�

where B1 and C1 are full row rank. Then B1 and C1 have the same number of rowsand V3 := B1C

�1 (C1C

�1 )�1 satis�es V �3 V3 = I since B�B = C�C. Hence V3 is a unitary

matrix and V �3 B1 = C1. Finally let

U = V1

�V3 00 V4

�V �2

for any suitably dimensioned V4 such that V �4 V4 = I . 2

We can de�ne square root for a positive semi-de�nite matrix A, A1=2 = (A1=2)� � 0,by

A = A1=2A1=2:

Clearly, A1=2 can be computed by using spectral decomposition or SVD: let A = U�U�,then

A1=2 = U�1=2U�

where� = diagf�1; : : : ; �ng; �1=2 = diagf

p�1; : : : ;

p�ng:

Lemma 2.15 Suppose A = A� > 0 and B = B� � 0. Then A > B i� �(BA�1) < 1.

Proof. Since A > 0, we have A > B i�

0 < I �A�1=2BA�1=2 = I �A�1=2(BA�1)A1=2:

However,A�1=2BA�1=2 andBA�1 are similar, and hence �i(BA�1) = �i(A�1=2BA�1=2).

Therefore, the conclusion follows by the fact that

0 < I �A�1=2BA�1=2

i� �(A�1=2BA�1=2) < 1 i� �(BA�1) < 1. 2

Lemma 2.16 Let X = X� � 0 be partitioned as

X =

�X11 X12

X�12 X22

�:

Then KerX22 � KerX12. Consequently, if X+22 is the pseudo-inverse of X22, then Y =

X12X+22 solves

Y X22 = X12

and�X11 X12

X�12 X22

�=

�I X12X

+22

0 I

��X11 �X12X

+22X

�12 0

0 X22

� �I 0

X+22X

�12 I

�:

38 LINEAR ALGEBRA

Proof. Without loss of generality, assume

X22 =�U1 U2

� � �1 00 0

��U�1U�2

�

with �1 = ��1 > 0 and U =�U1 U2

�unitary. Then

Ker X22 = spanfcolumns of U2gand

X+22 =

�U1 U2

� � ��11 00 0

� �U�1U�2

�:

Moreover �I 00 U�

� �X11 X12

X�12 X22

� �I 00 U

�� 0

gives X12U2 = 0. Hence, Ker X22 � Ker X12 and now

X12X+22X22 = X12U1U

�1 = X12U1U

�1 +X12U2U

�2 = X12:

The factorization follows easily. 2

2.11 Matrix Dilation Problems*

In this section, we consider the following induced 2-norm optimization problem:

minX

�X BC A

� (2:3)

where X , B, C, and A are constant matrices of compatible dimensions.

The matrix

�X BC A

�is a dilation of its sub-matrices as indicated in the following

diagram:

d

c

d

c

cdcd

�C A

� �A�

�BA

��X BC A

�

?

6

?

6

-

�

-

�

2.11. Matrix Dilation Problems* 39

In this diagram, \c" stands for the operation of compression and \d" stands for dilation.Compression is always norm non-increasing and dilation is always norm non-decreasing.Sometimes dilation can be made to be norm preserving. Norm preserving dilations arethe focus of this section.

The simplest matrix dilation problem occurs when solving

minX

�XA

� : (2:4)

Although (2.4) is a much simpli�ed version of (2.3), we will see that it contains all theessential features of the general problem. Letting 0 denote the minimum norm in (2.4),it is immediate that

0 = kAk :The following theorem characterizes all solutions to (2.4).

Theorem 2.17 8 � 0, �XA

� �

i� there is a Y with kY k � 1 such that

X = Y ( 2I �A�A)1=2:

Proof. �XA

� �

i�

X�X +A�A � 2I

i�

X�X � ( 2I � A�A):

Now suppose X�X � ( 2I �A�A) and let

Y := Xh( 2I �A�A)1=2

i+then X = Y ( 2I � A�A)1=2 and Y �Y � I . Similarly if X = Y ( 2I � A�A)1=2 andY �Y � I then X�X � ( 2I �A�A): 2

This theorem implies that, in general, (2.4) has more than one solution, which isin contrast to the minimization in the Frobenius norm in which X = 0 is the uniquesolution. The solution X = 0 is the central solution but others are possible unlessA�A = 20I .

40 LINEAR ALGEBRA

Remark 2.1 The theorem still holds if ( 2I � A�A)1=2 is replaced by any matrix Rsuch that 2I �A�A = R�R. ~

A more restricted version of the above theorem is shown in the following corollary.

Corollary 2.18 8 > 0, �XA

� � (< )

i� X( 2I �A�A)�1=2 � 1(< 1):

The corresponding dual results are

Theorem 2.19 8 � 0 � X A� �

i� there is a Y , kY k � 1, such that

X = ( 2I �AA�)1=2Y:

Corollary 2.20 8 > 0 � X A� � (< )

i� ( 2I �AA�)�1=2X � 1 (< 1):

Now, returning to the problem in (2.3), let

0 := minX

�X BC A

� : (2:5)

The following so called Parrott's theorem will play an important role in many controlrelated optimization problems. The proof is the straightforward application of Theo-rem 2.17 and its dual, Theorem 2.19.

Theorem 2.21 (Parrott's Theorem) The minimum in (2.5) is given by

0 = max

� � C A� ;

�BA

� �: (2:6)


Proof. Denote by the right hand side of the equation (2.6). Clearly, 0 � sincecompressions are norm non-increasing, and that 0 � will be shown by using Theo-rem 2.17 and Theorem 2.19.

Suppose A 2 Cn�m and n � m (the case for m > n can be shown in the same

fashion). Then A has the following singular value decomposition:

A = U

��m

0n�m;m

�V �; U 2 C

n�n ; V 2 Cm�m :

Hence 2I �A�A = V ( 2I ��2

m)V�

and

2I �AA� = U

� 2I ��2

m 00 2In�m

�U�:

Now let( 2I �A�A)1=2 := V ( 2I ��2

m)1=2V �

and

( 2I �AA�)1=2 := U

�( 2I ��2

m)1=2 0

0 In�m

�U�:

Then it is easy to verify that

( 2I �A�A)1=2A� = A�( 2I �AA�)1=2:

Using this equality, we can show that� �A� ( 2I �A�A)1=2

( 2I �AA�)1=2 A

� � �A� ( 2I �A�A)1=2

( 2I �AA�)1=2 A

��

=

� 2I 00 2I

�:

Now we are ready to show that 0 � .From Theorem 2.17 we have that B = Y ( 2I � A�A)1=2 for some Y such that

kY k � 1. Similarly, Theorem 2.19 yields C = ( 2I � AA�)1=2Z for some Z withkZk � 1. Now let X = �Y A�Z. Then

�X BC A

� =

� �Y A�Z Y ( 2I �A�A)1=2

( 2I �AA�)1=2Z A

� =

�Y

I

� � �A� ( 2I �A�A)1=2

( 2I �AA�)1=2 A

��Z

I

� �

� �A� ( 2I �A�A)1=2

( 2I �AA�)1=2 A

� = :

42 LINEAR ALGEBRA

Thus � 0, so = 0. 2

This theorem gives one solution to (2.3) and an expression for 0. As in (2.4), theremay be more than one solution to (2.3), although the proof of theorem 2.21 exhibitsonly one. Theorem 2.22 considers the problem of parameterizing all solutions. Thesolution X = �Y A�Z is the \central" solution analogous to X = 0 in (2.4).

Theorem 2.22 Suppose � 0. The solutions X such that �X BC A

� � (2:7)

are exactly those of the form

X = �Y A�Z + (I � Y Y �)1=2W (I � Z�Z)1=2 (2:8)

where W is an arbitrary contraction (kWk � 1), and Y with kY k � 1 and Z withkZk � 1 solve the linear equations

B = Y ( 2I �A�A)1=2; (2.9)

C = ( 2I �AA�)1=2Z: (2.10)

Proof. Since � 0, again from Theorem 2.19 there exists a Z with kZk � 1 suchthat

C = ( 2I �AA�)1=2Z:

Note that using the above expression for C we have

2I � � C A��

C A�

=

� (I � Z�Z)1=2 0

�A�Z ( 2I �A�A)1=2

�� (I � Z�Z)1=2 0

�A�Z ( 2I �A�A)1=2

�:

Now apply Theorem 2.17 (Remark 2.1) to inequality (2.7) with respect to the partitioned

matrix

�X BC A

�to get

�X B

�= W

� (I � Z�Z)1=2 0

�A�Z ( 2I �A�A)1=2

�

for some contraction W , W � 1. Partition W as W =

�W1 Y

�to obtain the

expression for X and B:

X = �Y A�Z + W1(I � Z�Z)1=2;

B = Y ( 2I �A�A)1=2:

Then kY k � 1 and the theorem follows by noting that � W1 Y

� � 1 i� there is a

W , kWk � 1, such that W1 = (I � Y Y �)1=2W . 2

The following corollary gives an alternative version of Theorem 2.22 when > 0.


Corollary 2.23 For > 0. �X BC A

� � (< ) (2:11)

i� (I � Y Y �)�1=2(X + Y A�Z)(I � Z�Z)�1=2 � (< ) (2:12)

where

Y = B( 2I �A�A)�1=2; (2.13)

Z = ( 2I �AA�)�1=2C: (2.14)

Note that in the case of > 0, I � Y Y � and I � Z�Z are invertible since Corol-lary 2.18 and 2.20 clearly show that kY k < 1 and kZk < 1. There are many alternativecharacterizations of solutions to (2.11), although the formulas given above seem to bethe simplest.

As a straightforward application of the dilation results obtained above, consider thefollowing matrix approximation problem:

0 = minQkR+ UQV k (2:15)

where R, U , and V are constant matrices such that U�U = I and V V � = I .

Corollary 2.24 The minimum achievable norm is given by

0 = max fkU�?Rk ; kRV �?kg ;

and the parameterization for all optimal solutions Q can be obtained from Theorem 2.22with X = Q+ U�RV �, A = U�?RV

�?, B = U�RV �?, and C = U�?RV

�.

Proof. Let U? and V? be such that�U U?

�and

�VV?

�are unitary matrices.

Then

0 = minQkR+ UQV k

= minQ

� U U?��(R+ UQV )

�VV?

�� = min

Q

�U�RV � +Q U�RV �?U�?RV

� U�?RV�?

� :The result follows after applying Theorem 2.21 and 2.22. 2

A similar problem arises in H1 control theory.

44 LINEAR ALGEBRA

2.12 Notes and References

A very extensive treatment of most topics in this chapter can be found in Brogan[1991], Horn and Johnson [1990,1991] and Lancaster and Tismenetsky [1985]. Goluband Van Loan's book [1983] contains many numerical algorithms for solving most of theproblems in this chapter. The matrix dilation theory can be found in Davis, Kahan,and Weinberger [1982].

3Linear Dynamical Systems

This chapter reviews some basic system theoretical concepts. The notions of controlla-bility, observability, stabilizability, and detectability are de�ned and various algebraicand geometric characterizations of these notions are summarized. Kalman canonical de-composition, pole placement, and observer theory are then introduced. The solutions ofLyapunov equations and their connections with system stability, controllability, and soon, are discussed. System interconnections and realizations, in particular the balancedrealization, are studied in some detail. Finally, the concepts of system poles and zerosare introduced.

3.1 Descriptions of Linear Dynamical Systems

Let a �nite dimensional linear time invariant (FDLTI) dynamical system be describedby the following linear constant coe�cient di�erential equations:

_x = Ax+Bu; x(t0) = x0 (3.1)

y = Cx+Du (3.2)

where x(t) 2 Rn is called the system state, x(t0) is called the initial condition of thesystem, u(t) 2 Rm is called the system input, and y(t) 2 Rp is the system output.The A;B;C; and D are appropriately dimensioned real constant matrices. A dynamicalsystem with single input (m = 1) and single output (p = 1) is called a SISO (single inputand single output) system, otherwise it is called MIMO (multiple input and multiple

45

46 LINEAR DYNAMICAL SYSTEMS

output) system. The corresponding transfer matrix from u to y is de�ned as

Y (s) = G(s)U(s)

where U(s) and Y (s) are the Laplace transform of u(t) and y(t) with zero initial condi-tion (x(0) = 0). Hence, we have

G(s) = C(sI �A)�1B +D:

Note that the system equations (3.1) and (3.2) can be written in a more compact matrixform: �

_xy

�=

�A BC D

� �xu

�:

To expedite calculations involving transfer matrices, the notation�A BC D

�:= C(sI �A)�1B +D

will be used. Other reasons for using this notation will be discussed in Chapter 10.Note that �

A BC D

�is a real block matrix, not a transfer function.

Now given the initial condition x(t0) and the input u(t), the dynamical systemresponse x(t) and y(t) for t � t0 can be determined from the following formulas:

x(t) = eA(t�t0)x(t0) +Z t

t0

eA(t��)Bu(�)d� (3.3)

y(t) = Cx(t) +Du(t): (3.4)

In the case of u(t) = 0; 8t � t0, it is easy to see from the solution that for any t1 � t0and t � t0, we have

x(t) = eA(t�t1)x(t1):

Therefore, the matrix function �(t; t1) = eA(t�t1) acts as a transformation from onestate to another, and thus �(t; t1) is usually called the state transition matrix. Sincethe state of a linear system at one time can be obtained from the state at anotherthrough the transition matrix, we can assume without loss of generality that t0 = 0.This will be assumed in the sequel.

The impulse matrix of the dynamical system is de�ned as

g(t) = L�1 fG(s)g = CeAtB1+(t) +D�(t)

where �(t) is the unit impulse and 1+(t) is the unit step de�ned as

1+(t) :=

�1; t � 0;0; t < 0:

3.2. Controllability and Observability 47

The input/output relationship (i.e., with zero initial state: x0 = 0) can be described bythe convolution equation

y(t) = (g � u)(t) :=Z 1

�1g(t� �)u(�)d� =

Z t

�1g(t� �)u(�)d�:

3.2 Controllability and Observability

We now turn to some very important concepts in linear system theory.

De�nition 3.1 The dynamical system described by the equation (3.1) or the pair(A;B) is said to be controllable if, for any initial state x(0) = x0, t1 > 0 and �nalstate x1, there exists a (piecewise continuous) input u(�) such that the solution of (3.1)satis�es x(t1) = x1. Otherwise, the system or the pair (A;B) is said to be uncontrollable.

The controllability (or observability introduced next) of a system can be veri�ed throughsome algebraic or geometric criteria.

Theorem 3.1 The following are equivalent:

(i) (A;B) is controllable.

(ii) The matrix

Wc(t) :=

Z t

0

eA�BB�eA��d�

is positive de�nite for any t > 0.

(iii) The controllability matrix

C = � B AB A2B : : : An�1B�

has full row rank or, in other words, hA jImBi :=Pni=1 Im(A

i�1B) = Rn .

(iv) The matrix [A� �I; B] has full row rank for all � in C .

(v) Let � and x be any eigenvalue and any corresponding left eigenvector of A, i.e.,x�A = x��, then x�B 6= 0.

(vi) The eigenvalues of A+BF can be freely assigned (with the restriction that complexeigenvalues are in conjugate pairs) by a suitable choice of F .


Proof.

(i), (ii): Suppose Wc(t1) > 0 for some t1 > 0, and let the input be de�ned as

u(�) = �B�eA�(t1��)Wc(t1)�1(eAt1x0 � x1):

Then it is easy to verify using the formula in (3.3) that x(t1) = x1. Since x1 isarbitrary, the pair (A;B) is controllable.

To show that the controllability of (A;B) implies that Wc(t) > 0 for any t > 0,assume that (A;B) is controllable but Wc(t1) is singular for some t1 > 0. SinceeAtBB�eA

�t � 0 for all t, there exists a real vector 0 6= v 2 Rn such that

v�eAtB = 0; t 2 [0; t1]:

Now let x(t1) = x1 = 0, and then from the solution (3.3), we have

0 = eAt1x(0) +

Z t1

0

eA(t1��)Bu(�)d�:

Pre-multiply the above equation by v� to get

0 = v�eAt1x(0):

If we chose the initial state x(0) = e�At1v, then v = 0, and this is a contradiction.Hence, Wc(t) can not be singular for any t > 0.

(ii), (iii): Suppose Wc(t) > 0 for all t > 0 (in fact, it can be shown that Wc(t) > 0for all t > 0 i�, for some t1, Wc(t1) > 0) but the controllability matrix C does nothave full row rank. Then there exists a v 2 Rn such that

v�AiB = 0

for all 0 � i � n � 1. In fact, this equality holds for all i � 0 by the Cayley-Hamilton Theorem. Hence,

v�eAtB = 0

for all t or, equivalently, v�Wc(t) = 0 for all t; this is a contradiction, and hence,the controllability matrix C must be full row rank. Conversely, suppose C has fullrow rank but Wc(t) is singular for some t1. Then there exists a 0 6= v 2 Rn suchthat v�eAtB = 0 for all t 2 [0; t1]. Therefore, set t = 0, and we have

v�B = 0:

Next, evaluate the i-th derivative of v�eAtB = 0 at t = 0 to get

v�AiB = 0; i > 0:

Hence, we havev��B AB A2B : : : An�1B

�= 0

or, in other words, the controllability matrix C does not have full row rank. Thisis again a contradiction.


(iii)) (iv): Suppose, on the contrary, that the matrix�A� �I B

�does not have full row rank for some � 2 C . Then there exists a vector x 2 C n

such thatx��A� �I B

�= 0

i.e., x�A = �x� and x�B = 0. However, this will result in

x��B AB : : : An�1B

�=�x�B �x�B : : : �n�1x�B

�= 0

i.e., the controllability matrix C does not have full row rank, and this is a contra-diction.

(iv)) (v): This is obvious from the proof of (iii)) (iv).

(v) ) (iii): We will again prove this by contradiction. Assume that (v) holds butrank C = k < n. Then in section 3.3, we will show that there is a transformationT such that

TAT�1 =�

�Ac �A12

0 �A�c

�TB =

��Bc

0

�

with �A�c 2 R(n�k)�(n�k) . Let �1 and x�c be any eigenvalue and any correspondingleft eigenvector of �A�c, i.e., x

��c�A�c = �1x

��c . Then x

�(TB) = 0 and

x =

�0x�c

�

is an eigenvector of TAT�1 corresponding to the eigenvalue �1, which impliesthat (TAT�1; TB) is not controllable. This is a contradiction since similaritytransformation does not change controllability. Hence, the proof is completed.

(vi) ) (i): This follows the same arguments as in the proof of (v) ) (iii): assumethat (vi) holds but (A;B) is uncontrollable. Then, there is a decomposition sothat some subsystems are not a�ected by the control, but this contradicts thecondition (vi).

(i)) (vi): This will be clear in section 3.4. In that section, we will explicitly constructa matrix F so that the eigenvalues of A+BF are in the desired locations.

2

De�nition 3.2 An unforced dynamical system _x = Ax is said to be stable if all theeigenvalues of A are in the open left half plane, i.e., Re�(A) < 0. A matrix A with sucha property is said to be stable or Hurwitz.


De�nition 3.3 The dynamical system (3.1), or the pair (A; B), is said to be stabilizableif there exists a state feedback u = Fx such that the system is stable, i.e., A+ BF isstable.

Therefore, it is more appropriate to call this stabilizability the state feedback stabiliz-ability to di�erentiate it from the output feedback stabilizability de�ned later.

The following theorem is a consequence of Theorem 3.1.


(i) (A; B) is stabilizable.

(ii) The matrix [A� �I; B] has full row rank for all Re� � 0.

(iii) For all � and x such that x�A = x�� and Re� � 0, x�B 6= 0.

(iv) There exists a matrix F such that A+BF is Hurwitz.

We now consider the dual notions of observability and detectability of the systemdescribed by equations (3.1) and (3.2).

De�nition 3.4 The dynamical system described by the equations (3.1) and (3.2) orby the pair (C;A) is said to be observable if, for any t1 > 0, the initial state x(0) = x0can be determined from the time history of the input u(t) and the output y(t) in theinterval of [0; t1]. Otherwise, the system, or (C;A), is said to be unobservable.


(i) (C;A) is observable.

(ii) The matrix

Wo(t) :=

Z t

0

eA��C�CeA�d�

is positive de�nite for any t > 0.

(iii) The observability matrix

O =

2666664

CCACA2

...CAn�1

3777775

has full column rank orTni=1Ker(CA

i�1) = 0.

(iv) The matrix

�A� �IC

�has full column rank for all � in C .


(v) Let � and y be any eigenvalue and any corresponding right eigenvector of A, i.e.,Ay = �y, then Cy 6= 0.

(vi) The eigenvalues of A+LC can be freely assigned (with the restriction that complexeigenvalues are in conjugate pairs) by a suitable choice of L.

(vii) (A�; C�) is controllable.

Proof. First, we will show the equivalence between conditions (i) and (iii). Once thisis done, the rest will follow by the duality or condition (vii).

(i)( (iii): Note that given the input u(t) and the initial condition x0, the output inthe time interval [0; t1] is given by

y(t) = CeAtx(0) +

Z t

0

CeA(t��)Bu(�)d� +Du(t):

Since y(t) and u(t) are known, there is no loss of generality in assuming u(t) =0;8t. Hence,

y(t) = CeAtx(0); t 2 [0; t1]:

From this equation, we have

26664

y(0)_y(0)...

y(n�1)(0)

37775 =

2666664

CCACA2

...CAn�1

3777775x(0)

where y(i) stands for the i-th derivative of y. Since the observability matrix Ohas full column rank, there is a unique solution x(0) in the above equation. Thiscompletes the proof.

(i)) (iii): This will be proven by contradiction. Assume that (C;A) is observable butthat the observability matrix does not have full column rank, i.e., there is a vectorx0 such that Ox0 = 0 or equivalently CAix0 = 0; 8i � 0 by the Cayley-HamiltonTheorem. Now suppose the initial state x(0) = x0, then y(t) = CeAtx(0) = 0.This implies that the system is not observable since x(0) cannot be determinedfrom y(t) � 0.

2

De�nition 3.5 The system, or the pair (C;A), is detectable if A + LC is stable forsome L.



(i) (C;A) is detectable.

(ii) The matrix

�A� �IC

�has full column rank for all Re� � 0.

(iii) For all � and x such that Ax = �x and Re� � 0, Cx 6= 0.

(iv) There exists a matrix L such that A+ LC is Hurwitz.

(v) (A�; C�) is stabilizable.

The conditions (iv) and (v) of Theorem 3.1 and Theorem 3.3 and the conditions(ii) and (iii) of Theorem 3.2 and Theorem 3.4 are often called Popov-Belevitch-Hautus(PBH) tests. In particular, the following de�nitions of modal controllability and ob-servability are often useful.

De�nition 3.6 Let � be an eigenvalue of A or, equivalently, a mode of the system.Then the mode � is said to be controllable (observable) if x�B 6= 0 (Cx 6= 0) for all left(right) eigenvectors of A associated with �, i.e., x�A = �x� (Ax = �x) and 0 6= x 2 C

n .Otherwise, the mode is said to be uncontrollable (unobservable).

It follows that a system is controllable (observable) if and only if every mode is control-lable (observable). Similarly, a system is stabilizable (detectable) if and only if everyunstable mode is controllable (observable).

For example, consider the following 4th order system:

�A BC D

�=

266664�1 1 0 0 00 �1 1 0 10 0 �1 0 �0 0 0 �2 11 0 0 � 0

377775

with �1 6= �2. Then, the mode �1 is not controllable if � = 0, and �2 is not observableif � = 0. Note that if �1 = �2, the system is uncontrollable and unobservable for any �and � since in that case, both

x1 =

2664

0010

3775 and x2 =

2664

0001

3775

are the left eigenvectors of A corresponding to �1. Hence any linear combination of x1and x2 is still an eigenvector of A corresponding to �1. In particular, let x = x1 � �x2,then x�B = 0, and as a result, the system is not controllable. Similar arguments can beapplied to check observability. However, if the B matrix is changed into a 4� 2 matrix

3.3. Kalman Canonical Decomposition 53

with the last two rows independent of each other, then the system is controllable evenif �1 = �2. For example, the reader may easily verify that the system with

B =

2664

0 01 0� 11 0

3775

is controllable for any �.In general, for a system given in the Jordan canonical form, the controllability and

observability can be concluded by inspection. The interested reader may easily de-rive some explicit conditions for the controllability and observability by using Jordancanonical form and the tests (iv) and (v) of Theorem 3.1 and Theorem 3.3.

3.3 Kalman Canonical Decomposition

There are usually many di�erent coordinate systems to describe a dynamical system.For example, consider a simple pendulum, the motion of the pendulum can be uniquelydetermined either in terms of the angle of the string attached to the pendulum or interms of the vertical displacement of the pendulum. However, in most cases, the angulardisplacement is a more natural description than the vertical displacement in spite of thefact that they both describe the same dynamical system. This is true for most physicaldynamical systems. On the other hand, although some coordinates may not be naturaldescriptions of a physical dynamical system, they may make the system analysis andsynthesis much easier.

In general, let T 2 Rn�n be a nonsingular matrix and de�ne

�x = Tx:

Then the original dynamical system equations (3.1) and (3.2) become

_�x = TAT�1�x+ TBu

y = CT�1�x+Du:

These equations represent the same dynamical system for any nonsingular matrix T ,and hence, we can regard these representations as equivalent. It is easy to see that theinput/output transfer matrix is not changed under the coordinate transformation, i.e.,

G(s) = C(sI �A)�1B +D = CT�1(sI � TAT�1)�1TB +D:

In this section, we will consider the system structure decomposition using coordinatetransformation if the system is not completely controllable and/or is not completelyobservable. To begin with, let us consider further the dynamical systems related by asimilarity transformation:�

A BC D

�7�!

"�A �B

�C �D

#=

�TAT�1 TBCT�1 D

�:


The controllability and observability matrices are related by

�C = TC �O = OT�1:Moreover, the following theorem follows easily from the PBH tests or from the aboverelations.

Theorem 3.5 The controllability (or stabilizability) and observability (or detectability)are invariant under similarity transformations.

Using this fact, we can now show the following theorem.

Theorem 3.6 If the controllability matrix C has rank k1 < n, then there exists a simi-larity transformation

�x =

��xc�x�c

�= Tx

such that �_�xc_�x�c

�=

��Ac �A12

0 �A�c

��xc�x�c

�+

��Bc

0

�u

y =�

�Cc �C�c

� � �xc�x�c

�+Du

with �Ac 2 C k1�k1 and ( �Ac; �Bc) controllable. Moreover,

G(s) = C(sI �A)�1B +D = �Cc(sI � �Ac)�1 �Bc +D:

Proof. Since rank C = k1 < n, the pair (A;B) is not controllable. Now let q1; q2; : : : ; qk1be any linearly independent columns of C. Let qi; i = k1+1; : : : ; n be any n�k1 linearlyindependent vectors such that the matrix

Q :=�q1 � � � qk1 qk1+1 � � � qn

�is nonsingular. De�ne

T := Q�1:

Then the transformation �x = Tx will give the desired decomposition. To see that,note that for each i = 1; 2; : : : ; k1, Aqi can be written as a linear combination of qi; i =1; 2; : : : ; k1 since Aqi is a linear combination of the columns of C by the Cayley-HamiltonTheorem. Therefore, we have

AT�1 =�Aq1 � � � Aqk1 Aqk1+1 � � � Aqn

�=

�q1 � � � qk1 qk1+1 � � � qn

� � �Ac �A12

0 �A�c

�

= T�1�

�Ac �A12

0 �A�c

�


for some k1 � k1 matrix �Ac. Similarly, each column of the matrix B is a linear combi-nation of qi; i = 1; 2; : : : ; k1, hence

B = Q

��Bc

0

�= T�1

��Bc

0

�

for some �Bc 2 C k1�m.To show that ( �Ac; �Bc) is controllable, note that rank C = k1 and

C = T�1�

�Bc�Ac �Bc � � � �Ak1�1c

�Bc � � � �An�1c�Bc

0 0 � � � 0 � � � 0

�:

Since, for each j � k1, �Ajc is a linear combination of �Aic; i = 0; 1; : : : ; (k1� 1) by Cayley-Hamilton theorem, we have

rank�

�Bc�Ac �Bc � � � �Ak1�1c

�Bc

�= k1;

i.e., ( �Ac; �Bc) is controllable. 2

A numerically reliable way to �nd a such transformation T is to use QR factorization.For example, if the controllability matrix C has the QR factorization QR = C, thenT = Q�1.

Corollary 3.7 If the system is stabilizable and the controllability matrix C has rankk1 < n, then there exists a similarity transformation T such that


�=

264

�Ac �A12�Bc

0 �A�c 0

�Cc �C�c D

375

with �Ac 2 C k1�k1 , ( �Ac; �Bc) controllable and with �A�c stable.

Hence, the state space �x is partitioned into two orthogonal subspaces��xc0

��and

��0�x�c

��

with the �rst subspace controllable from the input and second completely uncontrollablefrom the input (i.e., the state �x�c are not a�ected by the control u). Write these subspacesin terms of the original coordinates x, we have

x = T�1�x =�q1 � � � qk1 qk1+1 � � � qn

� � �xc�x�c

�:

So the controllable subspace is the span of qi; i = 1; : : : ; k1 or, equivalently, Im C. On theother hand, the uncontrollable subspace is given by the complement of the controllablesubspace.

By duality, we have the following decomposition if the system is not completelyobservable.


Theorem 3.8 If the observability matrix O has rank k2 < n, then there exists a simi-larity transformation T such that


�=

264

�Ao 0 �Bo�A21

�A�o�B�o

�Co 0 D

375

with �Ao 2 Ck2�k2 and ( �Co; �Ao) observable.

Corollary 3.9 If the system is detectable and the observability matrix C has rank k2 <n, then there exists a similarity transformation T such that


�=

264

�Ao 0 �Bo�A21

�A�o�B�o

�Co 0 D

375

with �Ao 2 C k2�k2 , ( �Co; �Ao) observable and with �A�o stable.

Similarly, we have

G(s) = C(sI �A)�1B +D = �Co(sI � �Ao)�1 �Bo +D:

Carefully combining the above two theorems, we get the following Kalman CanonicalDecomposition. The proof is left to the reader as an exercise.

Theorem 3.10 Let an LTI dynamical system be described by the equations (3.1) and (3.2).Then there exists a nonsingular coordinate transformation �x = Tx such that2

664_�xco_�xc�o_�x�co_�x�c�o

3775 =

2664

�Aco 0 �A13 0�A21

�Ac�o �A23�A24

0 0 �A�co 00 0 �A43

�A�c�o

37752664

�xco�xc�o�x�co�x�c�o

3775+

2664

�Bco�Bc�o

00

3775u

y =�

�Cco 0 �C�co 0�2664


3775+Du

or equivalently


�=

266664

�Aco 0 �A13 0 �Bco�A21

�Ac�o �A23�A24

�Bc�o

0 0 �A�co 0 00 0 �A43

�A�c�o 0

�Cco 0 �C�co 0 D

377775


where the vector �xco is controllable and observable, �xc�o is controllable but unobserv-able, �x�co is observable but uncontrollable, and �x�c�o is uncontrollable and unobservable.Moreover, the transfer matrix from u to y is given by

G(s) = �Cco(sI � �Aco)�1 �Bco +D:

One important issue is that although the transfer matrix of a dynamical system�A BC D

�

is equal to its controllable and observable part"�Aco �Bco

�Cco D

#

their internal behaviors are very di�erent. In other words, while their input/outputbehaviors are the same, their state space response with nonzero initial conditions arevery di�erent. This can be illustrated by the state space response for the simple system2

664_�xco_�xc�o_�x�co_�x�c�o

3775 =

2664

�Aco 0 0 00 �Ac�o 0 00 0 �A�co 00 0 0 �A�c�o

37752664


3775+

2664

�Bco�Bc�o

00

3775 u

y =�

�Cco 0 �C�co 0�2664


3775

with �xco controllable and observable, �xc�o controllable but unobservable, �x�co observablebut uncontrollable, and �x�c�o uncontrollable and unobservable.

The solution to this system is given by

2664

�xco(t)�xc�o(t)�x�co(t)�x�c�o(t)

3775 =

26664e�Acot�xco(0) +

R t0e�Aco(t��) �Bcou(�)d�

e�Ac�ot�xc�o(0) +

R t0e�Ac�o(t��) �Bc�ou(�)d�

e�A�cot�x�co(0)

e�A�c�ot�x�c�o(0)

37775

y(t) = �Cco�xco(t) + �C�co�x�co;

note that �x�co(t) and �x�c�o(t) are not a�ected by the input u, while �xc�o(t) and �x�c�o(t) donot show up in the output y. Moreover, if the initial condition is zero, i.e., �x(0) = 0,then the output

y(t) =

Z t

0

�Ccoe�Aco(t��) �Bcou(�)d�:


However, if the initial state is not zero, then the response �x�co(t) will show up in theoutput. In particular, if �A�co is not stable, then the output y(t) will grow without bound.The problems with uncontrollable and/or unobservable unstable modes are even moreprofound than what we have mentioned. Since the states are the internal signals of thedynamical system, any unbounded internal signal will eventually destroy the system.On the other hand, since it is impossible to make the initial states exactly zero, anyuncontrollable and/or unobservable unstable mode will result in unacceptable systembehavior. This issue will be exploited further in section 3.7. In the next section, we willconsider how to place the system poles to achieve desired closed-loop behavior if thesystem is controllable.

3.4 Pole Placement and Canonical Forms

Consider a MIMO dynamical system described by

_x = Ax+Bu

y = Cx+Du;

and let u be a state feedback control law given by

u = Fx+ v:

This closed-loop system is as shown in Figure 3.1, and the closed-loop system equationsare given by

_x = (A+BF )x +Bv

y = (C +DF )x+Dv:

e eey ?� �

u

�

v��6

-

6

-

D

F

A

x� �� BC

R

Figure 3.1: State Feedback

Then we have the following lemma in which the proof is simple and is left as anexercise to the reader.

3.4. Pole Placement and Canonical Forms 59

Lemma 3.11 Let F be a constant matrix with appropriate dimension; then (A;B) iscontrollable (stabilizable) if and only if (A+BF;B) is controllable (stabilizable).

However, the observability of the system may change under state feedback. Forexample, the following system

�A BC D

�=

24 1 1 1

1 0 01 0 0

35

is controllable and observable. But with the state feedback

u = Fx =� �1 �1 �x;

the system becomes �A+BF BC +DF D

�=

24 0 0 1

1 0 01 0 0

35

and is not completely observable.

The dual operation of the dynamical system by

_x = Ax +Bu 7�! _x = Ax+Bu+ Ly

is called output injection which can be written as�A BC D

�7�!

�A+ LC B + LD

C D

�:

By duality, the output injection does not change the system observability (detectability)but may change the system controllability (stabilizability).

Remark 3.1 We would like to call attention to the diagram and the signals ow con-vention used in this book. It may not be conventional to let signals ow from the rightto the left, however, the reader will �nd that this representation is much more visuallyappealing than the traditional representation, and is consistent with the matrix ma-nipulations. For example, the following diagram represents the multiplication of threematrices and will be very helpful in dealing with complicated systems:

z =M1M2M3w:

wz� � � �M3M2M1

The conventional signal block diagrams, i.e., signals owing from left to right, will alsobe used in this book. ~


We now consider some special state space representations of the dynamical systemdescribed by equations (3.1) and (3.2). First, we will consider the systems with singleinputs.

Assume that a single input and multiple output dynamical system is given by

G(s) =

�A bC d

�; b 2 R

n ; C 2 Rp�n ; d 2 R

p

and assume that (A; b) is controllable. Let

det(�I �A) = �n + a1�n�1 + � � �+ an;

and de�ne

A1 :=

2666664

�a1 �a2 � � � �an�1 �an1 0 � � � 0 00 1 � � � 0 0...

......

...0 0 � � � 1 0

3777775 b1 :=

2666664

100...0

3777775

and

C =�b Ab � � � An�1b

�C1 =

�b1 A1b1 � � � An�11 b1

�:

Then it is easy to verify that both C and C1 are nonsingular. Moreover, the transfor-mation

Tc = C1C�1

will give the equivalent system representation�TcAT

�1c Tcb

CT�1c d

�=

�A1 b1

CT�1c d

�

whereCT�1c =

��1 �2 � � � �n�1 �n

�for some �i 2 Rp . This state space representation is usually called controllable canonicalform or controller canonical form. It is also easy to show that the transfer matrix isgiven by

G(s) = C(sI �A)�1b+ d =�1s

n�1 + �2sn�2 + � � �+ �n�1s+ �n

sn + a1sn�1 + � � �+ an�1s+ an+ d;

which also shows that, given a column vector of transfer matrix G(s), a state space rep-resentation of the transfer matrix can be obtained as above. The quadruple (A; b; C; d)is called a state space realization of the transfer matrix G(s).

3.4. Pole Placement and Canonical Forms 61

Now consider a dynamical system equation given by

_x = A1x+ b1u

and a state feedback control law

u = Fx =�f1 f2 : : : fn�1 fn

�x:

Then the closed-loop system is given by

_x = (A1 + b1F )x

and det(�I�(A1+b1F )) = �n+(a1�f1)�n�1+ � � �+(an�fn). It is clear that the zerosof det(�I � (A1 + b1F ) can be made arbitrary for an appropriate choice of F providedthat the complex zeros appear in conjugate pairs, thus showing that the eigenvalues ofA+ bF can be freely assigned if (A; b) is controllable.

Dually, consider a multiple input and single output system

G(s) =

�A Bc d

�; B 2 R

n�m ; c� 2 Rn ; d� 2 R

m ;

and assume that (c; A) is observable; then there is a transformation To such that

�ToAT

�1o ToB

cT�1o d

�=

266666664

�a1 1 0 � � � 0 �1�a2 0 1 � � � 0 �2...

......

......

�an�1 0 0 � � � 1 �n�1�an 0 0 � � � 0 �n

1 0 0 � � � 0 d

377777775; ��i 2 R

m

and

G(s) = c(sI �A)�1B + d =�1s

n�1 + �2sn�2 + � � �+ �n�1s+ �n

sn + a1sn�1 + � � �+ an�1s+ an+ d:

This representation is called observable canonical form or observer canonical form.Similarly, we can show that the eigenvalues of A+Lc can be freely assigned if (c; A)

is observable.

The pole placement problem for a multiple input system (A;B) can be convertedinto a simple single input pole placement problem. To describe the procedure, we needsome preliminary results.

Lemma 3.12 If an m input system pair (A;B) is controllable and if A is cyclic, thenfor almost all v 2 Rm , the single input pair (A;Bv) is controllable.


Proof. Without loss of generality, assume that A is in the Jordan canonical form andthat the matrix B is partitioned accordingly:

A =

26664J1

J2. . .

Jk

37775 B =

26664B1

B2

...Bk

37775

where Ji is in the form of

Ji =

26666664

�i 1

�i. . .

. . .. . .

�i 1�i

37777775

and �i 6= �j if i 6= j. By PBH test, the pair (A;B) is controllable if and only if, for eachi = 1; : : : ; k, the last row of Bi is not zero. Let bi 2 Rm be the last row of Bi, and thenwe only need to show that, for almost all v 2 Rm , biv 6= 0 for each i = 1; : : : ; k which isclear since for each i, the set v 2 R

m such that biv = 0 has measure zero in Rm since

bi 6= 0. 2

The cyclicity assumption in this theorem is essential. Without this assumption, thetheorem does not hold. For example, the pair

A =

�1 00 1

�; B =

�1 00 1

�

is controllable but there is no v 2 R2 such that (A;Bv) is controllable since A is notcyclic.

Since a matrix A with distinct eigenvalues is cyclic, by de�nition we have the fol-lowing lemma.

Lemma 3.13 If (A;B) is controllable, then for almost any K 2 Rm�n , all the eigen-values of A+BK are distinct and, consequently, A+BK is cyclic.

A proof can be found in Brasch and Pearson [1970], Davison [1968], and Heymann[1968].

Now it is evident that given a multiple input controllable pair (A;B), there is amatrix K 2 Rm�n and a vector v 2 Rm such that A+BK is cyclic and (A+BK;Bv)is controllable. Moreover, from the pole placement results for the single input system,there is a matrix f� 2 Rn so that the eigenvalues of (A+BK)+(Bv)f can be arbitrarilyassigned. Hence, the eigenvalues of A + BF can be arbitrarily assigned by choosing astate feedback in the form of

u = Fx = (K + vf)x:

3.5. Observers and Observer-Based Controllers 63

A dual procedure can be applied for the output injection A+ LC.

The canonical form for single input or single output system can also be generalized tomultiple input and multiple output systems at the expense of notation. The interestedreader may consult Kailath [1980] or Chen [1984].

If a system is not completely controllable, then the Kalman controllable decompo-sition can be applied �rst and the above procedure can be used to assign the systempoles corresponding to the controllable subspace.

3.5 Observers and Observer-Based Controllers

We have shown in the last section that if a system is controllable and, furthermore, ifthe system states are available for feedback, then the system closed loop poles can beassigned arbitrarily through a constant feedback. However, in most practical applica-tions, the system states are not completely accessible and all the designer knows are theoutput y and input u. Hence, the estimation of the system states from the given outputinformation y and input u is often necessary to realize some speci�c design objectives.In this section, we consider such an estimation problem and the application of this stateestimation in feedback control.

Consider a plant modeled by equations (3.1) and (3.2). An observer is a dynamicalsystem with input of (u; y) and output of, say x, which asymptotically estimates thestate x. More precisely, a (linear) observer is a system such as

_q = Mq +Nu+Hy

x = Qq +Ru+ Sy

so that x(t)�x(t)! 0 as t!1 for all initial states x(0), q(0) and for every input u(�).Theorem 3.14 An observer exists i� (C;A) is detectable. Further, if (C;A) is de-tectable, then a full order Luenberger observer is given by

_q = Aq +Bu+ L(Cq +Du� y) (3.5)

x = q (3.6)

where L is any matrix such that A+ LC is stable.

Proof. We �rst show that the detectability of (C;A) is su�cient for the existenceof an observer. To that end, we only need to show that the so-called Luenbergerobserver de�ned in the theorem is indeed an observer. Note that equation (3.5) for qis a simulation of the equation for x, with an additional forcing term L(Cq +Du� y),which is a gain times the output error. Equivalent equations are

_q = (A+ LC)q +Bu+ LDu� Ly

x = q:


These equations have the form allowed in the de�nition of an observer. De�ne the error,e := x� x, and then simple algebra gives

_e = (A+ LC)e;

therefore e(t)! 0 as t!1 for every x(0), q(0), and u(:).

To show the converse, assume that (C;A) is not detectable. Take the initial statex(0) in the undetectable subspace and consider a candidate observer:

_q = Mq +Nu+Hy

x = Qq +Ru+ Sy:

Take q(0) = 0 and u(t) � 0. Then the equations for x and the candidate observer are

_x = Ax

_q = Mq +HCx

x = Qq + SCx:

Since an unobservable subspace is an A-invariant subspace containing x(0), it followsthat x(t) is in the unobservable subspace for all t � 0. Hence, Cx(t) = 0 for all t � 0,and, consequently, q(t) � 0 and x(t) � 0. However, for some x(0) in the undetectablesubspace, x(t) does not converge to zero. Thus the candidate observer does not havethe required property, and therefore, no observer exists. 2

The above Luenberger observer has dimension n, which is the dimension of the statex. It's possible to get an observer of lower dimension. The idea is this: since we canmeasure y �Du = Cx, we already know x modulo Ker C, so we only need to generatethe part of x in Ker C. If C has full row rank and p := dim y, then the dimension ofKer C equals n � p, so we might suspect that we can get an observer of order n � p.This is true. Such an observer is called a \minimal order observer". We will not pursuethis issue further here. The interested reader may consult Chen [1984].

Recall that, for a dynamical system described by the equations (3.1) and (3.2), if(A;B) is controllable and state x is available for feedback, then there is a state feedbacku = Fx such that the closed-loop poles of the system can be arbitrarily assigned.Similarly, if (C;A) is observable, then the system observer poles can be arbitrarilyplaced so that the state estimator x can be made to approach x arbitrarily fast. Nowlet us consider what will happen if the system states are not available for feedbackso that the estimated state has to be used. Hence, the controller has the followingdynamics:

_x = (A+ LC)x+Bu+ LDu� Ly

u = F x:

3.6. Operations on Systems 65

Then the total system state equations are given by�_x_x

�=

�A BF

�LC A+BF + LC

��xx

�:

Let e := x� x, then the system equation becomes�_e_x

�=

�A+ LC 0�LC A+BF

��ex

�;

and the closed-loop poles consist of two parts: the poles resulting from state feedback�(A+BF ) and the poles resulting from the state estimation �(A+LC). Now if (A;B) iscontrollable and (C;A) is observable, then there exist F and L such that the eigenvaluesof A+BF and A+LC can be arbitrarily assigned. In particular, they can be made tobe stable. Note that a slightly weaker result can also result even if (A;B) and (C;A)are only stabilizable and detectable.

The controller given above is called an observer-based controller and is denoted as

u = K(s)y

and

K(s) =

�A+BF + LC + LDF �L

F 0

�:

Now denote the open loop plant by

G =

�A BC D

�;

then the closed-loop feedback system is as shown below:

G K� � �y u

In general, if a system is stabilizable through feeding back the output y, then it issaid to be output feedback stabilizable. It is clear from the above construction that asystem is output feedback stabilizable if (A;B) is stabilizable and (C;A) is detectable.The converse is also true and will be shown in Chapter 12.

3.6 Operations on Systems

In this section, we present some facts about system interconnection. Since these proofsare straightforward, we will leave the details to the reader.


Suppose that G1 and G2 are two subsystems with state space representations:

G1 =

�A1 B1

C1 D1

�G2 =

�A2 B2

C2 D2

�:

Then the series or cascade connection of these two subsystems is a system with theoutput of the second subsystem as the input of the �rst subsystem as shown in thefollowing diagram:

� � �G2G1

This operation in terms of the transfer matrices of the two subsystems is essentially theproduct of two transfer matrices. Hence, a representation for the cascaded system canbe obtained as

G1G2 =

�A1 B1

C1 D1

� �A2 B2

C2 D2

�

=

24 A1 B1C2

0 A2

B1D2

B2

C1 D1C2 D1D2

35 =

24 A2 0

B1C2 A1

B2

B1D2

D1C2 C1 D1D2

35 :

Similarly, the parallel connection or the addition of G1 and G2 can be obtained as

G1 +G2 =

�A1 B1

C1 D1

�+

�A2 B2

C2 D2

�=

24 A1 0

0 A2

B1

B2

C1 C2 D1 +D2

35 :

Next we consider a feedback connection of G1 and G2 as shown below:

fy r

��6

-

� �

G2

G1

Then the closed-loop transfer matrix from r to y is given by

T =

24 A1 �B1D2R

�112 C1 �B1R

�121 C2 B1R

�121

B2R�112 C1 A2 �B2D1R

�121 C2 B2D1R

�121

R�112 C1 �R�112 D1C2 D1R�121

35

where R12 = I+D1D2 and R21 = I+D2D1. Note that these state space representationsmay not be necessarily controllable and/or observable even if the original subsystemsG1 and G2 are.

For future reference, we shall also introduce the following de�nitions.

3.6. Operations on Systems 67

De�nition 3.7 The transpose of a transfer matrix G(s) or the dual system is de�nedas

G 7�! GT (s) = B�(sI �A�)�1C� +D�

or equivalently �A BC D

�7�!

�A� C�

B� D�

�:

De�nition 3.8 The conjugate system of G(s) is de�ned as

G 7�! G�(s) := GT (�s) = B�(�sI �A�)�1C� +D�

or equivalently �A BC D

�7�!

� �A� �C�B� D�

�:

In particular, we have G�(j!) := [G(j!)]� = G�(j!).

De�nition 3.9 A real rational matrix G(s) is called a right (left) inverse of a transfermatrix G(s) if G(s)G(s) = I ( G(s)G(s) = I ). Moreover, if G(s) is both a right inverseand a left inverse of G(s), then it is simply called the inverse of G(s).

Lemma 3.15 Let Dy denote a right (left) inverse of D if D has full row (column) rank.Then

Gy =�A�BDyC �BDy

DyC Dy

�is a right (left) inverse of G.

Proof. The right inverse case will be proven and the left inverse case follows by duality.Suppose DDy = I . Then

GGy =

24 A BDyC BDy

0 A�BDyC �BDyC DDyC DDy

35

=

24 A BDyC BDy

0 A�BDyC �BDyC C I

35 :

Performing similarity transformation T =

�I I0 I

�on the above system yields

GGy =

24 A 0 0

0 A�BDyC �BDyC 0 I

35

= I:

2


3.7 State Space Realizations for Transfer Matrices

In some cases, the natural or convenient description for a dynamical system is in termsof transfer matrices. This occurs, for example, in some highly complex systems forwhich the analytic di�erential equations are too hard or too complex to write down.Hence, certain engineering approximation or identi�cation has to be carried out; forexample, input and output frequency responses are obtained from experiments so thatsome transfer matrix approximating the system dynamics is available. Since the statespace computation is most convenient to implement on the computer, some appropriatestate space representation for the resulting transfer matrix is necessary.

In general, assume that G(s) is a real-rational transfer matrix which is proper. Thenwe call a state-space model (A;B;C;D) such that

G(s) =

�A BC D

�;

a realization of G(s).

We note that if the transfer matrix is either single input or single output, then theformulas in Section 3.4 can be used to obtain a controllable or observable realization.The realization for a general MIMO transfer matrix is more complicated and is the focusof this section.

De�nition 3.10 A state space realization (A;B;C;D) of G(s) is said to be a minimalrealization of G(s) if A has the smallest possible dimension.

Theorem 3.16 A state space realization (A;B;C;D) of G(s) is minimal if and only if(A;B) is controllable and (C;A) is observable.

Proof. We shall �rst show that if (A;B;C;D) is a minimal realization of G(s), then(A;B) must be controllable and (C;A) must be observable. Suppose, on the contrary,that (A;B) is not controllable and/or (C;A) is not observable. Then from Kalmandecomposition, there is a smaller dimensioned controllable and observable state spacerealization that has the same transfer matrix; this contradicts the minimality assump-tion. Hence (A;B) must be controllable and (C;A) must be observable.

Next we show that if an n-th order realization (A;B;C;D) is controllable and ob-servable, then it is minimal. But supposing it is not minimal, let (Am; Bm; Cm; D) bea minimal realization of G(s) with order k < n. Since

G(s) = C(sI �A)�1B +D = Cm(sI �Am)�1Bm +D;

we haveCAiB = CmA

imBm; 8i � 0:

This implies thatOC = OmCm (3:7)

3.7. State Space Realizations for Transfer Matrices 69

where C and O are the controllability and observability matrices of (A;B) and (C;A),respectively, and

Cm :=�Bm AmBm � � � An�1m Bm

�

Om :=

26664

CmCmAm

...CmA

n�1m

37775 :

By Sylvester's inequality,

rank C + rank O � n � rank (OC) � minfrank C; rank Og;

and, therefore, we have rank (OC) = n since rank C = rank O = n by the controllabilityand observability assumptions. Similarly, since (Am; Bm; Cm; D) is minimal, (Am; Bm)is controllable and (Cm; Am) is observable. Moreover,

rank OmCm = k < n;

which is a contradiction since rank OC = rank OmCm by equality (3.7). 2

The following property of minimal realizations can also be veri�ed, and this is leftto the reader.

Theorem 3.17 Let (A1; B1; C1; D) and (A2; B2; C2; D) be two minimal realizations ofa real rational transfer matrix G(s), and let C1, C2, O1, and O2 be the correspond-ing controllability and observability matrices, respectively. Then there exists a uniquenonsingular T such that

A2 = TA1T�1; B2 = TB1; C2 = C1T

�1:

Furthermore, T can be speci�ed as T = (O�2O2)�1O�2O1 or T�1 = C1C�2 (C2C�2 )�1.

We now describe several ways to obtain a state space realization for a given multipleinput and multiple output transfer matrix G(s). The simplest and most straightforwardway to obtain a realization is by realizing each element of the matrix G(s) and thencombining all these individual realizations to form a realization for G(s). To illustrate,let us consider a 2� 2 (block) transfer matrix such as

G(s) =

�G1(s) G2(s)G3(s) G4(s)

�

and assume that Gi(s) has a state space realization of

Gi(s) =

�Ai Bi

Ci Di

�; i = 1; : : : ; 4:


Note that Gi(s) may itself be a multiple input and multiple output transfer matrix. Inparticular, if Gi(s) is a column or row vector of transfer functions, then the formulasin Section 3.4 can be used to obtain a controllable or observable realization for Gi(s).Then a realization for G(s) can be given by

G(s) =

26666664

A1 0 0 0 B1 00 A2 0 0 0 B2

0 0 A3 0 B3 00 0 0 A4 0 B4

C1 C2 0 0 D1 D2

0 0 C3 C4 D3 D4

37777775:

Alternatively, if the transfer matrix G(s) can be factored into the product and/orthe sum of several simply realized transfer matrices, then a realization for G can beobtained by using the cascade or addition formulas in the last section.

A problem inherited with these kinds of realization procedures is that a realizationthus obtained will generally not be minimal. To obtain a minimal realization, a Kalmancontrollability and observability decomposition has to be performed to eliminate theuncontrollable and/or unobservable states. (An alternative numerically reliable methodto eliminate uncontrollable and/or unobservable states is the balanced realizationmethodwhich will be discussed later.)

We will now describe one factorization procedure that does result in a minimalrealization by using partial fractional expansion (The resulting realization is sometimescalled Gilbert's realization due to Gilbert).

Let G(s) be a p�m transfer matrix and write it in the following form:

G(s) =N(s)

d(s)

with d(s) a scalar polynomial. For simplicity, we shall assume that d(s) has only realand distinct roots �i 6= �j if i 6= j and

d(s) = (s� �1)(s� �2) � � � (s� �r):

Then G(s) has the following partial fractional expansion:

G(s) = D +

rXi=1

Wi

s� �i:

Supposerank Wi = ki

and let Bi 2 Rki�m and Ci 2 Rp�ki be two constant matrices such that

Wi = CiBi:

3.8. Lyapunov Equations 71

Then a realization for G(s) is given by

G(s) =

26664�1Ik1 B1

. . ....

�rIkr Br

C1 � � � Cr D

37775 :

It follows immediately from PBH tests that this realization is controllable and observ-able. Hence, it is minimal.

An immediate consequence of this minimal realization is that a transfer matrix withan r-th order polynomial denominator does not necessarily have an r-th order statespace realization unless Wi for each i is a rank one matrix.

This approach can, in fact, be generalized to more complicated cases where d(s) mayhave complex and/or repeated roots. Readers may convince themselves by trying somesimple examples.

3.8 Lyapunov Equations

Testing stability, controllability, and observability of a system is very important in linearsystem analysis and synthesis. However, these tests often have to be done indirectly. Inthat respect, the Lyapunov theory is sometimes useful. Consider the following Lyapunovequation

A�X +XA+Q = 0 (3:8)

with given real matrices A and Q. It has been shown in Chapter 2 that this equationhas a unique solution i� �i(A) + ��j(A) 6= 0;8i; j. In this section, we will study therelationships between the stability of A and the solution of X . The following results arestandard.

Lemma 3.18 Assume that A is stable, then the following statements hold:

(i) X =R10 eA

�tQeAtdt.

(ii) X > 0 if Q > 0 and X � 0 if Q � 0.

(iii) if Q � 0, then (Q;A) is observable i� X > 0.

An immediate consequence of part (iii) is that, given a stable matrix A, a pair (C;A)is observable if and only if the solution to the following Lyapunov equation is positivede�nite:

A�Lo + LoA+ C�C = 0:

The solution Lo is called the observability Gramian. Similarly, a pair (A;B) is con-trollable if and only if the solution to

ALc + LcA� +BB� = 0


is positive de�nite and Lc is called the controllability Gramian.In many applications, we are given the solution of the Lyapunov equation and need

to conclude the stability of the matrix A.

Lemma 3.19 Suppose X is the solution of the Lyapunov equation (3.8), then

(i) Re�i(A) � 0 if X > 0 and Q � 0.

(ii) A is stable if X > 0 and Q > 0.

(iii) A is stable if X � 0, Q � 0 and (Q;A) is detectable.

Proof. Let � be an eigenvalue of A and v 6= 0 be a corresponding eigenvector, thenAv = �v. Pre-multiply equation (3.8) by v� and postmultiply (3.8) by v to get

2Re �(v�Xv) + v�Qv = 0:

Now if X > 0 then v�Xv > 0, and it is clear that Re� � 0 if Q � 0 and Re� < 0 ifQ > 0. Hence (i) and (ii) hold. To see (iii), we assume Re� � 0. Then we must havev�Qv = 0, i.e., Qv = 0. This implies that � is an unstable and unobservable mode,which contradicts the assumption that (Q;A) is detectable. 2

3.9 Balanced Realizations

Although there are in�nitely many di�erent state space realizations for a given transfermatrix, some particular realizations have proven to be very useful in control engineeringand signal processing. Here we will only introduce one class of realizations for stabletransfer matrices that are most useful in control applications. To motivate the class ofrealizations, we �rst consider some simple facts.

Lemma 3.20 Let

�A BC D

�be a state space realization of a (not necessarily stable)

transfer matrix G(s). Suppose that there exists a symmetric matrix

P = P � =�P1 00 0

�

with P1 nonsingular such that

AP + PA� +BB� = 0:

Now partition the realization (A;B;C;D) compatibly with P as24 A11 A12 B1

A21 A22 B2

C1 C2 D

35 :

3.9. Balanced Realizations 73

Then �A11 B1

C1 D

�is also a realization of G. Moreover, (A11; B1) is controllable if A11 is stable.

Proof. Use the partitioned P and (A;B;C) to get

0 = AP + PA� +BB� =�A11P1 + P1A

�11 +B1B

�1 P1A

�21 +B1B

�2

A21P1 +B2B�1 B2B

�2

�;

which gives B2 = 0 and A21 = 0 since P1 is nonsingular. Hence, part of the realizationis not controllable:2

4 A11 A12 B1

A21 A22 B2

C1 C2 D

35 =

24 A11 A12 B1

0 A22 0C1 C2 D

35 =

�A11 B1

C1 D

�:

Finally, it follows from Lemma 3.18 that (A11; B1) is controllable if A11 is stable. 2

We also have

Lemma 3.21 Let

�A BC D

�be a state space realization of a (not necessarily stable)

transfer matrix G(s). Suppose that there exists a symmetric matrix

Q = Q� =�Q1 00 0

�

with Q1 nonsingular such that

QA+A�Q+ C�C = 0:

Now partition the realization (A;B;C;D) compatibly with Q as24 A11 A12 B1

A21 A22 B2

C1 C2 D

35 :

Then �A11 B1

C1 D

�is also a realization of G. Moreover, (C1; A11) is observable if A11 is stable.

The above two lemmas suggest that to obtain a minimal realization from a stablenon-minimal realization, one only needs to eliminate all states corresponding to the zeroblock diagonal term of the controllability Gramian P and the observability Gramian Q.In the case where P is not block diagonal, the following procedure can be used toeliminate non-controllable subsystems:


1. Let G(s) =

�A BC D

�be a stable realization.

2. Compute the controllability Gramian P � 0 from

AP + PA� +BB� = 0:

3. Diagonalize P to get P =�U1 U2

� � �1 00 0

� �U1 U2

��with �1 > 0 and�

U1 U2

�unitary.

4. Then G(s) =

�U�1AU1 U�1BCU1 D

�is a controllable realization.

A dual procedure can also be applied to eliminate non-observable subsystems.

Now assume that �1 > 0 is diagonal and �1 = diag(�11;�12) such that �max(�12)��min(�11), then it is tempting to conclude that one can also discard those states cor-responding to �12 without causing much error. However, this is not necessarily true asshown in the following example. Consider a stable transfer function

G(s) =3s+ 18

s2 + 3s+ 18:

Then G(s) has a state space realization given by

G(s) =

24 �1 �4=� 1

4� �2 2��1 2=� 0

35

where � is any nonzero number. It is easy to check that the controllability Gramian ofthe realization is given by

P =

�0:5

�2

�:

Since the last diagonal term of P can be made arbitrarily small by making � small,the controllability of the corresponding state can be made arbitrarily weak. If the statecorresponding to the last diagonal term of P is removed, we get a transfer function

G =

� �1 1�1 0

�=

�1s+ 1

which is not close to the original transfer function in any sense. The problem may beeasily detected if one checks the observability Gramian Q, which is

Q =

�0:5

1=�2

�:


Since 1=�2 is very large if � is small, this shows that the state corresponding to thelast diagonal term is strongly observable. This example shows that controllability (orobservability) Gramian alone can not give an accurate indication of the dominance ofthe system states in the input/output behavior.

This motivates the introduction of a balanced realization which gives balancedGramians for controllability and observability.

Suppose G =

�A BC D

�is stable, i.e., A is stable. Let P and Q denote the control-

lability Gramian and observability Gramian, respectively. Then by Lemma 3.18, P andQ satisfy the following Lyapunov equations

AP + PA� +BB� = 0 (3:9)

A�Q+QA+ C�C = 0; (3:10)

and P � 0, Q � 0. Furthermore, the pair (A;B) is controllable i� P > 0, and (C;A) isobservable i� Q > 0.

Suppose the state is transformed by a nonsingular T to x = Tx to yield the realiza-tion

G =

"A B

C D

#=


�:

Then the Gramians are transformed to P = TPT � and Q = (T�1)�QT�1. Note thatP Q = TPQT�1, and therefore the eigenvalues of the product of the Gramians areinvariant under state transformation.

Consider the similarity transformation T which gives the eigenvector decomposition

PQ = T�1�T; � = diag(�1; : : : ; �n):

Then the columns of T�1 are eigenvectors of PQ corresponding to the eigenvalues f�ig.Later, it will be shown that PQ has a real diagonal Jordan form and that � � 0, whichare consequences of P � 0 and Q � 0.

Although the eigenvectors are not unique, in the case of a minimal realization theycan always be chosen such that

P = TPT � = �;

Q = (T�1)�QT�1 = �;

where � = diag(�1; �2; : : : ; �n) and �2 = �. This new realization with controllabilityand observability Gramians P = Q = � will be referred to as a balanced realization(also called internally balanced realization). The decreasingly order numbers, �1 ��2 � : : : � �n � 0, are called the Hankel singular values of the system.

More generally, if a realization of a stable system is not minimal, then there is a trans-formation such that the controllability and observability Gramians for the transformedrealization are diagonal and the controllable and observable subsystem is balanced. Thisis a consequence of the following matrix fact.


Theorem 3.22 Let P and Q be two positive semide�nite matrices. Then there existsa nonsingular matrix T such that

TPT � =

2664

�1

�2

00

3775 ; (T�1)�QT�1 =

2664

�1

0�3

0

3775 ;

respectively, with �1; �2; �3 diagonal and positive de�nite.

Proof. Since P is a positive semide�nite matrix, there exists a transformation T1 suchthat

T1PT�1 =

�I 00 0

�:

Now let

(T �1 )�1QT�11 =

�Q11 Q12

Q�12 Q22

�;

and there exists a unitary matrix U1 such that

U1Q11U�1 =

��21 00 0

�; �1 > 0:

Let

(T �2 )�1 =

�U1 00 I

�;

and then

(T �2 )�1(T �1 )

�1QT�11 (T2)�1 =

24 �2

1 0 Q121

0 0 Q122

Q�121 Q�122 Q22

35 :

But Q � 0 implies Q122 = 0. So now let

(T �3 )�1 =

24 I 0 0

0 I 0

�Q�121��21 0 I

35 ;

giving

(T �3 )�1(T �2 )

�1(T �1 )�1QT�11 (T2)

�1(T3)�1 =

24 �2

1 0 00 0 0

0 0 Q22 � Q�121��21 Q121

35 :

Next �nd a unitary matrix U2 such that

U2(Q22 � Q�121��21 Q121)U

�2 =

��3 00 0

�; �3 > 0:


De�ne

(T �4 )�1 =

24 �

�1=21 0 00 I 00 0 U2

35

and letT = T4T3T2T1:

Then

TPT � =

2664

�1

�2

00

3775 ; (T �)�1QT�1 =

2664

�1

0�3

0

3775

with �2 = I . 2

Corollary 3.23 The product of two positive semi-de�nite matrices is similar to a pos-itive semi-de�nite matrix.

Proof. Let P and Q be any positive semi-de�nite matrices. Then it is easy to see thatwith the transformation given above

TPQT�1 =��21 00 0

�:

2

Corollary 3.24 For any stable system G =

�A BC D

�, there exists a nonsingular

T such that G =


�has controllability Gramian P and observability

Gramian Q given by

P =

2664

�1

�2

00

3775 ; Q =

2664

�1

0�3

0

3775 ;

respectively, with �1; �2; �3 diagonal and positive de�nite.

In the special case where

�A BC D

�is a minimal realization, a balanced realization

can be obtained through the following simpli�ed procedure:


1. Compute the controllability and observability Gramians P > 0; Q > 0.

2. Find a matrix R such that P = R�R.

3. Diagonalize RQR� to get RQR� = U�2U�.

4. Let T�1 = R�U��1=2. Then TPT � = (T �)�1QT�1 = � and


�is balanced.

Assume that the Hankel singular values of the system is decreasingly ordered so that� = diag(�1; �2; : : : ; �n) and �1 � �2 � : : : � �n and suppose �r � �r+1 for some r thenthe balanced realization implies that those states corresponding to the singular valuesof �r+1; : : : ; �n are less controllable and observable than those states corresponding to�1; : : : ; �r. Therefore, truncating those less controllable and observable states will notlose much information about the system. These statements will be made more concretein Chapter 7 where error bounds will be derived for the truncation error.

Two other closely related realizations are called input normal realization with P = Iand Q = �2, and output normal realization with P = �2 and Q = I . Both realizationscan be obtained easily from the balanced realization by a suitable scaling on the states.

3.10 Hidden Modes and Pole-Zero Cancelation

Another important issue associated with the realization theory is the problem of uncon-trollable and/or unobservable unstable modes in the dynamical system. This problemis illustrated in the following example: Consider a series connection of two subsystemsas shown in the following diagram

u�y� � �s� 1

s+ 1

1

s� 1

The transfer function for this system,

g(s) =s� 1

s+ 1

1

s� 1=

1

s+ 1;

is stable and has a �rst order minimal realization. On the other hand, let

x1 = y

x2 = u� �:

Then a state space description for this dynamical system is given by�_x1_x2

�=

�1 �10 �1

��x1x2

�+

�12

�u

y =�1 0

� � x1x2

�

3.10. Hidden Modes and Pole-Zero Cancelation 79

and is a second order system. Moreover, it is easy to show that the unstable mode 1 isuncontrollable but observable. Hence, the output can be unbounded if the initial statex1(0) is not zero. We should also note that the above problem does not go away bychanging the interconnection order:

u�y� � �1

s� 1

s� 1

s+ 1

In the later case, the unstable mode 1 becomes controllable but unobservable. Theunstable mode can still result in the internal signal � unbounded if the initial state�(0) is not zero. Of course, there are fundamental di�erences between these two typesof interconnections as far as control design is concerned. For instance, if the state isavailable for feedback control, then the latter interconnection can be stabilized whilethe former cannot be.

This example shows that we must be very careful in canceling unstable modes inthe procedure of forming a transfer function in control designs; otherwise the resultsobtained may be misleading and those unstable modes become hidden modes waitingto blow. One observation from this example is that the problem is really caused by theunstable zero of the subsystem s�1

s+1 . Although the zeros of an SISO transfer functionare easy to see, it is not quite so for an MIMO transfer matrix. In fact, the notion of\system zero" cannot be generalized naturally from the scalar transfer function zeros.For example, consider the following transfer matrix

G(s) =

2664

1

s+ 1

1

s+ 2

2

s+ 2

1

s+ 1

3775

which is stable and each element of G(s) has no �nite zeros. Let

K =

24 s+ 2

s�p2 � s+ 1

s�p20 1

35

which is unstable. However,

KG =

26664� s+

p2

(s+ 1)(s+ 2)0

2

s+ 2

1

s+ 1

37775

is stable. This implies that G(s) must have an unstable zero atp2 that cancels the

unstable pole of K. This leads us to the next topic: multivariable system poles andzeros.


3.11 Multivariable System Poles and Zeros

Let R[s] denote the polynomial ring with real coe�cients. A matrix is called a poly-nomial matrix if every element of the matrix is in R[s]. A square polynomial matrix iscalled a unimodular matrix if its determinant is a nonzero constant (not a polynomialof s). It is a fact that a square polynomial matrix is unimodular if and only if it isinvertible in R[s], i.e., its inverse is also a polynomial matrix.

De�nition 3.11 Let Q(s) 2 R[s] be a (p �m) polynomial matrix. Then the normalrank of Q(s), denoted normalrank (Q(s)), is the rank in R[s] or, equivalently, is themaximum dimension of a square submatrix of Q(s) with nonzero determinant in R[s].

In short, sometimes we say that a polynomial matrix Q(s) has rank(Q(s)) in R[s] whenwe refer to the normal rank of Q(s).

To show the di�erence between the normal rank of a polynomial matrix and therank of the polynomial matrix evaluated at certain point, consider

Q(s) =

�s 1s2 1

�:

Then Q(s) has normal rank 2 since detQ(s) = s� s2 6� 0. However, Q(0) has rank 1.

It is a fact in linear algebra that any polynomial matrix can be reduced to a so-called Smith form through some pre- and post- unimodular operations. [cf. Kailath,1984, pp.391].

Lemma 3.25 (Smith form) Let P (s) 2 R[s] be any polynomial matrix, then thereexist unimodular matrices U(s); V (s) 2 R[s] such that

U(s)P (s)V (s) = S(s) :=

2666664

1(s) 0 � � � 0 00 2(s) � � � 0 0...

.... . .

......

0 0 � � � r(s) 00 0 � � � 0 0

3777775

and i(s) divides i+1(s).

S(s) is called the Smith form of P (s). It is also clear that r is the normal rank of P (s).We shall illustrate the procedure of obtaining a Smith form by an example. Let

P (s) =

24 s+ 1 (s+ 1)(2s+ 1) s(s+ 1)s+ 2 (s+ 2)(s2 + 5s+ 3) s(s+ 2)1 2s+ 1 s

35 :

The polynomial matrix P (s) has normal rank 2 since

det(P (s)) � 0; det

�s+ 1 (s+ 1)(2s+ 1)s+ 2 (s+ 2)(s2 + 5s+ 3)

�= (s+ 1)2(s+ 2)2 6� 0:

3.11. Multivariable System Poles and Zeros 81

First interchange the �rst row and the third row and use row elementary operation tozero out the s+1 and s+2 elements of P (s). This process can be done by pre-multiplyinga unimodular matrix U to P (s):

U =

24 0 0 1

0 1 �(s+ 2)1 0 �(s+ 1)

35 :

Then

P1(s) := U(s)P (s) =

24 1 2s+ 1 s

0 (s+ 1)(s+ 2)2 00 0 0

35 :

Next use column operation to zero out the 2s+ 1 and s terms in P1. This process canbe done by post-multiplying a unimodular matrix V to P1(s):

V (s) =

24 1 �(2s+ 1) �s

0 1 00 0 1

35

and

P1(s)V (s) =

24 1 0 0

0 (s+ 1)(s+ 2)2 00 0 0

35 :

Then we have

S(s) = U(s)P (s)V (s) =

24 1 0 0

0 (s+ 1)(s+ 2)2 00 0 0

35 :

Similarly, let Rp(s) denote the set of rational proper transfer matrices.1 Then any

real rational transfer matrix can be reduced to a so-calledMcMillan form through somepre- and post- unimodular operations.

Lemma 3.26 (McMillan form) Let G(s) 2 Rp(s) be any proper real rational transfermatrix, then there exist unimodular matrices U(s); V (s) 2 R[s] such that

U(s)G(s)V (s) =M(s) :=

26666664

�1(s)�1(s)

0 � � � 0 0

0 �2(s)�2(s)

� � � 0 0...

.... . .

......

0 0 � � � �r(s)�r(s)

0

0 0 � � � 0 0

37777775

and �i(s) divides �i+1(s) and �i+1(s) divides �i(s).

1It is obvious that a similar notion of normal rank can be extended to a matrix in Rp(s). Inparticular, let G(s) 2 Rp(s) be a rational matrix and write G(s) = N(s)=d(s) such that d(s) is apolynomial and N(s) is a polynomial matrix, then G(s) is said to have normal rank r if N(s) hasnormal rank r.


Proof. If we write the transfer matrix G(s) as G(s) = N(s)=d(s) such that d(s) is ascalar polynomial and N(s) is a p�m polynomial matrix and if let the Smith form ofN(s) be S(s) = U(s)N(s)V (s), the conclusion follows by letting M(s) = S(s)=d(s). 2

De�nition 3.12 The numberP

i deg(�i(s)) is called the McMillan degree of G(s)where deg(�i(s)) denotes the degree of the polynomial �i(s), i.e., the highest powerof s in �i(s).

The McMillan degree of a transfer matrix is closely related to the dimension of aminimal realization of G(s). In fact, it can be shown that the dimension of a minimalrealization of G(s) is exactly the McMillan degree of G(s).

De�nition 3.13 The roots of all the polynomials �i(s) in the McMillan form for G(s)are called the poles of G.

Let (A;B;C;D) be a minimal realization of G(s). Then it is fairly easy to show thata complex number is a pole of G(s) if and only if it is an eigenvalue of A.

De�nition 3.14 The roots of all the polynomials �i(s) in the McMillan form for G(s)are called the transmission zeros of G(s). A complex number z0 2 C is called a blockingzero of G(s) if G(z0) = 0.

It is clear that a blocking zero is a transmission zero. Moreover, for a scalar transferfunction, the blocking zeros and the transmission zeros are the same.

We now illustrate the above concepts through an example. Consider a 3�3 transfermatrix:

G(s) =

2666666664

1

(s+ 1)(s+ 2)

2s+ 1

(s+ 1)(s+ 2)

s

(s+ 1)(s+ 2)

1

(s+ 1)2s2 + 5s+ 3

(s+ 1)2s

(s+ 1)2

1

(s+ 1)2(s+ 2)

2s+ 1

(s+ 1)2(s+ 2)

s

(s+ 1)2(s+ 2)

3777777775:

Then G(s) can be written as

G(s) =1

(s+ 1)2(s+ 2)

24 s+ 1 (s+ 1)(2s+ 1) s(s+ 1)s+ 2 (s+ 2)(s2 + 5s+ 3) s(s+ 2)1 2s+ 1 s

35 :=

N(s)

d(s):


Since N(s) is exactly the same as the P (s) in the previous example, it is clear that theG(s) has the McMillan form

M(s) = U(s)G(s)V (s) =

266666664

1

(s+ 1)2(s+ 2)0 0

0s+ 2

s+ 10

0 0 0

377777775

andG(s) has McMillan degree of 4. The poles of the transfer matrix are f�1;�1;�1;�2gand the transmission zero is f�2g. Note that the transfer matrix has pole and zero atthe same location f�2g; this is the unique feature of multivariable systems.

To get a minimal state space realization for G(s), note that G(s) has the followingpartial fractional expansion:

G(s) =

24 0 0 0

0 1 00 0 0

35+ 1

s+ 1

24 1

00

35 � 1 �1 �1 �+ 1

s+ 1

24 0

10

35 � 0 3 1

�

+1

s+ 1

24 0

01

35 � �1 3 2

�+

1

(s+ 1)2

24 0

11

35 � 1 �1 �1 �

+1

s+ 2

24 �1

01

35 � 1 �3 �2 �

Since there are repeated poles at �1, the Gilbert's realization procedure described in thelast section cannot be used directly. Nevertheless, a careful inspection of the fractionalexpansion results in a 4-th order minimal state space realization:

G(s) =

2666666664

�1 0 1 0 0 3 10 �1 1 0 �1 3 20 0 �1 0 1 �1 �10 0 0 �2 1 �3 �20 0 1 �1 0 0 01 0 0 0 0 1 00 1 0 1 0 0 0

3777777775:

We remind readers that there are many di�erent de�nitions of system zeros. Thede�nitions introduced here are the most common ones and are useful in this book.

Lemma 3.27 Let G(s) be a p�m proper transfer matrix with full column normal rank.Then z0 2 C is a transmission zero of G(s) if and only if there exists a 0 6= u0 2 Cm

such that G(z0)u0 = 0.


Proof. We shall outline a proof of this lemma. We shall �rst show that there is avector u0 2 Cm such that G(z0)u0 = 0 if z0 2 C is a transmission zero. Without loss ofgenerality, assume

G(s) = U1(s)

26666664

�1(s)�1(s)

0 � � � 0

0 �2(s)�2(s)

� � � 0...

.... . .

...

0 0 � � � �m(s)�m(s)

0 0 � � � 0

37777775V1(s)

for some unimodular matrices U1(s) and V1(s) and suppose z0 is a zero of �1(s), i.e.,�1(z0) = 0. Let

u0 = V �11 (z0)e1 6= 0

where e1 = [1; 0; 0; : : :]� 2 Rm . Then it is easy to verify that G(z0)u0 = 0. On the other

hand, suppose there is a u0 2 Cm such that G(z0)u0 = 0. Then

U1(z0)

26666664

�1(z0)�1(z0)

0 � � � 0

0 �2(z0)�2(z0)

� � � 0...

.... . .

...

0 0 � � � �m(z0)�m(z0)

0 0 � � � 0

37777775V1(z0)u0 = 0:

De�ne 26664

u1u2...um

37775 = V1(z0)u0 6= 0:

Then 26664

�1(z0)u1�2(z0)u2

...�m(z0)um

37775 = 0:

This implies that z0 must be a root of one of polynomials �i(s); i = 1; : : : ;m. 2

Note that the lemma may not be true if G(s) does not have full column normal rank.This can be seen from the following example. Consider

G(s) =1

s+ 1

�1 11 1

�; u0 =

�1�1

�:


It is easy to see that G has no transmission zero but G(s)u0 = 0 for all s. It shouldalso be noted that the above lemma applies even if z0 is a pole of G(s) although G(z0)is not de�ned. The reason is that G(z0)u0 may be well de�ned. For example,

G(s) =

� s�1s+1 0

0 s+2s�1

�; u0 =

�10

�:

Then G(1)u0 = 0. Therefore, 1 is a transmission zero.Similarly we have the following lemma:

Lemma 3.28 Let G(s) be a p �m proper transfer matrix with full row normal rank.Then z0 2 C is a transmission zero of G(s) if and only if there exists a 0 6= �0 2 C p

such that ��0G(z0) = 0.

In the case where the transmission zero is not a pole of G(s), we can give a usefulalternative characterization of the transfer matrix transmission zeros. Furthermore,G(s) is not required to be full column (or row) rank in this case.

The following lemma is easy to show from the de�nition of zeros.

Lemma 3.29 Suppose z0 2 C is not a pole of G(s). Then z0 is a transmission zero ifand only if rank(G(z0)) < normalrank(G(s)).

Corollary 3.30 Let G(s) be a square m�m proper transfer matrix and detG(s) 6� 0.Suppose zo 2 C is not a pole of G(s). Then z0 2 C is a transmission zero of G(s) ifand only if detG(z0) = 0.

Using the above corollary, we can con�rm that the example in the last section does havea zero at

p2 since

det

2664

1

s+ 1

1

s+ 2

2

s+ 2

1

s+ 1

3775 =

2� s2

(s+ 1)2(s+ 2)2:

Note that the above corollary may not be true if z0 is a pole of G. For example,

G(s) =

� s�1s+1 0

0 s+2s�1

�

has a zero at 1 which is not a zero of detG(s).

The poles and zeros of a transfer matrix can also be characterized in terms of itsstate space realizations. Let �

A BC D

�

be a state space realization of G(s).


De�nition 3.15 The eigenvalues of A are called the poles of the realization of G(s).

To de�ne zeros, let us consider the following system matrix

Q(s) =

�A� sI BC D

�:

De�nition 3.16 A complex number z0 2 C is called an invariant zero of the systemrealization if it satis�es

rank

�A� z0I B

C D

�< normalrank

�A� sI BC D

�:

The invariant zeros are not changed by constant state feedback since

rank

�A+BF � z0I BC +DF D

�= rank

�A� z0I B

C D

��I 0F I

�= rank

�A� z0I BC D

�:

It is also clear that invariant zeros are not changed under similarity transformation.

The following lemma is obvious.

Lemma 3.31 Suppose

�A� sI BC D

�has full column normal rank. Then z0 2 C is

an invariant zero of a realization (A;B;C;D) if and only if there exist 0 6= x 2 C n andu 2 Cm such that �

A� z0I BC D

� �xu

�= 0:

Moreover, if u = 0, then z0 is also a non-observable mode.

Proof. By de�nition, z0 is an invariant zero if there is a vector

�xu

�6= 0 such that

�A� z0I BC D

��xu

�= 0

since

�A� sI BC D

�has full column normal rank.

On the other hand, suppose z0 is an invariant zero, then there is a vector

�xu

�6= 0

such that �A� z0I B

C D

� �xu

�= 0:

We claim that x 6= 0. Otherwise,

�BD

�u = 0 or u = 0 since

�A� sI BC D

�has full

column normal rank, i.e.,

�xu

�= 0 which is a contradiction.


Finally, note that if u = 0, then�A� z0I

C

�x = 0

and z0 is a non-observable mode by PBH test. 2

Lemma 3.32 Suppose

�A� sI BC D

�has full row normal rank. Then z0 2 C is an

invariant zero of a realization (A;B;C;D) if and only if there exist 0 6= y 2 C n andv 2 C p such that �

y� v�� A� z0I B

C D

�= 0:

Moreover, if v = 0, then z0 is also a non-controllable mode.

Lemma 3.33 G(s) has full column (row) normal rank if and only if

�A� sI BC D

�has full column (row) normal rank.

Proof. This follows by noting that�A� sI BC D

�=

�I 0

C(A � sI)�1 I

� �A� sI B

0 G(s)

�

and

normalrank

�A� sI BC D

�= n+ normalrank(G(s)):

2

Theorem 3.34 Let G(s) be a real rational proper transfer matrix and let (A;B;C;D)be a corresponding minimal realization. Then a complex number z0 is a transmissionzero of G(s) if and only if it is an invariant zero of the minimal realization.

Proof. We will give a proof only for the case that the transmission zero is not a poleof G(s). Then, of course, z0 is not an eigenvalue of A since the realization is minimal.Note that �

A� sI BC D

�=

�I 0

C(A� sI)�1 I

� �A� sI B

0 G(s)

�:

Since we have assumed that z0 is not a pole of G(s), we have

rank

�A� z0I B

C D

�= n+ rank G(z0):


Hence

rank

�A� z0I B

C D

�< normalrank

�A� sI BC D

�

if and only if rank G(z0) < normalrank G(s). Then the conclusion follows fromLemma 3.29. 2

Note that the minimality assumption is essential for the converse statement. Forexample, consider a transfer matrix G(s) = D (constant) and a realization of G(s) =�A 0C D

�where A is any square matrix with any dimension and C is any matrix with

compatible dimension. Then G(s) has no poles or zeros but every eigenvalue of A is an

invariant zero of the realization

�A 0C D

�.

Nevertheless, we have the following corollary if a realization of the transfer matrixis not minimal.

Corollary 3.35 Every transmission zero of a transfer matrix G(s) is an invariant zeroof all its realizations, and every pole of a transfer matrix G(s) is a pole of all its real-izations.

Lemma 3.36 Let G(s) 2 Rp(s) be a p �m transfer matrix and let (A;B;C;D) be aminimal realization. If the system input is of the form u(t) = u0e

�t, where � 2 C is nota pole of G(s) and u0 2 Cm is an arbitrary constant vector, then the output due to theinput u(t) and the initial state x(0) = (�I �A)�1Bu0 is y(t) = G(�)u0e

�t; 8t � 0.

Proof. The system response with respect to the input u(t) = u0e�t and the initial

condition x(0) = (�I �A)�1Bu0 is (in terms of Laplace transform):

Y (s) = C(sI �A)�1x(0) + C(sI �A)�1BU(s) +DU(s)

= C(sI �A)�1x(0) + C(sI �A)�1Bu0(s� �)�1 +Du0(s� �)�1

= C(sI �A)�1(x(0)� (�I �A)�1Bu0) +G(�)u0(s� �)�1

= G(�)u0(s� �)�1:

Hence y(t) = G(�)u0e�t. 2

Combining the above two lemmas, we have the following results that give a dynamicalinterpretation of a system's transmission zero.

Corollary 3.37 Let G(s) 2 Rp(s) be a p �m transfer matrix and let (A;B;C;D) bea minimal realization. Suppose that z0 2 C is a transmission zero of G(s) and is nota pole of G(s). Then for any nonzero vector u0 2 Cm the output of the system due tothe initial state x(0) = (z0I � A)�1Bu0 and the input u = u0e

z0t is identically zero:y(t) = G(z0)u0e

z0t = 0.

3.12. Notes and References 89

The following lemma characterizes the relationship between zeros of a transfer func-tion and poles of its inverse.

Lemma 3.38 Suppose that G =

�A BC D

�is a square transfer matrix with D non-

singular, and suppose z0 is not an eigenvalue of A (note that the realization is notnecessarily minimal). Then there exists x0 such that

(A�BD�1C)x0 = z0x0; Cx0 6= 0

i� there exists u0 6= 0 such thatG(z0)u0 = 0:

Proof. (() G(z0)u0 = 0 implies that

G�1(s) =�A�BD�1C �BD�1

D�1C D�1

�

has a pole at z0 which is observable. Then, by de�nition, there exists x0 such that

(A�BD�1C)x0 = z0x0

andCx0 6= 0:

()) Set u0 = �D�1Cx0 6= 0. Then

(z0I �A)x0 = �BD�1Cx0 = Bu0:

Using this equality, one gets

G(z0)u0 = C(z0I �A)�1Bu0 +Du0 = Cx0 � Cx0 = 0:

2

The above lemma implies that z0 is a zero of an invertible G(s) if and only if it is apole of G�1(s).


Readers are referred to Brogan [1991], Chen [1984], Kailath [1980], and Wonham [1985]for the extensive treatment of the standard linear system theory. The balanced real-ization was �rst introduced by Mullis and Roberts [1976] to study the roundo� noisein digital �lters. Moore [1981] proposed the balanced truncation method for modelreduction which will be considered in Chapter 7.

4Performance Speci�cations

The most important objective of a control system is to achieve certain performancespeci�cations in addition to providing the internal stability. One way to describe theperformance speci�cations of a control system is in terms of the size of certain signalsof interest. For example, the performance of a tracking system could be measuredby the size of the tracking error signal. In this chapter, we look at several ways ofde�ning a signal's size, i.e., at several norms for signals. Of course, which norm is mostappropriate depends on the situation at hand. For that purpose, we shall �rst introducesome normed spaces and some basic notions of linear operator theory, in particular, theHardy spacesH2 andH1 are introduced. We then consider the performance of a systemunder various input signals and derive the worst possible outputs with the class of inputsignals under consideration. We show that H2 and H1 norms come out naturally asmeasures of the worst possible performance for many classes of input signals. Somestate space methods of computing real rational H2 and H1 transfer matrix norms arealso presented.

4.1 Normed Spaces

Let V be a vector space over C (or R) and let k�k be a norm de�ned on V . Then Vis a normed space. For example, the vector space C n with any vector p-norm, k�kp,for 1 � p � 1, is a normed space. As another example, consider the linear vectorspace C[a; b] of all bounded continuous functions on the real interval [a; b]. Then C[a; b]

91

92 PERFORMANCE SPECIFICATIONS

becomes a normed space if a supremum norm is de�ned on the space:

kfk1 := supt2[a;b]

jf(t)j:

A sequence fxng in a normed space V is called a Cauchy sequence if kxn � xmk ! 0as n;m ! 1. A sequence fxng is said to converge to x 2 V , written xn ! x, ifkxn � xk ! 0. A normed space V is said to be complete if every Cauchy sequence in Vconverges in V . A complete normed space is called a Banach space. For example, Rn

and C n with the usual spatial p-norm, k�kp for 1 � p � 1, are Banach spaces. (Oneshould not be confused with the notation k�kp used here and the same notation usedbelow for function spaces because usually the context will make the meaning clear.)

The following are some more examples of Banach spaces:

lp[0;1) spaces for 1 � p <1:

For each 1 � p < 1, lp[0;1) consists of all sequences x = (x0; x1; : : :) such that1Xi=0

jxijp <1. The associated norm is de�ned as

kxkp := 1Xi=0

jxijp!1=p

:

l1[0;1) space:

l1[0;1) consists of all bounded sequences x = (x0; x1; : : :), and the l1 norm isde�ned as

kxk1 := supijxij:

Lp(I) spaces for 1 � p � 1:

For each 1 � p < 1, Lp(I) consists of all Lebesgue measurable functions x(t)de�ned on an interval I � R such that

kxkp :=�Z

I

jx(t)jpdt�1=p

<1; for 1 � p <1

andkx(t)k1 := ess sup

t2Ijx(t)j:

Some of these spaces, for example, L2(�1; 0], L2[0;1) and L2(�1;1), will bediscussed in more detail later on.

C[a; b] space:

C[a; b] consists of all continuous functions on the real interval [a; b] with the normde�ned as

kxk1 := supt2[a;b]

jx(t)j:

4.2. Hilbert Spaces 93

Note that if each component or function is itself a vector or matrix, then the corre-sponding Banach space can also be formed by replacing the absolute value j � j of eachcomponent or function with its spatially normed component or function. For example,consider a vector space with all sequences in the form of

x = (x0; x1; : : :)

where each component xi is a k �m matrix and each element of xi is bounded. Thenxi is bounded in any matrix norm, and the vector space becomes a Banach space if thefollowing norm is de�ned

kxk1 := supi�i(xi)

where �i(xi) := kxik is any matrix norm. This space will also be denoted by l1.

Let V1 and V2 be two vector spaces and let T be an operator from S � V1 intoV2. An operator T is said to be linear if for any x1; x2 2 S and scalars �; � 2 C , thefollowing holds:

T (�x1 + �x2) = �(Tx1) + �(Tx2):

Moreover, let V0 be a linear subspace in V1. Then the operator T0 : V0 7�! V2 de�ned byT0x = Tx for every x 2 V0 is called the restriction of T to V0 and is denoted as T jV0 = T0.On the other hand, a linear operator T : V1 7�! V2 coinciding with T0 on V0 � V1 is

called an extension of T0. For example, let V0 :=

��x10

�: x1 2 C n

�� V1 = C n+m ,

and let

T =

�A1 A2

0 A3

�2 C

(n+m)�(n+m) ; T0 =

�A1 00 0

�2 C

(n+m)�(n+m) :

Then T jV0 = T0.

De�nition 4.1 Two normed spaces V1 and V2 are said to be linearly isometric, denotedby V1 �= V2, if there exists a linear operator T of V1 onto V2 such that

kTxk = kxk

for all x in V1. In this case, the mapping T is said to be an isometric isomorphism.

4.2 Hilbert Spaces

Recall the inner product of vectors de�ned on a Euclidean space C n :

hx; yi := x�y =nXi=1

�xiyi 8x =

264x1...xn

375 ; y =

264y1...yn

375 2 C

n :


Note that many important metric notions and geometrical properties such as length,distance, angle, and the energy of physical systems can be deduced from this innerproduct. For instance, the length of a vector x 2 C n is de�ned as

kxk :=phx; xi;

and the angle between two vectors x; y 2 C n can be computed from

cos\(x; y) =hx; yikxk kyk ; \(x; y) 2 [0; �]:

The two vectors are said to be orthogonal if \(x; y) = �2 .

We now consider a natural generalization of the inner product on C n to more general(possibly in�nite dimensional) vector spaces.

De�nition 4.2 Let V be a vector space over C . An inner product on V is a complexvalued function,

h�; �i : V � V 7�! C

such that for any x; y; z 2 V and �; � 2 C

(i) 1 hx; �y + �zi = �hx; yi + �hx; zi(ii) hx; yi = hy; xi(iii) hx; xi > 0 if x 6= 0.

A vector space V with an inner product is called an inner product space.

It is clear that the inner product de�ned above induces a norm kxk := phx; xi, sothat the norm conditions in Chapter 2 are satis�ed. In particular, the distance betweenvectors x and y is d(x; y) = kx� yk.

Two vectors x and y in an inner product space V are said to be orthogonal ifhx; yi = 0, denoted x ? y. More generally, a vector x is said to be orthogonal to a setS � V , denoted by x ? S, if x ? y for all y 2 S.

The inner product and the inner-product induced norm have the following familiarproperties.

Theorem 4.1 Let V be an inner product space and let x; y 2 V . Then(i) jhx; yij � kxk kyk (Cauchy-Schwarz inequality). Moreover, the equality holds if

and only if x = �y for some constant � or y = 0.

(ii) kx+ yk2 + kx� yk2 = 2 kxk2 + 2 kyk2 (Parallelogram law) .

(iii) kx+ yk2 = kxk2 + kyk2 if x ? y.

1This is the other way round to the usual mathematical convention since we want to have hx; yi = x�yrather than y�x for x; y 2 C n .

4.2. Hilbert Spaces 95

A Hilbert space is a complete inner product space with the norm induced by itsinner product. Obviously, a Hilbert space is also a Banach space. For example, C n

with the usual inner product is a (�nite dimensional) Hilbert space. More generally, itis straightforward to verify that C n�m with the inner product de�ned as

hA;Bi := TraceA�B =nXi=1

mXj=1

�aijbij 8A;B 2 Cn�m

is also a (�nite dimensional) Hilbert space.Here are some examples of in�nite dimensional Hilbert spaces:

l2(�1;1):

l2(�1;1) consists of the set of all real or complex square summable sequences

x = (: : : ; x�2; x�1; x0; x1; x2; : : :)

i.e,1X

i=�1jxij2 <1

with the inner product de�ned as

hx; yi :=1X

i=�1�xiyi

for x; y 2 l2(�1;1). The subspaces l2(�1; 0) and l2[0;1) of l2(�1;1) arede�ned similarly and consist of sequences of the form x = (: : : ; x�2; x�1) andx = (x0; x1; x2; : : :), respectively.

Note that we can also de�ne a corresponding Hilbert space even if each componentxi is a vector or a matrix; in fact, the following inner product will su�ce:

hx; yi :=1X

i=�1Trace(x�i yi):

L2(I) for I � R:

L2(I) consists of all square integrable and Lebesgue measurable functions de�nedon an interval I � R with the inner product de�ned as

hf; gi :=ZI

f(t)�g(t)dt

for f; g 2 L2(I). Similarly, if the function is vector or matrix valued, the innerproduct is de�ned correspondingly as

hf; gi :=ZI

Trace [f(t)�g(t)] dt:


Some very often used spaces in this book are L2[0;1);L2(�1; 0];L2(�1;1).More precisely, they are de�ned as

L2 = L2(�1;1): Hilbert space of matrix-valued functions on R, with innerproduct

hf; gi :=Z 1

�1Trace [f(t)�g(t)] dt:

L2+ = L2[0;1): subspace of L2(�1;1) with functions zero for t < 0.

L2� = L2(�1; 0]: subspace of L2(�1;1) with functions zero for t > 0.

Let H be a Hilbert space and M � H a subset. Then the orthogonal complement ofM , denoted by HM or M?, is de�ned as

M? = fx : hx; yi = 0;8y 2M;x 2 Hg :

It can be shown that M? is closed (hence M? is also a Hilbert space). For example,let M = L2+ � L2, then M? = L2� is a Hilbert space.

Let M and N be subspaces of a vector space V . V is said to be the direct sum ofM and N , written V = M � N , if M \ N = f0g, and every element v 2 V can beexpressed as v = x + y with x 2 M and y 2 N . If V is an inner product space andM and N are orthogonal, then V is said to be the orthogonal direct sum of M and N .As an example, it is easy to see that L2 is the orthogonal direct sum of L2� and L2+.Similarly, l2(�1;1) is the orthogonal direct sum of l2(�1; 0) and l2[0;1).

The following is a version of the so-called orthogonal projection theorem:

Theorem 4.2 Let H be a Hilbert space, and let M be a closed subspace of H. Then foreach vector v 2 H, there exist unique vectors x 2 M and y 2M? such that v = x+ y,i.e., H =M �M?. Moreover, x 2M is the unique vector such that d(v;M) = kv � xk.

Let H1 and H2 be two Hilbert spaces, and let A be a bounded linear operator fromH1 into H2. Then there exists a unique linear operator A� : H2 7�! H1 such that forall x 2 H1 and y 2 H2

hAx; yi = hx;A�yi:

A� is called the adjoint of A. Furthermore, A is called self-adjoint if A = A�.Let H be a Hilbert space and M � H be a closed subspace. A bounded operator P

mapping from H into itself is called the orthogonal projection onto M if

P (x+ y) = x;8 x 2M and y 2M?:

4.3. Hardy Spaces H2 and H1 97

4.3 Hardy Spaces H2 and H1Let S � C be an open set, and let f(s) be a complex valued function de�ned on S:

f(s) : S 7�! C :

Then f(s) is said to be analytic at a point z0 in S if it is di�erentiable at z0 and alsoat each point in some neighborhood of z0. It is a fact that if f(s) is analytic at z0 thenf has continuous derivatives of all orders at z0. Hence, a function analytic at z0 hasa power series representation at z0. The converse is also true, i.e., if a function has apower series at z0, then it is analytic at z0. A function f(s) is said to be analytic in S ifit has a derivative or is analytic at each point of S. A matrix valued function is analyticin S if every element of the matrix is analytic in S. For example, all real rational stabletransfer matrices are analytic in the right-half plane and e�s is analytic everywhere.

A well known property of the analytic functions is the so-called Maximum ModulusTheorem.

Theorem 4.3 If f(s) is de�ned and continuous on a closed-bounded set S and analyticon the interior of S, then the maximum of jf(s)j on S is attained on the boundary ofS, i.e.,

maxs2S

jf(s)j = maxs2@S

jf(s)j

where @S denotes the boundary of S.

Next we consider some frequently used complex (matrix) function spaces.

L2(jR) SpaceL2(jR) or simply L2 is a Hilbert space of matrix-valued (or scalar-valued) func-tions on jR and consists of all complex matrix functions F such that the integralbelow is bounded, i.e., Z 1

�1Trace [F �(j!)F (j!)] d! <1:

The inner product for this Hilbert space is de�ned as

hF;Gi := 1

2�

Z 1

�1Trace [F �(j!)G(j!)] d!

for F;G 2 L2, and the inner product induced norm is given by

kFk2 :=phF; F i:

For example, all real rational strictly proper transfer matrices with no poles on theimaginary axis form a subspace (not closed) of L2(jR) which is denoted by RL2(jR) orsimply RL2.


H2 Space2

H2 is a (closed) subspace of L2(jR) with matrix functions F (s) analytic in Re(s) >0 (open right-half plane). The corresponding norm is de�ned as

kFk22 := sup�>0

�1

2�

Z 1

�1Trace [F �(� + j!)F (� + j!)] d!

�:

It can be shown3 that

kFk22 =1

2�

Z 1

�1Trace [F �(j!)F (j!)] d!:

Hence, we can compute the norm for H2 just as we do for L2. The real rationalsubspace ofH2, which consists of all strictly proper and real rational stable transfermatrices, is denoted by RH2.

H?2 Space

H?2 is the orthogonal complement of H2 in L2, i.e., the (closed) subspace of func-tions in L2 that are analytic in the open left-half plane. The real rational subspaceof H?2 , which consists of all strictly proper rational transfer matrices with all polesin the open right half plane, will be denoted by RH?2 . It is easy to see that if Gis a strictly proper, stable, and real rational transfer matrix, then G 2 H2 andG� 2 H?2 . Most of our study in this book will be focused on the real rationalcase.

The L2 spaces de�ned above in the frequency domain can be related to the L2 spacesde�ned in the time domain. Recall the fact that a function in L2 space in the timedomain admits a bilateral Laplace (or Fourier) transform. In fact, it can be shown thatthis bilateral Laplace (or Fourier) transform yields an isometric isomorphism betweenthe L2 spaces in the time domain and the L2 spaces in the frequency domain (this iswhat is called Parseval's relations or Plancherel Theorem in complex analysis):

L2(�1;1) �= L2(jR)

L2[0;1) �= H2

L2(�1; 0] �= H?2 :As a result, if g(t) 2 L2(�1;1) and if its Fourier (or bilateral Laplace) transform isG(j!) 2 L2(jR), then

kGk2 = kgk2 :2The H2 space and H1 space de�ned below together with the Hp spaces, p � 1, which will not be

introduced in this book, are usually called Hardy spaces named after the mathematician G. H. Hardy(hence the notation of H).

3See Francis [1987].


Hence, whenever there is no confusion, the notation of functions in the time domainand in the frequency domain will be used interchangeably.

De�ne an orthogonal projection

P+ : L2(�1;1) 7�! L2[0;1)

such that, for any function f(t) 2 L2(�1;1), we have g(t) = P+f(t) with

g(t) :=

�f(t); for t � 0;0; for t < 0:

In this book, P+ will also be used to denote the projection from L2(jR) onto H2.Similarly, de�ne P� as another orthogonal projection from L2(�1;1) onto L2(�1; 0](or L2(jR) onto H?2 ). Then the relationships between L2 spaces and H2 spaces can beshown as in Figure 4.1.

H2

L2(jR)

H?2L2(�1; 0]

L2(�1;1)

L2[0;1)

P�

P+

Laplace Transform

Inverse Transform

Inverse Transform

Laplace Transform

Inverse Transform

Laplace Transform

P�

P+

? ?

66

� -

� -

� -

Figure 4.1: Relationships among function spaces

Other classes of important complex matrix functions used in this book are thosebounded on the imaginary axis.

L1(jR) Space

L1(jR) or simply L1 is a Banach space of matrix-valued (or scalar-valued) func-tions that are (essentially) bounded on jR, with norm

kFk1 := ess sup!2R

� [F (j!)] :


The rational subspace of L1, denoted by RL1(jR) or simply RL1, consists ofall rational proper transfer matrices with no poles on the imaginary axis.

H1 Space

H1 is a (closed) subspace in L1 with functions that are analytic in the openright-half plane and bounded on the imaginary axis. The H1 norm is de�ned as

kFk1 := supRe(s)>0

� [F (s)] = ess sup!2R

� [F (j!)] :

The second equality can be regarded as a generalization of the maximum modulustheorem for matrix functions. The real rational subspace of H1 is denoted byRH1 which consists of all proper and real rational stable transfer matrices.

H�1 Space

H�1 is a (closed) subspace in L1 with functions that are analytic in the openleft-half plane and bounded on the imaginary axis. The H�1 norm is de�ned as

kFk1 := supRe(s)<0

� [F (s)] = ess sup!2R

� [F (j!)] :

The real rational subspace of H�1 is denoted by RH�1 which consists of all properrational transfer matrices with all poles in the open right half plane.

De�nition 4.3 A transfer matrix G(s) 2 H�1 is usually said to be antistable or anti-causal.

Some facts about L1 and H1 functions are worth mentioning:

(i) if G(s) 2 L1, then G(s)L2 := fG(s)f(s) : f(s) 2 L2g � L2.(ii) if G(s) 2 H1, then G(s)H2 := fG(s)f(s) : f(s) 2 H2g � H2.

(ii) if G(s) 2 H�1, then G(s)H?2 := fG(s)f(s) : f(s) 2 H?2 g � H?2 .

Remark 4.1 The notation for L1 is somewhat unfortunate; it should be clear to thereader that the L1 space in the time domain and in the frequency domain denotecompletely di�erent spaces. The L1 space in the time domain is usually used to denotesignals, while the L1 space in the frequency domain is usually used to denote transferfunctions and operators. ~

Let G(s) 2 L1 be a p� q transfer matrix. Then a multiplication operator is de�nedas

MG : L2 7�! L2MGf := Gf:


In writing the above mapping, we have assumed that f has a compatible dimension. Amore accurate description of the above operator should be

MG : Lq2 7�! Lp2i.e., f is a q-dimensional vector function with each component in L2. However, we shallsuppress all dimensions in this book and assume all objects have compatible dimensions.

It is easy to verify that the adjoint operator M�G =MG� .

A useful fact about the multiplication operator is that the norm of a matrix G inL1 equals the norm of the corresponding multiplication operator.

Theorem 4.4 Let G 2 L1 be a p� q transfer matrix. Then kMGk = kGk1.

Remark 4.2 It is also true that this operator norm equals the norm of the operatorrestricted to H2 (or H?2 ), i.e.,

kMGk = kMGjH2k := sup fkGfk2 : f 2 H2; kfk2 � 1g :This will be clear in the proof where an f 2 H2 is constructed. ~

Proof. By de�nition, we have

kMGk = sup fkGfk2 : f 2 L2; kfk2 � 1g :First we see that kGk1 is an upper bound for the operator norm:

kGfk22 =1

2�

Z 1

�1f�(j!)G�(j!)G(j!)f(j!) d!

� kGk211

2�

Z 1

�1kf(j!)k2 d!

= kGk21kfk22:To show that kGk1 is the least upper bound, �rst choose a frequency !o where � [G(j!)]is maximum, i.e.,

� [G(j!0)] = kGk1and denote the singular value decomposition of G(j!0) by

G(j!0) = �u1(j!0)v�1(j!0) +

rXi=2

�iui(j!0)v�i (j!0)

where r is the rank of G(j!0) and ui; vi have unit length.

If !0 <1, write v1(j!0) as

v1(j!0) =

26664�1e

j�1

�2ej�2

...�qe

j�q

37775


where �i 2 R is such that �i 2 [��; 0) and q is the column dimension of G. Now let�i � 0 be such that

�i = \

��i � j!0�i + j!0

�;

and let f be given by

f(s) =

266664�1

�1�s�1+s

�2�2�s�2+s...

�q�q�s�q+s

377775 f(s)

where a scalar function f is chosen so that

jf(j!)j =�c if j! � !oj < � or j! + !oj < �0 otherwise

where � is a small positive number and c is chosen so that f has unit 2-norm, i.e.,c =

p�=2�. This in turn implies that f has unit 2-norm. Then

kGfk22 � 1

2�

h� [G(�j!o)]2 � + � [G(j!o)]

2�i

= � [G(j!o)]2= kGk21:

If !0 = 1, then G(j!0) is real and v1 is real. In this case, the conclusion follows by

letting f(s) = v1f(s). 2

4.4 Power and Spectral Signals

In this section, we introduce two additional classes of signals that have been widely usedin engineering. These classes of signals have some nice statistical and frequency domainrepresentations. Let u(t) be a function of time. De�ne its autocorrelation matrix as

Ruu(�) := limT!1

1

2T

Z T

�Tu(t+ �)u�(t)dt;

if the limit exists and is �nite for all � . It is easy to see from the de�nition that Ruu(�) =R�uu(��). Assume further that the Fourier transform of the signal's autocorrelationmatrix function exists (and may contain impulses). This Fourier transform is called thespectral density of u, denoted Suu(j!):

Suu(j!) :=

Z 1

�1Ruu(�)e

�j!� d�:

4.4. Power and Spectral Signals 103

Thus Ruu(�) can be obtained from Suu(j!) by performing an inverse Fourier transform:

Ruu(�) :=1

2�

Z 1

�1Suu(j!)e

j!�d!:

We will call signal u(t) a power signal if the autocorrelation matrix Ruu(�) exists andis �nite for all � , and moreover, if the power spectral density function Suu(j!) exists(note that Suu(j!) need not be bounded and may include impulses).

The power of the signal is de�ned as

kukP =

slimT!1

1

2T

Z T

�Tku(t)k2 dt =

pTrace [Ruu(0)]

where k�k is the usual Euclidean norm and the capital script P is used to di�erentiatethis power semi-norm from the usual Lebesgue Lp norm. The set of all signals having�nite power will be denoted by P .

The power semi-norm of a signal can also be computed from its spectral densityfunction

kuk2P =1

2�

Z 1

�1Trace[Suu(j!)]d!:

This expression implies that, if u 2 P , Suu is strictly proper in the sense that Suu(1) =0. We note that if u 2 P and ku(t)k1 := supt ku(t)k < 1, then kukP � kuk1.However, not every signal having �nite 1-norm is a power signal since the limit in thede�nition may not exist. For example, let

u(t) =

�1 22k < t < 22k+1; for k = 0; 1; 2; : : :0 otherwise:

Then limT!1 12T

R T�T u

2 dt does not exist. Note also that power signals are persistentsignals in time such as sines or cosines; clearly, a time-limited signal has zero power, asdoes an L2 signal. Thus k � kP is only a semi-norm, not a norm.

Now let G be a linear system transfer matrix with convolution kernel g(t), in-put u(t), and output z(t). Then Rzz(�) = g(�) � Ruu(�) � g�(��) and Szz(j!) =G(j!)Suu(j!)G

�(j!). These properties are useful in establishing some input and out-put relationships in the next section.

A signal u(t) is said to have bounded power spectral density if kSuu(j!)k1 < 1.The set of signals having bounded spectral density is denoted as

S := fu(t) 2 Rm : kSuu(j!)k1 <1g:

The quantity kuks :=pkSuu(j!)k is called the spectral density norm of u(t). The set

S can be used to model signals with �xed spectral characteristics by passing white noisesignals through a weighting �lter. Similarly, P could be used to model signals whosespectrum is not known but which are �nite in power.


4.5 Induced System Gains

Many control engineering problems involve keeping some system signals \small" undervarious conditions, for example, under a set of possible disturbances and system param-eter variations. In this section we are interested in answering the following question: ifwe know how \big" the input (disturbance) is, how \big" is the output going to be fora given stable dynamical system?

Consider a q-input and p-output linear �nite dimensional system as shown in thefollowing diagram with input u, output z, and transfer matrix G 2 RH1:

Gz u� �

We will further assume that G(s) is strictly proper, i.e, G(1) = 0 although most ofthe results derived here hold for the non-strictly proper case. In the time-domain aninput-output model for such a system has the form of a convolution equation,

z = g � ui.e.,

z(t) =

Z t

0

g(t� �)u(�) d�

where the p � q real matrix g(t) is the convolution kernel. Let the convolution kerneland the corresponding transfer matrix be partitioned as

g(t) =

264g11(t) � � � g1q(t)...

...gp1(t) � � � gpq(t)

375 =

264g1(t)...

gp(t)

375 ;

G(s) =

264G11(s) � � � G1q(s)

......

Gp1(s) � � � Gpq(s)

375 =

264G1(s)...

Gp(s)

375

where gi(t) is a q-dimensional row vector of the convolution kernel and Gi(s) is a rowvector of the transfer matrix. Now if G is considered as an operator from the input spaceto the output space, then a norm is induced on G, which, loosely speaking, measuresthe size of the output for a given input u. These norms can determine the achievableperformance of the system for di�erent classes of input signals

The various input/output relationships, given di�erent classes of input signals, aresummarized in two tables below. Table 4.1 summarizes the results for �xed input signals.Note that the �(t) in this table denotes the unit impulse and u0 2 Rq is a constant vectorindicating the direction of the input signal.

We now prove Table 4.1.

4.5. Induced System Gains 105

Input u(t) Output z(t)

kzk2 = kGu0k2 = kgu0k2

u(t) = u0�(t); u0 2 Rq kzk1 = kgu0k1

kzkP = 0

kzk2 =1

lim supt!1

maxijzi(t)j = kG(j!0)u0k1 = max

ijGi(j!0)u0j

u(t) = u0 sin!0t; u0 2 Rq

lim supt!1

kz(t)k = kG(j!0)u0k

kzkP =1p2kG(j!0)u0k

Table 4.1: Output norms with �xed inputs

Proof.

u = u0�(t): If u(t) = u0�(t), then z(t) = g(t)u0, so kzk2 = kgu0k2 and kzk1 =kgu0k1. But by Parseval's relation, kgu0k2 = kGu0k2. On the other hand,

kzk2P = limT!1

1

2T

Z T

�Tu�0g

�(t)g(t)u0 dt

� limT!1

1

2T

Z 1

�1Tracefg�(t)g(t)g dt ku0k2

= limT!1

1

2TkGk22 ku0k2 = 0:

u = u0 sin!0t: With the input u(t) = u0 sin(!0t), the i-th output as t!1 is

zi(t) = jGi(j!0)u0j sin[!0t+ argfGi(j!0)u0g]: (4:1)


The 2-norm of this signal is in�nite as long as Gi(j!0)u0 6= 0, i.e., the system'stransfer function does not have a zero in every channel in the input direction atthe frequency of excitation.

The amplitude of the sinusoid (4.1) equals jGi(j!0)u0j. Hencelim supt!1

maxijzi(t)j = max

ijGi(j!0)u0j = kG(j!0)u0k1

and

lim supt!1

kz(t)k =vuut pX

i=1

jGi(j!0)u0j2 = kG(j!0)u0k :

Finally, let �i := argGi(j!)u0. Then

kzk2P = limT!1

1

2T

Z T

�T

pXi=1

jGi(j!0)u0j2 sin2(!0t+ �i) dt

=

pXi=1

jGi(j!0)u0j2 limT!1

1

2T

Z T

�Tsin2(!0t+ �i) dt

=

pXi=1

jGi(j!0)u0j2 limT!1

1

2T

Z T

�T

1� cos 2(!0t+ �i)

2dt

=1

2

pXi=1

jGi(j!0)u0j2 = 1

2kG(j!0)u0k2 :

2

It is interesting to see what this table signi�es from the control point of view. Tofocus our discussion, let us assume that u(t) = u0 sin!0t is a disturbance or a commandsignal on a feedback system, and z(t) is the tracking error. Then we say that the systemhas good tracking behavior if z(t) is small in some sense, for instance, lim supt!1 kz(t)kis small. Note that

lim supt!1

kz(t)k = kG(j!0)u0kfor any given !0 and u0 2 Rq . Now if we want to track signals from various channels,that is if u0 can be chosen to be any direction, then we would require that �� (G(j!0))be small. Furthermore, if, in addition, we want to track signals of many di�erentfrequencies, we then would require that �� (G(j!0)) be small at all those frequencies.This interpretation enables us to consider the control system in the frequency domaineven though the speci�cations are given in the time domain.

Table 4.2 lists the maximum possible system gains when the input signal u is notrequired to be a �xed signal; instead it can be any signal in a unit ball in some functionspace.

Now we give a proof for Table 4.2. Note that the �rst row (L2 7�! L2) has beenshown in Section 4.3.


Input u(t) Output z(t) Signal Norms Induced Norms

L2 L2 kuk22 =Z 1

0

kuk2 dt kGk1

S S kuk2S = kSuuk1 kGk1

S P kuk2P =1

2�

Z 1

�1TracefSuu(j!)gd! kGk2

P P kuk2P =1

2�

Z 1

�1TracefSuu(j!)gd! kGk1

kuk1 = suptmaxijui(t)j max

ikgik1

L1 L1

kuk1 = suptku(t)k �

Z 1

0

kg(t)k dt

Table 4.2: Induced System Gains

Proof.

S 7�! S: If u 2 S, thenSzz(j!) = G(j!)Suu(j!)G

�(j!):

Now suppose�[G(j!0)] = kGk1

and take a signal u such that Suu(j!0) = I . Then

kSzz(j!)k1 = kGk21 :

S 7�! P : By de�nition, we have

kzk2P =1

2�

Z 1

�1TracefG(j!)Suu(j!)G�(j!)g d!:


Now let u be white with unit spectral density, i.e., Suu = I . Then

kzk2P =1

2�

Z 1

�1TracefG(j!)G�(j!)g d! = kGk22 :

P 7�! P :If u is a power signal, then

kzk2P =1

2�

Z 1

�1TracefG(j!)Suu(j!)G�(j!)g d!;

and immediately, we get that

kzkP � kGk1 kukP :

To achieve the equality, assume that !0 is such that

� [G(j!0)] = kGk1and denote the singular value decomposition of G(j!0) by

G(j!0) = �u1(j!0)v�1(j!0) +

rXi=2

�iui(j!0)v�i (j!0)

where r is the rank of G(j!0) and ui; vi have unit length.

If !0 <1, write v1(j!0) as

v1(j!) =

26664�1e

j�1

�2ej�2

...�qe

j�q

37775

where �i 2 R is such that �i 2 [��; 0) and q is the column dimension of G. Nowlet �i � 0 be such that

�i = \

��i � j!0�i + j!0

�

and let the input u be generated from passing u through a �lter:

u(t) =

266664�1

�1�s�1+s

�2�2�s�2+s...

�q�q�s�q+s

377775 u(t)


whereu(t) =

p2 sin(!ot):

Then Ruu(�) = cos(!o�), so

kukP = Ruu(0) = 1:

Also,Suu(j!) = � [�(! � !o) + �(! + !o)] :

Then

Suu(j!) =

266664�1

�1�j!�1+j!

�2�2�j!�2+j!...

�q�q�j!�q+j!

377775Suu(j!)

266664�1

�1�j!�1+j!

�2�2�j!�2+j!...

�q�q�j!�q+j!

377775

�

and it is easy to showkukP = 1:

Finally,

kzk2P =1

2� [G(j!o)]

2 +1

2� [G(�j!o)]2

= � [G(�j!o)]2= kGk21:

Similarly, if !0 = 1, then G(j!) is real. The conclusion follows by letting u =v1u(t) and !o !1.

L1 7�! L1:

1. First of all, maxi kgik1 is an upper bound on the 1-norm/1-norm systemgain:

jzi(t)j =

��Z t

0

gi(�)u(t � �) d�

�� Z t

0

jgi(�)u(t � �)j d�

�Z t

0

qXj=1

jgij(�)j d� kuk1 �Z 1

0

qXj=1

jgij(�)j d� kuk1

= kgik1kuk1and

kzk1 := maxi

suptjzi(t)j

� maxikgik1 kuk1 :


That maxi kgik1 is the least upper bound can be seen as follows: Assumethat the maximum maxi kgik1 is achieved for i = 1. Let t0 be given and set

u(t0 � �) := sign(g�1(�)); 8�:Note that since g1(�) is a vector function, the sign function sign(g�1(�)) is acomponent-wise operation. Then kuk1 = 1 and

z1(t0) =

Z t0

0

g1(�)u(t0 � �) d�

=

Z t0

0

qXj=1

jg1j(�)j d�

= kg1k1 �Z 1

t0

qXj=1

jg1j(�)j d�:

Hence, let t0 !1, and we have kzk1 = kg1k1.2. If ku(t)k1 := supt ku(t)k, then

kz(t)k =

Z t

0

g(�)u(t� �)d�

�

Z t

0

kg(�)k ku(t� �)k d�

�Z t

0

kg(�)k d� kuk1 :

And, therefore, kzk1 � R10kg(�)k d� kuk1.

2

Next we shall derive some simple and useful bounds for the H1 norm and the L1norm of a stable system. Suppose

G(s) =

�A BC 0

�2 RH1

is a balanced realization, i.e., there exists

� = diag(�1; �2; : : : ; �n) � 0

with �1 � �2 � : : : � �n � 0, such that

A� +�A� + BB� = 0 A��+�A+ C�C = 0:

Then we have the following theorem.


Theorem 4.5

�1 � kGk1 �Z 1

0

kg(t)k dt � 2

nXi=1

�i

where g(t) = CeAtB.

Remark 4.3 It should be clear that the inequalities stated in the theorem do notdepend on a particular state space realization of G(s). However, use of the balancedrealization does make the proof simple. ~

Proof. The inequality �1 � kGk1 follows from the Nehari Theorem of Chapter 8.We will now show the other inequalities. Since

G(s) :=

Z 1

0

g(t)e�stdt; Re(s) > 0;

by the de�nition of H1 norm, we have

kGk1 = supRe(s)>0

Z 1

0

g(t)e�stdt

� supRe(s)>0

Z 1

0

g(t)e�st dt�

Z 1

0

kg(t)k dt:

To prove the last inequality, let ui be the ith unit vector. Then

u�i uj = �ij =

�1 if i = j0 if i 6= j

and

nXi=1

uiu�i = I:

De�ne i(t) = u�i eAt=2B and i(t) = CeAt=2ui. It is easy to verify that

k ik22 =Z 1

0

u�i eAt=2BB�eA

�t=2uidt = 2u�i�ui = 2�i

k ik22 =Z 1

0

u�i eA�t=2C�CeAt=2uidt = 2u�i�ui = 2�i:

Using these two equalities, we have

Z 1

0

kg(t)k dt =

Z 1

0

nXi=1

i i

dt �nXi=1

Z 1

0

k i ik dt

�nXi=1

k ik2 k ik2 � 2

nXi=1

�i:


2

It should be clear from the above two tables that many system performance criteriacan be stipulated as requiring a certain closed loop transfer matrix have small H2 normor H1 norm or L1 norm. Moreover, if L1 performance is satis�ed, then the H1 normperformance is also satis�ed. We will be most interested in H2 and H1 performance inthis book.

4.6 Computing L2 and H2 Norms

Let G(s) 2 L2 and recall that the L2 norm of G is de�ned as

kGk2 :=

s1

2�

Z 1

�1TracefG�(j!)G(j!)g d!

= kgk2

=

sZ 1

�1Tracefg�(t)g(t)g dt

where g(t) denotes the convolution kernel of G.It is easy to see that the L2 norm de�ned above is �nite i� the transfer matrix G

is strictly proper, i.e., G(1) = 0. Hence, we will generally assume that the transfermatrix is strictly proper whenever we refer to the L2 norm of G (of course, this alsoapplies to H2 functions). One straightforward way of computing the L2 norm is to usecontour integral. Suppose G is strictly proper, and then we have

kGk22 =1

2�

Z 1

�1TracefG�(j!)G(j!)g d!

=1

2�j

ITracefG�(s)G(s)g ds:

The last integral is a contour integral along the imaginary axis, and around an in�nitesemi-circle in the left half-plane; the contribution to the integral from this semi-circleequals zero because G is strictly proper. By the residue theorem, kGk22 equals the sumof the residues of TracefG�(s)G(s)g at its poles in the left half-plane.

Although kGk2 can, in principle, be computed from its de�nition or from the methodsuggested above, it is useful in many applications to have alternative characterizationsand to take advantage of the state space representations of G. The computation of aRH2 transfer matrix norm is particularly simple.

Lemma 4.6 Consider a transfer matrix

G(s) =

�A BC 0

�

4.6. Computing L2 and H2 Norms 113

with A stable. Then we have

kGk22 = trace(B�LoB) = trace(CLcC�) (4:2)

where Lo and Lc are observability and controllability Gramians which can be obtainedfrom the following Lyapunov equations

ALc + LcA� +BB� = 0 A�Lo + LoA+ C�C = 0:

Proof. Since G is stable, we have

g(t) = L�1(G) =�CeAtB; t � 00; t < 0

and

kGk22 =

Z 1

0

Tracefg�(t)g(t)g dt =Z 1

0

Tracefg(t)g(t)�g dt

=

Z 1

0

TracefB�eA�tC�CeAtBg dt =Z 1

0

TracefCeAtBB�eA�tC�g dt:

The lemma follows from the fact that the controllability Gramian of (A;B) and theobservability Gramian of (C;A) can be represented as

Lo =

Z 1

0

eA�tC�CeAt dt; Lc =

Z 1

0

eAtBB�eA�t dt;

which can also be obtained from


2

To compute the L2 norm of a rational transfer function, G(s) 2 L2, using state spaceapproach. Let G(s) = [G(s)]+ + [G(s)]� with G+ 2 RH2 and G� 2 RH?2 . Then

kGk22 = k[G(s)]+k22 + k[G(s)]�k22where k[G(s)]+k2 and k[G(s)]�k2 = k[G(�s)]+k2 can be computed using the abovelemma.

Still another useful characterization of the H2 norm of G is in terms of hypotheticalinput-output experiments. Let ei denote the i

th standard basis vector of Rm where mis the input dimension of the system. Apply the impulsive input �(t)ei (�(t) is the unitimpulse) and denote the output by zi(t)(= g(t)ei). Assume D = 0, and then zi 2 L2+and

kGk22 =mXi=1

kzik22:

Note that this characterization of the H2 norm can be appropriately generalized fornonlinear time varying systems, see Chen and Francis [1992] for an application of thisnorm in sampled-data control.


4.7 Computing L1 and H1 Norms

We shall �rst consider, as in the L2 case, how to compute the1 norm of an L1 transfermatrix. Let G(s) 2 L1 and recall that the L1 norm of a transfer function G is de�nedas

kGk1 := ess sup!�fG(j!)g:

The computation of the L1 norm of G is complicated and requires a search. A controlengineering interpretation of the in�nity norm of a scalar transfer function G is thedistance in the complex plane from the origin to the farthest point on the Nyquist plotof G, and it also appears as the peak value on the Bode magnitude plot of jG(j!)j.Hence the 1 of a transfer function can in principle be obtained graphically.

To get an estimate, set up a �ne grid of frequency points,

f!1; � � � ; !Ng:

Then an estimate for kGk1 is

max1�k�N

�fG(j!k)g:

This value is usually read directly from a Bode singular value plot. The L1 norm canalso be computed in state space if G is rational.

Lemma 4.7 Let > 0 and

G(s) =

�A BC D

�2 RL1: (4:3)

Then kGk1 < if and only if �(D) < and H has no eigenvalues on the imaginaryaxis where

H :=

�A+BR�1D�C BR�1B�

�C�(I +DR�1D�)C �(A+BR�1D�C)�

�(4.4)

and R = 2I �D�D.

Proof. Let �(s) = 2I � G�(s)G(s). Then it is clear that kGk1 < if and only if�(j!) > 0 for all ! 2 R. Since �(1) = R > 0 and since �(j!) is a continuous functionof !, �(j!) > 0 for all ! 2 R if and only if �(j!) is nonsingular for all ! 2 R [ f1g,i.e., �(s) has no imaginary axis zero. Equivalently, ��1(s) has no imaginary axis pole.It is easy to compute by some simple algebra that

��1(s) =

24 H

�BR�1

�C�DR�1�

�R�1D�C R�1B�

�R�1

35 :

4.7. Computing L1 and H1 Norms 115

Thus the conclusion follows if the above realization has neither uncontrollable modesnor unobservable modes on the imaginary axis. Assume that j!0 is an eigenvalueof H but not a pole of ��1(s). Then j!0 must be either an unobservable mode of

(�R�1D�C R�1B�

�; H) or an uncontrollable mode of (H;

�BR�1

�C�DR�1�). Now

suppose j!0 is an unobservable mode of (�R�1D�C R�1B�

�; H). Then there exists

an x0 =

�x1x2

�6= 0 such that

Hx0 = j!0x0;�R�1D�C R�1B�

�x0 = 0:

These equations can be simpli�ed to

(j!0I �A)x1 = 0

(j!0I +A�)x2 = �C�Cx1D�Cx1 +B�x2 = 0:

Since A has no imaginary axis eigenvalues, we have x1 = 0 and x2 = 0. This contradictsour assumption, and hence the realization has no unobservable modes on the imaginaryaxis.

Similarly, a contradiction will also be arrived if j!0 is assumed to be an uncontrol-

lable mode of (H;

�BR�1

�C�DR�1�). 2

Bisection Algorithm

Lemma 4.7 suggests the following bisection algorithm to compute RL1 norm:

(a) select an upper bound u and a lower bound l such that l � kGk1 � u;

(b) if ( u � l)= l �speci�ed level, stop; kGk � ( u + l)=2. Otherwise go to nextstep;

(c) set = ( l + u)=2;

(d) test if kGk1 < by calculating the eigenvalues of H for the given ;

(e) if H has an eigenvalue on jR set l = ; otherwise set u = ; go back to step (b).

Of course, the above algorithm applies to H1 norm computation as well. Thus L1norm computation requires a search, over either or !, in contrast to L2 (H2) normcomputation, which does not. A somewhat analogous situation occurs for constantmatrices with the norms kMk22 = trace(M�M) and kMk1 = �[M ]. In principle, kMk22can be computed exactly with a �nite number of operations, as can the test for whether�(M) < (e.g. 2I �M�M > 0), but the value of �(M) cannot. To compute �(M),we must use some type of iterative algorithm.


Remark 4.4 It is clear that kGk1 < i� �1G 1 < 1. Hence, there is no loss of

generality to assume = 1. This assumption will often be made in the remainder ofthis book. It is also noted that there are other fast algorithms to carry out the abovenorm computation; nevertheless, this bisection algorithm is the simplest. ~

The H1 norm of a stable transfer function can also be estimated experimentallyusing the fact that theH1 norm of a stable transfer function is the maximum magnitudeof the steady-state response to all possible unit amplitude sinusoidal input signals.


The basic concept of function spaces presented in this chapter can be found in anystandard functional analysis textbook, for instance, Naylor and Sell [1982] and Gohbergand Goldberg [1981]. The system theoretical interpretations of the norms and functionspaces can be found in Desoer and Vidyasagar [1975]. The bisection H1 norm computa-tional algorithm is �rst developed in Boyd, Balakrishnan, and Kabamba [1989]. A moree�cient L1 norm computational algorithm is presented in Bruinsma and Steinbuch[1990].

5Stability and Performance of

Feedback Systems

This chapter introduces the feedback structure and discusses its stability and perfor-mance properties. The arrangement of this chapter is as follows: Section 5.1 discussesthe necessity for introducing feedback structure and describes the general feedback con-�guration. In section 5.2, the well-posedness of the feedback loop is de�ned. Next, thenotion of internal stability is introduced and the relationship is established between thestate space characterization of internal stability and the transfer matrix characteriza-tion of internal stability in section 5.3. The stable coprime factorizations of rationalmatrices are also introduced in section 5.4. Section 5.5 considers feedback propertiesand discusses how to achieve desired performance using feedback control. These discus-sions lead to a loop shaping control design technique which is introduced in section 5.6.Finally, we consider the mathematical formulations of optimal H2 and H1 controlproblems in section 5.7.

5.1 Feedback Structure

In designing control systems, there are several fundamental issues that transcend theboundaries of speci�c applications. Although they may di�er for each application andmay have di�erent levels of importance, these issues are generic in their relationship tocontrol design objectives and procedures. Central to these issues is the requirement toprovide satisfactory performance in the face of modeling errors, system variations, and

117

118 STABILITY AND PERFORMANCE OF FEEDBACK SYSTEMS

uncertainty. Indeed, this requirement was the original motivation for the developmentof feedback systems. Feedback is only required when system performance cannot beachieved because of uncertainty in system characteristics. The more detailed treatmentof model uncertainties and their representations will be discussed in Chapter 9.

For the moment, assuming we are given a model including a representation of un-certainty which we believe adequately captures the essential features of the plant, thenext step in the controller design process is to determine what structure is necessaryto achieve the desired performance. Pre�ltering input signals (or open loop control)can change the dynamic response of the model set but cannot reduce the e�ect of un-certainty. If the uncertainty is too great to achieve the desired accuracy of response,then a feedback structure is required. The mere assumption of a feedback structure,however, does not guarantee a reduction of uncertainty, and there are many obstaclesto achieving the uncertainty-reducing bene�ts of feedback. In particular, since for anyreasonable model set representing a physical system uncertainty becomes large and thephase is completely unknown at su�ciently high frequencies, the loop gain must be smallat those frequencies to avoid destabilizing the high frequency system dynamics. Evenworse is that the feedback system actually increases uncertainty and sensitivity in thefrequency ranges where uncertainty is signi�cantly large. In other words, because of thetype of sets required to reasonably model physical systems and because of the restrictionthat our controllers be causal, we cannot use feedback (or any other control structure)to cause our closed-loop model set to be a proper subset of the open-loop model set.Often, what can be achieved with intelligent use of feedback is a signi�cant reductionof uncertainty for certain signals of importance with a small increase spread over othersignals. Thus, the feedback design problem centers around the tradeo� involved in re-ducing the overall impact of uncertainty. This tradeo� also occurs, for example, whenusing feedback to reduce command/disturbance error while minimizing response degra-dation due to measurement noise. To be of practical value, a design technique mustprovide means for performing these tradeo�s. We will discuss these tradeo�s in moredetail later in section 5.5 and in Chapter 6.

To focus our discussion, we will consider the standard feedback con�guration shownin Figure 5.1. It consists of the interconnected plant P and controller K forced by

e

eee

n

6

�?

yu

ddi

�r ? --up

P? -K ---

Figure 5.1: Standard Feedback Con�guration

5.2. Well-Posedness of Feedback Loop 119

command r, sensor noise n, plant input disturbance di, and plant output disturbanced. In general, all signals are assumed to be multivariable, and all transfer matrices areassumed to have appropriate dimensions.

5.2 Well-Posedness of Feedback Loop

Assume that the plant P and the controller K in Figure 5.1 are �xed real rationalproper transfer matrices. Then the �rst question one would ask is whether the feedbackinterconnection makes sense or is physically realizable. To be more speci�c, consider asimple example where

P = �s� 1

s+ 2; K = 1

are both proper transfer functions. However,

u =(s+ 2)

3(r � n� d)� s� 1

3di

i.e., the transfer functions from the external signals r�n�d and di to u are not proper.Hence, the feedback system is not physically realizable!

De�nition 5.1 A feedback system is said to be well-posed if all closed-loop transfermatrices are well-de�ned and proper.

Now suppose that all the external signals r; n; d, and di are speci�ed and that theclosed-loop transfer matrices from them to u are respectively well-de�ned and proper.Then, y and all other signals are also well-de�ned and the related transfer matrices areproper. Furthermore, since the transfer matrices from d and n to u are the same anddi�er from the transfer matrix from r to u by only a sign, the system is well-posed if

and only if the transfer matrix from

�did

�to u exists and is proper.

In order to be consistent with the notation used in the rest of the book, we shalldenote

K := �Kand regroup the external input signals into the feedback loop as w1 and w2 and regroupthe input signals of the plant and the controller as e1 and e2. Then the feedback loopwith the plant and the controller can be simply represented as in Figure 5.2 and the

system is well-posed if and only if the transfer matrix from

�w1

w2

�to e1 exists and is

proper.

Lemma 5.1 The feedback system in Figure 5.2 is well-posed if and only if

I � K(1)P (1) (5:1)

is invertible.


e

e

++

++e2

e1

w2

w1

� �

-6

?

-

K

P

Figure 5.2: Internal Stability Analysis Diagram

Proof. The system in the above diagram can be represented in equation form as

e1 = w1 + Ke2

e2 = w2 + Pe1:

Then an expression for e1 can be obtained as

(I � KP )e1 = w1 + Kw2:

Thus well-posedness is equivalent to the condition that (I�KP )�1 exists and is proper.But this is equivalent to the condition that the constant term of the transfer functionI � KP is invertible. 2

It is straightforward to show that (5.1) is equivalent to either one of the followingtwo conditions: �

I �K(1)�P (1) I

�is invertible; (5:2)

I � P (1)K(1) is invertible:

The well-posedness condition is simple to state in terms of state-space realizations.Introduce realizations of P and K:

P =

�A BC D

�(5:3)

K =

"A B

C D

#: (5:4)

Then P (1) = D and K(1) = D. For example, well-posedness in (5.2) is equivalent tothe condition that �

I �D�D I

�is invertible: (5:5)

Fortunately, in most practical cases we will have D = 0, and hence well-posedness formost practical control systems is guaranteed.

5.3. Internal Stability 121

5.3 Internal Stability

Consider a system described by the standard block diagram in Figure 5.2 and assumethe system is well-posed. Furthermore, assume that the realizations for P (s) and K(s)given in equations (5.3) and (5.4) are stabilizable and detectable.

Let x and x denote the state vectors for P and K, respectively, and write the stateequations in Figure 5.2 with w1 and w2 set to zero:

_x = Ax+Be1 (5.6)

e2 = Cx+De1 (5.7)_x = Ax+ Be2 (5.8)

e1 = Cx+ De2: (5.9)

De�nition 5.2 The system of Figure 5.2 is said to be internally stable if the origin(x; x) = (0; 0) is asymptotically stable, i.e., the states (x; x) go to zero from all initialstates when w1 = 0 and w2 = 0.

Note that internal stability is a state space notion. To get a concrete characterizationof internal stability, solve equations (5.7) and (5.9) for e1 and e2:�

e1e2

�=

�I �D�D I

��1 �0 CC 0

� �xx

�:

Note that the existence of the inverse is guaranteed by the well-posedness condition.Now substitute this into (5.6) and (5.8) to get�

_x_x

�= ~A

�xx

�

where

~A =

�A 0

0 A

�+

�B 0

0 B

� �I �D�D I

��1 �0 CC 0

�:

Thus internal stability is equivalent to the condition that ~A has all its eigenvalues inthe open left-half plane. In fact, this can be taken as a de�nition of internal stability.

Lemma 5.2 The system of Figure 5.2 with given stabilizable and detectable realizationsfor P and K is internally stable if and only if ~A is a Hurwitz matrix.

It is routine to verify that the above de�nition of internal stability depends only onP and K, not on speci�c realizations of them as long as the realizations of P and K areboth stabilizable and detectable, i.e., no extra unstable modes are introduced by therealizations.

The above notion of internal stability is de�ned in terms of state-space realizationsof P and K. It is also important and useful to characterize internal stability from the


transfer matrix point of view. Note that the feedback system in Figure 5.2 is described,in term of transfer matrices, by�

I �K�P I

� �e1e2

�=

�w1

w2

�: (5:10)

Now it is intuitively clear that if the system in Figure 5.2 is internally stable, then for allbounded inputs (w1; w2), the outputs (e1; e2) are also bounded. The following lemmashows that this idea leads to a transfer matrix characterization of internal stability.

Lemma 5.3 The system in Figure 5.2 is internally stable if and only if the transfermatrix �

I �K�P I

��1=

�I + K(I � PK)�1P K(I � PK)�1

(I � PK)�1P (I � PK)�1

�(5:11)

from (w1; w2) to (e1; e2) belongs to RH1.

Proof. As above let

�A BC D

�and

"A B

C D

#be stabilizable and detectable realiza-

tions of P and K, respectively. Let y1 denote the output of P and y2 the output of K.Then the state-space equations for the system in Figure 5.2 are�

_x_x

�=

�A 0

0 A

� �xx

�+

�B 0

0 B

��e1e2

��y1y2

�=

�C 0

0 C

� �xx

�+

�D 0

0 D

� �e1e2

��e1e2

�=

�w1

w2

�+

�0 II 0

� �y1y2

�:

The last two equations can be rewritten as�I �D�D I

� �e1e2

�=

�0 CC 0

� �xx

�+

�w1

w2

�:

Now suppose that this system is internally stable. Then the well-posedness conditionimplies that (I � DD) = (I � PK)(1) is invertible. Hence, (I � PK) is invertible.Furthermore, since the eigenvalues of

~A =

�A 0

0 A

�+

�B 0

0 B

��I �D�D I

��1 �0 CC 0

�

are in the open left-half plane, it follows that the transfer matrix from (w1; w2) to (e1; e2)given in (5.11) is in RH1.


Conversely, suppose that (I � PK) is invertible and the transfer matrix in (5.11) isin RH1. Then, in particular, (I�PK)�1 is proper which implies that (I�PK)(1) =(I �DD) is invertible. Therefore, �

I �D�D I

�

is nonsingular. Now routine calculations give the transfer matrix from

�w1

w2

�to

�e1e2

�in terms of the state space realizations:

�I �D�D I

��1��I �D�D I

�+

�0 CC 0

�(sI � ~A)�1

�B 0

0 B

��I �D�D I

��1:

Since the above transfer matrix belongs to RH1, it follows that�0 CC 0

�(sI � ~A)�1

�B 0

0 B

�

as a transfer matrix belongs to RH1. Finally, since (A;B;C) and (A; B; C) are stabi-lizable and detectable, �

~A;

�B 0

0 B

�;

�0 CC 0

��

is stabilizable and detectable. It then follows that the eigenvalues of ~A are in the openleft-half plane. 2

Note that to check internal stability, it is necessary (and su�cient) to test whethereach of the four transfer matrices in (5.11) is in RH1. Stability cannot be concludedeven if three of the four transfer matrices in (5.11) are in RH1. For example, let aninterconnected system transfer function be given by

P =s� 1

s+ 1; K = � 1

s� 1:

Then it is easy to compute

�e1e2

�=

2664s+ 1

s+ 2� s+ 1

(s� 1)(s+ 2)

s� 1

s+ 2

s+ 1

s+ 2

3775�w1

w2

�;

which shows that the system is not internally stable although three of the four transferfunctions are stable. This can also be seen by calculating the closed-loop A-matrix withany stabilizable and detectable realizations of P and K.


Remark 5.1 It should be noted that internal stability is a basic requirement for apractical feedback system. This is because all interconnected systems may be unavoid-ably subject to some nonzero initial conditions and some (possibly small) errors, andit cannot be tolerated in practice that such errors at some locations will lead to un-bounded signals at some other locations in the closed-loop system. Internal stabilityguarantees that all signals in a system are bounded provided that the injected signals(at any locations) are bounded. ~

However, there are some special cases under which determining system stability issimple.

Corollary 5.4 Suppose K 2 RH1. Then the system in Figure 5.2 is internally stablei� (I � PK)�1P 2 RH1.

Proof. The necessity is obvious. To prove the su�ciency, it is su�cient to show that(I � PK)�1 2 RH1. But this follows from

(I � PK)�1 = I + (I � PK)�1PK

and (I � PK)�1P; K 2 RH1. 2

This corollary is in fact the basis for the classical control theory where the stabilityis checked only for one closed-loop transfer function with the implicit assumption thatthe controller itself is stable. Also, we have

Corollary 5.5 Suppose P 2 RH1. Then the system in Figure 5.2 is internally stablei� K(I � PK)�1 2 RH1.

Corollary 5.6 Suppose P 2 RH1 and K 2 RH1. Then the system in Figure 5.2 isinternally stable i� (I � PK)�1 2 RH1.

To study the more general case, de�ne

nc := number of open rhp poles of K(s)

np := number of open rhp poles of P (s):

Theorem 5.7 The system is internally stable if and only if

(i) the number of open rhp poles of P (s)K(s) = nc + np;

(ii) �(s) := det(I � P (s)K(s)) has all its zeros in the open left-half plane (i.e., (I �P (s)K(s))�1 is stable).


Proof. It is easy to show that PK and (I � PK)�1 have the following realizations:

PK =

264 A BC BD

0 A B

C DC DD

375

(I � PK)�1 =

"�A �B

�C �D

#

where

�A =

�A BC

0 A

�+

�BD

B

�(I �DD)�1

�C DC

��B =

�BD

B

�(I �DD)�1

�C = (I �DD)�1�C DC

��D = (I �DD)�1:

It is also easy to see that �A = ~A. Hence, the system is internally stable i� �A is stable.Now suppose that the system is internally stable, then (I � PK)�1 2 RH1. This

implies that all zeros of det(I � P (s)K(s)) must be in the left-half plane. So we onlyneed to show that given condition (ii), condition (i) is necessary and su�cient for theinternal stability. This follows by noting that ( �A; �B) is stabilizable i��

A BC

0 A

�;

�BD

B

��(5:12)

is stabilizable; and ( �C; �A) is detectable i��C DC

�;

�A BC

0 A

��(5:13)

is detectable. But conditions (5.12) and (5.13) are equivalent to condition (i), i.e., PKhas no unstable pole/zero cancelations. 2

With this observation, the MIMO version of the Nyquist stability theorem is obvious.

Theorem 5.8 (Nyquist Stability Theorem) The system is internally stable if and onlyif condition (i) in Theorem 5.7 is satis�ed and the Nyquist plot of �(j!) for �1 � ! �1 encircles the origin, (0; 0), nk + np times in the counter-clockwise direction.

Proof. Note that by SISO Nyquist stability theorem, �(s) has all zeros in the openleft-half plane if and only if the Nyquist plot of �(j!) for �1 � ! � 1 encircles theorigin, (0; 0), nk + np times in the counter-clockwise direction. 2


5.4 Coprime Factorization over RH1Recall that two polynomials m(s) and n(s), with, for example, real coe�cients, are saidto be coprime if their greatest common divisor is 1 (equivalent, they have no commonzeros). It follows from Euclid's algorithm1 that two polynomials m and n are coprimei� there exist polynomials x(s) and y(s) such that xm + yn = 1; such an equation iscalled a Bezout identity. Similarly, two transfer functions m(s) and n(s) in RH1 aresaid to be coprime over RH1 if there exists x; y 2 RH1 such that

xm+ yn = 1:

The more primitive, but equivalent, de�nition is that m and n are coprime if everycommon divisor of m and n is invertible in RH1, i.e.,

h;mh�1; nh�1 2 RH1 =) h�1 2 RH1:More generally, we have

De�nition 5.3 Two matrices M and N in RH1 are right coprime over RH1 if theyhave the same number of columns and if there exist matrices Xr and Yr in RH1 suchthat �

Xr Yr� � M

N

�= XrM + YrN = I:

Similarly, two matrices ~M and ~N in RH1 are left coprime over RH1 if they have thesame number of rows and if there exist matrices Xl and Yl in RH1 such that

�~M ~N

� � Xl

Yl

�= ~MXl + ~NYl = I:

Note that these de�nitions are equivalent to saying that the matrix

�MN

�is left-

invertible in RH1 and the matrix�

~M ~N�is right-invertible in RH1. These two

equations are often called Bezout identities.Now let P be a proper real-rational matrix. A right-coprime factorization (rcf)

of P is a factorization P = NM�1 where N and M are right-coprime over RH1.Similarly, a left-coprime factorization (lcf) has the form P = ~M�1 ~N where ~N and ~Mare left-coprime over RH1. A matrix P (s) 2 Rp(s) is said to have double coprimefactorization if there exist a right coprime factorization P = NM�1, a left coprimefactorization P = ~M�1 ~N , and Xr; Yr; Xl; Yl 2 RH1 such that�

Xr Yr� ~N ~M

� �M �YlN Xl

�= I: (5:14)

Of course implicit in these de�nitions is the requirement that both M and ~M be squareand nonsingular.

1See, e.g., [Kailath, 1980, pp. 140-141].

5.4. Coprime Factorization over RH1 127

Theorem 5.9 Suppose P (s) is a proper real-rational matrix and

P =

�A BC D

�

is a stabilizable and detectable realization. Let F and L be such that A+BF and A+LCare both stable, and de�ne

�M �YlN Xl

�=

24 A+BF B �L

F I 0C +DF D I

35 (5:15)

�Xr Yr� ~N ~M

�=

24 A+ LC �(B + LD) L

F I 0C �D I

35 : (5:16)

Then P = NM�1 = ~M�1 ~N are rcf and lcf, respectively, and, furthermore, (5.14) issatis�ed.

Proof. The theorem follows by verifying the equation (5.14). 2

Remark 5.2 Note that if P is stable, then we can take Xr = Xl = I , Yr = Yl = 0,N = ~N = P , M = ~M = I . ~Remark 5.3 The coprime factorization of a transfer matrix can be given a feedbackcontrol interpretation. For example, right coprime factorization comes out naturallyfrom changing the control variable by a state feedback. Consider the state space equa-tions for a plant P :

_x = Ax+Bu

y = Cx+Du:

Next, introduce a state feedback and change the variable

v := u� Fx

where F is such that A+BF is stable. Then we get

_x = (A+BF )x +Bv

u = Fx+ v

y = (C +DF )x+Dv:

Evidently from these equations, the transfer matrix from v to u is

M(s) =

�A+BF B

F I

�;


and that from v to y is

N(s) =

�A+BF BC +DF D

�:

Thereforeu =Mv; y = Nv

so that y = NM�1u, i.e., P = NM�1. ~

We shall now see how coprime factorizations can be used to obtain alternative charac-terizations of internal stability conditions. Consider again the standard stability analysisdiagram in Figure 5.2. We begin with any rcf's and lcf's of P and K:

P = NM�1 = ~M�1 ~N (5:17)

K = UV �1 = ~V �1 ~U: (5:18)

Lemma 5.10 Consider the system in Figure 5.2. The following conditions are equiva-lent:

1. The feedback system is internally stable.

2.

�M UN V

�is invertible in RH1.

3.

�~V � ~U

� ~N ~M

�is invertible in RH1.

4. ~MV � ~NU is invertible in RH1.

5. ~VM � ~UN is invertible in RH1.

Proof. As we saw in Lemma 5.3, internal stability is equivalent to

�I �K�P I

��12 RH1

or, equivalently, �I KP I

��12 RH1: (5:19)

Now �I KP I

�=

�I UV �1

NM�1 I

�=

�M UN V

� �M�1 00 V �1

�

so that �I KP I

��1=

�M 00 V

��M UN V

��1:

5.4. Coprime Factorization over RH1 129

Since the matrices �M 00 V

�;

�M UN V

�are right-coprime (this fact is left as an exercise for the reader), (5.19) holds i��

M UN V

��12 RH1:

This proves the equivalence of conditions 1 and 2. The equivalence of 1 and 3 is provedsimilarly.

The conditions 4 and 5 are implied by 2 and 3 from the following equation:�~V � ~U

� ~N ~M

� �M UN V

�=

�~VM � ~UN 0

0 ~MV � ~NU

�:

Since the left hand side of the above equation is invertible in RH1, so is the right handside. Hence, conditions 4 and 5 are satis�ed. We only need to show that either condition4 or condition 5 implies condition 1. Let us show condition 5! 1; this is obvious since�

I KP I

��1=

�I ~V �1 ~U

NM�1 I

��1

=

�M 00 I

� �~VM ~UN I

��1 � ~V 00 I

�2 RH1

if

�~VM ~UN I

��12 RH1 or if condition 5 is satis�ed. 2

Combining Lemma 5.10 and Theorem 5.9, we have the following corollary.

Corollary 5.11 Let P be a proper real-rational matrix and P = NM�1 = ~M�1 ~N becorresponding rcf and lcf over RH1. Then there exists a controller

K0 = U0V�10 = ~V �10

~U0

with U0; V0; ~U0, and ~V0 in RH1 such that�~V0 � ~U0

� ~N ~M

� �M U0

N V0

�=

�I 00 I

�: (5:20)

Furthermore, let F and L be such that A+BF and A+LC are stable. Then a particularset of state space realizations for these matrices can be given by

�M U0

N V0

�=

24 A+BF B �L

F I 0C +DF D I

35 (5:21)

�~V0 � ~U0

� ~N ~M

�=

24 A+ LC �(B + LD) L

F I 0C �D I

35 : (5:22)


Proof. The idea behind the choice of these matrices is as follows. Using the observertheory, �nd a controller K0 achieving internal stability; for example

K0 :=

�A+ BF + LC + LDF �L

F 0

�: (5:23)

Perform factorizationsK0 = U0V

�10 = ~V �10

~U0

which are analogous to the ones performed on P . Then Lemma 5.10 implies that eachof the two left-hand side block matrices of (5.20) must be invertible in RH1. In fact,(5.20) is satis�ed by comparing it with the equation (5.14). 2

5.5 Feedback Properties

In this section, we discuss the properties of a feedback system. In particular, we considerthe bene�t of the feedback structure and the concept of design tradeo�s for con ictingobjectives { namely, how to achieve the bene�ts of feedback in the face of uncertainties.

e

eee

n

6

�?

yu

ddi

�r ? --up

P? -K ---


Consider again the feedback system shown in Figure 5.1. For convenience, the systemdiagram is shown again in Figure 5.3. For further discussion, it is convenient to de�nethe input loop transfer matrix, Li, and output loop transfer matrix, Lo, as

Li = KP; Lo = PK;

respectively, where Li is obtained from breaking the loop at the input (u) of the plantwhile Lo is obtained from breaking the loop at the output (y) of the plant. The inputsensitivity matrix is de�ned as the transfer matrix from di to up:

Si = (I + Li)�1; up = Sidi:

And the output sensitivity matrix is de�ned as the transfer matrix from d to y:

So = (I + Lo)�1; y = Sod:

5.5. Feedback Properties 131

The input and output complementary sensitivity matrices are de�ned as

Ti = I � Si = Li(I + Li)�1

To = I � So = Lo(I + Lo)�1;

respectively. (The word complementary is used to signify the fact that T is the comple-ment of S, T = I � S.) The matrix I + Li is called input return di�erence matrix andI + Lo is called output return di�erence matrix.

It is easy to see that the closed-loop system, if it is internally stable, satis�es thefollowing equations:

y = To(r � n) + SoPdi + Sod (5.24)

r � y = So(r � d) + Ton� SoPdi (5.25)

u = KSo(r � n)�KSod� Tidi (5.26)

up = KSo(r � n)�KSod+ Sidi: (5.27)

These four equations show the fundamental bene�ts and design objectives inherent infeedback loops. For example, equation (5.24) shows that the e�ects of disturbance don the plant output can be made \small" by making the output sensitivity function Sosmall. Similarly, equation (5.27) shows that the e�ects of disturbance di on the plantinput can be made small by making the input sensitivity function Si small. The notionof smallness for a transfer matrix in a certain range of frequencies can be made explicitusing frequency dependent singular values, for example, �(So) < 1 over a frequencyrange would mean that the e�ects of disturbance d at the plant output are e�ectivelydesensitized over that frequency range.

Hence, good disturbance rejection at the plant output (y) would require that

�(So) = ��(I + PK)�1

�=

1

�(I + PK); (for disturbance at plant output, d)

�(SoP ) = ��(I + PK)�1P

�= �(PSi); (for disturbance at plant input, di)

be made small and good disturbance rejection at the plant input (up) would requirethat

�(Si) = ��(I +KP )�1

�=

1

�(I +KP ); (for disturbance at plant input, di)

�(SiK) = ��K(I + PK)�1

�= �(KSo); (for disturbance at plant output, d)

be made small, particularly in the low frequency range where d and di are usuallysigni�cant.

Note that

�(PK)� 1 � �(I + PK) � �(PK) + 1

�(KP )� 1 � �(I +KP ) � �(KP ) + 1


then

1

�(PK) + 1� �(So) � 1

�(PK)� 1; if �(PK) > 1

1

�(KP ) + 1� �(Si) � 1

�(KP )� 1; if �(KP ) > 1:

These equations imply that

�(So)� 1 () �(PK)� 1

�(Si)� 1 () �(KP )� 1:

Now suppose P and K are invertible, then

�(PK)� 1 or �(KP )� 1 () �(SoP ) = ��(I + PK)�1P

� � �(K�1) =1

�(K)

�(PK)� 1 or �(KP )� 1 () �(KSo) = ��K(I + PK)�1

� � �(P�1) =1

�(P ):

Hence good performance at plant output (y) requires in general large output loop gain�(Lo) = �(PK) � 1 in the frequency range where d is signi�cant for desensitizing dand large enough controller gain �(K)� 1 in the frequency range where di is signi�cantfor desensitizing di. Similarly, good performance at plant input (up) requires in generallarge input loop gain �(Li) = �(KP )� 1 in the frequency range where di is signi�cantfor desensitizing di and large enough plant gain �(P )� 1 in the frequency range whered is signi�cant, which can not changed by controller design, for desensitizing d. (Itshould be noted that in general So 6= Si unless K and P are square and diagonal whichis true if P is a scalar system. Hence, small �(So) does not necessarily imply small�(Si); in other words, good disturbance rejection at the output does not necessarilymean good disturbance rejection at the plant input.)

Hence, good multivariable feedback loop design boils down to achieving high loop (andpossibly controller) gains in the necessary frequency range.

Despite the simplicity of this statement, feedback design is by no means trivial.This is true because loop gains cannot be made arbitrarily high over arbitrarily largefrequency ranges. Rather, they must satisfy certain performance tradeo� and designlimitations. A major performance tradeo�, for example, concerns commands and dis-turbance error reduction versus stability under the model uncertainty. Assume that theplant model is perturbed to (I + �)P with � stable, and assume that the system isnominally stable, i.e., the closed-loop system with � = 0 is stable. Now the perturbedclosed-loop system is stable if

det (I + (I +�)PK) = det(I + PK) det(I +�To)

has no right-half plane zero. This would in general amount to requiring that k�Tok besmall or that ��(To) be small at those frequencies where � is signi�cant, typically at

5.5. Feedback Properties 133

high frequency range, which in turn implies that the loop gain, �(Lo), should be smallat those frequencies.

Still another tradeo� is with the sensor noise error reduction. The con ict betweenthe disturbance rejection and the sensor noise reduction is evident in equation (5.24).Large �(Lo(j!)) values over a large frequency range make errors due to d small. How-ever, they also make errors due to n large because this noise is \passed through" overthe same frequency range, i.e.,

y = To(r � n) + SoPdi + Sod � (r � n):

Note that n is typically signi�cant in the high frequency range. Worst still, large loopgains outside of the bandwidth of P , i.e., �(Lo(j!)) � 1 or �(Li(j!)) � 1 while�(P (j!)) � 1, can make the control activity (u) quite unacceptable, which may causethe saturation of actuators. This follows from

u = KSo(r � n� d)� Tidi = SiK(r � n� d)� Tidi � P�1(r � n� d)� di:

Here, we have assumed P to be square and invertible for convenience. The resultingequation shows that disturbances and sensor noise are actually ampli�ed at u wheneverthe frequency range signi�cantly exceeds the bandwidth of P , since for ! such that�(P (j!))� 1, we have

�[P�1(j!)] =1

�[P (j!)]� 1:

Similarly, the controller gain, �(K), should also be kept not too large in the frequencyrange where the loop gain is small in order to not saturate the actuators. This is becausefor small loop gain �(Lo(j!))� 1 or �(Li(j!))� 1

u = KSo(r � n� d)� Tidi � K(r � n� d):

Therefore, it is desirable to keep �(K) not too large when the loop gain is small.To summarize the above discussion, we note that good performance requires in some

frequency range, typically some low frequency range (0; !l):

�(PK)� 1; �(KP )� 1; �(K)� 1

and good robustness and good sensor noise rejection require in some frequency range,typically some high frequency range (!h;1)

�(PK)� 1; �(KP )� 1; �(K) �M

whereM is not too large. These design requirements are shown graphically in Figure 5.4.The speci�c frequencies !l and !h depend on the speci�c applications and the knowledgeone has on the disturbance characteristics, the modeling uncertainties, and the sensornoise levels.


��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

�(L)

�(L)

@@@@@@@SSSSHHHHHH

SSSSSSS

@@

@@

@@@

HHHH

HH

ZZZZZZ@@@@@@

HHHHHHHHH

ZZZZZZZ

XXXXX

log!!l

!h -

6

Figure 5.4: Desired Loop Gain

5.6 The Concept of Loop Shaping

The analysis in the last section motivates a conceptually simple controller design tech-nique: loop shaping. Loop shaping controller design involves essentially �nding a con-trollerK that shapes the loop transfer function L so that the loop gains, �(L) and �(L),clear the boundaries speci�ed by the performance requirements at low frequencies andby the robustness requirements at high frequencies as shown in Figure 5.4.

In the SISO case, the loop shaping design technique is particularly e�ective andsimple since �(L) = �(L) = jLj. The design procedure can be completed in two steps:

SISO Loop Shaping

(1) Find a rational strictly proper transfer function L which contains all the right halfplane poles and zeros of P such that jLj clears the boundaries speci�ed by theperformance requirements at low frequencies and by the robustness requirementsat high frequencies as shown in Figure 5.4.

L must also be chosen so that 1+L has all zeros in the open left half plane, whichcan usually be guaranteed by making L well-behaved in the crossover region, i.e.,L should not be decreasing too fast in the frequency range of jL(j!)j � 1.

(2) The controller is given by K = L=P .

The loop shaping for MIMO system can be done similarly if the singular values ofthe loop transfer functions are used for the loop gains.

5.6. The Concept of Loop Shaping 135

MIMO Loop Shaping

(1) Find a rational strictly proper transfer function L which contains all the righthalf plane poles and zeros of P so that the product of P and P�1L (or LP�1)has no unstable poles and/or zeros cancelations, and �(L) clears the boundaryspeci�ed by the performance requirements at low frequencies and �(L) clears theboundary speci�ed by the robustness requirements at high frequencies as shownin Figure 5.4.

L must also be chosen so that det(I +L) has all zeros in the open left half plane.(This is not easy for MIMO systems.)

(2) The controller is given by K = P�1L if L is the output loop transfer function (orK = LP�1 if L is the input loop transfer function).

The loop shaping design technique can be quite useful especially for SISO controlsystem design. However, there are severe limitations when it is used for MIMO systemdesign.

Limitations of MIMO Loop Shaping

Although the above loop shaping design can be e�ective in some of applications, thereare severe intrinsic limitations. Some of these limitations are listed below:

� The loop shaping technique described above can only e�ectively deal with prob-lems with uniformly speci�ed performance and robustness speci�cations. Morespeci�cally, the method can not e�ectively deal with problems with di�erent spec-i�cations in di�erent channels and/or problems with di�erent uncertainty char-acteristics in di�erent channels without introducing signi�cant conservatism. Toillustrate this di�culty, consider an uncertain dynamical system

P� = (I +�)P

where P is the nominal plant and � is the multiplicative modeling error. Assumethat � can be written in the following form

� = ~�Wt; �( ~�) < 1:

Then, for robust stability, we would require �(�To) = �( ~�WtTo) < 1 or �(WtTo) �1. If a uniform bound is required on the loop gain to apply the loop shaping tech-nique, we would need to overbound �(WtTo):

�(WtTo) � �(Wt)�(To) � �(Wt)�(Lo)

1� �(Lo); if �(Lo) < 1

and the robust stability requirement is implied by

�(Lo) � 1

�(Wt) + 1� 1

�(Wt); if �(Lo) < 1:


Similarly, if the performance requirements, say output disturbance rejection, arenot uniformly speci�ed in all channels but by a weighting matrix Ws such that�(WsSo) � 1, then it is also necessary to overbound �(WsSo) in order to applythe loop shaping techniques:

�(WsSo) � �(Ws)�(So) � �(Ws)

�(Lo)� 1; if �(Lo) > 1

and the performance requirement is implied by

�(Lo) � �(Ws) + 1 � �(Ws); if �(Lo) > 1:

It is possible that the bounds for the loop shape may contradict each other atsome frequency range, as shown in the �gure 5.5. However, this does not implythat there is no controller that will satisfy both nominal performance and robuststability except for SISO systems. This contradiction happens because the bounds

��

��

@@@@@

@@@@

@@@@@

@@@@@

@@@@

@@@@

@@@

@@

��

��

��

��

��

��

��

��

��

��

��

��

��

��

�

��

log!

�(Ws)

1

�(Wt)��7

��

PPPPPPPPP

ZZZZZZZZ

-

6

Figure 5.5: Con ict Requirements

do not utilize the structure of weights, Ws and Wt; and the bounds are onlysu�cient conditions for robust stability and nominal performance. This possibilitycan be further illustrated by the following example: Assume that a two-input andtwo-output system transfer matrix is given by

P (s) =1

s+ 1

�1 �0 1

�;

and suppose the weighting matrices are given by

Ws =

24 1

s+1�

(s+1)(s+2)

0 1s+1

35 ; Wt =

24 s+2

s+10�(s+1)s+10

0 s+2s+10

35 :

5.7. Weighted H2 and H1 Performance 137

It is easy to show that for large �, the weighting functions are as shown in Fig-ure 5.5, and thus the above loop shaping technique cannot be applied. However,it is also easy to show that the system with controller

K = I2

gives

WsS =

� 1s+2 0

0 1s+2

�; WtT =

� 1s+10 0

0 1s+10

�

and, therefore, the robust performance criterion is satis�ed.

� Even if all of the above problems can be avoided, it may still be di�cult to �nd amatrix function Lo so that K = P�1Lo is stabilizing. This becomes much harderif P is non-minimum phase and/or unstable.

Hence some new methodologies have to be introduced to solve complicated problems.The so-called LQG/LTR (Linear Quadratic Gaussian/Loop Transfer Recovery) proce-dure developed �rst by Doyle and Stein [1981] and extended later by various authorscan solve some of the problems, but it is essentially limited to nominally minimumphase and output multiplicative uncertain systems. For these reasons, it will not beintroduced here. This motivates us to consider the closed-loop performance directly interms of the closed-loop transfer functions instead of open loop transfer functions. Thefollowing section considers some simple closed-loop performance problem formulations.

5.7 Weighted H2 and H1 Performance

In this section, we consider how to formulate some performance objectives into math-ematically tractable problems. As shown in section 5.5, the performance objectives ofa feedback system can usually be speci�ed in terms of requirements on the sensitivityfunctions and/or complementary sensitivity functions or in terms of some other closed-loop transfer functions. For instance, the performance criteria for a scalar system maybe speci�ed as requiring � js(j!)j � � < 1 8! � !0;

js(j!)j � � > 1 8! > !0

where s(j!) = 11+p(j!)k(j!) . However, it is much more convenient to re ect the system

performance objectives by choosing appropriate weighting functions. For example, theabove performance objective can be written as

jws(j!)s(j!)j � 1; 8!with

jws(j!)j =��1 8! � !0;��1 8! > !0:


e

eee

n

6

�? ~n

eyu

d

~d~di~u

di

�r

�Wn

?

?Wd

-We--P

6

6

Wi

?

?

Wu

-K ---Wr-

Figure 5.6: Standard Feedback Con�guration with Weights

In order to use ws in control design, a rational transfer function ws is usually used toapproximate the above frequency response.

The advantage of using weighted performance speci�cations is obvious in multivari-able system design. First of all, some components of a vector signal are usually moreimportant than others. Secondly, each component of the signal may not be measured inthe same metric; for example, some components of the output error signal may be mea-sured in terms of length, and the others may be measured in terms of voltage. Therefore,weighting functions are essential to make these components comparable. Also, we mightbe primarily interested in rejecting errors in a certain frequency range (for example lowfrequencies), hence some frequency dependent weights must be chosen.

In general, we shall modify the standard feedback diagram in Figure 5.3 into Fig-ure 5.6. The weighting functions in Figure 5.6 are chosen to re ect the design objectivesand knowledge on the disturbances and sensor noise. For example, Wd and Wi may bechosen to re ect the frequency contents of the disturbances d and di or they may be usedto model the disturbance power spectrum depending on the nature of signals involvedin the practical systems. The weighting matrix Wn is used to model the frequencycontents of the sensor noise while We may be used to re ect the requirements on theshape of certain closed-loop transfer functions, for example, the shape of the outputsensitivity function. Similarly, Wu may be used to re ect some restrictions on the con-trol or actuator signals, and the dashed precompensatorWr is an optional element usedto achieve deliberate command shaping or to represent a non-unity feedback system inequivalent unity feedback form.

It is, in fact, essential that some appropriate weighting matrices be used in order toutilize the optimal control theory discussed in this book, i.e., H2 and H1 theory. Soa very important step in controller design process is to choose the appropriate weights,We;Wd;Wu, and possibly Wn;Wi, and Wr. The appropriate choice of weights for aparticular practical problem is not trivial. In many occasions, as in the scalar case,the weights are chosen purely as a design parameter without any physical bases, so

5.7. Weighted H2 and H1 Performance 139

these weights may be treated as tuning parameters which are chosen by the designerto achieve the best compromise between the con icting objectives. The selection of theweighting matrices should be guided by the expected system inputs and the relativeimportance of the outputs.

Hence, control design may be regarded as a process of choosing a controller K suchthat certain weighted signals are made small in some sense. There are many di�erentways to de�ne the smallness of a signal or transfer matrix, as we have discussed inthe last chapter. Di�erent de�nitions lead to di�erent control synthesis methods, andsome are much harder than others. A control engineer should make a judgment of themathematical complexity versus engineering requirements.

Below, we introduce two classes of performance formulations: H2 and H1 criteria.For the simplicity of presentation, we shall assume di = 0 and n = 0.

H2 Performance

Assume, for example, that the disturbance ~d can be approximately modeled as animpulse with random input direction, i.e.,

~d(t) = ��(t)

andE(��) = I

where E denotes the expectation. We may choose to minimize the expected energy ofthe error e due to the disturbance ~d:

Enkek22

o= E

�Z 1

0

kek2 dt�= kWeSoWdk22 :

Alternatively, if we suppose that the disturbance ~d can be approximately modeled aswhite noise with S ~d ~d = I , then

See = (WeSoWd)S ~d ~d(WeSoWd)�;

and we may chose to minimize the power of e:

kek2P =1

2�

Z 1

�1TraceSee(j!) d! = kWeSoWdk22 :

In general, a controller minimizing only the above criterion can lead to a very largecontrol signal u that could cause saturation of the actuators as well as many otherundesirable problems. Hence, for a realistic controller design, it is necessary to includethe control signal u in the penalty function. Thus, our design criterion would usuallybe something like this

Enkek22 + �2 k~uk22

o=

�

WeSoWd

�WuKSoWd

� 2

2


with some appropriate choice of weighting matrix Wu and scalar �. The parameter �clearly de�nes the tradeo� we discussed earlier between good disturbance rejection atthe output and control e�ort (or disturbance and sensor noise rejection at the actuators).Note that � can be set to � = 1 by an appropriate choice of Wu. This problem canbe viewed as minimizing the energy consumed by the system in order to reject thedisturbance d.

This type of problem was the dominant paradigm in the 1960's and 1970's and isusually referred to as Linear Quadratic Gaussian Control or simply as LQG. (Theywill also be referred to as H2 mixed sensitivity problems for the consistency with theH1 problems discussed next.) The development of this paradigm stimulated extensiveresearch e�orts and is responsible for important technological innovation, particularlyin the area of estimation. The theoretical contributions include a deeper understandingof linear systems and improved computational methods for complex systems throughstate-space techniques. The major limitation of this theory is the lack of formal treat-ment of uncertainty in the plant itself. By allowing only additive noise for uncertainty,the stochastic theory ignored this important practical issue. Plant uncertainty is par-ticularly critical in feedback systems.

H1 Performance

Although the H2 norm (or L2 norm) may be a meaningful performance measure andalthough LQG theory can give e�cient design compromises under certain disturbanceand plant assumptions, the H2 norm su�ers a major de�ciency. This de�ciency is dueto the fact that the tradeo� between disturbance error reduction and sensor noise errorreduction is not the only constraint on feedback design. The problem is that theseperformance tradeo�s are often overshadowed by a second limitation on high loop gains{ namely, the requirement for tolerance to uncertainties. Though a controller may bedesigned using FDLTI models, the design must be implemented and operated with a realphysical plant. The properties of physical systems, in particular the ways in which theydeviate from �nite-dimensional linear models, put strict limitations on the frequencyrange over which the loop gains may be large.

A solution to this problem would be to put explicit constraints on the loop gain inthe penalty function. For instance, one may chose to minimize

supk ~dk

2�1kek2 = kWeSoWdk1 ;

subject to some restrictions on the control energy or control bandwidth:

supk ~dk

2�1k~uk2 = kWuKSoWdk1 :

Or more frequently, one may introduce a parameter � and a mixed criterion

supk ~dk

2�1

nkek22 + �2 k~uk22

o=

�

WeSoWd

�WuKSoWd

� 2

1:


This problem can also be regarded as minimizing the maximum power of the errorsubject to all bounded power disturbances: let

e :=

�e~u

�

then

See =

�WeSoWd

WuKSoWd

�S ~d ~d

�WeSoWd

WuKSoWd

��and

supk ~dk

P�1kek2P = sup

k ~dkP�1

�1

2�

Z 1

�1TraceS~e~e(j!) d!

�=

�

WeSoWd

WuKSoWd

� 2

1:

Alternatively, if the system robust stability margin is the major concern, the weightedcomplementary sensitivity has to be limited. Thus the whole cost function may be

�WeSoWd

�W1ToW2

� 1

where W1 and W2 are the frequency dependent uncertainty scaling matrices. Thesedesign problems are usually called H1 mixed sensitivity problems. For a scalar system,an H1 norm minimization problem can also be viewed as minimizing the maximummagnitude of the system's steady-state response with respect to the worst case sinusoidalinputs.


The presentation of this chapter is based primarily on Doyle [1984]. The discussionof internal stability and coprime factorization can also be found in Francis [1987] andVidyasagar [1985]. The loop shaping design is well known for SISO systems in theclassical control theory. The idea was extended to MIMO systems by Doyle and Stein[1981] using LQG design technique. The limitations of the loop shaping design arediscussed in detail in Stein and Doyle [1991]. Chapter 18 presents another loop shapingmethod using H1 control theory which has the potential to overcome the limitationsof the LQG/LTR method.

6Performance Limitations

This chapter introduces some multivariable versions of the Bode's sensitivity integral re-lations and Poisson integral formula. The sensitivity integral relations are used to studythe design limitations imposed by bandwidth constraints and the open-loop unstablepoles, while the Poisson integral formula is used to study the design constraints imposedby the non-minimum phase zeros. These results display that the design limitations inmultivariable systems are dependent on the directionality properties of the sensitivityfunction as well as those of the poles and zeros, in addition to the dependence upon poleand zero locations which is known in single-input single-output systems. These integralrelations are also used to derive lower bounds on the singular values of the sensitivityfunction which display the design tradeo�s.

6.1 Introduction

One important problem that arises frequently is concerned with the level of performancethat can be achieved in feedback design. It has been shown in the previous chapters thatthe feedback design goals are inherently con icting, and a tradeo� must be performedamong di�erent design objectives. It is also known that the fundamental requirementssuch as stability and robustness impose inherent limitations upon the feedback proper-ties irrespective of design methods, and the design limitations become more severe inthe presence of right-half plane zeros and poles in the open-loop transfer function.

An important tool that can be used to quantify feedback design constraints is fur-nished by the Bode's sensitivity integral relation and the Poisson integral formula. Theseintegral formulae express design constraints directly in terms of the system's sensitivity

143

144 PERFORMANCE LIMITATIONS

and complementary sensitivity functions. A well-known theorem due to Bode statesthat for single-input single-output open-loop stable systems with more than one pole-zero excess, the integral of the logarithmic magnitude of the sensitivity function overall frequencies must equal to zero. This integral relation therefore suggests that in thepresence of bandwidth constraints, the desirable property of sensitivity reduction inone frequency range must be traded o� against the undesirable property of sensitivityincrease at other frequencies. A result by Freudenberg and Looze [1985] further ex-tends Bode's theorem to open-loop unstable systems, which shows that the presenceof open-loop unstable poles makes the sensitivity tradeo� a more di�cult task. In thesame reference, the limitations imposed by the open loop non-minimum phase zerosupon the feedback properties were also quanti�ed using the Poisson integral. The re-sults presented here are some multivariable extensions of the above mentioned integralrelations.

We shall now consider a linear time-invariant feedback system with an n � n looptransfer matrix L. Let S(s) and T (s) be the sensitivity function and the complementarysensitivity function, respectively

S(s) = (I + L(s))�1; T (s) = L(s)(I + L(s))�1: (6:1)

Before presenting the multivariable integral relations, recall that a point z 2 C is a trans-mission zero of L(s), which has full normal rank and a minimal state space realization(A;B;C;D), if there exist vectors � and � such that the relation�

A� zI BC D

��

�= 0

holds, where �� = 1, and � is called the input zero direction associated with z. Analo-gously, a transmission zero z of L(s) satis�es the relation

�x� w�

� � A� zI BC D

�= 0;

where x and w are some vectors with w satisfying the condition w�w = 1. The vectorw is called the output zero direction associated with z. Note also that p 2 C is a pole ofL(s) if and only if it is a zero of L�1(s). By a slight abuse of terminology, we shall callthe input and output zero directions of L�1(s) the input and output pole directions ofL(s), respectively.

In the sequel we shall preclude the possibility that z is both a zero and pole of L(s).Then, by Lemma 3.27 and 3.28, z is a zero of L(s) if and only if L(z)� = 0 for somevector �, �� = 1, or w�L(z) = 0 for some vector w, w�w = 1. Similarly, p is a pole ofL(s) if and only if L�1(p)� = 0 for some vector �, �� = 1, or w�L�1(p) = 0 for somevector w, w�w = 1.

It is well-known that a non-minimum phase transfer function admits a factorizationthat consists of a minimum phase part and an all-pass factor. Let zi 2 C+ , i = 1; � � � ; k,be the non-minimum phase zeros of L(s) and let �i; �

�i �i = 1; be the input directions

generated from the following iterative procedure

6.2. Integral Relations 145

� Let (A;B;C;D) be a minimal realization of L(s) and B(0) := B;

� Repeat for i = 1 to k �A� ziI B(i�1)

C D

� ��i�i

�= 0

B(i) := B(i�1) � 2(Rezi)�i��i :

Then, the input factorization of L(s) is given by

L(s) = Lm(s)Bk(s) � � �B1(s)

where Lm(s) denotes the minimum phase factor of L(s), and Bi(s) corresponds to theall-pass factor associated with zi:

Bi(s) = I � 2Rezis+ zi

�i��i : (6:2)

In fact, Lm(s) can be written as

Lm(s) :=

�A B(k)

C D

�:

For example, suppose z 2 C is a zero of L(s). Then it is easy to verify using state spacecalculation that L(s) can be factorized as

L(s) =

�A B � 2(Rez)��

C D

��I � 2Rez

s+ z��:

Note that a non-minimum phase transfer function admits an output factorizationanalogous to the input factorization, and the subsequent results can be applied to bothtypes of factorizations.

6.2 Integral Relations

In this section we provide extensions of the Bode's sensitivity integral relations andPoisson integral relations to multivariable systems. Consider a unit feedback systemwith a loop transfer function L. The following assumptions are needed:

Assumption 6.1 The closed-loop system is stable.

Assumption 6.2limR!1

sup

s 2 C +

jsj � R

R �(L(s)) = 0


Each of these assumptions has important implications. Assumption 6.1 implies thatthe sensitivity function S(s) is analytic in C +. Assumption 6.2 states that the open-looptransfer function has a rollo� rate of more than one pole-zero excess. Note that mostof practical systems require a rollo� rate of more than one pole-zero excess in order tomaintain a su�cient stability margin. One instance for this assumption to hold is thateach element of L has a rollo� rate of more than one pole-zero excess.

Suppose that the open-loop transfer function L(s) has poles pi in the open right-halfplane with input pole directions �i, i = 1; � � � ; k, which are obtained through a similariterative procedure as in the last section.

Lemma 6.1 Let L(s) and S(s) be de�ned by (6.1). Then p 2 C is a zero of S(s) withzero direction � if and only if it is a pole of L(s) with pole direction �.

Proof. Let p be a pole of L(s). Then there exists a vector � such that �� = 1

and L�1(p)� = 0. However, S(s) = (I + L(s))�1 =�I + L�1(s)

��1L�1(s). Hence,

S(p)� = 0. This establishes the su�ciency part. The proof for necessity follows byreversing the above procedure. 2

Then the sensitivity function S(s) can be factorized as

S(s) = Sm(s)B1(s)B2(s) � � �Bk(s) (6:3)

where Sm(s) has no zeros in C +, and Bi(s) is given by

Bi(s) = I � 2Repis+ pi

�i��i :

Theorem 6.2 Let S(s) be factorized in (6.3) and suppose that Assumptions 6.1-6.2hold. Then Z 1

0

ln�(S(j!))d! � ��max

0@ kXj=1

(Repj)�j��j

1A : (6:4)

It is also instructive to examine the following two extreme cases.

(i) If ��i �j = 0 for all i; j = 1; � � � ; k, i 6= j, thenZ 1

0

ln�(S(j!))d! � �maxRepi: (6:5)

(ii) If �i = � for all i = 1; � � � ; k, thenZ 1

0

ln�(S(j!))d! � �kXi=1

Repi: (6:6)

6.2. Integral Relations 147

Note also that the following equality holdsZ 1

0

ln jS(j!)jd! = �

kXi=1

Repi

if S(s) is a scalar function.Theorem 6.2 has an important implication toward the limitations imposed by the

open-loop unstable poles on sensitivity properties. It shows that there will exist afrequency range over which the largest singular value of the sensitivity function exceedsone if it is to be kept below one at other frequencies. In the presence of bandwidthconstraint, this imposes a sensitivity tradeo� in di�erent frequency ranges. Furthermore,this result suggests that unlike in single-input single-output systems, the limitationsimposed by open-loop unstable poles in multivariable systems are related not only tothe locations, but also to the directions of poles and their relative interaction. The case(i) of this result corresponds to the situation where the pole directions are mutuallyorthogonal, for which the integral corresponding to each singular value is related solelyto one unstable pole with a corresponding distance to the imaginary axis, as if eachchannel of the system is decoupled from the others in e�ects of sensitivity properties.The case (ii) corresponds to the situation where all the pole directions are parallel, forwhich the unstable poles a�ect only the integral corresponding to the largest singularvalue, as if the channel corresponding to the largest singular value contains all unstablepoles. Clearly, these phenomena are unique to multivariable systems. The followingresult further strengthens these observations and shows that the interaction betweenopen-loop unstable poles plays an important role toward sensitivity properties.

Corollary 6.3 Let k = 2 and let the Assumptions 6.1-6.2 hold. ThenZ 1

0

ln�(S(j!))d! � �

2

�Re(p1 + p2) +

p(Re(p1 � p2))2 + 4Rep1Rep2 cos2\(�1; �2)

�:

(6:7)

Proof. Note that

�max ((Rep1)�1��1 + (Rep2)�2�

�2) = �max

��1 �2

� � Rep1 00 Rep2

� ��1��2

��

= �max

��Rep1 00 Rep2

� ��1��2

� ��1 �2

��

= �amx

��Rep1 00 Rep2

� �1 ��1�2

��2�1 1

��

= �max

��Rep1 (Rep1)�

�1�2

(Rep2)��2�1 Rep2

��:

A straightforward calculation then gives

�max ((Rep1)�1��1 + (Rep2)�2�

�2) =

1

2Re(p1 + p2)


+1

2

p(Re(p1 � p2))2 + 4Rep1Rep2 cos2 \(�1; �2):

The proof is now completed. 2

The utility of this corollary is clear. This result fully characterizes the limitationimposed by a pair of open-loop unstable poles on the sensitivity reduction properties.This limitation depends not only on the relative distances of the poles to the imaginaryaxis, but also on the principal angle between the two pole directions.

Next, we investigate the design constraints imposed by open-loop non-minimumphase zeros upon sensitivity properties. The results below may be considered to be amatrix extension of the Poisson integral relation.

Theorem 6.4 Let S(s) 2 H1 be factorized as in (6.3) and assume that

limR!1

max�2[��=2;�=2]

��ln�(S(Rej�))��R

= 0: (6:8)

Then, for any non-minimum phase zero z = x0+jy0 2 C+ of L(s) with output directionw, w�w = 1,Z 1

�1ln�(S(j!))

x0x20 + (! � y0)2

d! � � ln�(Sm(z)) � � ln�(S(z)): (6:9)

Z 1

�1ln�(S(j!))

x0x20 + (! � y0)2

d! � � ln��w�B�1k (z) � � �B�11 (z)

�� : (6:10)

Note that the condition (6.8) is satis�ed if L(s) is a proper rational transfer matrix.

Furthermore, for single-input single-output systems (n = 1),

��w�

kYi=1

B�1i (z)

�� =

kYi=1

��z + piz � pi

��and Z 1

�1ln jS(j!)j x0

x20 + (! � y0)2d! = � ln

kYi=1

��z + piz � pi

�� :The more interesting result, however, is the inequality (6.10). This result again suggeststhat the multivariable sensitivity properties are closely related to the pole and zerodirections. This result implies that the sensitivity reduction ability of the system maybe severely limited by the open-loop unstable poles and non-minimum phase zeros,especially when these poles and zeros are close to each other and the angles betweentheir directions are small.

6.3. Design Limitations and Sensitivity Bounds 149

6.3 Design Limitations and Sensitivity Bounds

The integral relations derived in the preceding section are now used to analyze the de-sign tradeo�s and the limitations imposed by the bandwidth constraint and right-halfplane poles and zeros upon sensitivity reduction properties. Similar to their counter-parts for single-input single-output systems, these integral relations show that there willnecessarily exist frequencies at which the sensitivity function exceeds one if it is to bekept below one over other frequencies, hence exhibiting a tradeo� between the reductionof the sensitivity over one frequency range against its increase over another frequencyrange.

Suppose that the feedback system is designed such that the level of sensitivity re-duction is given by

�(S(j!)) �ML < 1; 8! 2 [�!L; !L]; (6:11)

where ML > 0 is a given constant. Let z = x0 + jy0 2 C+ be an open right-half planezero of L(s) with output direction w. De�ne also

�(z) :=

Z !L

�!L

x0x20 + (! � y0)2

d!:

The following lower bound on the maximum sensitivity displays a limitation due to theopen right-half plane zeros upon the sensitivity reduction properties.

Corollary 6.5 Let the assumption in Theorem 6.4 holds. In addition, suppose that thecondition (6.11) is satis�ed. Then, for each open right-half plane zero z 2 C+ of L(s)with output direction w,

kS(s)k1 ��

1

ML

� �(z)��(z) ��w�B�1k (z) � � �B�11 (z)

�� (z) ; (6:12)

and

kS(s)k1 ��

1

ML

� �(z)

��(z)

(�(S(z)))�

��(z) : (6:13)

Proof. Note thatZ 1

�1ln�(S(j!))

x0x20 + (! � y0)2

d! � (� � �(z)) ln kS(j!)k1 + �(z) ln(ML):

Then the inequality (6.12) follows by applying inequality (6.10) and inequality (6.13)follows by applying inequality (6.9) 2

The interpretation of Corollary 6.5 is similar to that in single-input single-outputsystems. Roughly stated, this result shows that for a non-minimum phase system, its


sensitivity must increase beyond one at certain frequencies if the sensitivity reduction isto be achieved at other frequencies. Of particular importance here is that the sensitivityfunction will in general exhibit a larger peak in multivariable systems than in single-input single-output systems, due to the fact that �(S(z)) � 1.

The design tradeo�s and limitations on the sensitivity reduction which arise frombandwidth constraints as well as open-loop unstable poles can be studied using theextended Bode integral relations. However, these integral relations by themselves donot mandate a meaningful tradeo� between the sensitivity reduction and the sensitivityincrease, since the sensitivity function can be allowed to exceed one by an arbitrarilysmall amount over an arbitrarily large frequency range so as not to violate the Bodeintegral relations. However, bandwidth constraints in feedback design typically requirethat the open-loop transfer function be small above a speci�ed frequency, and that it rollo� at a rate of more than one pole-zero excess above that frequency. These constraintsare commonly needed to ensure stability robustness despite the presence of modelinguncertainty in the plant model, particularly at high frequencies. One way of quantifyingsuch bandwidth constraints is by requiring the open-loop transfer function to satisfy

�(L(j!)) � MH

!1+k� � < 1; 8! 2 [!H ; 1) (6:14)

where !H > !L, and MH > 0, k > 0 are some given constants. With the bandwidthconstraint given as such, the following result again shows that the sensitivity reductionspeci�ed by (6.11) can be achieved only at the expense of increasing the sensitivity atcertain frequencies.

Corollary 6.6 Suppose the Assumptions 6.1-6.2 hold. In addition, suppose that theconditions (6.11) and (6.14) are satis�ed for some !H and !L such that !H > !L.Then

max!2[!L;!H ]

�(S(j!)) � e��

1

ML

� !L!H�!L

(1� �)!H

k(!H�!L) (6:15)

where

� =��max

�Pki=1 (Repi) �i�

�i

�!H � !L

:

Proof. Note �rst that for ! � !H ,

�(S(j!)) =1

�(I + L(j!))� 1

1� �(L(j!))� 1

1� MH

!1+k

and

�Z 1

!H

ln

�1� MH

!1+k

�d! =

1Xi=1

Z 1

!H

1

i

�MH

!1+k

�id!

6.4. Bode's Gain and Phase Relation 151

=

1Xi=1

1

i

!Hi(1 + k)� 1

MH

!1+kH

!i

� !Hk

1Xi=1

1

i

MH

!1+kH

!i

= �!Hk

ln

1� MH

!1+kH

!

� �!Hk

ln(1� �):

ThenZ 1

0

ln�(S(j!))d! =

Z !L

0

ln�(S(j!))d! +

Z !H

!L

ln�(S(j!))d! +

Z 1

!H

ln�(S(j!))d!

� !L lnML + (!H � !L) max!2[!L;!H ]

ln�(S(j!))�Z 1

!H

ln

�1� MH

!1+k

�d!

� !L lnML + (!H � !L) max!2[!L;!H ]

ln�(S(j!))� !Hk

ln(1� �)

and the result follows from applying (6.4). 2

The above lower bound shows that the sensitivity can be very signi�cant in thetransition band.

6.4 Bode's Gain and Phase Relation

In the classical feedback theory, the Bode's gain-phase integral relation (see Bode [1945])has been used as an important tool to express design constraints in scalar systems. Thefollowing is an extended version of the Bode's gain and phase relationship for an open-loop stable scalar system with possible right-half plane zeros, see Freudenberg and Looze[1988] and Doyle, Francis, and Tannenbaum [1992]:

\L(j!0) =1

�

Z 1

�1

d ln jLjd�

ln cothj�j2d� +\

kYi=1

j!0 + zij!0 � zi

where zi's are assumed to be the right-half plane zeros of L(s) and � := ln(!=!0). Note

thatd ln jL(j!)j

d�is the slope of the Bode plot which is almost always negative. It follows

that \L(j!0) will be large if the gain L attenuates slowly and small if it attenuatesrapidly. The behavior of \L(j!) is particularly important near the crossover frequency!c where jL(j!c)j = 1 since � + \L(j!c) is the phase margin of the feedback system,and further the return di�erence is given by

j1 + L(j!c)j = j1 + L�1(j!c)j = 2

��sin � +\L(j!c)2

��


which must not be too small for good stability robustness. If � +\L(j!c) is forced tobe very small by rapid gain attenuation, the feedback system will amplify disturbances

and exhibit little uncertainty tolerance at and near !c. Since \j!0 + zij!0 � zi

� 0 for each i,

a non-minimum phase zero contributes an additional phase lag and imposes limitationsupon the rollo� rate of the open-loop gain. The con ict between attenuation rateand loop quality near crossover is thus clearly evident. A thorough discussion of thelimitations these relations impose upon feedback control design is given by Bode [1945],Horowitz [1963], and Freudenberg and Looze [1988]. See also Freudenberg and Looze[1988] for some multivariable generalizations.

In the classical feedback theory, it has been common to express design goals interms of the \shape" of the open-loop transfer function. A typical design requires thatthe open-loop transfer function have a high gain at low frequencies and a low gain athigh frequencies while the transition should be well-behaviored. The same conclusionapplies to multivariable system where the singular value plots should be well-behavioredbetween the transition band.


The results presented in this chapter are based on Chen [1992a, 1992b, 1995]. Somerelated results can be found in Boyd and Desoer [1985] and Freudenberg and Looze[1988]. The related results for scalar systems can be found in Bode [1945], Horowitz[1963], Doyle, Francis, and Tannenbaum [1992], and Freudenberg and Looze [1988].

The study of analytic functions, harmonic functions1, and various integral relationsin the scalar case can be found in Garnett [1981] and Ho�man [1962].

1A function f : C 7�! R is said to be a harmonic function (subharmonic function) if 52f(s) = 0(52f(s) � 0) where the symbol 52f(s) with s = x + jy denotes the Laplacian of f(s) and is de�nedby

52f(s) :=@2f(s)

@x2+

@2f(s)

@y2:

7Model Reduction by Balanced

Truncation

In this chapter we consider the problem of reducing the order of a linear multivariabledynamical system. There are many ways to reduce the order of a dynamical system.However, we shall study only two of them: balanced truncation method and Hankelnorm approximation method. This chapter focuses on the balanced truncation methodwhile the next chapter studies the Hankel norm approximation method.

A model order reduction problem can in general be stated as follows: Given a fullorder model G(s), �nd a lower order model, say, an r-th order model Gr, such that Gand Gr are close in some sense. Of course, there are many ways to de�ne the closenessof an approximation. For example, one may desire that the reduced model be such that

G = Gr +�a

and �a is small in some norm. This model reduction is usually called an additive modelreduction problem. On the other hand, one may also desire that the approximation bein relative form

Gr = G(I +�rel)

so that �rel is small in some norm. This is called a relative model reduction problem.We shall be only interested in L1 norm approximation in this book. Once the norm ischosen, the additive model reduction problem can be formulated as

infdeg(Gr)�r

kG�Grk1

153

154 MODEL REDUCTION BY BALANCED TRUNCATION

and the relative model reduction problem can be formulated as

infdeg(Gr)�r

G�1(G�Gr) 1

if G is invertible. In general, a practical model reduction problem is inherently frequencyweighted, i.e., the requirement on the approximation accuracy at one frequency rangecan be drastically di�erent from the requirement at another frequency range. Theseproblems can in general be formulated as frequency weighted model reduction problems

infdeg(Gr)�r

kWo(G�Gr)Wik1

with appropriate choice of Wi and Wo. We shall see in this chapter how the balancedrealization can give an e�ective approach to the above model reduction problems.

7.1 Model Reduction by Balanced Truncation

Consider a stable system G 2 RH1 and suppose G =

�A BC D

�is a balanced realiza-

tion, i.e., its controllability and observability Gramians are equal and diagonal. Denotethe balanced Gramians by �, then

A� +�A� +BB� = 0 (7:1)

A��+�A+ C�C = 0: (7:2)

Now partition the balanced Gramian as � =

��1 00 �2

�and partition the system

accordingly as

G =

24 A11 A12

A21 A22

B1

B2

C1 C2 D

35 :

Then (7.1) and (7.2) can be written in terms of their partitioned matrices as

A11�1 +�1A�11 +B1B

�1 = 0 (7.3)

�1A11 +A�11�1 + C�1C1 = 0 (7.4)

A21�1 +�2A�12 +B2B

�1 = 0 (7.5)

�2A21 +A�12�1 + C�2C1 = 0 (7.6)

A22�2 +�2A�22 +B2B

�2 = 0 (7.7)

�2A22 +A�22�2 + C�2C2 = 0: (7.8)

The following theorem characterizes the properties of these subsystems.

Theorem 7.1 Assume that �1 and �2 have no diagonal entries in common. Then bothsubsystems (Aii; Bi; Ci); i = 1; 2 are asymptotically stable.

7.1. Model Reduction by Balanced Truncation 155

Proof. It is clearly su�cient to show that A11 is asymptotically stable. The proof forthe stability of A22 is similar.

By Lemma 3.20 or Lemma 3.21, �1 can be assumed to be positive de�nite withoutloss of generality. Then it is obvious that �i(A11) � 0 by Lemma 3.19. Assume thatA11 is not asymptotically, then there exists an eigenvalue at j! for some !. Let V bea basis matrix for Ker(A11 � j!I). Then we have

(A11 � j!I)V = 0 (7:9)

which givesV �(A�11 + j!I) = 0:

Equations (7.3) and (7.4) can be rewritten as

(A11 � j!I)�1 +�1(A�11 + j!I) +B1B

�1 = 0 (7.10)

�1(A11 � j!I) + (A�11 + j!I)�1 + C�1C1 = 0: (7.11)

Multiplication of (7.11) from the right by V and from the left by V � gives V �C�1C1V = 0,which is equivalent to

C1V = 0:

Multiplication of (7.11) from the right by V now gives

(A�11 + j!I)�1V = 0:

Analogously, �rst multiply (7.10) from the right by �1V and from the left by V ��1 toobtain

B�1�1V = 0:

Then multiply (7.10) from the right by �1V to get

(A11 � j!I)�21V = 0:

It follows that the columns of �21V are in Ker(A11 � j!I). Therefore, there exists a

matrix ��1 such that�21V = V ��2

1:

Since ��21 is the restriction of �2

1 to the space spanned by V , it follows that it is possibleto choose V such that ��2

1 is diagonal. It is then also possible to choose ��1 diagonal andsuch that the diagonal entries of ��1 are a subset of the diagonal entries of �1.

Multiply (7.5) from the right by �1V and (7.6) by V to get

A21�21V +�2A

�12�1V = 0

�2A21V +A�12�1V = 0

which gives(A21V )��

21 = �2

2(A21V ):


This is a Sylvester equation in (A21V ). Because ��21 and �2

2 have no diagonal entries incommon it follows from Lemma 2.7 that

A21V = 0 (7:12)

is the unique solution. Now (7.12) and (7.9) implies that�A11 A12

A21 A22

� �V0

�= j!

�V0

�

which means that the A-matrix of the original system has an eigenvalue at j!. Thiscontradicts the fact that the original system is asymptotically stable. Therefore A11

must be asymptotically stable. 2

Corollary 7.2 If � has distinct singular values, then every subsystem is asymptoticallystable.

The stability condition in Theorem 7.1 is only su�cient. For example,

(s� 1)(s� 2)

(s+ 1)(s+ 2)=

24 �2 �2:8284 �2

0 �1 �1:41422 1:4142 1

35

is a balanced realization with � = I and every subsystem of the realization is stable.On the other hand,

s2 � s+ 2

s2 + s+ 2=

24 �1 1:4142 1:4142�1:4142 0 0�1:4142 0 1

35

is also a balanced realization with � = I but one of the subsystems is not stable.

Theorem 7.3 Suppose G(s) 2 RH1 and

G(s) =

24 A11 A12

A21 A22

B1

B2

C1 C2 D

35

is a balanced realization with Gramian � = diag(�1;�2)

�1 = diag(�1Is1 ; �2Is2 ; : : : ; �rIsr )

�2 = diag(�r+1Isr+1 ; �r+2Isr+2 ; : : : ; �NIsN )

and�1 > �2 > � � � > �r > �r+1 > �r+2 > � � � > �N


where �i has multiplicity si; i = 1; 2; : : : ; N and s1 + s2 + � � � + sN = n. Then thetruncated system

Gr(s) =

�A11 B1

C1 D

�is balanced and asymptotically stable. Furthermore

kG(s)�Gr(s)k1 � 2(�r+1 + �r+2 + � � �+ �N )

and the bound is achieved if r = N � 1, i.e.,

kG(s)�GN�1(s)k1 = 2�N :

Proof. The stability of Gr follows from Theorem 7.1. We shall now prove the errorbound for the case si = 1 for all i. The case where the multiplicity of �i is not equal toone is more complicated and an alternative proof is given in the next chapter. Hence,we assume si = 1 and N = n.

Let

�(s) := (sI �A11)�1

(s) := sI �A22 �A21�(s)A12

~B(s) := A21�(s)B1 +B2

~C(s) := C1�(s)A12 + C2

then using the partitioned matrix results of section 2.3,

G(s) �Gr(s) = C(sI �A)�1B � C1�(s)B1

=�C1 C2

� � sI �A11 �A12

�A21 sI �A22

��1 �B1

B2

�� C1�(s)B1

= ~C(s) �1(s) ~B(s)

computing this quantity on the imaginary axis to get

� [G(j!)�Gr(j!)] = �1=2max

h �1(j!) ~B(j!) ~B�(j!) ��(j!) ~C�(j!) ~C(j!)

i: (7:13)

Expressions for ~B(j!) ~B�(j!) and ~C�(j!) ~C(j!) are obtained by using the partitionedform of the internally balanced Gramian equations (7.3){(7.8).

An expression for ~B(j!) ~B�(j!) is obtained by using the de�nition of B(s), substi-tuting for B1B

�1 , B1B

�2 and B2B

�2 from the partitioned form of the Gramian equations

(7.3){(7.5), we get~B(j!) ~B�(j!) = (j!)�2 +�2

�(j!):

The expression for ~C�(j!) ~C(j!) is obtained analogously and is given by

~C�(j!) ~C(j!) = �2 (j!) + �(j!)�2:


These expressions for ~B(j!) ~B�(j!) and ~C�(j!) ~C(j!) are then substituted into (7.13)to obtain

� [G(j!)�Gr(j!)] = �1=2max

��2 + �1(j!)�2

�(j!)� ��2 + ��(j!)�2 (j!)

�:

Now consider one-step order reduction, i.e., r = n� 1, then �2 = �n and

� [G(j!)�Gr(j!)] = �n�1=2max

��1 + ��1(j!)

�[1 + �(j!)]

(7:14)

where � := ��(j!) (j!) = �� is an \all pass" scalar function. (This is the onlyplace we need the assumption of si = 1) Hence j�(j!)j = 1.

Using triangle inequality we get

� [G(j!)�Gr(j!)] � �n [1 + j�(j!)j] = 2�n: (7:15)

This completes the bound for r = n� 1.The remainder of the proof is achieved by using the order reduction by one step

results and by noting that Gk(s) =

�A11 B1

C1 D

�obtained by the \k-th" order parti-

tioning is internally balanced with balanced Gramian given by

�1 = diag(�1Is1 ; �2Is2 ; : : : ; �kIsk ):

Let Ek(s) = Gk+1(s)�Gk(s) for k = 1; 2; : : : ; N � 1 and let GN (s) = G(s). Then

� [Ek(j!)] � 2�k+1

since Gk(s) is a reduced order model obtained from the internally balanced realizationof Gk+1(s) and the bound for one-step order reduction, (7.15) holds.

Noting that

G(s)�Gr(s) =

N�1Xk=r

Ek(s)

by the de�nition of Ek(s), we have

� [G(j!)�Gr(j!)] �N�1Xk=r

� [Ek(j!)] � 2

N�1Xk=r

�k+1:

This is the desired upper bound.To see that the bound is actually achieved when r = N � 1, we note that �(0) = I .

Then the right hand side of (7.14) is 2�N at ! = 0. 2

The singular values �i are called the Hankel singular values. A useful consequenceof the above theorem is the following corollary.


Corollary 7.4 Let �i; i = 1; : : : ; N be the Hankel singular values of G(s) 2 RH1.Then

kG(s)�G(1)k1 � 2(�1 + : : :+ �N ):

The above bound can be tight for some systems. For example, consider an n-th ordertransfer function

G(s) =

nXj=1

�2j

s+ �2j;

with � > 0. Then kG(s)k1 = G(0) = n and G(s) has the following state spacerealization

G =

2666664

��2 ��4 �2

. . ....

��2n �n

� �2 � � � �n 0

3777775

and the controllability and observability Gramians of the realization are given by

P = Q =

��i+j

�2i + �2j

�

and P = Q ! 12In as � ! 1. So the Hankel singular values �j ! 1

2 and 2(�1 + �2 +� � �+ �n)! n = kG(s)k1 as �!1.

The model reduction bound can also be loose for systems with Hankel singular valuesclose to each other. For example, consider the balanced realization of a fourth ordersystem

G(s) =

266664�19:9579 �5:4682 9:6954 0:9160 �6:31805:4682 0 0 0:2378 0:0020�9:6954 0 0 �4:0051 �0:00670:9160 �0:2378 4:0051 �0:0420 0:2893�6:3180 �0:0020 0:0067 0:2893 0

377775

with Hankel singular values given by

�1 = 1; �2 = 0:9977; �3 = 0:9957; �4 = 0:9952:

The approximation errors and the estimated bounds are listed in the following table.The table shows that the actual error for an r-th order approximation is almost the sameas 2�r+1 which would be the estimated bound if we regard �r+1 = �r+2 = � � � = �4. Ingeneral, it is not hard to construct an n-th order system so that the r-th order balancedmodel reduction error is approximately 2�r+1 but the error bound is arbitrarily closeto 2(n � r)�r+1. One method to construct such system is as follows: Let G(s) be astable all-pass function, i.e., G(s)�G(s) = I , then there is a balanced realization forG so that the controllability and observability Gramians are P = Q = I . Next make


a very small perturbation to the balanced realization then the perturbed system hasa balanced realization with distinct singular values and P = Q � I . This perturbedsystem will have the desired properties and this is exactly how the above example isconstructed.

r 0 1 2 3kG�Grk1 2 1.996 1.991 1.9904

Bounds: 2P4

i=r+1 �i 7.9772 5.9772 3.9818 1.9904

2�r+1 2 1.9954 1.9914 1.9904

7.2 Frequency-Weighted Balanced Model Reduction

This section considers the extension of the balanced truncation method to frequencyweighted case. Given the original full order model G 2 RH1, the input weightingmatrix Wi 2 RH1 and the output weighting matrix Wo 2 RH1, our objective is to�nd a lower order model Gr such that

kWo(G�Gr)Wik1is made as small as possible. Assume that G;Wi, and Wo have the following state spacerealizations

G =

�A BC 0

�; Wi =

�Ai Bi

Ci Di

�; Wo =

�Ao Bo

Co Do

�

with A 2 Rn�n . Note that there is no loss of generality in assuming D = G(1) = 0since otherwise it can be eliminated by replacing Gr with D +Gr.

Now the state space realization for the weighted transfer matrix is given by

WoGWi =

2664

A 0 BCi BDi

BoC Ao 0 00 0 Ai Bi

DoC Co 0 0

3775 =:

"�A �B

�C 0

#:

Let �P and �Q be the solutions to the following Lyapunov equations

�A �P + �P �A� + �B �B� = 0 (7.16)

�Q �A+ �A� �Q+ �C� �C = 0: (7.17)

Then the input weighted Gramian P and the output weighted Gramian Q are de�nedby

P :=�In 0

��P

�In0

�; Q :=

�In 0

��Q

�In0

�:

7.3. Relative and Multiplicative Model Reductions 161

It can be shown easily that P and Q satisfy the following lower order equations�A BCi

0 Ai

��P P12P �12 P22

�+

�P P12P �12 P22

��A BCi

0 Ai

��

+

�BDi

Bi

��BDi

Bi

��

= 0

(7:18)

�Q Q12

Q�12 Q22

��A 0BoC Ao

�+

�A 0BoC Ao

��

Q Q12

Q�12 Q22

�+

�C�D�oC�o

��C�D�oC�o

��

= 0:

(7:19)

The computation can be further reduced ifWi = I orWo = I . In the case ofWi = I ,P can be obtained from

PA� +AP +BB� = 0 (7:20)

while in the case of Wo = I , Q can be obtained from

QA+A�Q+ C�C = 0: (7:21)

Now let T be a nonsingular matrix such that

TPT � = (T�1)�QT�1 =��1

�2

�

(i.e., balanced) with �1 = diag(�1Is1 ; : : : ; �rIsr ) and �2 = diag(�r+1Isr+1 ; : : : ; �nIsn)and partition the system accordingly as

�TAT�1 TBCT�1 0

�=

24 A11 A12 B1

A21 A22 B2

C1 C2 0

35 :

Then a reduced order model Gr is obtained as

Gr =

�A11 B1

C1 0

�:

Unfortunately, there is generally no known a priori error bound for the approximationerror and the reduced order model Gr is not guaranteed to be stable either.

7.3 Relative and Multiplicative Model Reductions

A very special frequency weighted model reduction problem is the relative error modelreduction problem where the objective is to �nd a reduced order model Gr so that

Gr = G(I +�rel)


and k�relk1 is made as small as possible. �rel is usually called the relative error. Inthe case where G is square and invertible, this problem can be simply formulated as

mindegGr�r

G�1(G�Gr) 1 :

Of course the dual approximation problem

Gr = (I +�rel)G

can be obtained by taking the transpose of G. We will show below that, as a bonus,the approximation Gr obtained below also serves as a multiplicative approximation:

G = Gr(I +�mul)

where �mul is usually called the multiplicative error.

Theorem 7.5 Let G;G�1 2 RH1 be an n-th order square transfer matrix with a statespace realization

G(s) =

�A BC D

�:

Let Wi = I and Wo = G�1(s) =�A�BD�1C �BD�1

D�1C D�1

�.

(a) Then the weighted Gramians P and Q for the frequency weighted balanced realiza-tion of G can be obtained as

PA� +AP +BB� = 0 (7:22)

Q(A�BD�1C) + (A�BD�1C)�Q+ C�(D�1)�D�1C = 0: (7:23)

(b) Suppose the realization for G is weighted balanced, i.e.,

P = Q = diag(�1Is1 ; : : : ; �rIsr ; �r+1Isr+1 ; : : : ; �N IsN ) = diag(�1;�2)

with �1 > �2 > : : : > �N � 0 and let the realization of G be partitioned compatiblywith �1 and �2 as

G(s) =

24 A11 A12 B1

A21 A22 B2

C1 C2 D

35 :

Then

Gr(s) =

�A11 B1

C1 D

�is stable and minimum phase. Furthermore

k�relk1 �NY

i=r+1

�1 + 2�i(

q1 + �2i + �i)

�� 1

k�mulk1 �NY

i=r+1

�1 + 2�i(

q1 + �2i + �i)

�� 1:


Proof. Since the input weighting matrixWi = I , it is obvious that the input weightedGramian is given by P . Now the output weighted transfer matrix is given by

G�1(G�D) =

24 A 0 B�BD�1C A�BD�1C 0D�1C D�1C 0

35 =:

"�A �B

�C 0

#:

It is easy to verify that

�Q :=

�Q QQ Q

�

satis�es the following Lyapunov equation

�Q �A+ �A� �Q+ �C� �C = 0:

Hence Q is the output weighted Gramian.The proof for part (b) is more involved and needs much more work. We refer readers

to the references at the end of the chapter for details. 2

In the above theorem, we have assumed that the system is square, we shall nowextend the results to include non-square case. Let G(s) be a minimum phase transfermatrix with a minimal realization

G(s) =

�A BC D

�

and assume that D has full row rank. Without loss of generality, we shall also assumethat D is normalized such that DD� = I . Let D? be a matrix with full row rank such

that

�DD?

�is square and unitary.

Lemma 7.6 A complex number z 2 C is a zero of G(s) if and only if z is an uncon-trollable mode of (A�BD�C;BD�?).

Proof. Since D has full row rank and G(s) =

�A BC D

�is a minimal realization, z

is a transmission zero of G(s) if and only if�A� zI BC D

�

does not have full row rank. Now note that

�A� zI BC D

��I 00 [D�; D�?]

�24 I 0 0�C I 00 0 I

35


=

�A�BD�C � zI BD� BD�?

0 I 0

�:

Then it is clear that �A� zI BC D

�does not have full row rank if and only if�

A�BD�C � zI BD�?�

does not have full row rank. By PBH test, this implies that z is a zero of G(s) if andonly if it is an uncontrollable mode of (A�BD�C;BD�?). 2

Corollary 7.7 There exists a matrix ~C such that the augmented system

Ga :=

�A BCa Da

�=

24 A BC D~C D?

35 =

�G(s)~G(s)

�

is minimum phase.

Proof. Note that the zeros of Ga are given by the eigenvalues of

A�B

�DD?

��1 �C~C

�= A�BD�C �BD�? ~C:

Hence ~C can always be chosen so that A�BD�C �BD�? ~C is stable. 2

If the previous model reduction algorithms are applied to the augmented system Ga,the corresponding P and Q equations are given by

PA� +AP +BB� = 0

Q(A�BD�1a Ca) + (A�BD�1a Ca)�Q+ C�a(D

�1a )�D�1a Ca = 0:

Moreover, we have"G(s)

~G(s)

#=

�G(s)~G(s)

�(I +�rel);

�G(s)~G(s)

�=

"G(s)

~G(s)

#(I +�mul)

andG(s) = G(s)(I +�rel); G(s) = G(s)(I +�mul):

However, there are in general in�nitely many choices of ~C and the model reductionresults will in general depend on the speci�c choice. Hence an appropriate choice of ~Cis important. To motivate our choice, note that the equation for Q can be rewritten as

Q(A�BD�C)+(A�BD�C)�Q�QBD�?D?B�Q+C�C+( ~C�D?B�Q)�( ~C�D?B�Q) = 0


A natural choice might be~C = D?B�Q:

The existence of a solution Q to the following so-called algebraic Riccati equation

Q(A�BD�C) + (A�BD�C)�Q�QBD�?D?B�Q+ C�C = 0

such that

A�BD�1a Ca = A�BD�C �BD�? ~C = A�BD�C �BD�?D?B�Q

is stable will be shown in Chapter 13.In the case where the model is not stable and /or is not minimum phase, the following

procedure can be used: Let G(s) be factorized as G(s) = Gap(s)Gmp(s) such that Gap

is an all-pass, i.e., G�apGap = I , and Gmp is stable and minimum phase. Let Gmp be arelative/multiplicative reduced model of Gmp such that

Gmp = Gmp(I +�rel)

and

Gmp = Gmp(I +�mul):

Then G := GapGmp has exactly the same right half plane poles and zeros as that of Gand

G = G(I +�rel)

G = G(I +�mul):

Unfortunately, this approach may be conservative if the transfer matrix has many non-minimum phase zeros or unstable poles.

An alternative relative/multiplicative model reduction approach, which does notrequire that the transfer matrix be minimum phase but does require solving an algebraicRiccati equation, is the so-called Balanced Stochastic Truncation (BST) method. LetG(s) 2 RH1 be a square transfer matrix with a state space realization

G(s) =

�A BC D

�

and det(D) 6= 0. Let W (s) 2 RH1 be a minimum phase left spectral factor ofG(s)G�(s), i.e,

W�(s)W (s) = G(s)G�(s):

Then W (s) can be obtained as

W (s) =

�A BW

CW D�

�


with

BW = PC� +BD�

CW = D�1(C �B�WX)

where P is the controllability Gramian given by

AP + PA� +BB� = 0 (7.24)

and X is the solution of a Riccati equation

XAW +A�WX +XBW (DD�)�1B�WX + C�(DD�)�1C = 0 (7:25)

with AW := A � BW (DD�)�1C such that AW + BW (DD�)�1B�WX is stable. Therealization G is said to be a balanced stochastic realization if

P = X =

26664�1Is1

�2Is2. . .

�nIsn

37775

with �1 > �2 > : : : > �n � 0. �i is in fact the i-th Hankel singular value of the so-called\phase matrix" (W�(s))�1G(s).

Theorem 7.8 Let G(s) 2 RH1 have the following balanced stochastic realization

G(s) =

�A BC D

�=

24 A11 A12 B1

A21 A22 B2

C1 C2 D

35

with det(D) 6= 0 and P = X = diag(M1;M2) where

M1 = diag(�1Is1 ; : : : ; �rIsr ); M2 = diag(�r+1Isr+1 ; : : : ; �nIsn):

Then

G =

�A11 B1

C1 D

�

is stable and G�1(G� G) 1�

nYi=r+1

1 + �i1� �i

� 1

G�1(G� G) 1�

nYi=r+1

1 + �i1� �i

� 1:


It can be shown that the balanced stochastic realization and the self-weighted balancedrealization in Theorem 7.5 are the same and �i = �i=

p1 + �2i if G(s) is minimum phase.

To illustrate, consider a third order stable and minimum phase transfer function

G(s) =s3 + 2s2 + 3s+ 4

s3 + 3s2 + 4s+ 4:

It is easy to show that the Hankel singular values of the phase function is given by

�1 = 0:55705372196966; �2 = 0:53088390857035; �3 = 0:03715882438832;

and a �rst order BST approximation is given by

G =s+ 0:00375717470515

s+ 0:00106883052786:

The relative approximation error is 2:51522024045904 and the error bound is

3Yi=2

1 + �i1� �i

� 1 = 2:51522024046226:


The balanced model reduction method was �rst introduced by Moore [1981]. The stabil-ity properties of the reduced order model were shown by Pernebo and Silverman [1982].The error bound for the balanced model reduction was shown by Enns [1984] and Glover[1984] subsequently gave an independent proof. The frequency weighted balanced modelreduction method was also introduced by Enns [1984] from a somewhat di�erent per-spective. The error bounds for the relative and multiplicative approximations using theself-weighted balanced realization were shown by Zhou [1993]. The Balanced StochasticTruncation (BST) method was proposed by Desai and Pal [1984] and generalized byGreen [1988a,1988b] and many other people. The relative error bound for the BalancedStochastic Truncation method was obtained by Green [1988a] and the multiplicativeerror bound for the BST was obtained by Wang and Safonov [1992]. It was also shownin Zhou [1993] that the frequency self-weighted method and the BST method are thesame if the transfer matrix is minimum phase. Improved error bounds for the BSTreduction were reported in Wang and Safonov [1990,1992] where it was claimed thatthe following error bounds hold

G�1(G� G) 1� 2

nXi=r+1

�i1� �i

G�1(G� G) 1� 2

nXi=r+1

�i1� �i

:


However, the example in the last section gives

2

3Xi=2

�i1� �i

= 2:34052280324021

which is smaller than the actual error. Other weighted model reduction methods canbe found in Al-Saggaf and Franklin [1988], Glover [1986,1989], Glover, Limebeer andHung [1992], Hung and Glover [1986] and references therein. Discrete time balancemodel reduction results can be found in Al-Saggaf and Franklin [1987], Hinrichsen andPritchard [1990], and references therein.

8Hankel Norm Approximation

This chapter is devoted to the study of optimal Hankel norm approximation and itsapplications in L1 norm model reduction. The Hankel operator is introduced �rsttogether with some time domain and frequency domain characterizations. The optimalHankel norm approximation problem can be stated as follows: Given G(s) of McMillan

degree n, �nd G(s) of McMillan degree k < n such that G(s) � G(s)

His minimized.

The solution to this approximation problem relies on the all-pass dilation result of asquare transfer function which will be given for a general class of transfer functions.The all-pass dilation results are then specialized to obtain the optimal Hankel normapproximations, which gives

inf G(s)� G(s)

H= �k+1

where �1 > �2 : : : > �k+1 : : : > �n are the Hankel singular values of G(s). Moreover,we show that a square stable transfer function G(s) can be represented as

G(s) = D0 + �1E1(s) + �2E2(s) + : : :+ �nEn(s)

where Ek(s) are all-pass functions and the partial sum D0 + �1E1(s) + �2E2(s) + : : :+�kEk(s) have McMillan degrees k. This representation is obtained by reducing the orderone dimension at a time via optimal Hankel norm approximations. This representationalso gives that

kGk1 � 2(�1 + : : :+ �n)

and further that there exists a constant D0 such that

kG(s)�D0k1 � (�1 + : : :+ �n):

169

170 HANKEL NORM APPROXIMATION

The above bounds are then used to show that the k-th order optimal Hankel normapproximation, G(s), together with some constant matrix D0 satis�es G(s)� G(s)�D0

1� (�k+1 + : : :+ �n):

We shall also provide an alternative proof for the error bounds derived in the last chapterfor the truncated balanced realizations using the results obtained in this chapter.

Finally we consider the Hankel operator in discrete time and o�er some alternativeproof of the well-known Nehari's theorem.

8.1 Hankel Operator

Let G(s) 2 L1 be a matrix function. The Hankel operator associated with G will bedenoted by �G and is de�ned as

�G : H?2 7�! H2

�Gf := (P+MG)f = P+(Gf); for f 2 H?2i.e., �G = P+MGjH?2 . This is shown in the following diagram:

��>

P+

�G

MG

L2

H2H?2?

-

There is a corresponding Hankel operator in the time domain. Let g(t) denote theinverse (bilateral) Laplace transform of G(s). Then the time domain Hankel operatoris

�g : L2(�1; 0] 7�! L2[0;1)

�gf := P+(g � f); for f 2 L2(�1; 0]:

Thus

(�gf)(t) =

� R 0�1 g(t� �)f(�)d�; t � 0;

0; t < 0:

Because of the isometric isomorphism property between the L2 spaces in the time do-main and in the frequency domain, we have

k�gk = k�Gk :

8.1. Hankel Operator 171

Hence, in this book we will use the time domain and the frequency domain descriptionsfor Hankel operators interchangeably.

For the interest of this book, we will now further restrict G to be rational, i.e.,G(s) 2 RL1. Then G can be decomposed into strictly causal part and anticausal part,i.e., there are Gs(s) 2 RH2 and Gu(s) 2 RH?2 such that

G(s) = Gs(s) +G(1) +Gu(s):

Now for any f 2 H?2 , it is easy to see that

�Gf = P+(Gf) = P+(Gsf):

Hence, the Hankel operator associated with G 2 RL1 depends only on the strictlycausal part of G. In particular, if G is antistable, i.e., G�(s) 2 RH1, then �G = 0.Therefore, there is no loss of generality in assuming G 2 RH1 and strictly proper.

The adjoint operator of �G can be computed easily from the de�nition as below: letf 2 H?2 ; g 2 H2, then

h�Gf; gi := hP+Gf; gi= hP+Gf; gi+ hP�Gf; gi (since P�Gf and g are orthogonal )

= hGf; gi= hf;G�gi= hf; P+G�gi+ hf; P�G�gi= hf; P�G�gi (since f and P+G

�g are orthogonal ):

Hence ��Gg = P�(G�g) : H2 7�! H?2 or ��G = P�MG� jH2 .

Now suppose G 2 RH1 has a state space realization as given below:

_x = Ax+Bu

y = Cx

with A stable and x(�1) = 0. Then the Hankel operator �g can be written as

�gu(t) =

Z 0

�1CeA(t��)Bu(�)d�; for t � 0

and has the interpretation of the system future output

y(t) = �gu(t); t � 0

based on the past input u(t); t � 0.

In the state space representation, the Hankel operator can be more speci�cally de-composed as the composition of maps from the past input to the initial state and then


Γ

0 0 tt

y(t)u(t) g

Figure 8.1: System theoretical interpretation of Hankel operators

from the initial state to the future output. These two operators will be called the con-trollability operator, c, and the observability operator, o, respectively, and are de�nedas

c : L2(�1; 0] 7�! Cn

cu :=

Z 0

�1e�A�Bu(�)d�

ando : C

n 7�! L2[0;1)

ox0 := CeAtx0; t � 0:

(If all the data are real, then the two operators become c : L2(�1; 0] 7�! Rn ando : Rn 7�! L2[0;1).) Clearly, x0 = cu(t) for u(t) 2 L2(�1; 0] is the system stateat t = 0 due to the past input and y(t) = ox0; t � 0, is the future output due to theinitial state x0 with the input set to zero.

It is easy to verify that�g = oc:

�g

oc

L2[0;1)

Cn

L2(�1; 0] -

ZZZZZZ~�

��>

The adjoint operators of c and o can also be obtained easily from their de�nitionsas follows: let u(t) 2 L2(�1; 0], x0 2 C

n , and y(t) 2 L2[0;1), then

hcu; x0iCn =

Z 0

�1u�(�)B�e�A

��x0d� = hu;B�e�A��x0iL2(�1;0] = hu;�cx0iL2(�1;0]

8.1. Hankel Operator 173

and

hox0; yiL2[0;1) =

Z 1

0

x�(t)eA�tC�y(t)dt = hx0;

Z 1

0

eA�tC�y(t)dtiCn = hx0;�oyiCn

where h�; �iX denotes the inner product in the Hilbert space X . Therefore, we have

�c : Cn 7�! L2(�1; 0]

�cx0 = B�e�A��x0; � � 0

and�o : L2[0;1) 7�! C

n

�oy(t) =Z 1

0

eA�tC�y(t)dt:

This also gives the adjoint of �g :

��g = (oc)� = �c

�o : L2[0;1) 7�! L2(�1; 0]

��gy =Z 1

0

B�eA�(t��)C�y(t)dt; � � 0:

Let Lc and Lo be the controllability and observability Gramians of the system, i.e.,

Lc =

Z 1

0

eAtBB�eA�tdt

Lo =

Z 1

0

eA�tC�CeAtdt:

Then we have

c�cx0 = Lcx0

�oox0 = Lox0

for every x0 2 C n . Thus Lc and Lo are the matrix representations of the operatorsc

�c and �oo.

Theorem 8.1 The operator ��g�g (or ��G�G) and the matrix LcLo have the same

nonzero eigenvalues. In particular k�gk =p�(LcLo).

Proof. Let �2 6= 0 be an eigenvalue of ��g�g, and let 0 6= u 2 L2(�1; 0] be a corre-sponding eigenvector. Then by de�nition

��g�gu = �c�oocu = �2u: (8:1)

Pre-multiply (8.1) by c and de�ne x = cu 2 C n to get

LcLox = �2x: (8:2)


Note that x = cu 6= 0 since otherwise �2u = 0 from (8.1) which is impossible. So �2

is an eigenvalue of LcLo.On the other hand, suppose �2 6= 0 and x 6= 0 are an eigenvalue and a corresponding

eigenvector of LcLo. Pre-multiply (8.2) by �cLo and de�ne u = �cLox to get (8.1). Itis easy to see that u 6= 0 since cu = c

�cLox = LcLox = �2x 6= 0. Therefore �2 is

an eigenvalue of ��g�g .Finally, since G(s) is rational, ��g�g is compact and self-adjoint and has only discrete

spectrum. Hence k�gk2 = ��g�g = �(LcLo). 2

Remark 8.1 Let �2 6= 0 be an eigenvalue of ��g�g and 0 6= u 2 L2(�1; 0] be acorresponding eigenvector. De�ne

v :=1

��gu 2 L2[0;1):

Then (u; v) satisfy

�gu = �v

��gv = �u:

This pair of vectors (u; v) are called a Schmidt pair of �g . The proof given above suggestsa way to construct this pair: �nd the eigenvalues and eigenvectors of LcLo, i.e., �

2i and

xi such thatLcLoxi = �2i xi:

Then the pairs (ui; vi) given below are the corresponding Schmidt pairs:

ui = �c(1

�iLoxi) 2 L2(�1; 0]; vi = oxi 2 L2[0;1):

~

Remark 8.2 As seen in various literature, there are some alternative ways to write aHankel operator. For comparison, let us examine some of the alternatives below:

(i) Let v(t) = u(�t) for u(t) 2 L2(�1; 0], and then v(t) 2 L2[0;1). Hence, theHankel operator can be written as

�g : L2[0;1) 7�! L2[0;1) or �G : H2 7�! H2

(�gv)(t) =

� R10 g(t+ �)v(�)d�; t � 0;

0; t < 0

=

Z 1

0

CeA(t+�)Bv(�)d�; for t � 0:

8.2. All-pass Dilations 175

(ii) In some applications, it is more convenient to work with an anticausal operator Gand view the Hankel operator associated with G as the mapping from the futureinput to the past output. It will be clear in later chapters that this operator isclosely related to the problem of approximating an anticausal function by a causalfunction, which is the problem at the heart of the H1 Control Theory.

Let G(s) =

�A BC 0

�be an antistable transfer matrix, i.e., all the eigenvalues of

A have positive real parts. Then the Hankel operator associated with G(s) canbe written as

�g : L2[0;1) 7�! L2(�1; 0]

(�gv)(t) =

� R10 g(t� �)v(�)d�; t � 0;

0; t > 0

=

Z 1

0

CeA(t��)Bv(�)d�; for t � 0

or in the frequency domain

�G = P�MGjH2 : H2 7�! H?2�Gv = P�(Gv); for v 2 H2:

Now for any v 2 H2 and u 2 H?2 , we have

hP�(Gv); ui = hGv; ui = hv;G�ui = hv; P+(G�u)i:

Hence, �G = ��G� .

~

8.2 All-pass Dilations

This section considers the dilation of a given transfer function to an all-pass transferfunction. This transfer function dilation is the key to the optimal Hankel norm approx-imation in the next section. But �rst we need some preliminary results and some statespace characterizations of all-pass functions.

De�nition 8.1 The inertia of a general complex, square matrix A denoted In(A) is thetriple (�(A); �(A); �(A)) where

�(A) = number of eigenvalues of A in the open right half-plane.

�(A) = number of eigenvalues of A in the open left half-plane.

�(A) = number of eigenvalues of A on the imaginary axis.


Theorem 8.2 Given complex n�n and n�m matrices A and B, and hermitian matrixP = P � satisfying

AP + PA� +BB� = 0 (8:3)

then

(1) If �(P ) = 0 then �(A) � �(P ); �(A) � �(P ).

(2) If �(A) = 0 then �(P ) � �(A); �(P ) � �(A).

Proof. (1) If �(P ) = 0 then observe that (8.3) implies

AP + PA� � 0:

Now suppose �(A) > �(P ) then there is an eigenvalue of A, � 2 C with Re(�) > 0,and a corresponding eigenvector, x 2 C n , such that x�A = �x� and x�Px > 0, whichimplies

x�(AP + PA�)x = (�+ ��)x�Px � 0

i.e., Re(�) � 0, a contradiction. Hence �(A) � �(P ). Similarly, we can show that�(A) � �(P ).

(2) Assume �(A) = 0 and that P = U

�P1 00 0

�U� with �(P1) = 0; U�U = I , and

de�ne

~A = U�AU =

�A11 A12

A21 A22

�; ~B = U�B =

�B1

B2

�:

Then U�(8:3)U gives

~A

�P1 00 0

�+

�P1 00 0

�~A + ~B ~B� = 0 (8.4)

(8:4)) B2B�2 = 0) B2 = 0 (8.5)

(8:4); (8:5)) A21P1 = 0) A21 = 0 (8.6)

(8:4)) A11P1 + P1A�11 +B1B

�1 = 0 (8.7)

) (by part (1)) �(A11) � �(P1)

�(A11) � �(P1)

but since �(A11) = �(P1) = 0

�(P1) = �(A11) � �(A)

�(P1) = �(A11) � �(A):

2

Theorem 8.3 Given a realization (A;B;C) (not necessarily stable) with A 2 C n�n ,B 2 C

n�m , C 2 Cm�n , then


(1) If (A;B;C) is completely controllable and completely observable the following twostatements are equivalent:

(a) there exists a D such that GG� = �2I where G(s) = D + C(sI �A)�1B.(b) there exist P;Q 2 C

n�n such that

(i) P = P �, Q = Q�

(ii) AP + PA� +BB� = 0

(iii) A�Q+QA+ C�C = 0

(iv) PQ = �2I

(2) Given that part (1b) is satis�ed then there exists a D satisfying

D�D = �2I

D�C +B�Q = 0

DB� + CP = 0

and any such D will satisfy part (1a) (note, observability and controllability arenot assumed).

Proof. Any systems satisfying part (1a) or (1b) can be transformed to the case � = 1by B = B=

p�, C = C=

p�, D = D=�, P = P=�, Q = Q=�. Hence, without loss of

generality the proof will be given for the case � = 1 only.(1a)) (1b) This is proved by constructing P and Q to satisfy (1b) as follows. Given

(1a), G(1) = D ) DD� = I . Also GG� = I ) G� = G�1, i.e.,

G�1(s) =

�A�BD�1C �BD�1

D�1C D�1

�=

�A�BD�C �BD�

D�C D�

�

= G� =

� �A� �C�B� D�

�:

These two transfer functions are identical and both minimal (since (A;B;C) is assumedto be minimal), and hence there exists a similarity transformation T relating the state-space descriptions, i.e.,

�A� = T (A�BD�C)T�1 (8.8)

C� = TBD� (8.9)

B� = D�CT�1: (8.10)

Further

(8:9)) B� = D�C(T �)�1 (8.11)

(8:10)) C� = T �BD� (8.12)

(8:8)) �A� = �C�DB� + (T�1A�T )�

= T �(A� (T �)�1C�DB�T �)(T �)�1

(8:9) and (8:10)) = T �(A�BD�C)(T �)�1: (8.13)


Hence, T and T � satisfy identical equations, (8.8) to (8.10) and (8.11) to (8.13), andminimality implies these have a unique solution and hence T = T �.

Now setting

Q = �T (8.14)

P = �T�1 (8.15)

clearly satis�es part (1b), equations (i) and (iv). Further, (8.8) and (8.9) imply

TA+A�T � C�C = 0 (8:16)

which veri�es (1b), equation (iii). Also (8.16) implies

AT�1 + T�1A� � T�1C�CT�1 = 0 (8:17)

which together with (8.10) implies part (1b), equation (ii).

(1b)) (1a) This is proved by �rst constructingD according to part (2) and then veri-fying part (1a) by calculation. Firstly note that sinceQ = P�1, Q�((1b); equation (ii))�Q gives

QA+A�Q+QBB�Q = 0 (8:18)

which together with part (1b), equation (iii) implies that

QBB�Q = C�C (8:19)

and hence by Lemma 2.14 there exists a D such that D�D = I and

DB�Q = �C (8.20)

DB� = �CQ�1 = �CP: (8.21)

Equations (8.20) and (8.21) imply that the conditions of part (2) are satis�ed. Nownote that

BB� = (sI �A)P + P (�sI �A�)

) C(sI �A)�1BB�(�sI �A�)�1C� = CP (�sI �A�)�1C� + C(sI �A)�1PC�

(8:21)) = �DB�(�sI �A�)�1C� � C(sI �A)�1BD�:

Hence, on expanding G(s)G�(s) we get

G(s)G� = I:

Part (2) follows immediately from the proof of (1b)) (1a) above. 2

The following theorem dilates a given transfer function to an all-pass function.


Theorem 8.4 Let G(s) =

�A BC D

�with A 2 C n�n ; B 2 C n�m ; C 2 Cm�n ; D 2

Cm�m satisfy

AP + PA� +BB� = 0 (8.22)

A�Q+QA+ C�C = 0 (8.23)

for

P = P � = diag(�1; �Ir) (8.24)

Q = Q� = diag(�2; �Ir) (8.25)

with �1 and �2 diagonal, � 6= 0 and �(�1�2 � �2I) = 0.Partition (A;B;C) conformably with P , as

A =

�A11 A12

A21 A22

�; B =

�B1

B2

�; C =

�C1 C2

�(8:26)

and de�ne W (s) :=

�A B

C D

�with

A = ��1(�2A�11 + �2A11�1 � �C�1UB�1) (8.27)

B = ��1(�2B1 + �C�1U) (8.28)

C = C1�1 + �UB�1 (8.29)

D = D � �U (8.30)

where U is a unitary matrix satisfying

B2 = �C�2U (8.31)

and

� = �1�2 � �2I: (8.32)

Also de�ne the error system

E(s) = G(s)�W (s) =

�Ae Be

Ce De

�

with

Ae =

�A 0

0 A

�; Be =

�B

B

�; Ce =

�C �C �

; De = D � D: (8:33)

Then


(1) (Ae; Be; Ce) satisfy

AePe + PeA�e +BeB

�e = 0 (8.34)

A�eQe +QeAe + C�eCe = 0 (8.35)

with

Pe =

24 �1 0 I

0 �Ir 0I 0 �2�

�1

35 (8.36)

Qe =

24 �2 0 ��

0 �Ir 0�� 0 �1�

35 (8.37)

PeQe = �2I (8.38)

(2) E(s)E�(s) = �2I.

(3) If �(A) = 0 then

(a) �(A) = 0

(b) If �(�1�2) = 0 then

In(A) = In(��1�) = In(��2�)

(c) If P > 0; Q > 0 then the McMillan degree of the stable part of (A; B; C)equals �(�1�) = �(�2�).

(d) If either (i) �1� > 0 and �2� > 0 or (ii) �1� < 0 and �2� < 0 then(A; B; C) is a minimal realization.

Proof. For notational convenience it will be assumed that � = 1 and this can be donewithout loss of generality since B;C and � can be simply rescaled to give � = 1.

It is �rst necessary to verify that there exists a unitary matrix U satisfying (8.31).The (2,2) blocks of (8.22) and (8.23) give

A22 +A�22 +B2B�2 = 0 (8.39)

A�22 +A22 + C�2C2 = 0 (8.40)

and hence B2B�2 = C�2C2 and by Lemma 2.14 there exists a unitary U satisfying (8.31).

(1) The proof of equations (8.34) to (8.38) is by a straightforward calculation, asfollows. To verify (8.34) and (8.36) we need (8.22) which is assumed, together with

A11 + A� +B1B� = 0 (8.41)

A21 +B2B� = 0 (8.42)

A�2��1 +�2�

�1A� + BB� = 0 (8.43)


which will now be veri�ed.

B2B� = B2(B

�1�2 + U�C1)�

�1 (8.44)

(8.31)) = (B2B�1�2 � C�2C1)�

�1 (8.45)

= ((�A21�1 �A�12)�2 +A�12�2 +A21)��1 (8.46)

= �A21 ) (8:42)

where B2B�1 and C�2C1 were substituted in (8.45) using the (2,1) blocks of (8.22) and

(8.23), respectively. To verify (8.41)

B1B� = (B1B

�1�2 +B1U

�C1)��1 (8.47)

= (�A11�1�2 ��1A�11�2 +B1U

�C1)��1 (8.48)

= �A11 � A� ) (8:41)

where B1B�1 was substituted using the (1; 1) block of (8.22) and (8.27) substituting in

(8.48). Finally to verify (8.43) consider

�A�2 +�2A�� = (A�11 +�2A11�1 � C�1UB

�1)�2 +�2(A11 +�1A

�11�2 �B1U

�C1)

(8.49)

= �(�2B1 + C�1U)(B�1�2 + U�C1) + (A�11�2 +�2A11 + C�1C1)

+�2(A11�1 +�1A�11 +B1B

�1)�2 (8.50)

= ��BB��) (8:43)

where (8.27)!(8.49) and (8.50) is a rearrangement of (8.49) and �nally, the (1; 1) blocksof (8.22) and (8.23) are used. Equations (8.34) and (8.36) are hence veri�ed.

Similarly in order to verify (8.35) and (8.37) we need (8.23) which is assumed togetherwith

A�11(��) + (��)A� C�1 C = 0 (8.51)

A�12(��)� C�2 C = 0 (8.52)

A��1� + �1�A+ C�C = 0: (8.53)

Equations (8.51) to (8.53) are now veri�ed in an analogous manner to equations (8.41)to (8.43)

C�2 C = C�2 (C1�1 + UB�1)

= (�A�12�2 �A21)�1 �B2B�1

= �A�12�) (8:52)

C�1 C = C�1C1�1 + C�1UB�1

= �A�11�2�1 ��2A11�1 + C�1UB�1


= �A�11�� A) (8:51)

A��1� + �1�A = (A11 +�1A�11�2 �B1U

�C1)�1 +�1(A�11 +�2A11�1 � C�1UB1)

= �(�1C�1 +B1U

�)(C1�1 + UB�1)

+�1(A�11�2 +�2A11 + C�1C1)�1 + (A11�1 +�1A

�11 +B1B

�1)

= �C�C ) (8:53)

Therefore, (8.35) and (8.37) have been veri�ed, (8.38) is immediate, and the proof ofpart (1) is complete.

(2) Equations (8.34), (8.35) and (8.38) ensure the conditions of Theorem 8.3, part(1b) are satis�ed and Theorem 8.3, part (2) can be used to show that the De given in(8.33) makes E(s) all-pass. (Note it is still assumed that � = 1.) We hence need toverify that

D�eDe = I (8.54)

D�eCe +B�eQe = 0 (8.55)

DeB�e + CePe = 0: (8.56)

Equation (8.54) is immediate, (8.55) follows by substituting the de�nitions of B; C;De

and Q, and (8.56) follows from De � (8:55)� Pe.(3) (a) To show that �(A) = 0 if �(A) = 0 we will assume that there exists x 2 C n�r

and � 2 C such that Ax = �x and � + �� = 0, and show that this implies x = 0. Fromx�(8:53)x,

(�+ ��)x��1�x+ x�C�Cx = 0 (8.57)

) Cx = 0: (8.58)

Now (8:51)x gives

�A�11�x� ��x+ C�1 Cx = 0

) x��A11 = ��x��: (8.59)

Also (8:52)x and (8.58) giveA�12�x = 0: (8:60)

Equations (8.59) and (8.60) imply that (x��; 0)A = ��(x��; 0) but since it is assumedthat �(A) = 0, �+�� = 0 and ��1 exists this implies that x = 0 and �(A) = 0 is proven.

(b) Since �(A) = 0 has been proved and �(�1�2) = 0 is assumed () �(�2��1) =

�(�1�) = 0) Theorem 8.2 can be applied since equations (8.43) and (8.53) have beenveri�ed. Hence

In(A) = In(��1��1) = In(��1�) = In(��2�)

(c) Assume that there exists x 6= 0 2 C n�r and � 2 C such that Ax = �x andCx = 0 (i.e., (C; A) is not completely observable). Then (8:51)x and (8:52)x give

�A�11�x� ��x = 0

�A�12�x = 0


hence (��) is an eigenvalue of A since �x 6= 0. However, since P > 0 and �(A) = 0are assumed then In(A) = (0; n; 0) and all the unobservable modes must be in the openright half plane. Similarly, if it is assumed that (A; B) is not completely controllablethen (8.41) and (8.42) will give the analogous conclusion and therefore all the modes inthe left half-plane are controllable and observable, and the condition in (3b) gives theirnumber.

(d) (i) If �1� > 0 or �2� > 0 then by (3b) In(A) = (0; n � r; 0) and by (3c) theMcMillan degree of (A; B; C) is n� r and the result is proven.

(ii) Assume there exists x such that Ax = �x and Cx = 0. Then x�(8:53)x gives

(� + ��)x��1�x = 0

but (� + ��) 6= 0 by (3a) and �1� < 0 is assumed so that x = 0. Hence (C; A) iscompletely observable. Similarly (8.43) gives (A; B) completely controllable. 2

Example 8.1 Take

G(s) =39s2 + 105s+ 250

(s+ 2)(s+ 5)2:

This has a balanced realization given by

A =

24 �2 4 �4�4 �1 �4�4 4 �9

35 ; B =

24 2

16

35 ; C =

�2 �1 6

�; � =

24 1 0 0

0 12 0

0 0 2

35

Now using the above construction with �1 =

�1 00 1

2

�, � = 2 gives

A =1

3

�2 10�8 5

�; B =

1

3

�2�2

�; C =

� �2 �5=2 � ; D = 2

W (s) =6s2 � 13s+ 90

3s2 � 7s+ 30; G(s)�W (s) =

2(�s+ 2)(�s+ 5)2(3s2 + 7s+ 30)

(s+ 2)(s+ 5)2(3s2 � 7s+ 30)

W (s) is an optimal anticausal approximation to G(s) with L1 error of 2. 3

Example 8.2 Let us also illustrate Theorem 8.4 when (�21 � �2I) is inde�nite. Take

G(s) as in the above example and permute the �rst and third states of the balancedrealization so that � = diag(2; 12 ; 1), �1 = diag(2; 12 ), � = 1. The construction ofTheorem 8.4 now gives

A =

� �3 28 3

�; B =

�2�2

�; C =

�6 �3=2 � ; D = 1:

Theorem 8.4, part (2b) implies that

In(A) = In(��1(�21 � �2I)) = In

� �6 00 3

8

�= (1; 1; 0)


which is veri�ed by noting that A has eigenvalues of 5 and �5.

W (s) =

"A B

C D

#=s+ 20

s+ 5

and we note that the stable part of W (s) has McMillan degree 1 as predicted by Theo-rem 8.4, part (3c). However, this example has been constructed to show that (A; B; C)itself may not be minimal when the conditions of part (3d) are not satis�ed, and in thiscase the unstable pole at +5 is both uncontrollable and unobservable. s+20

s+5 is in factan optimal Hankel norm approximation to G(s) of degree 1 and

E(s) =(�s+ 2)(�s+ 5)2

(s+ 2)(s+ 5)2:

In general the error E(j!) will have modulus equal to � but E(s) will contain unstablepoles. 3

Example 8.3 Let us �nally complete the analysis of this G(s) by permuting the secondand third states in the balanced realization of the last example to obtain �1 = diag(2; 1),� = 1

2 . We will �nd

A =

� �15 �4�20 �6

�; B =

�44

�; C =

�15 3

�; D = �1

2

W (s) =

"A B

C D

#=

1

2

(�s2 + 123s+ 110)

(s2 + 21s+ 10)

E(s) = G(s) �W (s) = �1

2

(�s+ 2)(�s+ 5)2(s2 � 21s+ 10)

(s+ 2)(s+ 5)2(s2 + 21s+ 10):

Note that ��1� = diag(�15=2;�3=4) so that A is stable by Theorem 8.4, part (3b).jE(j!)j = 1

2 by Theorem 8.4, part (2), (A; B; C) is minimal by Theorem 8.4, part (3d).W (s) is in fact an optimal second-order Hankel norm approximation to G(s). 3

8.3 Optimal Hankel Norm Approximation

We are now ready to give a solution to the optimal Hankel norm approximation problembased on Theorem 8.4. The following Lemma gives a lower bound on the achievableHankel norm of the error

infG

G(s)� G(s) H� �k+1(G)

and then Theorem 8.6 shows that the construction of Theorem 8.4 can be used to achievethis lower bound.

8.3. Optimal Hankel Norm Approximation 185

Lemma 8.5 Given a stable, rational, p�m, transfer function matrix G(s) with Hankelsingular values �1 � �2 : : : � �k � �k+1 � �k+2 : : : � �n > 0, then for all G(s) stableand of McMillan degree � k

�i(G(s) � G(s)) � �i+k(G(s)); i = 1; : : : ; n� k; (8.61)

�i+k(G(s) � G(s)) � �i(G(s)); i = 1; : : : ; n: (8.62)

In particular, G(s)� G(s) H� �k+1(G(s)): (8:63)

Proof. We shall prove (8.61) only and the inequality (8.62) follows from (8.61) bysetting

G(s) = (G(s)� G(s))� (�G(s)):Let (A; B; C) be a minimal state space realization of G(s), then (Ae; Be; Ce) given by(8.33) will be a state space realization of G(s) � G(s). Now let P = P � and Q = Q�

satisfy (8.34) and (8.35) respectively (but not necessary (8.36) and (8.37) and write

P =

�P11 P12P �12 P22

�; Q =

�Q11 Q12

Q�12 Q22

�; P11; Q11 2 R

n�n :

Since P � 0 it can be factorized as

P = RR�

where

R =

�R11 R12

0 R22

�with

R22 = P1=222 ; R12 = P12P

�1=222 ; R11R

�11 = P11 �R12R

�12

(P22 > 0 since (A; B; C) is a minimal realization.)

�i(G(s)� G(s)) = �i(PQ) = �i(RR�Q) = �i(R

�QR)

� �i

��In 0

�R�QR

�In0

��

= �i

��R�11 0

�Q

�R11

0

��= �i(R

�11Q11R11) = �i(Q11R11R

�11)

= �i(Q11(P11 �R12R�12))

= �i(Q1=211 P11Q

1=211 �XX�) where X = Q

1=211 R12

� �i+k(Q1=211 P11Q

1=211 ) (8.64)

= �i+k(P11Q11) = �2i+k(G)


where (8.64) follows from the fact that X is an n� k matrix () rank(XX�) � k). 2

We can now give a solution to the optimal Hankel norm approximation problem forsquare transfer functions.

Theorem 8.6 Given a stable, rational, m�m, transfer function G(s) then

(1) �k+1(G(s)) = infG2H1;F2H�1

G(s)� G(s)� F (s) 1, McMillan degree (G) � k.

(2) If G(s) has Hankel singular values �1 � �2 : : : � �k > �k+1 = �k+2 : : : = �k+r >�k+r+1 � : : : � �n > 0 then G(s) of McMillan degree k is an optimal Hankel normapproximation to G(s) if and only if there exists F (s) 2 H�1 (whose McMillandegree can be chosen � n+ k � 1) such that E(s) := G(s)� G(s)� F (s) satis�es

E(s)E�(s) = �2k+1I: (8:65)

In which case G(s)� G(s) H= �k+1: (8:66)

(3) Let G(s) be as in (2) above, then an optimal Hankel norm approximation of McMil-lan degree k, G(s), can be constructed as follows. Let (A;B;C) be a balancedrealization of G(s) with corresponding

� = diag(�1; �2; : : : ; �k; �k+r+1; : : : ; �n; �k+1; : : : ; �k+r);

and de�ne (A; B; C; D) from equations (8.26) to (8.31). Then

G(s) + F (s) =

"A B

C D

#(8:67)

where G(s) 2 H1 and F (s) 2 H�1 with the McMillan degree of G(s) = k and theMcMillan degree of F (s) = n� k � r.

Proof. By the de�nition of L1 norm, for all F (s) 2 H�1, and G(s) of McMillan degreek G(s)� G(s)� F (s)

1

� supf2H?

2;kfk

2�1

(G(s)� G(s)� F (s))f 2

� supf2H?

2;kfk

2�1

P+(G� G� F )f 2

= supf2H?

2;kfk

2�1

P+(G� G)f 2

= G� G

H

� �k+1(G(s)) (8.68)

8.3. Optimal Hankel Norm Approximation 187

where (8.68) follows from Lemma 8.5.Now de�ne G(s) and F (s) via equation (8.26), then Theorem 8.4, part (2) implies

that (8.65) holds and hencekE(s)k1 = �k+1: (8:69)

Also from Theorem 8.4, part (3b)

In(A) = In(��1(�21 � �2k+1I)) = (n� k � r; k; 0): (8:70)

Hence, G has McMillan degree k and it in the correct class, and therefore (8.69) impliesthat the inequalities in (8.68) becomes equalities, and part (1) is proven, as in part (3).Clearly the su�ciency of part (2) can be similarly veri�ed by noting that (8.65) impliesthat (8.68) is satis�ed with equality.

To show the necessity of part (2) suppose that G(s) is an optimal Hankel norm ap-proximation to G(s) of McMillan degree k, i.e, equation (8.66) holds. Now Theorem 8.4can be applied to G(s) � G(s) to produce an optimal anticausal approximation F (s),such that (G(s)�G(s)�F (s))=�k+1(G) is all-pass since �k+1(G) = �1(G�G). Further,the McMillan degree of this F (s) will be, the McMillan degree of (G(s) � G(s)) minusthe multiplicity of �1(G� G), � n+ k � 1. 2

The following corollary gives the solution to the well-known Nehari's problem.

Corollary 8.7 Let G(s) be a stable, rational, m � m, transfer function of McMillandegree n such that �1(G) has multiplicity r1. Also let F (s) be an optimal anticausalapproximation of degree n� r1 given by the construction of Theorem 8.4. Then

(1) (G(s)� F (s))=�1 is all-pass.

(2) �i�r1(F (�s)) = �i(G(s)); i = r1 + 1; : : : ; n.

Proof. (1) is proved in Theorem 8.4, part (2). (2) is obtained from the forms of Pe andQe in Theorem 8.4, part (1). F (�s) is used since it will be stable and have well-de�nedHankel singular values. 2

The optimal Hankel norm approximation for non-square case can be obtained by �rstaugmenting the function to form a square function. For example, consider a stable, ra-

tional, p�m (p < m), transfer function G(s). Let Ga =

�G0

�be an augmented square

transfer function and let Ga =

�G

G2

�be the optimal Hankel norm approximation of

Ga such that Ga � Ga

H= �k+1(Ga):

Then�k+1(G) �

G� G H� Ga � Ga

H= �k+1(Ga) = �k+1(G)

i.e., G is an optimal Hankel norm approximation of G(s).


8.4 L1 Bounds for Hankel Norm Approximation

The natural question that arise now is, does the Hankel norm being small imply thatany other more familiar norms are also small? We shall have a de�nite answer in thissection.

Lemma 8.8 Let an m �m transfer matrix E =

�A BC D

�satisfy E(s)E�(s) = �2I

and all equations of Theorem 8.3 and let A have dimension n1+n2 with n1 eigenvaluesstrictly in the left half plane and n2 < n1 eigenvalues strictly in the right half plane. IfE = Gs + F with Gs 2 RH1 and F 2 RH�1 then,

�i(Gs) =

�� i = 1; 2; : : : ; n1 � n2�i�n1+n2(F (�s)) i = n1 � n2 + 1; : : : ; n1

Proof. Firstly let the realization be transformed to,

E =

24 A1 0 B1

0 A2 B2

C1 C2 D

35 =

�A BC D

�; Re�i(A1) < 0; Re�i(A2) > 0;

in which case G =

�A1 B1

C1 D

�, F =

�A2 B2

C2 0

�. The equations of Theorem 8.3

(i)-(iv) are then satis�ed by a transformed P and Q, partitioned as,

P =

�P11 P12P �12 P22

�; Q =

�Q11 Q�21Q21 Q22

�PQ = �2I implies that,

det(�I � P11Q11) = det(�I � (�2I � P12Q21))

= det((� � �2)I + P12Q21)

= (�� 2)n1�n2 det((�� 2)I +Q21P12)

= (�� 2)n1�n2 det(�I �Q22P22):

The result now follows on observing that �i(G(s)) = �i(P11Q11) and �2i (F (�s)) =�i(Q22P22): 2

Corollary 8.9 Let E(s) = G(s)� G(s)�F (s) be as de�ned in part (3) of Theorem 8.6with G(s); G(s) 2 RH1 and F (s) 2 RH�1. Then for i = 1; 2; :::; 2k + r,

�i(G� G) = �k+1(G);

and for i = 1; 2; :::; n� k � r,

�i+3k+r(G) � �i(F (�s)) = �i+2k+r(G� G) � �i+k+r(G)

8.4. L1 Bounds for Hankel Norm Approximation 189

Proof. The construction of E(s) ensures that the all-pass equations are satis�ed andan inertia argument easily establishes that the A-matrix has precisely n1 = n + keigenvalues in the open left half plane and n2 = n� k � r in the open right half plane.Hence Lemma 8.8 can be applied to give the equalities. The inequalities follow fromLemma 8.5. 2

The following lemma gives the properties of certain optimal Hankel norm approxi-mations when the degree is reduced by the multiplicity of �n. In this case some precisestatements on the error and the approximating system can be made.

Lemma 8.10 Let G(s) be a stable, rational m � m, transfer function of McMillandegree n and such that �n(G) has multiplicity r. Also let G(s) be an optimal Hankelnorm approximation of degree n � r given by Theorem 8.6, part (3) (with F (s) � 0)then

(1) (G(s)� G(s))=�n(G(s)) is all-pass.

(2) �i(G(s)) = �i(G(s)); i = 1; : : : ; n� r.

Proof. Theorem 8.6 gives that A 2 R(n�r)�(n�r) is stable and hence F (s) can bechosen to be zero and therefore (G(s) � G(s))=�n(G) is all-pass. The �i(G(s)) areobtained from Lemma 8.8. 2

Applying the above reduction procedure again on G(s) and repeating until G(s) haszero McMillan degree gives the following new representation of stable systems.

Theorem 8.11 Let G(s) be a stable, rational m � m, transfer function with Hankelsingular values �1 > �2 : : : > �N where �i has multiplicity ri and r1+r2+ : : :+rN = n.Then there exists a representation of G(s) as

G(s) = D0 + �1E1(s) + �2E2(s) + : : :+ �NEN (s) (8:71)

where

(1) Ek(s) are all-pass and stable for all k.

(2) For k = 1; 2; : : : ; N

Gk(s) := D0 +kXi=1

�iEi(s)

has McMillan degree r1 + r2 + : : :+ rk.


Proof. Let Gk(s) be the optimal Hankel norm approximation to Gk+1(s) (given byLemma 8.10) of degree r1+ r2+ : : :+ rk, with GN (s) := G(s). Lemma 8.10 (2) appliedat each step then gives that the Hankel singular values of Gk(s) will be �1; �2; : : : ; �kwith multiplicities r1; r2; : : : ; rk, respectively. Hence Lemma 8.10 (1) gives that Gk(s)�Gk�1(s) = �kEk(s) for some stable, all-pass Ek(s). Note also that Theorem 8.4, part(3d), relation (i) also ensures that each Gk(s) will have McMillan degree r1+r2+: : :+rk.Finally taking D0 = G0(s) which will be a constant and combining the steps gives theresult. 2

Note that the construction of Theorem 8.11 immediately gives an approximation

algorithm that will satisfy G(s)� G(s)

1� �k+1 + �k+2 + : : :+ �N . This will not be

an optimal Hankel norm approximation in general, but would involve less computationsince the decomposition into G(s) = F (s) need not be done, and at each step a balancedrealization of Gk(s) is given by (Ak; Bk; Ck) with a diagonal scaling.

An upper-bound on the L1 norm of G(s) is now obtained as an immediate conse-quence of Theorem 8.11.

Corollary 8.12 Let G(s) be a stable, rational p � m, transfer function with Hankelsingular values �1 > �2 > : : : > �N , where each �i has multiplicity ri, and such thatG(1) = 0. Then

(1) kG(s)k1 � 2(�1 + �2 + : : :+ �N )

(2) there exists a constant D0 such that

kG(s)�D0k1 � �1 + �2 + : : :+ �N :

Proof. For p = m consider the representation of G(s) given by Theorem 8.11 then

kG(s) �D0k1 = k�1E1(s) + �2E2(s) + : : :+ �NEN (s)k1� �1 + �2 + : : :+ �N

since Ek(s) are all-pass. Further setting s =1, since G(1) = 0, gives

kD0k � �1 + �2 + : : :+ �N

) kG(s)k1 � 2(�1 + �2 + : : :+ �N ):

For the case p 6= m just augment G(s) by zero rows or columns to make it square, butwill have the same L1 norm, then the above argument gives upper bounds on the L1norm of this augmented system. 2

Theorem 8.13 Given a stable, rational, m � m, transfer function G(s) with Hankelsingular values �1 � �2 : : : � �k > �k+1 = �k+2 : : : = �k+r > �k+r+1 � : : : � �n > 0

8.4. L1 Bounds for Hankel Norm Approximation 191

and let G(s) 2 RH1 of McMillan degree k be an optimal Hankel norm approximationto G(s) obtained in Theorem 8.6, then there exists a D0 such that

�k+1(G) � kG� G�D0k1 � �k+1(G) +

n�k�rXi=1

�i+k+r(G):

Proof. The theorem follows from Corollary 8.9 and Corollary 8.12. 2

It should be noted that if G is an optimal Hankel norm approximation of G thenG+D for any constant matrix D is also an optimal Hankel norm approximation. Hencethe constant term of G can not be determined from Hankel norm.

An appropriate constant term D0 in Theorem 8.13 can be obtained in many ways.We shall mention three of them:

� Apply Corollary 8.12 to G(s) � G(s). This is usually complicated since (G(s) �G(s)) in general has McMillan degree of n+ k.

� An alternative is to use the unstable part of the optimal Hankel norm approxi-mation in Theorem 8.6. Let G+ F be obtained from Theorem 8.6, part (3) suchthat F (s) 2 RH�1 has McMillan degree � n� k � r then

kG� G�D0k1 � G� G� F

1+ kF �D0k1 = �k+1(G) + kF �D0k1

Now Corollary 8.12 can be applied to F (�s) to obtain a D0 such that

kF �D0k1 �n�k�rXi=1

�i(F (�s)) �n�k�rXi=1

�i+k+r(G):

since by Corollary 8.9,�i(F (�s)) � �i+k+r(G)

for i = 1; 2; :::; n� k � r.

� D0 can of course be obtained using any standard convex optimization algorithm:

D0 = argminD0

G� G�D0

1:

Note that Theorem 8.13 can also be applied to non-square systems. For example,

consider a stable, rational, p � m (p < m), transfer function G(s). Let Ga =

�G0

�be an augmented square transfer function and let Ga =

�G

G2

�be the optimal Hankel

norm approximation of Ga such that

�k+1(Ga) � Ga � Ga �Da

1� �k+1(Ga) +

n�k�rXi=1

�i+k+r(Ga)


with Da =

�D0

D2

�. Then

�k+1(G) � G� G�D0

1� �k+1(G) +

n�k�rXi=1

�i+k+r(G)

since �i(G) = �i(Ga) and G� G�D0

1� Ga � Ga �Da

1.

A less tighter error bound can be obtained without computing the appropriate D0.

Corollary 8.14 Given a stable, rational, m�m, strictly proper transfer function G(s),with Hankel singular values �1 � �2 : : : � �k > �k+1 = �k+2 : : : = �k+r > �k+r+1 �: : : � �n > 0 and let G(s) 2 RH1 of McMillan degree k be a strictly proper optimalHankel norm approximation to G(s) obtained in Theorem 8.6, then

�k+1(G) � kG� Gk1 � 2(�k+1(G) +

n�k�rXi=1

�i+k+r(G))

Proof. The result follows from Theorem 8.13. 2

8.5 Bounds for Balanced Truncation

Very similar techniques to those of Theorem 8.11 can be used to bound the error obtainedby truncating a balanced realization. We will �rst need a lemma that gives a perhapssurprising relationship between a truncated balanced realization of degree (n� rN ) andan optimal Hankel norm approximation of the same degree.

Lemma 8.15 Let (A;B;C) be a balanced realization of the stable, rational, m � mtransfer function G(s), and let

A =

�A11 A12

A21 A22

�; B =

�B1

B2

�; C =

�C1 C2

�� =

��1 00 �I

�Let (A; B; C; D) be de�ned by equations (8.27) to (8.32) (where �2 = �1) and de�ne

Gb :=

�A11 B1

C1 0

�

Gh :=

�A B

C D

�then

8.5. Bounds for Balanced Truncation 193

(1) (Gb(s)�Gh(s))=� is all-pass.

(2) kG(s)�Gb(s)k1 � 2�.

(3) If �1 > �I then kG(s)�Gb(s)kH � 2�.

Proof. (1) In order to prove that (Gb(s)�Gh(s))=� is all-pass we note that

Gb(s)�Gh(s) =

"~A ~B

~C ~D

#

where

~A =

�A11 0

0 A

�; ~B =

�B1

B

�; ~C =

�C1 �C

�; ~D = �D:

Now Theorem 8.4, part (1) gives that the solutions to Lyapunov equations

~A ~P + ~P ~A� + ~B ~B� = 0 (8.72)

~A� ~Q+ ~Q ~A+ ~C� ~C = 0 (8.73)

are

~P =

��1 II �1�

�1

�; ~Q =

��1 �� 1�

�(8:74)

(This is veri�ed by noting that the blocks of equations (8.72) and (8.73) are also blocksof equations (8.34) and (8.35) for Pe and Qe.) Hence ~P ~Q = �2I and by Theorem 8.3there exists ~D such that (Gb(s)�Gh(s))=� is all-pass. That ~D = �D is an appropriatechoice is veri�ed from equations (8.54) to (8.56) and Theorem 8.3, part (2).

(2) (G(s)�Gb(s))=� = (G(s)�Gh(s))=�+(Gb(s)�Gh(s))=� but the �rst term onthe right hand side is all-pass by Theorem 8.4, part (2) and the second term is all-passby part (1) above. Hence kG(s)�Gb(s)k1 � 2�.

(3) Similarly using the fact that all-pass functions have unity Hankel norms givesthat

kG(s)�Gb(s)kH � kG(s) �Gh(s)kH + kGb(s)�Gh(s)kH = 2�

(Note that Gh(s) is stable if �1 > �I .) 2

Given the results of Lemma 8.15 bounds on the error in a truncated balanced real-ization are easily proved as follows.

Theorem 8.16 Let G(s) be a stable, rational, p � m, transfer function with Hankelsingular values �1 > �2 : : : > �N , where each �i has multiplicity ri and let ~Gk(s) beobtained by truncating the balanced realization of G(s) to the �rst (r1 + r2 + : : : + rk)states. Then

(1) G(s)� ~Gk(s)

1� 2(�k+1 + �k+2 + : : :+ �N ).

(2) G(s)� ~Gk(s)

H� 2(�k+1 + �k+2 + : : :+ �N ).


Proof. If p 6= m then augmenting B or C by zero columns or rows, respectively, willstill give a balanced realization and the same argument is valid. Hence assume p = m.Notice that since truncation of balanced realization are also balanced, satisfying thetruncated Lyapunov equations, the Hankel singular values of ~Gi(s) will be �1; �2; : : : ; �iwith multiplicities r1; r2; : : : ; ri, respectively. Also ~Gi(s) can be obtained by truncating

the balanced realization of ~Gi+1(s) and hence ~Gi+1(s)� ~Gi(s)

� 2�i+1 for both L1and Hankel norms. Hence (GN (s) := G(s))

G(s)� ~Gk(s) =

N�1Xi=k

( ~Gi+1(s)� ~Gi(s))

� 2(�k+1 + �k+2 + : : :+ �N )

for both norms, and the proof is complete. 2

8.6 Toeplitz Operators

In this section, we consider another operator. Again let G(s) 2 L1. Then a Toeplitzoperator associated with G is denoted by TG and is de�ned as

TG : H2 7�! H2

TGf := (P+MG)f = P+(Gf); for f 2 H2

i.e., TG = P+MGjH2. In particular if G(s) 2 H1 then TG =MGjH2

.

��>

P+

TG

MG

L2

H2H2

?-

Analogous to the Hankel operator, there is also a corresponding time domain de-scription for the Toeplitz operator:

Tg : L2[0;1) 7�! L2[0;1)

Tgf := P+(g � f) =Z 1

0

g(t� �)f(�)d�; t � 0

for f(t) 2 L2[0;1).

8.7. Hankel and Toeplitz Operators on the Disk* 195

It is seen that the multiplication operator (in frequency domain) or the convolutionoperator (in time domain) plays an important role in the development of the Hankeland Toeplitz operators. In fact, a multiplication operator can be decomposed as severalHankel and Toeplitz operators: let L2 = H?2 �H2 and G 2 L1. Then the multiplicationoperator associated with G can be written as

MG : H?2 �H2 7�! H?2 �H2

MG =

�P�MGjH?

2P�MGjH2

P+MGjH?2

P+MGjH2

�=

�P�MGjH?

2��G�

�G TG

�:

Note that P�MGjH?2is actually a Toeplitz operator; however, we will not discuss it

further here. A fact worth mentioning is that if G is causal, i.e., G 2 H1, then��G� = 0 and MG is a lower block triangular operator. In fact, it can be shown that Gis causal if and only if ��G� = 0, i.e., the future input does not a�ect the past output.This is yet another characterization of causality.

8.7 Hankel and Toeplitz Operators on the Disk*

It is sometimes more convenient and insightful to study operators on the disk. From thecontrol system point of view, some operators are much easier to interpret and computein discrete time. The objective of this section is to examine the Hankel and Toeplitzoperators in discrete time (i.e., on the disk) and, hopefully, to give the readers someintuitive ideas about these operators since they are very important in the H1 controltheory. To start with, it is necessary to introduce some function spaces in respect to aunit disk.

Let D denote the unit disk :

D := f� 2 C : j�j < 1g ;

and let D and @D denote its closure and boundary, respectively:

D := f� 2 C : j�j � 1g@D := f� 2 C : j�j = 1g :

Let L2(@D ) denote the Hilbert space of matrix valued functions de�ned on the unitcircle @D as

L2(@D ) =�F (�) :

1

2�

Z 2�

0

Trace�F �(ej�)F (ej�)

�d� <1

�with inner product

hF;Gi := 1

2�

Z 2�

0

Trace�F �(ej�)G(ej�)

�d�:


Furthermore, letH2(@D ) be the (closed) subspace of L2(@D ) with matrix functions F (�)analytic in D , i.e.,

H2(@D ) =

�F (�) 2 L2(@D ) : 1

2�

Z 2�

0

F (ej�)ejn�d� = 0; for all n > 0

�;

and let H?2 (@D ) be the (closed) subspace of L2(@D ) with matrix functions F (�) analyticin C nD .

It can be shown that the Fourier transform (or bilateral Z transform) gives thefollowing isometric isomorphism:

L2(@D ) �= l2(�1;1)

H2(@D ) �= l2[0;1)

H?2 (@D ) �= l2(�1; 0):

Remark 8.3 It should be kept in mind that, in contrast to the variable z in the stan-dard Z-transform, � here denotes � = z�1. ~

Analogous to the space in the half plane, L1(@D ) is used to denote the Banachspace of essentially bounded matrix functions with norm

kFk1 = ess sup�2[0;2�]

��F (ej�)

�:

The Hardy space H1(@D ) is the closed subspace of L1(@D ) with functions analytic inD and is de�ned as

H1(@D ) =�F (�) 2 L1(@D ) :

Z 2�

0

F (ej�)ejn�d� = 0; for all n > 0

�:

The H1 norm is de�ned as

kFk1 := sup�2D

� [F (�)] = ess sup�2[0;2�]

��F (ej�)

�:

It is easy to see that L1(@D ) � L2(@D ) and H1(@D ) � H2(@D ). (However, it shouldbe pointed out that these inclusions are not true for functions in the half planes or forcontinuous time functions.)

Example 8.4 Let F (�) 2 L2(@D ), and let F (�) have a power series representation asfollows:

F (�) =

1Xi=�1

Fi�i:

Then F (�) 2 H2(@D ) if and only if Fi = 0 for all i < 0 and F (�) 2 H?2 (@D ) if and onlyif Fi = 0 for all i � 0. 3


Now let P� and P+ denote the orthogonal projections:

P+ : l2(�1;1) 7�! l2[0;1) or L2(@D ) 7�! H2(@D )

P� : l2(�1;1) 7�! l2(�1; 0) or L2(@D ) 7�! H?2 (@D ):Suppose Gd(�) 2 L1(@D ). Then the Hankel operator associated with Gd(�) is de�nedas

�Gd: l2(�1; 0) 7�! l2[0;1) or H?2 (@D ) 7�! H2(@D )

�Gd= P+MGd

jH?2(@D) :

Similarly, a Toeplitz operator associated with Gd(�) is de�ned as

TGd: l2[0;1) 7�! l2[0;1) or H2(@D ) 7�! H2(@D )

TGd= P+MGd

jH2(@D) :

Now let Gd(�) =P1

i=�1Gi�i 2 L1(@D ) be a linear system transfer matrix, u(�) =P1

i=�1 ui�i 2 L2(@D ) be the system input in the frequency domain, and ui be the input

at time t = i. Then the system output in the frequency domain is given by

y(�) = Gd(�)u(�) =

1Xi=�1

1Xj=�1

Giuj�i+j =:

1Xi=�1

yi�i

where yi is the system time response at time t = i, and consequently, we have2666666666666664

��y2y1y0y�1y�2y�3��

3777777777777775=

2666666666666664

� � � � � � � � � �� G0 G1 G2 G3 G4 G5 � �� G�1 G0 G1 G2 G3 G4 � �� G�2 G�1 G0 G1 G2 G3 � �� G�3 G�2 G�1 G0 G1 G2 � �� G�4 G�3 G�2 G�1 G0 G1 � �� G�5 G�4 G�3 G�2 G�1 G0 � ��

3777777777777775

2666666666666664

��u2u1u0u�1u�2u�3��

3777777777777775: (8:75)

Equation (8.75) can be rewritten as2666666666666664

y0y1y2��y�1y�2y�3��

3777777777777775

=

2666666666666664

G0 G�1 G�2 � � G1 G2 G3 � �G1 G0 G�1 � � G2 G3 G4 � �G2 G1 G0 � � G3 G4 G5 � ��

G�1 G�2 G�3 � � G0 G1 G2 � �G�2 G�3 G�4 � � G�1 G0 G1 � �G�3 G�4 G�5 � � G�2 G�1 G0 � ��

3777777777777775

2666666666666664

u0u1u2��u�1u�2u�3��

3777777777777775


=:

�T1 H1

H2 T2

�

2666666666666664

u0u1u2��u�1u�2u�3��

3777777777777775:

Matrices like T1 and T2 are called (block) Toeplitz matrices and matrices like H1 andH2 are called (block) Hankel matrices. In fact, H1 is the matrix representation of theHankel operator �Gd

and T1 is the matrix representation of the Toeplitz operator TGd,

and so on. Thus these operator norms can be computed from the matrix norms of theircorresponding matrix representations.

Lemma 8.17 kGd(�)k1 =

�T1 H1

H2 T2

� , k�Gdk = kH1k, and kTGd

k = kT1k.

Example 8.5 Let Gd(�) 2 RH1(@D ) and Gd(�) = C(��1I � A)�1B +D be a statespace realization of Gd(�) and let A have all the eigenvalues in D . Then

Gd(�) = D +1Xi=0

�i+1CAiB =1Xi=0

Gi�i

with G0 = D and Gi = CAi�1B;8i � 1. The state space equations are given by

xk+1 = Axk +Buk; x�1 = 0;

yk = Cxk +Duk:

Then for u 2 l2(�1; 0), we have x0 =P1

i=1Ai�1Bu�i, which de�nes the controllability

operator

x0 = cu =�B AB A2B � � � �

26664u�1u�2u�3...

37775 2 R

n :

On the other hand, given x0 and uk = 0; i � 0, x 2 Rn , the output can be computed as

ox0 =

26664y0y1y2...

37775 =

26664

CCACA2

...

37775x0 2 l2[0;1)


which de�nes the observability operator. Of course, the adjoint operators of c and o

can be computed easily as

�cx0 =

26664

B�

B�A�

B�(A�)2

...

37775x0 2 l2(�1; 0)

and

�oy =�C� A�C� (A�)2C� � � � �

26664y0y1y2...

37775 2 R

n :

Hence, the Hankel operator has the following matrix representation:

H =

26664

CCACA2

...

37775 � B AB A2B � � � � =

26664G1 G2 G3 � � �G2 G3 G4 � � �G3 G4 G5 � � �...

......

...

37775 :

3

It is interesting to establish some connections between a Hankel operator de�ned onthe unit disk and the one de�ned on the half plane. First de�ne the map as

� =s� 1

s+ 1; s =

1 + �

1� �; (8:76)

which maps the closed right-half plane Re(s) � 0 onto the closed unit disk, D .Suppose G(s) 2 H1(jR) and de�ne

Gd(�) := G(s)js= 1+�1��

: (8:77)

Since G(s) is analytic in Re(s) > 0, including the point at 1, Gd(�) is analytic in D ,i.e., Gd(�) 2 H1(@D ).

Lemma 8.18 �G(s) = �Gd(�)

.Proof. De�ne the function

�(s) =

p2

s+ 1:

The relation between a point j! on the imaginary axis and the corresponding point ej�

on the unit circle is, from (8.76),

ej� =j! � 1

j! + 1:


This yields

d� = � 2

!2 + 1d! = �j�(j!)j2d!;

which implies that the mapping

fd(�) 7�! �(s)f(s) : H2(@D ) 7�! H2(jR)

where f(s) = fd(�)j�= s�1s+1

is an isomorphism. Similarly,

fd(�) 7�! �(s)f(s) : H?2 (@D ) 7�! H?2 (jR)is an isomorphism; note that if fd(�) 2 H?2 (@D ), then fd(1) = 0, so that f(�1) = 0,and hence �(s)f(s) is analytic in Re(s) < 0.

The lemma now follows from the commutative diagram

�G(s)

�Gd(�)

H2(jR)

H2(@D )

H?2 (jR)

H?2 (@D )

-??

-

2

The above isomorphism between H2(@D ) and H2(jR) has also the implication thatevery discrete time H2 control problem can be converted to an equivalent continuoustime H2 control problem. It should be emphasized that the bilinear transformation isnot an isomorphism between H2(@D ) and H2(jR).

8.8 Nehari's Theorem*

As we have mentioned at the beginning of this chapter, a Hankel operator is closelyrelated to an analytic approximation problem. In this section, we shall be concernedwith approximating G by an anticausal transfer matrix, i.e., one analytic in Re(s) < 0(or j�j > 1), where the approximation is with respect to the L1 norm. For convenience,let H�1(@D ) denote the subspace of L1(@D ) with functions analytic in j�j > 1. SoG(�) 2 H�1(@D ) if and only if G(��1) 2 H1(@D ).

Minimum Distance Problem

In the following, we shall establish that the distance in L1 from G to the nearestmatrix in H�1 equals k�Gk. This is the classical Nehari Theorem. Some applications

8.8. Nehari's Theorem* 201

and explicit construction of the approximation for the matrix case will be considered inthe later chapters.

Theorem 8.19 Suppose G 2 L1, then

infQ2H�1

kG�Qk1 = k�Gk

and the in�mum is achieved.

Remark 8.4 Note that from the mapping (8.76) and Lemma 8.18, the above theoremis the same whether the problem is on the unit disk or on the half plane. Hence, we willonly give the proof on the unit disk. ~

Proof. We shall give the proof on the unit disk. First, it is easy to establish that theHankel norm is the lower bound: for �xed Q 2 H�1(@D ), we have

kG�Qk1 = supf2H?

2(@D)

k(G�Q)fk2kfk2

� supf2H?

2(@D)

kP+(G�Q)fk2kfk2

= supf2H?

2(@D)

kP+Gfk2kfk2

= k�Gk :

Next we shall construct a Q(�) 2 H�1(@D ) such that

kG�Qk1 = k�Gk : (8:78)

Let the power series expansion of G(�) and Q(�) be

G(�) =

1Xi=�1

�iGi

Q(�) =

0Xi=�1

�iQi:

The left-hand side of (8.78) equals the norm of the operator

f 7�! (G(�) �Q(�))f(�) : H?2 (@D ) 7�! L2(@D ):


From the discussion of the last section, there is a matrix representation of this operator:2666666666666664

� � � � �� G3 G4 G5 � �G2 G3 G4 � �G1 G2 G3 � �

G0 �Q0 G1 G2 � �G�1 �Q�1 G0 �Q0 G1 � �G�2 �Q�2 G�1 �Q�1 G0 �Q0 � �

� � � � ��

3777777777777775: (8:79)

The idea in the construction of a Q to satisfy (8.78) is to select Q0, Q�1, : : : , in turnto minimize the matrix norm of (8.79). First choose Q0 to minimize

26666664

� � � � �� G3 G4 G5 � �G2 G3 G4 � �G1 G2 G3 � �

G0 �Q0 G1 G2 � �

37777775

:

By Parrott's Theorem (i.e., matrix dilation theory presented in Chapter 2), the mini-mum equals the norm of the following matrix2

66664� � � � �� G3 G4 G5 � �G2 G3 G4 � �G1 G2 G3 � �

377775

which is the rotated Hankel matrix H1. Hence, the minimum equals the Hankel normk�Gk. Next, choose Q�1 to minimize

2666666664

� � � � �� G3 G4 G5 � �G2 G3 G4 � �G1 G2 G3 � �

G0 �Q0 G1 G2 � �G�1 �Q�1 G0 �Q0 G1 � �

3777777775

:

Again, the minimum equals k�Gk. Continuing in this way gives a suitable Q. 2

As we have mentioned earlier we might be interested in approximating an L1 func-tion by an H1 function. Then we have the following corollary.


Corollary 8.20 Suppose G 2 L1, then

infQ2H1

kG�Qk1 = �G

and the in�mum is achieved.

Proof. This follows from the fact that

infQ2H1

kG�Qk1 = infQ�2H�1

kG� �Q�k1 = k�G�k = �G :

2

Let G 2 RH1 and g be the impulse response of G and let �1 be the largest Hankelsingular value of G, i.e., k�Gk = �1. Suppose u 2 L2(�1; 0] (or l2(�1; 0)) andv 2 L2[0;1) (or l2[0;1)) are the corresponding Schmidt pairs:

�gu = �1v

��gv = �1u:

Now denote the Laplace transform (or Z-transform) of u and v as U 2 RH?2 andV 2 RH2.

Lemma 8.21 Let G 2 RH1 and �1 be the largest Hankel singular value of G. Then

infQ2H�1

kG�Qk1 = �1

and(G�Q)U = �1V:

Moreover, if G is a scalar function, then Q = G � �1V=U is the unique solution andG�Q is all-pass.

Proof. Let H := (G�Q)U and note that �GU 2 RH2 and P+H = P+(GU) = �GU .Then

0 � kH � �GUk22= kHk22 + k�GUk22 � hH;�GUi � h�GU;Hi= kHk22 + k�GUk22 � hP+H;�GUi � h�GU; P+Hi= kHk22 � h�GU;�GUi= kHk22 � hU;��G�GUi= kHk22 � �21hU;Ui= kHk22 � �21 kUk22� kG�Qk21 kUk22 � �21 kUk22= 0:


Hence, H = �GU , i.e., (G�Q)U = �GU = �1V .

Now if G is a scalar function, then Q is uniquely determined and Q = G � �1V=U .We shall prove through explicit construction below that V=U is an all-pass. 2

Formulas for Continuous Time

Let G(s) 2 RH1 and let

G(s) =

�A BC 0

�:

Let Lc and Lo be the corresponding controllability and observability Gramians:

ALc + LcA� +BB� = 0

A�Lo + LoA+ C�C = 0:

And let �21 , � be the largest eigenvalue and the corresponding eigenvector of LcLo:

LcLo� = �21�:

De�ne

� :=1

�1Lo�:

Then

u = �c� = B�e�A�� 2 L2(�1; 0]

v = o� = CeAt� 2 L2[0;1)

are the Schmidt pair and

U(s) =

� �A� ��B� 0

�2 RH?2

V (s) =

�A �C 0

�2 RH2 :

It is easy to show that if G is a scalar, then U; V are scalar transfer functions andU�U = V �V . Hence, V=U is an all-pass. The details are left as an exercise for thereader.

Formulas for Discrete Time

Similarly, let G be a discrete time transfer matrix and let

G(�) = C(��1I �A)�1B =

�A BC 0

�:


Let Lc and Lo be the corresponding controllability and observability Gramians:

ALcA� � Lc +BB� = 0

A�LoA� Lo + C�C = 0:

And let �21 , � be the largest eigenvalue and a corresponding eigenvector of LcLo:

LcLo� = �21�:

De�ne

� :=1

�1Lo�:

Then

u = �c� =

26664

B�

B�A�

B�(A�)2

...

37775 � 2 l2(�1; 0)

v = o� =

26664

CCACA2

...

37775 � 2 l2[0;1)

are the Schmidt pair and

U(�) =1Xi=1

B�(A�)i�1��i =

�A B�� 0

��2 RH?2

V (�) =

1Xi=0

CAi��i =

�A �CA C�

�2 RH2

where W (�)� :=W T (��1).

Alternatively, since the Hankel operator has a matrix representation H , u and v canbe obtained from a \singular value decomposition" (possibly in�nite size): let

G(�) =

1Xi=1

Gi�i; Gi 2 R

p�q

(in fact, Gi = CAi�1B if a state space realization of G is available) and let

H =

26664G1 G2 G3 � � �G2 G3 G4 � � �G3 G4 G5 � � �...

......

...

37775 :


Suppose H has a \singular value decomposition"

H = U�V �

� = diagf�1; �2; � � �gU =

�u1 u2 � � � �

V =�v1 v2 � � � �

with U�U = UU� = I and V �V = V V � = I . Then

u = u1 =

264 u11u12...

375 ; v = v1 =

264 v11v12...

375

where u and v are partitioned such that u1i 2 Rp and v1i 2 Rq . Finally, U(�) and V (�)can be obtained as

U(�) =1Xi=1

u1i��i 2 RH?2

V (�) =

1Xi=1

v1i�i�1 2 RH2 :

In particular, if G(�) is an n-th order matrix polynomial, then matrix H has only a�nite number of nonzero elements and

H =

�Hn 00 0

�with

Hn =

2666664

G1 G2 � � � Gn�1 Gn

G2 G3 � � � Gn 0G3 G4 � � � 0 0...

......

...Gn 0 � � � 0 0

3777775 :

Hence, u and v can be obtained from a standard SVD of Hn.


The study of the Hankel operators and of the optimal Hankel norm approximationtheory can be found in Adamjan, Arov, and Krein [1978], Bettayeb, Silverman, andSafonov [1980], Kung and Lin [1981], Power [1982], Francis [1987], Kavrano�glu andBettayeb [1994] and references therein. The presentation of this chapter is based onFrancis [1987] and Glover [1984,1989].

9Model Uncertainty and

Robustness

In this chapter we brie y describe various types of uncertainties which can arise inphysical systems, and we single out \unstructured uncertainties" as generic errors whichare associated with all design models. We obtain robust stability tests for systems undervarious model uncertainty assumptions through the use of the small gain theorem. Andwe also obtain some su�cient conditions for robust performance under unstructureduncertainties. The di�culty associated with MIMO robust performance design and therole of plant condition numbers for systems with skewed performance and uncertaintyspeci�cations are revealed. We show by examples that the classical gain margin andphase margin are insu�cient indicators for the system robustness. A simple exampleis also used to indicate the fundamental di�erence between the robustness of an SISOsystem and that of an MIMO system. In particular, we show that applying the SISOanalysis/design method to an MIMO system may lead to erroneous results.

9.1 Model Uncertainty

Most control designs are based on the use of a design model. The relationship betweenmodels and the reality they represent is subtle and complex. A mathematical modelprovides a map from inputs to responses. The quality of a model depends on how closelyits responses match those of the true plant. Since no single �xed model can respondexactly like the true plant, we need, at the very least, a set of maps. However, the

207

208 MODEL UNCERTAINTY AND ROBUSTNESS

modeling problem is much deeper { the universe of mathematical models from which amodel set is chosen is distinct from the universe of physical systems. Therefore, a modelset which includes the true physical plant can never be constructed. It is necessary forthe engineer to make a leap of faith regarding the applicability of a particular designbased on a mathematical model. To be practical, a design technique must help makethis leap small by accounting for the inevitable inadequacy of models. A good modelshould be simple enough to facilitate design, yet complex enough to give the engineercon�dence that designs based on the model will work on the true plant.

The term uncertainty refers to the di�erences or errors between models and reality,and whatever mechanism is used to express these errors will be called a representationof uncertainty. Representations of uncertainty vary primarily in terms of the amountof structure they contain. This re ects both our knowledge of the physical mechanismswhich cause di�erences between the model and the plant and our ability to representthese mechanisms in a way that facilitates convenient manipulation. For example, con-sider the problem of bounding the magnitude of the e�ect of some uncertainty on theoutput of a nominally �xed linear system. A useful measure of uncertainty in this con-text is to provide a bound on the spectrum of the output's deviation from its nominalresponse. In the simplest case, this spectrum is assumed to be independent of the input.This is equivalent to assuming that the uncertainty is generated by an additive noisesignal with a bounded spectrum; the uncertainty is represented as additive noise. Ofcourse, no physical system is linear with additive noise, but some aspects of physicalbehavior are approximated quite well using this model. This type of uncertainty re-ceived a great deal of attention in the literature during the 1960's and 1970's, and theattention is probably due more to the elegant theoretical solutions that are yielded (e.g.,white noise propagation in linear systems, Wiener and Kalman �ltering, LQG) than tothe great practical signi�cance o�ered.

Generally, the deviation's spectrum of the true output from the nominal will dependsigni�cantly on the input. For example, an additive noise model is entirely inappro-priate for capturing uncertainty arising from variations in the material properties ofphysical plants. The actual construction of model sets for more general uncertainty canbe quite di�cult. For example, a set membership statement for the parameters of anotherwise known FDLTI model is a highly-structured representation of uncertainty. Ittypically arises from the use of linear incremental models at various operating points,e.g., aerodynamic coe�cients in ight control vary with ight environment and aircraftcon�gurations, and equation coe�cients in power plant control vary with aging, slagbuildup, coal composition, etc. In each case, the amounts of variation and any knownrelationships between parameters can be expressed by con�ning the parameters to ap-propriately de�ned subsets of parameter space. However, for certain classes of signals(e.g., high frequency), the parameterized FDLTI model fails to describe the plant be-cause the plant will always have dynamics which are not represented in the �xed ordermodel.

In general, we are forced to use not just a single parameterized model but model setsthat allow for plant dynamics which are not explicitly represented in the model structure.

9.1. Model Uncertainty 209

A simple example of this involves using frequency-domain bounds on transfer functionsto describe a model set. To use such sets to describe physical systems, the boundsmust roughly grow with frequency. In particular, at su�ciently high frequencies, phaseis completely unknown, i.e., �180o uncertainties. This is a consequence of dynamicproperties which inevitably occur in physical systems. This gives a less structuredrepresentation of uncertainty.

Examples of less-structured representations of uncertainty are direct set membershipstatements for the transfer function matrix of the model. For instance, the statement

P�(s) = P (s) +W1(s)�(s)W2(s); �[�(j!)] < 1; 8! � 0; (9:1)

whereW1 andW2 are stable transfer matrices that characterize the spatial and frequencystructure of the uncertainty, con�nes the matrix P� to a neighborhood of the nominalmodel P . In particular, if W1 = I and W2 = w(s)I where w(s) is a scalar function,then P� describes a disk centered at P with radius w(j!) at each frequency as shownin Figure 9.1. The statement does not imply a mechanism or structure which gives riseto �. The uncertainty may be caused by parameter changes, as mentioned above or byneglected dynamics or by a host of other unspeci�ed e�ects. An alternative statementto (9.1) is the so-called multiplicative form:

P�(s) = (I +W1(s)�(s)W2(s))P (s): (9:2)

This statement con�nes P� to a normalized neighborhood of the nominal model P . Anadvantage of (9.2) over (9.1) is that in (9.2) compensated transfer functions have thesame uncertainty representation as the raw model (i.e., the weighting functions apply toPK as well as P ). Some other alternative set membership statements will be discussedlater.

P(j

w(jω

ω)

)

Figure 9.1: Nyquist Diagram of an Uncertain Model

The best choice of uncertainty representation for a speci�c FDLTI model depends, ofcourse, on the errors the model makes. In practice, it is generally possible to represent


some of these errors in a highly-structured parameterized form. These are usuallythe low frequency error components. There are always remaining higher frequencyerrors, however, which cannot be covered this way. These are caused by such e�ects asin�nite-dimensional electro-mechanical resonance, time delays, di�usion processes, etc.Fortunately, the less-structured representations, e.g., (9.1) or (9.2), are well suited torepresent this latter class of errors. Consequently, (9.1) and (9.2) have become widelyused \generic" uncertainty representations for FDLTI models. An important point isthat the construction of the weighting matrices W1 and W2 for multivariable systems isnot trivial.

Motivated from these observations, we will focus for the moment on the multiplica-tive description of uncertainty. We will assume that P� in (9.2) remains a strictly properFDLTI system for all �. More general perturbations (e.g., time varying, in�nite dimen-sional, nonlinear) can also be covered by this set provided they are given appropriate\conic sector" interpretations via Parseval's theorem. This connection is developed in[Safonov, 1980] and [Zames, 1966] and will not be pursued here.

When used to represent the various high frequency mechanisms mentioned above,the weighting functions in (9.2) commonly have the properties illustrated in Figure 9.2.They are small (� 1) at low frequencies and increase to unity and above at higherfrequencies. The growth with frequency inevitably occurs because phase uncertaintieseventually exceed �180 degrees and magnitude deviations eventually exceed the nom-inal transfer function magnitudes. Readers who are skeptical about this reality areencouraged to try a few experiments with physical devices.

actual model

log

nominal model

ω

Figure 9.2: Typical Behavior of Multiplicative Uncertainty: p�(s) = [1 +w(s)�(s)]p(s)

Also note that the representation of uncertainty in (9.2) can be used to include per-turbation e�ects that are in fact certain. A nonlinear element, may be quite accurately

9.2. Small Gain Theorem 211

modeled, but because our design techniques cannot e�ectively deal with the nonlinear-ity, it is treated as a conic linearity1. As another example, we may deliberately chooseto ignore various known dynamic characteristics in order to achieve a simple nominaldesign model. This is the model reduction process discussed in the previous chapters.

The following terminologies are used in this book.

De�nition 9.1 Given the description of an uncertainty model set � and a set of per-formance objectives, suppose P 2 � is the nominal design model and K is the resultingcontroller. Then the closed-loop feedback system is said to have

Nominal Stability (NS): if K internally stabilizes the nominal model P .

Robust Stability (RS): if K internally stabilizes every plant belong to �.

Nominal Performance (NP): if the performance objectives are satis�ed for thenominal plant P .

Robust Performance (RP): if the performance objectives are satis�ed for everyplant belong to �.

The nominal stability and performance can be easily checked using various standardtechniques. The conditions for which the robust stability and robust performance aresatis�ed under various assumptions on the uncertainty set � will be considered in thefollowing sections.

9.2 Small Gain Theorem

This section and the next section consider the stability test of a nominally stable systemunder unstructured perturbations. The basis for the robust stability criteria derived inthe sequel is the so-called small gain theorem.

Consider the interconnected system shown in Figure 9.3 with M(s) a stable p � qtransfer matrix.

e

e

++

++e2

e1

w2

w1

� �

-6

?

-

M

�

Figure 9.3: Small Gain Theorem

1See, for example, Safonov [1980] and Zames [1966].


Theorem 9.1 (Small Gain Theorem) Suppose M 2 RH1. Then the intercon-nected system shown in Figure 9.3 is well-posed and internally stable for all �(s) 2RH1 with

(a) k�k1 � 1= if and only if kM(s)k1 < ;

(b) k�k1 < 1= if and only if kM(s)k1 � .

Proof. We shall only prove part (a). The proof for part (b) is similar. Without lossof generality, assume = 1.

(Su�ciency) It is clear that M(s)�(s) is stable since both M(s) and �(s) are stable.Thus by Theorem 5.7 (or Corollary 5.6) the closed-loop system is stable if det(I�M�)has no zero in the closed right-half plane for all � 2 RH1 and k�k1 � 1. Equivalently,the closed-loop system is stable if

infs2C+

� (I �M(s)�(s)) 6= 0

for all � 2 RH1 and k�k1 � 1. But this follows from

infs2C+

� (I �M(s)�(s)) � 1� sups2C+

�� (M(s)�(s)) = 1�kM(s)�(s)k1 � 1�kM(s)k1 > 0:

(Necessity) This will be shown by contradiction. Suppose kMk1 � 1. We will showthat there exists a � 2 RH1 with k�k1 � 1 such that det(I �M(s)�(s)) has a zeroon the imaginary axis, so the system is unstable. Suppose !0 2 R+ [ f1g is such that��(M(j!0)) � 1. LetM(j!0) = U(j!)�(j!0)V

�(j!0) be a singular value decompositionwith

U(j!0) =�u1 u2 � � � up

�V (j!0) =

�v1 v2 � � � vq

�

�(j!0) =

264 �1

�2. . .

375

and kMk1 = � (M(j!0)) = �1. To obtain a contradiction, it now su�ces to constructa � 2 RH1 such that �(j!0) =

1�1v1u

�1 and k�k1 � 1. Indeed, for such �(s),

det(I �M(j!0)�(j!0)) = det(I � U�V �v1u�1=�1) = 1� u�1U�V

�v1=�1 = 0

and thus the closed-loop system is either not well-posed (if !0 = 1) or unstable (if! 2 R). There are two di�erent cases:

(1) !0 = 0 or1: then U and V are real matrices. In this case, �(s) can be chosen as

� =1

�1v1u

�1 2 R

q�p :

9.2. Small Gain Theorem 213

(2) 0 < !0 <1: write u1 and v1 in the following form:

u1 =

26664

u11ej�1

u12ej�2

...u1pe

j�p

37775 ; v1 =

26664v11e

j�1

v12ej�2

...v1qe

j�q

37775

where u1i 2 R and v1j 2 R are chosen so that �i; �j 2 [��; 0) for all i; j.Choose �i � 0 and �j � 0 so that

\

��i � j!0�i + j!0

�= �i; \

��j � j!0�j + j!0

�= �j

for i = 1; 2; : : : ; p and j = 1; 2; : : : ; q. Let

�(s) =1

�1

264v11

�1�s�1+s...

v1q�q�s�q+s

375 h u11

�1�s�1+s

� � � u1p�p�s�p+s

i2 RH1:

Then k�k1 = 1=�1 � 1 and �(j!0) =1�1v1u

�1.

2

The theorem still holds even if � andM are in�nite dimensional. This is summarizedas the following corollary.

Corollary 9.2 The following statements are equivalent:

(i) The system is well-posed and internally stable for all � 2 H1 with k�k1 < 1= ;

(ii) The system is well-posed and internally stable for all � 2 RH1 with k�k1 < 1= ;

(iii) The system is well-posed and internally stable for all � 2 C q�p with k�k < 1= ;

(iv) kMk1 � .

Remark 9.1 It can be shown that the small gain condition is su�cient to guaranteeinternal stability even if � is a nonlinear and time varying \stable" operator with anappropriately de�ned stability notion, see Desoer and Vidyasagar [1975]. ~

The following lemma shows that if kMk1 > , there exists a destabilizing � withk�k1 < 1= such that the closed-loop system has poles in the open right half plane.(This is stronger than what is given in the proof of Theorem 9.1.)

Lemma 9.3 Suppose M 2 RH1 and kMk1 > . Then there exists a �0 > 0 suchthat for any given � 2 [0; �0] there exists a � 2 RH1 with k�k1 < 1= such thatdet(I �M(s)�(s)) has a zero on the axis Re(s) = �.


Proof. Without loss of generality, assume = 1. Since M 2 RH1 and kMk1 > 1,there is a su�ciently small �0 > 0 such that kM(�0 + s)k1 := sup

Re(s)>0

kM(�0 + s)k > 1.

Now let � 2 [0; �0]. Then there exists a 0 < !0 <1 such that kM(� + j!0)k > 1.Let M(� + j!0) = U�V � be a singular value decomposition with

U =�u1 u2 � � � up

�V =

�v1 v2 � � � vq

�

� =

264 �1

�2. . .

375 :

Write u1 and v1 in the following form:

u1 =

26664

u11ej�1

u12ej�2

...u1pe

j�p

37775 ; v1 =

26664v11e

j�1

v12ej�2

...v1qe

j�q

37775

where u1i 2 R and v1j 2 R are chosen so that �i; �j 2 [��; 0) for all i; j.Choose �i � 0 and �j � 0 so that

\

��i � � � j!0�i + � + j!0

�= �i; \

��j � � � j!0�j + � + j!0

�= �j

for i = 1; 2; : : : ; p and j = 1; 2; : : : ; q. Let

�(s) =1

�1

264v11

�1�s�1+s...

v1q�q�s�q+s

375 h u11

�1�s�1+s

� � � u1p�p�s�p+s

i2 RH1:

Then k�k1 = 1=�1 < 1 and det (I �M(� + j!0)�(� + j!0)) = 0. Hence s = � + j!0is a zero for the transfer function det (I �M(s)�(s)). 2

The above lemma plays a key role in the necessity proofs of many robust stabilitytests in the sequel.

9.3 Stability under Stable Unstructured Uncertain-

ties

The small gain theorem in the last section will be used here to derive robust stabilitytests under various assumptions of model uncertainties. The modeling error � will again

9.3. Stability under Stable Unstructured Uncertainties 215

be assumed to be stable. (Most of the robust stability tests discussed in the sequel canbe easily generalized to unstable � case with some mild assumptions on the numberof unstable poles of the uncertain model, we encourage readers to �ll in the details.)In addition, we assume that the modeling error � is suitably scaled with weightingfunctions W1 and W2, i.e., the uncertainty can be represented as W1�W2.

We shall consider the standard setup shown in Figure 9.4, where � is the set of un-certain plants with P 2 � as the nominal plant and with K as the internally stabilizingcontroller for P . The sensitivity and complementary sensitivity matrix functions arede�ned as usual as

So = (I + PK)�1; To = I � So

andSi = (I +KP )�1; Ti = I � Si:

Recall that the closed-loop system is well-posed and internally stable if and only if�I K�� I

��1=

�(I +K�)�1 �K(I +�K)�1

(I +�K)�1� (I +�K)�1

�2 RH1:

f K �- - -6

-

�

Figure 9.4: Unstructured Robust Stability Analysis

9.3.1 Additive Uncertainty

We assume that the model uncertainty can be represented by an additive perturbation:

� = P +W1�W2:

Theorem 9.4 Let � = fP +W1�W2 : � 2 RH1g and let K be a stabilizing con-troller for the nominal plant P . Then the closed-loop system is well-posed and internallystable for all k�k1 < 1 if and only if kW2KSoW1k1 � 1.

Proof. Let � = P +W1�W2. Then�I K�� I

��1=

�(I +KSoW1�W2)

�1Si �KSo(I +W1�W2KSo)�1

(I + SoW1�W2K)�1So(P +W1�W2) So(I +W1�W2KSo)�1

�is well-posed and internally stable if (I +�W2KSoW1)

�1 2 RH1 since

det(I +KSoW1�W2) = det(I +W1�W2KSo) = (I + SoW1�W2K)

= det(I +�W2KSoW1):


But (I + �W2KSoW1)�1 2 RH1 is guaranteed if k�W2KSoW1k1 < 1. Hence

kW2KSoW1k1 � 1 is su�cient for robust stability.To show the necessity, note that robust stability implies that

K(I +�K)�1 = KSo(I +W1�W2KSo)�1 2 RH1

for all admissible �. This in turn implies that

�W2K(I +�K)�1W1 = I � (I +�W2KSoW1)�1 2 RH1

for all admissible �. By small gain theorem, this is true for all � 2 RH1 withk�k1 < 1 only if kW2KSoW1k1 � 1. 2

Similarly, it is easy to show that the closed-loop system is stable for all � 2 RH1with k�k1 � 1 if and only if kW2KSoW1k1 < 1.

9.3.2 Multiplicative Uncertainty

In this section, we assume that the system model is described by the following set ofmultiplicative perturbation

P� = (I +W1�W2)P

with W1;W2;� 2 RH1. Consider the feedback system shown in the Figure 9.5.

e ee

e-

dm

?wz

6

-

?

--

-----

W2 � W1

Pey

d

~d

�r ?

?Wd

-We-K

Figure 9.5: Output Multiplicative Perturbed Systems

Theorem 9.5 Let � = f(I +W1�W2)P : � 2 RH1g and let K be a stabilizing con-troller for the nominal plant P . Then

(i) the closed-loop system is well-posed and internally stable for all � 2 RH1 withk�k1 < 1 if and only if kW2ToW1k1 � 1.

(ii) the closed-loop system is well-posed and internally stable for all � 2 RH1 withk�k1 � 1 if kW2ToW1k1 < 1.


(iii) the robust stability of the closed-loop system for all � 2 RH1 with k�k1 � 1does not necessarily imply kW2ToW1k1 < 1.

(iv) the closed-loop system is well-posed and internally stable for all � 2 RH1 withk�k1 � 1 only if kW2ToW1k1 � 1.

(v) In addition, assume that neither P nor K has poles on the imaginary axis. Thenthe closed-loop system is well-posed and internally stable for all � 2 RH1 withk�k1 � 1 if and only if kW2ToW1k1 < 1.

Proof. We shall �rst prove that the condition in (i) is necessary for robust stability.Suppose kW2ToW1k1 > 1. Then by Lemma 9.3, for any given su�ciently small � > 0,there is a � 2 RH1 with k�k1 < 1 such that (I+�W2ToW1)

�1 has poles on the axisRe(s) = �. This implies

(I +�K)�1 = So(I +W1�W2To)�1

has poles on the axis Re(s) = � since � can always be chosen so that the unstable polesare not cancelled by the zeros of So. Hence kW2ToW1k1 � 1 is necessary for robuststability. In fact, we have also proven part (iv). The su�ciency parts of (i), (ii), and(v) follow from the small gain theorem.

To show the necessity part of (v), suppose kW2ToW1k1 = 1. From the proof of thesmall gain theorem, there is a � 2 RH1 with k�k1 � 1 such that (I +�W2ToW1)

�1

has poles on the imaginary axis. This implies

(I +�K)�1 = So(I +W1�W2To)�1

has poles on the imaginary axis since the imaginary axis poles of (I+W1�W2To)�1 are

not cancelled by the zeros of So, which are the poles of P andK. Hence kW2ToW1k1 < 1is necessary for robust stability.

The proof of part (iii) is given below by exhibiting an example with kW2ToW1k1 = 1but there is no destabilizing � 2 RH1 with k�k1 � 1. 2

Example 9.1 Let P (s) = 1s , K(s) = 1, and W1 = W2 = 1. It is easy to check that K

stabilizes P . We have

To =1

s+ 1; kTok1 = 1

and

(I +�K)�1 = (I +K�)�1 = K(I +�K)�1 =s

s+ 1

1

1 + 1s+1�

(I +�K)�1� =1

s+ 1

1 +�

1 + 1s+1�

:


Since jTo(s)j =�� 1s+1

�� < 1 for all 0 6= s 2 C and Re(s) � 0, 1 + 1s+1� 6= 0 for all

0 6= s 2 C , Re(s) � 0, � 2 RH1 and k�k � 1. The only point where 1 + 1s+1� = 0 in

the closed right half plane is s = 0. Then �(0) = �1. By assumption, � is analytic ina neighborhood of the origin since � 2 RH1. Hence, we can write

�(s) = �1 +1Xi=1

ansn; ai 2 R:

We now claim that a1 � 0. Otherwise,d�(s)

ds

��s=0

= a1 < 0, along with �(0) = �1,implies k�k1 > 1. Hence �(s) = �1 + a1s+ s2g(s) for some g(s) and a1 � 0 and

(I +�K)�1 = (I +K�)�1 = K(I +�K)�1 =s

s+ 1

1

1 + 1s+1�

=1

1 + a1 + sg

(I +�K)�1� =1

s+ 1

1 +�

1 + 1s+1�

=a1 + sg

1 + a1 + sg

both of which are bounded in the neighborhood of the origin. Hence there is no desta-bilizing � 2 RH1 with k�k1 � 1. 3

The gap between the necessity and the su�ciency of the above robust stability conditionsfor the closed ball of uncertainty is only technique and will not be considered in thesequel. The reason for the existence of a such gap may be attributed to the fact that inthe multiplicative perturbed case, the signals z, w, and dm in Figure 9.5 are arti�cialand they are not physical signals. Indeed, � = (I +W1�W2)P is a single system, theinternal stability of the closed-loop system does not necessarily imply the boundednessof the arti�cial signals z or w with respect to the arti�cial disturbance dm. This is thecase for the above example where � = (1 + �)P = (a1s + s2g(s))=s = a1 + sg(s) andthe pole s = 0 is cancelled. This cancelation is arti�cial and is caused by the particularmodel representation (i.e., there is really no cancelation in the physical system.) Thusthe closed-loop system is robustly stable although the transfer function from dm to z isunstable.

9.3.3 Coprime Factor Uncertainty

As another example, consider a left coprime factor perturbed plant described in Fig-ure 9.6.

Theorem 9.6 Let

� = ( ~M + ~�M )�1( ~N + ~�N )

with ~M; ~N; ~�M , ~�N 2 RH1. The transfer matrices ( ~M; ~N) are assumed to be astable left coprime factorization of P (i.e., P = ~M�1 ~N), and K internally stabilizes


e

ee

�

�

yw

z2z1

r

6

?

� �

-

~�M

~M�1--

-

~N

~�N-

-K--

Figure 9.6: Left Coprime Factor Perturbed Systems

the nominal system P . De�ne � :=�~�N

~�M

�. Then the closed-loop system is

well-posed and internally stable for all k�k1 < 1 if and only if �KI

�(I + PK)�1 ~M�1

1

� 1:

Proof. Let K = UV �1 be a right coprime factorization over RH1. By Lemma 5.10,the closed-loop system is internally stable if and only if�

( ~N + ~�N )U + ( ~M + ~�M )V��1

2 RH1: (9:3)

Since K stabilizes P , ( ~NU + ~MV )�1 2 RH1. Hence (9.3) holds if and only if�I + ( ~�NU + ~�MV )( ~NU + ~MV )�1

��12 RH1:

By the small gain theorem, the above is true for all k�k1 < 1 if and only if �UV

�( ~NU + ~MV )�1

1

=

�KI

�(I + PK)�1 ~M�1

1

� 1:

2

Similarly, one can show that the closed-loop system is well-posed and internallystable for all k�k1 � 1 if and only if

�KI

�(I + PK)�1 ~M�1

1

< 1:

9.3.4 Unstructured Robust Stability Tests

Table 9.1 summaries robust stability tests on the plant uncertainties under variousassumptions. All of the tests pertain to the standard setup shown in Figure 9.4, where� is the set of uncertain plants with P 2 � as the nominal plant and with K as theinternally stabilizing controller of P .


Table 9.1 should be interpreted as

UNSTRUCTURED ANALYSIS THEOREM

Given NS & Perturbed Model Sets

Then Closed-Loop Robust Stability

if and only if Robust Stability Tests

The table also indicates representative types of physical uncertainties which canbe usefully represented by cone bounded perturbations inserted at appropriate loca-tions. For example, the representation P� = (I +W1�W2)P in the �rst row is usefulfor output errors at high frequencies (HF), covering such things as unmodeled highfrequency dynamics of sensors or plant, including di�usion processes, transport lags,electro-mechanical resonances, etc. The representation P� = P (I +W1�W2) in thesecond row covers similar types of errors at the inputs. Both cases should be contrastedwith the third and the fourth rows which treat P (I+W1�W2)

�1 and (I+W1�W2)�1P .

These representations are more useful for variations in modeled dynamics, such as lowfrequency (LF) errors produced by parameter variations with operating conditions, withaging, or across production copies of the same plant. Discussion of still other cases isleft to the table.

Note from the table that the stability requirements on � do not limit our ability torepresent variations in either the number or locations of rhp singularities as can be seenfrom some simple examples.

Example 9.2 Suppose an uncertain system with changing numbers of right-half planepoles is described by

P� =

�1

s� �: � 2 R; j�j � 1

�:

Then P1 =1

s�1 2 P� has one right-half plane pole and P2 =1

s+1 2 P� has no right-halfplane pole. Nevertheless, the set of P� can be covered by a set of feedback uncertainplants:

P� �� :=�P (1 + �P )�1 : � 2 RH1; k�k1 � 1

with P = 1

s . 3

Example 9.3 As another example, consider the following set of plants:

P� =s+ 1 + �

(s+ 1)(s+ 2); j�j � 2:


W1 2 RH1 W2 2 RH1 � 2 RH1 k�k1 < 1

Perturbed Model Sets Representative Types of Robust Stability Tests� Uncertainty Characterized

output (sensor) errors(I +W1�W2)P neglected HF dynamics kW2ToW1k1 � 1

changing # of rhp zerosinput (actuators) errors

P (I +W1�W2) neglected HF dynamics kW2TiW1k1 � 1changing # of rhp zerosLF parameter errors

(I +W1�W2)�1P changing # of rhp poles kW2SoW1k1 � 1

LF parameter errorsP (I +W1�W2)

�1 changing # of rhp poles kW2SiW1k1 � 1

additive plant errorsP +W1�W2 neglected HF dynamics kW2KSoW1k1 � 1

uncertain rhp zerosLF parameter errors

P (I +W1�W2P )�1 uncertain rhp poles kW2SoPW1k1 � 1

( ~M + ~�M )�1( ~N + ~�N ) LF parameter errors

P = ~M�1 ~N neglected HF dynamics

�K

I

�So

~M�1

1

� 1

� =�~�N

~�M

�uncertain rhp poles & zeros

(N +�N )(M +�M )�1 LF parameter errorsP = NM�1 neglected HF dynamics

M�1Si[K I]

1

� 1

� =

��N

�M

�uncertain rhp poles & zeros

Table 9.1: Unstructured Robust Stability Tests (# stands for `the number'.)


This set of plants have changing numbers of right-half plane zeros since the plant hasno right-half plane zero when � = 0 and has one right-half plane zero when � = �2.The uncertain plant can be covered by a set of multiplicative perturbed plants:

P� � � :=

�1

s+ 2(1 +

2�

s+ 1); � 2 RH1; k�k1 � 1

�:

It should be noted that this covering can be quite conservative. 3

9.3.5 Equivalence: Robust Stability vs. Nominal Performance

A robust stability problem can be viewed as another nominal performance problem. Forexample, the output multiplicative perturbed robust stability problem can be treated asa sensor noise rejection problem shown in Figure 9.7 and vise versa. It is clear that thesystem with output multiplicative uncertainty as shown in Figure 9.5 is robustly stablefor k�k1 < 1 i� the H1 norm of the transfer function from w to z, Tzw, is no greaterthan 1. Since Tzw = Te~n, hence kTzwk1 � 1 i� supk~nk

2�1 kek2 = kW2ToW1k1 � 1.

e

e

~n

e

�

?

6

� �

----

W1

W2PK

Figure 9.7: Equivalence Between Robust Stability With Output Multiplicative Uncer-tainty and Nominal Performance With Sensor Noise Rejection

There is, in fact, a much more general result along this line of equivalence: anyrobust stability problem (with open ball of uncertainty) can be regarded as an equivalentperformance problem. This will be considered in Chapter 11.

9.4 Unstructured Robust Performance

Consider the perturbed system shown in Figure 9.8 with the set of perturbed modelsdescribed by a set �. Suppose the weighting matrices Wd;We 2 RH1 and the perfor-mance criterion is to keep the error e as small as possible in some sense for all possiblemodels belong to the set �. In general, the set � can be either parameterized set orunstructured set such as those described in Table 9.1. The performance speci�cationsare usually speci�ed in terms of the magnitude of each component e in time domain, i.e.,L1 norm, with respect to bounded disturbances, or alternatively and more convenientlysome requirements on the closed-loop frequency response of the transfer matrix between~d and e, say, integral of square error or the magnitude of the steady state error with

9.4. Unstructured Robust Performance 223

e e- P� 2 �6

- - eyd

~d

�r ?

?Wd

-We--K

Figure 9.8: Diagram for Robust Performance Analysis

respect to sinusoidal disturbances. The former design criterion leads to the so-calledL1-optimal control framework and the latter leads to H2 and H1 design frameworks,respectively. In this section, we will focus primarily on the H2 and H1 performanceobjectives with unstructured model uncertainty descriptions. The performance understructured uncertainty will be considered in Chapter 11.

9.4.1 Robust H2 Performance

Although nominal H2 performance analysis is straightforward and involves only thecomputation of H2 norms, the H2 performance analysis with H1 norm bounded modeluncertainty is much harder and little studied. Nevertheless, this problem is sensible andimportant for the reason that performance is sometimes more appropriately speci�edin terms of the H2 norm than the H1 norm. On the other hand, model uncertaintiesare more conveniently described using the H1 norm bounded sets and arise naturallyin identi�cation processes.

Let T�e ~d

denote the transfer matrix between ~d and e, then

T�e ~d=We(I + P�K)�1Wd; P� 2 �: (9:4)

And the robust H2 performance analysis problem can be stated as �nding

supP�2�

T�e ~d

2:

To simplify our analysis, consider a scalar system with Wd = 1, We = ws, P = p, andassume that the model is given by a multiplicative uncertainty set

� = f(1 + wt�)p : � 2 RH1; k�k1 < 1g :Assume further that the system is robustly stabilized by a controller k. Then

supP�2�

T�e ~d

2= supk�k

1<1

ws"

1 + �wt�

2

=

ws"

1� j�wtj 2


where " = (1 + pk)�1 and � = 1� ".

The exact analysis for the matrix case is harder to determine although some upperbounds can be derived as we shall do for the H1 case below. However, the upperbounds do not seem to be insightful in the H2 setting, and, therefore, are omitted. Itshould be pointed out that synthesis for robust H2 performance is much harder even inthe scalar case although synthesis for nominal performance is relatively easy and willbe considered in Chapter 14.

9.4.2 Robust H1Performance with Output Multiplicative Un-

certainty

Suppose the performance criterion is to keep the energy of error e as small as possible,i.e.,

supk ~dk

2�1

kek2 � �

for some small �. By scaling the error e (i.e., by properly selecting We) we can assumewithout loss of generality that � = 1. Then the robust performance criterion in this casecan be described as requiring that the closed-loop system be robustly stable and that T�

e ~d

1� 1; 8P� 2 �: (9:5)

More speci�cally, an output multiplicatively perturbed system will be analyzed �rst.The analysis for other classes of models can be done analogously. The perturbed modelcan be described as

� := f(I +W1�W2)P : � 2 RH1; k�k1 < 1g (9:6)

with W1;W2 2 RH1. The explicit system diagram is as shown in Figure 9.5. For thisclass of models, we have

T�e ~d=WeSo(I +W1�W2To)

�1Wd;

and the robust performance is satis�ed i�

kW2ToW1k1 � 1

and T�e ~d

1� 1; 8� 2 RH1; k�k1 < 1:

The exact analysis for this robust performance problem is not trivial and will be givenin Chapter 11. However, some su�cient conditions are relatively easy to obtain bybounding these two inequalities, and they may shed some light on the nature of theseproblems. It will be assumed throughout that the controller K internally stabilizes thenominal plant P .


Theorem 9.7 Suppose P� 2 f(I +W1�W2)P : � 2 RH1; k�k1 < 1g and K in-ternally stabilizes P . Then the system robust performance is guaranteed if either one ofthe following conditions is satis�ed

(i) for each frequency !

�(Wd)�(WeSo) + �(W1)�(W2To) � 1; (9:7)

(ii) for each frequency !

�(W�11 Wd)�(WeSoWd) + �(W2ToW1) � 1 (9:8)

where W1 and Wd are assumed to be invertible and �(W�11 Wd) is the condition

number.

Proof. It is obvious that both condition (9.7) and condition (9.8) guarantee that

kW2ToW1k1 � 1. So it is su�cient to show that T�

e ~d

1� 1;8� 2 RH1; k�k1 < 1.

Now for any frequency !, it is easy to see that

�(T�e ~d) � �(WeSo)�[(I +W1�W2To)

�1]�(Wd)

=�(WeSo)�(Wd)

�(I +W1�W2To)

� �(WeSo)�(Wd)

1� �(W1�W2To)

� �(WeSo)�(Wd)

1� �(W1)�(W2To)��(�):

Hence condition (9.7) guarantees �(T�e ~d) � 1 for all � 2 RH1 with k�k1 < 1 at all

frequencies.

Similarly, suppose W1 and Wd are invertible; write

T�e ~d=WeSoWd(W

�11 Wd)

�1(I +�W2ToW1)�1(W�1

1 Wd);

and then

�(T�e ~d) � �(WeSoWd)�(W

�11 Wd)

1� �(W2ToW1)��(�):

Hence by condition (9.8), �(T�e ~d) � 1 is guaranteed for all � 2 RH1 with k�k1 < 1

at all frequencies. 2

Remark 9.2 It is not hard to show that either one of the conditions in the theorem isalso necessary for scalar valued systems.


Remark 9.3 Suppose �(W�11 Wd) � 1 (weighting matrices satisfying this condition

are usually called round weights). This is particularly the case if W1 = w1(s)I andWd = wd(s)I . Recall that �(WeSoWd) � 1 is the necessary and su�cient condition fornominal performance and that �(W2ToW1) � 1 is the necessary and su�cient conditionfor robust stability. Hence it is clear that in this case nominal performance plus robuststability almost guarantees robust performance, i.e., NP + RS � RP. However, this isnot necessarily true for other types of uncertainties as will be shown later. ~

Remark 9.4 Note that in light of the equivalence relation between the robust sta-bility and nominal performance, it is reasonable to conjecture that the above robustperformance problem is equivalent to the robust stability problem in Figure 9.4 withthe uncertainty model set given by

� := (I +Wd�eWe)�1(I +W1�W2)P

and k�ek1 < 1, k�k1 < 1, as shown in Figure 9.9. This conjecture is indeed true;however, the equivalent model uncertainty is structured, and the exact stability analysisfor such systems is not trivial and will be studied in Chapter 11. ~

e ee

?��e

6

-

?

--

----

W2 � W1

P

�?

Wd

We-K

Figure 9.9: Robust Performance with Unstructured Uncertainty vs. Robust Stabilitywith Structured Uncertainty

Remark 9.5 Note that if W1 and Wd are invertible, then T�e ~d

can also be written as

T�e ~d=WeSoWd

�I + (W�1

1 Wd)�1�W2ToW1(W

�11 Wd)

��1:

So another alternative su�cient condition for robust performance can be obtained as

�(WeSoWd) + �(W�11 Wd)�(W2ToW1) � 1:

A similar situation also occurs in the skewed case below. We will not repeat all thesevariations. ~


9.4.3 Skewed Speci�cations and Plant Condition Number

We now consider the system with skewed speci�cations, i.e., the uncertainty and per-formance are not measured at the same location. For instance, the system performanceis still measured in terms of output sensitivity, but the uncertainty model is in inputmultiplicative form:

� := fP (I +W1�W2) : � 2 RH1; k�k1 < 1g :For systems described by this class of models, see Figure 9.8, the robust stability con-dition becomes

kW2TiW1k1 � 1;

and the nominal performance condition becomes

kWeSoWdk1 � 1:

To consider the robust performance, let ~T�e ~d

denote the transfer matrix from ~d to e.Then

~T�e ~d

= WeSo(I + PW1�W2KSo)�1Wd

= WeSoWd

�I + (W�1

d PW1)�(W2TiW1)(W�1d PW1)

�1��1

:

The last equality follows if W1, Wd, and P are invertible and, if W2 is invertible, canalso be written as

~T�e ~d=WeSoWd(W

�11 Wd)

�1�I + (W�1

1 PW1)�(W2P�1W�1

2 )(W2ToW1)��1

(W�11 Wd):

Then the following results follow easily.

Theorem 9.8 Suppose P� 2 � = fP (I +W1�W2) : � 2 RH1; k�k1 < 1g and Kinternally stabilizes P . Assume that P;W1;W2, and Wd are square and invertible. Thenthe system robust performance is guaranteed if either one of the following conditions issatis�ed

(i) for each frequency !

�(WeSoWd) + �(W�1d PW1)�(W2TiW1) � 1; (9:9)

(ii) for each frequency !

�(W�11 Wd)�(WeSoWd) + �(W�1

1 PW1)�(W2P�1W�1

2 )�(W2ToW1) � 1: (9:10)

Remark 9.6 If the appropriate invertibility conditions are not satis�ed, then an alter-native su�cient condition for robust performance can be given by

�(Wd)�(WeSo) + �(PW1)�(W2KSo) � 1:

Similar to the previous case, there are many di�erent variations of su�cient conditionsalthough (9.10) may be the most useful one. ~


Remark 9.7 It is important to note that in this case, the robust stability condition isgiven in terms of Li = KP while the nominal performance condition is given in termsof Lo = PK. These classes of problems are called skewed problems or problems withskewed speci�cations.2 Since, in general, PK 6= KP , the robust stability margin ortolerances for uncertainties at the plant input and output are generally not the same.

~

Remark 9.8 It is also noted that the robust performance condition is related to thecondition number of the weighted nominal model. So in general if the weighted nominalmodel is ill-conditioned at the range of critical frequencies, then the robust performancecondition may be far more restrictive than the robust stability condition and the nominalperformance condition together. For simplicity, assume W1 = I , Wd = I and W2 = wtIwhere wt 2 RH1 is a scalar function. Further, P is assumed to be invertible. Thenrobust performance condition (9.10) can be written as

�(WeSo) + �(P )�(wtTo) � 1;8!:

Comparing these conditions with those obtained for non-skewed problems shows thatthe condition related to robust stability is scaled by the condition number of the plant3.Since �(P ) � 1, it is clear that the skewed speci�cations are much harder to satisfyif the plant is not well conditioned. This problem will be discussed in more detail insection 11.3.3 of Chapter 11. ~

Remark 9.9 Suppose K is invertible, then ~T�e ~d

can be written as

~T�e ~d=WeK

�1(I + TiW1�W2)�1SiKWd:

Assume further that We = I;Wd = wsI;W2 = I where ws 2 RH1 is a scalar function.Then a su�cient condition for robust performance is given by

�(K)�(Siws) + �(TiW1) � 1;8!;

with �(K) := �(K)�(K�1). This is equivalent to treating the input multiplicative plantuncertainty as the output multiplicative controller uncertainty. ~

These skewed speci�cations also create problems for MIMO loop shaping designwhich has been discussed brie y in Chapter 5. The idea of loop shaping design isbased on the fact that robust performance is guaranteed by designing a controller witha su�cient nominal performance margin and a su�cient robust stability margin. Forexample, if �(W�1

1 Wd) � 1, the output multiplicative perturbed robust performance isguaranteed by designing a controller with twice the required nominal performance androbust stability margins.

2See Stein and Doyle [1991].3Alternative condition can be derived so that the condition related to nominal performance is scaled

by the condition number.

9.5. Gain Margin and Phase Margin 229

The fact that the condition number appeared in the robust performance test forskewed problems can be given another interpretation by considering two sets of plants�1 and �2 as shown below.

�1 := fP (I + wt�) : � 2 RH1; k�k1 < 1g�2 := f(I + ~wt�)P : � 2 RH1; k�k1 < 1g :

ee?

- -

? --

�~wt

P--

--

-- P

wt �

Figure 9.10: Converting Input Uncertainty to Output Uncertainty

Assume that P is invertible, then

�2 ��1 if j ~wtj � jwtj�(P ) 8!

since P (I + wt�) = (I + wtP�P�1)P .

The condition number of a transfer matrix can be very high at high frequency whichmay signi�cantly limit the achievable performance. The example below, taken fromthe textbook [Franklin, Powell, and Workman, 1990], shows that the condition numbershown in Figure 9.11 may increase with the frequency:

P (s) =

266664�0:2 0:1 1 0 1�0:05 0 0 0 0:70 0 �1 1 01 0 0 0 00 1 0 0 0

377775 =

1

a(s)

�s (s+ 1)(s+ 0:07)

�0:05 0:7(s+ 1)(s+ 0:13)

�

where a(s) = (s+ 1)(s+ 0:1707)(s+ 0:02929).

9.5 Gain Margin and Phase Margin

In this section, we show that the gain margin and phase margin de�ned in classicalcontrol theory may not be su�cient indicators of a system's robustness. Let L(s) be ascalar transfer function and consider a unit feedback system such as the one shown inthe following diagram:


10-3

10-2

10-1

100

101

102

100

101

102

103

frequency

condit

ion n

um

ber

Figure 9.11: Condition Number �(!) = ��(P (j!))=�(P (j!))

e

�-6

-- L(s)

Suppose that the above closed-loop feedback system with L(s) = L0(s) is stable.Then the system is said to have

Gain Margin kmin and kmax if the closed-loop system is stable for all L(s) = kL0(s) withkmin < k < kmax but unstable for L(s) = kmaxL0(s) and for L(s) = kminL0(s)where 0 � kmin � 1 and kmax � 1.

Phase Margin �min and �max if the closed-loop system is stable for all L(s) = e�j�L0(s)with �min < � < �max but unstable for L(s) = e�j�maxL0(s) and for L(s) =e�j�minL0(s) where �� min � 0 and 0 � �max � �.

These margins can be easily read from the open-loop system Nyquist diagram as shownin Figure 9.12 where kmax and kmin represent how much the loop gain can be increased

9.5. Gain Margin and Phase Margin 231

and decreased, respectively, without causing instability. Similarly �max and �min repre-sent how much loop phase lag and lead can be tolerated without causing instability.

L(jω)

-1 -1

1/K

1/K

L(jω)

min

max min

max

φ

φ

Figure 9.12: Gain Margin and Phase Margin of A Scalar System

However, gain margin or phase margin alone may not be a su�cient indicator of asystem's robustness. To be more speci�c, consider a simple dynamical system

P =a� s

as� 1; a > 1

with a stabilizing controller K. Now let L = PK and consider a controller

K =b+ s

bs+ 1; b > 0:

It is easy to show that the closed-loop system is stable for any

1

a< b < a:

To compute the stability margins, consider three cases:

(i) b = 1: in this case, K = 1 and the stability margins can be easily shown to be

kmin =1

a; kmax = a; �min = ��; �max = sin�1

�a2 � 1

a2 + 1

�=: �:

It is easy to see that both gain margin and phase margin are very large for largea.

(ii) 1a < b < a and b! a: in this case

kmin =1

ab! 1

a2; kmax = ab! a2; �min = ��; �max ! 0

i.e., very large gain margin but arbitrarily small phase margin.


(iii) 1a < b < a and b! 1

a : in this case

kmin =1

ab! 1; kmax = ab! 1; �min = ��; �max ! 2�

i.e., very large phase margin but arbitrarily small gain margin.

The open-loop frequency response of the system is shown in Figure 9.13 for a = 2 andb = 1; b = 1:9 and b = 0:55, respectively.

Sometimes, gain margin and phase margin together may still not be enough toindicate the true robustness of a system. For example, it is possible to construct a(complicated) controller such that

kmin <1

a; kmax > a; �min = ��; �max > �

but the Nyquist plot is arbitrarily close to (�1; 0). Such a controller is complicated toconstruct; however, the following controller should give the reader a good idea of itsconstruction:

Kbad =s+ 3:3

3:3s+ 1

s+ 0:55

0:55s+ 1

1:7s2 + 1:5s+ 1

s2 + 1:5s+ 1:7:

The open-loop frequency response of the system with this controller is also shown inFigure 9.13 by the dotted line. It is easy to see that the system has at least the samegain margin and phase margin as the system with controllerK = 1, but the Nyquist plotis closer to (�1; 0). Therefore this system is less robust with respect to the simultaneousgain and phase perturbations. The problem is that the gain margin and phase margindo not give the correct indication of the system's robustness when the gain and phaseare varying simultaneously.

9.6 De�ciency of Classical Control for MIMO Sys-

tems

In this section, we show through an example that the classical control theory may notbe reliable when it is applied to MIMO system design.

Consider a symmetric spinning body with torque inputs, T1 and T2, along two or-thogonal transverse axes, x and y, as shown in Figure 9.14. Assume that the angularvelocity of the spinning body with respect to the z axis is constant, . Assume furtherthat the inertia of the spinning body with respect to the x; y, and z axes are I1, I2 = I1,and I3, respectively. Denote by !1 and !2 the angular velocities of the body with re-spect to the x and y axes, respectively. Then the Euler's equation of the spinning bodyis given by

I1 _!1 � !2(I1 � I3) = T1

I1 _!2 � !1(I3 � I1) = T2:

9.6. De�ciency of Classical Control for MIMO Systems 233

-2

-1.5

-1

-0.5

0

0.5

1

-2.5 -2 -1.5 -1 -0.5 0 0.5

Figure 9.13: Nyquist Plots of L with a = 2 and b = 1(solid), b = 1:9(dashed), b =0:55(dashdot) and with Kbad (dotted)

De�ne �u1u2

�:=

�T1=I1T2=I1

�; a := (1� I3=I1):

Then the system dynamical equations can be written as�_!1_!2

�=

�0 a�a 0

��!1!2

�+

�u1u2

�:

Now suppose that the angular rates !1 and !2 are measured in scaled and rotatedcoordinates:�

y1y2

�=

1

cos �

�cos � sin �� sin � cos �

��!1!2

�=

�1 a�a 1

� �!1!2

�where tan � := a. (There is no speci�c physical meaning for the measurements of y1 andy1 but they are assumed here only for the convenience of discussion.) Then the transfermatrix for the spinning body can be computed as

Y (s) = P (s)U(s)

with

P (s) =1

s2 + a2

�s� a2 a(s+ 1)

�a(s+ 1) s� a2

�:


z

y

x

Figure 9.14: Spinning Body

Suppose the control law is chosen as

u = K1r � y

where

K1 =1

1 + a2

�1 �aa 1

�:

fuy ~r r

�6� � � �K1P

Figure 9.15: Closed-loop with a \Bizarre" Controller

Then the closed-loop transfer function is given by

Y (s) =1

s+ 1

�1 00 1

�R(s)

9.6. De�ciency of Classical Control for MIMO Systems 235

and the sensitivity function and the complementary sensitivity function are given by

S = (I + P )�1 =1

s+ 1

�s �aa s

�; T = P (I + P )�1 =

1

s+ 1

�1 a�a 1

�:

It is noted that this controller design has the property of decoupling the loops. Further-more, each single loop has the open-loop transfer function as

1

s

so each loop has phase margin �max = ��min = 90o and gain margin kmin = 0; kmax =1.

Suppose one loop transfer function is perturbed, as shown in Figure 9.16.

e

e

e

u2

w z

y2 �

�

u1y1

6�

6�

�

�

�

?

6

�

�

P

Figure 9.16: One-Loop-At-A-Time Analysis

Denotez(s)

w(s)= �T11 = � 1

s+ 1:

Then the maximum allowable perturbation is given by

k�k1 <1

kT11k1= 1

which is independent of a. Similarly the maximum allowable perturbation on the otherloop is also 1 by symmetry. However, if both loops are perturbed at the same time,then the maximum allowable perturbation is much smaller, as shown below.

Consider a multivariable perturbation, as shown in Figure 9.17, i.e., P� = (I+�)P ,with

� =

��11 �12�21 �22

�2 RH1


f

f

f

f f

f~r2

~r1

y2

y1 ?

6�

�

�

�

�

�?��/

SSo6

�21

�12

�22

�11

�

�

�

�

?

6

�

�

�

�

�

�

g22

g21

g12

g11

�

�

Figure 9.17: Simultaneous Perturbations

a 2 � 2 transfer matrix such that k�k1 < . Then by the small gain theorem, thesystem is robustly stable for every such � i�

� 1

kTk1=

1p1 + a2

(� 1 if a� 1):

In particular, consider

� = �d =

��11

�22

�2 R

2�2 :

Then the closed-loop system is stable for every such � i�

det(I + T�d) =1

(s+ 1)2�s2 + (2 + �11 + �22)s+ 1 + �11 + �22 + (1 + a2)�11�22

�has no zero in the closed right-half plane. Hence the stability region is given by

2 + �11 + �22 > 0

1 + �11 + �22 + (1 + a2)�11�22 > 0:

It is easy to see that the system is unstable with

�11 = ��22 = 1p1 + a2

:


0.707

0.707

δ 22

δ 11

Figure 9.18: Stability Region for a = 1

The stability region is drawn in the Figure 9.18. This clearly shows that the analysis ofan MIMO system using SISO methods can be misleading and can even give erroneousresults. Hence an MIMO method has to be used.


The small gain theorem was �rst presented by Zames [1966]. The book by Desoer andVidyasagar [1975] contains a quite extensive treatment and applications of this theoremin various forms. Robust stability conditions under various uncertainty assumptions arediscussed in Doyle, Wall, and Stein [1982]. It was �rst shown in Kishore and Pearson[1992] that the small gain condition may not be necessary for robust stability withclosed-ball perturbed uncertainties. In the same reference, some extensions of stabilityand performance criteria under various structured/unstructured uncertainties are given.Some further extensions are also presented in Tits and Fan [1994].

10Linear Fractional

Transformation

This chapter introduces a new matrix function: linear fractional transformation (LFT).We show that many interesting control problems can be formulated in an LFT frame-work and thus can be treated using the same technique.

10.1 Linear Fractional Transformations

This section introduces the matrix linear fractional transformations. It is well knownfrom the one complex variable function theory that a mapping F : C 7! C of the form

F (s) =a+ bs

c+ ds

with a; b; c and d 2 C is called a linear fractional transformation. In particular, if c 6= 0then F (s) can also be written as

F (s) = �+ �s(1� s)�1

for some �; � and 2 C . The linear fractional transformation described above forscalars can be generalized to the matrix case.

De�nition 10.1 Let M be a complex matrix partitioned as

M =

�M11 M12

M21 M22

�2 C

(p1+p2)�(q1+q2);

239

240 LINEAR FRACTIONAL TRANSFORMATION

and let �` 2 C q2�p2 and �u 2 C q1�p1 be two other complex matrices. Then we canformally de�ne a lower LFT with respect to �` as the map

F`(M; �) : Cq2�p2 7! C

p1�q1

withF`(M;�`)

�=M11 +M12�`(I �M22�`)

�1M21

provided that the inverse (I �M22�`)�1 exists. We can also de�ne an upper LFT with

respect to �u asFu(M; �) : C

q1�p1 7! Cp2�q2

withFu(M;�u) =M22 +M21�u(I �M11�u)

�1M12

provided that the inverse (I �M11�u)�1 exists.

The matrix M in the above LFTs is called the coe�cient matrix. The motivation forthe terminologies of lower and upper LFTs should be clear from the following diagramrepresentations of F`(M;�`) and Fu(M;�u):

y1 u1

w1z1� ��

-

M

�`

y2 u2

w2z2 ��

-

M

�u

The diagram on the left represents the following set of equations:�z1y1

�= M

�w1

u1

�=

�M11 M12

M21 M22

� �w1

u1

�;

u1 = �` y1

while the diagram on the right represents�y2z2

�= M

�u2w2

�=

�M11 M12

M21 M22

� �u2w2

�;

u = �u y2:

It is easy to verify that the mapping de�ned on the left diagram is equal to F`(M;�`)and the mapping de�ned on the right diagram is equal to Fu(M;�u). So from theabove diagrams, F`(M;�`) is a transformation obtained from closing the lower loopon the left diagram; similarly Fu(M;�u) is a transformation obtained from closing theupper loop on the right diagram. In most cases, we will use the general term LFT inreferring to both upper and lower LFTs and assume that the contents will distinguishthe situations. The reason for this is that one can use either of these notations to express

10.1. Linear Fractional Transformations 241

a given object. Indeed, it is clear that Fu(N;�) = F`(M;�) with N =

�M22 M21

M12 M11

�.

It is usually not crucial which expression is used; however, it is often the case that oneexpression is more convenient than the other for a given problem. It should also beclear to the reader that in writing F`(M;�) (or Fu(M;�)) it is implied that � hascompatible dimensions.

A useful interpretation of an LFT, e.g., F`(M;�), is that F`(M;�) has a nominalmapping, M11, and is perturbed by �, while M12;M21, and M22 re ect a prior knowl-edge as to how the perturbation a�ects the nominal map, M11. A similar interpretationcan be applied to Fu(M;�). This is why LFT is particularly useful in the study ofperturbations, which is the focus of the next chapter.

The physical meaning of an LFT in control science is obvious if we take M as a propertransfer matrix. In that case, the LFTs de�ned above are simply the closed-loop transfermatrices from w1 7! z1 and w2 7! z2, respectively, i.e.,

Tzw1 = F`(M;�`); Tzw2 = Fu(M;�u)

where M may be the controlled plant and � may be either the system model uncer-tainties or the controllers.

De�nition 10.2 An LFT, F`(M;�), is said to be well de�ned (or well-posed) if (I �M22�) is invertible.

Note that this de�nition is consistent with the well-posedness de�nition of the feed-back system, which requires the corresponding transfer matrix be invertible in Rp(s). Itis clear that the study of an LFT that is not well-de�ned is meaningless, hence through-out the book, whenever an LFT is invoked, it will be assumed implicitly to be wellde�ned. It is also clear from the de�nition that, for any M , F`(M; 0) is well de�ned;hence any function that is not well de�ned at the origin cannot be expressed as an LFTin terms of its variables. For example, f(�) = 1=� is not an LFT of �.

In some literature, LFT is used to refer to the following matrix functions:

(A+BQ)(C +DQ)�1 or (C +QD)�1(A+QB)

where C is usually assumed to be invertible due to practical consideration. The followingis true.

Lemma 10.1 Suppose C is invertible. Then

(A+BQ)(C +DQ)�1 = F`(M;Q)

(C +QD)�1(A+QB) = F`(N;Q)

with

M =

�AC�1 B �AC�1DC�1 �C�1D

�; N =

�C�1A C�1

B �DC�1A �DC�1�:


The converse also holds if M satis�es certain conditions.

Lemma 10.2 Let F`(M;Q) be a given LFT with M =

�M11 M12

M21 M22

�, then

(a) if M21 is invertible,

F`(M;Q) = (A+BQ)(C +DQ)�1

with

A =M11M�121 ; B =M12 �M11M

�121 M22; C =M�1

21 ; D = �M�121 M22:

(b) if M12 is invertible,

F`(M;Q) = (C +QD)�1(A+QB)

with

A =M�112 M11; B =M21 �M22M

�112 M11; C =M�1

12 ; D = �M22M�112 :

However, for an arbitrary LFT F`(M;Q), neither M21 nor M12 is necessarily squareand invertible; therefore, the alternative fractional formula is more restrictive.

It should be pointed out that some seemingly similar functions do not have simpleLFT representations. For example,

(A+QB)(I +QD)�1

cannot always be written in the form of F`(M;Q) for some M ; however, it can bewritten as

(A+QB)(I +QD)�1 = F`(N;�)with

N =

24 A I A�B 0 �BD 0 D

35 ; � =

�Q

Q

�:

Note that the dimension of � is twice as many as Q.The following lemma summarizes some of the algebraic properties of LFT s.

Lemma 10.3 Let M;Q, and G be suitably partitioned matrices:

M =

�M11 M12

M21 M22

�; Q =

�Q11 Q12

Q21 Q22

�; G =

24 A B1 B2

C1 D11 D12

C2 D21 D22

35 :


(i) Fu(M;�) = Fl(N;�) with

N =

�0 II 0

�M

�0 II 0

�=

�M22 M21

M12 M11

�where the dimensions of identity matrices are compatible with the partitions of Mand N .

(ii) Suppose Fu(M;�) is square and well-de�ned and M22 is nonsingular. Then theinverse of Fu(M;�) exists and is also an LFT with respect to �:

(Fu(M;�))�1

= Fu(N;�)

with N given by

N =

�M11 �M12M

�122 M21 �M12M

�122

M�122 M21 M�1

22

�:

(iii) Fu(M;�1) +Fu(Q;�2) = Fu(N;�) with

N =

24 M11 0 M12

0 Q11 Q12

M21 Q21 M22 +Q22

35 ; � =

��1 00 �2

�:

(iv) Fu(M;�1)Fu(Q;�2) = Fu(N;�) with

N =

24 M11 M12Q21 M12Q22

0 M11 Q12

M21 M22Q21 M22Q22

35 ; � =

��1 00 �2

�:

(v) Consider the following interconnection structure where the dimensions of �1 arecompatible with A:

wz� �

�

-

Fu(G;�1)

Fu(Q;�2)

Then the mapping from w to z is given by

Fl (Fu(G;�1);Fu(Q;�2)) = Fu (Fl (G;Fu(Q;�2)) ;�1) = Fu(N;�)


N =

24 A+B2Q22L1C2 B2L2Q21 B1 +B2Q22L1D21

Q12L1C2 Q11 +Q12L1D22Q21 Q12L1D21

C1 +D12L2Q22C2 D12L2Q21 D11 +D12Q22L1D21

35

where L1 := (I �D22Q22)�1, L2 := (I �Q22D22)

�1, and � =

��1 00 �2

�.

Proof. These properties can be straightforwardly veri�ed by the de�nition of LFT ,so the proofs are omitted. 2

Property (v) shows that if the open-loop system parameters are LFTs of some vari-ables, then the closed-loop system parameters are also LFTs of the same variables. Thisproperty is particularly useful in perturbation analysis and in building the interconnec-tion structure. Similar results can be stated for lower LFT. It is usually convenient tointerpret an LFT Fu(M;�) as a state space realization of a generalized system withfrequency structure �. In fact, all the above properties can be reduced to the standard

transfer matrix operations if � =1

sI .

The following proposition is concerned with the algebraic properties of LFT s in thegeneral control setup.

Lemma 10.4 Let P =

�P11 P12P21 P22

�and let K be rational transfer function matrices

and let G = F`(P;K). Then

(a) G is proper if P and K are proper with det(I � P22K)(1) 6= 0.

(b) F`(P;K1) = F`(P;K2) implies that K1 = K2 if P12 and P21 have normal fullcolumn and row rank in Rp(s), respectively.

(c) If P and G are proper, detP (1) 6= 0, det

�P �

�G 00 0

��(1) 6= 0 and P12 and

P21 are square and invertible for almost all s, then K is proper and

K = Fu(P�1; G):

Proof.

(a) is immediate from the de�nition of F`(P;K) (or well-posedness condition).

(b) follows from the identity

F`(P;K1)�F`(P;K2) = P12(I �K2P22)�1(K1 �K2)(I � P22K1)

�1P21:

(c) it is su�cient to show that the formula for K is well-posed and K thus obtainedis proper. Let Q = P�1, which will be proper since detP (1) 6= 0, and de�ne

K = Fu(Q;G) = Q22 +Q21G(I �Q11G)�1Q12:


This expression is well-posed and proper since at s =1

det(I �Q11G) = det

�I � � I 0

�P�1

�I0

�G

�

= det

�P�1

�P �

�G 00 0

��6= 0:

We also need to ensure that F`(P;K) is well-posed:

I � P22K = (I � P22Q22)� P22Q21G(I �Q11G)�1Q12

= P21Q12 + P21Q11G(I �Q11G)�1Q12

= P21(I �Q11G)�1Q12:

The above form is obtained by using the fact that PQ = I . Then det(I�P22K) 6= 0since P�121 exists andQ�112 = P21�P22P�112 P11. Hence the LFTs are both well-posedand we immediately obtain that F`(P;K) = G as required upon substituting forK and (I � P22K), as shown above.

2

Remark 10.1 This lemma shows that under certain conditions, an LFT of transfermatrices is a bijective map between two sets of proper and real rational matrices. Whengiven proper transfer matrices P and G with compatible dimensions which satisfy con-ditions in (c), there exists a unique proper K such that G = Fl(P;K). On the otherhand, the conditions of part (c) show that the feedback systems are well-posed. ~Remark 10.2 A simple interpretation of the result (c) is given by considering thesignals in the feedback systems,

P

K

z

y

w

u

� �

�

-

assuming this structure is well-posed. And we have�zy

�= P

�wu

�; u = Ky

) z = F`(P;K)w = Gw;

hence �wu

�= P�1

�zy

�; z = Gw

) u = Fu(P�1; G)y; or K = Fu(P�1; G):~


10.2 Examples of LFTs

LFT is a very convenient tool to formulate many mathematical objects. In this sectionand the sections to follow, some commonly encountered control or mathematical objectsare given new perspectives, i.e., they will be examined from the LFT point of view.

Polynomials

A very commonly encountered object in control and mathematics is a polynomial func-tion. For example,

p(�) = a0 + a1� + � � �+ an�n

with indeterminate �. It is easy to verify that p(�) can be written in the following LFTform:

p(�) = F`(M; �In)

with

M =

2666664a0 a1 � � � an�1 an1 0 � � � 0 00 1 � � � 0 0

0...

. . ....

...0 0 � � � 1 0

3777775 :

Hence every polynomial is a linear fraction of their indeterminates. More generally, anymultivariate (matrix) polynomials are also LFTs in their indeterminates; for example,

p(�1; �2) = a1�21 + a2�

22 + a3�1�2 + a4�1 + a5�2 + a6:

Then

p(�1; �2) = F`(N;�)with

N =

266664a6 1 0 1 0a4 0 a1 0 a31 0 0 0 0a5 0 0 0 a21 0 0 0 0

377775 ; � =

��1I2

�2I2

�:

It should be noted that these representations or realizations of polynomials are neitherunique nor necessarily minimal. Here a minimal realization refers to a realization withthe smallest possible dimension of �. As commonly known, in multidimensional systemsand �lter theory, it is usually very hard, if not impossible, to �nd a minimal realizationfor even a two variable polynomial. In fact, the minimal dimension of � depends also onthe �eld (real, complex, etc.) of the realization. More detailed discussion of this issueis beyond the scope of this book, the interested readers should consult the references in2-d or n-d systems or �lter theory.

10.2. Examples of LFTs 247

Rational Functions

As another example of LFT representation, we consider a rational matrix function (notnecessarily proper), F (�1; �2; � � � ; �m), with a �nite value at the origin: F (0; 0; � � � ; 0) is�nite. Then F (�1; �2; � � � ; �m) can be written as an LFT in (�1; �2; � � � ; �m) (some �i maybe repeated). To see that, write

F (�1; �2; � � � ; �m) = N(�1; �2; � � � ; �m)d(�1; �2; � � � ; �m) = N(�1; �2; � � � ; �m) (d(�1; �2; � � � ; �m)I)�1

where N(�1; �2; � � � ; �m) is a multivariate matrix polynomial and d(�1; �2; � � � ; �m) is ascalar multivariate polynomial with d(0; 0; � � � ; 0) 6= 0. Both N and dI can be repre-sented as LFTs, and, furthermore, since d(0; 0; � � � ; 0) 6= 0, the inverse of dI is also anLFT as shown in Lemma 10.3. Now the conclusion follows by the fact that the productof LFTs is also an LFT. (Of course, the above LFT representation problem is exactlythe problem of state space realization for a multidimensional discrete transfer matrix.)However, this is usually not a nice way to get an LFT representation for a rational ma-trix since this approach usually results in a much higher dimensioned � than required.For example,

f(�) =�+ ��

1 + �= F`(M; �)

with

M =

�� 1 �

�:

By using the above approach, we would end up with

f(�) = F`(N; �I2)and

N =

24 � � ��

1 0 � 1 0 �

35 :

Although the latter can be reduced to the former, it is not easy to see how to carry outsuch reduction for a complicated problem, even if it is possible.

State Space Realizations

We can use the LFT formulae to establish the relationship between transfer matricesand their state space realizations. A system with a state space realization as

_x = Ax+Buy = Cx+Du

has a transfer matrix of

G(s) = D + C(sI �A)�1B = Fu(�A BC D

�;1

sI):


Now take � = 1s I , the transfer matrix can be written as

G(s) = Fu(�A BC D

�;�):

More generally, consider a discrete time 2-D (or MD) system realized by the �rst-orderstate space equation

x1(k1 + 1; k2) = A11x1(k1; k2) +A12x2(k1; k2) +B1u(k1; k2)x2(k1; k2 + 1) = A21x1(k1; k2) +A22x2(k1; k2) +B2u(k1; k2)y(k1; k2) = C1x1(k1; k2) + C2x2(k1; k2) +Du(k1; k2):

In the same way, take

� =

�z�11 I 00 z�12 I

�=:

��1I 00 �2I

�

where zi denotes the forward shift operator, and let

A�=

�A11 A12

A21 A22

�; B

�=

�B1

B2

�; C

�=�C1 C2

�then its transfer matrix is

G(z1; z2) = D + C(

�z1I 00 z2I

��A)�1B = D + C�(I ��A)�1B

=: Fu(�A BC D

�;�):

Both formulations can correspond to the following diagram:

�

-

� �

�

�A BC D

�

The following notation for a transfer matrix has already been adopted in the previouschapters: �

A BC D

�:= Fu(

�A BC D

�;�):


It is easy to see that this notation can be adopted for general dynamical systems,e.g., multidimensional systems, as far as the structure � is speci�ed. This notationmeans that the transfer matrix can be expressed as an LFT of � with the coe�cient

matrix

�A BC D

�. In this special case, we say the parameter matrix � is the frequency

structure of the system state space realization. This notation is deliberately somewhatambiguous and can be viewed as both a transfer matrix and one of its realizations. Theambiguity is benign and convenient and can always be resolved from the context.

Frequency Transformation

The bilinear transformation between the z-domain and s-domain

s =z + 1

z � 1

transforms the unit disk to the left-half plane and is the simplest example of an LFT.We may rewrite it in our standard form as

1

sI = I �

p2I z�1I (I + z�1I)�1

p2I = Fu(N; z�1I)

where

N =

�I

p2I

�p2I �I�:

Now consider a continuous system

G(s) =

�A BC D

�= Fu(M;

1

sI)

where

M =

�A BC D

�;

then the corresponding discrete time system realization is given by

~G(z) = Fu(M;z � 1

z + 1I) = Fu(M;Fu(N; z�1I)) = Fu( ~M; z�1I)

with

~M =

� �(I �A)�1(I +A) �p2(I �A)�1Bp2C(I �A)�1 C(I �A)�1B +D

�:

The transformation from the z-domain to the s-domain can be obtained similarly.


F

K

W1

P W2i i

i

- - - - -

��

?

?��

6

y

uf

u

d

v

n

�

Simple Block Diagrams

A feedback system with the following block diagram

can be rearranged as an LFT:

G

K

z

y

w

u

� �

�

-

with

w =

�dn

�z =

�vuf

�and

G =

24 W2P 0 W2P

0 0 W1

�FP �F �FP

35 :

Constrained Structure Synthesis

Using the properties of LFTs, we can show that constrained structure control synthesisproblems can be converted to constrained structure constant output feedback problems.Consider the synthesis structure in the last example and assume

G =

24 A B1 B2

C1 D11 D12

C2 D21 D22

35 K =

�AK BK

CK DK

�:

Then it is easy to show that

F`(G;K) = F`(M(s); F )


where

M(s) =

266664

A 0 B1 0 B2

0 0 0 I 0C1 0 D11 0 D12

0 I 0 0 0C2 0 D21 0 D22

377775

and

F =

�AK BK

CK DK

�:

Note that F is a constant matrix, not a system matrix. Hence if the controller structureis �xed (or constrained), then the corresponding control problem becomes a constant(constrained) output feedback problem.

Parametric Uncertainty: A Mass/Spring/Damper System

One natural type of uncertainty is unknown coe�cients in a state space model. To moti-vate this type of uncertainty description, we will begin with a familiar mass/spring/dampersystem, shown below in Figure 10.1.

m

6F

XX��XX

X��XX ck

Figure 10.1: Mass/Spring/Damper System

The dynamical equation of the system motion can be described by

�x+c

m_x+

k

mx =

F

m:

Suppose that the 3 physical parameters m; c, and k are not known exactly, but arebelieved to lie in known intervals. In particular, the actual mass m is within 10% ofa nominal mass, �m, the actual damping value c is within 20% of a nominal value of�c, and the spring sti�ness is within 30% of its nominal value of �k. Now introducingperturbations �m, �c, and �k, which are assumed to be unknown but lie in the interval[�1; 1], the block diagram for the dynamical system can be shown in Figure 10.2.


�k(1 + 0:3�k)

�c(1 + 0:2�c)

1�m(1+0:1�m)

1s

1s

�d

d

+

��

-

- -6

6

F�x_xx

Figure 10.2: Block Diagram of Mass/Spring/Damper Equation

It is easy to check that 1m can be represented as an LFT in �m:

1

m=

1

�m(1 + 0:1�m)=

1

�m� 0:1

�m�m(1 + 0:1�m)

�1 = F` (M1; �m) :

with M1 =

"1�m � 0:1

�m

1 �0:1

#. Suppose that the input signals of the dynamical system are

selected as x1 = x; x2 = _x; F , and the output signals are selected as _x1 and _x2. Torepresent the system model as an LFT of the natural uncertainty parameters �m; �c and�k, we shall �rst isolate the uncertainty parameters and denote the inputs and outputsof �k; �c and �m as yk; yc; ym and uk; uc; um, respectively, as shown in Figure 10.3. Then

26666664

_x1

_x2

yk

yc

ym

37777775 =

26666664

0 1 0 0 0 0

� �k�m � �c

�m1�m � 1

�m � 1�m � 0:1

�m

0:3�k 0 0 0 0 0

0 0:2�c 0 0 0 0

��k ��c 1 �1 �1 �0:1

37777775

26666666664

x1

x2

F

uk

uc

um

37777777775;

264 uk

uc

um

375 = �

264 yk

yc

ym

375

i.e., "_x1

_x2

#= F`(M;�)

264 x1

x2

F

375


where

M =

26666664

0 1 0 0 0 0

� �k�m � �c

�m1�m � 1

�m � 1�m � 0:1

�m

0:3�k 0 0 0 0 0

0 0:2�c 0 0 0 0

��k ��c 1 �1 �1 �0:1

37777775 ; � =

264 �k 0 0

0 �c 0

0 0 �m

375 :

e

ee

e

M1

1s

1s

�m-

��

�c

0:2 �c

� �6

- --

- -6

�k

0:3 �k

6-

--

6

-

�x2x1

ym um

ucyc

yk uk

F

Figure 10.3: Mass/Spring/Damper System

General A�ne State-Space Uncertainty

We will consider a special class of state space models with unknown coe�cients and showhow this type of uncertainty can be represented via the LFT formulae with respect toan uncertain parameter matrix so that the perturbations enter the system in a feedbackform. This type of modeling will form the basic building blocks for components withparametric uncertainty.

Consider a linear system G�(s) that is parameterized by k uncertain parameters,�1; : : : ; �k, and has the realization

G�(s) =

266664A+

kXi=1

�iAi B +

kXi=1

�iBi

C +

kXi=1

�iCi D +

kXi=1

�iDi

377775 :


Here A; Ai 2 Rn�n ; B; Bi 2 Rn�nu ; C; Ci 2 Rny�n, and D; Di 2 Rny�nu .The various terms in these state equations are interpreted as follows: the nominal

system description G(s), given by known matrices A;B;C; and D, is (A;B;C;D) andthe parametric uncertainty in the nominal system is re ected by the k scalar uncertainparameters �1; : : : ; �k, and we can specify them, say by �i 2 [�1; 1]. The structuralknowledge about the uncertainty is contained in the matrices Ai; Bi; Ci, and Di. Theyre ect how the i'th uncertainty, �i, a�ects the state space model.

Now, we consider the problem of describing the perturbed system via the LFTformulae so that all the uncertainty can be represented as a nominal system with theunknown parameters entering it as the feedback gains. This is shown in Figure 10.4.

Since G�(s) = Fu(M� ;1s I) where

M��=

266664A+

kXi=1

�iAi B +

kXi=1

�iBi

C +

kXi=1

�iCi D +

kXi=1

�iDi

377775 ;

we need to �nd an LFT representation for the matrix M� with respect to

�p = diag f�1I; �2I; : : : �kIg :

To achieved this with the smallest possibly size of repeated blocks, let qi denote therank of the matrix

Pi�=

"Ai Bi

Ci Di

#2 R

(n+ny)�(n+nu)

for each i. Then Pi can be written as

Pi =

"Li

Wi

#"Ri

Zi

#�

where Li 2 Rn�qi , Wi 2 Rny�qi , Ri 2 Rn�qi and Zi 2 Rnu�qi . Hence, we have

�iPi =

"Li

Wi

#[�iIqi ]

"Ri

Zi

#�;

and M� can be written as

M� =

M11z }| {"A B

C D

#+

M12z }| {"L1 � � � Lk

W1 � � � Wk

# �pz }| {2664�1Iq1

. . .

�kIqk

3775

M21z }| {2664

R�1 Z�1...

...

R�k Z�k

3775


i.e.

M� = F`("M11 M12

M21 0

#;�p):

x

u

u2y2

y

_x �

-

�

�

-

�

A B B2

C D D12

C2 D21 D22

1s I

�1I

. . .

�kI

Figure 10.4: LFT Representation of State Space Uncertainty

Therefore, the matrices B2; C2; D12; D21, and D22 in the diagram are

B2 =hL1 L2 � � � Lk

iD12 =

hW1 W2 � � � Wk

iC2 =

hR1 R2 � � � Rk

i�D21 =

hZ1 Z2 � � � Zk

i�D22 = 0

and

G�(�) = Fu(F`("M11 M12

M21 0

#;�p);

1

sI):


10.3 Basic Principle

We have studied several simple examples of the use of LFTs and, in particular, theirrole in modeling uncertainty. The basic principle at work here in writing a matrixLFT is often referred to as \pulling out the �s". We will try to illustrate this withanother picture (inspired by Boyd and Barratt [1991]). Consider a structure with foursubstructures interconnected in some known way as shown in Figure 10.5.

�

�

�

�

.....

.....

.....

........

........

........

...........

.....

.....

.....

.....

......

......

.......

......

......................

.......

..........................

.......................

.................... ......

......

......

...................

.............

...........

.......

...... . . . . .

.. .. .. .. .............

...........

.....................

.....................

.....................

.......................

..........................

..............................

........

........

........

........

......

..... ......

..... ...........

........

......

......

..

.......

..

.......

.......

....

......

.........

....�

��

��

�

�3

�2

�1

K

Figure 10.5: Multiple Source of Uncertain Structure

This diagram can be redrawn as a standard one via \pulling out the �s" in Fig-ure 10.6. Now the matrix \M" of the LFT can be obtained by computing the corre-sponding transfer matrix in the shadowed box.

We shall illustrate the above principle with an example. Consider an input/outputrelation

z =a+ b�2 + c�1�

22

1 + d�1�2 + e�21w =: Gw

where a; b; c; d and e are given constants or transfer functions. We would like to writeG as an LFT in terms of �1 and �2. We shall do this in three steps:

1. Draw a block diagram for the input/output relation with each � separated asshown in Figure 10.7.

2. Mark the inputs and outputs of the �'s as y's and u's, respectively. (This isessentially pulling out the �s).

3. Write z and y's in terms of w and u's with all �'s taken out. (This step is equivalentto computing the transformation in the shadowed box in Figure 10.6.)2

6666664

y1

y2

y3

y4

z

37777775 =M

26666664

u1

u2

u3

u4

w

37777775

10.4. Redhe�er Star-Products 257

.....

.....

.....

........

........

........

...........

.....

.....

.....

.....

......

......

.......

......

......................

.......

..........................

.......................

.................... ......

......

......

...................

.............

...........

.......

...... . . . . .

.. .. .. .. .............

...........

.....................

.....................

.....................

.......................

..........................

..............................

........

........

........

........

......

..... ......

..... ...........

........

......

......

..

.......

..

.......

.......

....

......

.........

....��

��

-

��

� �

-

�

��

�

-

�

�?

�

�

-

�

�

K

�3

�2

�1

Figure 10.6: Pulling out the �s

where

M =

26666664

0 e d 0 1

1 0 0 0 0

1 0 0 0 0

0 be bd+ c 0 b

0 ae ad 1 a

37777775 :

Then

z = Fu(M;�)w; � =

"�1I2 0

0 �2I2

#:

All LFT examples in the last section can be obtained following the above steps.

10.4 Redhe�er Star-Products

The most important property of LFTs is that any interconnection of LFTs is again anLFT. This property is by far the most often used and is the heart of LFT machinery.Indeed, it is not hard to see that most of the interconnection structures discussed early,e.g., feedback and cascade, can be viewed as special cases of the so-called star product.


e e

e

e

�2 �1

�1 �e�d

c

b

a

�2

�

-

6

-

�6

?

6-

�6

��

�

? �

�

z u4 y4

u3 y3 u1 y1

y2 u2

w

Figure 10.7: Block diagram for G

Suppose that P and K are compatibly partitioned matrices

P =

"P11 P12

P21 P22

#; K =

"K11 K12

K21 K22

#

such that the matrix product P22K11 is well de�ned and square, and assume furtherthat I � P22K11 is invertible. Then the star product of P and K with respect to thispartition is de�ned as

S (P;K) :=

"Fl (P;K11) P12 (I �K11P22)

�1K12

K21 (I � P22K11)�1

P21 Fu (K;P22)

#:

Note that this de�nition is dependent on the partitioning of the matrices P and Kabove. In fact, this star product may be well de�ned for one partition and not wellde�ned for another; however, we will not explicitly show this dependence because it isalways clear from the context. In a block diagram, this dependence appears, as shownin Figure 10.8.

Now suppose that P and K are transfer matrices with state space representations:

P =

264 A B1 B2

C1 D11 D12

C2 D21 D22

375 K =

264 AK BK1 BK2

CK1 DK11 DK12

CK2 DK21 DK22

375 :

Then the transfer matrix

S (P;K) :

"w

w

#7!"z

z

#

10.4. Redhe�er Star-Products 259

z w

wzw

wz

z

�

�

!!!!

!!aaaaaa

�

�

�

�

�

�

�

�S (P;K)

K

P

Figure 10.8: Interconnection of LFTs

has a representation

S (P;K) =

2664

�A �B1�B2

�C1�D11

�D12

�C2�D21

�D22

3775 =

"�A �B

�C �D

#

where

�A =

"A+B2

~R�1DK11C2 B2~R�1CK1

BK1R�1C2 AK +BK1R

�1D22CK1

#

�B =

"B1 +B2

~R�1DK11D21 B2~R�1DK12

BK1R�1D21 BK2 +BK1R

�1D22DK12

#

�C =

"C1 +D12DK11R

�1C2 D12~R�1CK1

DK21R�1C2 CK2 +DK21R

�1D22CK1

#

�D =

"D11 +D12DK11R

�1D21 D12~R�1DK12

DK21R�1D21 DK22 +DK21R

�1D22DK12

#

R = I �D22DK11; ~R = I �DK11D22:

In fact, it is easy to show that

�A = S "

A B2

C2 D22

#;

"DK11 CK1

BK1 AK

#!;

�B = S "

B1 B2

D21 D22

#;

"DK11 DK12

BK1 BK2

#!;


�C = S "

C1 D12

C2 D22

#;

"DK11 CK1

DK21 CK2

#!;

�D = S "

D11 D12

D21 D22

#;

"DK11 DK12

DK21 DK22

#!:


This chapter is based on the lecture notes by Packard [1991] and the paper by Doyle,Packard, and Zhou [1991].

11Structured Singular Value

It is noted that the robust stability and robust performance criteria derived in Chapter 9vary with the assumptions on the uncertainty descriptions and performance require-ments. We will show in this chapter that they can all be treated in a uni�ed frameworkusing the LFT machinery introduced in the last chapter and the structured singularvalue to be introduced in this chapter. This, of course, does not mean that those specialproblems and their corresponding results are not important; on the contrary, they aresometimes very enlightening to our understanding of complex problems such as thosein which complex problems are formed from simple problems. On the other hand, auni�ed approach may relieve the mathematical burden of dealing with speci�c prob-lems repeatedly. Furthermore, the uni�ed framework introduced here will enable us totreat exactly the robust stability and robust performance problems for systems withmultiple sources of uncertainties, which is a formidable problem in the standing point ofChapter 9, in the same fashion as single unstructured uncertainty. Indeed, if a systemis subject to multiple sources of uncertainties, in order to use the results in Chapter 9for unstructured cases, it is necessary to re ect all sources of uncertainties from theirknown point of occurrence to a single reference location in the loop. Such re ecteduncertainties invariably have a great deal of structure which must then be \covered up"with a large, arbitrarily more conservative perturbation in order to maintain a simplecone bounded representation at the reference location. Readers might have already hadsome idea about the conservativeness in such re ection from the skewed speci�cationproblem, where an input multiplicative uncertainty of the plant is re ected at the out-put and the size of the re ected uncertainty is proportional to the condition number ofthe plant. In general, the re ected uncertainty may be proportional to the condition

261

262 STRUCTURED SINGULAR VALUE

number of the transfer matrix between its original location and the re ected location.Thus it is highly desirable to treat the uncertainties as they are and where they are.The structured singular value is de�ned exactly for that purpose.

11.1 General Framework for System Robustness

As we have illustrated in the last chapter, any interconnected system may be rearrangedto �t the general framework in Figure 11.1. Although the interconnection structurecan become quite complicated for complex systems, many software packages, such asSIMULINK1 and �-TOOLS2, are available which could be used to generate the inter-connection structure from system components. Various modeling assumptions will beconsidered and the impact of these assumptions on analysis and synthesis methods willbe explored in this general framework.

Note that uncertainty may be modeled in two ways, either as external inputs oras perturbations to the nominal model. The performance of a system is measured interms of the behavior of the outputs or errors. The assumptions which characterizethe uncertainty, performance, and nominal models determine the analysis techniqueswhich must be used. The models are assumed to be FDLTI systems. The uncertaininputs are assumed to be either �ltered white noise or weighted power or weightedLp signals. Performance is measured as weighted output variances, or as power, or asweighted output Lp norms. The perturbations are assumed to be themselves FDLTIsystems which are norm-bounded as input-output operators. Various combinations ofthese assumptions form the basis for all the standard linear system analysis tools.

Given that the nominal model is an FDLTI system, the interconnection system hasthe form

P (s) =

264 P11(s) P12(s) P13(s)

P21(s) P22(s) P23(s)

P31(s) P32(s) P33(s)

375

and the closed-loop system is an LFT on the perturbation and the controller given by

z = Fu (F`(P;K);�)w

= F` (Fu(P;�);K)w:

We will focus our discussion in this section on analysis methods; therefore, thecontroller may be viewed as just another system component and absorbed into theinterconnection structure. Denote

M(s) = F` (P (s);K(s)) =

"M11(s) M12(s)

M21(s) M22(s)

#;

1SIMULINK is a trademark of The MathWorks, Inc.2�-TOOLS is a trademark of MUSYN Inc.

11.1. General Framework for System Robustness 263

wz

-

�

-

��

K

�

P

Figure 11.1: General Framework

and then the general framework reduces to Figure 11.2, where

z = Fu(M;�)w =�M22 +M21�(I �M11�)

�1M12

�w:

wz��

-

M

�

Figure 11.2: Analysis Framework

SupposeK(s) is a stabilizing controller for the nominal plant P . ThenM(s) 2 RH1.In general, the stability of Fu(M;�) does not necessarily imply the internal stability ofthe closed-loop feedback system. However, they can be made equivalent with suitablychosen w and z. For example, consider again the multiplicatively perturbed systemshown in Figure 11.3. Now let

w :=

"d1

d2

#; z :=

"e1

e2

#

then the system is robustly stable for all �(s) 2 RH1 with k�k1 < 1 if and only ifFu(M;�) 2 RH1 for all admissible �, which is guaranteed by kM11k1 � 1. (Notethat this is not necessarily equivalent to (I �M11�)

�1 2 RH1 if � belongs to a closedball as shown in Theorem 9.5.)

The analysis results presented in the previous chapters together with the associ-ated synthesis tools are summarized in Table 11.1 with various uncertainty modelingassumptions.

However, the analysis is not so simple for systems with multiple sources of modeluncertainties, including the robust performance problem for systems with unstructured


e e e- K P

W2 � W1

- - - -

- - -

?

6

?d1 e1d2e2

e3 d3

�

Figure 11.3: Multiplicatively Perturbed Systems

uncertainty. As we have shown in the last chapter, if a system is built from componentswhich are themselves uncertain, then, in general, the uncertainty in the system level isstructured involving typically a large number of real parameters. The stability analysisinvolving real parameters is much more involved and is beyond the scope of this bookand that, instead, we shall simply cover the real parametric uncertainty with normbounded dynamical uncertainty. Moreover, the interconnection model M can always bechosen so that �(s) is block diagonal, and, by absorbing any weights, k�k1 < 1. Thuswe shall assume that �(s) takes the form of

�(s) = fdiag [�1Ir1 ; : : : ; �sIrS ;�1; : : : ;�F ] : �i(s) 2 RH1; �j 2 RH1g

with k�ik1 < 1 and k�jk1 < 1. Then the system is robustly stable i� the intercon-nected system in Figure 11.4 is stable.

......

-

�

�

-

...

M11(s)

�F (s)

�1(s)I

Figure 11.4: Robust Stability Analysis Framework

The results of Table 11.1 can be applied to the analyses of the system's robuststability in two ways:

11.1. General Framework for System Robustness 265

Input Performance Perturbation Analysis Synthesis

Assumptions Speci�cations Assumptions Tests Methods

E(w(t)w(t)�) = I E(z(t)�z(t)) � 1 LQG

� = 0 kM22k2 � 1 Wiener-Hopf

w = U0�(t) E(kzk22) � 1

E(U0U�0 ) = I H2

kwk2 � 1 kzk2 � 1 � = 0 kM22k1 � 1 Singular Value

Loop Shaping

kwk2 � 1 Internal Stability k�k1 < 1 kM11k1 � 1 H1

Table 11.1: General Analysis for Single Source of Uncertainty

(1) kM11k1 � 1 implies stability, but not conversely, because this test ignores theknown block diagonal structure of the uncertainties and is equivalent to regarding� as unstructured. This can be arbitrarily conservative3 in that stable systemscan have arbitrarily large kM11k1.

(2) Test for each �i (�j) individually (assuming no uncertainty in other channels).This test can be arbitrarily optimistic because it ignores interaction between the�i (�j). This optimism is also clearly shown in the spinning body example.

The di�erence between the stability margins(or bounds on �) obtained in (1) and (2)can be arbitrarily far apart. Only when the margins are close can conclusions be madeabout the general case with structured uncertainty.

The exact stability and performance analysis for systems with structured uncertaintyrequires a new matrix function called the structured singular value (SSV) which isdenoted by �.

3By \arbitrarily conservative," we mean that examples can be constructed where the degree ofconservatism is arbitrarily large. Of course, other examples exist where it is quite reasonable, see forexample the spinning body example.


11.2 Structured Singular Value

11.2.1 Basic Concept

Conceptually, the structured singular value is nothing but a straightforward generaliza-tion of the singular values for constant matrices. To be more speci�c, it is instructiveat this point to consider again the robust stability problem of the following standardfeedback interconnection with stable M(s) and �(s).

e

e

++

++e2

e1

w2

w1

� �

-6

?

-

M

�

One important question one might ask is how large � (in the sense of k�k1) canbe without destabilizing the feedback system. Since the closed-loop poles are given bydet(I �M�) = 0, the feedback system becomes unstable if det(I �M(s)�(s)) = 0 forsome s 2 C +. Now let � > 0 be a su�ciently small number such that the closed-loopsystem is stable for all stable k�k1 < �. Next increase � until �max so that the closed-loop system becomes unstable. So �max is the robust stability margin. By small gaintheorem,

1

�max= kMk1 := sup

s2C+

�(M(s)) = sup!

�(M(j!))

if � is unstructured. Note that for any �xed s 2 C +, �(M(s)) can be written as

�(M(s)) =1

min f�(�) : det (I �M(s)�) = 0; � is unstructuredg : (11:1)

In other words, the reciprocal of the largest singular value of M is a measure of thesmallest unstructured � that causes instability of the feedback system.

To quantify the smallest destabilizing structured �, the concept of singular valuesneeds to be generalized. In view of the characterization of the largest singular value ofa matrix M(s) given by (11.1), we shall de�ne

��(M(s)) =1

min f�(�) : det (I �M(s)�) = 0; � is structuredg (11:2)

as the largest structured singular value of M(s) with respect to the structured �. Thenit is obvious that the robust stability margin of the feedback system with structureduncertainty � is

1

�max= sup

s2C+

��(M(s)) = sup!

��(M(j!)):

The last equality follows from the following lemma.

11.2. Structured Singular Value 267

Lemma 11.1 Let � be a structured set and M(s) 2 RH1. Then

sups2C+

��(M(s)) = sups2C+

��(M(s)) = sup!

��(M(j!)):

Proof. It is clear that

sups2C+

��(M(s)) = sups2C+

��(M(s)) � sup!

��(M(j!)):

Now suppose sups2C+ ��(M(s)) > 1=�, then by the de�nition of �, there is an so 2C + [ f1g and a complex structured � such that �(�) < � and det(I �M(so)�) = 0.This implies that there is a 0 � ! � 1 and 0 < � � 1 such that det(I �M(j!)��) =0. This in turn implies that ��(M(j!)) > 1=� since �(��) < �. In other words,sups2C+ ��(M(s)) � sup! ��(M(j!)). The proof is complete. 2

The formal de�nition and characterization of the structured singular value of aconstant matrix will be given below.

11.2.2 De�nitions of �

This section is devoted to de�ning the structured singular value, a matrix functiondenoted by � (�). We consider matricesM 2 C

n�n . In the de�nition of � (M), there is anunderlying structure� (a prescribed set of block diagonal matrices) on which everythingin the sequel depends. For each problem, this structure is, in general, di�erent; itdepends on the uncertainty and performance objectives of the problem. De�ning thestructure involves specifying three things: the type of each block, the total number ofblocks, and their dimensions.

There are two types of blocks: repeated scalar and full blocks. Two nonnegativeintegers, S and F , represent the number of repeated scalar blocks and the number offull blocks, respectively. To bookkeep their dimensions, we introduce positive integersr1; : : : ; rS ; m1; : : : ;mF . The i'th repeated scalar block is ri�ri, while the j'th full blockis mj �mj . With those integers given, we de�ne � � C n�n as

� =�diag [�1Ir1 ; : : : ; �sIrS ;�1; : : : ;�F ] : �i 2 C ;�j 2 C

mj�mj: (11:3)

For consistency among all the dimensions, we must have

SXi=1

ri +

FXj=1

mj = n:

Often, we will need norm bounded subsets of�, and we introduce the following notation:

B� = f� 2� : �(�) � 1g (11:4)

Bo� = f� 2� : �(�) < 1g (11:5)


where the superscript \o" symbolizes the open ball. To keep the notation as simple aspossible in (11.3), we place all of the repeated scalar blocks �rst; in actuality, they cancome in any order. Also, the full blocks do not have to be square, but restricting themas such saves a great deal in terms of notation.

De�nition 11.1 For M 2 C n�n ;��(M) is de�ned as

��(M) :=1

min f�(�) : � 2�; det (I �M�) = 0g (11:6)

unless no � 2� makes I �M� singular, in which case ��(M) := 0.

Remark 11.1 Without a loss in generality, the full blocks in the minimal norm � caneach be chosen to be dyads (rank = 1). To see this, assume S = 0, i.e., all blocks arefull blocks. Suppose that I �M� is singular for some � 2�. Then there is an x 2 C n

such that M�x = x. Now partition x conformably with �:

x =

266664

x1

x2...

xF

377775 ; xi 2 C

mi ; i = 1; : : : ; F

and de�ne

D = diag(d1Im1; : : : ; dF ImF

)

where di = 1 if xi = 0 and di = 1= kxik if xi 6= 0. Next de�ne

~x =

266664

~x1

~x2...

~xF

377775 := Dx =

266664

d1x1

d2x2...

dmxF

377775

and

y =

266664

y1

y2...

yF

377775 := �~x =

266664

�1~x1

�2~x2...

�~xF

377775 :

It follows that k~xik = 1 if xi 6= 0, k~xik = 0 if xi = 0 , and y 6= 0. Hence, de�ne a newperturbation ~� 2 C n�n as

~� := diag [y1~x�1; : : : ; yF ~x

�F ] :


Obviously, �( ~�) � �(�) and y = ~�~x. Note that D� = �D and D ~� = ~�D, we have

M�x = x =) MD�1�~x = x =) MD�1y = x

=) MD�1 ~�~x = x =) M ~�x = x

i.e., I �M ~� is also singular. Hence we have replaced a general perturbation � whichsatis�es the singularity condition with a perturbation ~� that is no larger (in the �(�)sense) and has rank 1 for each blocks but still satis�es the singularity condition. ~

An alternative expression for ��(M) follows from the de�nition.

Lemma 11.2 ��(M) = max�2B�

� (M�)

In view of this lemma, continuity of the function � :C n�n!R is apparent. In general,though, the function � :C n�n!R is not a norm, since it doesn't satisfy the triangleinequality; however, for any � 2 C , � (�M) = j�j� (M), so in some sense, it is relatedto how \big" the matrix is.

We can relate ��(M) to familiar linear algebra quantities when � is one of twoextreme sets.

� If � = f�I : � 2 C g (S = 1; F = 0; r1 = n), then ��(M) = � (M), the spectralradius of M .

Proof. The only �'s in � which satisfy the det (I �M�) = 0 constraint arereciprocals of nonzero eigenvalues of M . The smallest one of these is associatedwith the largest (magnitude) eigenvalue, so, ��(M) = � (M). 2

� If � = Cn�n (S=0; F =1;m1=n), then ��(M) = �(M).

Proof. If � (�) < 1�(M) , then � (M�) < 1, so I �M� is nonsingular. Applying

equation (11.6) implies ��(M) � � (M). On the other hand, let u and v be unitvectors satisfying Mv = � (M)u, and de�ne � := 1

�(M)vu�. Then � (�) = 1

�(M)

and I �M� is obviously singular. Hence, ��(M) � �(M). 2

Obviously, for a general � as in (11.3) we must have

f�In : � 2 C g �� Cn�n : (11:7)

Hence directly from the de�nition of � and from the two special cases above, we concludethat

� (M) � ��(M) � � (M) : (11:8)

These bounds alone are not su�cient for our purposes because the gap between � and

� can be arbitrarily large. For example, suppose � =

"�1 0

0 �2

#and consider


(1) M =

"0 �

0 0

#for any � > 0. Then �(M) = 0 and �(M) = �. But det(I�M�) =

1 so �(M) = 0.

(2) M =

"�1=2 1=2

�1=2 1=2

#. Then �(M) = 0 and �(M) = 1. Since det(I �M�) =

1 + �1��22 , it is easy to see that min

�maxi j�ij; 1 + �1��2

2 = 0= 1, so �(M) = 1.

Thus neither � nor � provide useful bounds even in simple cases. The only time theydo provide reliable bounds is when � � �.

However, the bounds can be re�ned by considering transformations on M that donot a�ect ��(M), but do a�ect � and �. To do this, de�ne the following two subsetsof C n�n :

U = fU 2� : UU� = Ing (11:9)

D =

(diag

�D1; : : : ; DS ; d1Im1

; : : : ; dF�1ImF�1; ImF

�:

Di 2 C ri�ri ; Di = D�i > 0; dj 2 R; dj > 0

): (11:10)

Note that for any � 2�; U 2 U , and D 2 D,U� 2 U U� 2� �U 2� � (U�) = � (�U) = � (�) (11:11)

D� = �D: (11:12)

Consequently,

Theorem 11.3 For all U 2 U and D 2 D��(MU) = ��(UM) = ��(M) = ��

�DMD�1

�: (11:13)

Proof. For all D 2 D and � 2�,

det (I �M�) = det�I �MD�1�D

�= det

�I �DMD�1�

�since D commutes with �. Therefore ��(M) = ��

�DMD�1

�. Also, for each U 2

U , det (I �M�) = 0 if and only if det (I �MUU��) = 0. Since U�� 2 � and� (U��) = � (�), we get �� (MU) = ��(M) as desired. The argument for UM is thesame. 2

Therefore, the bounds in (11.8) can be tightened to

maxU2U

�(UM) � max�2B�

� (�M) = ��(M) � infD2D

��DMD�1

�(11:14)

where the equality comes from Lemma 11.2. Note that the last element in the D matrixis normalized to 1 since for any nonzero scalar , DMD�1 = ( D)M ( D)

�1.


Remark 11.2 Note that the scaling set D in Theorem 11.3 and in the inequality (11.14)does not necessarily be restricted to Hermitian. In fact, they can be replaced by anyset of nonsingular matrices that satisfy (11.12). However, enlarging the set of scalingmatrices does not improve the upper bound in inequality (11.14). This can be shownas follows: Let D be any nonsingular matrix such that D� = �D. Then there exist aHermitian matrix 0 < R = R� 2 D and a unitary matrix U such that D = UR and

infD��DMD�1

�= inf

D��URMR�1U�

�= inf

R2D��RMR�1

�:

Therefore, there is no loss of generality in assuming D to be Hermitian. ~

11.2.3 Bounds

In this section we will concentrate on the bounds

maxU2U

� (UM) � ��(M) � infD2D

��DMD�1

�:

The lower bound is always an equality [Doyle, 1982].

Theorem 11.4 maxU2U

�(MU) = �(M).

Unfortunately, the quantity � (UM) can have multiple local maxima which are notglobal. Thus local search cannot be guaranteed to obtain �, but can only yield a lowerbound. For computation purposes one can derive a slightly di�erent formulation ofthe lower bound as a power algorithm which is reminiscent of power algorithms foreigenvalues and singular values [Packard, Fan, and Doyle, 1988]). While there are openquestions about convergence, the algorithm usually works quite well and has proven tobe an e�ective method to compute �.

The upper bound can be reformulated as a convex optimization problem, so theglobal minimum can, in principle, be found. Unfortunately, the upper bound is notalways equal to �. For block structures � satisfying 2S + F � 3, the upper bound isalways equal to ��(M), and for block structures with 2S+F > 3, there exist matricesfor which � is less than the in�mum. This can be summarized in the following diagram,which shows for which cases the upper bound is guaranteed to be equal to �.

Theorem 11.5 �(M) = infD2D

�(DMD�1) if 2S + F � 3

F= 0 1 2 3 4

S=

0 yes yes yes no

1 yes yes no no no

2 no no no no no


Several of the boxes have connections with standard results.

� S = 0, F = 1 : ��(M) = � (M).

� S = 1, F = 0 : ��(M) = � (M) = infD2D

��DMD�1

�. This is a standard result in

linear algebra. In fact, without a loss in generality, the matrix M can be assumedin Jordan Canonical form. Now let

J1 =

266666664

� 1

� 1

. . .. . .

� 1

�

377777775; D1 =

266666664

1

k

. . .

kn1�2

kn1�1

3777777752 C

n1�n1 :

Then infD12Cn1�n1

�(D1J1D�11 ) = lim

k!1�(D1J1D

�11 ) = j�j. (Note that by Re-

mark 11.2, the scaling matrix does not need to be Hermitian.) The conclusionfollows by applying this result to each Jordan block.

It is also equivalent to the fact that Lyapunov asymptotic stability and exponen-tial stability are equivalent for discrete time systems. This is because � (M) < 1(exponential stability of a discrete time system matrix M) implies for some non-singular D 2 C n�n

�(DMD�1) < 1 or (D�1)�M�D�DMD�1 � I < 0

which in turn is equivalent to the existence of a P = D�D > 0 such that

M�PM � P < 0

(Lyapunov asymptotic stability).

� S = 0, F = 2 : This case was studied by Redhe�er [1959].

� S = 1, F = 1 : This is equivalent to a state space characterization of the H1norm of a discrete time transfer function, see Chapter 21.

� S = 2, F = 0 : This is equivalent to the fact that for multidimensional systems(2-d, in fact), exponential stability is not equivalent to Lyapunov stability.

� S = 0, F � 4 : For this case, the upper bound is not always equal to �. Thisis important, as these are the cases that arise most frequently in applications.Fortunately, the bound seems to be close to �. The worst known example has aratio of � over the bound of about :85, and most systems are close to 1.


The above bounds are much more than just computational schemes. They are alsotheoretically rich and can unify a number of apparently quite di�erent results in linearsystems theory. There are several connections with Lyapunov asymptotic stability, twoof which were hinted at above, but there are further connections between the upperbound scalings and solutions to Lyapunov and Riccati equations. Indeed, many majortheorems in linear systems theory follow from the upper bounds and from some resultsof Linear Fractional Transformations. The lower bound can be viewed as a naturalgeneralization of the maximum modulus theorem.

Of course one of the most important uses of the upper bound is as a computationalscheme when combined with the lower bound. For reliable use of the � theory, it isessential to have upper and lower bounds. Another important feature of the upper boundis that it can be combined with H1 controller synthesis methods to yield an ad-hoc�-synthesis method. Note that the upper bound when applied to transfer functions issimply a scaled H1 norm. This is exploited in the D�K iteration procedure to performapproximate �-synthesis (Doyle[1982]), which will be brie y introduced in section 11.4.

11.2.4 Well Posedness and Performance for Constant LFTs

Let M be a complex matrix partitioned as

M =

"M11 M12

M21 M22

#(11:15)

and suppose there are two de�ned block structures, �1 and �2, which are compatiblein size with M11 and M22, respectively. De�ne a third structure � as

� =

("�1 0

0 �2

#: �1 2�1;�2 2�2

): (11:16)

Now, we may compute � with respect to three structures. The notations we use to keeptrack of these computations are as follows: �1 (�) is with respect to �1, �2 (�) is withrespect to �2, and ��(�) is with respect to �. In view of these notations, �1 (M11),�2 (M22) and ��(M) all make sense, though, for instance, �1 (M) does not.

This section is interested in following constant matrix problems:

� determine whether the LFT Fl (M;�2) is well posed for all �2 2�2 with �(�2) �� (< �), and,

� if so, then determine how \large" Fl (M;�2) can get for this norm-bounded setof perturbations.

Let �2 2 �2. Recall that Fl (M;�2) is well posed if I �M22�2 is invertible. The�rst theorem is nothing more than a restatement of the de�nition of �.

Theorem 11.6 The linear fractional transformation Fl (M;�2) is well posed


(a) for all �2 2 B�2 if and only if �2 (M22) < 1.

(b) for all �2 2 Bo�2 if and only if �2 (M22) � 1.

As the \perturbation" �2 deviates from zero, the matrix Fl (M;�2) deviates fromM11. The range of values that �1 (Fl (M;�2)) takes on is intimately related to ��(M),as shown in the following theorem:

Theorem 11.7 (MAIN LOOP THEOREM) The following are equivalent:

��(M) < 1 ()

8>><>>:

�2 (M22) < 1; and

max�22B�2

�1 (Fl (M;�2)) < 1:

��(M) � 1 ()

8>><>>:

�2 (M22) � 1; and

max�22 B

o�2

�1 (Fl (M;�2)) � 1:

Proof. We shall only prove the �rst part of the equivalence. The proof for the secondpart is similar.

( Let �i 2�i be given, with � (�i) � 1, and de�ne � = diag [�1;�2]. Obviously� 2�. Now

det (I �M�) = det

"I �M11�1 �M12�2

�M21�1 I �M22�2

#: (11:17)

By hypothesis I �M22�2 is invertible, and hence, det (I �M�) becomes

det (I �M22�2) det�I �M11�1 �M12�2 (I �M22�2)

�1M21�1

�:

Collecting the �1 terms leaves

det (I �M�) = det (I �M22�2) det (I � Fl (M;�2)�1) : (11:18)

But, �1 (Fl (M;�2)) < 1 and �1 2 B�1, so I � Fl (M;�2)�1 must be nonsingular.Therefore, I �M� is nonsingular and, by de�nition, ��(M) < 1.

) Basically, the argument above is reversed. Again let �1 2 B�1 and �2 2B�2 be given, and de�ne � = diag [�1;�2]. Then � 2 B� and, by hypothesis,det (I �M�) 6= 0. It is easy to verify from the de�nition of � that (always)

� (M) � max f�1 (M11) ; �2 (M22)g :We can see that �2 (M22) < 1, which gives that I�M22�2 is also nonsingular. Therefore,the expression in (11.18) is valid, giving

det (I �M22�2) det (I � Fl (M;�2)�1) = det (I �M�) 6= 0:


Obviously, I � Fl (M;�2)�1 is nonsingular for all �i 2 B�i, which indicates thatthe claim is true. 2

Remark 11.3 This theorem forms the basis for all uses of � in linear system robustnessanalysis, whether from a state-space, frequency domain, or Lyapunov approach. ~

The role of the block structure �2 in the MAIN LOOP theorem is clear - it isthe structure that the perturbations come from; however, the role of the perturbationstructure �1 is often misunderstood. Note that �1 (�) appears on the right hand sideof the theorem, so that the set �1 de�nes what particular property of Fl (M;�2) isconsidered. As an example, consider the theorem applied with the two simple blockstructures considered right after Lemma 11.2. De�ne �1 := f�1In : �1 2 C g. Hence, forA 2 C n�n ; �1 (A) = � (A). Likewise, de�ne�2 = Cm�m ; then for D 2 Cm�m ; �2 (D) =�(D). Now, let � be the diagonal augmentation of these two sets, namely

� :=

("�1In 0n�m

0m�n �2

#: �1 2 C ;�2 2 C

m�m

)� C

(n+m)�(n+m) :

Let A 2 C n�n ; B 2 C n�m ; C 2 Cm�n , and D 2 Cm�m be given, and interpret them asthe state space model of a discrete time system

xk+1 = Axk +Buk

yk = Cxk +Duk:

And let M 2 C (n+m)�(n+m) be the block state space matrix of the system

M =

"A B

C D

#:

Applying the theorem with this data gives that the following are equivalent:

� The spectral radius of A satis�es � (A) < 1, and

max�12Cj�1j�1

��D + C�1 (I �A�1)

�1B�< 1: (11:19)

� The maximum singular value of D satis�es �(D) < 1, and

max�22C

m�m

�(�2)�1

��A+ B�2 (I �D�2)

�1 C�< 1: (11:20)

� The structured singular value of M satis�es

��(M) < 1: (11:21)


The �rst condition is recognized by two things: the system is stable, and the jj � jj1norm on the transfer function from u to y is less than 1 (by replacing �1 with

1z )

kGk1 := maxz2Cjzj�1

��D + C (zI �A)

�1B�= max

�12Cj�1j�1

��D + C�1 (I �A�1)

�1B�:

The second condition implies that (I �D�2)�1

is well de�ned for all �(�2) � 1 andthat a robust stability result holds for the uncertain di�erence equation

xk+1 =�A+B�2 (I �D�2)

�1C�xk

where �2 is any element in Cm�m with �(�2) � 1, but otherwise unknown.

This equivalence between the small gain condition, kGk1 < 1, and the stabilityrobustness of the uncertain di�erence equation is well known. This is the small gaintheorem, in its necessary and su�cient form for linear, time invariant systems with oneof the components norm-bounded, but otherwise unknown. What is important to noteis that both of these conditions are equivalent to a condition involving the structuredsingular value of the state space matrix. Already we have seen that special cases of � arethe spectral radius and the maximum singular value. Here we see that other importantlinear system properties, namely robust stability and input-output gain, are also relatedto a particular case of the structured singular value.

11.3 Structured Robust Stability and Performance

11.3.1 Robust Stability

The most well-known use of � as a robustness analysis tool is in the frequency domain.Suppose G(s) is a stable, multi-input, multi-output transfer function of a linear system.For clarity, assume G has q1 inputs and p1 outputs. Let � be a block structure, as inequation (11.3), and assume that the dimensions are such that � � C q1�p1 . We wantto consider feedback perturbations to G which are themselves dynamical systems withthe block-diagonal structure of the set �.

LetM (�) denote the set of all block diagonal and stable rational transfer functionsthat have block structures such as �.

M (�) :=��(�) 2 RH1 : �(so) 2� for all so 2 C +

Theorem 11.8 Let � > 0. The loop shown below is well-posed and internally stable

for all �(�) 2 M (�) with k�k1 < 1� if and only if

sup!2R

��(G(j!)) � �

11.3. Structured Robust Stability and Performance 277

e

e

++

++e2

e1

w2

w1

� �

-6

?

-

G(s)

�

Proof. ((=) By Lemma 11.1, sups2C+ ��(G(s)) = sup!2R ��(G(j!)) � �. Hence

det(I �G(s)�(s)) 6= 0 for all s 2 C + [ f1g whenever k�k1 < 1=�, i.e., the system isrobustly stable.

(=)) Suppose sup!2R ��(G(j!)) > �. Then there is a 0 < !o < 1 such that��(G(j!o)) > �. By Remark 11.1, there is a complex �c 2 � that each full blockhas rank 1 and �(�c) < 1=� such that I �G(j!o)�c is singular. Next, using the sameconstruction used in the proof of the small gain theorem (Theorem 9.1), one can �nd arational �(s) such that k�(s)k1 = �(�c) < 1=�, �(j!o) = �c, and �(s) destabilizesthe system. 2

Hence, the peak value on the � plot of the frequency response determines the sizeof perturbations that the loop is robustly stable against.

Remark 11.4 The internal stability with closed ball of uncertainties is more compli-cated. The following example is shown in Tits and Fan [1994]. Consider

G(s) =1

s+ 1

"0 �11 0

#

and � = �(s)I2. Then

sup!2R

��(G(j!)) = sup!2R

1

jj! + 1j = ��(G(j0)) = 1:

On the other hand, ��(G(s)) < 1 for all s 6= 0; s 2 C +, and the only matrices in theform of � = I2 with j j � 1 for which

det(I �G(0)�) = 0

are the complex matrices �jI2. Thus, clearly, (I � G(s)�(s))�1 2 RH1 for all realrational �(s) = �(s)I2 with k�k1 � 1 since �(0) must be real. This shows thatsup!2R ��(G(j!)) < 1 is not necessary for (I �G(s)�(s))�1 2 RH1 with the closedball of structured uncertainty k�k1 � 1. Similar examples with no repeated blocks aregenerated by setting G(s) = 1

s+1M where M is any real matrix with ��(M) = 1 forwhich there is no real � 2� with �(�) = 1 such that det(I �M�) = 0. For example,let

M =

264 0 �

�

��

375"��

0 �

#; � =

8><>:264 �1

�2

�3

375 ; �i 2 C

9>=>;


with 2 = 12 and �2 + 2�2 = 1. Then it is shown in Packard and Doyle [1993] that

��(M) = 1 and all � 2 � with �(�) = 1 that satisfy det(I �M�) = 0 must becomplex. ~Remark 11.5 Let � 2 RH1 be a structured uncertainty and

G(s) =

"G11(s) G12(s)

G21(s) G22(s)

#2 RH1

then Fu(G;�) 2 RH1 does not necessarily imply (I �G11�)�1 2 RH1 whether � is

in an open ball or is in a closed ball. For example, consider

G(s) =

264

1s+1 0 1

0 10s+1 0

1 0 0

375

and � =

"�1

�2

#with k�k1 < 1. Then Fu(G;�) =

1

1� �11

s+1

2 RH1 for all

admissible � (k�k1 < 1) but (I �G11�)�1 2 RH1 is true only for k�k1 < 0:1. ~

11.3.2 Robust Performance

Often, stability is not the only property of a closed-loop system that must be robust toperturbations. Typically, there are exogenous disturbances acting on the system (windgusts, sensor noise) which result in tracking and regulation errors. Under perturbation,the e�ect that these disturbances have on error signals can greatly increase. In mostcases, long before the onset of instability, the closed-loop performance will degrade tothe point of unacceptability, hence the need for a \robust performance" test. Such atest will indicate the worst-case level of performance degradation associated with a givenlevel of perturbations.

Assume Gp is a stable, real-rational, proper transfer function with q1 + q2 inputsand p1 + p2 outputs. Partition Gp in the obvious manner

Gp(s) =

"G11 G12

G21 G22

#

so that G11 has q1 inputs and p1 outputs, and so on. Let � � C q1�p1 be a blockstructure, as in equation (11.3). De�ne an augmented block structure

�P :=

("� 0

0 �f

#: � 2�;�f 2 C

q2�p2

):

The setup is to theoretically address the robust performance questions about theloop shown below


wz��

-

Gp(s)

�(s)

The transfer function from w to z is denoted by Fu (Gp;�).

Theorem 11.9 Let � > 0. For all �(s) 2 M (�) with k�k1 < 1� , the loop shown

above is well-posed, internally stable, and kFu (Gp;�)k1 � � if and only if

sup!2R

��P(Gp(j!)) � �:

Note that by internal stability, sup!2R ��(G11(j!)) � �, then the proof of thistheorem is exactly along the lines of the earlier proof for Theorem 11.8, but also appealsto Theorem 11.7. This is a remarkably useful theorem. It says that a robust performanceproblem is equivalent to a robust stability problem with augmented uncertainty � asshown in Figure 11.5.

�

-

-

�

�f

�

Gp(s)

Figure 11.5: Robust Performance vs Robust Stability

11.3.3 Two Block �: Robust Performance Revisited

Suppose that the uncertainty block is given by

� =

"�1

�2

#2 RH1


with k�k1 < 1 and that the interconnection model G is given by

G(s) =

"G11(s) G12(s)

G21(s) G22(s)

#2 RH1:

Then the closed-loop system is well-posed and internally stable i� sup! ��(G(j!)) � 1.Let

D! =

"d!I

I

#; d! 2 R+

then

D!G(j!)D�1! =

"G11(j!) d!G12(j!)1d!G21(j!) G22(j!)

#:

Hence by Theorem 11.5, at each frequency !

��(G(j!)) = infd!2R+

�

"G11(j!) d!G12(j!)1d!G21(j!) G22(j!)

#!: (11:22)

Since the minimization is convex in log d! [see, Doyle, 1982], the optimal d! can be foundby a search; however, two approximations to d! can be obtained easily by approximatingthe right hand side of (11.22):

(1) From Lemma 2.10, we have

��(G(j!)) � infd!2R+

�

"kG11(j!)k d! kG12(j!)k1d!kG21(j!)k kG22(j!)k

#!

�s

infd!2R+

�kG11(j!)k2 + d2! kG12(j!)k2 + 1

d2!kG21(j!)k2 + kG22(j!)k2

�

=

qkG11(j!)k2 + kG22(j!)k2 + 2 kG12(j!)k kG21(j!)k

with the minimizing d! given by

d! =

8>><>>:qkG21(j!)kkG12(j!)k

if G12 6= 0 & G21 6= 0;

0 if G21 = 0;

1 if G12 = 0:

(11:23)

(2) Alternative approximation can be obtained by using the Frobenius norm

��(G(j!)) � infd!2R+

"

G11(j!) d!G12(j!)1d!G21(j!) G22(j!)

# F


=

sinf

d!2R+

�kG11(j!)k2F + d2! kG12(j!)k2F +

1

d2!kG21(j!)k2F + kG22(j!)k2F

�

=

qkG11(j!)k2F + kG22(j!)k2F + 2 kG12(j!)kF kG21(j!)kF

with the minimizing d! given by

~d! =

8>><>>:qkG21(j!)kFkG12(j!)kF

if G12 6= 0 & G21 6= 0;

0 if G21 = 0;

1 if G12 = 0:

(11:24)

It can be shown that the approximations for the scalar d! obtained above are exactfor a 2 � 2 matrix G. For higher dimensional G, the approximations for d! are stillreasonably good. Hence an approximation of � can be obtained as

��(G(j!)) � �

"G11(j!) d!G12(j!)1d!G21(j!) G22(j!)

#!(11:25)

or, alternatively, as

��(G(j!)) � �

"G11(j!) ~d!G12(j!)1~d!G21(j!) G22(j!)

#!: (11:26)

We can now see how these approximated � tests are compared with the su�cientconditions obtained in Chapter 9.

Example 11.1 Consider again the robust performance problem of a system with out-put multiplicative uncertainty in Chapter 9 (see Figure 9.5):

P� = (I +W1�W2)P; k�k1 < 1:

Then it is easy to show that the problem can be put in the general framework byselecting

G(s) =

"�W2ToW1 �W2ToWd

WeSoW1 WeSoWd

#

and that the robust performance condition is satis�ed if and only if

kW2ToW1k1 � 1 (11:27)

and

kFu(G;�)k1 � 1 (11:28)


for all � 2 RH1 with k�k1 < 1. But (11.27) and (11.28) are satis�ed i� for eachfrequency !

��(G(j!)) = infd!2R+

�

"�W2ToW1 �d!W2ToWd

1d!WeSoW1 WeSoWd

#!� 1:

Note that, in contrast to the su�cient condition obtained in Chapter 9, this condition isan exact test for robust performance. To compare the � test with the criteria obtainedin Chapter 9, some upper bounds for � can be derived. Let

d! =

skWeSoW1kkW2ToWdk :

Then, using the �rst approximation for �, we get

��(G(j!)) �qkW2ToW1k2 + kWeSoWdk2 + 2 kW2ToWdk kWeSoW1k

�qkW2ToW1k2 + kWeSoWdk2 + 2�(W�1

1 Wd) kW2ToW1k kWeSoWdk� kW2ToW1k+ �(W�1

1 Wd) kWeSoWdk

where W1 is assumed to be invertible in the last two inequalities. The last term isexactly the su�cient robust performance criteria obtained in Chapter 9. It is clear thatany term preceding the last forms a tighter test since �(W�1

1 Wd) � 1. Yet anotheralternative su�cient test can be obtained from the above sequence of inequalities:

��(G(j!)) �q�(W�1

1 Wd)(kW2ToW1k+ kWeSoWdk):

Note that this su�cient condition is not easy to get from the approach taken in Chapter 9and is potentially less conservative than the bounds derived there. 3

Next we consider the skewed speci�cation problem, but �rst the following lemma isneeded in the sequel.

Lemma 11.10 Suppose � = �1 � �2 � : : : � �m = � > 0, then

infd2R+

maxi

�(d�i)

2 +1

(d�i)2

�=

�

�+

�

�:

Proof. Consider a function y = x + 1=x, then y is a convex function and the maxi-mization over a closed interval is achieved at the boundary of the interval. Hence forany �xed d

maxi

�(d�i)

2 +1

(d�i)2

�= max

�(d�)2 +

1

(d�)2; (d�)2 +

1

(d�)2

�:


Then the minimization over d is obtained i�

(d�)2 +1

(d�)2= (d�)2 +

1

(d�)2

which gives d2 = 1� � . The result then follows from substituting d. 2

Example 11.2 As another example, consider again the skewed speci�cation problemfrom Chapter 9. Then the corresponding G matrix is given by

G =

"�W2TiW1 �W2KSoWd

WeSoPW1 WeSoWd

#:

So the robust performance speci�cation is satis�ed i�

��(G(j!)) = infd!2R+

�

"�W2TiW1 �d!W2KSoWd

1d!WeSoPW1 WeSoWd

#!� 1

for all ! � 0. As in the last example, an upper bound can be obtained by taking

d! =

skWeSoPW1kkW2KSoWdk :

Then

��(G(j!)) �q�(W�1

d PW1)(kW2TiW1k+ kWeSoWdk):In particular, this suggests that the robust performance margin is inversely proportionalto the square root of the plant condition number if Wd = I and W1 = I . This can befurther illustrated by considering a plant-inverting control system.

To simplify the exposition, we shall make the following assumptions:

We = wsI; Wd = I; W1 = I; W2 = wtI;

and P is stable and has a stable inverse (i.e., minimum phase) (P can be strictly proper).Furthermore, we shall assume that the controller has the form

K(s) = P�1(s)l(s)

where l(s) is a scalar loop transfer function which makes K(s) proper and stabilizesthe closed-loop. This compensator produces diagonal sensitivity and complementarysensitivity functions with identical diagonal elements, namely

So = Si =1

1 + l(s)I; To = Ti =

l(s)

1 + l(s)I:


Denote

"(s) =1

1 + l(s); �(s) =

l(s)

1 + l(s)

and substitute these expressions into G; we get

G =

"�wt�I �wt�P�1ws"P ws"I

#:

The structured singular value for G at frequency ! can be computed by

��(G(j!)) = infd2R+

�

"�wt�I �wt�(dP )�1ws"dP ws"I

#!:

Let the singular value decomposition of P (j!) at frequency ! be

P (j!) = U�V �; � = diag(�1; �2; : : : ; �m)

with �1 = � and �m = � where m is the dimension of P . Then

��(G(j!)) = infd2R+

�

"�wt�I �wt�(d�)�1ws"d� ws"I

#!

since unitary operations do not change the singular values of a matrix. Note that"�wt�I �wt�(d�)�1ws"d� ws"I

#= P1diag(M1;M2; : : : ;Mm)P2

where P1 and P2 are permutation matrices and where

Mi =

"�wt� �wt�(d�i)�1ws"d�i ws"

#:

Hence

��(G(j!)) = infd2R+

maxi

�

"�wt� �wt�(d�i)�1ws"d�i ws"

#!

= infd2R+

maxi

�

"�wt�ws"d�i

# h1 (d�i)

�1i!

= infd2R+

maxi

p(1 + jd�ij�2)(jws"d�ij2 + jwt� j2)

= infd2R+

maxi

sjws"j2 + jwt� j2 + jws"d�ij2 +

��wt�d�i

��2:


Using Lemma 11.10, it is easy to show that the maximum is achieved at either � or �and that optimal d is given by

d2 =jwt� jjws"j�� ;

so the structured singular value is

��(G(j!)) =

sjws"j2 + jwt� j2 + jws"jjwt� j[�(P ) + 1

�(P )]: (11:29)

Note that if jws"j and jwt� j are not too large, which are guaranteed if the nominalperformance and robust stability conditions are satis�ed, then the structured singularvalue is proportional to the square root of the plant condition number:

��(G(j!)) �pjws"jjwt� j�(P ) : (11:30)

3

This con�rms our intuition that an ill-conditioned plant with skewed speci�cations ishard to control.

11.3.4 Approximation of Multiple Full Block �

The approximations given in the last subsection can be generalized to the multiple block� problem by assuming that M is partitioned consistently with the structure of

� = diag(�1;�2; : : : ;�F )

so that

M =

266664

M11 M12 � � � M1F

M21 M22 � � � M2F

......

...

MF1 MF2 � � � MFF

377775

and

D = diag(d1I; : : : ; dF�1I; I):

Now

DMD�1 =

�Mij

didj

�; dF := 1:

And hence

��(M) � infD2D

�(DMD�1) = infD2D

�

�Mij

didj

�


� infD2D

�

�kMijk di

dj

�� inf

D2D

vuut FXi=1

FXj=1

kMijk2 d2i

d2j

� infD2D

vuut FXi=1

FXj=1

kMijk2Fd2id2j:

An approximate D can be found by solving the following minimization problem:

infD2D

FXi=1

FXj=1

kMijk2 d2i

d2j

or, more conveniently, by minimizing

infD2D

FXi=1

FXj=1

kMijk2Fd2id2j

with dF = 1. The optimal di minimizing of the above two problems satisfy, respectively,

d4k =

Pi6=k kMikk2 d2iPj 6=k kMkjk2 =d2j

; k = 1; 2; : : : ; F � 1 (11:31)

and

d4k =

Pi6=k kMikk2F d2iPj 6=k kMkjk2F =d2j

; k = 1; 2; : : : ; F � 1: (11:32)

Using these relations, dk can be obtained by iterations.

Example 11.3 Consider a 3� 3 complex matrix

M =

264 1 + j 10� 2j �20j

5j 3 + j �1 + 3j

�2 j 4� j

375

with structured � = diag(�1; �2; �3). The largest singular value ofM is �(M) = 22:9094and the structured singular value of M computed using the �-TOOLS is equal to itsupper bound:

��(M) = infD2D

�(DMD�1) = 11:9636

with the optimal scaling Dopt = diag(0:3955; 0:6847; 1). The optimal D minimizing

infD2D

FXi=1

FXj=1

kMijk2 d2i

d2j

11.4. Overview on � Synthesis 287

is Dsubopt = diag(0:3212; 0:4643; 1) which is solved from equation (11.31). Using thisDsubopt, we obtain another upper bound for the structured singular value:

��(M) � �(DsuboptMD�1subopt) = 12:2538:

One may also use this Dsubopt as an initial guess for the exact optimization. 3

11.4 Overview on � Synthesis

This section brie y outlines various synthesis methods. The details are somewhat com-plicated and are treated in the other parts of the book. At this point, we simply wantto point out how the analysis theory discussed in the previous sections leads naturallyto synthesis questions.

From the analysis results, we see that each case eventually leads to the evaluationof

kMk� � = 2; 1; or � (11:33)

for some transfer matrix M . Thus when the controller is put back into the problem, itinvolves only a simple linear fractional transformation as shown in Figure 11.6 with

M = F`(G;K) = G11 +G12K(I �G22K)�1G21:

where G =

"G11 G12

G21 G22

#is chosen, respectively, as

� nominal performance only (� = 0): G =

"P22 P23

P32 P33

#

� robust stability only: G =

"P11 P13

P31 P33

#

� robust performance: G = P =

264 P11 P12 P13

P21 P22 P23

P31 P32 P33

375.

Each case then leads to the synthesis problem

minKkF`(G;K)k� for � = 2; 1; or � (11:34)

which is subject to the internal stability of the nominal.

The solutions of these problems for � = 2 and 1 are the focus of the rest of thisbook. The solutions presented in this book unify the two approaches in a commonsynthesis framework. The � = 2 case was already known in the 1960's, and the results


wz� �

�

-

G

K

Figure 11.6: Synthesis Framework

are simply a new interpretation. The two Riccati solutions for the � = 1 case werenew products of the late 1980's.

The synthesis for the � = � case is not yet fully solved. Recall that � may beobtained by scaling and applying k�k1 (for F � 3 and S = 0), a reasonable approachis to \solve"

minK

infD;D�12H1

DF`(G;K)D�1 1

(11:35)

by iteratively solving for K and D. This is the so-called D-K Iteration. The stable andminimum phase scaling matrix D(s) is chosen such that D(s)�(s) = �(s)D(s) (Notethat D(s) is not necessarily belong to D since D(s) is not necessarily Hermitian, seeRemark 11.2). For a �xed scaling transfer matrix D, minK

DF`(G;K)D�1 1

is astandard H1 optimization problem which will be solved in the later part of the book.For a given stabilizing controller K, infD;D�12H1

DF`(G;K)D�1 1

is a standardconvex optimization problem and it can be solved pointwise in the frequency domain:

sup!

infD!2D

��D!F`(G;K)(j!)D�1!

�:

Indeed,

infD;D�12H1

DF`(G;K)D�1 1

= sup!

infD!2D

��D!F`(G;K)(j!)D�1!

�:

This follows intuitively from the following arguments: the left hand side is always nosmaller than the right hand side, and, on the other hand, given the minimizing D!

from the right hand side across the frequency, there is always a rational function D(s)uniformly approximating the magnitude frequency response D!.

Note that when S = 0, (no scalar blocks)

D! = diag(d!1 I; : : : ; d!F�1I; I) 2 D;

which is a block-diagonal scaling matrix applied pointwise across frequency to the fre-quency response F`(G;K)(j!).

D-K Iterations proceed by performing this two-parameter minimization in sequentialfashion: �rst minimizing over K with D! �xed, then minimizing pointwise over D!

with K �xed, then again over K, and again over D!, etc. Details of this process aresummarized in the following steps:

11.4. Overview on � Synthesis 289

� �D D�1�

-

�

�

K

G

Figure 11.7: �-Synthesis via Scaling

(i) Fix an initial estimate of the scaling matrix D! 2 D pointwise across frequency.

(ii) Find scalar transfer functions di(s); d�1i (s) 2 RH1 for i = 1; 2; : : : ; (F � 1) such

that jdi(j!)j � d!i . This step can be done using the interpolation theory [Youlaand Saito, 1967]; however, this will usually result in very high order transferfunctions, which explains why this process is currently done mostly by graphicalmatching using lower order transfer functions.

(iii) LetD(s) = diag(d1(s)I; : : : ; dF�1(s)I; I):

Construct a state space model for system

G(s) =

"D(s)

I

#G(s)

"D�1(s)

I

#:

(iv) Solve an H1-optimization problem to minimize F`(G;K) 1

over all stabilizing K's. Note that this optimization problem uses the scaledversion of G. Let its minimizing controller be denoted by K.

(v) Minimize �[D!F`(G; K)D�1! ] over D!, pointwise across frequency.4 Note thatthis evaluation uses the minimizing K from the last step, but that G is unscaled.The minimization itself produces a new scaling function. Let this new functionbe denoted by D!.

(vi) Compare D! with the previous estimateD!. Stop if they are close, but, otherwise,replace D! with D! and return to step (ii).

With either K or D �xed, the global optimum in the other variable may be found usingthe � and H1 solutions. Although the joint optimization of D and K is not convex and

4The approximate solutions given in the last section may be used.


the global convergence is not guaranteed, many designs have shown that this approachworks very well [see e.g. Balas, 1990]. In fact, this is probably the most e�ective designmethodology available today for dealing with such complicated problems. The detailedtreatment of � analysis is given in Packard and Doyle [1991]. The rest of this book willfocus on the H1 optimization which is a fundamental tool for � synthesis.


This chapter is partially based on the lecture notes given by Doyle [1984] in Honeywelland partially based on the lecture notes by Packard [1991] and the paper by Doyle,Packard, and Zhou [1991]. Parts of section 11.3.3 come from the paper by Stein andDoyle [1991]. The small � theorem for systems with non-rational plants and uncertain-ties is proven in Tits [1994]. Other results on � can be found in Fan and Tits [1986],Fan, Tits, and Doyle [1991], Packard and Doyle [1993], Packard and Pandey [1993],Young [1993], and references therein.

12Parameterization of

Stabilizing Controllers

The basic con�guration of the feedback systems considered in this chapter is an LFTas shown in Figure 12.1, where G is the generalized plant with two sets of inputs: theexogenous inputs w, which include disturbances and commands, and control inputs u.The plant G also has two sets of outputs: the measured (or sensor) outputs y and theregulated outputs z. K is the controller to be designed. A control problem in this setupis either to analyze some speci�c properties, e.g., stability or performance, of the closed-loop or to design the feedback control K such that the closed-loop system is stablein some appropriate sense and the error signal z is speci�ed, i.e., some performancecondition is satis�ed. In this chapter we are only concerned with the basic internalstabilization problems. We will see again that this setup is very convenient for othergeneral control synthesis problems in the coming chapters.

Suppose that a given feedback system is feedback stabilizable. In this chapter, theproblem we are mostly interested in is parameterizing all controllers that stabilize thesystem. The parameterization of all internally stabilizing controllers was �rst introducedby Youla et al [1976]; in their parameterization, the coprime factorization technique isused. All of the existing results are mainly in the frequency domain although they canalso be transformed to state-space descriptions. In this chapter, we consider this issuein the general setting and directly in state space without adopting coprime factorizationtechnique. The construction of the controller parameterization is done via consideringa sequence of special problems, which are so-called full information (FI) problems, dis-turbance feedforward (DF) problems, full control (FC) problems and output estimation

291

292 PARAMETERIZATION OF STABILIZING CONTROLLERS

G

K

z

y

w

u

� �

�

-

Figure 12.1: General System Interconnection

(OE) problems. On the other hand, these special problems are also of importance intheir own right.

In addition to presenting the controller parameterization, this chapter also aims atintroducing the synthesis machinery, which is essential in some control syntheses (theH2 and H1 control in Chapter 14 and Chapter 16), and at seeing how it works inthe controller parameterization problem. The structure of this chapter is as follows: insection 12.1, the conditions for the existence of a stabilizing controller are examined.In section 12.2, we shall examine the stabilization of di�erent special problems andestablish the relations among them. In section 12.3, the construction of controllerparameterization for the general output feedback problem will be considered via thespecial problems FI , DF , FC and OE. In section 12.4, the structure of the controllerparameterization is displayed. Section 12.5 shows the closed-loop transfer matrix interms of the parameterization. Section 12.6 considers an alternative approach to thecontroller parameterization using coprime factorizations and establishes the connectionswith the state space approach. This section can be either studied independently of allthe preceding sections or skipped over without loss of continuity.

12.1 Existence of Stabilizing Controllers

Consider a system described by the standard block diagram in Figure 12.1. Assumethat G(s) has a stabilizable and detectable realization of the form

G(s) =

"G11(s) G12(s)

G21(s) G22(s)

#=

264 A B1 B2

C1 D11 D12

C2 D21 D22

375 : (12:1)

The stabilization problem is to �nd feedback mapping K such that the closed-loopsystem is internally stable; the well-posedness is required for this interconnection. Thisgeneral synthesis question will be called the output feedback (OF) problem.

De�nition 12.1 A proper system G is said to be stabilizable through output feedbackif there exists a proper controller K internally stabilizing G in Figure 12.1. Moreover,a proper controller K(s) is said to be admissible if it internally stabilizes G.

12.1. Existence of Stabilizing Controllers 293

The following result is standard and follows from Chapter 3.

Lemma 12.1 There exists a proper K achieving internal stability i� (A;B2) is stabiliz-able and (C2; A) is detectable. Further, let F and L be such that A+B2F and A+LC2

are stable, then an observer-based stabilizing controller is given by

K(s) =

"A+B2F + LC2 + LD22F �L

F 0

#:

Proof. (() By the stabilizability and detectability assumptions, there exist F andL such that A + B2F and A + LC2 are stable. Now let K(s) be the observer-basedcontroller given in the lemma, then the closed-loop A-matrix is given by

~A =

"A B2F

�LC2 A+B2F + LC2

#:

It is easy to check that this matrix is similar to the matrix"A+ LC2 0

�LC2 A+B2F

#:

Thus the spectrum of ~A equals the union of the spectra of A+ LC2 and A + B2F . Inparticular, ~A is stable.

()) If (A;B2) is not stabilizable or if (C2; A) is not detectable, then there are someeigenvalues of ~A which are �xed in the right half-plane, no matter what the compensatoris. The details are left as an exercise. 2

The stabilizability and detectability conditions of (A;B2; C2) are assumed through-out the remainder of this chapter1. It follows that the realization for G22 is stabilizableand detectable, and these assumptions are enough to yield the following result:

G22

K

y u

�

-

Figure 12.2: Equivalent Stabilization Diagram

1It should be clear that the stabilizability and detectability of a realization for G do not guaranteethe stabilizability and/or detectability of the corresponding realization for G22.


Lemma 12.2 Suppose (A;B2; C2) is stabilizable and detectable and G22 =

"A B2

C2 D22

#.

Then the system in Figure 12.1 is internally stable i� the one in Figure 12.2 is internallystable.

In other words, K(s) internally stabilizes G(s) if and only if it internally stabilizesG22.

Proof. The necessity follows from the de�nition. To show the su�ciency, it is su�cientto show that the system in Figure 12.1 and that in Figure 12.2 share the same A-matrix,which is obvious. 2

From Lemma 12.2, we see that the stabilizing controller for G depends only on G22.Hence all stabilizing controllers for G can be obtained by using only G22, which is howit is usually done in the conventional Youla parameterization. However, it will be shownthat the general setup is very convenient and much more useful since any closed-loopsystem information can also be considered in the same framework.

Remark 12.1 There should be no confusion between a given realization for a transfermatrix G22 and the inherited realization from G where G22 is a submatrix. A givenrealization for G22 may be stabilizable and detectable while the inherited realizationmay be not. For instance,

G22 =1

s+ 1=

"�1 1

1 0

#

is a minimal realization but the inherited realization of G22 from

"G11 G12

G21 G22

#=

266664�1 0 0 1

0 1 1 0

0 1 0 0

1 0 0 0

377775

is

G22 =

264 �1 0 1

0 1 0

1 0 0

375 �

=1

s+ 1

�

which is neither stabilizable nor detectable. ~

12.2. Duality and Special Problems 295

12.2 Duality and Special Problems

In this section, we will discuss four problems from which the output feedback solutionsare constructed via a separation argument. These special problems are fundamentalto the approach taken for synthesis in this book, and, as we shall see, they are also ofimportance in their own right.

12.2.1 Algebraic Duality and Special Problems

Before we get into the study of the algebraic structure of control systems, we now intro-duce the concept of algebraic duality which will play an important role. It is well knownthat the concepts of controllability (stabilizability) and observability (detectability) ofa system (C;A;B) are dual because of the duality between (C;A;B) and (BT ; AT ; CT ).So, to deal with the issues related to a system's controllability and/or observability, weonly need to examine the issues related to the observability and/or controllability of itsdual system, respectively. The notion of duality can be generalized to a general setting.

Consider a standard system block diagram

G

K

z

y

w

u

� �

�

-

where the plant G and controller K are assumed to be linear time invariant. Nowconsider another system shown below

GT

KT

~z

~y

~w

~u

� �

�

-

whose plant and controller are obtained by transposing G and K. We can check easilythat T T

zw = [F`(G;K)]T= F`(GT ;KT ) = T~z ~w. It is not di�cult to see that K internally

stabilizes G i�KT internally stabilizes GT . And we say that these two control structuresare algebraically dual, especially, GT and KT which are dual objects of G and K,respectively. So as far as stabilization or other synthesis problems are concerned, wecan obtain the results for GT from the results for its dual object G if they are available.

Now, we consider some special problems which are related to the general OF prob-lems stated in the last section and which are important in constructing the results forOF problems. The special problems considered here all pertain to the standard blockdiagram, but to di�erent structures than G. The problems are labeled as


FI. Full information, with the corresponding plant

GFI =

266664

A B1 B2

C1 D11 D12"I

0

# "0

I

# "0

0

#377775 :

FC. Full control, with the corresponding plant

GFC =

26664

A B1

hI 0

iC1 D11

h0 I

iC2 D21

h0 0

i37775 :

DF. Disturbance feedforward, with the corresponding plant

GDF =

264 A B1 B2

C1 D11 D12

C2 I 0

375 :

OE. Output estimation, with the corresponding plant

GOE =

264 A B1 B2

C1 D11 I

C2 D21 0

375 :

The motivations for these special problems will be given later when they are consid-ered. There are also two additional structures which are standard and which will notbe considered in this chapter; they are

SF. State feedback

GSF =

264 A B1 B2

C1 D11 D12

I 0 0

375 :

OI. Output injection

GOI =

264 A B1 I

C1 D11 0

C2 D21 0

375 :


Here we assume that all physical variables have compatible dimensions. We saythat these special problems are special cases of OF problems in the sense that theirstructures are speci�ed in comparison to OF problems.

The structure of transfer matrices shows clearly that FC, OE (and OI) are duals ofFI, DF (and SF), respectively. These relationships are shown in the following diagram:

DF OE

FI FC

� -

6

?

� -

6

?

equivalent equivalent

dual

dual

The precise meaning of \equivalent" in this diagram will be explained below.

12.2.2 Full Information and Disturbance Feedforward

In the FI problem, the controller is provided with Full Information since y =

"x

w

#.

For the FI problem, we only need to assume that (A;B2) is stabilizable to guarantee thesolvability. It is clear that if any output feedback control problem is to be solvable thenthe corresponding FI problem has to be solvable, provided FI is of the same structurewith OF except for the speci�ed parts.

To motivate the name Disturbance Feedforward, consider the special case with C2 =0. Then there is no feedback and the measurement is exactly w, where w is generallyregarded as disturbance to the system. Only the disturbance, w, is fed through directlyto the output. As we shall see, the feedback caused by C2 6= 0 does not a�ect thetransfer function from w to the output z, but it does a�ect internal stability. In fact,the conditions for the solvability of the DF problem are that (A;B2) is stabilizable and(C2; A) is detectable.

Now we examine the connection between the DF problem and the FI problem andshow the meaning of their equivalence. Suppose that we have controllersKFI and KDF

and let TFI and TDF denote the closed-loop Tzws in

GFI

KFI

z

yFI

w

u

� �

�

-

GDF

KDF

z

yDF

w

u

� �

�

-


The question of interest is as follows: given either the KFI or the KDF controller,can we construct the other in such a way that TFI = TDF ? The answer is positive.Actually, we have the following:

Lemma 12.3 Let GFI and GDF be given as above. Then

(i) GDF (s) =

"I 0 0

0 C2 I

#GFI(s).

(ii) GFI = S(GDF ; PDF ) (where S(�; �) denotes the star-product)

�

�

!!!!

!!aaaaaa

�

�

�

�

PDF (s) =

266664A�B1C2 B1 B2

0 0 I"I

�C2

# "0

I

# "0

0

#377775 :

PDF

GDF

Proof. Part (i) is obvious. Part (ii) follows from applying the star product formula.Nevertheless, we will give a direct proof to show the system structure. Let x and xdenote the state of GDF and PDF , respectively. Take e := x� x and x as the states ofthe resulting interconnected system; then its realization is2

6666664

A�B1C2 0 0 0

B1C2 A B1 B2

C1 C1 D11 D12"0

C2

# "I

0

# "0

I

# "0

0

#

37777775 =

266664

A B1 B2

C1 D11 D12"I

0

# "0

I

# "0

0

#377775

which is exactly GFI , as claimed. 2

Remark 12.2 There is an alternative way to see part (ii). The fact is that in the DFproblems, the disturbance w and the system states x can be solved in terms of y and u:

"x

w

#=

264 A�B1C2 B1 B2

I 0 0

�C2 I 0

375"y

u

#=: V

"y

u

#:

Now connect V up with GDF as shown below


��

�

��

y

w

u

w

x

z

�V

GDF

Then it is easy to show that the transfer function from (w; u) to (z; x; w) is GFI , and,furthermore, that the internal stability is not changed if A�B1C2 is stable. ~

The following theorem follows immediately:

Theorem 12.4 Let GFI , GDF , and PDF be given as above.

(i) KFI := KDF

hC2 I

iinternally stabilizes GFI if KDF internally stabilizes

GDF . Furthermore,

F`(GFI ;KDF

hC2 I

i) = F`(GDF ;KDF ):

(ii) Suppose A�B1C2 is stable. Then KDF := F`(PDF ;KFI) as shown below

PDF

KFI

u

y

yDF

u

� �

�

-

internally stabilizes GDF if KFI internally stabilizes GFI . Furthermore,

F`(GDF ;F`(PDF ;KFI)) = F`(GFI ;KFI):

Remark 12.3 This theorem shows that if A � B1C2 is stable, then problems FI andDF are equivalent in the above sense. Note that the transfer function from w to yDF is

G21(s) =

"A B1

C2 I

#:

Hence this stability condition implies that G21(s) has neither right half plane invariantzeros nor hidden unstable modes. ~


12.2.3 Full Control and Output Estimation

For the FC problem, the term Full Control is used because the controller has full accessto both the state through output injection and the output z. The only restriction on thecontroller is that it must work with the measurement y. This problem is dual to the FIcase and has the dual solvability condition to the FI problem, which is also guaranteedby the assumptions on OF problems. The solutions to this kind of control problem canbe obtained by �rst transposing GFC , and solving the corresponding FI problem, andthen transposing back.

On the other hand, problem OE is dual to DF . Thus the discussion of the DFproblem is relevant here, when appropriately dualized. And the solvability conditionsfor the OE problem are that (A;B2) is stabilizable and (C2; A) is detectable. To examinethe physical meaning of output estimation, �rst note that

z = C1x+D11w + u

where z is to be controlled by an appropriately designed control u. In general, ourcontrol objective will be to specify z in some well-de�ned mathematical sense. To putit in other words, it is desired to �nd a u that will estimate C1x+D11w in such de�nedmathematical sense. So this kind of control problem can be regarded as an estimationproblem. We are focusing on this particular estimation problem because it is the onethat arises in solving the output feedback problem. A more conventional estimationproblem would be the special case where no internal stability condition is imposedand B2 = 0. Then the problem would be that of estimating the output z given themeasurement y. This special case motivates the term output estimation and can beobtained immediately from the results obtained for the general case.

The following discussion will explain the meaning of equivalence between FC andOE problems. Consider the following FC and OE diagrams:

GFC

KFC

z

yFC

w

u

� �

�

-

GOE

KOE

z

yOE

w

u

� �

�

-

We have similar results to the ones in the last subsection:

Lemma 12.5 Let GFC and GOE be given as above. Then

(i) GOE(s) = GFC(s)

264 I 0

0 B2

0 I

375

12.3. Parameterization of All Stabilizing Controllers 301

(ii) GFC = S(GOE ; POE), where POE is given by

POE(s) =

26664A�B2C1 0

hI �B2

iC1 0

h0 I

iC2 I

h0 0

i37775 :

Theorem 12.6 Let GFC ; GOE , and POE be given as above.

(i) KFC :=

"B2

I

#KOE internally stabilizes GFC if KOE internally stabilizes GOE .

Furthermore,

F`(GFC ;

"B2

I

#KOE) = F`(GOE ;KOE):

(ii) Suppose A�B2C1 is stable. Then KOE := F`(POE ;KFC), as shown below

POE

KFC

u

y

y

u

� �

�

-

internally stabilizes GOE if KFC internally stabilizes GFC . Furthermore,

F`(GOE ;F`(POE ;KFC)) = F`(GFC ;KFC):

Remark 12.4 It is seen that if A � B2C1 is stable, then FC and OE problems areequivalent in the above sense. This condition implies that the transfer matrix G12(s)from u to z has neither right half-plane invariant zeros nor hidden unstable modes,which indicates that it has a stable inverse. ~

12.3 Parameterization of All Stabilizing Controllers

12.3.1 Problem Statement and Solution

Consider again the standard system block diagram in Figure 12.1 with

G(s) =

264 A B1 B2

C1 D11 D12

C2 D21 D22

375 =

"G11(s) G12(s)

G21(s) G22(s)

#:

Suppose (A;B2) is stabilizable and (C2; A) is detectable. In this section we discuss thefollowing problem:


Given a plant G, parameterize all controllers K that internally stabilize G.

This parameterization for all stabilizing controllers is usually called Youla parameteri-zation. As we have mentioned early, the stabilizing controllers for G will depend only onG22. However, it is more convenient to consider the problem in the general frameworkas will be shown. The parameterization of all stabilizing controllers is easy when theplant itself is stable.

Theorem 12.7 Suppose G 2 RH1; then the set of all stabilizing controllers can bedescribed as

K = Q(I +G22Q)�1 (12:2)

for any Q 2 RH1 and I +D22Q(1) nonsingular.

Remark 12.5 This result is very natural considering Corollary 5.5, which says that acontroller K stabilizes a stable plant G22 i� K(I � G22K)�1 is stable. Now supposeQ = K(I�G22K)�1 is a stable transfer matrix, then K can be solved from this equationwhich gives exactly the controller parameterization in the above theorem. ~

Proof. Note that G22(s) is stable by the assumptions on G. Now use straightforwardalgebra to verify that the controllers given above stabilize G22. On the other hand,suppose K0 is a stabilizing controller; then Q0 := K0(I �G22K0)

�1 2 RH1, so K0 canbe expressed as K0 = Q0(I +G22Q0)

�1. Note that the invertibility in the last equationis guaranteed by the well posedness of the interconnected system with controller K0

since I +D22Q0(1) = (I �D22K0(1))�1. 2

However, if G is not stable, the parameterization is much more complicated. Theresults can be more conveniently stated using state space representations.

Theorem 12.8 Let F and L be such that A+LC2 and A+B2F are stable, and then allcontrollers that internally stabilize G can be parameterized as the transfer matrix fromy to u below

J

Q

u y� �

�

-

J =

264 A+B2F + LC2 + LD22F �L B2 + LD22

F 0 I

�(C2 +D22F ) I �D22

375

with any Q 2 RH1 and I +D22Q(1) nonsingular.

A non-constructive proof of the theorem can be given by using the same argumentas in the proof of Theorem 12.7, i.e., �rst verify that any controller given by the formulaindeed stabilizes the system G, and then show that we can construct a stable Q for any


given stabilizing controller K. This approach, however, does not give much insight intothe controller construction and thus can not be generalized to other synthesis problems.

The conventional Youla approach to this problem is via coprime factorization [Youlaet al, 1976, Vidyasagar, 1985, Desoer et al, 1982], which will be adopted in the laterpart of this chapter as an alternative approach.

In the following sections, we will present a novel approach to this problem withoutadopting coprime factorizations. The idea of this approach is to reduce the outputfeedback problem into some simpler problems, such as FI and OE or FC and DF whichadmit simple solutions, and then to solve the output feedback problem by the separationargument. The advantages of this approach are that it is simple and that many othersynthesis problems, such asH2 andH1 optimal control problems in Chapters 14 and 16,can be solved by using the same machinery.

Readers should bear in mind that our objective here is to �nd all admissible con-trollers for the OF problem. So at �rst, we will try to build up enough tools for thisobjective by considering the special problems. We will see that it is not necessary to pa-rameterize all stabilizing controllers for these special problems to get the required tools.Instead, we only parameterize some equivalent classes of controllers which generate thesame control action.

De�nition 12.2 Two controllers K and K are of equivalent control actions if theircorresponding closed loop transfer matrices are identical, i.e. Fl(G;K) = Fl(G; K),written as K �= K.

Algebraically, the controller equivalence is an equivalence relation. We will see thatfor di�erent special problems we have di�erent re�ned versions of this relation. Wewill also see that the characterizations of equivalent classes of stabilizing controllers forspecial problems are good enough to construct the parameterization of all stabilizingcontrollers for the OF problem. In the next two subsections, we mainly consider thestabilizing controller characterizations for special problems. Also, we use the solutionsto these special problems and the approach provided in the last section to characterizeall stabilizing controllers of OF problems.

12.3.2 Stabilizing Controllers for FI and FC Problems

In this subsection, we �rst examine the FI structure

GFI

KFI

z

yFI

w

u

� �

�

-

where the transfer matrix GFI is given in section 12.2. The purpose of this subsectionis to characterize the equivalent classes of stabilizing controllers KFI that stabilize


internally GFI and to build up enough tools to be used later. For this problem, wesay two controllers KFI and KFI are equivalent if they produce the same closed-looptransfer function from w to u. Obviously, this also guarantees that Fl(GFI ;KFI) =Fl(GFI ; KFI). The characterization of equivalent classes of controllers can be suitablycalled the control parameterization in contrast with controller parameterization. Notethat the same situation will occur in FC problems by duality.

Since we have full information for feedback, our controller will have the followinggeneral form:

KFI =hK1(s) K2(s)

iwith K1(s) stabilizing internally (sI �A)�1B2 and arbitrary K2(s) 2 RH1. Note thatthe stability of K2 is required to guarantee the internal stability of the whole systemsince w is fed directly through K2.

Lemma 12.9 Let F be a constant matrix such that A+B2F is stable. Then all stabiliz-ing controllers, in the sense of generating all admissible closed-loop transfer functions,for FI can be parameterized as

KFI�=hF Q

iwith any Q 2 RH1.

Note that for the parameter matrix Q 2 RH1, it is reasonable to assume that therealization of Q(s) is stabilizable and detectable.

Proof. It is easy to see that the controller given in the above formula stabilizes thesystem GFI . Hence we only need to show that the given set of controllers parameterizesall stabilizing control action, u, i.e., there is a choice of Q 2 RH1 such that the transfer

functions from w to u for any stabilizing controller KFI =hK1(s) K2(s)

iand for

K0FI =

hF Q

iare the same. To show this, make a change of control variable as

v = u � Fx, where x denotes the state of the system GFI ; then the system with thecontroller KFI will be as shown in the following diagram:

~GFI

~KFI

z

yFI

w

v

� �

�

-

with

~GFI =

266664

A+B2F B1 B2

C1 +D12F D11 D12"I

0

# "0

I

# "0

0

#377775 ; ~KFI = KFI � [F 0]:


Let Q be the transfer matrix from w to v; Q belongs to RH1 by internal stability.

Then u = Fx+ v = Fx+Qw, so KFI�=hF Q

i. 2

Remark 12.6 The equivalence of KFI�= K0

FI in the above equation can actually beshown by directly computing Q from equating the transfer matrices from w to u for thecases of KFI and K0

FI . In fact, the transfer matrices from w to u with KFI and K0FI

are given by

�I �K1(sI �A)�1B2

��1K1(sI �A)�1B1 +

�I �K1(sI �A)�1B2

��1K2 (12:3)

and �I � F (sI �A)�1B2

��1F (sI �A)�1B1 +

�I � F (sI �A)�1B2

��1Q; (12:4)

respectively. We can verify that

Q = K2 + (K1 � F )(sI �A�B2K1)�1(B2K2 +B1)

is stable and makes the formulas in (12.3) and (12.4) equal. ~

Now we consider the dual FC problem; the system diagram pertinent to this case is

GFC

KFC

z

yFC

w

u

� �

�

-

Dually, we say controllers KFC and KFC are equivalent in the sense that the sameinjection inputs yFC 's produce the same outputs z's. This also guarantees the identityof their resulting closed-loop transfer matrices from w to z. And we also have

Lemma 12.10 Let L be a constant matrix such that A + LC2 is stable. Then theset of equivalent classes of all stabilizing controllers for FC in the above sense can beparameterized as

KFC�="L

Q

#

with any Q 2 RH1.


12.3.3 Stabilizing Controllers for DF and OE Problems

In the DF case we have the following system diagram:

GDF

KDF

z

yDF

w

u

� �

�

-

The transfer matrix is given as in section 12.2.1. We will further assume that A�B1C2

is stable in this subsection. It should be pointed out that the existence of a stabilizingcontroller for this system is guaranteed by the stabilizability of (A;B2) and detectabilityof (C2; A). Hence this assumption is not necessary for our problem to be solvable;however, it does simplify the solution.

We will now parameterize stabilizing controllers forGDF by invoking the relationshipbetween the FI problem and DF problem established in section 12.2. We say that thecontrollers KDF and KDF are equivalent for the DF problem if the two transfer matricesfrom w to u in the above diagram are the same. Of course, the resulting two closed-looptransfer matrices from w to z are identical.

Remark 12.7 By the equivalence between FI and DF problems, it is easy to show

that if KDF�= KDF in the DF structure, then KDF

hC2 I

i �= KDF

hC2 I

iin

the corresponding FI structure. We also have that ifKFI�= KFI , then Fl(PDF ;KFI) �=

Fl(PDF ; KFI). ~

Now we construct the parameterization of equivalent classes of stabilizing controllersin DF structure via the tool we have developed in section 12.2.

Let KDF (s) be a stabilizing control for DF ; then KFI(s) = KDF (s)hC2 I

istabilizes the corresponding GFI . Assume KFI

�= KFI =hF Q

ifor some Q 2

RH1; then KFI stabilizes GFI and F`(JDF ; Q) = F`(PDF ; KFI) where

JDF =

264 A+B2F �B1C2 B1 B2

F 0 I

�C2 I 0

375

with F such that A + B2F is stable. Hence by Theorem 12.4, KDF := F`(JDF ; Q)stabilizes GDF for any Q 2 RH1. Since KFI

�= KFI , by Remarks 12.7 we haveKDF

�= KDF = F`(JDF ; Q). This equation characterizes an equivalent class of allcontrollers for the DF problem by the equivalence of FI and DF .

In fact, we have the following lemma which shows that the above construction ofparameterization characterizes all stabilizing controllers for the DF problem.


Lemma 12.11 All stabilizing controllers for the DF problem can be characterized byKDF = F`(JDF ; Q) with Q 2 RH1, where JDF is given as above.

Proof. We have already shown that the controller KDF = F`(JDF ; Q) for any givenQ 2 RH1 does internally stabilize GDF . Now let KDF be any stabilizing controller forGDF ; then F`(JDF ;KDF ) 2 RH1 where

JDF =

264 A B1 B2

�F 0 I

C2 I 0

375 :

(JDF is stabilized by KDF since it has the same `G22' matrix as GDF .)Let Q0 := F`(JDF ;KDF ) 2 RH1; then F`(JDF ; Q0) = F`(JDF ;F`(JDF ;KDF )) =:

F`(Jtmp;KDF ), where Jtmp can be obtained by using the state space star productformula given in Chapter 10:

Jtmp =

266664A�B1C2 +B2F �B2F B1 B2

�B1C2 A B1 B2

F �F 0 I

�C2 C2 I 0

377775

=

266664A�B1C2 �B2F B1 B2

0 A+B2F 0 0

0 �F 0 I

0 C2 I 0

377775

=

"0 I

I 0

#:

Hence F`(JDF ; Q0) = F`(Jtmp;KDF ) = KDF . This shows that any stabilizingcontroller can be expressed in the form of F`(JDF ; Q0) for some Q0 2 RH1. 2

Now we turn to the dual OE case. The corresponding system diagram is shown asbelow:

GOE

KOE

z

yOE

w

u

� �

�

-

We will assume that A � B2C1 is stable. Again this assumption is made only forthe simplicity of the solution, it is not necessary for the stabilization problem to be


solvable. The parameterization of an equivalent class of stabilizing controllers for OEcan be obtained by invoking Theorem 12.6 and the equivalent classes of controllers forFC. Here we say controllers KOE and KOE are equivalent if the transfer matrices fromyOE to z are the same. This also guarantees that the resulting closed-loop transfermatrices are identical.

Now we construct the parameterization for the OE structure as a dual case to DF .

Assume that KOE is an admissible controller for OE; then KFC =

"B2

I

#KOE

�="L

Q

#=: KFC for some Q 2 RH1, and KFC stabilizes GFC and F`(JOE ; Q) =

F`(POE ; KFC), where

JOE =

264 A�B2C1 + LC2 L �B2

C1 0 I

C2 I 0

375

with L such that A + LC2 is stable. Hence by Theorem 12.6, KOE = F`(JOE ; Q)stabilizes GOE for any Q 2 RH1. Since KOE

�= KOE , KOE�= F`(JOE ; Q). In fact, we

have the following lemma.

Lemma 12.12 All admissible controllers for the OE problem can be characterized asF`(JOE ; Q0) with any Q0 2 RH1, where JOE is de�ned as above.

Proof. The controllers in the form as stated in the theorem are admissible since thecorresponding FC controllers internally stabilize resulting GFC .

Now assume KOE is any stabilizing controller for GOE ; then F`(JOE ;KOE) 2 RH1where

JOE =

264 A �L B2

C1 0 I

C2 I 0

375 :

Let Q0 := F`(JOE ;KOE) 2 RH1. Then F`(JOE ; Q0) = F`(JOE ;F`(JOE ;KOE)) =KOE , by using again the state space star product formula given in Chapter 10. Thisshows that any stabilizing controller can be expressed in the form of F`(JOE ; Q0) forsome Q0 2 RH1. 2

12.3.4 Output Feedback and Separation

We are now ready to give a complete proof for Theorem 12.8. We will assume theresults of the special problems and show how to construct all admissible controllers forthe OF problem from them. And we can also observe the separation argument as the


byproduct; this essentially involves reducing the OF problem to the combination ofthe simpler FI and FC problems. Moreover, we can see from the construction whythe stability conditions of A � B1C2 and A � B2C1 in DF and OE problems werereasonably assumed and are automatically guaranteed in this case. Again we assumethat the system has the following standard system block diagram:

G

K

z

y

w

u

� �

�

-

with

G(s) =

264 A B1 B2

C1 D11 D12

C2 D21 D22

375

and that (A;B2) is stabilizable and (C2; A) is detectable.

Proof of Theorem 12.8. Without loss of generality, we shall assume D22 = 0. Formore general cases, i.e. D22 6= 0, the mapping

K(s) = K(s)(I �D22K(s))�1

is well de�ned if the closed-loop system is assumed to be well posed. Then the systemin terms of K has the structure

G

K

z

y

w

u

� �

�

-

where

G(s) =

264 A B1 B2

C1 D11 D12

C2 D21 0

375 :

Now we construct the controllers for the OF problem with D22 = 0. Denote x thestate of system G; then the open-loop system can be written as2

64 _x

z

y

375 =

264 A B1 B2

C1 D11 D12

C2 D21 0

375264 x

w

u

375 :


Since (A;B2) is stabilizable, there is a constant matrix F such that A+B2F is stable.

Note thathF 0

iis actually a special FI stabilizing controller. Now let

v = u� Fx:

Then the system can be broken into two subsystems:

"_x

z

#=

"A+B2F B1 B2

C1 +D12F D11 D12

#264 x

w

v

375

and 264 _x

v

y

375 =

264 A B1 B2

�F 0 I

C2 D21 0

375264 x

w

u

375 :

This can be shown pictorially below:

-

v

u

w

y

z

�

�

K

��

Gtmp

G1

with

G1 =

"A+B2F B1 B2

C1 +D12F D11 D12

#2 RH1

and

Gtmp =

264 A B1 B2

�F 0 I

C2 D21 0

375 :

Obviously, K stabilizes G if and only if K stabilizes Gtmp; however, Gtmp is of OEstructure. Now let L be such that A + LC2 is stable. Then by Lemma 12.12 allcontrollers stabilizing Gtmp are given by

K = F`(J;Q)


where

J =

264 A+B2F + LC2 L �B2

�F 0 I

C2 I 0

375 =

264 A+B2F + LC2 �L B2

F 0 I

�C2 I 0

375 :

This concludes the proof. 2

Remark 12.8 We can also get the same result by applying the dual procedure to theabove construction, i.e., �rst use an output injection to reduce the OF problem to a DFproblem. The separation argument is obvious since the synthesis of the OF problemcan be reduced to FI and FC problems, i.e. the latter two problems can be designedindependently. ~

Remark 12.9 Theorem 12.8 shows that any stabilizing controller K(s) can be charac-terized as an LFT of a parameter matrix Q 2 RH1, i.e., K(s) = F`(J;Q). Moreover,using the same argument as in the proof of Lemma 12.11, a realization of Q(s) in termsof K can be obtained as

Q := F`(J ;K)

where

J =

264 A �L B2

�F 0 I

C2 I D22

375

and where K(s) has the stabilizable and detectable realization. ~

Now we can reconsider the characterizations of all stabilizing controllers for thespecial problems with some reasonable assumptions, i.e. the stability conditions ofA � B1C2 and A � B2C1 for DF and OE problems which were assumed in the lastsection can be dropped.

If we specify

C2 =

"I

0

#D21 =

"0

I

#D22 =

"0

0

#;

the OF problem, in its general setting, becomes the FI problem. We know that thesolvability conditions for the FI problem are reduced, because of its special structure,to (A;B2) as stabilizable. By assuming this, we can get the following result from theOF problem.

Corollary 12.13 Let L1 and F be such that A+ L1 and A+B2F are stable; then allcontrollers that stabilize GFI can be characterized as F`(JFI ; Q) with any Q 2 RH1,


where

JFI =

266664A+B2F + L1 L �B2

�F 0 I"I

0

#I 0

377775

and where L = (L1 L2) is the injection matrix for any L2 with compatible dimensions.

In the same way, we can consider the FC problem as the special OF problem byspecifying

B2 =hI 0

iD12 =

h0 I

iD22 =

h0 0

i:

The DF (OE) problem can also be considered as the special case of OF by simplysetting D21 = I(D12 = I) and D22 = 0.

Corollary 12.14 Consider the DF problem, and assume that (C2; A;B2) is stabilizableand detectable. Let F and L be such that A + LC2 and A + B2F are stable, and thenall controllers that internally stabilize G can be parameterized as Fl(JDF ; Q) for someQ 2 RH1, i.e. the transfer function from y to u is shown as below

JDF

Q

u y� �

�

-

JDF =

264 A+B2F + LC2 �L B2

F 0 I

�C2 I 0

375 :

Remark 12.10 It would be interesting to compare this result with Lemma 12.11. Itcan be seen that Lemma 12.11 is a special case of this corollary. The condition thatA � B1C2 is stable, which is required in Lemma 12.11, provides the natural injectionmatrix L = �B1 which satis�es a partial condition in this corollary. ~

12.4 Structure of Controller Parameterization

Let us recap what we have done. We begin with a stabilizable and detectable realizationof G22

G22 =

"A B2

C2 D22

#:

We choose F and L so that A+B2F and A+LC2 are stable. De�ne J by the formulain Theorem 12.8. Then the proper K's achieving internal stability are precisely thoserepresentable in Figure 12.3 and K = F`(J;Q) where Q 2 RH1 and I +D22Q(1) isinvertible.

12.4. Structure of Controller Parameterization 313

cc c cc66

� � ��

y1u1

z w

uy

-

� �

Q

?

�G

?

�

-

6

-

-

� � � � �

D22

�LF

A

B2

RC2

Figure 12.3: Structure of Stabilizing Controllers

It is interesting to note that the system in the dashed box is an observer-basedstabilizing controller for G (or G22). Furthermore, it is easy to show that the transferfunction between (y; y1) and (u; u1) is J , i.e.,"

u

u1

#= J

"y

y1

#:

It is also easy to show that the transfer matrix from y1 to u1 is zero.This diagram of all stabilizing controller parameterization also suggests an interest-

ing interpretation: every internal stabilization amounts to adding stable dynamics tothe plant and then stabilizing the extended plant by means of an observer. The precisestatement is as follows: for simplicity of the formulas, only the cases of strictly properG22 and K are treated.

Theorem 12.15 Assume that G22 and K are strictly proper and the system is Fig-ure 12.1 is internally stable. Then G22 can be embedded in a system"

Ae Be

Ce 0

#

where

Ae =

"A 0

0 Aa

#; Be =

"B2

0

#; Ce =

hC2 0

i(12:5)


and where Aa is stable, such that K has the form

K =

"Ae +BeFe + LeCe �Le

Fe 0

#(12:6)

where Ae +BeFe and Ae + LeCe are stable.

Proof. K is representable as in Figure 12.3 for some Q in RH1. For K to be strictlyproper, Q must be strictly proper. Take a minimal realization of Q:

Q =

"Aa Ba

Ca 0

#:

Since Q 2 RH1, Aa is stable. Let x and xa denote state vectors for J and Q, respec-tively, and write the equations for the system in Figure 12.3:

_x = (A+B2F + LC2)x � Ly +B2y1

u = Fx+ y1

u1 = �C2x+ y

_xa = Aaxa +Bau1

y1 = Caxa

These equations yield

_xe = (Ae +BeFe + LeCe)xe � Ley

u = Fexe

where

xe :=

"x

xa

#; Fe :=

hF Ca

i; Le :=

"L

�Ba

#

and where Ae; Be; Ce are as in (12.5). 2

12.5 Closed-Loop Transfer Matrix

Recall that the closed-loop transfer matrix from w to z is a linear fractional transfor-mation F`(G;K) and that K stabilizes G if and only if K stabilizes G22. Eliminationof the signals u and y in Figure 12.3 leads to Figure 12.4 for a suitable transfer matrixT . Thus all closed-loop transfer matrices are representable as in Figure 12.4.

z = F`(G;K)w = F`(G;F`(J;Q))w = F`(T;Q)w: (12:7)

It remains to give a realization of T .

12.6. Youla Parameterization via Coprime Factorization* 315

�

-

wz

Q

� �T

Figure 12.4: Closed loop system

Theorem 12.16 Let F and L be such that A+BF and A+ LC are stable. Then theset of all closed-loop transfer matrices from w to z achievable by an internally stabilizingproper controller is equal to

F`(T;Q) = fT11 + T12QT21 : Q 2 RH1; I +D22Q(1) invertiblegwhere T is given by

T =

"T11 T12

T21 T22

#=

266664

A+B2F �B2F B1 B2

0 A+ LC2 B1 + LD21 0

C1 +D12F �D12F D11 D12

0 C2 D21 0

377775 :

Proof. This is straightforward by using the state space star product formula andfollows from some tedious algebra. Hence it is left for the readers to verify. 2

An important point to note is that the closed-loop transfer matrix is simply an a�nefunction of the controller parameter matrix Q since T22 = 0.

12.6 Youla Parameterization via Coprime Factor-

ization*

In this section, all stabilizing controller parameterization will be derived using the con-ventional coprime factorization approach. Readers should be familiar with the resultspresented in Section 5.4 of Chapter 5 before proceeding further.

Theorem 12.17 Let G22 = NM�1 = ~M�1 ~N be the rcf and lcf of G22 over RH1,respectively. Then the set of all proper controllers achieving internal stability is param-eterized either by

K = (U0 +MQr)(V0 +NQr)�1; det(I + V �10 NQr)(1) 6= 0 (12:8)

for Qr 2 RH1 or by

K = (~V0 +Ql~N)�1( ~U0 +Ql

~M); det(I +Ql~N ~V �10 )(1) 6= 0 (12:9)


for Ql 2 RH1 where U0; V0; ~U0; ~V0 2 RH1 satisfy the Bezout identities:

~V0M � ~U0N = I; ~MV0 � ~NU0 = I:

Moreover, if U0; V0; ~U0, and ~V0 are chosen such that U0V�10 = ~V �10

~U0, i.e.,"~V0 � ~U0

� ~N ~M

#"M U0

N V0

#=

"I 0

0 I

#:

Then

K = (U0 +MQy)(V0 +NQy)�1

= (~V0 +Qy~N)�1( ~U0 +Qy

~M)

= F`(Jy; Qy) (12.10)

where

Jy :=

"U0V

�10

~V �10

V �10 �V �10 N

#(12:11)

and where Qy ranges over RH1 such that (I + V �10 NQy)(1) is invertible

Proof. We shall prove the parameterization given in (12.8) �rst. Assume that K hasthe form indicated, and de�ne

U := U0 +MQr; V := V0 +NQr:

Then

~MV � ~NU = ~M(V0 +NQr)� ~N(U0 +MQr) = ~MV0 � ~NU0 + ( ~MN � ~NM)Qr = I:

Thus K achieves internal stability by Lemma 5.10.

Conversely, suppose K is proper and achieves internal stability. Introduce an rcf ofK over RH1 as K = UV �1. Then by Lemma 5.10, Z := ~MV � ~NU is invertible inRH1. De�ne Qr by the equation

U0 +MQr = UZ�1; (12:12)

soQr =M�1(UZ�1 � U0):

Then using the Bezout identity, we have

V0 +NQr = V0 +NM�1(UZ�1 � U0)

= V0 + ~M�1 ~N(UZ�1 � U0)

= ~M�1( ~MV0 � ~NU0 + ~NUZ�1)

= ~M�1(I + ~NUZ�1)

= ~M�1(Z + ~NU)Z�1

= ~M�1 ~MV Z�1

= V Z�1: (12.13)

12.6. Youla Parameterization via Coprime Factorization* 317

Thus,

K = UV �1

= (U0 +MQr)(V0 +NQr)�1:

To see that Qr belongs to RH1, observe �rst from (12.12) and then from (12.13) thatboth MQr and NQr belong to RH1. Then

Qr = (~V0M � ~U0N)Qr = ~V0(MQr)� ~U0(NQr) 2 RH1:

Finally, since V and Z evaluated at s = 1 are both invertible, so is V0 + NQr from(12.13), hence so is I + V �10 NQr.

Similarly, the parameterization given in (12.9) can be obtained.

To show that the controller can be written in the form of equation (12.10), note that

(U0 +MQy)(V0 +NQy)�1 = U0V

�10 + (M � U0V

�10 N)Qy(I + V �10 NQy)

�1V �10

and that U0V�10 = ~V �10

~U0. We have

(M � U0V�10 N) = (M � ~V �10

~U0N) = ~V �10 ( ~V0M � ~U0N) = ~V �10

and

K = U0V�10 + ~V �10 Qy(I + V �10 NQy)

�1V �10 : (12:14)

2

Corollary 12.18 Given an admissible controller K with coprime factorizations K =UV �1 = ~V �1 ~U , the free parameter Qy 2 RH1 in Youla parameterization is given by

Qy =M�1(UZ�1 � U0)

where

Z := ~MV � ~NU:

Next, we shall establish the precise relationship between the above all stabilizingcontroller parameterization and the parameterization obtained in the previous sectionsvia LFT framework.

Theorem 12.19 Let the doubly coprime factorizations of G22 be chosen as

"M U0

N V0

#=

264 A+B2F B2 �L

F I 0

C2 +D22F D22 I

375


"~V0 � ~U0

� ~N ~M

#=

264 A+ LC2 �(B2 + LD22) L

F I 0

C2 �D22 I

375

where F and L are chosen such that A+B2F and A+ LC2 are both stable.Then Jy can be computed as

Jy =

264 A+B2F + LC2 + LD22F �L B2 + LD22

F 0 I

�(C2 +D22F ) I �D22

375 :

Proof. This follows from some tedious algebra. 2

Remark 12.11 Note that Jy is exactly the same as the J in Theorem 12.8 and thatK0 := U0V

�10 is an observer-based stabilizing controller with

K0 :=

"A+B2F + LC2 + LD22F �L

F 0

#:

~


The special problems FI, DF, FC, and OE were �rst introduced in Doyle, Glover, Khar-gonekar, and Francis [1989] for solving the H1 problem, and they have been since usedin many other papers for di�erent problems. The new derivation of all stabilizing con-trollers was reported in Lu, Zhou, and Doyle [1991]. The paper by Moore et al [1990]contains some other related interesting results. The conventional Youla parameteriza-tion can be found in Youla et al [1976], Desoer et al [1980], Doyle [1984], Vidyasagar[1985], and Francis [1987]. The parameterization of all two-degree-of-freedom stabilizingcontrollers is given in Youla and Bongiorno [1985] and Vidyasagar [1985].

13Algebraic Riccati Equations

We have studied the Lyapunov equation in Chapter 3 and have seen the roles it playedin some applications. A more general equation than the Lyapunov equation in controltheory is the so-called Algebraic Riccati Equation or ARE for short. Roughly speaking,Lyapunov equations are most useful in system analysis while AREs are most useful incontrol system synthesis; particularly, they play the central roles in H2 and H1 optimalcontrol.

Let A, Q, and R be real n�n matrices with Q and R symmetric. Then an algebraicRiccati equation is the following matrix equation:

A�X +XA+XRX +Q = 0: (13:1)

Associated with this Riccati equation is a 2n� 2n matrix:

H :=

"A R

�Q �A�

#: (13:2)

A matrix of this form is called a Hamiltonian matrix. The matrix H in (13.2) willbe used to obtain the solutions to the equation (13.1). It is useful to note that �(H)(the spectrum of H) is symmetric about the imaginary axis. To see that, introduce the2n� 2n matrix:

J :=

"0 �II 0

#

having the property J2 = �I . ThenJ�1HJ = �JHJ = �H�

319

320 ALGEBRAIC RICCATI EQUATIONS

so H and �H� are similar. Thus � is an eigenvalue i� �� is.

This chapter is devoted to the study of this algebraic Riccati equation and relatedproblems: the properties of its solutions, the methods to obtain the solutions, and someapplications.

In Section 13.1, we will study all solutions to (13.1). The word \all" means that anyX , which is not necessarily real, not necessarily hermitian, not necessarily nonnegative,and not necessarily stabilizing, satis�es equation (13.1). The conditions for a solutionto be hermitian, real, and so on, are also given in this section. The most important partof this chapter is Section 13.2 which focuses on the stabilizing solutions. This section isdesigned to be essentially self-contained so that readers who are only interested in thestabilizing solution may go to Section 13.2 directly without any di�culty. Section 13.3presents the extreme (i.e., maximal or minimal) solutions of a Riccati equation andtheir properties. The relationship between the stabilizing solution of a Riccati equa-tion and the spectral factorization of some frequency domain function is established inSection 13.4. Positive real functions and inner functions are introduced in Section 13.5and 13.6. Some other special rational matrix factorizations, e.g., inner-outer factoriza-tions and normalized coprime factorization, are given in Sections 13.7-13.8.

13.1 All Solutions of A Riccati Equation

The following theorem gives a way of constructing solutions to (13.1) in terms of invari-ant subspaces of H .

Theorem 13.1 Let V � C 2n be an n-dimensional invariant subspace of H, and letX1; X2 2 C

n�n be two complex matrices such that

V = Im

"X1

X2

#:

If X1 is invertible, then X := X2X�11 is a solution to the Riccati equation (13.1) and

�(A +RX) = �(H jV ). Furthermore, the solution X is independent of a speci�c choiceof bases of V.

Proof. Since V is an H invariant subspace, there is a matrix � 2 C n�n such that"A R

�Q �A�

#"X1

X2

#=

"X1

X2

#�:

Postmultiply the above equation by X�11 to get"

A R

�Q �A�

#"I

X

#=

"I

X

#X1�X

�11 : (13:3)

13.1. All Solutions of A Riccati Equation 321

Now pre-multiply (13.3) byh�X I

ito get

0 =h�X I

i " A R

�Q �A�

#"I

X

#= �XA�A�X �XRX �Q;

which shows that X is indeed a solution of (13.1). Equation (13.3) also gives

A+RX = X1�X�11 ;

therefore, �(A + RX) = �(�). But, by de�nition, � is a matrix representation of themap H jV , so �(A + RX) = �(H jV). Finally note that any other basis spanning V canbe represented as "

X1

X2

#P =

"X1P

X2P

#

for some nonsingular matrix P . The conclusion follows from the fact (X2P )(X1P )�1 =

X2X�11 . 2

As we would expect, the converse of the theorem also holds.

Theorem 13.2 If X 2 C n�n is a solution to the Riccati equation (13.1), then thereexist matrices X1; X2 2 C n�n , with X1 invertible, such that X = X2X

�11 and the

columns of

"X1

X2

#form a basis of an n-dimensional invariant subspace of H.

Proof. De�ne � := A+RX . Multiplying this by X gives

X� = XA+XRX = �Q�A�X

with the second equality coming from the fact that X is a solution to (13.1). Writethese two relations as "

A R

�Q �A�

#"I

X

#=

"I

X

#�:

Hence, the columns of

"I

X

#span an n-dimensional invariant subspace of H , and

de�ning X1 := I , and X2 := X completes the proof. 2


Remark 13.1 It is now clear that to obtain solutions to the Riccati equation, it isnecessary to be able to construct bases for those invariant subspaces of H . One way ofconstructing those invariant subspaces is to use eigenvectors and generalized eigenvectorsof H . Suppose �i is an eigenvalue of H with multiplicity k (then �i+j = �i for allj = 1; : : : ; k� 1), and let vi be a corresponding eigenvector and vi+1; : : : ; vi+k�1 be thecorresponding generalized eigenvectors associated with vi and �i. Then vj are relatedby

(H � �iI)vi = 0

(H � �iI)vi+1 = vi...

(H � �iI)vi+k�1 = vi+k�2;

and the spanfvj ; j = i; : : : ; i+ k � 1g is an invariant subspace of H . ~

Example 13.1 Let

A =

"�3 2

�2 1

#R =

"0 0

0 �1

#; Q =

"0 0

0 0

#:

The eigenvalues of H are 1; 1;�1;�1, and the corresponding eigenvector and generalizedeigenvector of 1 are

v1 =

266664

1

2

2

�2

377775 ; v2 =

266664

�1�3=2

1

0

377775 :

The corresponding eigenvector and generalized eigenvector of �1 are

v3 =

266664

1

1

0

0

377775 ; v4 =

266664

1

3=2

0

0

377775 :

All solutions of the Riccati equation under various combinations are given below:

� spanfv1; v2g isH-invariant: let

"X1

X2

#= [v1 v2], thenX = X2X

�11 =

"�10 6

6 �4

#is a solution and �(A+RX) = f1; 1g;

� spanfv1; v3g isH-invariant: let

"X1

X2

#= [v1 v3], thenX = X2X

�11 =

"�2 2

2 �2

#is also a solution and �(A +RX) = f1;�1g;

13.1. All Solutions of A Riccati Equation 323

� spanfv3; v4g is H-invariant: let

"X1

X2

#= [v3 v4], then X = 0 is a solution and

�(A+RX) = f�1;�1g;� spanfv1; v4g; spanfv2; v3g, and spanfv2; v4g are not H-invariant subspaces. Read-ers can verify that the matrices constructed from those vectors are not solutionsto the Riccati equation (13.1).

3

Up to this point, we have said nothing about the structure of the solutions given byTheorem 13.1 and 13.2. The following theorem gives a su�cient condition for a Riccatisolution to be hermitian (not necessarily real symmetric).

Theorem 13.3 Let V be an n-dimensional H-invariant subspace and let X1; X2 2C n�n be such that

V = Im

"X1

X2

#:

Then �i + ��j 6= 0 for all i; j = 1; : : : ; n, �i; �j 2 �(H jV ) implies that X�1X2 is hermi-

tian, i.e., X�1X2 = (X�

1X2)�. Furthermore, if X1 is nonsingular, then X = X2X

�11 is

hermitian.

Proof. Since V is an invariant subspace of H , there is a matrix representation � forH jV such that �(�) = �(H jV) and

H

"X1

X2

#=

"X1

X2

#�:

Pre-multiply this equation by

"X1

X2

#�J to get

"X1

X2

#�JH

"X1

X2

#=

"X1

X2

#�J

"X1

X2

#�: (13:4)

Note that JH is hermitian (actually symmetric since H is real); therefore, the left-handside of (13.4) is hermitian as well as the right-hand side:"

X1

X2

#�J

"X1

X2

#� = ��

"X1

X2

#�J�

"X1

X2

#= ��

"X1

X2

#�J

"X1

X2

#

i.e.,(�X�

1X2 +X�2X1)� = ��(�X�

1X2 +X�2X1):


This is a Lyapunov equation. Since �i + ��j 6= 0, the equation has a unique solution:

�X�1X2 +X�

2X1 = 0:

This implies that X�1X2 is hermitian.

That X is hermitian is easy to see by noting that X = (X�11 )�(X�

1X2)X�11 . 2

Remark 13.2 It is clear from Example 13.1 that the condition �i + ��j 6= 0 is notnecessary for the existence of a hermitian solution. ~

The following theorem gives necessary and su�cient conditions for a solution to bereal.

Theorem 13.4 Let V be an n-dimensional H-invariant subspace, and let X1; X2 2C n�n be such that X1 is nonsingular and the columns of

"X1

X2

#form a basis of V.

Then X := X2X�11 is real if and only if V is conjugate symmetric, i.e., v 2 V implies

that �v 2 V.

Proof. (() Since V is conjugate symmetric, there is a nonsingular matrix P 2 C n�n

such that "�X1

�X2

#=

"X1

X2

#P

where the over bar �X denotes the complex conjugate. Therefore, �X = �X2�X�11 =

X2P (X1P )�1 = X2X

�11 = X is real as desired.

()) De�ne X := X2X�11 . By assumption, X 2 Rn�n and

Im

"I

X

#= V ;

therefore, V is conjugate symmetric. 2

Example 13.2 This example is intended to show that there are non-real, non-hermitiansolutions to equation (13.1). It is also designed to show that the condition �i + ��j 6=0;8 i; j, which excludes the possibility of having imaginary axis eigenvalues since if�l = j! then �l + ��l = 0, is not necessary for the existence of a hermitian solution. Let

A =

264 1 0 0

0 0 �10 1 0

375 ; R =

264 �1 0 �2

0 0 0

�2 0 �4

375 ; Q =

264 0 0 0

0 0 0

0 0 0

375 :

13.2. Stabilizing Solution and Riccati Operator 325

Then H has eigenvalues �1 = 1 = ��2; �3 = �5 = j = ��4 = ��6. It is easy to showthat

X =

264 0 0:5 + 0:5j 0:5� 0:5j

0 0 0

0 0 0

375

satis�es equation (13.1) and that X is neither real nor hermitian. On the other hand,

X =

264 2 0 0

0 0 0

0 0 0

375

is a real symmetric nonnegative de�nite solution corresponding to the eigenvalues �1,j, �j. 3

13.2 Stabilizing Solution and Riccati Operator

In this section, we discuss when a solution is stabilizing, i.e., �(A + RX) � C� andthe properties of such solutions. This section is the central part of this chapter and isdesigned to be self-contained. Hence some of the material appearing in this section maybe similar to that seen in the previous section.

AssumeH has no eigenvalues on the imaginary axis. Then it must have n eigenvaluesin Re s < 0 and n in Re s > 0. Consider the two n-dimensional spectral subspaces,X�(H) and X+(H): the former is the invariant subspace corresponding to eigenvaluesin Re s < 0 and the latter corresponds to eigenvalues in Re s > 0. By �nding a basisfor X�(H), stacking the basis vectors up to form a matrix, and partitioning the matrix,we get

X�(H) = Im

"X1

X2

#

where X1; X2 2 C n�n . (X1 and X2 can be chosen to be real matrices.) If X1 isnonsingular or, equivalently, if the two subspaces

X�(H); Im

"0

I

#(13:5)

are complementary, we can set X := X2X�11 . Then X is uniquely determined by H ,

i.e., H 7�! X is a function, which will be denoted Ric. We will take the domain of Ric,denoted dom(Ric), to consist of Hamiltonian matrices H with two properties: H has noeigenvalues on the imaginary axis and the two subspaces in (13.5) are complementary.For ease of reference, these will be called the stability property and the complementarity


property, respectively. This solution will be called the stabilizing solution. Thus, X =Ric(H) and

Ric : dom(Ric) � R2n�2n 7�! R

n�n :

The following well-known results give some properties of X as well as veri�able condi-tions under which H belongs to dom(Ric).

Theorem 13.5 Suppose H 2 dom(Ric) and X = Ric(H). Then

(i) X is real symmetric;

(ii) X satis�es the algebraic Riccati equation

A�X +XA+XRX +Q = 0;

(iii) A+RX is stable .

Proof. (i) Let X1; X2 be as above. It is claimed that

X�1X2 is symmetric: (13:6)

To prove this, note that there exists a stable matrix H� in Rn�n such that

H

"X1

X2

#=

"X1

X2

#H� :

(H� is a matrix representation of H jX�(H).) Pre-multiply this equation by"X1

X2

#�J

to get "X1

X2

#�JH

"X1

X2

#=

"X1

X2

#�J

"X1

X2

#H� : (13:7)

Since JH is symmetric, so is the left-hand side of (13.7) and so is the right-hand side:

(�X�1X2 +X�

2X1)H� = H��(�X�

1X2 +X�2X1)

�

= �H��(�X�

1X2 +X�2X1):

This is a Lyapunov equation. Since H� is stable, the unique solution is

�X�1X2 +X�

2X1 = 0:

This proves (13.6).Since X1 is nonsingular and X = (X�1

1 )�(X�1X2)X

�11 , X is symmetric.


(ii) Start with the equation

H

"X1

X2

#=

"X1

X2

#H�

and post-multiply by X�11 to get

H

"I

X

#=

"I

X

#X1H�X

�11 : (13:8)

Now pre-multiply by [X � I ]:

[X � I ]H

"I

X

#= 0:

This is precisely the Riccati equation.

(iii) Pre-multiply (13.8) by [I 0] to get

A+RX = X1H�X�11 :

Thus A+RX is stable because H� is. 2

Now, we are going to state one of the main theorems of this section which gives thenecessary and su�cient conditions for the existence of a unique stabilizing solution of(13.1) under certain restrictions on the matrix R.

Theorem 13.6 Suppose H has no imaginary eigenvalues and R is either positive semi-de�nite or negative semi-de�nite. Then H 2 dom(Ric) if and only if (A;R) is stabiliz-able.

Proof. (() To prove that H 2 dom(Ric), we must show that

X�(H); Im

"0

I

#

are complementary. This requires a preliminary step. As in the proof of Theorem 13.5de�ne X1; X2; H� so that

X�(H) = Im

"X1

X2

#

H

"X1

X2

#=

"X1

X2

#H� : (13:9)


We want to show that X1 is nonsingular, i.e., Ker X1 = 0. First, it is claimed that KerX1 is H�-invariant. To prove this, let x 2 KerX1. Pre-multiply (13.9) by [I 0] to get

AX1 +RX2 = X1H� : (13:10)

Pre-multiply by x�X�2 , post-multiply by x, and use the fact that X�

2X1 is symmetric(see (13.6)) to get

x�X�2RX2x = 0:

Since R is semide�nite, this implies that RX2x = 0. Now post-multiply (13.10) by x toget X1H�x = 0, i.e. H�x 2 KerX1. This proves the claim.

Now to prove that X1 is nonsingular, suppose, on the contrary, that Ker X1 6= 0.Then H�jKerX1

has an eigenvalue, �, and a corresponding eigenvector, x:

H�x = �x (13:11)

Re � < 0; 0 6= x 2 KerX1:

Pre-multiply (13.9) by [0 I ]:

�QX1 �A�X2 = X2H� : (13:12)

Post-multiply the above equation by x and use (13.11):

(A� + �I)X2x = 0:

Recall that RX2x = 0, we have

x�X�2 [A+ �I R] = 0:

Then stabilizability of (A;R) implies X2x = 0. But if both X1x = 0 and X2x = 0, then

x = 0 since

"X1

X2

#has full column rank, which is a contradiction.

()) This is obvious since H 2 dom(Ric) implies that X is a stabilizing solution andthat A+RX is asymptotically stable. It also implies that (A;R) must be stabilizable.

2

Theorem 13.7 Suppose H has the form

H =

"A �BB�

�C�C �A�

#:

Then H 2 dom(Ric) i� (A;B) is stabilizable and (C;A) has no unobservable modes onthe imaginary axis. Furthermore, X = Ric(H) � 0 if H 2 dom(Ric), and Ker(X) = 0if and only if (C;A) has no stable unobservable modes.

Note that Ker(X) � Ker(C), so that the equation XM = C� always has a solutionfor M , and a minimum F -norm solution is given by XyC�.


Proof. It is clear from Theorem 13.6 that the stabilizability of (A;B) is necessary,and it is also su�cient if H has no eigenvalues on the imaginary axis. So we only needto show that, assuming (A;B) is stabilizable, H has no imaginary eigenvalues i� (C;A)has no unobservable modes on the imaginary axis. Suppose that j! is an eigenvalue

and 0 6="x

z

#is a corresponding eigenvector. Then

Ax�BB�z = j!x

�C�Cx�A�z = j!z:

Re-arrange:

(A� j!I)x = BB�z (13:13)

� (A� j!I)�z = C�Cx: (13:14)

Thus

hz; (A� j!I)xi = hz;BB�zi = kB�zk2

�hx; (A� j!I)�zi = hx;C�Cxi = kCxk2

so hx; (A � j!I)�zi is real and

�kCxk2 = h(A� j!I)x; zi = hz; (A� j!I)xi = kB�zk2:

Therefore B�z = 0 and Cx = 0. So from (13.13) and (13.14)

(A� j!I)x = 0

(A� j!I)�z = 0:

Combine the last four equations to get

z�[A� j!I B] = 0"A� j!I

C

#x = 0:

The stabilizability of (A;B) gives z = 0. Now it is clear that j! is an eigenvalue of Hi� j! is an unobservable mode of (C;A).

Next, set X := Ric(H). We'll show that X � 0. The Riccati equation is

A�X +XA�XBB�X + C�C = 0

or equivalently

(A�BB�X)�X +X(A�BB�X) +XBB�X + C�C = 0: (13:15)


Noting that A�BB�X is stable (Theorem 13.5), we have

X =

Z 1

0

e(A�BB�X)�t(XBB�X + C�C)e(A�BB

�X)tdt: (13:16)

Since XBB�X + C�C is positive semi-de�nite, so is X .

Finally, we'll show that KerX is non-trivial if and only if (C;A) has stable unobserv-able modes. Let x 2 KerX , then Xx = 0. Pre-multiply (13.15) by x� and post-multiplyby x to get

Cx = 0:

Now post-multiply (13.15) again by x to get

XAx = 0:

We conclude that Ker(X) is an A-invariant subspace. Now if Ker(X) 6= 0, then thereis a 0 6= x 2 Ker(X) and a � such that �x = Ax = (A � BB�X)x and Cx = 0. Since(A � BB�X) is stable, Re� < 0; thus � is a stable unobservable mode. Conversely,suppose (C;A) has an unobservable stable mode �, i.e., there is an x such that Ax =�x;Cx = 0. By pre-multiplying the Riccati equation by x� and post-multiplying by x,we get

2Re�x�Xx� x�XBB�Xx = 0:

Hence x�Xx = 0, i.e., X is singular. 2

Example 13.3 This example shows that the observability of (C;A) is not necessaryfor the existence of a positive de�nite stabilizing solution. Let

A =

"1 0

0 2

#; B =

"1

1

#; C =

h0 0

i:

Then (A;B) is stabilizable, but (C;A) is not detectable. However,

X =

"18 �24�24 36

#> 0

is the stabilizing solution. 3

Corollary 13.8 Suppose that (A;B) is stabilizable and (C;A) is detectable. Then theRiccati equation

A�X +XA�XBB�X + C�C = 0

has a unique positive semide�nite solution. Moreover, the solution is stabilizing.


Proof. It is obvious from the above theorem that the Riccati equation has a uniquestabilizing solution and that the solution is positive semide�nite. Hence we only needto show that any positive semide�nite solution X � 0 must also be stabilizing. Thenby the uniqueness of the stabilizing solution, we can conclude that there is only onepositive semide�nite solution. To achieve that goal, let us assume that X � 0 satis�esthe Riccati equation but that it is not stabilizing. First rewrite the Riccati equation as

(A�BB�X)�X +X(A�BB�X) +XBB�X + C�C = 0 (13:17)

and let � and x be an unstable eigenvalue and the corresponding eigenvector of A �BB�X , respectively, i.e.,

(A�BB�X)x = �x:

Now pre-multiply and postmultiply equation (13.17) by x� and x, respectively, and wehave

(��+ �)x�Xx+ x�(XBB�X + CC)x = 0:

This implies

B�Xx = 0; Cx = 0

since Re(�) � 0 and X � 0. Finally, we arrive at

Ax = �x; Cx = 0

i.e., (C;A) is not detectable, which is a contradiction. Hence Re(�) < 0, i.e., X � 0 isthe stabilizing solution. 2

Lemma 13.9 Suppose D has full column rank and let R = D�D > 0; then the followingstatements are equivalent:

(i)

"A� j!I B

C D

#has full column rank for all !.

(ii)�(I �DR�1D�)C; A�BR�1D�C

�has no unobservable modes on j!-axis.

Proof. Suppose j! is an unobservable mode of�(I �DR�1D�)C; A�BR�1D�C

�;

then there is an x 6= 0 such that

(A�BR�1D�C)x = j!x; (I �DR�1D�)Cx = 0

i.e., "A� j!I B

C D

#"I 0

�R�1D�C I

# "x

0

#= 0:


But this implies that "A� j!I B

C D

#(13:18)

does not have full column rank. Conversely, suppose (13.18) does not have full column

rank for some !; then there exists

"u

v

#6= 0 such that

"A� j!I B

C D

# "u

v

#= 0:

Now let "u

v

#=

"I 0

�R�1D�C I

# "x

y

#:

Then "x

y

#=

"I 0

R�1D�C I

# "u

v

#6= 0

and(A�BR�1D�C � j!I)x+By = 0 (13:19)

(I �DR�1D�)Cx+Dy = 0: (13:20)

Pre-multiply (13.20) by D� to get y = 0. Then we have

(A�BR�1D�C)x = j!x; (I �DR�1D�)Cx = 0

i.e., j! is an unobservable mode of�(I �DR�1D�)C; A�BR�1D�C

�. 2

Remark 13.3 If D is not square, then there is a D? such thathD? DR�1=2

iis

unitary and thatD?D�? = I�DR�1D�. Hence, in some cases we will write the condition

(ii) in the above lemma as (D�?C;A � BR�1D�C) having no imaginary unobservable

modes. Of course, if D is square, the condition is simpli�ed to A�BR�1D�C with noimaginary eigenvalues. Note also that if D�C = 0, condition (ii) becomes (C;A) withno imaginary unobservable modes. ~Corollary 13.10 Suppose D has full column rank and denote R = D�D > 0. Let Hhave the form

H =

"A 0

�C�C �A�

#�"

B

�C�D

#R�1

hD�C B�

i

=

"A�BR�1D�C �BR�1B�

�C�(I �DR�1D�)C �(A�BR�1D�C)�

#:

13.3. Extreme Solutions and Matrix Inequalities 333

Then H 2 dom(Ric) i� (A;B) is stabilizable and

"A� j!I B

C D

#has full column rank

for all !. Furthermore, X = Ric(H) � 0 if H 2 dom(Ric), and Ker(X) = 0 if andonly if (D�

?C;A�BR�1D�C) has no stable unobservable modes.

Proof. This is the consequence of the Lemma 13.9 and Theorem 13.7. 2

Remark 13.4 It is easy to see that the detectability (observability) of (D�?C;A �

BR�1D�C) implies the detectability (observability) of (C;A); however, the converse isin general not true. Hence the existence of a stabilizing solution to the Riccati equationin the above corollary is not guaranteed by the stabilizability of (A;B) and detectabilityof (C;A). Furthermore, even if a stabilizing solution exists, the positive de�niteness ofthe solution is not guaranteed by the observability of (C;A) unless D�C = 0. As anexample, consider

A =

"0 1

0 0

#; B =

"0

�1

#; C =

"1 0

0 0

#; D =

"1

0

#:

Then (C;A) is observable, (A;B) is controllable, and

A�BD�C =

"0 1

1 0

#; D�

?C =h0 0

i:

A Riccati equation with the above data has a nonnegative de�nite stabilizing solu-tion since (D�

?C;A � BR�1D�C) has no unobservable modes on the imaginary axis.However, the solution is not positive de�nite since (D�

?C;A�BR�1D�C) has a stableunobservable mode. On the other hand, if the B matrix is changed to

B =

"0

1

#;

then the corresponding Riccati equation has no stabilizing solution since, in this case,(A�BD�C) has eigenvalues on the imaginary axis although (A;B) is controllable and(C;A) is observable. ~

13.3 Extreme Solutions and Matrix Inequalities

We have shown in the previous sections that given a Riccati equation, there are generallymany possible solutions. Among all solutions, we are most interested in those which arereal, symmetric, and, in particular, the stabilizing solutions. There is another class ofsolutions which are interesting; they are called extreme (maximal or minimal) solutions.Some properties of the extreme solutions will be studied in this section. The connectionsbetween the extreme solutions and the stabilizing solutions will also be established inthis section. To illustrate the idea, let us look at an example.


Example 13.4 Let

A =

264

1 0 0

0 2 0

0 0 �3

375 ; B =

264

1

1

1

375 ; C =

h0 0 0

i:

The eigenvalues of matrix H =

"A �BB�

�C�C �A�

#are

�1 = 1; �2 = �1; �3 = 2; �4 = �2; �5 = �3; �6 = 3;

and their corresponding eigenvectors are

v1 =

26666666664

1

0

0

0

0

0

37777777775; v2 =

26666666664

3

2

�36

0

0

37777777775; v3 =

26666666664

0

1

0

0

0

0

37777777775; v4 =

26666666664

4

3

�120

12

0

37777777775; v5 =

26666666664

0

0

1

0

0

0

37777777775; v6 =

26666666664

�3�6�10

0

6

37777777775:

There are four distinct nonnegative de�nite symmetric solutions depending on the cho-sen invariant subspaces:

�"X1

X2

#= [v1 v3 v5]; Y1 = X2X

�11 =

264

0 0 0

0 0 0

0 0 0

375;

�"X1

X2

#= [v2 v3 v5]; Y2 = X2X

�11 =

264

2 0 0

0 0 0

0 0 0

375;

�"X1

X2

#= [v1 v4 v5]; Y3 = X2X

�11 =

264

0 0 0

0 4 0

0 0 0

375;

�"X1

X2

#= [v2 v4 v5]; Y4 = X2X

�11 =

264

18 �24 0

�24 36 0

0 0 0

375.

These solutions can be ordered as Y4 � Yi � Y1; i = 2; 3. Of course, this is only a partialordering since Y2 and Y3 are not comparable. Note also that only Y4 is a stabilizingsolution, i.e., A � BB�Y4 is stable. Furthermore, Y4 and Y1 are the \maximal" and\minimal" solutions, respectively. 3


The partial ordering concept shown in Example 13.4 can be stated in a much moregeneral setting. To do that, again consider the Riccati equation (13.1). We shall calla hermitian solution X+ of (13.1) a maximal solution if X+ � X for all hermitiansolutions X of (13.1). Similarly, we shall call a hermitian solution X� of (13.1) aminimal solution if X� � X for all hermitian solutions X of (13.1). Clearly, maximaland minimal solutions are unique if they exist.

To study the properties of the maximal and minimal solutions, we shall introducethe following quadratic matrix:

Q(X) := A�X +XA+XRX +Q: (13:21)

Theorem 13.11 Assume R � 0 and assume there is a hermitian matrix X = X� suchthat Q(X) � 0.

(i) If (A;R) is stabilizable, then there exists a unique maximal solution X+ to theRiccati equation (13.1). Furthermore,

X+ � X; 8X such that Q(X) � 0

and �(A+RX+) � C �.

(ii) If (�A;R) is stabilizable, then there exists a unique minimal solution X� to theRiccati equation (13.1). Furthermore,

X� � X; 8X such that Q(X) � 0

and �(A+RX�) � C +.

(iii) If (A;R) is controllable, then both X+ and X� exist. Furthermore, X+ > X� i��(A+RX+) � C� i� �(A +RX�) � C+ . In this case,

X+ �X� =

�Z 1

0

e(A+RX+)tRe(A+RX+)�tdt

��1

=

�Z 0

�1

e(A+RX�)tRe(A+RX�)�tdt

��1:

(iv) If Q(X) > 0, the results in (i) and (ii) can be respectively strengthened to X+ > X,�(A+RX+) � C� , and X� < X, �(A+RX�) � C+ .

Proof. Let R = �BB� for some B. Note the fact that (A;R) is stabilizable (control-lable) i� (A;B) is.

(i): Let X be such that Q(X) � 0. Since (A;B) is stabilizable, there is an F0 suchthat

A0 := A+BF0


is stable. Now let X0 be the unique solution to the Lyapunov equation

X0A0 +A�0X0 + F �0 F0 +Q = 0:

Then X0 is hermitian. De�ne

F0 := F0 +B�X;

and we have the following equation:

(X0 �X)A0 +A�0(X0 �X) = �F �0 F0 �Q(X) � 0:

The stability of A0 implies that

X0 � X:

Starting with X0, we shall de�ne a non-increasing sequence of hermitian matrices fXig.Associated with fXig, we shall also de�ne a sequence of stable matrices fAig and asequence of matrices fFig. Assume inductively that we have already de�ned matricesfXig, fAig, and fFig for i up to n� 1 such that Xi is hermitian and

X0 � X1 � � � � � Xn�1 � X;

Ai = A+BFi; is stable; i = 0; : : : ; n� 1;

Fi = �B�Xi�1; i = 1; : : : ; n� 1;

XiAi +A�iXi = �F �i Fi �Q; i = 0; 1; : : : ; n� 1: (13:22)

Next, introduce

Fn = �B�Xn�1;

An = A+BFn:

First we show that An is stable. Then, using (13.22), with i = n, we de�ne a hermitianmatrix Xn with Xn�1 � Xn � X . Now using (13.22), with i = n� 1, we get

Xn�1An +A�nXn�1 +Q+ F �nFn + (Fn � Fn�1)�(Fn � Fn�1) = 0: (13:23)

Let

Fn := Fn +B�X ;

then

(Xn�1�X)An+A�n(Xn�1�X) = �Q(X)� F �n Fn�(Fn�Fn�1)�(Fn�Fn�1): (13:24)

Now assume that An is not stable, i.e., there exists an � with Re� � 0 and x 6= 0 suchthat Anx = �x. Then pre-multiply (13.24) by x� and postmultiply by x, and we have

2Re�x�(Xn�1 �X)x = �x�fQ(X) + F �n Fn + (Fn � Fn�1)�(Fn � Fn�1)gx:


Since it is assumed Xn�1 � X , each term on the right-hand side of the above equationhas to be zero. So we have

x�(Fn � Fn�1)�(Fn � Fn�1)x = 0:

This implies(Fn � Fn�1)x = 0:

But nowAn�1x = (A+BFn�1)x = (A+BFn)x = Anx = �x;

which is a contradiction with the stability of An�1. Hence An is stable as well.Now we introduce Xn as the unique solution of the Lyapunov equation

XnAn +A�Xn = �F �nFn �Q: (13:25)

Then Xn is hermitian. Next, we have

(Xn �X)An +A�n(Xn �X) = �Q(X)� F �n Fn � 0;

and, by using (13.23),

(Xn�1 �Xn)An +A�n(Xn�1 �Xn) = �(Fn � Fn�1)�(Fn � Fn�1) � 0:

Since An is stable, we haveXn�1 � Xn � X:

We have a non-increasing sequence fXig, and the sequence is bounded below byXi � X .Hence the limit

Xf := limn!1

Xn

exists and is hermitian, and we have Xf � X . Passing the limit n!1 in (13.25), weget Q(Xf ) = 0. So Xf is a solution of (13.1). Since X is an arbitrary element satisfyingQ(X) � 0 and Xf is independent of the choice of X , we have

Xf � X; 8 X such that Q(X) � 0:

In particular, Xf is the maximal solution of the Riccati equation (13.1), i.e., Xf = X+.To establish the stability property of the maximal solution, note that An is stable

for any n. Hence, in the limit, the eigenvalues of

A�BB�Xf

will have non-positive real parts. The uniqueness follows from the fact that the maximalsolution is unique.

(ii): The results follow by the following substitutions in the proof of part (i):

A �A; X �X ; X+ �X�:


(iii): The existence of X+ and X� follows from (i) and (ii). Let A+ := A + RX+;we now show that X+ �X� > 0 i� �(A+) � C� . It is easy to verify that

A�+(X+ �X�) + (X+ �X�)A+ � (X+ �X�)R(X+ �X�) = 0:

()): Suppose X+ � X� > 0; then (A+; R(X+ � X�)) is controllable. Therefore,from Lyapunov theorem, we conclude that A+ is stable.

((): Assuming now that A+ is stable and that X is any other solution to the Riccatiequation (13.1),

A�+(X+ �X) + (X+ �X)A+ � (X+ �X)R(X+ �X) = 0:

This equation has an invertible solution

X+ � �X =

��Z 1

0


��1> 0:

A simple rearrangement of terms gives

(A+R �X)�(X+ � �X) + (X+ � �X)(A +R �X) + (X+ � �X)R(X+ � �X) = 0:

Since X+ � �X > 0, we conclude �(A + R �X) � C+ . This in turn implies �X = X�.Therefore, X+ > X�. That X+ �X� > 0 i� �(A+RX�) � C+ follows by analogy.

(iv): We shall only show the case for X+; the case for X� follows by analogy. Notethat from (i) we have

(X+ �X)A+ + A�+(X+ �X) = �Q(X) + (X+ �X)R(X+ �X) < 0 (13:26)

and X+ � X � 0. Now suppose X+ � X is singular and there is an x 6= 0 such that(X+ � X)x = 0. By pre-multiplying (13.26) by x� and post-multiplying by x, weget x�Q(X)x = 0, a contradiction. The stability of A + RX+ then follows from theLyapunov theorem. 2

Remark 13.5 The proof given above also gives an iterative procedure to computethe maximal and minimal solutions. For example, to �nd the maximal solution, thefollowing procedures can be used:

(i) �nd F0 such that A0 = A+BF0 is stable;

(ii) solve Xi: XiAi +A�iXi + F �i Fi +Q = 0;

(iii) if kXi �Xi�1k � � =speci�ed accuracy, stop. Otherwise go to (iv);

(iv) let Fi+1 = �B�Xi and Ai+1 = A+BFi+1 go to (ii).

This procedure will converge to the stabilizing solution if the solution exists. ~


Corollary 13.12 Let R � 0 and suppose (A;R) is controllable and X1, X2 are twosolutions to the Riccati equation (13.1). Then X1 > X2 implies that X+ = X1, X� =X2, �(A +RX1) � C� and that �(A+RX2) � C+ .

The following example illustrates that the stabilizability of (A;R) is not su�cientto guarantee the existence of a minimal hermitian solution of equation (13.1).

Example 13.5 Let

A =

"0 0

0 �1

#; R =

"�1 0

0 0

#; Q =

"1 0

0 0

#:

Then it can be shown that all the hermitian solutions of (13.1) are given by"1 0

0 0

#;

"�1 �

�� 12 j�j2

#; � 2 C :

The maximal solution is clearly

X+ =

"1 0

0 0

#;

however, there is no minimal solution. 3

The Riccati equation appeared in H1 control, which will be considered in the laterpart of this book, often has R � 0. However, these conditions are only a dual of theabove theorem.

Corollary 13.13 Assume that R � 0 and that there is a hermitian matrix X = X�

such that Q(X) � 0.

(i) If (A;R) is stabilizable, then there exists a unique minimal solution X� to theRiccati equation (13.1). Furthermore,

X� � X; 8X such that Q(X) � 0

and �(A+RX�) � C �.

(ii) If (�A;R) is stabilizable, then there exists a unique maximal solution X+ to theRiccati equation (13.1). Furthermore,

X+ � X; 8X such that Q(X) � 0

and �(A+RX+) � C +.


(iii) If (A;R) is controllable, then both X+ and X� exist. Furthermore, X+ > X� i��(A+RX�) � C� i� �(A +RX+) � C+ . In this case,

X+ �X� =

�Z 1

0

e(A+RX�)tRe(A+RX�)�tdt

��1

=

�Z 0

�1


��1:

(iv) If Q(X) < 0, the results in (i) and (ii) can be respectively strengthened to X+ > X,�(A+RX+) � C+ , and X� < X, �(A+RX�) � C� .

Proof. The proof is similar to the proof for Theorem 13.11. 2

Theorem 13.11 can be used to derive some comparative results for some Riccatiequations. More speci�cally, let

Hs :=

"A�BR�1s S� �BR�1s B�

�P + SR�1s S� �(A�BR�1s S�)�

#

and

~Hs :=

"~A� ~B ~R�1s

~S� � ~B ~R�1s~B�

� ~P + ~S ~R�1s~S� �( ~A� ~B ~R�1s

~S�)�

#

where P; ~P ;Rs, and ~Rs are real symmetric and Rs > 0, ~Rs > 0. We shall also make useof the following matrices:

T :=

"P S

S� Rs

#; ~T :=

"~P ~S~S� ~Rs

#:

We denote by X+ and ~X+ the maximal solution to the Riccati equation associated withHs and ~Hs, respectively:

(A�BR�1s S�)�X +X(A�BR�1s S�)�XBR�1s B�X + (P � SR�1s S�) = 0 (13:27)

( ~A� ~B ~R�1s ~S�)� ~X + ~X( ~A� ~B ~R�1s ~S�)� ~X ~B ~R�1s ~B� ~X + ( ~P � ~S ~R�1s ~S�) = 0: (13:28)

Recall that JHs and J ~Hs are hermitian where J is de�ned as

J =

"0 �II 0

#:

Let

K := JHs =

"P � SR�1s S� (A�BR�1s S�)�

A�BR�1s S� �BR�1s B�

#

~K := J ~Hs =

"~P � ~S ~R�1s ~S� ( ~A� ~B ~R�1s ~S�)�

~A� ~B ~R�1s~S� � ~B ~R�1s

~B�

#:


Theorem 13.14 Suppose (A;B) and ( ~A; ~B) are stabilizable.

(i) Assume that (13.28) has a hermitian solution and that K � ~K (K > ~K); thenX+ and ~X+ exist, and X+ � ~X+ (X+ > ~X+ ).

(ii) Let A = ~A, B = ~B, and T � ~T (T > ~T ). Assume that (13.28) has a hermitiansolution. Then X+ and ~X+ exist, and X+ � ~X+ (X+ > ~X+).

(iii) If T � 0 (T > 0), then X+ exists, and X+ � 0 (X+ > 0).

Proof. (i): Let X be a hermitian solution of (13:28); then we have"I

X

#�~K

"I

X

#= 0:

Then, since K � ~K, we have

Qs(X) :=

"I

X

#�K

"I

X

#=

"I

X

#�(K � ~K)

"I

X

#� 0:

Now we use Theorem 13.11 to obtain the existence of X+, and X+ � X . Since X isarbitrary, we get X+ � X for all hermitian solutions X of (13.28). The existence of ~X+

follows immediately by applying Theorem 13.11. Moreover,

X+ � ~X+:

(ii): Let A = ~A, B = ~B and let X be a hermitian solution of (13.28). Denote

L = �R�1s (S� +B�X) and ~L = � ~Rs�1( ~S� +B�X). Then

Qs(X) =

"I

X

#�K

"I

X

#= A�X +XA� L�RsL+ P

while (13.28) becomesA�X +XA� ~L� ~Rs

~L+ ~P = 0:

It is easy to show that

Qs(X) =

"I

X

#�(K � ~K)

"I

X

#

= P � ~P � L�RsL+ ~L� ~Rs~L

=

"I

L

#�(T � ~T )

"I

L

#+ (L� ~L)� ~Rs(L� ~L) � 0:

Now as in part (i), there exist X+ and ~X+, and X+ � ~X+.

(iii): The condition T � 0 implies P � SR�1s S� � 0, so we have Qs(0) = P �SR�1s S� � 0. Apply Theorem 13.11 to get the existence of X+, and X+ � 0. 2


13.4 Spectral Factorizations

Let A;B; P; S;R be real matrices of compatible dimensions such that P = P �, R = R�,and de�ne a parahermitian rational matrix function

�(s) = R+ S�(sI �A)�1B +B�(�sI �A�)�1S +B�(�sI �A�)�1P (sI �A)�1B

=hB�(�sI �A�)�1 I

i " P S

S� R

#"(sI �A)�1B

I

#: (13.29)

Lemma 13.15 Suppose R is nonsingular and either one of the following assumptionsis satis�ed:

(A1) A has no eigenvalues on j!-axis;

(A2) P is sign de�nite, i.e., P � 0 or P � 0, (A;B) has no uncontrollable modes onj!-axis, and (P;A) has no unobservable modes on the j!-axis.

Then the following statements are equivalent:

(i) �(j!0) is singular for some 0 � !0 � 1.

(ii) The Hamiltonian matrix

H =

"A�BR�1S� �BR�1B��(P � SR�1S�) �(A�BR�1S�)�

#

has an eigenvalue at j!0.

Proof. (i)) (ii): Let

�(s) =

"A B

C D

#:=

264

A 0

�P �A�B

�SS� B� R

375 :

Then H = A� BD�1C and

��1(s) =

2664 H

"�BR�1SR�1

#hR�1S� R�1B�

iR�1

3775 :

If �(j!0) is singular, then j!0 is a zero of �(s). Hence j!0 is a pole of ��1(s), and j!0

is an eigenvalue of H .

13.4. Spectral Factorizations 343

(ii) ) (i): Suppose j!0 is an eigenvalue of H but is not a pole of ��1(s). Then

j!0 must be either an unobservable mode of (hR�1S� R�1B�

i; H) or an uncon-

trollable mode of (H;

"�BR�1SR�1

#). Now suppose j!0 is an unobservable mode of

(hR�1S� R�1B�

i; H). Then there exists an x0 =

"x1

x2

#6= 0 such that

Hx0 = j!0x0;hR�1S� R�1B�

ix0 = 0:

These equations can be simpli�ed to

(j!0I �A)x1 = 0 (13:30)

(j!0I +A�)x2 = �Px1 (13:31)

S�x1 +B�x2 = 0: (13:32)

We now consider two cases under the di�erent assumptions:

(a) If assumption (A1) is true, then x1 = 0 from (13.30), and this, in turn, impliesx2 = 0 from (13.31), which is a contradiction.

(b) If assumption (A2) is true, since (13.30) implies x�1(j!0I+A�) = 0, from (13.31)we have x�1Px1 = 0. This gives Px1 = 0 since P is sign de�nite (P � 0 or P � 0). Thisimplies, along with (13.30) that j!0 is an unobservable mode of (P;A) if x1 6= 0. On theother hand, if x1 = 0, then (13.31) and (13.32) imply that (A;B) has an uncontrollablemode at j!0, again a contradiction.

Similarly, a contradiction will also be derived if j!0 is assumed to be an uncontrol-

lable mode of (H;

"�BR�1SR�1

#). 2

Corollary 13.16 Suppose R > 0 and either one of the assumptions (A1) or (A2)de�ned in Lemma 13.15 is true. Then the following statements are equivalent:

(i) �(j!) > 0 for all 0 � ! � 1.

(ii) The Hamiltonian matrix

H =


#

has no eigenvalue on j!-axis.


Proof. (i) ) (ii) follows easily from Lemma 13.15. To prove (ii) ) (i), note that�(j1) = R > 0 and det�(j!) 6= 0 for all ! from Lemma 13.15. Then the continuity of�(j!) gives �(j!) > 0. 2

Lemma 13.17 Suppose A is stable and P � 0. Then

(i) the matrix

�0(s) =hB�(�sI �A�)�1 I

i " P 0

0 I

#"(sI �A)�1B

I

#

satis�es�0(j!) > 0; for all 0 � ! � 1

if and only if there exists a unique real X = X� � 0 such that

A�X +XA�XBB�X + P = 0

and �(A�BB�X) � C� .

(ii) �0(j!) � 0; for all 0 � ! � 1 if and only if there exists a unique real X =X� � 0 such that

A�X +XA�XBB�X + P = 0 (13:33)

and �(A�BB�X) � C �.

Proof. (i): ()) From Corollary 13.16, �0(j!) > 0 implies that a Hamiltonian matrix"A �BB��P �A�

#

has no eigenvalues on the imaginary axis. This in turn implies from Theorem 13.6 that(13.33) has a stabilizing solution. Furthermore, since"

A �BB��P �A�

#=

"�I

I

#"A BB�

P �A�

#"�I

I

#;

we have

Ric

"A �BB��P �A�

#= �Ric

"A BB�

P �A�

#� 0:

(() The su�ciency proof is omitted here and is a special case of (b) ) (a) ofTheorem 13.19 below.

(ii): ()) Let P = �C�C and G(s) := C(sI�A)�1B. Then �0(s) = I�G�(s)G(s).Let 0 < < 1 and de�ne

(s) := I � 2G�(s)G(s):


Then (j!) > 0; 8!. Thus from part (i), there is an X = X� � 0 such that

A�BB�X is stable and

A�X +X A�X BB�X � 2C�C = 0:

It is easy to see from Theorem 13.14 that X is monotone-decreasing with , i.e.,X 1 � X 2 if 1 � 2. To show that lim !1X exists, we need to show that X isbounded below for all 0 < < 1.

In the following, it will be assumed that (A;B) is controllable. The controllabilityassumption will be removed later.

Let Y be the destabilizing solution to the following Riccati equation:

A�Y + Y A� Y BB�Y = 0

with �(A � BB�Y ) � C+ (note that the existence of such a solution is guaranteed bythe controllability assumption). Then it is easy to verify that

(A�BB�Y )�(X �Y )+ (X �Y )(A�BB�Y )� (X �Y )BB�(X �Y )� 2C�C = 0:

This implies that

X � Y =

Z 1

0

e�(A�BB�Y )�t[(X � Y )BB�(X � Y ) + 2C�C]e�(A�BB

�Y )tdt � 0:

Thus X is bounded below with Y as the lower bound, and lim !1X exists. LetX := lim !1X ; then from continuity argument, X satis�es the Riccati equation

A�X +XA�XBB�X + P = 0

and �(A �BB�X) � C �.

Now suppose (A;B) is not controllable, and then assume without loss of generalitythat

A =

"A11 A12

0 A22

#; B =

"B1

0

#; C =

hC1 C2

i

so that (A11; B1) is controllable and A11 and A22 are stable. Then the Riccati equationfor

X =

"X

11 X 12

(X 12)

� X 22

#

can be written as three equations

A�11X 11 +X

11A11 �X 11B1B

�1X

11 � 2C�1C1 = 0

(A11 �B1B1X 11)

�X 12 +X

12A22 +X 11A12 � 2C�1C2 = 0

A�22X 22 +X

22A22 + (X 12)

�A12 +A�12X 12 � (X

12)�B1B

�1X

12 � 2C�2C2 = 0


and

A�BB�X =

"A11 �B1B

�1X

11 A12 �B1B

�1X

12

0 A22

#

is stable.

Let Y11 be the anti-stabilizing solution to

A�11Y11 + Y11A11 � Y11B1B�1Y11 = 0

with �(A11 � B1B�1Y11) � C+ . Then it is clear that X

11 � Y11 � 0. Moreover, X11 =lim !1X

11 exists and satis�es the following Riccati equation:

A�11X11 +X11A11 �X11B1B�1X11 � C�1C1 = 0

with �(A11 �B1B�1X11) � C �. Consequently, the following Sylvester equation

(A11 �B1B1X11)�X12 +X12A22 +X11A12 � C�1C2 = 0

has a unique solution X12 since �i(A11 �B1B1X11) + �j(A22) 6= 0; 8i; j. Furthermore,the following Lyapunov equation has a unique nonnegative de�nite solution X22:

A�22X22 +X22A22 +X�12A12 +A�12X12 �X�

12B1B�1X12 � C�2C2 = 0:

We have proven that there exists a unique X such that

A�X +XA�XBB�X � C�C = 0

and �(A �BB�X) � C �.

(() same as in part (i). 2

Lemma 13.18 Let R > 0,

�(s) =hB�(�sI �A�)�1 I

i " P S

S� R

#"(sI �A)�1B

I

#

and

�(s) =hB�(�sI � A�)�1 I

i " P 0

0 I

# "(sI � A)�1B

I

#

where

A := A�BR�1S�; B := BR�1=2; P := P � SR�1S�:

Then �(j!) � 0 i� �(j!) � 0.


Proof. Note that"P S

S� R

#=

"I SR�1=2

0 R1=2

#"P 0

0 I

# "I 0

R�1=2S� R1=2

#:

Hence the function �(s) can be written as

�(s) =hB�(�sI �A�)�1 '�(s)

i " P 0

0 I

#"(sI �A)�1B

'(s)

#

with '(s) = R1=2 +R�1=2S�(sI �A)�1B. It is easy to verify that

�(s) = ['�1(s)]��(s)'�1(s):

Hence the conclusion follows. 2

Now we are ready to state and prove one of the main results of this section. Thefollowing theorem and corollary characterize the relations among spectral factorizations,Riccati equations, and decomposition of Hamiltonians.

Theorem 13.19 Let A;B; P; S;R be matrices of compatible dimensions such that P =P �, R = R� > 0, with (A;B) stabilizable. Suppose either one of the following assump-tions is satis�ed:

(A1) A has no eigenvalues on j!-axis;

(A2) P is sign de�nite, i.e., P � 0 or P � 0 and (P;A) has no unobservable modes onthe j!-axis.

Then

(I) The following statements are equivalent:

(a) The parahermitian rational matrix

�(s) =hB�(�sI �A�)�1 I

i " P S

S� R

#"(sI �A)�1B

I

#

satis�es�(j!) > 0 for all 0 � ! � 1:

(b) There exists a unique real X = X� such that


and that A�BR�1S� �BR�1B�X is stable.


(c) The Hamiltonian matrix

H =


#

has no j!-axis eigenvalues.

(II) The following statements are also equivalent:

(d) �(j!) � 0 for all 0 � ! � 1.

(e) There exists a unique real X = X� such that


and that �(A �BR�1S� �BR�1B�X) � C �.

The following corollary is usually referred to as the spectral factorization theory.

Corollary 13.20 If either one of the conditions, (a){(e), in Theorem 13.19 is satis�ed,then there exists an M 2 Rp such that

� =M�RM:

A particular realization of one such M is

M =

"A B

�F I

#

where F = �R�1(S�+B�X). Furthermore, if X is the stabilizing solution, then M�1 2RH1.

Remark 13.6 If the stabilizability of (A;B) is changed into the stabilizability of (�A;B),then the theorem still holds except that the solutions X in (b) and (e) are changed intothe destabilizing solution (�(A�BR�1S��BR�1B�X) � C+ ) and the weakly desta-bilizing solution (�(A�BR�1S� �BR�1B�X) � C +), respectively. ~

Proof. (a)) (c) follows from Corollary 13.16.

(c)) (b) follows from Theorem 13.6 and Theorem 13.5.

(b)) (a) Suppose 9X = X� such that A�BR�1S��BR�1B�X = A�BR�1(S�+B�X) is stable. Let F = �R�1(S� +B�X) and

M =

"A B

�F I

#:


It is easily veri�ed by use of the Riccati equation for X and by routine algebra that� =M�RM . Since

M�1 =

"A+BF B

F I

#;

M�1 2 RH1. Thus M(s) has no zeros on the imaginary axis and �(j!) > 0.

(e)) (d) follows the same procedure as the proof of (b)) (a).

(d) ) (e): Assume S = 0 and R = I ; otherwise use Lemma 13.18 �rst to get anew function with such properties. Let P = C�1C1 � C�2C2 with C1 and C2 squarenonsingular. Note that this decomposition always exists since P = �I � (�I �P ), with� > 0 su�ciently large, de�nes one such possibility. Let X1 be the positive de�nitesolution to

A�X1 +X1A�X1BB�X1 + C�1C1 = 0:

By Theorem 13.7, X1 indeed exists and is stabilizing, i.e., A1 := A�BB�X1 is stable.Let � = X �X1. Then the equation in � becomes

A�1�+�A1 ��BB�� C�2C2 = 0:

To show that this equation has a solution, recall Lemma 13.17 and note that A1 isstable; then it is su�cient to show that

I �B�(�j!I �A�1)�1C�2C2(j!I �A1)

�1B

is positive semi-de�nite.Notice �rst that

C2(sI �A+BB�X1)�1B = C2(sI �A)�1B

�I +B�X1(sI �A)�1B

��1:

From the de�nition of X1 and Corollary 13.20, we also have that

I +B�(�sI �A�)�1C�1C1(sI �A)�1B

=�I +B�X1(sI �A)�1B

�� I +B�X1(sI �A)�1B

�:

Now, by assumption

�(j!) = I +B�(�j!I �A�)�1C�1C1(j!I �A)�1B

�B�(�j!I �A�)�1C�2C2(j!I �A)�1B � 0:

It follows that

I �n�I +B�(�j!I �A�)�1X1B

��1B�(�j!I �A�)�1C�2

onC2(j!I �A)�1B

�I +B�X1(j!I �A)�1B

��1o � 0:


Consequently,I �B�(�j!I �A�1)

�1C�2C2(j!I �A1)�1B � 0:

We may now apply Lemma 13.17 to the � equation. Consequently, there exists a uniquesolution � such that �(A1 �BB��) = �(A�BB�X) � C �. This shows the existenceand uniqueness of X .

2

We shall now illustrate the proceeding results through a simple example. Note inparticular that the function can have poles on the imaginary axis.

Example 13.6 Let A = 0; B = 1; R = 1; S = 0 and P = 1. Then �(s) = 1 � 1s2 and

�(j!) > 0. It is easy to check that the Hamiltonian matrix

H =

"0 �1�1 0

#

does not have eigenvalues on the imaginary axis and X = 1 is the stabilizing solutionto the corresponding Riccati equation and the spectral factor is given by

M(s) =

"0 1

1 1

#=s+ 1

s:

3

Some frequently used special spectral factorizations are now considered.

Corollary 13.21 Assume that G(s) :=

"A B

C D

#2 RL1 is a stabilizable and de-

tectable realization and > kG(s)k1. Then, there exists a transfer matrix M 2 RL1such that M�M = 2I �G�G and M�1 2 RH1. A particular realization of M is

M(s) =

"A B

�R1=2F R1=2

#

where

R = 2I �D�D

F = R�1(B�X +D�C)

X = Ric

"A+BR�1D�C BR�1B�

�C�(I +DR�1D�)C �(A+BR�1D�C)�

#

and X � 0 if A is stable.


Proof. This is a special case of Theorem 13.19. In fact, the theorem follows by lettingP = �C�C; S = �C�D; R = 2I �D�D in Theorem 13.19 and by using the fact that

Ric


�C�(I +DR�1D�)C �(A+BR�1D�C)�

#=

�Ric"

A+BR�1D�C �BR�1B�C�(I +DR�1D�)C �(A+BR�1D�C)�

#:

2

The spectral factorization for the dual case is also often used and follows by takingthe transpose of the corresponding transfer matrices.

Corollary 13.22 Assume G(s) :=

"A B

C D

#2 RL1 and > kG(s)k1. Then, there

exists a transfer matrix M 2 RL1 such that MM� = 2I �GG� and M�1 2 RH1.A particular realization of M is

M(s) =

"A �LR1=2

C R1=2

#

where

R = 2I �DD�

L = (Y C� +BD�)R�1

Y = Ric

"(A+BD�R�1C)� C�R�1C

�B(I +D�R�1D)B� �(A+BD�R�1C)

#

and Y � 0 if A is stable.

For convenience, we also include the following spectral factorization results whichare again special cases of Theorem 13.19.

Corollary 13.23 Let G(s) =

"A B

C D

#be a stabilizable and detectable realization.

(a) Suppose G�(j!)G(j!) > 0 for all ! or

"A� j! B

C D

#has full column rank for

all !. Let

X = Ric


�C�(I �DR�1D�)C �(A�BR�1D�C)�

#


with R := D�D > 0. Then we have the following spectral factorization

W�W = G�G

where W�1 2 RH1 and

W =

"A B

R�1=2(D�C +B�X) R1=2

#:

(b) Suppose G(j!)G�(j!) > 0 for all ! or

"A� j! B

C D

#has full row rank for all

!. Let

Y = Ric

"(A�BD� ~R�1C)� �C� ~R�1C�B(I �D� ~R�1D)B� �(A�BD� ~R�1C)

#

with ~R := DD� > 0. Then we have the following spectral factorization

~W ~W� = GG�

where ~W�1 2 RH1 and

~W =

"A (BD� + Y C�) ~R�1=2

C ~R1=2

#:

Theorem 13.19 also gives some additional characterizations of a transfer matrix H1norm.

Corollary 13.24 Let > 0, G(s) =

"A B

C D

#2 RH1 and

H :=


�C�(I +DR�1D�)C �(A+BR�1D�C)�

#

where R = 2I �D�D. Then the following conditions are equivalent:

(i) kGk1 < .

(ii) ��(D) < and H has no eigenvalues on the imaginary axis.

(iii) ��(D) < and H 2 dom(Ric) .

(iv) ��(D) < and H 2 dom(Ric) and Ric(H) � 0 (Ric(H) > 0 if (C;A) is observ-able).

13.5. Positive Real Functions 353

Proof. This follows from the fact that kGk1 < is equivalent to that the followingfunction is positive de�nite for all !:

�(j!) := 2I �GT (�j!)G(j!)

=hB�(�j!I �A�)�1 I

i " �C�C �C�D�D�C 2I �D�D

#"(j!I �A)�1B

I

#> 0;

and the fact that"�I 0

0 I

#H

"�I 0

0 I

#=

"A+ BR�1D�C �BR�1B�

C�(I +DR�1D�)C �(A+BR�1D�C)�

#:

2

The equivalence between (i) and (iv) in the above corollary is usually referred asBounded Real Lemma.

13.5 Positive Real Functions

A square (m � m) matrix function G(s) 2 RH1 is said to be positive real (PR) ifG(j!) +G�(j!) � 0 for all �nite !, i.e., ! 2 R, and G(s) is said to be strictly positivereal (SPR) if G(j!) +G�(j!) > 0 for all ! 2 R.

Theorem 13.25 Let

"A B

C D

#be a state space realization of G(s) with A stable (not

necessarily a minimal realization). Suppose there exist an X � 0, Q, and W such that

XA+A�X = �Q�Q (13.34)

B�X +W �Q = C (13.35)

D +D� = W �W; (13.36)

Then G(s) is positive real and

G(s) +G�(s) =M�(s)M(s)

with M(s) =

"A B

Q W

#. Furthermore, if M(j!) has full column rank for all ! 2 R,

then G(s) is strictly positive real.


Proof.

G(s)+G�(s) =

264A 0 B

0 �A� �C�C B� D +D�

375 =

264

A 0 B

0 �A� �(XB +Q�W )

B�X +W �Q B� W �W

375 :

Apply a similarity transformation

"I 0

X I

#to the last realization to get

G(s) +G�(s) =

264

A 0 B

XA+A�X �A� �Q�WW �Q B� W �W

375 =

264

A 0 B

�Q�Q �A� �Q�WW �Q B� W �W

375

=

"�A� �Q�B� W �

#"A B

Q W

#:

This implies that

G(j!) +G�(j!) =M�(j!)M(j!) � 0

i.e., G(s) is positive real. Finally note that if M(j!) has full column rank for all ! 2 R,then M(j!)x 6= 0 for all x 2 Cm and ! 2 R. Thus G(s) is strictly positive real. 2

Theorem 13.26 Suppose (A;B;C;D) is a minimal realization of G(s) with A stableand G(s) is positive real. Then there exist an X � 0, Q, and W such that

XA+A�X = �Q�QB�X +W �Q = C

D +D� = W �W:

and

G(s) +G�(s) =M�(s)M(s)

with M(s) =

"A B

Q W

#. Furthermore, if G(s) is strictly positive real, then M(j!)

given above has full column rank for all ! 2 R.

Proof. Since G(s) is assumed to be positive real, there exists a transfer matrixM(s) ="A1 B1

C1 D1

#with A1 stable such that

G(s) +G�(s) =M�(s)M(s)

13.5. Positive Real Functions 355

where A and A1 have the same dimensions. Now let X1 � 0 be the solution of thefollowing Lyapunov equation:

X1A1 +A�1X1 = �C�1C1:

Then

M�(s)M(s) =

"�A�1 �C�1B�1 D�

1

# "A1 B1

C1 D1

#=

264

A1 0 B1

�C�1C1 �A�1 �C�1D1

D�1C1 B�1 D�

1D1

375

=

264

A1 0 B1

X1A1 +A�1X1 �A�1 �C�1D1

D�1C1 B�1 D�

1D1

375

=

264

A1 0 B1

0 �A�1 �(X1B1 + C�1D1)

B�1X +D�1C1 B�1 D�

1D1

375

= D�1D1 +

"A1 B1

B�1X +D�1C1 0

#+

"�A�1 �(B�1X +D�

1C1)�

B�1 0

#:

But the realization for G(s) +G�(s) is given by

G(s) +G�(s) = D +D� +

"A B

C 0

#+

"�A� �C�B� 0

#:

Since the realization for G(s) is minimal, there exists a nonsingular matrix T such that

A = TA1T�1; B = TB1; C = (B�1X +D�

1C1)T�1; D +D� = D�

1D1:

Now the conclusion follows by de�ning

X = (T�1)�X1T�1; W = D1; Q = C1T

�1:

2

Corollary 13.27 Let

"A B

C D

#be a state space realization of G(s) 2 RH1 with A

stable and R := D + D� > 0. Then G(s) is strictly positive real if and only if thereexists a stabilizing solution to the following Riccati equation:

X(A�BR�1C) + (A�BR�1C)�X +XBR�1B�X + C�R�1C = 0:

Moreover, M(s) =

"A B

R�12 (C �B�X) R

12

#is minimal phase and

G(s) +G�(s) =M�(s)M(s):


Proof. This follows from Theorem 13.19 and from the fact that

G(j!) +G�(j!) > 0

for all ! including 1. 2

The above corollary also leads to the following special spectral factorization.


"A B

C D

#2 RH1 with D full row rank and G(j!)G�(j!) >

0 for all !. Let P be the controllability grammian of (A;B):

PA� +AP +BB� = 0:

De�ne

BW = PC� +BD�:

Then there exists an X � 0 such that

XA+A�X + (C �B�WX)�(DD�)�1(C �B�WX) = 0:

Furthermore, there is an M(s) 2 RH1 such that M�1(s) 2 RH1 and

G(s)G�(s) =M�(s)M(s)

where M(s) =

"A BW

CW DW

#with a square matrix DW such that

D�WDW = DD�

and

CW = DW (DD�)�1(C �B�WX):

Proof. This corollary follows from Corollary 13.27 and the fact that G(j!)G�(j!) > 0and

G(s)G�(s) =

"A BW

C 0

#+

"�A� �C�B�W 0

#+DD�:

2

A dual spectral factorization can also be obtained easily.

13.6. Inner Functions 357

13.6 Inner Functions

A transfer function N is called inner if N 2 RH1 and N�N = I and co-inner ifN 2 RH1 and NN� = I . Note that N need not be square. Inner and co-inner aredual notions, i.e., N is an inner i� NT is a co-inner. A matrix function N 2 RL1 iscalled all-pass if N is square and N�N = I ; clearly a square inner function is all-pass.We will focus on the characterizations of inner functions here and the properties ofco-inner functions follow by duality.

Note that N inner implies that N has at least as many rows as columns. ForN inner and any q 2 Cm , v 2 L2, kN(j!)qk = kqk, 8! and kNvk2 = kvk2 sinceN(j!)�N(j!) = I for all !. Because of these norm preserving properties, inner matriceswill play an important role in the control synthesis theory in this book. In this section,we present a state-space characterization of inner transfer functions.

Lemma 13.29 Suppose N =

"A B

C D

#2 RH1 and X = X� � 0 satis�es

A�X +XA+ C�C = 0: (13:37)

Then

(a) D�C + B�X = 0 implies N�N = D�D.

(b) (A;B) controllable, and N�N = D�D implies D�C +B�X = 0.

Proof. Conjugating the states of

N�N =

2664

A 0 B

�C�C �A� �C�DD�C B� D�D

3775

by

"I 0

�X I

#on the left and

"I 0

�X I

#�1=

"I 0

X I

#on the right yields

N�N =

2664

A 0 B

�(A�X +XA+ C�C) �A� �(XB + C�D)

B�X +D�C B� D�D

3775

=

2664

A 0 B

0 �A� �(XB + C�D)

B�X +D�C B� D�D

3775 :

Then (a) and (b) follow easily. 2


This lemma immediately leads to one characterization of inner matrices in termsof their state space representations. Simply add the condition that D�D = I toLemma 13.29 to get N�N = I .

Corollary 13.30 Suppose N =

"A B

C D

#is stable and minimal, and X is the ob-

servability grammian. Then N is inner if and only if

(a) D�C + B�X = 0

(b) D�D = I .

A transfer matrix N? is called a complementary inner factor (CIF) of N if [N N?]is square and inner. The dual notion of the complementary co-inner factor is de�ned inthe obvious way. Given an inner N , the following lemma gives a construction of its CIF.The proof of this lemma follows from straightforward calculation and from the fact thatCXyX = C since Im(I �X+X) � Ker(X) � Ker(C).

Lemma 13.31 Let N =

"A B

C D

#be an inner and X be the observability grammian.

Then a CIF N? is given by

N? =

"A �XyC�D?

C D?

#

where D? is an orthogonal complement of D such that [DD?] is square and orthogonal.

13.7 Inner-Outer Factorizations

In this section, some special form of coprime factorizations will be developed. In par-ticular, explicit realizations are given for coprime factorizations G = NM�1 with innernumerator N and inner denominator M , respectively. The former factorization in thecase of G 2 RH1 will give an inner-outer factorization1. The results will be proven forthe right coprime factorizations, while the results for left coprime factorizations followby duality.

Let G 2 Rp be a p � m transfer matrix and denote R1=2R1=2 = R. For a givenfull column rank matrix D, let D? denote for any orthogonal complement of D so thathDR�1=2 D?

i(with R = D�D > 0) is square and orthogonal. To obtain an rcf of

G with N inner, we note that if NM�1 is an rcf, then (NZr)(MZr)�1 is also an rcf for

any nonsingular real matrix Zr. We simply need to use the formulas in Theorem 5.9 tosolve for F and Zr.

1A p �m (p � m) transfer matrix Go 2 RH1 is called an outer if Go(s) has full row rank in theopen right half plane, i.e., Go(s) has full row normal rank and has no open right half plane zeros.

13.7. Inner-Outer Factorizations 359

Theorem 13.32 Assume p � m. Then there exists an rcf G = NM�1 such that Nis inner if and only if G�G > 0 on the j!-axis, including at 1. This factorization isunique up to a constant unitary multiple. Furthermore, assume that the realization of

G =

"A B

C D

#is stabilizable and that

"A� j!I B

C D

#has full column rank for all

! 2 R. Then a particular realization of the desired coprime factorization is

"M

N

#:=

264

A+BF BR�1=2

F

C +DF

R�1=2

DR�1=2

375 2 RH1

whereR = D�D > 0

F = �R�1(B�X +D�C)

and

X = Ric


�C�(I �DR�1D�)C �(A�BR�1D�C)�

#� 0:

Moreover, a complementary inner factor can be obtained as

N? =

"A+BF �XyC�D?

C +DF D?

#

if p > m.

Proof. ()) SupposeG = NM�1 is an rcf andN�N = I . ThenG�G = (M�1)�M�1 >0 on the j!-axis since M 2 RH1.

(() This can be shown by using Corollary 13.20 �rst to get a factorization G�G =(M�1)�(M�1) and then to compute N = GM . The following proof is more direct.It will be proven by showing that the de�nition of the inner and coprime factorizationformula given in Theorem 5.9 lead directly to the above realization of the rcf of Gwith an inner numerator. That G = NM�1 is an rcf follows immediately once it isestablished that F is a stabilizing state feedback. Now suppose

N =

"A+BF BZr

C +DF DZr

#

and let F and Zr be such that

(DZr)�(DZr) = I (13:38)

(BZr)�X + (DZr)

�(C +DF ) = 0 (13:39)


(A+BF )�X +X(A+BF ) + (C +DF )�(C +DF ) = 0: (13:40)

Clearly, we have that Zr = R�1=2U where R = D�D > 0 and where U is any orthogonalmatrix. Take U = I and solve (13.39) for F to get

F = �R�1(B�X +D�C):

Then substitute F into (13.40) to get

0 = (A+BF )�X +X(A+BF ) + (C +DF )�(C +DF )

= (A�BR�1D�C)�X +X(A�BR�1D�C)�XBR�1B�X + C�D?D�?C

where D?D�? = I �DR�1D�. To show that such choices indeed make sense, we need

to show that H 2 dom(Ric), where

H =

"A�BR�1D�C �BR�1B��C�D?D�

?C �(A�BR�1D�C)�

#

so X = Ric(H). However, by Theorem 13.19, H 2 dom(Ric) is guaranteed by the fact

that

"A� j! B

C D

#has full column rank (or G�(j!)G(j!) > 0.

The uniqueness of the factorization follows from coprimeness and N inner. Supposethat G = N1M

�11 = N2M

�12 are two right coprime factorizations and that both nu-

merators are inner. By coprimeness, these two factorizations are unique up to a rightmultiple which is a unit2 in RH1. That is, there exists a unit � 2 RH1 such that"M1

N1

#� =

"M2

N2

#. Clearly, � is inner since �� = ��N�

1 N1� = N�2 N2 = I .

The only inner units in RH1 are constant matrices, and thus the desired uniquenessproperty is established. Note that the non-uniqueness is contained entirely in the choiceof a particular square root of R.

Finally, the formula for N? follows from Lemma 13.31. 2

Note that the important inner-outer factorization formula can be obtained from thisinner numerator coprime factorization if G 2 RH1.

Corollary 13.33 Suppose G 2 RH1; then the denominator matrixM in Theorem 13.32is an outer. Hence, the factorization G = N(M�1) given in Theorem 13.32 is an inner-outer factorization.

Remark 13.7 It is noted that the above inner-outer factorization procedure does notapply to the strictly proper transfer matrix even if the factorization exists. For example,G(s) = s�1

s+11

s+2 has inner-outer factorizations but the above procedure cannot be used.The inner-outer factorization for the general transfer matrices can be done using themethod adopted in Section 6.1 of Chapter 6. ~

2A function � is called a unit in RH1 if �;��1 2 RH1.

13.7. Inner-Outer Factorizations 361

Suppose that the system G is not stable; then a coprime factorization with an innerdenominator can also be obtained by solving a special Riccati equation. The proof ofthis result is similar to the inner numerator case and is omitted.

Theorem 13.34 Assume that G =

"A B

C D

#2 Rp and (A;B) is stabilizable. Then

there exists a right coprime factorization G = NM�1 such that M 2 RH1 is inner ifand only if G has no poles on j!-axis. A particular realization is

"M

N

#:=

264

A+BF B

F

C +DF

I

D

375 2 RH1

whereF = �B�X

X = Ric

"A �BB�0 �A�

#� 0:

Dual results can be obtained when p � m by taking the transpose of the transfer functionmatrix. In these factorizations, output injection using the dual Riccati solution replacesstate feedback to obtain the corresponding left factorizations.

Theorem 13.35 Assume p � m. Then there exists an lcf G = ~M�1 ~N such that ~N isa co-inner if and only if GG� > 0 on the j!-axis, including at 1. This factorization isunique up to a constant unitary multiple. Furthermore, assume that the realization of

G =

"A B

C D

#is detectable and that

"A� j!I B

C D

#has full row rank for all ! 2 R.

Then a particular realization of the desired coprime factorization is

h~M ~N

i:=

"A+ LC L B + LD

~R�1=2C ~R�1=2 ~R�1=2D

#2 RH1

where~R = DD� > 0

L = �(BD� + Y C�) ~R�1

and

Y = Ric

"(A�BD� ~R�1 ~C)� �C� ~R�1C�B(I �D� ~R�1D)B �(A�BD� ~R�1C)

#� 0:

Moreover, a complementary co-inner factor can be obtained as

~N? =

"A+ LC B + LD

� ~D?B�Y y ~D?

#

if p < m, where ~D? is a full row rank matrix such that ~D�?~D? = I �D� ~R�1D.


Theorem 13.36 Assume that G =

"A B

C D

#2 Rp and (C;A) is detectable. Then

there exists a left coprime factorization G = ~M�1 ~N such that ~M 2 RH1 is inner ifand only if G has no poles on j!-axis. A particular realization is

h~M ~N

i:=

"A+ LC L B + LD

C I D

#2 RH1

whereL = �Y C�

Y = Ric

"A� �C�C0 �A

#� 0:

13.8 Normalized Coprime Factorizations

A right coprime factorization of G = NM�1 with N;M 2 RH1 is called a normalizedright coprime factorization if

M�M +N�N = I

i.e., if

"M

N

#is an inner. Similarly, an lcf G = ~M�1 ~N is called a normalized left

coprime factorization ifh

~M ~Niis a co-inner.

The normalized coprime factorization is easy to obtain from the de�nition. Thefollowing theorem can be proven using the same procedure as in the proof for thecoprime factorization with inner numerator. In this case, the proof involves choosing F

and Zr such that

"M

N

#is an inner.

Theorem 13.37 Let a realization of G be given by

G =

"A B

C D

#

and de�neR = I +D�D > 0; ~R = I +DD� > 0:

(a) Suppose (A;B) is stabilizable and (C;A) has no unobservable modes on the imag-inary axis. Then there is a normalized right coprime factorization G = NM�1

"M

N

#:=

264

A+BF BR�1=2

F

C +DF

R�1=2

DR�1=2

375 2 RH1

13.8. Normalized Coprime Factorizations 363

where

F = �R�1(B�X +D�C)

and

X = Ric

"A� BR�1D�C �BR�1B��C� ~R�1C �(A�BR�1D�C)�

#� 0:

(b) Suppose (C;A) is detectable and (A;B) has no uncontrollable modes on the imag-inary axis. Then there is a normalized left coprime factorization G = ~M�1 ~N

h~M ~N

i:=

"A+ LC L B + LD

~R�1=2C ~R�1=2 ~R�1=2D

#

where

L = �(BD� + Y C�) ~R�1

and

Y = Ric

"(A�BD� ~R�1C)� �C� ~R�1C�BR�1B� �(A�BD� ~R�1C)

#� 0:

(c) The controllability grammian P and observability grammian Q of

"M

N

#are given

by

P = (I + Y X)�1Y; Q = X

while the controllability grammian ~P and observability grammian ~Q ofh

~M ~Ni

are given by~P = Y; ~Q = (I +XY )�1X:

Proof. We shall only prove the �rst part of (c). It is obvious that Q = X since theRiccati equation for X can be written as

X(A+BF ) + (A+BF )�X +

"F

C +DF

#� "F

C +DF

#= 0

while the controllability grammian solves the Lyapunov equation

(A+BF )P + P (A+BF )� +BR�1B� = 0

or equivalently

P = Ric

"(A+BF )� 0

�BR�1B� �(A+BF )

#:


Now let T =

"I X

0 I

#; then

"(A+BF )� 0

�BR�1B� �(A+BF )

#= T

"A�BD� ~R�1C �C� ~R�1C�BR�1B� �(A�BD� ~R�1C)�

#T�1:

This shows that the stable invariant subspaces for these two Hamiltonian matrices arerelated by

X�"(A+BF )� 0

�BR�1B� �(A+BF )

#= T X�

"A�BD� ~R�1C �C� ~R�1C�BR�1B� �(A�BD� ~R�1C)�

#

or

Im

"I

P

#= T Im

"I

Y

#= Im

"I +XY

Y

#= Im

"I

Y (I +XY )�1

#:

Hence we have P = Y (I +XY )�1. 2


The general solutions of a Riccati equation are given by Martensson [1971]. The iter-ative procedure for solving ARE was �rst introduced by Kleinman [1968] for a specialcase and was further developed by Wonham [1968]. It was used by Coppel [1974], Ranand Vreugdenhil [1988], and many others for the proof of the existence of maximaland minimal solutions. The comparative results were obtained in Ran and Vreugdenhil[1988]. The paper by Wimmer[1985] also contains comparative results for some specialcases. The paper by Willems [1971] contains a comprehensive treatment of ARE andthe related optimization problems. Some matrix factorization results are given in Doyle[1984]. Numerical methods for solving ARE can be found in Arnold and Laub [1984],Dooren [1981], and references therein. The state space spectral factorization for func-tions singular at 1 or on imaginary axis is considered in Clements and Glover [1989]and Clements [1993].

14H2 Optimal Control

In this chapter we treat the optimal control of linear time-invariant systems with aquadratic performance criterion. The material in this chapter is standard, but thetreatment is somewhat novel and lays the foundation for the subsequent chapters onH1-optimal control.

14.1 Introduction to Regulator Problem

Consider the following dynamical system:

_x = Ax+ B2u; x(t0) = x0 (14:1)

where x0 is given but arbitrary. Our objective is to �nd a control function u(t) de�nedon [t0; T ] which can be a function of the state x(t) such that the state x(t) is driven to a(small) neighborhood of origin at time T . This is the so-called Regulator Problem. Onemight suggest that this regulator problem can be trivially solved for any T > t0 if thesystem is controllable. This is indeed the case if the controller can provide arbitrarilylarge amounts of energy since, by the de�nition of controllability, one can immediatelyconstruct a control function that will drive the state to zero in an arbitrarily shorttime. However, this is not practical since any physical system has the energy limitation,i.e., the actuator will eventually saturate. Furthermore, large control action can easilydrive the system out of the region where the given linear model is valid. Hence certainlimitations have to be imposed on the control in practical engineering implementation.

365

366 H2 OPTIMAL CONTROL

The constraints on control u may be measured in many di�erent ways; for example,

Z T

t0

kuk dt;Z T

t0

kuk2 dt; supt2[t0;T ]

kuk

i.e., in terms of L1-norm, L2-norm, and L1-norm, or more generally, weighted L1-norm,L2-norm, and L1-norm

Z T

t0

kWuuk dt;Z T

t0

kWuuk2 dt; supt2[t0;T ]

kWuuk

for some constant weighting matrix Wu.

Similarly, one might also want to impose some constraints on the transient responsex(t) in a similar fashion

Z T

t0

kWxxk dt;Z T

t0

kWxxk2 dt; supt2[t0;T ]

kWxxk

for some weighting matrixWx. Hence the regulator problem can be posed as an optimalcontrol problem with certain combined performance index on u and x, as given above.In this chapter, we shall be concerned exclusively with the L2 performance problem orquadratic performance problem. Moreover, we will focus on the in�nite time regulatorproblem, i.e., T ! 1, and, without loss of generality, we shall assume t0 = 0. In thiscase, our problem is as follows: �nd a control u(t) de�ned on [0;1) such that the statex(t) is driven to the origin at t!1 and the following performance index is minimized:

minu

Z 1

0

"x(t)

u(t)

#� "Q S

S� R

#"x(t)

u(t)

#dt (14:2)

for some Q = Q�, S, and R = R� > 0. This problem is traditionally called a LinearQuadratic Regulator problem or simply an LQR problem. Here we have assumed R > 0to emphasis that the control energy has to be �nite, i.e., u(t) 2 L2[0;1). So this is thespace over which the integral is minimized. Moreover, it is also generally assumed that"

Q S

S� R

#� 0: (14:3)

Since R is positive de�nite, it has a square root, R1=2, which is also positive-de�nite.By the substitution

u R1=2u;

we may as well assume at the start that R = I . In fact, we can even assume S = 0 byusing a pre-state feedback u = �S�x+ v provided some care is exercised; however, this

14.2. Standard LQR Problem 367

will not be assumed in the sequel. Since the matrix in (14.3) is positive semi-de�nitewith R = I , it can be factored as"

Q S

S� I

#=

"C�1D�

12

# hC1 D12

i:

And (14.2) can be rewritten as

minu2L2[0;1)

kC1x+D12uk22 :

In fact, the LQR problem is posed traditionally as the minimization problem

minu2L2[0;1)

kC1x+D12uk22 (14:4)

_x = Ax+Bu; x(0) = x0 (14:5)

without explicitly mentioning the condition that the control should drive the state to theorigin. Instead some assumptions are imposed on Q;S, and R (or equivalently C1 andD12) to ensure that the optimal control law u has this property. To see what assumptionone needs to make in order to ensure that the minimization problem formulated in (14.4)and (14.5) has a sensible solution, let us consider a simple example with A = 1; B =1; Q = 0; S = 0, and R = 1:

minu2L2[0;1)

Z 1

0

u2dt; _x = x+ u; x(0) = x0:

It is clear that u = 0 is the optimal solution. However, the system with u = 0 isunstable and x(t) diverges exponentially to in�nity, x(t) = etx0. The problem withthis example is that the performance index does not \see" the unstable state x. This istrue in general, and the proof of this fact is left as an exercise to the reader. Hence inorder to ensure that the minimization problem in (14.4) and (14.5) is sensible, we mustassume that all unstable states can be \seen" from the performance index, i.e., (C1; A)must be detectable. This will be called a standard LQR problem.

On the other hand, if the closed-loop stability is imposed on the above minimization,then it can be shown that minu2L2[0;1)

R10 u2dt = 2x20 and u(t) = �2x(t) is the optimal

control. This can also be generalized to a more general case where (C1; A) is notnecessarily detectable. This problem will be referred to as an Extended LQR problem.

14.2 Standard LQR Problem

In this section, we shall consider the LQR problem as traditionally formulated.

Standard LQR Problem


Let a dynamical system be described by

_x = Ax +B2u; x(0) = x0 given but arbitrary (14.6)

z = C1x+D12u (14.7)

and suppose that the system parameter matrices satisfy the following as-sumptions:

(A1) (A;B2) is stabilizable;

(A2) D12 has full column rank withhD12 D?

iunitary;

(A3) (C1; A) is detectable;

(A4)

"A� j!I B2

C1 D12


Find an optimal control law u 2 L2[0;1) such that the performance criterion

kzk22 is minimized.

Remark 14.1 Assumption (A1) is clearly necessary for the existence of a stabilizingcontrol function u. The assumption (A2) is made for simplicity of notation and isactually a restatement that R = D�

12D12 = I . Note also that D? drops out when D12

is square. It is interesting to point out that (A3) is not needed in the Extended LQRproblem. The assumption (A3) enforces that the unconditional optimization problemwill result in a stabilizing control law. In fact, the assumption (A3) together with (A1)guarantees that the input/output stability implies the internal stability, i.e., u 2 L2and z 2 L2 imply x 2 L2, which will be shown in Lemma 14.1. Finally note that (A4)is equivalent to the condition that (D�

?C1; A � B2D�12C1) has no unobservable modes

on the imaginary axis and is weaker than the popular assumption of detectability of(D�

?C1; A�B2D�12C1). (A4), together with the stabilizability of (A;B2), guarantees by

Corollary 13.10 that the following Hamiltonian matrix belongs to dom(Ric) and thatX = Ric(H) � 0:

H =

"A 0

�C�1C1 �A�

#�"

B2

�C�1D12

# hD�

12C1 B�2

i

=

"A�B2D

�12C1 �B2B

�2

�C�1D?D�?C1 �(A�B2D

�12C1)

�

#: (14.8)

Note also that if D�12C1 = 0, then (A4) is implied by the detectability of (C1; A), while

the detectability of (C1; A) is implied by the detectability of (D�?C1; A � B2D

�12C1).

The above implication is not true if D�12C1 6= 0, for example,

A =

"2 0

0 �2

#; B2 =

"1

0

#; C1 =

h1 0

i; D12 = 1:


Then (C1; A) is detectable and A � B2D�12C1 =

"1 0

0 �2

#has no eigenvalue on the

imaginary axis but is not stable. ~

Note also that the Riccati equation corresponding to (14.8) is

(A�B2D�12C1)

�X +X(A�B2D�12C1)�XB2B

�2X + C�1D?D

�?C1 = 0: (14:9)

Now let X be the corresponding stabilizing solution and de�ne

F := �(B�2X +D�12C1): (14:10)

Then A+ B2F is stable. Denote

AF := A+B2F; CF := C1 +D12F

and re-arrange equation (14.9) to get

A�FX +XAF + C�FCF = 0: (14:11)

Thus X is the observability Gramian of (CF ; AF ).Consider applying the control law u = Fx to the system (14.6) and (14.7). The

controlled system is

_x = AFx; x(0) = x0

z = CFx

or equivalently

_x = AFx+ x0�(t); x(0�) = 0

z = CFx:

The associated transfer matrix is

Gc(s) =

"AF I

CF 0

#

andkGcx0k22 = x�0Xx0:

The proof of the following theorem requires a preliminary result about internalstability given input-output stability.

Lemma 14.1 If u; z 2 Lp[0;1) for p � 1 and (C1; A) is detectable in the systemdescribed by equations (14.6) and (14.7), then x 2 Lp[0;1). Furthermore, if p < 1,then x(t)! 0 as t!1.


Proof. Since (C1; A) is detectable, there exists L such that A+ LC1 is stable. Let xbe the state estimate of x given by

_x = (A+ LC1)x+ (LD12 +B2)u� Lz:

Then x 2 Lp[0;1) since z and u are in Lp[0;1). Now let e = x� x; then

_e = (A+ LC1)e

and e 2 Lp[0;1). Therefore, x = e + x 2 Lp[0;1). It is easy to see that e(t) ! 0as t ! 1 for any initial condition e(0). Finally, x(t) ! 0 follows from the fact that ifp <1 then x! 0. 2

Theorem 14.2 There exists a unique optimal control for the LQR problem, namelyu = Fx. Moreover,

minu2L2[0;1)

kzk2 = kGcx0k2 :

Note that the optimal control strategy is constant gain state feedback, and this gainis independent of the initial condition x0.

Proof. With the change of variable v = u� Fx, the system can be written as"_x

z

#=

"AF B2

CF D12

# "x

v

#; x(0) = x0: (14:12)

Now if v 2 L2[0;1), then x; z 2 L2[0;1) and x(1) = 0 since AF is stable. Henceu = Fx + v 2 L2[0;1). Conversely, if u; z 2 L2[0;1), then from Lemma 14.1 x 2L2[0;1). So v 2 L2[0;1). Thus the mapping v = u � Fx between v 2 L2[0;1) andthose u 2 L2[0;1) that make z 2 L2[0;1) is one-to-one and onto. Therefore,

minu2L2[0;1)

kzk2 = minv2L2[0;1)

kzk2 :

By di�erentiating x(t)�Xx(t) with respect to t along a solution of the di�erential equa-tion (14.12) and by using (14.9) and the fact that C�FD12 = �XB2, we see that

d

dtx�Xx = _x�Xx+ x�X _x

= x�(A�FX +XAF )x+ 2x�XB2v

= �x�C�FCFx+ 2x�XB2v

= �(CFx+D12v)�(CFx+D12v) + 2x�C�FD12v + v�v + 2x�XB2v

= �kzk2 + kvk2 : (14.13)


Now integrate (14.13) from 0 to 1 to get

kzk22 = x�0Xx0 + kvk22 :Clearly, the unique optimal control is v = 0, i.e., u = Fx. 2

This method of proof, involving change of variables and the completion of the square,is a standard technique and variants of it will be used throughout this book. An alter-native proof can be given in frequency domain. To do that, let us �rst note the followingfact:

Lemma 14.3 Let a transfer matrix be de�ned as

U :=

"AF B2

CF D12

#2 RH1:

Then U is inner and U�Gc 2 RH?2 .

Proof. The proof uses standard manipulations of state space realizations. From U weget

U�(s) =

"�A�F �C�FB�2 D�

12

#:

Then it is easy to compute

U�U =

264�A�F �C�FCF �C�FD12

0 AF B2

B�2 D�12CF I

375 ; U�Gc =

264�A�F �C�FCF 0

0 AF I

B�2 D�12CF 0

375 :

Now do the similarity transformation"I �X0 I

#

on the states of the transfer matrices and use (14.11) to get

U�U =

264�A�F 0 0

0 AF B2

B�2 0 I

375 = I

U�Gc =

264�A�F 0 �X0 AF I

B�2 0 0

375 =

"�A�F �XB�2 0

#2 RH?2 :


2

An alternative proof of Theorem 14.2 We have in the frequency domain

z = Gcx0 + Uv:

Let v 2 H2. By Lemma 14.3, Gcx0 and Uv are orthogonal. Hence

kzk22 = kGcx0k22 + kUvk22 :

Since U is inner, we getkzk22 = kGcx0k22 + kvk22 :

This equation immediately gives the desired conclusion. 2

Remark 14.2 It is clear that the LQR problem considered above is essentially equiva-lent to minimizing the 2-norm of z with the input w = x0�(t) in the following diagram:

�

-

wz

K

� �A I B2

C1 0 D12

I 0 0

But this problem is a special H2 norm minimization problem considered in a latersection. ~

14.3 Extended LQR Problem

This section considers the extended LQR problem where no detectability assumption ismade for (C1; A).

Extended LQR Problem

Let a dynamical system be given by

_x = Ax +B2u; x(0) = x0 given but arbitrary

z = C1x+D12u

with the following assumptions:

(A1) (A;B2) is stabilizable;

14.4. Guaranteed Stability Margins of LQR 373

(A2) D12 has full column rank withhD12 D?

iunitary;

(A3)

"A� j!I B2

C1 D12


Find an optimal control law u 2 L2[0;1) such that the system is internally

stable, i.e., x 2 L2[0;1) and the performance criterion kzk22 is minimized.

Assume the same notation as above, and we have

Theorem 14.4 There exists a unique optimal control for the extended LQR problem,namely u = Fx. Moreover,

minu2L2[0;1)

kzk2 = kGcx0k2 :

Proof. The proof of this theorem is very similar to the proof of the standard LQRproblem except that, in this case, the input/output stability may not necessarily implythe internal stability. Instead, the internal stability is guaranteed by the way of choosingcontrol law.

Suppose that u 2 L2[0;1) is such a control law that the system is stable, i.e.,x 2 L2[0;1). Then v = u � Fx 2 L2[0;1). On the other hand, let v 2 L2[0;1) andconsider "

_x

z

#=

"AF B2

CF D12

#"x

v

#; x(0) = x0:

Then x; z 2 L2[0;1) and x(1) = 0 since AF is stable. Hence u = Fx + v 2 L2[0;1).Again the mapping v = u�Fx between v 2 L2[0;1) and those u 2 L2[0;1) that makez 2 L2[0;1) and x 2 L2[0;1) is one to one and onto. Therefore,

minu2L2[0;1)

kzk2 = minv2L2[0;1)

kzk2 :

Using the same technique as in the proof of the standard LQR problem, we have

kzk22 = x�0Xx0 + kvk22 :And the unique optimal control is v = 0, i.e., u = Fx. 2

14.4 Guaranteed Stability Margins of LQR

Now we will consider the system described by equation (14.6) with the LQR control lawu = Fx. The closed-loop block diagram is as shown in Figure 14.1.

The following result is the key to stability margins of an LQR control law.

Lemma 14.5 Let F = �(B�2X + D�12C1) and de�ne G12 = D12 + C1(sI � A)�1B2.

Then �I �B�2(�sI �A�)�1F �

� �I � F (sI �A)�1B2

�= G�12(s)G12(s):


xu- -- _x = Ax+B2uF

Figure 14.1: LQR closed-loop system

Proof. Note that the Riccati equation (14.9) can be written as

XA+A�X � F �F + C�1C1 = 0:

Add and subtract sX to the equation to get

�X(sI � A)� (�sI �A�)X � F �F + C�1C1 = 0:

Now multiply the above equation from the left by B�2(�sI �A�)�1 and from the rightby (sI �A)�1B2 to get

�B�2(�sI �A�)�1XB2 �B�2X(sI �A)�1B2 �B�2(�sI �A�)�1F �F (sI �A)�1B2

+B�2(�sI �A�)�1C�1C1(sI �A)�1B2 = 0:

Using �B�2X = F +D�12C1 in the above equation, we have

B�2(�sI �A�)�1F � + F (sI �A)�1B2 �B�2 (�sI �A�)�1F �F (sI �A)�1B2

+B�2(�sI �A�)�1C�1D12 +D�12C1(sI �A)�1B2

+B�2(�sI �A�)�1C�1C1(sI �A)�1B2 = 0:

Then the result follows from completing the square and from the fact that D�12D12 = I .

2

Corollary 14.6 Suppose D�12C1 = 0. Then�

I �B�2(�sI �A�)�1F �� I � F (sI �A)�1B2

�= I+B�2(�sI�A�)�1C�1C1(sI�A)�1B2:

In particular, �I �B�2(�j!I �A�)�1F �

� �I � F (j!I �A)�1B2

� � I (14:14)

and �I +B�2 (�j!I �A� � F �B�2)

�1F �� I + F (j!I �A�B2F )

�1B2

� � I: (14:15)

14.5. Standard H2 Problem 375

Note that the inequality (14.15) follows from taking the inverse of inequality (14.14).

De�ne G(s) = �F (sI�A)�1B2 and assume for the moment that the system is singleinput. Then the inequality (14.14) shows that the open-loop Nyquist diagram of thesystem G(s) in Figure 14.1 never enters the unit disk centered at (�1; 0) of the complexplane. Hence the system has at least the following stability margins:

kmin � 1

2; kmax =1; �min � �60o; �max � 60o

i.e., the system has at least a 6dB (= 20 log 2) gain margin and a 60o phase margin inboth directions. A similar interpretation may be generalized to multiple input systems.

Next, it is noted that the inequality (14.15) can also be given some robustnessinterpretation. In fact, it implies that the closed-loop system in Figure 14.1 is stableeven if the open-loop system G(s) is perturbed additively by a � 2 RH1 as long ask�k1 < 1. This can be seen from the following block diagram and small gain theoremwhere the transfer matrix from w to z is exactly I + F (j!I �A�B2F )

�1B2.

e

z w

?

-

- --

�

- _x = Ax+B2uF

14.5 Standard H2 Problem

The system considered in this section is described by the following standard blockdiagram:

G

K

z

y

w

u

� �

�

-

The realization of the transfer matrix G is taken to be of the form

G(s) =

264

A B1 B2

C1 0 D12

C2 D21 0

375 :


Notice the special o�-diagonal structure of D: D22 is assumed to be zero so that G22

is strictly proper1; also, D11 is assumed to be zero in order to guarantee that the H2

problem properly posed.2 The case for D11 6= 0 will be discussed in Section 14.7.The following additional assumptions are made for the output feedback H2 problem

in this chapter:

(i) (A;B2) is stabilizable and (C2; A) is detectable;

(ii) D12 has full column rank withhD12 D?

iunitary, and D21 has full row rank

with

"D21

~D?

#unitary;

(iii)

"A� j!I B2

C1 D12

#has full column rank for all !;

(iv)

"A� j!I B1

C2 D21

#has full row rank for all !.

The �rst assumption is for the stabilizability of G by output feedback, and the thirdand the fourth assumptions together with the �rst guarantee that the two Hamilto-nian matrices associated with the H2 problem below belong to dom(Ric). The rankassumptions (ii) are necessary to guarantee that the H2 optimal controller is a �nitedimensional linear time invariant one, while the unitary assumptions are made for thesimplicity of the �nal solution; they are not restrictions (see e.g., Chapter 17).

H2 Problem The H2 control problem is to �nd a proper, real-rationalcontroller K which stabilizes G internally and minimizes the H2-norm ofthe transfer matrix Tzw from w to z.

In the following discussions we shall assume that we have state models of G and K.Recall that a controller is said to be admissible if it is internally stabilizing and proper.

We now state the solution of the problem and then take up its derivation in the nextseveral sections. By Corollary 13.10 the two Hamiltonian matrices

H2 :=

"A 0

�C�1C1 �A�

#�"

B2

�C�1D12

# hD�

12C1 B�2

i

J2 :=

"A� 0

�B1B�1 �A

#�"

C�2

�B1D�21

# hD21B

�1 C2

i1As we have discussed in Section 12.3.4 of Chapter 12 there is no loss of generality in making this

assumption since the controller for D22 nonzero case can be recovered from the zero case.2Recall that a rational proper stable transfer function is an RH2 function i� it is strictly proper.

14.5. Standard H2 Problem 377

belong to dom(Ric), and, moreover, X2 := Ric(H2) � 0 and Y2 := Ric(J2) � 0. De�ne

F2 := �(B�2X2 +D�12C1); L2 := �(Y2C�2 +B1D

�21)

and

AF2 := A+B2F2; C1F2 := C1 +D12F2

AL2 := A+ L2C2; B1L2 := B1 + L2D21

A2 := A+B2F2 + L2C2

Gc(s) :=

"AF2 I

C1F2 0

#; Gf (s) :=

"AL2 B1L2

I 0

#:

Theorem 14.7 There exists a unique optimal controller

Kopt(s) :=

"A2 �L2

F2 0

#:

Moreover, min kTzwk22 = kGcB1k22 + kF2Gfk22 = kGcL2k22 + kC1Gfk22.

The controllerKopt has the well-known separation structure, which will be discussedin more detail in Section 14.9. For comparison with the H1 results, it is useful todescribe all suboptimal controllers.

Theorem 14.8 The family of all admissible controllers such that kTzwk2 < equalsthe set of all transfer matrices from y to u in

M2

Q

u y� �

�

-

M2(s) =

264

A2 �L2 B2

F2 0 I

�C2 I 0

375

where Q 2 RH2, kQk22 < 2 � (kGcB1k22 + kF2Gfk22).

Thus, the suboptimal controllers are parameterized by a �xed (independent of )linear-fractional transformation with a free parameter Q. With Q = 0, we recoverKopt.It is worth noting that the parameterization in Theorem 14.8 makes Tzw a�ne in Qand yields the Youla parameterization of all stabilizing controllers when the conditionson Q are replaced by Q 2 RH1.


14.6 Optimal Controlled System

In this section, we look at the controller,

K(s) = F`(M2; Q); Q 2 RH1connected to G. (Keep in mind that all admissible controllers are parameterized by theabove formula). We will give a brief analysis of the closed-loop system. It will be seenthat a direct consequence from this analysis is the results of Theorem 14.7 and 14.8.The proof given here is not our emphasis. The reason is that this approach does notgeneralize nicely to other control problems and is often very involved. An alternativeproof will be given in the later part of this chapter by using the FI and OE resultsdiscussed in section 14.8 and the separation argument. The idea of separation is themain theme for synthesis. We shall now analyze the system structure under the controlof a such controller. In particular, we will compute explicitly kTzwk22.

Consider the following system diagram with controller K(s) = F`(M2; Q):

u1y1

y u

wz

-

�

�

��

�

�

XXXXXXX�

�

Q

M2

G

Then Tzw = F`(N;Q) with

N =

266664

AF2 �B2F2 B1 B2

0 AL2 B1L2 0

C1F2 �D12F2 0 D12

0 C2 D21 0

377775 :

De�ne

U =

"AF2 B2

C1F2 D12

#; V =

"AL2 B1L2

C2 D21

#:

We haveTzw = GcB1 � UF2Gf + UQV:

It follows from Lemma 14.3 that GcB1 and U are orthogonal. Thus

kTzwk22 = kGcB1k22 + kUF2Gf � UQV k22 = kGcB1k22 + kF2Gf �QV k22 :

14.7. H2 Control with Direct Disturbance Feedforward* 379

It can also be shown easily by duality that Gf and V are orthogonal, i.e., GfV� 2 RH?2 ,

and V is a co-inner, so we have

kTzwk22 = kGcB1k22 + kF2Gf �QV k22 = kGcB1k22 + kF2Gfk22 + kQk22 :This shows clearly that Q = 0 gives the unique optimal control, so K = F`(M2; 0) isthe unique optimal controller. Note also that kTzwk2 is �nite if and only if Q 2 RH2.Hence Theorem 14.7 and 14.8 follow easily.

It is interesting to examine the structure of Gc and Gf . First of all the transfermatrix Gc can be represented as a �xed system with the feedback matrix F2 wrappedaround it:

�

- F2

� �A I B2

C1 0 D12

I 0 0

F2 is, in fact, an optimal LQR controller and minimizes theH2 norm ofGc. Similarly,Gf can be represented as

�

- L2

� �A B1 I

I 0 0

C2 D21 0

and L2 minimizes the H2 norm of Gf and solves a special �ltering problem.

14.7 H2 Control with Direct Disturbance Feedfor-

ward*

Let us consider the generalized system structure again with D11 not necessarily zero:

G(s) =

264

A B1 B2

C1 D11 D12

C2 D21 0

375


We shall consider the following question: what will happen and under what conditionwill the H2 optimal control problem make sense if D11 6= 0?

Recall that F`(M2; Q) with Q 2 RH1 parameterizes all stabilizing controllers for Gregardless of D11 = 0 or not. Now again consider the closed loop transfer matrix withthe controller K = F`(M2; Q); then

Tzw = GcB1 � UF2Gf + UQV +D11

and

Tzw(1) = D12Q(1)D21 +D11:

Hence the H2 optimal control problem will make sense, i.e., having �nite H2 norm, ifand only if there is a constant Q(1) such that

D12Q(1)D21 +D11 = 0:

This requires that

Q(1) = �D�12D11D

�21

and that

�D12D�12D11D

�21D21 +D11 = 0: (14:16)

Note that the equation (14.16) is a very restrictive condition. For example, suppose

D12 =

"0

I

#; D21 =

h0 I

i

and D11 is partitioned accordingly

D11 =

"D1111 D1112

D1121 D1122

#:

Then equation (14.16) implies that

"D1111 D1112

D1121 0

#=

"0 0

0 0

#

and that Q(1) = �D1122. So only D1122 can be nonzero for a sensible H2 problem.Hence from now on in this section we shall assume that (14.16) holds and denotesDK := �D�

12D11D�21. To �nd the optimal control law for the system G with D11 6= 0,

let us consider the following system con�guration:

14.8. Special Problems 381

f

f

�

�K

G

y u1

wz� �

�

-

-

-6

6-

-

K

�DK

DK

G

Then

G =

264

A+B2DKC2 B1 + B2DKD21 B2

C1 +D12DKC2 0 D12

C2 D21 0

375

andK = DK + K:

It is easy to check that the system G satis�es all assumptions in Section 14.5; hence thecontroller formula in Section 14.5 can be used. A little bit of algebra will show that

K =

"A2 �B2DKC2 �(L2 �B2DK)

F2 �DKC2 0

#

is the H2 optimal controller for G. Hence the controller K for the original system Gwill be given by

K =

"A2 �B2DKC2 �(L2 �B2DK)

F2 �DKC2 DK

#= F`(M2; DK):

14.8 Special Problems

In this section we look at various H2-optimization problems from which the outputfeedback solutions of the previous sections will be constructed via a separation argument.All the special problems in this section are to �nd K stabilizing G and minimizing theH2-norm from w to z in the standard setup, but with di�erent structures for G. As


in Chapter 12, we shall call these special problems, respectively, state feedback (SF),output injection (OI), full information (FI), full control (FC), disturbance feedforward(DF), and output estimation (OE). OI, FC, and OE are natural duals of SF, FI, andDF, respectively. The output feedback solutions will be constructed out of the FI andOE results.

The special problems SF, OI, FI, and FC are not, strictly speaking, special cases ofthe output feedback problem since they do not satisfy all of the assumptions for outputfeedback (while DF and OE do). Each special problem inherits some of the assumptions(i)-(iv) from the output feedback as appropriate. The assumptions will be discussed inthe subsections for each problem.

In each case, the results are summarized as a list of three items; (in all cases, Kmust be admissible)

1. the minimum of kTzwk2;2. the unique controller minimizing kTzwk2;3. the family of all controllers such that kTzwk2 < , where is greater than the

minimum norm.

Warning: we will be more speci�c below about what we mean about the uniqueness andall controllers in the second and third item. In particular, the controllers characterizedhere for SF, OI, FI and FC problems are neither unique nor all-inclusive. Once again weregard all controllers that give the same control signal u, i.e., having the same transferfunction from w to u, as an equivalence class. In other words, if K1 and K2 generatethe same control signal u, we will regard them as the same, denoted as K1

�= K2. Hencethe \unique controller" here means one of the controllers from the unique equivalenceclass. The same comments apply to the \all" situation. This will be much clearer insection 14.8.1 when we consider the state feedback problem. In that case we actuallygive a parameterization of all the elements in the equivalence class of the \unique"optimal controllers. Thus the unique controller is really not unique. We chose not togive the parameterization of all the elements in an equivalence class in this book since itis very messy, as can be seen in section 14.8.1, and not very useful. However, it will beseen that this equivalence class problem will not occur in the general output feedbackcase including DF and OE problems.

14.8.1 State Feedback

Consider an open-loop system transfer matrix

GSF (s) =

264

A B1 B2

C1 0 D12

I 0 0

375



(i) (A;B2) is stabilizable;

(ii) D12 has full column rank withhD12 D?

iunitary;

(iii)

"A� j!I B2

C1 D12


This is very much like the LQR problem except that we require from the start that ube generated by state feedback and that the detectability of (C1; A) is not imposed sincethe controllers are restricted to providing internal stability. The controller is allowed tobe dynamic, but it turns out that dynamics are not necessary.

State Feedback:

1. min kTzwk2 = kGcB1k2 = (trace(B�1X2B1))1=2

2. K(s) �= F2

Remark 14.3 The class of all suboptimal controllers for state feedback are messy andare not very useful in this book, so they are omitted, as are the OI problems. ~

Proof. Let K be a stabilizing controller, u = K(s)x. Change control variables byde�ning v := u� F2x and then write the system equations as2

64_x

z

v

375 =

264

AF2 B1 B2

C1F2 0 D12

(K � F2) 0 0

375264

x

w

v

375 :

The block diagram is

�

-

x v

wz

K � F2

� �AF2 B1 B2

C1F2 0 D12

I 0 0

Let Tvw denote the transfer matrix from w to v. Notice that Tvw 2 RH2 becauseK stabilizes G. Then

Tzw = GcB1 + UTvw


where U =

"AF2 B2

C1F2 D12

#, and by Lemma 14.3 U is inner and U�Gc is in RH?2 . We

get

kTzwk22 = kGcB1k22 + kTvwk22 :Thus min kTzwk22 = kGcB1k22 and the minimum is achieved i� Tvw = 0. Furthermore,K = F2 is a controller achieving this minimum, and any other controllers achievingminimum are in the equivalence class of F2. 2

Note that the above proof actually yields a much stronger result than what is needed.The proof that the optimal Tvw is Tvw = 0 does not depend on the restriction that thecontroller measures just the state. We only require that the controller produce v as acausal stable function Tvw of w. This means that the optimal state feedback is alsooptimal for the full information problem as well.

We now give some further explanation about the uniqueness of the optimal controllerthat we commented on before. The important observation for this issue is that thecontrollers making Tvw = 0 are not unique. The controller given above, F2, is only oneof them. We will now try to �nd all of those controllers that stabilize the system andgive Tvw = 0, i.e., all K(s) �= F2.

Proposition 14.9 Let Vc be a matrix whose columns form a basis for KerB�1 (V �c B1 =0). Then all H2 optimal state feedback controllers can be parameterized as Kopt =F`(Msf ;�) with � 2 RH2 and

Msf =

"F2 I

V �c (sI �AF2) �V �c B2

#:

Proof. Since

Tvw =

I �

"AF2 B2

K � F2 0

#!�1 "AF2 B1

K � F2 0

#= 0;

we get

(K � F2)(sI �AF2)�1B1 = 0 (14:17)

which is achieved if K = F2. Clearly, this is the only solution if B1 is square andnonsingular or if K is restricted to be constant and (AF2 ; B1) is controllable.

To parameterize all elements in the equivalence class (Tvw = 0) to which K = F2belongs, let

Pc(s) :=

"AF2 B2

I 0

#:


Then all state feedback controllers stabilizing G can be parameterized as

K(s) = F2 + (I +QPc)�1Q; Q(s) 2 RH1

where Q is free. Substitute K in equation (14.17), and we get

Q(s)(sI �AF2)�1B1 = 0: (14:18)

Hence we haveQ(s)(sI �AF2)

�1 = �(s)V �c ; �(s) 2 RH2:

Thus all Q(s) 2 RH1 satisfying (14.18) can be written as

Q(s) = �(s)V �c (sI �AF2); �(s) 2 RH2: (14:19)

Therefore,

Kopt(s) = F2 + (I +�(s)V �c (sI �AF2)Pc)�1

�(s)V �c (sI �AF2)

= F2 + (I +�(s)V �c B2)�1

�(s)V �c (sI �AF2); �(s) 2 RH2

parameterizes the equivalence class of K = F2. 2

14.8.2 Full Information and Other Special Problems

We shall consider the FI problem �rst.

GFI (s) =

266664

A B1 B2

C1 0 D12"I

0

# "0

I

# "0

0

#377775

The assumptions relevant to the FI problem are the same as the state feedback problem.This is similar to the state feedback problem except that the controller now has moreinformation (w). However, as was pointed out in the discussion of the state feedbackproblem, this extra information is not used by the optimal controller.

Full Information:

1. min kTzwk2 = kGcB1k2 = (trace(B�1X2B1))1=2

2. K(s) �=hF2 0

i

3. K(s) �=hF2 Q(s)

i, where Q 2 RH2, kQk22 < 2 � kGcB1k22


Proof. Items 1 and 2 follow immediately from the proof of the state feedback resultsbecause the argument that Tvw = 0 did not depend on the restriction to state feedbackonly. Thus we only need to prove item 3. Let K be an admissible controller such thatkTzwk2 < . As in the SF proof, de�ne a new control variable v = u � F2x; then theclosed-loop system is as shown below

~GFI

~K

z

y

w

v

� �

�

-

with

~GFI =

266664

AF2 B1 B2

C1F2 0 D12"I

0

# "0

I

# "0

0

#377775 ; ~K = K � [F2 0]:

Denote by Q the transfer matrix from w to v; it belongs to RH2 by internal stabilityand the fact that D12 has full column rank and Tzw with z = C1F2x+D12Qw has �nite

H2 norm. Then u = F2x + v = F2x + Qw =hF2 Q

iy so K �=

hF2 Q

i, and

kTzwk22 = kGcB1 + UQk22 = kGcB1k22 + kQk22; hence,kQk22 = kTzwk22 � kGcB1k22 < 2 � kGcB1k22:

Likewise, one can show that every controller of the form given in item no.3 is admissibleand suboptimal. 2

The results for DF, OI, FC, and OE follow from the parallel development of Chap-ter 12.

Disturbance Feedforward:

GDF (s) =

264

A B1 B2

C1 0 D12

C2 I 0

375

This problem inherits the same assumptions (i)-(iii) as in the state feedback problem,in addition to the stability condition of A�B1C2.

1. min kTzwk2 = kGcB1k2

2. K(s) =

"A+B2F2 �B1C2 B1

F2 0

#


3. the set of all transfer matrices from y to u in

M2D

Q

u y� �

�

-

M2D(s) =

264A+B2F2 �B1C2 B1 B2

F2 0 I

�C2 I 0

375

where Q 2 RH2, kQk22 < 2 � kGcB1k22

Output Injection:

GOI (s) =

264

A B1 I

C1 0 0

C2 D21 0

375


(i) (C2; A) is detectable;

(ii) D21 has full row rank with

"D21

~D?

#unitary;

(iii)

"A� j!I B1

C2 D21


1. min kTzwk2 = kC1Gfk2 = (trace(C1Y2C�1 ))

1=2

2. K(s) �="L2

0

#

Full Control:

GFC(s) =

26664

A B1

hI 0

iC1 0

h0 I

iC2 D21

h0 0

i37775

with the same assumptions as an output injection problem.

1. min kTzwk2 = kC1Gfk2 = (trace(C1Y2C�1 ))

1=2

2. K(s) �="L2

0

#


3. K(s) �="

L2

Q(s)

#, where Q 2 RH2, kQk22 < 2 � kC1Gfk22

Output Estimation:

GOE(s) =

264

A B1 B2

C1 0 I

C2 D21 0

375

The assumptions are taken to be those in the output injection problem plus an additionalassumption that A�B2C1 is stable.

1. min kTzwk2 = kC1Gfk2

2. K(s) =

"A+ L2C2 �B2C1 L2

C1 0

#

3. the set of all transfer matrices from y to u in

M2O

Q

u y� �

�

-

M2O(s) =

264A+ L2C2 �B2C1 L2 �B2

C1 0 I

C2 I 0

375

where Q 2 RH2, kQk22 < 2 � kC1Gfk22

14.9 Separation Theory

Given the results for the special problems, we can now prove Theorem 14.7 using sepa-ration arguments. This essentially involves reducing the output feedback problem to acombination of the Full Information and the Output Estimation problems.

14.9.1 H2 Controller Structure

Recall that the unique H2 optimal controller is

K2(s) :=

"A2 �L2

F2 0

#=

"A+B2F2 + L2C2 Y2C

�2

�B�2X2 0

#

and min kTzwk22 = kGcB1k22 + kF2Gfk22where X2 := Ric(H2) and Y2 := Ric(J2) and the min is over all stabilizing controllers.Note that F2 is the optimal state feedback in the Full Information problem and L2 is the

14.9. Separation Theory 389

optimal output injection in the Full Control case. The well-known separation propertyof the H2 solution is re ected in the fact that K2 is exactly the optimal output estimateof F2x and can be obtained by setting C1 = F2 in OE.2. Also, the minimum cost is thesum of the FI cost (FI.1) and the OE cost for estimating F2x (OE.1).

The controller equations can be written in standard observer form as

_x = Ax+B2u+ L2(C2x� y)

u = F2x

where x is the optimal estimate of x.

14.9.2 Proof of Theorem 14.7

As before we de�ne a new control variable, v := u� F2x, and the transfer function toz becomes

z =

"AF2 B1 B2

C1F2 0 D12

# "w

v

#= GcB1w + Uv (14:20)

where Gc(s) :=

"AF2 I

C1F2 0

#and U(s) :=

"AF2 B2

C1F2 D12

#. Furthermore, U is inner

(i.e., U�U = I) and U�Gc belongs to RH?2 from Lemma 14.3.

Let K be any admissible controller and notice how v is generated:

Gv

K

v

y

w

u

� �

�

-

Gv =

264

A B1 B2

�F2 0 I

C2 D21 0

375

Note that K stabilizes G i� K stabilizes Gv (the two closed-loop systems have identicalA-matrices) and that Gv has the form of the Output Estimation problem. From (14.20)and the properties of U we have that

min kTzwk22 = kGcB1k22 +min kTvwk22: (14:21)

But from item OE.2, kTvwk2 is minimized by the controller

"A+B2F2 + L2C2 �L2

F2 0

#;

and then from OE.1 min kTvwk2 = kF2Gfk2.



Continuing with the development in the previous proof, we see that the set of all sub-optimal controllers equals the set of all K's such that kTvwk22 < 2 � kGcB1k22. Applyitem OE.3 to get that such K's are parameterized by

M2

Q

u y� �

�

-

M2(s) =

264

A2 �L2 B2

F2 0 I

�C2 I 0

375

with Q 2 RH2, kQk22 < 2 � kGcB1k22 � kF2Gfk22.

14.10 Stability Margins of H2 Controllers

We have shown that the system with LQR controller has at least 60o phase marginand 6dB gain margin. However, it is not clear whether these stability margins will bepreserved if the states are not available and the output feedback H2 (or LQG) controllerhas to be used. The answer is provided here through a counterexample: there are noguaranteed stability margins for a H2 controller.

Consider a single input and single output two state generalized dynamical system:

G(s) =

26666666664

"1 1

0 1

# " p� 0p� 0

# "0

1

#

" pqpq

0 0

#0

"0

1

#h1 0

i h0 1

i0

37777777775:

It can be shown analytically that

X2 =

"2� �

� �

#; Y2 =

"2� �

� �

#

and

F2 = ��h1 1

i; L2 = ��

"1

1

#

where

� = 2 +p4 + q ; � = 2 +

p4 + �:

14.10. Stability Margins of H2 Controllers 391

Then the optimal output H2 controller is given by

Kopt =

264

1� � 1 �

�(�+ �) 1� � �

�� 0

375 :

Suppose that the resulting closed-loop controller (or plant G22) has a scalar gain k witha nominal value k = 1. Then the controller implemented in the system is actually

K = kKopt;

and the closed-loop system A-matrix becomes

~A =

266664

1 1 0 0

0 1 �k� �k�� 0 1� � 1

� 0 �� 1� �

377775 :

It can be shown that the characteristic polynomial has the form

det(sI � ~A) = a4s4 + a3s

3 + a2s2 + a1s+ a0

witha1 = �+ � � 4 + 2(k � 1)��; a0 = 1 + (1� k)��:

Note that for closed-loop stability it is necessary to have a0 > 0 and a1 > 0. Note alsothat a0 � (1 � k)�� and a1 � 2(k � 1)�� for su�ciently large � and � if k 6= 1. It iseasy to see that for su�ciently large � and � (or q and �), the system is unstable forarbitrarily small perturbations in k in either direction. Thus, by choice of q and �, thegain margins may be made arbitrarily small.

It is interesting to note that the margins deteriorate as control weight (1=q) getssmall (large q) and/or system deriving noise gets large (large �). In modern controlfolklore, these have often been considered ad hoc means of improving sensitivity.

It is also important to recognize that vanishing margins are not only associated withopen-loop unstable systems. It is easy to construct minimum phase, open-loop stablecounterexamples for which the margins are arbitrarily small.

The point of these examples is that H2 (LQG) solutions, unlike LQR solutions,provide no global system-independent guaranteed robustness properties. Like their moreclassical colleagues, modern LQG designers are obliged to test their margins for eachspeci�c design.

It may, however, be possible to improve the robustness of a given design by relaxingthe optimality of the �lter (or FC controller) with respect to error properties. A success-ful approach in this direction is the so called LQG loop transfer recovery (LQG/LTR)design technique. The idea is to design a �ltering gain (or FC control law) in such wayso that the LQG (or H2) control law will approximate the loop properties of the regularLQR control. This will not be explored further here; interested reader may consultrelated references.



The detailed treatment of H2 related theory, LQ optimal control, Kalman �ltering, etc.,can be found in Anderson and Moore [1990] or Kwakernaak and Sivan [1972].

15Linear Quadratic

Optimization

This chapter considers time domain characterizations of Hankel operators and Toeplitzoperators by means of some related quadratic optimizations. These characterizationswill be used to prove a max-min problem which is the key to the H1 theory consideredin the next chapter.

15.1 Hankel Operators

Let G(s) be a stable real rational transfer matrix with a state space realization

_x = Ax+Bw

z = Cx+Dw: (15.1)

Consider �rst the problem of using an input w 2 L2� to maximize kP+zk22. This isexactly the standard problem of computing the Hankel norm of G, i.e., the inducednorm of the Hankel operator

P+MG : H?2 ! H2;

and the norm can be expressed in terms of the controllability Gramian Lc and observ-ability Gramian Lo:


393

394 LINEAR QUADRATIC OPTIMIZATION

Although this result is well-known, we will include a time-domain proof similar in tech-nique to the proofs of the optimal H2 and H1 control results.

Lemma 15.1 infw2L2�

�kwk22 �� x(0) = x0= x�0y0 where y0 solves Lcy0 = x0.

Proof. Assume (A;B) is controllable; otherwise, factor out the uncontrollable sub-space. Then Lc is invertible and y0 = L�1c x0. Moreover, w 2 L2� can be used toproduce any x(0) = x0 given x(�1) = 0. We need to show

infw2L2�

�kwk22 �� x(0) = x0= x�0L

�1c x0: (15:2)

To show this, we can di�erentiate x(t)�L�1c x(t) along the solution of (15.1) for any giveninput w as follows:

d

dt(x�L�1c x) = _x�L�1c x+ x�L�1c _x = x�(A�L�1c + L�1c A)x + 2hw;B�L�1c xi:

Using Lc equation to substitute for A�L�1c +L�1c A and completion of the squares gives

d

dt(x�L�1c x) = kwk2 � kw �B�L�1c xk2:

Integration from t = �1 to t = 0 with x(�1) = 0 and x(0) = x0 gives

x�0L�1c x0 = kwk22 � kw �B�L�1c xk22 � kwk22:

If w(t) = B�e�A�tL�1c x0 = B�L�1c e(A+BB

�L�1c

)tx0 on (�1; 0], then w 2 L2�, w =B�L�1c x and equality is achieved, thus proving (15.2). 2

Lemma 15.2 supw2BL2� kP+zk22 = supw2BH?2kP+MGwk22 = �(LoLc):

Proof. Given x(0) = x0 and w = 0, for t � 0 the norm of z(t) = CeAtx0 can be foundfrom

kP+zk22 =Z 1

0

x�0eA�tC�CeAtx0dt = x�0Lox0:

Combine this result with Lemma 15.1 to give

supw2BL2�

kP+zk22 = sup06=w2L2�

kP+zk22kwk22

= maxx0 6=0

x�0Lox0

x�0L�1c x0

= �(LoLc):

2

15.2. Toeplitz Operators 395

Remark 15.1 Another useful way to characterize the Hankel norm is to examine thefollowing quadratic optimization with initial condition x(�1) = 0:

supw2L2�

�kP+zk22 � �2kwk22:

It is easy to see from the de�nition of the Hankel norm that

sup06=w2L2�

kP+zk2kwk2 � �

i�sup

06=w2L2�

�kP+zk22 � �2kwk22 � 0:

So the Hankel norm is equal to the smallest � such that the above inequality holds.Now

supw2L2�

�kP+zk22 � �2kwk22

= supx02Rn

(x�0Lox0 � �2x�0L�1c x0)

=

(0; �(LoLc) � �2;

+1; �(LoLc) > �2:

Hence the Hankel norm is equal to the square root of �(LoLc). ~

15.2 Toeplitz Operators

If a transfer matrix G 2 RH1 and kGk1 < 1, then by Corollary 13.24, the Hamiltonianmatrix

H =


�C�(I +DR�1D�)C �(A+BR�1D�C)�

#; R = 2I �D�D

with = 1 is in dom(Ric), X = Ric(H) � 0, A+BR�1(B�X +D�C) is stable and

A�X +XA+ (XB + C�D)R�1(B�X +D�C) + C�C = 0: (15:3)

The following lemma o�ers yet another consequence of kGk1 < 1. (Recall that the H1norm of a stable matrix is the Toeplitz operator norm.)

Lemma 15.3 Suppose kGk1 < 1 and x(0) = x0. Then

supw2L2+

(kzk22 � kwk22) = x�0Xx0

and the sup is achieved.


Proof. We can di�erentiate x(t)�Xx(t) as in the last section, use the Riccati equation(15.3) to substitute for A�X +XA, and complete the squares to get

d

dt(x�Xx) = �kzk2 + kwk2 � kR�1=2[Rw � (B�X +D�C)x]k2:

If w 2 L2+, then x 2 L2+, so integrating from t = 0 to t =1 gives

kzk22 � kwk22 = x�0Xx0 � kR�1=2[Rw � (B�X +D�C)x]k22 � x�0Xx0: (15:4)

If we let w = R�1(B�X + D�C)x = R�1(B�X + D�C)e[A+BR�1(B�X+D�C)]tx0, then

w 2 L2+ because A+ BR�1(B�X +D�C) is stable. Thus the inequality in (15.4) canbe made an equality and the proof is complete. Note that the sup is achieved for a wwhich is a linear function of the state. 2

As a direct consequence of this lemma, we have the following

Corollary 15.4 Let x(0) = x0 and X0 = X�0 > 0.

(i) if kGk1 < and X = Ric(H). Then

sup06=(x0;w)2Rn�L2[0;1)

�kzk22 � 2(kwk22 + x�0X0x0)

= sup06=x02Rn

�x�0Xx0 � 2x�0X0x0

8><>:

< 0; �max(X � 2X0) < 0;

= 0; �max(X � 2X0) = 0;

= +1; �max(X � 2X0) > 0:

(ii) sup06=(x0;w)2Rn�L2[0;1)

kP+zk22kwk22 + x�0X0x0

< 2 if and only if ��(D) < , H 2 dom(Ric),

and �max(X � 2X0) < 0.

Remark 15.2 The matrix X0 has the interpretation of the con�dence on the initialcondition x0. So if �(X0) is small, then the initial condition is probably not knownvery well. In that case min will be large where min denotes the smallest suchthat �max(X � 2X0) � 0. On the other hand, a large �(X0) implies that the initialcondition is known very well and that min is determined essentially by the conditionH 2 dom(Ric). ~

A dual version of Lemma 15.3 can also be obtained and is useful in characterizingthe so-called 2�2 block mixed Hankel-Toeplitz operator. To do that, we �rst note that1

GT (s) =

"A� C�

B� D�

#

1Note that since the system matrices are real, AT = A�; BT = B�, etc. The conjugate transpose isused here for the transpose for the sake of consistency in notation.

15.3. Mixed Hankel-Toeplitz Operators 397

and kGk1 < 1 i� kGT k1 < 1. Let J denote the following Hamiltonian matrix

J =

"A� 0

�BB� �A

#+

"C�

�BD�

#~R�1

hDB� C

i

where ~R := I �DD�. Then J 2 dom(Ric), Y = Ric(J) � 0, A + (Y C� + BD�) ~R�1Cis stable and

AY + Y A� + (Y C� +BD�) ~R�1(CY +DB�) +BB� = 0 (15:5)

if kGk1 < 1. For simplicity, we shall assume that (A;B) is controllable; hence, Y > 0.The case in which Y is singular can be obtained by restricting x0 2 Im(Y ) and replacingY �1 by Y +.

Lemma 15.5 Suppose kGk1 < 1 and (A;B) is controllable. Then

supw2L2�

�kP�zk22 � kwk22 �� x(0) = x0= �x�0Y �1x0

and the sup is achieved.

Proof. Analogous to the proof of Lemma 15.3, we can di�erentiate x(t)�Y �1x(t), usethe Riccati equation (15.5) to substitute for AY + Y A�, and complete the squares toget

d

dt(x�Y �1x) = �kzk2 + kwk2 � kR�1=2[Rw � (B�Y �1 +D�C)x]k2

where R = I �D�D > 0. If w 2 L2�, then x 2 L2� and x(�1) = 0; so integratingfrom t = �1 to t = 0 gives

kzk22 � kwk22 = �x�0Y �1x0 � kR�1=2[Rw � (B�Y �1 +D�C)x]k22 � �x�0Y �1x0: (15:6)

If we let w = R�1(B�Y �1 +D�C)x = R�1(B�Y �1 +D�C)e[A+BR�1(B�Y �1+D�C)]tx0,

then w 2 L2� because A+BR�1(B�Y �1+D�C) = �Y fA+(Y C�+BD�) ~R�1Cg�Y �1and A + (Y C� + BD�) ~R�1C is stable. Thus the inequality can be made an equalityand the proof is complete. 2

15.3 Mixed Hankel-Toeplitz Operators

Now suppose that the input is partitioned so that B =hB1 B2

i, D =

hD1 D2

i,

G(s) =hG1(s) G2(s)

i2 RH1;


and w is partitioned conformably. Then kG2k1 < 1 i� ��(D2) < 1, and

HW :=

"A 0

�C�C �A�

#+

"B2

�C�D2

#R�12

hD�

2C B�2

i

is in dom(Ric) where R2 := I�D�2D2 > 0. For HW 2 dom(Ric), de�neW = Ric(HW ).

Let

w 2 W := H?2 �L2 =("

w1

w2

#��w1 2 H?2 ; w2 2 L2): (15:7)

We are interested in a test for supw2BW kP+zk2 < 1, or, equivalently,

supw2BW

k�wk2 < 1 (15:8)

where � = P+[MG1MG2

] :W ! H2 is a mixed Hankel-Toeplitz operator:

�

"w1

w2

#= P+

hG1 G2

i " w1

w2

#w1 2 H?2 ; w2 2 L2

= P+

hG1 G2

iP�w + P+G2P+w2:

Thus � is the sum of the Hankel operator P+MG : H?2 �H?2 ! H2 and the Toeplitzoperator P+MG2

: H2 ! H2. The following lemma generalizes Corollary 13.24 (B1 =0; D1 = 0) and Lemma 15.2 (B2 = 0; D2 = 0).

Lemma 15.6 supw2BW k�wk2 < 1 i� the following two conditions hold:

(i) ��(D2) < 1 and HW 2 dom(Ric);

(ii) �(WLc) < 1.

Proof. By Corollary 13.24, condition (i) is necessary for (15.8), so we will prove thatgiven condition (i), (15.8) holds i� condition (ii) holds. We will do this by showing,equivalently, that �(WLc) � 1 i� supw2BW k�wk2 � 1. By de�nition of W , if w 2 Wthen

kP+zk22 � kwk22 = kP+zk22 � kP+w2k22 � kP�wk22:Note that the last term only contributes to kP+zk22 through x(0). Thus if Lc is invertible,then Lemma 15.1 and 15.3 yield

supw2W

�kP+zk22 � kwk22 �� x(0) = x0

= x�0Wx0 � x�0L�1c x0 (15:9)

and the supremum is achieved for some w 2 W that can be constructed from theprevious lemmas. Since �(WLc) � 1 i� 9 x0 6= 0 such that the right-hand side of

15.4. Mixed Hankel-Toeplitz Operators: The General Case* 399

(15.9) is � 0, we have, by (15.9), that �(WLc) � 1 i� 9 w 2 W , w 6= 0 such thatkP+zk22 � kwk22. But this is true i� supw2BW k�wk2 � 1.

If Lc is not invertible, we need only restrict x0 in (15.9) to Im(Lc), and then theabove argument generalizes in a straightforward way. 2

In the di�erential game problem considered later and in the H1 optimal controlproblem, we will make use of the adjoint �� : H2 !W , which is given by

��z =

"P�(G

�1 z)

G�2 z

#=

"P�G

�1

G�2

#z (15:10)

where P�Gz := P�(Gz) = (P�MG)z. That the expression in (15.10) is actually theadjoint of � is easily veri�ed from the de�nition of the inner product on vector-valuedL2 space. The adjoint of � : W ! H2 is the operator �� : H2 ! W such that< z;�w >=< ��z; w > for all w 2 W , z 2 H2. By de�nition, we have

< z;�w > = < z; P+(G1w1 +G2w2) >=< z;G1w1 > + < z;G2w2 >

= < P�(G�1 z); w1 > + < G�2 z; w2 >

= < ��z; w > :

The mixed Hankel-Toeplitz operator just studied is the so-called 2� 1-block mixedHankel-Toeplitz operator. There is a 2� 2-block generalization.

15.4 Mixed Hankel-Toeplitz Operators: The Gen-

eral Case*

Historically, the mixed Hankel-Toeplitz operators have played important roles in H1theory, so it is interesting to consider the 2� 2-block generalization of Lemma 15.6. Infact, the whole H1 control theory can be developed using these tools. See Section 17.7in Chapter 17. The proof of Lemma 15.7 below is completely straightforward and fairlyshort, given the other results in the previous sections. Suppose that

G(s) =

"G11(s) G12(s)

G21(s) G22(s)

#=

264

A B1 B2

C1 D11 D12

C2 D21 D22

375 =:

"A B

C D

#2 RH1:

Denote

D�2 :=

"D12

D22

#D2� :=

hD21 D22

i

Rx := I �D��2D�2 Ry := I �D2�D

�2�


HX :=

"A 0

�C�C �A�

#+

"B2

�C�D�2

#R�1x

hD��2C B�2

i

HY :=

"A� 0

�BB� �A

#+

"C�2

�BD�2�

#R�1y

hD2�B

� C2

i:

De�ne W = H?2 �L2, Z = H2 �L2, and � :W ! Z as

�

"w1

w2

#=

"P+ 0

0 I

#"G11 G12

G21 G22

#"w1

w2

#:

Lemma 15.7 supw2BW k�wk2 < 1 holds i� the following three conditions hold:

(i) ��(D�2) < 1 and HX 2 dom(Ric);

(ii) ��(D2�) < 1 and HY 2 dom(Ric);

(iii) �(XY ) < 1 for X = Ric(HX) and Y = Ric(HY ).

Proof. The mixed Hankel-Toeplitz operator can be written as

�

"w1

w2

#= P+G

"w1

w2

#+

264

0

P�

hG21 G22

i " w1

w2

# 375 :Hence

k�wk22 = kP+Gwk22 + kP�hG21 G22

iwk22:

So supw2BW k�wk < 1 implies

supw2BW

kP+Gwk < 1

andsup

w2BWkP�

hG21 G22

iwk < 1:

But "G12

G22

# 1

= supw22H2

P+G"

0

w2

# 2

� supw2BW

kP+Gwk < 1

and h G21 G22

i 1

= supw2H?

2

P� h G21 G22

iw 2� sup

w2BWkP�

hG21 G22

iwk < 1:

These two inequalities then imply that (i) and (ii) are necessary. Analogous to the proofof Lemma 15.6, we will show that given conditions (i) and (ii), supw2BW k�wk2 < 1

15.5. Linear Quadratic Max-Min Problem 401

holds i� condition (iii) holds. We will do this by showing, equivalently, that �(XY ) � 1i� supw2BW k�wk2 � 1. By de�nition of W , if w 2 W then

k�wk22 � kwk22 =�kP+zk22 � kP+w2k22

�+�kP�

hG21 G22

iwk22 � kP�wk22

�:

Thus if Y is invertible, then Lemma 15.3 and 15.5 yield

supw2W

�k�wk22 � kwk22 �� x(0) = x0

= x�0Xx0 � x�0Y�1x0:

Now the same arguments as in the proof of Lemma 15.6 give the desired conclusion. 2

15.5 Linear Quadratic Max-Min Problem

Consider the dynamical system

_x = Ax+B1w +B2u (15.11)

z = C1x+D12u (15.12)


(i) (C1; A) is observable;

(ii) (A;B2) is stabilizable;

(iii) D�12

hC1 D12

i=h0 I

i.

In this section, we are interested in answering the following question: when

supw2BL2+

minu2L2+

kzk2 < 1?

Remark 15.3 This max-min problem is a game problem in the sense that u is chosento minimize the quadratic norm of z and that w is chosen to maximize the norm. Inother words, inputs u and w act as \opposing players". This linear quadratic max-minproblem can be reformulated in the traditional fashion as in the previous sections:

supw2L2+

minu2L2+

Z 1

0

�kzk2 � kwk2

�dt = sup

w2L2+

minu2L2+

�kzk22 � kwk22

�:

A conventional game problem setup would be to consider the min-max problem, i.e.,switching the order of sup and min. However, it will be seen that they are equivalentand that a saddle point exists. By saying that, we would like to warn readers that thismay not be true in the general case where z = C1x+D11w+D12u and D11 6= 0. In thatcase, it is possible that supw infu < infu supw. This will be elaborated in Chapter 17.

It should also be pointed out that the results presented here still hold, subject tosome minor modi�cations, if the assumptions (i) and (iii) on the dynamical system arerelaxed to:


(i)'

"A� j!I B2

C1 D12

#has full column rank for all ! 2 R, and

(iii)' D12 has full column rank. ~

It is clear from the assumptions that H2 2 dom(Ric) and X2 = Ric(H2) > 0, where

H2 =

"A �B2B

�2

�C�1C1 �A�

#:

Let F2 = �B�2X2 and D? be such that [D12 D?] is an orthogonal matrix. De�ne

Gc(s) :=

"AF2 I

C1F2 0

#; U(s) :=

"AF2 B2

C1F2 D12

#

and

U? =

"AF2 �X�1

2 C�1D?

C1F2 D?

#(15:13)

where AF2 = A + B2F2 and C1F2 = C1 +D12F2. The following is easily proven usingLemma 13.29 by obtaining a state-space realization and by eliminating uncontrollablestates using a little algebra involving the Riccati equation for X2.

Lemma 15.8 [U U?] is square and inner and a realization for G�c

hU U?

iis

G�c

hU U?

i=

"AF2 �B2 X�1

2 C�1D?

X2 0 0

#2 RH2: (15:14)

This implies that U and U? are each inner and that both U�?Gc and U�Gc are

in RH?2 . To answer our earlier question, de�ne a Hamiltonian matrix H1 and theassociated Riccati equation as

H1 :=

"A B1B

�1 �B2B

�2

�C�1C1 �A�

#

A�X1 +X1A+X1B1B�1X1 �X1B2B

�2X1 + C�1C1 = 0:

Theorem 15.9 supw2BL2+

minu2L2+

kzk2 < 1 if and only if H1 2 dom(Ric) and X1 =

Ric(H1) > 0. Furthermore, if the condition is satis�ed, then u = F1x with F1 :=�B�2X1 is an optimal control.

15.5. Linear Quadratic Max-Min Problem 403

Proof. ()) As in Chapter 14, de�ne � := u� F2x to get

z = GcB1w + U�:

Then the hypothesis implies that

supw2BH2

min�2H2

kzk2 < 1: (15:15)

Since by Lemma 15.8 [U U?] is square and inner, kzk2 = k[U U?]�zk2, and

hU U?

i�z =

"U�GcB1w + �

U�?GcB1w

#=

"P� (U�GcB1w) + P+ (U�GcB1w + �)

U�?GcB1w

#:

Thus

supw2BH2

min�2H2

kzk2 = supw2BH2

min�2H2

"P� (U�GcB1w) + P+ (U�GcB1w + �)

U�?GcB1w

# 2

;

and the right hand of the above equation is minimized by � = �P+ (U�GcB1w); wehave

supw2BH2

min�2H2

kzk2 = supw2BH2

"P� (U�GcB1w)

U�?GcB1w

# 2

=: supw2BH2

k��wk2 < 1

where �� : L2+ !W is de�ned as

��w =

"P�(U

�GcB1w)

U�?GcB1w

#=

"P�U

�

U�?

#GcB1w

with

W :=

("q1

q2

#�� q1 2 H?2 ; q2 2 L2):

Note that from equation (15.10) the adjoint operator � :W ! H2 is given by

�

"q1

q2

#= P+(B

�1G

�c (Uq1 + U?q2)) = P+B

�1G

�c

hU U?

i " q1

q2

#

where G�c [U U?] 2 RH2 has the realization in (15.14). So we have

supq2BW

k�qk2 < 1:


This is just the condition (15.8), so from Lemma 15.6 and equation (15.14) we havethat

HW :=

"AF2 X�1

2 C�1C1X�12

�X2B1B�1X2 �A�F2

#2 dom(Ric)

and W = Ric(HW ) � 0. Note that the observability of (C1; A) implies X2 > 0.Furthermore, the controllability Gramian for the system (15.14) is X�1

2 since

AF2X�12 +X�1

2 A�F2 +B2B�2 +X�1

2 C�1C1X�12 = 0:

Lemma 15.6 also implies �(WX�12 ) < 1 or, equivalently, X2 > W . Using the Ric-

cati equation for X2, one can verify that T :=

"�I X�1

2

�X2 0

#provides a similarity

transformation between H1 and HW , i.e., H1 = THWT�1. Then

X�(H1) = TX�(HW ) = T Im

"I

W

#= Im

"I �X�1

2 W

X2

#;

so H1 2 dom(Ric) and X1 = X2(X2 �W )�1X2 > 0.

(() If H1 2 dom(Ric) and X1 = Ric(H1) > 0, A + B1B�1X1 � B2B

�2X1 is

stable. De�neAF1 := A+B2F1; C1F1 := C1 +D12F1:

Then the Riccati equation can be written as

A�F1X1 +X1AF1 + C�1F1C1F1 +X1B1B�1X1 = 0:

We conclude from Lyapunov theory that AF1 is stable since (AF1 ; B�1X1) is detectable

and X1 > 0. Now with the given control law u = F1x, the dynamical system becomes

_x = AF1x+B1w

z = C1F1x:

So by Corollary 13.24, kTzwk1 < 1, i.e., supw2L2+ kzk2 < 1 for the given control law.2

Theorem 15.9 will be used in the next chapter to solve the FI H1 control problem.


Di�erential game is a well-studied topic, see e.g., Bryson and Ho [1975]. The paperby Mageirou and Ho [1977] is one of the early papers that are relevant to the topicscovered in this chapter. The current setup and proof are taken from Doyle, Glover,Khargonekar, and Francis [1989]. The application of game theoretic results to H1problems can be found in Ba�sar and Bernhard [1991], Khargonekar, Petersen, and Zhou[1990], Limebeer, Anderson, Khargonekar, and Green [1992], and references therein.

16H1 Control: Simple Case

In this chapter we consider H1 control theory. Speci�cally, we formulate the optimaland suboptimal H1 control problems in section 16.1. However, we will focus on thesuboptimal case in this book and discuss why we do so. In section 16.2 all suboptimalcontrollers are characterized for a class of simpli�ed problems while leaving the moregeneral problems to the next chapter. Some preliminary analysis is given in section 16.3for the output feedback results. The analysis suggested the need for solving the FullInformation and Output Estimation problems, which are the topics of sections 16.4-16.7.Section 16.8 discusses the H1 separation theory and presents the proof of the outputfeedback results. The behavior of the H1 controller as a function of performance level is considered in section 16.9. The optimal controllers are also brie y considered in thissection. Some other interpretations of the H1 controllers are given in section 16.10.Finally, section 16.11 presents the formulas for an optimal H1 controller.

16.1 Problem Formulation

Consider the system described by the block diagram

G

K

z

y

w

u

� �

�

-

405

406 H1 CONTROL: SIMPLE CASE

where the plant G and controller K are assumed to be real-rational and proper. It willbe assumed that state space models of G and K are available and that their realizationsare assumed to be stabilizable and detectable. Recall again that a controller is said tobe admissible if it internally stabilizes the system. Clearly, stability is the most basicrequirement for a practical system to work. Hence any sensible controller has to beadmissible.

OptimalH1 Control: �nd all admissible controllers K(s) such that kTzwk1is minimized.

It should be noted that the optimal H1 controllers as de�ned above are generally notunique for MIMO systems. Furthermore, �nding an optimal H1 controller is often bothnumerically and theoretically complicated, as shown in Glover and Doyle [1989]. Thisis certainly in contrast with the standard H2 theory, in which the optimal controlleris unique and can be obtained by solving two Riccati equations without iterations.Knowing the achievable optimal (minimum) H1 norm may be useful theoretically sinceit sets a limit on what we can achieve. However, in practice it is often not necessaryand sometimes even undesirable to design an optimal controller, and it is usually muchcheaper to obtain controllers that are very close in the norm sense to the optimal ones,which will be called suboptimal controllers. A suboptimal controller may also have othernice properties over optimal ones, e.g., lower bandwidth.

Suboptimal H1 Control: Given > 0, �nd all admissible controllersK(s) if there is any such that kTzwk1 < .

For the reasons mentioned above, we focus our attention in this book on suboptimalcontrol. When appropriate, we brie y discuss what will happen when approaches theoptimal value.

16.2 Output Feedback H1 Control

16.2.1 Internal Stability and Input/output Stability

Now suppose K is a stabilizing controller for the system G. Then the internal stabilityguarantees Tzw = F`(G;K) 2 RH1, but the latter does not necessarily imply theinternal stability. The following lemma provides the additional (mild) conditions to theequivalence of Tzw = F`(G;K) 2 RH1 and internal stability of the closed-loop system.To state the lemma, we shall assume that G and K have the following stabilizable anddetectable realizations, respectively:

G(s) =

264

A B1 B2

C1 D11 D12

C2 D21 D22

375 ; K(s) =

"A B

C D

#:

16.2. Output Feedback H1 Control 407

Lemma 16.1 Suppose that the realizations for G and K are both stabilizable and de-tectable. Then the feedback connection Tzw = F`(G;K) of the realizations for G and Kis

(a) detectable if

"A� �I B2

C1 D12

#has full column rank for all Re� � 0;

(b) stabilizable if

"A� �I B1

C2 D21

#has full row rank for all Re� � 0:

Moreover, if (a) and (b) hold, then K is an internally stabilizing controller i� Tzw 2RH1.

Proof. The state-space equations for the closed-loop are:

F`(G;K) =

2664

A+B2DL1C2 B2L2C B1 +B2DL1D21

BL1C2 A+ BL1D22C BL1D21

C1 +D12L2DC2 D12L2C D11 +D12DL1D21

3775

=:

"Ac Bc

Cc Dc

#

where L1 := (I �D22D)�1, L2 := (I � DD22)�1.

Suppose F`(G;K) has undetectable state (x0; y0)0 and mode Re� � 0; then the PBHtest gives "

Ac � �I

Cc

#"x

y

#= 0:

This can be simpli�ed as"A� �I B2

C1 D12

#"x

DL1C2x+ L2Cy

#= 0

andBL1(C2x+D22Cy) + Ay � �y = 0:

Now if "A� �I B2

C1 D12

#

has full column rank, then x = 0 and Cy = 0. This implies Ay = �y. Since (C; A) isdetectable, we get y = 0, which is a contradiction. Hence part (a) is proven, and part(b) is a dual result. 2

These relations will be used extensively below to simplify our development and toenable us to focus on input/output stability only.


16.2.2 Contraction and Stability

One of the keys to the entire development of H1 theory is the fact that the contractionand internal stability is preserved under an inner linear fractional transformation.

Theorem 16.2 Consider the following feedback system:

P

Q

z

r

w

v

� �

�

-

P =

"P11 P12

P21 P22

#2 RH1

Suppose that P�P = I, P�121 2 RH1 and that Q is a proper rational matrix. Thenthe following are equivalent:

(a) The system is internally stable, well-posed, and kTzwk1 < 1.

(b) Q 2 RH1 and kQk1 < 1.

Proof. (b) ) (a). Note that since P;Q 2 RH1, the system internal stability is guaran-teed if (I � P22Q)

�1 2 RH1. Therefore, internal stability and well-posedness followeasily from kP22k1 � 1, kQk1 < 1, and a small gain argument. (Note that kP22k1 � 1follows from the fact that P22 is a compression of P .)

To show that kTzwk1 < 1, consider the closed-loop system at any frequency s = j!with the signals �xed as complex constant vectors. Let kQk1 =: � < 1 and note thatTwr = P�121 (I � P22Q) 2 RH1. Also let � := kTwrk1. Then kwk � �krk, and P innerimplies that kzk2 + krk2 = kwk2 + kvk2. Therefore,

kzk2 � kwk2 + (�2 � 1)krk2 � [1� (1� �2)��2]kwk2

which implies kTzwk1 < 1 .

(a) ) (b). To show that kQk1 < 1, suppose there exist a (real or in�nite) frequency! and a constant nonzero vector r such that at s = j! , kQrk � krk. Then settingw = P�121 (I � P22Q)r, v = Qr gives v = Tvww. But as above, P inner implies thatkzk2 + krk2 = kwk2 + kvk2 and, hence, that kzk2 � kwk2, which is impossible sincekTzwk1 < 1. It follows that �max(Q(j!)) < 1 for all !, i.e., kQk1 < 1 since Q isrational.

To show Q 2 RH1, let Q = NM�1 with N;M 2 RH1 be a right coprime factor-ization, i.e., there exist X;Y 2 RH1 such that XN + YM = I . We shall show thatM�1 2 RH1. By internal stability we have

Q(I � P22Q)�1 = N(M � P22N)�1 2 RH1

and(I � P22Q)

�1 =M(M � P22N)�1 2 RH1:


ThusXQ(I � P22Q)

�1 + Y (I � P22Q)�1 = (M � P22N)�1 2 RH1:

This implies that the winding number of det(M � P22N), as s traverses the Nyquistcontour, equals zero. Now note the fact that, for all s = j!, detM�1 6= 0, det(I ��P22Q) 6= 0 for all � in [0,1] (this uses the fact that kP22k1 � 1 and kQk1 < 1). Also,det(I � �P22Q) = det(M � �P22N) detM�1, and we have det(M � �P22N) 6= 0 for all� in [0,1] and all s = j!. We conclude that the winding number of detM also equalszero. Therefore, Q 2 RH1, and the proof is complete. 2

16.2.3 Simplifying Assumptions

In this chapter, we discuss a simpli�ed version of H1 theory. The general case will beconsidered in the next chapter. The main reason for doing so is that the general casehas its unique features but is much more involved algebraically. Involved algebra maydistract attention from the essential ideas of the theory and therefore lose insight intothe problem. Nevertheless, the problem considered below contains the essential featuresof the H1 theory.


G(s) =

264

A B1 B2

C1 0 D12

C2 D21 0

375 :

The following assumptions are made:


(ii) (A;B2) is stabilizable and (C2; A) is detectable;

(iii) D�12

hC1 D12

i=h0 I

i;

(iv)

"B1

D21

#D�

21 =

"0

I

#.

Assumption (i) is made for a technical reason: together with (ii) it guarantees thatthe two Hamiltonian matrices (H2 and J2 in Chapter 14) in the H2 problem belongto dom(Ric). This assumption simpli�es the theorem statements and proofs, but ifit is relaxed, the theorems and proofs can be modi�ed so that the given formulae arestill correct, as will be seen in the next chapter. An important simpli�cation that isa consequence of the assumption (i) is that internal stability is essentially equivalentto input-output stability (Tzw 2 RH1). This equivalence enables us to focus only oninput/output stability and is captured in Corollary 16.3 below. Of course, assumption(ii) is necessary and su�cient for G to be internally stabilizable, but is not needed


to prove the equivalence of internal stability and Tzw 2 RH1. (Readers should beclear that this does not mean that the realization for G need not be stabilizable anddetectable. In point of fact, the internal stability and input-output stability can neverbe equivalent if either G or K has unstabilizable or undetectable modes.)

Corollary 16.3 Suppose that assumptions (i), (iii), and (iv) hold. Then a controllerK is admissible i� Tzw 2 RH1.

Proof. The realization for plant G is stabilizable and detectable by assumption (i).We only need to verify that the rank conditions of the two matrices in Lemma 16.1 aresatis�ed. Now suppose assumptions (i) and (iii) are satis�ed and let D? be such thathD12 D?

iis a unitary matrix. Then

rank

"A� �I B2

C1 D12

#= rank

264I 0

0

"D�

12

D�?

# 375"A� �I B2

C1 D12

#

= rank

264

A� �I B2"0

D�?C1

# "I

0

# 375 :So "

A� �I B2

C1 D12

#

has full column rank for all Re� � 0 i�"A� �I

D�?C1

#

has full column rank. However, the last matrix has full rank for all Re� � 0 i� (D�?C1; A)

is detectable. Since D?(D�?C1) = (I � D12D

�12)C1 = C1, (D

�?C1; A) is detectable i�

(C1; A) is detectable. The rank condition for the other matrix follows by duality. 2

Two additional assumptions that are implicit in the assumed realization for G(s) arethat D11 = 0 and D22 = 0. As we have mentioned many times, D22 6= 0 does not poseany problem since it is easy to form an equivalent problem with D22 = 0 by a linearfractional transformation on the controller K(s). However, relaxing the assumptionD11 = 0 complicates the formulae substantially, as will be seen in the next chapter.

16.2.4 Suboptimal H1 Controllers

In this subsection, we present the necessary and su�cient conditions for the existenceof an admissible controller K(s) such that kTzwk1 < for a given , and, furthermore,


if the necessary and su�cient conditions are satis�ed, we characterize all admissiblecontrollers that satisfy the norm condition. The proofs of these results will be given inthe later sections. Let opt := min fkTzwk1 : K(s) admissibleg, i.e., the optimal level.Then, clearly, must be greater than opt for the existence of suboptimal H1 con-trollers. In Section 16.9 we will brie y discuss how to �nd an admissible K to minimizekTzwk1. Optimal H1 controllers are more di�cult to characterize than suboptimalones, and this is one major di�erence between the H1 and H2 results. Recall thatsimilar di�erences arose in the norm computation problem as well.

The H1 solution involves the following two Hamiltonian matrices:

H1 :=

"A �2B1B

�1 �B2B

�2

�C�1C1 �A�

#; J1 :=

"A� �2C�1C1 � C�2C2

�B1B�1 �A

#:

The important di�erence here from the H2 problem is that the (1,2)-blocks are not signde�nite, so we cannot use Theorem 13.7 in Chapter 13 to guarantee thatH1 2 dom(Ric)or Ric(H1) � 0. Indeed, these conditions are intimately related to the existence ofH1 suboptimal controllers. Note that the (1,2)-blocks are a suggestive combination ofexpressions from the H1 norm characterization in Chapter 4 (or bounded real AREin Chapter 13) and from the H2 synthesis of Chapter 14. It is also clear that if approaches in�nity, then these two Hamiltonian matrices become the corresponding H2

control Hamiltonian matrices. The reasons for the form of these expressions shouldbecome clear through the discussions and proofs for the following theorem.

Theorem 16.4 There exists an admissible controller such that kTzwk1 < i� thefollowing three conditions hold:


(ii) J1 2 dom(Ric) and Y1 := Ric(J1) � 0;

(iii) �(X1Y1) < 2 .

Moreover, when these conditions hold, one such controller is

Ksub(s) :=

"A1 �Z1L1F1 0

#

whereA1 := A+ �2B1B

�1X1 +B2F1 + Z1L1C2

F1 := �B�2X1; L1 := �Y1C�2 ; Z1 := (I � �2Y1X1)�1:

The H1 controller displayed in Theorem 16.4, which is often called the centralcontroller or minimum entropy controller, has certain obvious similarities to the H2

controller as well as some important di�erences. Although not as apparent as in theH2 case, the H1 controller also has an interesting separation structure. Furthermore,


each of the conditions in the theorem can be given a system-theoretic interpretationin terms of this separation. These interpretations, given in Section 16.8, require the�ltering and full information (i.e., state feedback) results in sections 16.7 and 16.4. Theproof of Theorem 16.4 is constructed out of these results as well. The term centralcontroller will be obvious from the parameterization of all suboptimal controllers givenbelow, while the meaning of minimum entropy will be discussed in Section 16.10.1.

The following theorem parameterizes the controllers that achieve a suboptimal H1norm less than .

Theorem 16.5 If conditions (i) to (iii) in Theorem 16.4 are satis�ed, the set of alladmissible controllers such that kTzwk1 < equals the set of all transfer matrices fromy to u in

M1

Q

u y� �

�

-

M1(s) =

264

A1 �Z1L1 Z1B2

F1 0 I

�C2 I 0

375

where Q 2 RH1, kQk1 < .

As in the H2 case, the suboptimal controllers are parameterized by a �xed linear-fractional transformation with a free parameter Q. With Q = 0 (at the \center" of theset kQk1 < ), we recover the central controller Ksub(s).

16.3 Motivation for Special Problems

Although the proof for output feedback results will be given later, we shall now try togive some ideas for approaching the problem. Speci�cally, we try to motivate the studyof the OE (and hence other special problems) and show how this problem arises naturallyin proving the output feedback results. The key is to use the fact that contraction andinternal stability are preserved under an inner linear fractional transformation, which isTheorem 16.2. Assuming that X1 exists, we will show that the general output feedbackproblem boils down to an output estimation problem which can be solved easily if astate feedback or full information control problem can be solved.

Suppose X1 := Ric(H1) exists. Then X1 satis�es the following Riccati equation:

A�X1 +X1A+ C�1C1 + �2X1B1B�1X1 �X1B2B

�2X1 = 0: (16:1)

Let x denote the state of G with respect to a given input w, and then we can di�erentiatex(t)�X1x(t):

d

dt(x�X1x) = _x�X1x+ x�X1 _x

16.3. Motivation for Special Problems 413

= x�(A�X1 +X1A)x + 2hw;B�1X1xi + 2hu;B�2X1xi:Using the Riccati equation for X1 to substitute in for A�X1 +X1A gives

d

dt(x�X1x) = �kC1xk2� �2kB�1X1xk2+kB�2X1xk2+2hw;B�1X1xi+2hu;B�2X1xi:

Finally, completion of the squares along with orthogonality of C1x and D12u gives thekey equation

d

dt(x�X1x) = �kzk2 + 2kwk2 � 2kw � �2B�1X1xk2 + ku+B�2X1xk2: (16:2)

Assume x(0) = x(1) = 0, w 2 L2+, and integrate (16.2) from t = 0 to t =1:

kzk22 � 2kwk22 = ku+B�2X1xk22 � 2kw � �2B�1X1xk22 = kvk22 � 2krk22 (16:3)

wherev := u+B�2X1x; r := w � �2B�1X1x: (16:4)

With these new de�ned variables, the closed-loop system can be expressed as two in-terconnected subsystems below:264

_x

z

r

375 =

264

AF1 �1B1 B2

C1F1 0 D12

� �1B�1X1 I 0

375264

x

w

v

375 AF1 := A+ B2F1

C1F1 := C1 +D12F1

and 264

_x

v

y

375 =

264Atmp B1 B2

�F1 0 I

C2 D21 0

375264x

r

u

375 Atmp := A+ �2B1B

�1X1

where F1 is de�ned as in Section 16.2.4. This is shown in the following diagram:

y u

r

v r

w wz �

�

XXXXXXX

�

��

��

�

�

�

-

�

1

K

Gtmp

P


where

P :=

"P11 P12

P21 P22

#=

264

AF1 �1B1 B2

C1F1 0 D12

� �1B�1X1 I 0

375 (16:5)

and

Gtmp =

264Atmp B1 B2

�F1 0 I

C2 D21 0

375 : (16:6)

The equality (16.3) motivates the change of variables to r and v as in (16.4), and thesevariables provide the connection between Tzw and Tvr. Note that Tz( w) = �1Tzw andTv( r) = �1Tvr. It is immediate from equality (16.3) that kTzwk1 � i� kTvrk1 � .While this is the basic idea behind the proof of Lemma 16.8 below, the details neededfor strict inequality and internal stability require a bit more work.

Note that wworst := �2B�1X1x is the worst disturbance input in the sense thatit maximizes the quantity kzk22 � 2kwk22 in (16.3) for the minimizing value of u =�B�2X1x; that is, the u making v = 0 and the w making r = 0 are values satisfying asaddle point condition. (See Section 17.8 for more interpretations.) It is also interestingto note that wworst is the optimal strategy for w in the corresponding LQ game problem(see the di�erential game problem in the last chapter). Equation (16.3) also suggeststhat u = �B�2X1x is a suboptimal control for a full information (FI) problem if thestate x is available. This will be shown later. In terms of the OE problem for Gtmp, theoutput being estimated is the optimal FI control input F1x and the new disturbance ris o�set by the \worst case" FI disturbance input wworst.

Notice the structure of Gtmp: it is an OE problem. We will show below that theoutput feedback can indeed be transformed into the OE problem. To show this we �rstneed to prove some preliminary facts.

Lemma 16.6 Suppose H1 2 dom(Ric) and X1 = Ric(H1). Then AF1 = A+B2F1is stable i� X1 � 0.

Proof. Re-arrange the Riccati equation for X1 and use the de�nition of F1 andC1F1 to get

A�F1X1 +X1AF1 +

"C1F1

� �1B�1X1

#� "C1F1

� �1B�1X1

#= 0: (16:7)

Since H1 2 dom(Ric), (AF1 + �2B1B�1X1) is stable and hence (B�1X1; AF1) is

detectable. Then from standard results on Lyapunov equations (see Lemma 3.19), AF1is stable i� X1 � 0. 2

Equation (16.3) can be written as

kzk22 + k rk22 = k wk22 + kvk22 :

16.3. Motivation for Special Problems 415

This suggests that P might be inner when X1 � 0, which is veri�ed by the followinglemma.

Lemma 16.7 If H1 2 dom(Ric) and X1 = Ric(H1) � 0, then P in (16.5) is inRH1 and inner, and P�121 2 RH1.

Proof. By Lemma 16.6, AF1 is stable. So P 2 RH1. That P is inner (P�P = I)follows from Lemma 13.29 upon noting that the observability Gramian of P is X1 (see(16.7)) and "

0 I

D�12 0

#"C1F1

� �1B�1X1

#+

" �1B�1B�2

#X1 = 0:

Finally, the state matrix for P�121 is (AF1+ �2B1B�1X1), which is stable by de�nition.

Thus, P�121 2 RH1. 2

The following lemma connects these two systems Tzw and Tvr, which is the centralpart of the separation argument in Section 16.8.2. Recall that internal and input-output stability are equivalent for admissibility of K in the output feedback problemby Corollary 16.3 .

Lemma 16.8 Assume H1 2 dom(Ric) and X1 = Ric(H1) � 0. Then K is admissi-ble for G and kTzwk1 < i� K is admissible for Gtmp and kTvrk1 < .

Proof. We may assume without loss of generality that the realization of K is sta-bilizable and detectable. Recall from Corollary 16.3 that internal stability for Tzw isequivalent to Tzw 2 RH1. Similarly since"

Atmp � �I B1

C2 D21

#=

"A� �I B1

C2 D21

#"I 0

�2B�1X1 I

#

has full row rank for all Re(�) � 0 and since

det

"Atmp � �I B2

�F1 I

#= det(Atmp +B2F1 � �I) 6= 0

for all Re(�) � 0 by the stability of Atmp +B2F1, we have that"Atmp � �I B2

�F1 I

#

has full column rank. Hence by Lemma 16.1 the internal stability of Tvr, i.e., the internalstability of the subsystem consisting of Gtmp and controller K, is also equivalent toTvr 2 RH1. Thus internal stability is equivalent to input-output stability for both


G and Gtmp. This shows that K is an admissible controller for G if and only if itis admissible for Gtmp. Now it follows from Theorem 16.2 and Lemma 16.7 alongwith the above block diagram that

Tz( w) 1 < 1 i� Tv( r) 1 < 1 or, equivalently,

kTzwk1 < i� kTvrk1 < . 2

From the previous analysis, it is clear that to solve the output feedback problem weneed to show

(a) H1 2 dom(Ric) and X1 = Ric(H1) � 0;

(b) kTvrk1 < .

To show (a), we need to solve a FI problem. The problem (b) is an OE problem whichcan be solved by using the relationship between FC and OE problems in Chapter 12,while the FC problem can be solved by using the FI solution through duality. So in thesections to follow, we will focus on these special problems.

16.4 Full Information Control

Our system diagram in this section is standard as before

G

K

z

y

w

u

� �

�

-

with

G(s) =

266664

A B1 B2

C1 0 D12"I

0

# "0

I

# "0

0

#377775 :

The H1 problem corresponding to this setup again is not, strictly speaking, a specialcase of the output feedback problem because it does not satisfy all of the assumptions.In particular, it should be noted that for the FI (and FC in the next section) problem,internal stability is not equivalent to Tzw 2 RH1 since

264A� �I B1"

I

0

# "0

I

# 375

16.4. Full Information Control 417

can never have full row rank, although this presents no di�culties in solving this prob-lem. We simply must remember that in the FI case, K admissible means internallystabilizing, not just Tzw 2 RH1.

We have seen that in the H2 FI case, the optimal controller uses just the state x eventhough the controller is provided with full information. We will show below that, in theH1 case, a suboptimal controller exists which also uses just x. This case could havebeen restricted to state feedback, which is more traditional, but we believe that, onceone gets outside the pure H2 setting, the full information problem is more fundamentaland more natural than the state feedback problem.

One setting in which the full information case is more natural occurs when theparameterization of all suboptimal controllers is considered. It is also appropriate whenstudying the general case when D11 6= 0 in the next chapter or when H1 optimal(not just suboptimal) controllers are desired. Even though the optimal problem is notstudied in detail in this book, we want the methods to extend to the optimal case in anatural and straightforward way.

The assumptions relevant to the FI problem which are inherited from the outputfeedback problem are

(i) (C1; A) is detectable;

(ii) (A;B2) is stabilizable;

(iii) D�12

hC1 D12

i=h0 I

i.

Assumptions (iv) and the second part of (ii) for the general output feedback case havebeen e�ectively strengthened because of the assumed structure for C2 and D21.

Theorem 16.9 There exists an admissible controller K(s) for the FI problem such thatkTzwk1 < if and only if H1 2 dom(Ric) and X1 = Ric(H1) � 0. Furthermore,if these conditions are satis�ed, then the equivalence class of all admissible controllerssatisfying kTzwk1 < can be parameterized as

K(s) �=hF1 � �2Q(s)B�1X1 Q(s)

i(16:8)


It is easy to see by comparing the H1 solution with the corresponding H2 solutionthat a fundamental di�erence between H2 and H1 controllers is that the H1 controllerdepends on the disturbance through B1 whereas the H2 controller does not. This di�er-ence is essentially captured by the necessary and su�cient conditions for the existence ofa controller given in Theorem 16.9. Note that these conditions are the same as condition(i) in Theorem 16.4.

The two individual conditions in Theorem 16.9 may each be given their own inter-pretations. The condition that H1 2 dom(Ric) implies that X1 := Ric(H1) exists


and K(s) =hF1 0

igives Tzw as

Tzw =

"AF1 B1

C1F1 0

#AF1 = A+B2F1

C1F1 = C1 +D12F1:(16:9)

Furthermore, since Tzw = P11 and P�11P11 = I � P�21P21 by P�P = I , we havekP11k1 < 1 and kTzwk1 = kP11k1 < . The further condition that X1 � 0 isequivalent, by Lemma 16.6, to this K stabilizing Tzw.

Remark 16.1 Given that H1 2 dom(Ric) and X1 = Ric(H1) � 0, there is anintuitive way to see why all the equivalence classes of controllers can be parameterizedin the form of (16.8). Recall the following equality from equation (16.3):

kzk22 � 2kwk22 = ku+B�2X1xk22 � 2kw � �2B�1X1xk22 = kvk22 � 2 krk22 :

kTzwk1 < means that kzk22 � 2kwk22 < 0 for w 6= 0. This implies that r 6= 0 andkvk22 � 2krk22 < 0. Now all v satisfying this inequality can be written as v = Qrfor Q 2 RH1 and kQk1 < or as u + B�2X1x = Q(w � �2B�1X1x). This givesequation (16.8). ~

We shall now prove the theorem.

Proof. ()) For simplicity, in the proof to follow we assume that the system is nor-malized such that = 1. Further, we will show that we can, without loss of generality,strengthen the assumption on (C1; A) from detectable to observable. Suppose there ex-

ists an admissible controller K =

"A B1 B2

C D1 D2

#such that kTzwk1 < 1. If (C1; A) is

detectable but not observable, then change coordinates for the state of G to

"x1

x2

#with

x2 unobservable, (C11; A11) observable, and A22 stable, giving the following closed-loopstate equations:

26666664

_x1

_x2_x

z

u

37777775=

26666664

A11 0 0 B11 B21

A21 A22 0 B12 B22

B11 B12 A B2 0

C11 0 0 0 D12

D11 D12 C D2 0

37777775

26666664

x1

x2

x

w

u

37777775:


If we take a new plant Gobs with state x1 and output z and group the rest of theequations as a new controller Kobs with the state made up of x2 and x, then

Gobs(s) =

266664

A11 B11 B21

C11 0 D12"I

0

# "0

I

# "0

0

#377775

still satis�es the assumptions of the FI problem and is stabilized by Kobs with theclosed-loop H1 norm kTzwk1 < 1 where

Kobs =

264A22 +B22D12 B22C A21 +B22D11 B12 +B22D2

B12 A B11 B2

D12 C D11 D2

375 :

If we now show that there exists X1 > 0 solving the H1 Riccati equation for Gobs, i.e.,

X1 = Ric

"A11 B11B

�11 �B21B

�21

�C�11C11 �A�11

#;

then

Ric(H1) = X1 =

"X1 0

0 0

#� 0

exists for G. We can therefore assume without loss of generality that (C1; A) is observ-able.

We shall suppose that there exists an admissible controller such that kTzwk1 <1. But note that the existence of an admissible controller such that kTzwk1 < 1 isequivalent to that the admissible controller makes sup

w2BL2+

kzk2 < 1; hence, it is necessary

that

supw2BL2+

minu2L2+

kzk2 < 1

since the latter is always no greater than the former by the fact that the set of signalsu generated from admissible controllers is a subset of L2+. But from Theorem 15.9,the latter is true if and only if H1 2 dom(Ric) and X1 = Ric(H1) > 0. Hence thenecessity is proven.

(() Suppose H1 2 dom(Ric) and X1 = Ric(H1) � 0 and suppose K(s) is anadmissible controller such that kTzwk1 < 1. Again change variables to v := u � F1x


and r := w �B�1X1x, so that the closed-loop system is as shown below:

P

Tvr

z

r

w

v

� �

�

-

P =

"P11 P12

P21 P22

#=

264

AF1 B1 B2

C1F1 0 D12

�B�1X1 I 0

375

By Lemma 16.7, P is inner and P�121 2 RH1. By Theorem 16.2 the system is internallystable and kTzwk1 < 1 i� Tvr 2 RH1 and kTvrk1 < 1. Now denote Q := Tvr, thenv = Qr and

v = u� F1x =�K �

hF1 0

i�" x

w

#

r = w �B�1X1x =h�B�1X1 I

i " x

w

#:

Hence we have

�K �

hF1 0

i� " x

w

#= Q

h�B�1X1 I

i " x

w

#

or

K �hF1 0

i �= Qh�B�1X1 I

i:

Thus

K(s) �=hF1 �Q(s)B�1X1 Q(s)

i; Q 2 RH1; kQk1 < 1:

Hence all suboptimal controllers are in the stated equivalence class. 2

Remark 16.2 It should be emphasized that the set of controllers given above doesnot parameterize all controllers although it is su�cient for the purpose of deriving theoutput feedback results, and that is why \equivalence class" is used. It is clear that

there is a suboptimal controller K1 =hF1 0

iwith F1 6= F1; however, there is no

choice of Q such that K1 belongs to the set. Nevertheless, this problem will not occurin output feedback case. ~

The following theorem gives all full information controllers.

Theorem 16.10 Suppose the condition in Theorem 16.9 is satis�ed; then all admissiblecontrollers satisfying kTzwk1 < can be parameterized as K = F`(MFI ; Q):


MFI

Q

u y� �

�

-

MFI(s) =

2666664A+B2F1

h0 B1

iB2

0hF1 0

iI"

�I0

# "I 0

� �2B�1X1 I

#0

3777775

where Q =hQ1 Q2

i2 RH1 and kQ2k1 < .

Remark 16.3 It is simple to verify that for Q1 = 0, we have

K =hF1 � �2Q2B

�1X1 Q2

i;

which is the parameterization given in Theorem 16.9. The parameterization of allsuboptimal FC controllers follows by duality and, therefore, are omitted. ~

Proof. We only need to show that F`(MFI ; Q) with kQ2k1 < parameterizes allFI H1 suboptimal controllers. To show that, we shall make a change of variables asbefore:

v = u+B�2X1x; r = w � �2B�1X1x:

Then the system equations can be written as follows:264

_x

z

r

375 =

264

AF1 �1B1 B2

C1F1 0 D12

� �1B�1X1 I 0

375264

x

w

v

375

and 264

_x

v

y

375 =

266664

Atmp B1 B2

�F1 0 I"I

�2B�1X1

# "0

I

#0

377775264x

r

u

375

where Atmp := A+ �2B1B�1X1.

This is shown pictorially in the following diagram:

y u

r

v r

w wz �

�

XXXXXX

��

��

�

�

-

�

1

K

GFI

P


where P is as given in equation (16.5) and

GFI =

266664

Atmp B1 B2

�F1 0 I"I

�2B�1X1

# "0

I

#0

377775 :

So from Theorem 16.2 and Lemma 16.8, we conclude that K is an admissible controllerfor G and kTzwk1 < i� K is an admissible controller for GFI and kTvrk1 < .

Now let L =hB2F1 �B1

i; then Atmp + L

"I

�2B�1X1

#= A + B2F1 is stable.

Also note that Atmp + B2F1 is stable. Then all controllers that stabilize GFI can be

parameterized as K = F`(M;�); � =h�1 �2

i2 RH1 where

M =

26666664

Atmp +B2F1 + L

"I

�2B�1X1

#�L B2

F1 0 I

�"

I

�2B�1X1

#I 0

37777775:

With this parameterization of all controllers, the transfer matrix from r to v is Tvr =F`(GFI ;F`(M;�)) =: F`(N;�). It is easy to show that

N =

264

0 I"0

I

#0

375

and Tvr = F`(N;�) = �2. Hence kTvrk1 < if and only if k�2k1 < . Thisimplies that all FI H1 controllers can be parameterized as K = F`(M;�) with � =h�1 �2

i2 RH1; k�2k1 < and

M =

266664

A+ 2B2F1 �hB2F1 �B1

iB2

F1 0 I

�"

I

�2B�1X1

#I 0

377775 :

Now let�1 = F1 � �2Q2B

�1X1 +Q1; �2 = Q2:

Then it is easy to show that F`(M;�) = F`(MFI ; Q). 2

16.5. Full Control 423

16.5 Full Control

G(s) =

26664

A B1

hI 0

iC1 0

h0 I

iC2 D21

h0 0

i37775

This problem is dual to the Full Information problem. The assumptions that the FCproblem inherits from the output feedback problem are just the dual of those in the FIproblem:

(i) (A;B1) is stabilizable;

(ii) (C2; A) is detectable;

(iv)

"B1

D21

#D�

21 =

"0

I

#.

Theorem 16.11 There exists an admissible controller K(s) for the FC problem suchthat kTzwk1 < if and only if J1 2 dom(Ric) and Y1 = Ric(J1) � 0. Moreover,if these conditions are satis�ed then an equivalence class of all admissible controllerssatisfying kTzwk1 < can be parameterized as

K(s) �="L1 � �2Y1C

�1Q(s)

Q(s)

#


As expected, the condition in Theorem 16.11 is the same as that in (ii) of Theo-rem 16.4.

16.6 Disturbance Feedforward

G(s) =

264

A B1 B2

C1 0 D12

C2 I 0

375

This problem inherits the same assumptions (i)-(iii) as in the FI problem, but for internalstability we shall add that A�B1C2 is stable. With this assumption, it is easy to checkthat the condition in Lemma 16.1 is satis�ed so that the internal stability is againequivalent to Tzw 2 RH1, as in the output feedback case.


Theorem 16.12 There exists an admissible controller for the DF problem such thatkTzwk1 < if and only if H1 2 dom(Ric) and X1 = Ric(H1) � 0. Moreover, ifthese conditions are satis�ed then all admissible controllers satisfying kTzwk1 < canbe parameterized as the set of all transfer matrices from y to u in

M1

Q

u y� �

�

-

M1(s) =

264A+B2F1 �B1C2 B1 B2

F1 0 I

�C2 � �2B�1X1 I 0

375

with Q 2 RH1, kQk1 <

Proof. Suppose there is a controllerKDF solving the above problem, i.e., with kTzwk1 <

. Then by Theorem 12.4, the controller KFI = KDF

hC2 I

isolves the correspond-

ing H1 FI problem. Hence the conditions H1 2 dom(Ric) and X1 = Ric(H1) � 0are necessarily satis�ed. On the other hand, if these conditions are satis�ed thenFI is solvable. It is easy to verify that F`(M1; Q) = F`(PDF ;KFI) with KFI =hF1 � �2Q(s)B�1X1 Q(s)

iwhere PDF is as de�ned in section 12.2 of Chapter 12.

So again by Theorem 12.4, the controller F`(M1; Q) solves the DF problem.

To show that F`(M1; Q) with kQk1 < parameterizes all DF H1 suboptimalcontrollers, we shall make a change of variables as in equation (16.4):

v = u+B�2X1x; r = w � �2B�1X1x:

Then the system equations can be written as follows:

264

_x

z

r

375 =

264

AF1 �1B1 B2

C1F1 0 D12

� �1B�1X1 I 0

375264

x

w

v

375

and 264

_x

v

y

375 =

264

Atmp B1 B2

�F1 0 I

C2 + �2B�1X1 I 0

375264x

r

u

375 :

This is shown pictorially in the following diagram:

16.7. Output Estimation 425

y u

r

v r

w wz �

�

XXXXXXX

�

��

��

�

�

�

-

�

1

K

GDF

P

where

P =

264

AF1 �1B1 B2

C1F1 0 D12

� �1B�1X1 I 0

375

and

GDF =

264

Atmp B1 B2

�F1 0 I

C2 + �2B�1X1 I 0

375 :

Since Atmp �B1(C2 + �2B�1X1) = A�B1C2 and Atmp +B2F1 are stable, the rank

conditions of Lemma 16.1 for system GDF are satis�ed. So from Theorem 16.2 andLemma 16.7, we conclude thatK is an admissible controller for G and kTzwk1 < i� Kis an admissible controller for GDF and kTvrk1 < . Now it is easy to see by comparingthis formula with the controller parameterization in Theorem 12.8 that F`(M1; Q)with Q 2 RH1 (no norm constraint) parameterizes all stabilizing controllers for GDF ;however, simple algebra shows that Tvr = F`(GDF ;F`(M1; Q)) = Q. So kTvrk1 < i� kQk1 < , and F`(M1; Q) with Q 2 RH1 and kQk1 < parameterizes allsuboptimal controllers for G. 2

16.7 Output Estimation

G(s) =

264

A B1 B2

C1 0 I

C2 D21 0

375


This problem is dual to DF, just as FC was to FI. Thus the discussion of the DF problemis relevant here, when appropriately dualized. The OE assumptions are

(i) (A;B1) is stabilizable and A�B2C1 is stable;

(ii) (C2; A) is detectable;

(iv)

"B1

D21

#D�

21 =

"0

I

#.

Assumption (i), together with (iv), imply that internal stability is again equivalent toTzw 2 RH1, as in the output feedback case.

Theorem 16.13 There exists an admissible controller for the OE problem such thatkTzwk1 < if and only if J1 2 dom(Ric) and Ric(J1) � 0. Moreover, if theseconditions are satis�ed then all admissible controllers satisfying kTzwk1 < can beparameterized as the set of all transfer matrices from y to u in

M1

Q

u y� �

�

-

M1(s) =

264A+ L1C2 �B2C1 L1 �B2 � �2Y1C

�1

C1 0 I

C2 I 0

375

with Q 2 RH1, kQk1 < .

It is interesting to compare H1 and H2 in the context of the OE problem, eventhough, by duality, the essence of these remarks was made before. Both optimal es-timators are observers with the observer gain determined by Ric(J1) and Ric(J2).Optimal H2 output estimation consists of multiplying the optimal state estimate by theoutput map C1. Thus optimalH2 estimation depends only trivially on the output z thatis being estimated, and state estimation is the fundamental problem. In contrast, theH1 estimation problem depends very explicitly and importantly on the output beingestimated. This will have implications for the separation properties of the H1 outputfeedback controller.

16.8 Separation Theory

If we assume the results of the special problems, which are proven in the previoussections, we can now prove Theorems 16.4 and 16.5 using separation arguments. Thisessentially involves reducing the output feedback problem to a combination of the FullInformation and the Output Estimation problems. The separation properties of theH1 controller are more complicated than the H2 controller, although they are no lessinteresting. The notation and assumptions for this section are as in Section 16.2.


16.8.1 H1 Controller Structure

The H1 controller formulae from Theorem 16.4 are

Ksub(s) :=

"A1 �Z1L1F1 0

#

A1 := A+ �2B1B�1X1 +B2F1 + Z1L1C2

F1 := �B�2X1; L1 := �Y1C�2 ; Z1 := (I � �2Y1X1)�1

where X1 := Ric(H1) and Y1 := Ric(J1). The necessary and su�cient conditionsfor the existence of an admissible controller such that kTzwk1 < are


(ii) J1 2 dom(Ric) and Y1 := Ric(J1 � 0;

(iii) �(X1Y1) < 2.

We have seen that condition (i) corresponds to the Full Information condition andthat (ii) corresponds to the Full Control condition. It is easily shown that, given the FIand FC results, these conditions are necessary for the output feedback case as well.

Lemma 16.14 Suppose there exists an admissible controller making kTzwk1 < .Then conditions (i) and (ii) hold.

Proof. Let K be an admissible controller for which kTzwk1 < . The controllerK[C2 D21] solves the FI problem; hence from Theorem 16.9, H1 2 dom(Ric) andX1 := Ric(H1) � 0. Condition (ii) follows by the dual argument. 2

We would also expect some condition beyond these two, and that is provided by(iii), which is an elegant combination of elements from FI and FC. Note that all theconditions of Theorem 16.4 are symmetric in H1, J1, X1, and Y1, but the formulafor the controller is not. Needless to say, there is a dual form that can be obtained byinspection from the above formula. For a symmetric formula, the state equations abovecan be multiplied through by Z�11 and put in descriptor form. A simple substitutionfrom the Riccati equation forX1 will then yield a symmetric, though more complicated,formula:

(I � �2Y1X1) _x = Asx� L1y (16.10)

u = F1x (16.11)

where As := A+B2F1 + L1C2 + �2Y1A�X1 + �2B1B

�1X1 + �2Y1C

�1C1.

To emphasize its relationship to the H2 controller formulae, the H1 controller canbe written as

_x = Ax+B1wworst +B2u+ Z1L1(C2x� y)


u = F1x; wworst = �2B�1X1x:

These equations have the structure of an observer-based compensator. The obviousquestions that arise when these formulae are compared with the H2 formulae are

1) Where does the term B1wworst come from?

2) Why Z1L1 instead of L1?

3) Is there a separation interpretation of these formulae analogous to that for H2?

The proof of Theorem 16.4 reveals that there is a very well-de�ned separation in-terpretation of these formulae and that wworst := �2B�1X1x is, in some sense, aworst-case input for the Full Information problem. Furthermore, Z1L1 is actually theoptimal �lter gain for estimating F1x, which is the optimal Full Information controlinput, in the presence of this worst-case input. It is therefore not surprising that Z1L1should enter in the controller equations instead of L1. The term wworst may be thoughtof loosely as an estimate for wworst.


It has been shown from Lemma 16.14 that conditions (i) and (ii) are necessary forkTzwk1 < . Hence we only need to show that if conditions (i) and (ii) are satis�ed,condition (iii) is necessary and su�cient for kTzwk1 < . As in section 16.3, we de�nenew disturbance and control variables

r := w � �2B�1X1x; v := u+B�2X1x:

Then

"v

y

#=

264Atmp B1 B2

�F1 0 I

C2 D21 0

375"r

u

#= Gtmp

"r

u

#Atmp := A+ �2B1B

�1X1:

G

K

z

y

w

u

� �

�

-

Gtmp

K

v

y

r

u

� �

�

-

Recall from Lemma 16.8 that K is admissible for G and kTzwk1 < i� K is admissiblefor Gtmp and kTvrk1 < .

While Gtmp has the form required for the OE problem, to actually use the OEresults, we will need to verify that Gtmp satis�es the following assumptions for the OEproblem:


(i) (Atmp; B1) is stabilizable and Atmp +B2F1 is stable;

(ii) (C2; Atmp) is detectable;

(iv)

"B1

D21

#D�

21 =

"0

I

#.

Assumption (iv) and that (Atmp; B1) is stabilizable follow immediately from the corre-sponding assumptions for Theorem 16.4. The stability of Atmp + B2F1 follows fromthe de�nition of H1 2 dom(Ric). The following lemma gives conditions for assumption(ii) to hold. Of course, the existence of an admissible controller for Gtmp immediatelyimplies that assumption (ii) holds. Note that the OE Hamiltonian matrix for Gtmp is

Jtmp :=

"A�tmp �2F �1F1 � C�2C2

�B1B�1 �Atmp

#:

Lemma 16.15 If Jtmp 2 dom(Ric) and Ytmp := Ric(Jtmp) � 0, then (C2; Atmp) isdetectable.

Proof. The lemma follows from the dual to Lemma 16.6, which gives that (Atmp �YtmpC

�2C2) is stable. 2

Proof of Theorem 16.4 (Su�ciency) Assume the conditions (i) through (iii) in thetheorem statement hold. Using the Riccati equation for X1, one can easily verify that

T :=

"I � �2X10 I

#provides a similarity transformation between Jtmp and J1, i.e.,

T�1JtmpT = J1. So

X�(Jtmp) = TX�(J1) = T Im

"I

Y1

#= Im

"I � �2X1Y1

Y1

#

and �(X1Y1) < 2 implies that Jtmp 2 dom(Ric) and Ytmp := Ric(Jtmp) = Y1(I � �2X1Y1)�1 = Z1Y1 � 0. Thus by Lemma 16.15 the OE assumptions hold for Gtmp,and by Theorem 16.13 the OE problem is solvable. From Theorem 16.13 with Q = 0,one solution is "

A+ �2B1B�1X1 � YtmpC

�2C2 +B2F1 YtmpC

�2

F1 0

#

but this is precisely Ksub de�ned in Theorem 16.4. We conclude that Ksub stabilizesGtmp and that kTvrk1 < . Then by Lemma 16.8,Ksub stabilizesG and that kTzwk1 < .


(Necessity) LetK be an admissible controller for which kTzwk1 < . By Lemma 16.14,H1 2 dom(Ric), X1 := Ric(H1) � 0, J1 2 dom(Ric), and Y1 := Ric(J1) � 0.From Lemma 16.8, K is admissible for Gtmp and kTvrk1 < . This implies that the OEassumptions hold for Gtmp and that the OE problem is solvable. Therefore, from The-orem 16.13 applied to Gtmp, we have that Jtmp 2 dom(Ric) and Ytmp = Ric(Jtmp) � 0.Using the same similarity transformation formula as in the su�ciency part, we get thatYtmp = (I � �2Y1X1)�1Y1 � 0. We shall now show that Ytmp � 0 implies that�(X1Y1) < 2. We shall consider two cases:

� Y1 is nonsingular: in this case Ytmp � 0 implies that I � �2Y1=21 X1Y

1=21 > 0.

So �(Y1=21 X1Y

1=21 ) < 2 or �(X1Y1) < 2.

� Y1 is singular: there is a unitary matrix U such that

Y1 = U�

"Y11 0

0 0

#U

with Y11 > 0. Let UX1U� be partitioned accordingly,

UX1U� =

"X11 X12

X21 X22

#:

Then by the same argument as in the Y1 nonsingular case,

Ytmp = U�

"(I � �2Y11X11)

�1Y11 0

0 I

#U � 0

implies that 2 > �(X11Y11) (= �(X1Y1)). 2

We now see exactly why the term involving wworst appears and why the \observer"gain is Z1L1. Both terms are consequences of estimating the optimal Full Information(i.e., state feedback) control gain. While an analogous output estimation problem arisesin the H2 output feedback problem, the resulting equations are much simpler. This isbecause there is no \worst-case" disturbance for the H2 Full Information problem andbecause the problem of estimating any output, including the optimal state feedback, isequivalent to state estimation.

We now present a separation interpretation for H1 suboptimal controllers. It willbe stated in terms of the central controller, but similar interpretations could be madefor the parameterization of all suboptimal controllers (see the proofs of Theorems 16.4and 16.5).

The H1 output feedback controller is the output estimator of the full infor-mation control law in the presence of the \worst-case" disturbance wworst.

Note that the same statement holds for the H2 optimal controller, except thatwworst = 0.

16.9. Optimality and Limiting Behavior 431


From Lemma 16.8, the set of all admissible controllers for G such that kTzwk1 < equals the set of all admissible controllers for Gtmp such that kTvrk1 < . ApplyTheorem 16.13. 2

16.9 Optimality and Limiting Behavior

In this section, we will discuss, without proof, the behavior of the H1 suboptimalsolution as varies, especially as approaches the in�mal achievable norm, denoted by opt. Since Theorem 16.4 gives necessary and su�cient conditions for the existence ofan admissible controller such that kTzwk1 < , opt is the in�mum over all such thatconditions (i)-(iii) are satis�ed. Theorem 16.4 does not give an explicit formula for opt,but, just as for the H1 norm calculation, it can be computed as closely as desired by asearch technique.

Although we have not focused on the problem of H1 optimal controllers, the as-sumptions in this book make them relatively easy to obtain in most cases. In addition todescribing the qualitative behavior of suboptimal solutions as varies, we will indicatewhy the descriptor version of the controller formulae from Section 16.8.1 can usuallyprovide formulae for the optimal controller when = opt. Most of these results can beobtained relatively easily using the machinery that is developed in the previous sections.The reader interested in �lling in the details is encouraged to begin by strengtheningassumption (i) to controllable and observable and considering the Hamiltonians for X�1

1

and Y �11 .As ! 1, H1 ! H2, X1 ! X2, etc., and Ksub ! K2. This fact is the result of

the particular choice for the central controller (Q = 0) that was made here. While itcould be argued that Ksub is a natural choice, this connection with H2 actually hints atdeeper interpretations. In fact, Ksub is the minimum entropy solution (see next section)as well as the minimax controller for kzk22 � 2kwk22.

If 2 � 1 > opt, then X1( 1) � X1( 2) and Y1( 1) � Y1( 2). Thus X1 andY1 are decreasing functions of , as is �(X1Y1). At = opt, anyone of the threeconditions in Theorem 16.4 can fail. If only condition (iii) fails, then it is relativelystraightforward to show that the descriptor formulae for = opt are optimal, i.e., theoptimal controller is given by

(I � �2optY1X1) _x = Asx� L1y (16.12)

u = F1x (16.13)

where As := A + B2F1 + L1C2 + �2optY1A�X1 + �2optB1B

�1X1 + �2optY1C

�1C1. See

the example below.

The formulae in Theorem 16.4 are not well-de�ned in the optimal case because theterm (I � �2optX1Y1) is not invertible. It is possible but far less likely that conditions(i) or (ii) would fail before (iii). To see this, consider (i) and let 1 be the largest


for which H1 fails to be in dom(Ric) because the H1 matrix fails to have either thestability property or the complementarity property. The same remarks will apply to (ii)by duality.

If complementarity fails at = 1, then �(X1) ! 1 as ! 1. For < 1,H1 may again be in dom(Ric), but X1 will be inde�nite. For such , the controlleru = �B�2X1x would make kTzwk1 < but would not be stabilizing. See part 1) ofthe example below. If the stability property fails at = 1, then H1 62 dom(Ric) butRic can be extended to obtain X1 and the controller u = �B�2X1x is stabilizing andmakes kTzwk1 = 1. The stability property will also not hold for any � 1, andno controller whatsoever exists which makes kTzwk1 < 1. In other words, if stabilitybreaks down �rst, then the in�mum over stabilizing controllers equals the in�mum overall controllers, stabilizing or otherwise. See part 2) of the example below. In view ofthis, we would typically expect that complementarity would fail �rst.

Complementarity failing at = 1 means �(X1)!1 as ! 1, so condition (iii)would fail at even larger values of , unless the eigenvectors associated with �(X1) as ! 1 are in the null space of Y1. Thus condition (iii) is the most likely of all to fail�rst. If condition (i) or (ii) fails �rst because the stability property fails, the formulaein Theorem 16.4 as well as their descriptor versions are optimal at = opt. Thisis illustrated in the example below for the output feedback. If the complementaritycondition fails �rst, (but (iii) does not fail), then obtaining formulae for the optimalcontrollers is a more subtle problem.

Example 16.1 Let an interconnected dynamical system realization be given by

G(s) =

2666664

ah1 0

ib2"

1

0

#0

"0

1

#

c2

h0 1

i0

3777775

with jc2j � jb2j > 0. Then all assumptions for the output feedback problem are satis�edand

H1 =

"a

1�b22 2

2

�1 �a

#; J1 =

"a

1�c22 2

2

�1 �a

#:

The eigenvalues of H1 and J1 are given, respectively, by

�(H1) =

(�p(a2 + b22)

2 � 1

); �(J1) =

(�p(a2 + c22)

2 � 1

):

If 2 >1

a2 + b22

�� 1

a2 + c22

�, then X�(H1) and X�(J1) exist and

X�(H1) = Im

" p(a2+b2

2) 2�1�a

1

#


X�(J1) = Im

" p(a2+c2

2) 2�1�a

1

#:

We shall consider two cases:

1) a > 0: In this case, the complementary property of dom(Ric) will fail before thestability property fails since

q(a2 + b22)

2 � 1� a = 0

when 2 =1

b22

�>

1

a2 + b22

�.

Nevertheless, if 2 >1

a2 + b22and 2 6= 1

b22, then H1 2 dom(Ric) and

X1 = p

(a2 + b22) 2 � 1� a

=

(> 0; if 2 > 1

b22

< 0; if 1a2+b2

2

< 2 < 1b22

:

Let F1 = �B�2X1; then

A+B2F1 = �a+ b22 p(a2 + b22)

2 � 1

b22 2 � 1

=

(< 0 (stable); if 2 > 1

b22

> 0 (unstable); if 1a2+b2

2

< 2 < 1b22

:

{ Suppose full information (or states) are available for feedback and let

u = F1x:

Then the closed-loop transfer matrix is given by

Tzw =

"A+B2F1 B1

C1 +D12F1 0

#=

26664�a+b22

p(a2+b2

2) 2�1

b22 2�1

h1 0

i24 1

�b2 p(a2+b2

2) 2�1�a

35 0

37775 ;

and Tzw is stable for all 2 >1

b22and is not stable for

1

a2 + b22< 2 <

1

b22. Fur-

thermore, it can be shown that kTzwk < for all 2 >1

a2 + b22and 2 6= 1

b22.

It is clear that the optimal H1 norm is 1b22

but is not achievable.


{ Suppose the states are not available; then output feedback must be consid-

ered. Note that if 2 >1

b22, then H1 2 dom(Ric), J1 2 dom(Ric), and

X1 = p

(a2 + b22) 2 � 1� a

> 0

Y1 = p

(a2 + c22) 2 � 1� a

> 0:

Hence conditions (i) and (ii) in Theorem 16.4 are satis�ed, and need to checkcondition (iii). Since

�(X1Y1) = 2

(p(a2 + b22)

2 � 1� a )(p(a2 + c22)

2 � 1� a );

it is clear that �(X1Y1) ! 1 when 2 ! 1

b22. So condition (iii) will fail

before condition (i) or (ii) fails.

2) a < 0: In this case, complementary property is always satis�ed, and, furthermore,H1 2 dom(Ric), J1 2 dom(Ric), and

X1 = p

(a2 + b22) 2 � 1� a

> 0

Y1 = p

(a2 + c22) 2 � 1� a

> 0

for 2 >1

a2 + b22.

However, for 2 � 1

a2 + b22, H1 62 dom(Ric) since stability property fails. Never-

theless, in this case, if 20 =1

a2 + b22, we can extend the dom(Ric) to include those

matrices H1 with imaginary axis eigenvalues as

X�(H1) = Im

"�a1

#

such that X1 = �1ais a solution to the Riccati equation

A�X1 +X1A+ C�1C1 + �20 X1B1B�1X1 �X1B2B

�2X1 = 0

and A+ �20 B1B�1X1 �B2B

�2X1 = 0.


{ For = 0 and F1 = �B�2X1, then A + B2F1 = a+b22a< 0. So if states

are available for feedback and u = F1x, we have

Tzw =

2664

a+b22a

h1 0

i"

1b2a

#0

3775 2 RH1

and kTzwk1 =1p

a2 + b2= 0. Hence the optimum is achieved.

{ If states are not available, the output feedback is considered, and jb2j = jc2j,then it can be shown that

�(X1Y1) = 2

(p(a2 + b22)

2 � 1� a )2< 2

is satis�ed if and only if

>

pa2 + 2b22 + a

b22

>

1pa2 + b22

!:

So condition (iii) of Theorem 16.4 will fail before either (i) or (ii) fails.

In both a > 0 and a < 0 cases, the optimal for the output feedback is given by

opt =

pa2 + 2b22 + a

b22

if jb2j = jc2j; and the optimal controller given by the descriptor formula in equa-tions (16.12) and (16.13) is a constant. In fact,

uopt = � optq(a2 + b22)

2opt � 1� a opt

y:

For instance, let a = �1 and b2 = 1 = c2. Then opt =p3 � 1 = 0:7321 and uopt =

�0:7321 y. Further,

Tzw =

264�1:7321 1 �0:7321

1 0 0

�0:7321 0 �0:7321

375 :

It is easy to check that kTzwk1 = 0:7321. 3


16.10 Controller Interpretations

This section considers some additional connections with the minimum entropy solutionand the work of Whittle and will be of interest primarily to readers already familiarwith them. The connection with the Q-parameterization approach will be consideredin the next chapter for the general case.

Section 16.10.2 gives another separation interpretation of the central H1 controllerof Theorem 16.4 in the spirit of Whittle (1981). It has been shown in Glover and Doyle[1988] that the central controller corresponds exactly to the steady state version of theoptimal risk sensitive controller derived by [Whittle, 1981], who also derives a separationresult and a certainty equivalence principle (see also [Whittle, 1986]).

16.10.1 Minimum Entropy Controller

Let T be a transfer matrix with kTk1 < . Then the entropy of T (s) is de�ned by

I(T; ) = � 2

2�

Z 1

�1

ln��det �I � �2T �(j!)T (j!)

�� d!:It is easy to see that

I(T; ) = � 2

2�

Z 1

�1

Xi

ln��1� �2�2i (T (j!))

�� d!and I(T; ) � 0, where �i (T (j!)) is the ith singular value of T (j!). It is also easy toshow that

lim !1

I(T; ) =1

2�

Z 1

�1

Xi

�2i (T (j!)) d! = kTk22 :

Thus the entropy I(T; ) is in fact a performance index measuring the tradeo� betweenthe H1 optimality ( ! kTk1) and the H2 optimality ( !1).

It has been shown in Glover and Mustafa [1989] that the central controller given inTheorem 16.4 is actually the controller that satis�es the norm condition kTzwk1 < and minimizes the following entropy:

� 2

2�

Z 1

�1

ln��det �I � �2T �zw(j!)Tzw(j!)

�� d!:Therefore, the central controller is also called the minimum entropy controller (maxi-mum entropy controller if the entropy is de�ned as ~I(T; ) = �I(T; )).

16.10.2 Relations with Separation in Risk Sensitive Control

Although [Whittle, 1981] treats a �nite horizon, discrete time, stochastic control prob-lem, his separation result has a clear interpretation for the present in�nite horizon,

16.10. Controller Interpretations 437

continuous time, deterministic control problem, as given below; and it is an interestingexercise to compare the two separation statements. This discussion will be entirely inthe time-domain.

We will consider the system at time, t = 0, and evaluate the past stress, S�, andfuture stress, S+, as functions of the current state, x. First de�ne the future stress as

S+(x) := supw

infu(kP+zk22 � 2kP+wk22);

then by the completion of the squares and by the saddle point argument of Section 16.3,where u is not constrained to be a function of the measurements (FI case), we obtain

S+(x) = x�X1x:

The past stress, S�(x), is a function of the past inputs and observations, u(t); y(t) for�1 < t < 0, and of the present state, x, and is produced by the worst case disturbance,w, that is consistent with the given data:

S�(x) := sup(kP�zk22 � 2kP�wk22):In order to evaluate S� we see that w can be divided into two components, D21w

and D?21w , with x dependent only on D?

21w (since B1D�

21 = 0) and D21w = y � C2x.The past stress is then calculated by a completion of the square and in terms of a �lteroutput. In particular, let �x be given by the stable di�erential equation

_�x = A�x+B2u+ L1(C2�x� y) + Y1C�

1C1�x with �x(�1) = 0:

Then it can be shown that the worst case w is given by

D?21w = D?

21B�

1Y�11 (x(t) � �x(t)) for t < 0

and that this gives, with e := x� �x,

S�(x) = � 2e(0)�Y �11 e(0)� 2kP�(y � C2�x)k22 + kP�(C1�x)k22 + kP�uk22:The worst case disturbance will now reach the value of x to maximize the total

stress, S�(x) + S+(x) , and this is easily shown to be achieved at the current state of

x = Z1�x(0):

The de�nitions of X1 and Y1 can be used to show that the state equations forthe central controller can be rewritten with the state �x := Z1x and with �x as de�nedabove. The control signal is then

u = F1x = F1Z1�x:

The separation is between the evaluation of future stress, which is a control problemwith an unconstrained input, and the past stress, which is a �ltering problem withknown control input. The central controller then combines these evaluations to give aworst case estimate, x, and the control law acts as if this were the perfectly observedstate.


16.11 An Optimal Controller

To o�er a general idea about the appearance of an optimal controller, we shall give inthe following without proof the conditions under which an optimal controller exists andan explicit formula for an optimal controller.

Theorem 16.16 There exists an admissible controller such that kTzwk1 � i� thefollowing three conditions hold:

(i) there exists a full column rank matrix"X11

X12

#2 R2n�n

such that

H1

"X11

X12

#=

"X11

X12

#TX ; Re �i(TX) � 0 8i

andX�11X12 = X�

12X11;

(ii) there exists a full column rank matrix"Y11

Y12

#2 R

2n�n

such that

J1

"Y11

Y12

#=

"Y11

Y12

#TY ; Re �i(TY ) � 0 8i

andY �11Y12 = Y �12Y11;

(iii) "X�12X11 �1X�

12Y12

�1Y �12X12 Y �12Y11

#� 0:

Moreover, when these conditions hold, one such controller is

Kopt(s) := CK(sEK �AK)+BK

where

EK := Y �11X11 � �2Y �12X12

BK := Y �12C�2

CK := �B�2X12

AK := EKTX �BKC2X11 = T �YEK + Y �11B2CK :


Remark 16.4 It is simple to show that if X11 and Y11 are nonsingular and if X1 =X12X

�111 and Y1 = Y12Y

�111 , then condition (iii) in the above theorem is equivalent to

X1 � 0, Y1 � 0, and �(Y1X1) � 2. So in this case, the conditions for the existenceof an optimal controller can be obtained from \taking the limit" of the correspondingconditions in Theorem 16.4. Moreover, the controller given above is reduced to thedescriptor form given in equations (16.12) and (16.13). ~


This chapter is based on Doyle, Glover, Khargonekar, Francis [1989], and Zhou [1992].The minimum entropy controller is studied in detail in Glover and Mustafa [1989] andMustafa and Glover [1990]. The risk sensitivity problem is treated in Whittle [1981,1986]. The connections between the risk sensitivity controller and the central H1controller are explored in Doyle and Glover [1988]. The complete characterization ofoptimal controllers for the general setup can be found in Glover, Limebeer, Doyle,Kasenally, and Safonov [1991].

17H1 Control: General Case

In this chapter we will consider again the standard H1 control problem but with someassumptions in the last chapter relaxed. Since the proof techniques in the last chaptercan be applied to this general case except with some more involved algebra, the detailedproof for the general case will not be given; only the formulas are presented. However,some procedures to carry out the proof will be outlined together with some alternativeapproaches to solve the standardH1 problem and some interpretations of the solutions.We will also indicate how the assumptions in the general case can be relaxed further toaccommodate other more complicated problems. More speci�cally, Section 17.1 presentsthe solutions to the general H1 problem. Section 17.2 discusses the techniques totransform a general problem to a standard problem which satis�es the assumptionsin the last chapter. The problems associated with relaxing the assumptions for thegeneral standard problems and techniques for dealing with them will be considered inSection 17.3. Section 17.4 considers the integral control in the H2 and H1 theoryand Section 17.5 considers how the general H1 solution can be used to solve the H1�ltering problem. Section 17.6 considers an alternative approach to the standard H2

and H1 problems using Youla controller parameterizations, and Section 17.7 gives an2� 2 Hankel-Toeplitz operator interpretations of the H1 solutions presented here andin the last chapter. Finally, the general state feedback H1 control problem and itsrelations with full information control and di�erential game problems are discussed insection 17.8 and 17.9.

441

442 H1 CONTROL: GENERAL CASE

17.1 General H1 Solutions


G

K

z

y

w

u

� �

�

-

where, as usual, G andK are assumed to be real rational and proper withK constrainedto provide internal stability. The controller is said to be admissible if it is real-rational,proper, and stabilizing. Although we are taking everything to be real, the resultspresented here are still true for the complex case with some obvious modi�cations. Wewill again only be interested in characterizing all suboptimal H1 controllers.


G(s) =

264

A B1 B2

C1 D11 D12

C2 D21 0

375 =

"A B

C D

#

which is compatible with the dimensions z(t) 2 Rp1 , y(t) 2 Rp2 , w(t) 2 Rm1 , u(t) 2Rm2 , and the state x(t) 2 Rn . The following assumptions are made:(A1) (A;B2) is stabilizable and (C2; A) is detectable;

(A2) D12 =

"0

I

#and D21 =

h0 I

i;

(A3)

"A� j!I B2

C1 D12


(A4)

"A� j!I B1

C2 D21


Assumption (A1) is necessary for the existence of stabilizing controllers. The as-sumptions in (A2) mean that the penalty on z = C1x +D12u includes a nonsingular,normalized penalty on the control u, that the exogenous signal w includes both plantdisturbance and sensor noise, and that the sensor noise weighting is normalized andnonsingular. Relaxation of (A2) leads to singular control problems; see Stroorvogel[1990]. For those problems that have D12 full column rank and D21 full row rank butdo not satisfy assumption (A2), a normalizing procedure is given in the next section sothat an equivalent new system will satisfy this assumption.

17.1. General H1 Solutions 443

Assumptions (A3) and (A4) are made for a technical reason: together with (A1)they guarantee that the two Hamiltonian matrices in the corresponding H2 problembelong to dom(Ric), as we have seen in Chapter 14. It is tempting to suggest that(A3) and (A4) can be dropped, but they are, in some sense, necessary for the methodspresented in the last chapter to be applicable. A further discussion of the assumptionsand their possible relaxation will be discussed in Section 17.3.

The main result is now stated in terms of the solutions of the X1 and Y1 Riccatiequations together with the \state feedback" and \output injection" matrices F and L.

R := D�1�D1� �

" 2Im1

0

0 0

#; where D1� := [D11 D12]

~R := D�1D��1 �

" 2Ip1 0

0 0

#; where D�1 :=

"D11

D21

#

H1 :=

"A 0

�C�1C1 �A�

#�"

B

�C�1D1�

#R�1

hD�

1�C1 B�i

J1 :=

"A� 0

�B1B�1 �A

#�"

C�

�B1D��1

#~R�1

hD�1B

�1 C

i

X1 := Ric(H1) Y1 := Ric(J1)

F :=

"F11

F21

#:= �R�1 [D�

1�C1 +B�X1]

L :=hL11 L21

i:= �[B1D

��1 + Y1C

�] ~R�1

Partition D, F11, and L11 are as follows:

"F 0

L0 D

#=

266664

F �111 F �121 F �21

L�111 D1111 D1112 0

L�121 D1121 D1122 I

L�21 0 I 0

377775 :

Remark 17.1 In the above matrix partitioning, some matrices may not exist depend-ing on whether D12 or D21 is square. This issue will be discussed further later. For thetime being, we shall assume all matrices in the partition exist. ~Theorem 17.1 Suppose G satis�es the assumptions (A1){(A4).

(a) There exists an admissible controller K(s) such that jjF`(G;K)jj1 < (i.e.kTzwk1 < ) if and only if


(i) > max(�[D1111; D1112; ]; �[D�1111; D

�1121]);

(ii) H1 2 dom(Ric) with X1 = Ric(H1) � 0;

(iii) J1 2 dom(Ric) with Y1 = Ric(J1) � 0;

(iv) �(X1Y1) < 2.

(b) Given that the conditions of part (a) are satis�ed, then all rational internallystabilizing controllers K(s) satisfying jjF`(G;K)jj1 < are given by

K = F`(M1; Q) for arbitrary Q 2 RH1 such that kQk1 <

where

M1 =

2664

A B1 B2

C1 D11 D12

C2 D21 0

3775

D11 = �D1121D�1111(

2I �D1111D�1111)

�1D1112 �D1122;

D12 2 Rm2�m2 and D21 2 Rp2�p2 are any matrices (e.g. Cholesky factors) satisfying

D12D�12 = I �D1121(

2I �D�1111D1111)

�1D�1121;

D�21D21 = I �D�

1112( 2I �D1111D

�1111)

�1D1112;

andB2 = Z1(B2 + L121)D12;

C2 = �D21(C2 + F121);

B1 = �Z1L21 + B2D�112 D11;

C1 = F21 + D11D�121 C2;

A = A+BF + B1D�121 C2;

whereZ1 = (I � �2Y1X1)�1:

(Note that if D11 = 0 then the formulae are considerably simpli�ed.)

Some Special Cases:

Case 1: D12 = I

In this case

1. in part (a), (i) becomes > �(D1121).

2. in part (b)D11 = �D1122

D12D�12 = I � �2D1121D

�1121

D�21D21 = I:

17.2. Loop Shifting 445

Case 2: D21 = I

In this case

1. in part (a), (i) becomes > �(D1112).

2. in part (b)D11 = �D1122

D12D�12 = I

D�21D21 = I � �2D�

1112D1112:

Case 3: D12 = I & D21 = I

In this case

1. in part (a), (i) drops out.

2. in part (b)D11 = �D1122

D12D�12 = I

D�21D21 = I:

17.2 Loop Shifting

Let a given problem have the following diagram where zp(t) 2 Rp1 , yp(t) 2 Rp2 , wp(t) 2Rm1 , and up(t) 2 Rm2 :

P

Kp

zp

yp

wp

up

� �

�

-

The plant P has the following state space realization with Dp12 full column rank andDp21 full row rank:

P (s) =

264

Ap Bp1 Bp2

Cp1 Dp11 Dp12

Cp2 Dp21 Dp22

375

The objective is to �nd all rational proper controllers Kp(s) which stabilize P andjjF`(P;Kp)jj1 < . To solve this problem we �rst transfer it to the standard onetreated in the last section. Note that the following procedure can also be applied to theH2 problem (except the procedure for the case D11 6= 0).


Normalize D12 and D21

Perform singular value decompositions to obtain the matrix factorizations

Dp12 = Up

"0

I

#Rp; Dp21 = ~Rp

h0 I

i~Up

such that Up and ~Up are square and unitary. Now let

zp = Upz; wp = ~U�pw; yp = ~Rpy; up = Rpu

andK(s) = RpKp(s) ~Rp

G(s) =

"U�p 0

0 ~R�1p

#P (s)

"~U�p 0

0 R�1p

#

=

2664

Ap Bp1~U�p Bp2R

�1p

U�pCp1 U�pDp11~U�p U�pDp12R

�1p

~R�1p Cp2 ~R�1p Dp21~U�p

~R�1p Dp22R�1p

3775

=:

264

A B1 B2

C1 D11 D12

C2 D21 D22

375 =

"A B

C D

#:

Then

D12 =

"0

I

#D21 =

h0 I

i;

and the new system is shown below:

G

K

z

y

w

u

� �

�

-

Furthermore, jjF`(P;Kp)jj� = jjUpF`(G;K) ~Upjj� = kF`(G;K)k� for � = 2 or 1since Up and ~Up are unitary.

Remove the Assumption D22 = 0

Suppose K(s) is a controller for G with D22 set to zero. Then the controller for D22 6= 0is K(I +D22K)�1. Hence there is no loss of generality in assuming D22 = 0.

17.2. Loop Shifting 447

Remove the Assumption D11 = 0

We can even assume that D11 = 0. In fact, Theorem 17.1 can be shown by �rsttransforming the general problem to the standard problem considered in the last chapterusing Theorem 16.2. This transformation is called loop-shifting. Before we go into thedetailed description of the transformation, let us �rst consider a simple unitary matrixdilation problem.

Lemma 17.2 Suppose D is a constant matrix such that kDk < 1; then

N =

"�D (I �DD�)1=2

(I �D�D)1=2 D�

#

is a unitary matrix, i.e., N�N = I.

This result can be easily veri�ed, and the matrix N is called a unitary dilation of D�

(of course, there are many other ways to dilate a matrix to a unitary matrix).Consider again the feedback system shown at the beginning of this chapter; without

loss of generality, we shall assume that the system G has the following realization:

G =

266664

A B1 B2

C1

"D1111 D1112

D1121 D1122

# "0

I

#

C2

h0 I

i0

377775

with

D12 =

"0

I

#; D21 =

h0 I

i:

Suppose there is a stabilizing controller K such that kF`(G;K)k1 < 1. (Suppose wehave normalized to 1). In the following, we will show how to construct a new systemtransfer matrixM(s) and a new controller ~K such that the D11 matrix forM(s) is zero,

and, furthermore, kF`(G;K)k1 < 1 if and only if F`(M; ~K)

1< 1. To begin with,

note that kF`(G;K)(1)k < 1 by the assumption

kF`(G;K)(1)k = D1111 D1112

D1121 D1122 +K(1)

:Let

D1 2(X :

D1111 D1112

D1121 D1122 +X

< 1

):

For example, let D1 = �D1122 �D1121(I �D�1111D1111)

�1D�1111D1112 and de�ne

~D11 :=

"D1111 D1112

D1121 D1122 +D1

#;


then ~D11

< 1. Let

K(s) = ~K(s) +D1

to getF`(G;K) = F`(G; ~K +D1) = F`( ~G; ~K)

where

~G =

266664

A+B2D1C2 B1 +B2D1D21 B2

C1 +D12D1C2

"D1111 D1112

D1121 D1122 +D1

# "0

I

#

C2

h0 I

i0

377775

=:

2664

~A ~B1~B2

~C1~D11

~D12

~C2~D21 0

3775 :

Let

N =

"� ~D11 (I � ~D11

~D�11)

1=2

(I � ~D�11~D11)

1=2 ~D�11

#

and then N�N = I . Furthermore, by Theorem 16.2, we have that K stabilizes G andthat kF`(G;K)k1 < 1 if and only if F`(G;K) internally stabilizes N and

kF`(N;F`(G;K))k1 < 1:

Note thatF`(N;F`(G;K)) = F`(M; ~K)

with

M(s) =

2664

~A+ ~B1R�11

~D�11~C1

~B1R�1=21

~B2 + ~B1R�11

~D�11~D12

~R�1=21

~C1 0 ~R�1=21

~D12

~C2 + ~D21R�11

~D�11~C1

~D21R�1=21

~D21R�11

~D�11~D12

3775

where R1 = I � ~D�11~D11 and ~R1 = I � ~D11

~D�11. In summary, we have the following

lemma.

Lemma 17.3 There is a controller K that internally stabilizes G and kF`(G;K)k1 < 1

if and only if there is a ~K that stabilizes M and F`(M; ~K)

1< 1.


"A B

C D

#2 RH1. Then kG(s)k1 < 1 if and only if

kDk < 1,

M(s) =

"A+B(I �D�D)�1D�C B(I �D�D)�1=2

(I �DD�)�1=2C 0

#2 RH1;

17.3. Relaxing Assumptions 449

and kM(s)k1 < 1.

17.3 Relaxing Assumptions

In this section, we indicate how the results of section 17.1 can be extended to moregeneral cases.

Relaxing A3 and A4

Suppose that

G =

264

0 0 1

0 0 1

1 1 0

375

which violates both A3 and A4 and corresponds to the robust stabilization of an inte-grator. If the controller u = ��x where � > 0 is used, then

Tzw =��ss+ �

; with kTzwk1 = �:

Hence the norm can be made arbitrarily small as � ! 0, but � = 0 is not admissiblesince it is not stabilizing. This may be thought of as a case where the H1-optimum isnot achieved on the set of admissible controllers. Of course, for this system, H1 optimalcontrol is a silly problem, although the suboptimal case is not obviously so.

If one simply drops the requirement that controllers be admissible and removesassumptions A3 and A4, then the formulae presented above will yield u = 0 for boththe optimal controller and the suboptimal controller with � = 0. This illustrates thatassumptions A3 and A4 are necessary for the techniques used in the last chapter tobe directly applicable. An alternative is to develop a theory which maintains the samenotion of admissibility but relaxes A3 and A4. The easiest way to do this would be topursue the suboptimal case introducing � perturbations so that A3 and A4 are satis�ed.

Relaxing A1

If assumption A1 is violated, then it is obvious that no admissible controllers exist.Suppose A1 is relaxed to allow unstabilizable and/or undetectable modes on the j!axis and internal stability is also relaxed to also allow closed-loop j! axis poles, butA2-A4 is still satis�ed. It can be easily shown that under these conditions the closed-loop H1 norm cannot be made �nite and, in particular, that the unstabilizable and/orundetectable modes on the j! axis must show up as poles in the closed-loop system,see Lemma 16.1.


Violating A1 and either or both of A3 and A4

Sensible control problems can be posed which violate A1 and either or both of A3 andA4. For example, cases when A has modes at s = 0 which are unstabilizable through B2

and/or undetectable through C2 arise when an integrator is included in a weight on adisturbance input or an error term. In these cases, either A3 or A4 are also violated, orthe closed-loop H1 norm cannot be made �nite. In many applications, such problemscan be reformulated so that the integrator occurs inside the loop (essentially using theinternal model principle) and is hence detectable and stabilizable. We will show thisprocess in the next section.

An alternative approach to such problems which could potentially avoid the problemreformulation would be to pursue the techniques in the last chapter but to relax internalstability to the requirement that all closed-loop modes be in the closed left half plane.Clearly, to have �nite H1 norms, these closed-loop modes could not appear as polesin Tzw. The formulae given in this chapter will often yield controllers compatible withthese assumptions. The user would then have to decide whether closed-loop poles onthe imaginary axis were due to weights and hence acceptable or due to the problembeing poorly posed, as in the above example.

A third alternative is to again introduce � perturbations so that A1, A3, and A4 aresatis�ed. Roughly speaking, this would produce sensible answers for sensible problems,but the behavior as �! 0 could be problematic.

Relaxing A2

In the cases that either D12 is not full column rank or that D21 is not full row rank,improper controllers can give a bounded H1-norm for Tzw, although the controllers willnot be admissible by our de�nition. Such singular �ltering and control problems havebeen well-studied in H2 theory and many of the same techniques go over to the H1-case(e.g. Willems [1981], Willems, Kitapci, and Silverman [1986], and Hautus and Silverman[1983]). In particular, the structure algorithm of Silverman [1969] could be used to makethe terms D12 and D21 full rank by the introduction of suitable di�erentiators in thecontroller. A complete solution to the singular problem can be found in [Stroorvogel,1990].

17.4 H2 and H1 Integral Control

It is interesting to note that the H2 and H1 design frameworks do not in generalproduce integral control. In this section we show how to introduce integral control intothe H2 and H1 design framework through a simple disturbance rejection problem. Weconsider a feedback system shown in Figure 17.1. We shall assume that the frequencycontents of the disturbance w are e�ectively modeled by the weighting Wd 2 RH1 andthe constraints on control signal are limited by an appropriate choice ofWu 2 RH1. Inorder to let the output y track the reference signal r, we require K contain an integral,

17.4. H2 and H1 Integral Control 451

i.e., K(s) has a pole at s = 0. (In general, K is required to have poles on the imaginaryaxis.)

g gK P We

WdWu

- - - -

?

-

6

6

6

?-

� u

wz2

z1yr

Figure 17.1: A Simple Disturbance Rejection Problem

There are several ways to achieve the integral design. One approach is to introducean integral in the performance weight We. Then the transfer function between w andz1 is given by

z1 =We(I + PK)�1Wdw:

Now if the resulting controller K stabilizes the plant P and makes the norm (2-norm or1-norm) between w and z1 �nite, then K must have a pole at s = 0 which is the zeroof the sensitivity function (Assuming Wd has no zeros at s = 0). (This follows from thewell-known internal model principle.) The problem with this approach is that the H2

(or H1) control theory presented in this chapter and in the previous chapters can notbe applied to this problem formulation directly because the pole s = 0 of We becomesan uncontrollable pole of the feedback system and the very �rst assumption for the H2

(or H1) theory is violated.

However, the obstacles can be overcome by appropriately reformulating the problem.Suppose We can be factorized as follows

We = ~We(s)M(s)

where M(s) is proper, containing all the imaginary axis poles of We, and M�1(s) 2RH1, ~We(s) is stable and minimum phase. Now suppose there exists a controller K(s)which contains the same imaginary axis poles that achieves the performance. Thenwithout loss of generality, K can be factorized as

K(s) = �K(s)M(s)

where there is no unstable pole/zero cancelation in forming the product K(s)M(s).Now the problem can be reformulated as in Figure 17.2. Figure 17.2 can be put in thegeneral LFT framework as in Figure 17.3.

Let ~We;Wu;Wd;M , and P have the following stabilizable and detectable state space


gK P

WdWu

-

?

-6

6

?M ~We

- - --u

wz2

y1 z1

Figure 17.2: Disturbance Rejection with Imaginary Axis Poles

��

gM

Wu

~We Wd

K

�

?�

-

P� �

�

�

�

�z1

z2

w

uy1

Figure 17.3: LFT Framework for the Disturbance Rejection Problem

realizations:

~We =

"Ae Be

Ce De

#; Wu =

"Au Bu

Cu Du

#; Wd =

"Ad Bd

Cd Dd

#

M =

"Am Bm

Cm Dm

#; P =

"Ap Bp

Cp Dp

#:

Then a realization for the generalized system is given by

G(s) =

264"

~WeMWd

0

# "~WeMP

Wu

#

MWd MP

375

17.4. H2 and H1 Integral Control 453

=

2666666666666664

Ae 0 BeCm BeDmCd BeDmCp BeDmDd BeDmDp

0 Au 0 0 0 0 Bu

0 0 Am BmCd BmCp BmDd BmDp

0 0 0 Ad 0 Bd 0

0 0 0 0 Ap 0 Bp

Ce 0 DeCm DeDmCd DeDmCp DeDmDd DeDmDp

0 Cu 0 0 0 0 Du

0 0 Cm DmCd DmCp DmDd DmDp

3777777777777775

:

We shall illustrate the above design through a simple numerical example. Let

P =s� 2

(s+ 1)(s� 3)=

264

0 1 0

3 2 1

�2 1 0

375 ; Wd = 1;

Wu =s+ 10

s+ 100=

"�100 �901 1

#; We =

1

s:

Then we can choose without loss of generality that

M =s+ �

s; ~We =

1

s+ �; � > 0:

This gives the following generalized system

G(s) =

2666666666666664

�� 0 1 �2 1 1 0

0 �100 0 0 0 0 �900 0 0 �2� � � 0

0 0 0 0 1 0 0

0 0 0 3 2 0 1

1 0 0 0 0 0 0

0 1 0 0 0 0 1

0 0 1 �2 1 1 0

3777777777777775

and suboptimal H1 controller K1 for each � can be computed easily as

K1 =�2060381:4(s+ 1)(s+ �)(s+ 100)(s� 0:1557)

(s+ �)2(s+ 32:17)(s+ 262343)(s� 19:89)

which gives the closed-loop 1 norm 7:854. Hence the controller K1 = �K1(s)M(s)is given by

K1(s) =2060381:4(s+ 1)(s+ 100)(s� 0:1557)

s(s+ 32:17)(s+ 262343)(s� 19:89)� 7:85(s+ 1)(s+ 100)(s� 0:1557)

s(s+ 32:17)(s� 19:89)


which is independent of � as expected. Similarly, we can solve an optimal H2 controller

K2 =�43:487(s+ 1)(s+ �)(s+ 100)(s� 0:069)

(s+ �)2(s2 + 30:94s+ 411:81)(s� 7:964)

and

K2(s) = �K2(s)M(s) =43:487(s+ 1)(s+ 100)(s� 0:069)

s(s2 + 30:94s+ 411:81)(s� 7:964):

An approximate integral control can also be achieved without going through theabove process by letting

We = ~We =1

s+ �; M(s) = 1

for a su�ciently small � > 0. For example, a controller for � = 0:001 is given by

K1 =316880(s+ 1)(s+ 100)(s� 0:1545)

(s+ 0:001)(s+ 32)(s+ 40370)(s� 20)� 7:85(s+ 1)(s+ 100)(s� 0:1545)

s(s+ 32)(s� 20)

which gives the closed-loop H1 norm of 7:85. Similarly, an approximate H2 integralcontroller is obtained as

K2 =43:47(s+ 1)(s+ 100)(s� 0:0679)

(s+ 0:001)(s2 + 30:93s+ 411:7)(s� 7:9718):

17.5 H1 Filtering

In this section we show how the �ltering problem can be solved using the H1 theorydeveloped early. Suppose a dynamic system is described by the following equations

_x = Ax+B1w(t); x(0) = 0 (17.1)

y = C2x+D21w(t) (17.2)

z = C1x+D11w(t) (17.3)

The �ltering problem is to �nd an estimate z of z in some sense using the measurementof y. The restriction on the �ltering problem is that the �lter has to be causal sothat it can be realized, i.e., z has to be generated by a causal system acting on themeasurements. We will further restrict our �lter to be unbiased, i.e., given T > 0 theestimate z(t) = 0 8t 2 [0; T ] if y(t) = 0; 8t 2 [0; T ]. Now we can state our H1 �lteringproblem.

H1 Filtering: Given a > 0, �nd a causal �lter F (s) 2 RH1 if it existssuch that

J := supw2L2[0;1)

kz � zk22kwk22

< 2

with z = F (s)y.

17.5. H1 Filtering 455

j

264

A B1

C1 D11

C2 D21

375

F (s) ��?

�

z�

z

�z y

w

Figure 17.4: Filtering Problem Formulation

A diagram for the �ltering problem is shown in Figure 17.4.The above �ltering problem can also be formulated in an LFT framework: given a

system shown below

G(s)

F (s)

z�

y

w

z

� �

�

-

G(s) =

264

A B1 0

C1 D11 �IC2 D21 0

375

�nd a �lter F (s) 2 RH1 such that

supw2L2

kz�k22kwk22

< 2: (17:4)

Hence the �ltering problem can be regarded as a special H1 problem. However, com-paring with control problems there is no internal stability requirement in the �lteringproblem. Hence the solution to the above �ltering problem can be obtained from theH1solution in the previous sections by setting B2 = 0 and dropping the internal stabilityrequirement.

Theorem 17.5 Suppose (C2; A) is detectable and"A� j!I B1

C2 D21

#

has full row rank for all !. Let D21 be normalized and D11 partitioned conformably as"D11

D21

#=

"D111 D112

0 I

#:

Then there exists a causal �lter F (s) 2 RH1 such that J < 2 if and only if �(D111) < and J1 2 dom(Ric) with Y1 = Ric(J1) � 0 where

~R :=

"D11

D21

#"D11

D21

#��" 2I 0

0 0

#


J1 :=

"A� 0

�B1B�1 �A

#�"

C�1 C�2

�B1D�11 �B1D

�21

#~R�1

"D11B

�1 C1

D21B�1 C2

#:

Moreover, if the above conditions are satis�ed, then a rational causal �lter F (s) satis-fying J < 2 is given by

z = F (s)y =

"A+ L21C2 + L11D112C2 �L21 � L11D112

C1 �D112C2 D112

#y

where hL11 L21

i:= �

hB1D

�11 + Y1C

�1 B1D

�21 + Y1C

�2

i~R�1:

In the case where D11 = 0 and B1D�21 = 0 the �lter becomes much simpler:

z =

"A� Y1C

�2C2 Y1C

�2

C1 0

#y

where Y1 is the stabilizing solution to

Y1A� +AY1 + Y1( �2C�1C1 � C�2C2)Y1 +B1B

�1 = 0:

17.6 Youla Parameterization Approach*

In this section, we shall brie y outline an alternative approach to solving the standardH2 and H1 control problems using the Q parameterization (Youla parameterization)approach. This approach was the main focus in the early 1980's and is still very usefulin solving many interesting problems. We will see that this approach may suggestadditional interpretations for the results presented in the last chapter and applies toboth H2 and H1 problems. The H2 problem is very simple and involves a projection.While the H1 problem is much more di�cult, it bears some similarities to the constantmatrix dilation problem but with the restriction of internal stability. Nevertheless, wehave built enough tools to give a fairly complete solution using this Q parameterizationapproach.

Assume again that the G has the realization

G =

264

A B1 B2

C1 D11 D12

C2 D21 D22

375

with the same assumptions as before. But for convenience, we will allow the assumption(A2) to be relaxed to the following

17.6. Youla Parameterization Approach* 457

(A20) D12 is full column rank withhD12 D?

iunitary, and D21 is full row rank with"

D21

~D?

#unitary.

Next, we will outline the steps required to solve the H2 and H1 control problems.Because of the similarity between the H2 and H1 problems, they are developed inparallel below.

Parameterization: Recall that all controllers stabilizing the plant G can be expressedas

K = F`(M2; Q); I +D22Q(1) invertible

with Q 2 RH1 and

M2 =

264A+B2F2 + L2C2 + L2D22F2 �L2 B2 + L2D22

F2 0 I

�(C2 +D22F2) I �D22

375

where

F2 = �(B�2X2 +D�12C1)

L2 = �(Y2C�2 +B1D�21)

X2 = Ric

"A�B2D

�12C1 �B2B

�2

�C�1D?D�?C1 �(A�B2D

�12C1)

�

#� 0

Y2 = Ric

"(A�B1D

�21C2)

� �C�2C2

�B1~D�?~D?B

�1 �(A�B1D

�21C2)

#� 0:

Then the closed-loop transfer matrix from w to z can be written as

F`(G;K) = To + UQV; I +D22Q(1) invertible

where

To =

264

AF2 �B2F2 B1

0 AL2 B1L2

C1F2 �D12F2 D11

375 2 RH1

U =

"AF2 B2

C1F2 D12

#; V =

"AL2 B1L2

C2 D21

#

andAF2 := A+B2F2; AL2 := A+ L2C2

C1F2 = C1 +D12F2; B1L2 = B1 + L2D21:

It is easy to show that U is an inner, U�U = I , and V is a co-inner, V V � = I .


Unitary Invariant: There exist U? and V? such thathU U?

iand

"V

V?

#are

square and inner:

U? =

"AF2 �X+

2 C�1D?

C1F2 D?

#; V? =

"AL2 B1L2

� ~D?B�1Y

+2

~D?

#:

Since the multiplication of square inner matrices do not changeH2 andH1 norms,we have for � = 2 or 1

kF`(P;K)k� = kTo + UQV k�

=

hU U?

i�(To + UQV )

"V

V?

#� �

=

"R11 +Q R12

R21 R22

# �

where

R =

"U�

U�?

#To

hV � V �?

iwith the obvious partitioning. It can be shown through some long and tediousalgebra that R is antistable and has the following representation:

R =

266664

�A�F2 EB�1L2 �ED�21 �E ~D�

?

0 �A�L2 C�2 �Y +2 B1

~D�?

B�2 F2Y2 �D�12D11B

�1L2

D�12D11D

�21 D�

12D11~D�?

�D�?C1X

+2 �D�

?D11B�1L2

D�?D11D

�21 D�

?D11~D�?

377775

where E := X2B1 + C�1F2D11.

Projection/Dilation: At this point the � = 2 and � = 1 cases have to be treatedseparately.

H2 case: In this case, we will assume D�?D11 = 0 and D11

~D�? = 0; otherwise,

the 2-norm of the closed loop transfer matrix will be unbounded since

R(1) =

"D�

12D11D�21 D�

12D11~D�?

D�?D11D

�21 D�

?D11~D�?

#:

Now from the de�nition of 2-norm, the problem can be rewritten as

kF`(G;K)k22 =

"R11 +Q R12

R21 R22

# 2

2

17.6. Youla Parameterization Approach* 459

= kR11 +Qk22 + "

0 R12

R21 R22

# 2

2

:

Furthermore,

kR11 +Qk22 = kR11 �D�12D11D

�21k22 + kQ+D�

12D11D�21k22

since (R11 � D�12D11D

�21) 2 H?2 and (Q + D�

12D11D�21) 2 H1. In fact,

(Q+D�12D11D

�21) has to be in H2 to guarantee that the 2-norm be �nite.

Hence the unique optimal solution is given by a projection to H1:

Qopt = [�R11]+ = �D�12D11D

�21:

The optimal controller is given by

Kopt = F`(M2; Qopt):

In particular, if D11 = 0 then Kopt = F`(M2; 0).

H1 case: Recall the de�nition of the 1-norm: "R11 +Q R12

R21 R22

# 1

= sup!�

"R11 +Q R12

R21 R22

#!(j!):

Consider the minimization problem with respect toQ frequency by frequency;it becomes a constant matrix dilation problem if no causality restrictions areimposed on Q. The H1 optimal control follows this idea. However, it shouldbe noted that since Q is restricted to be an RH1 matrix, the term R11 +Qcannot be made into an arbitrary matrix. Therefore, the problem will besigni�cantly di�erent from the constant matrix case, and, generically,

minQ2H1

"R11 +Q R12

R21 R22

# 1

> 0

where

0 := max

( h R21 R22

i 1;

"R12

R22

# 1

):

It is convenient to use the characterization in Corollary 2.23. Recall that "R11 +Q R12

R21 R22

# 1

� (< ); for > 0

i� (I �WW�)�1=2(Q+R11 +WR�22Z)(I � Z�Z)�1=2 1� (< )


where

W = R12( 2I �R�22R22)

�1=2

Z = ( 2I �R22R�22)

�1=2R21:

The key here is to �nd the spectral factors (I �WW�)1=2 and (I �Z�Z)1=2with stable inverses such that ifM = (I�WW�)1=2 and N = (I�Z�Z)1=2,then M;M�1; N;N�1 2 H1; MM� = I �WW�, and N�N = (I �Z�Z).Now let

Q :=M�1QN�1; G :=M�1(R11 +WR�22Z)N�1;

then the problem is reduced to �nding Q 2 H1 such that G+ Q 1� (< ): (17:5)

Note that Q =MQN 2 H1 i� Q 2 H1.

The �nal step in the H1 problem involves solving (17.5) for Q 2 H1. This isa standard Nehari problem and is solved in Chapter 8. The optimal controllaw will be given by

Kopt = F`(J;Qopt):

17.7 Connections

This section considers the connections between the Youla parameterization approachand the current state space approach discussed in the last chapter. The key is Lemma 15.7and its connection with the formulae given in the last section.

To see how Lemma 15.7 might be used in the last section to prove Theorems 16.4and 16.5 or Theorem 17.1, we begin with G having state dimension n. For simplicity,we assume further that D11 = 0. Then from the last section, we have

kTzwk1 =

R+

"Q 0

0 0

# 1

=

"R11 +Q R12

R21 R22

# 1

=

R� +

"Q� 0

0 0

# 1

where R has state dimension 2n. Now de�ne"N11 N12

N21 N22

#:= R�:

17.7. Connections 461

Then

"N11 N12

N21 N22

#=

266664

AF2 0 B2 �X+2 C

�1D?

�B1L2B�1X2 AL2 Y2F

�2 0

D21B�1X2 �C2 0 0

~D?B�1X2

~D?B�1Y

+2 0 0

377775 2 RH1;

(17:6)and the H1 problem becomes to �nd an antistable Q� such that

"N11 +Q� N12

N21 N22

# 1

< :

Let w =

"w1

w2

#and note that

"N11 +Q� N12

N21 N22

# 2

1

:= supw2BL2

"N11 +Q� N12

N21 N22

# "w1

w2

# 2

2

= supfw2BL2g\fw12H?2 g

"N11 +Q� N12

N21 N22

# "w1

w2

# 2

2

:

The right hand side of the above equation can be written as

supfw2BL2g\fw12H?2 g

8<: "P+ 0

0 I

# "N11 N12

N21 N22

#"w1

w2

# 2

2

+ kQ�w1 + P�(N11w1 +N12w2)k22o:

It is clear that the last term can be made zero for an appropriate choice of an antistableQ�. Hence the H1 problem has a solution if and only if

supfw2BL2g\fw12H?2 g

"P+ 0

0 I

#"N11 N12

N21 N22

#"w1

w2

# 2

< :

But this is exactly the same operator as the one in Lemma 15.7. Lemma 15.7 may beapplied to derive the solvability conditions and some additional arguments to constructa Q 2 RH1 from X and Y such that kTzw(Q)k1 < . In fact, it turns out that X inLemma 15.7 for N is exactly W in the FI proof or in the di�erential game problem.

The �nal step is to obtain the controller from M2 and Q. Since M2 has statedimension n and Q has 2n, the apparent state dimension of K is 3n, but some tediousstate space manipulations produce cancelations resulting in the n dimensional controller


formulae in Theorems 16.4 and 16.5. This approach is exactly the procedure used inDoyle [1984] and Francis [1987] with Lemma 15.7 used to solve the general distanceproblem. Although this approach is conceptually straightforward and was, in fact, usedto obtain the �rst proof of the current state space results in this chapter, it seemsunnecessarily cumbersome and indirect.

17.8 State Feedback and Di�erential Game

It has been shown in Chapters 15 and 16 that a (central) suboptimal full informationH1 control law is actually a pure state feedback if D11 = 0. However, this is not truein general if D11 6= 0, as will be shown below. Nevertheless, the state feedback H1control is itself a very interesting problem and deserves special treatment. This sectionand the section to follow are devoted to the study of this state feedback H1 controlproblem and its connections with full information control and di�erential game.

Consider a dynamical system

_x = Ax+B1w +B2u (17.7)

z = C1x+D11w +D12u (17.8)

where z(t) 2 Rp1 , y(t) 2 Rp2 , w(t) 2 Rm1 , u(t) 2 Rm2 , and x(t) 2 Rn . The followingassumptions are made:

(AS1) (A;B2) is stabilizable;

(AS2) There is a matrix D? such thathD12 D?

iis unitary;

(AS3)

"A� j!I B2

C1 D12


We are interested in the following two related quadratic min-max problems: given > 0,check if

supw2BL2[0;1)

minu2L2[0;1)

kzk2 <

and

minu2L2[0;1)

supw2BL2[0;1)

kzk2 < :

The �rst problem can be regarded as a full information control problem since the controlsignal u can be a function of the disturbance w and the system state x. On the otherhand, the optimal control signal in the second problem cannot depend on the disturbancew (the worst disturbance w can be a function of u and x). In fact, it will be shown thatthe control signal in the latter problem depends only on the system state; hence, it isequivalent to a state feedback control.

17.8. State Feedback and Di�erential Game 463

Theorem 17.6 Let > 0 be given and de�ne

R := D�1�D1� �

" 2Im1

0

0 0

#; where D1� := [D11 D12]

H1 :=

"A 0

�C�1C1 �A�

#�"

B

�C�1D1�

#R�1

hD�

1�C1 B�i

where B :=hB1 B2

i.

(a) supw2BL2[0;1)

minu2L2[0;1)

kzk2 < if and only if

�(D�?D11) < ; H1 2 dom(Ric); X1 = Ric(H1) � 0:

Moreover, an optimal u is given by

u = �D�12D11w +

hD�

12D11 IiFx;

and a worst w�worst is given by

w�worst = F11x

where

F :=

"F11

F21

#:= �R�1 [D�

1�C1 +B�X1] :

(b) minu2L2[0;1)

supw2BL2[0;1)

kzk2 < if and only if

�(D11) < ; H1 2 dom(Ric); X1 = Ric(H1 � 0:

Moreover, an optimal u is given by

u = F21x = �D�12D11w�worst +

hD�

12D11 IiFx;

and the worst wsfworst is given by

wsfworst = ( 2I �D�11D11)

�1 f(D�11C1 +B�1X1)x +D�

11D12ug :

Proof. The condition for part (a) can be shown in the same way as in Chapter 15and is, in fact, the solution to the FI problem for the general case. We now prove thecondition for part (b). It is not hard to see that �(D11) < is necessary since controlu cannot feed back w directly, so the D11 term cannot be (partially) eliminated as in


the FI problem. The conditions H1 2 dom(Ric) and X1 = Ric(H1 � 0 can be easilyseen as necessary since

supw2BL2[0;1)

minu2L2[0;1)

kzk2 � minu2L2[0;1)

supw2BL2[0;1)

kzk2:

It is easy to verify directly that the optimal control and the worst disturbance can bechosen in the given form. 2

De�ne

R0 = I �D�11D?D

�?D11=

2

~R0 = I +D�12D11(

2I �D�11D11)

�1D�11D12

R0 = I �D�11D11=

2:

Then it is easy to show that

kzk2 � 2 kwk2 + d

dt(x�Xx) =

u+D�12D11w �

hD�

12D11 IiFx 2

� 2 R1=2

0 (w � F11x) 2

if conditions in (a) are satis�ed. On the other hand, we have

kzk2 � 2 kwk2 + d

dt(x�Xx) =

~R1=20 (u� F21x)

2 � 2 R1=2

0 (w � wsfworst) 2

if conditions in (b) are satis�ed. Integrating both equations from t = 0 to 1 withx(0) = x(1) = 0 gives

kzk22 � 2 kwk22 = u+D�

12D11w �hD�

12D11 IiFx 22� 2

R1=20 (w � F11x)

22

if conditions in (a) are satis�ed, and

kzk22 � 2 kwk22 = ~R1=2

0 (u� F21x) 22� 2

R1=20 (w � wsfworst)

22

if conditions in (b) are satis�ed. These relations suggest that an optimal control lawand a worst disturbance for problem (a) are

u = �D�12D11w +

hD�

12D11 IiFx; w = F11x

and that an optimal control law and a worst disturbance for problem (b) are

u = F21x w = wsfworst:

17.9. Parameterization of State Feedback H1 Controllers 465

Moreover, if problem (b) has a solution for a given , then the corresponding di�erentialgame problem

minu2L2[0;1)

supw2L2[0;1)

�kzk22 � 2kwk22

has a saddle point, i.e,

supw2L2[0;1)

minu2L2[0;1)

�kzk22 � 2kwk22= min

u2L2[0;1)sup

w2L2[0;1)

�kzk22 � 2kwk22

since the solvability of problem (b) implies the solvability of problem (a). However,the converse may not be true. In fact, it is easy to construct an example so that theproblem (a) has a solution for a given and problem (b) does not. On the other hand,the problems (a) and (b) are equivalent if D11 = 0.

17.9 Parameterization of State Feedback H1 Con-

trollers

In this section, we shall consider the parameterization of all state feedback control laws.We shall �rst assume for simplicity that D11 = 0 and show later how to reduce thegeneral D11 6= 0 case to an equivalent problem with D11 = 0. We shall assume

G =

264

A B1 B2

C1 0 D12

I 0 0

375 :

Note that the state feedback H1 problem is not a special case of the output feedbackproblem since D21 = 0. Hence the parameterization cannot be obtained from the outputfeedback.

Theorem 17.7 Suppose that the assumptions (AS1)� (AS3) are satis�ed and that B1

has the following SVD:

B1 = U

"� 0

0 0

#V �; UU� = In; V �V = Im1

; 0 < � 2 Rr�r :

There exists an admissible controller K(s) for the SF problem such that kTzwk1 < ifand only if H1 2 dom(Ric) and X1 = Ric(H1) � 0. Furthermore, if these conditionsare satis�ed, then all admissible controllers satisfying kTzwk1 < can be parameterizedas

K = F1 +

(Im2

+Q

"��1 0

0 Im1�r

#U�1B2

)�1Q

"��1 0

0 Im1�r

#U�1(sI � A)

where F1 = �(D�12C1+B

�2X1), A = A+ �2B1B

�1X1+B2F1, and Q =

hQ1 Q2

i2

RH2 with kQ1k1 < . The dimensions of Q1 and Q2 are m2 � r and m2 � (m1 � r),respectively.


Proof. The conditions for the existence of the state feedback control law have beenshown in the last section and in Chapter 15. We only need to show that the aboveparameterization indeed satis�es the H1 norm condition and gives all possible statefeedback H1 control laws. As in the proof of the FI problem in Chapter 16, we makethe same change of variables to get GSF instead of GFI :

y u

r

v r

w wz �

�

XXXXXX

��

��

�

�

-

�

1

K

GSF

P

GSF =

264Atmp B1 B2

�F1 0 I

I 0 0

375 :

So again from Theorem 16.2 and Lemma 16.8, we conclude that K is an admissiblecontroller for G and kTzwk1 < i�K is an admissible controller for GSF and kTvrk1 < . Now let L = B2F1, and then Atmp + L = Atmp + B2F1 is stable. All controllers

that stabilize GSF can be parameterized as K = F`(Mtmp;�); � 2 RH1 where

Mtmp =

264Atmp +B2F1 + L �L B2

F1 0 I

�I I 0

375 :

Then Tvr = F`(GSF ;F`(Mtmp;�)) =: F`(Ntmp;�). It is easy to show that

Ntmp =

264Atmp +B2F1 B1 0

�F1 0 I

I 0 0

375 :

Now let � = F1 + �, and we have

F`(Ntmp;�) = �

"Atmp +B2F1 B1

I 0

#:

17.9. Parameterization of State Feedback H1 Controllers 467

Let

� =hQ1 Q2

i " ��1 0

0 I

#U�1(sI � A):

Then the mapping from Q =hQ1 Q2

i2 RH2 to � 2 RH1 is one-to-one. Hence

we haveF`(Ntmp;�) =

hQ1 0

iV �

and kF`(Ntmp;�)k1 = kQ1k1, which in turn implies that kTvrk1 = kF`(Ntmp;�)k1 <

if and only if kQ1k1 < . Finally, substituting � = F1 + � into K = F`(Mtmp;�),we get the desired controller parameterization. 2

The controller parameterization for the general case can also be obtained:

Gg(s) =

264

A B1 B2

C1 D11 D12

I 0 0

375 :

Let

N =

"�D11 (I �D11D

�11)

1=2

(I �D�11D11)

1=2 D�11

#;

and then N�N = I . Furthermore, by Theorem 16.2, we have that K stabilizes Gg andkF`(Gg ;K)k1 < 1 if and only if F`(Gg ;K) internally stabilizes N and

kF`(N;F`(Gg ;K))k1 < 1:

Note thatF`(N;F`(Gg ;K)) = F`(M; ~K)

with

M(s) =

2664A+B1D

�11R

�11 C1 B1(I �D�

11D11)�1=2 B2 +B1D

�11R

�11 D12

R�1=21 C1 0 R

�1=21 D12

I 0 0

3775

where R1 := I �D11D�11. In summary, we have the following lemma.

Lemma 17.8 There is a K that internally stabilizes Gg and kF`(Gg ;K)k1 < 1 if and

only if kD11k < 1 and there is a ~K that stabilizes M and F`(M; ~K)

1< 1.

Now Theorem 17.7 can be applied to the new system M(s) to obtain the controllerparameterization for the general problem with D11 6= 0.



The detailed derivation of the H1 solution for the general case is treated in Gloverand Doyle [1988, 1989]. The loop-shifting procedures are given in Safonov, Limebeer,and Chiang [1989]. The idea is also used in Zhou and Khargonekar [1988] for statefeedback problems. A fairly complete solution to the singular H1 problem is obtainedin Stoorvogel [1992]. The H1 �ltering and smoothing problems are considered in detailin Nagpal and Khargonekar [1991]. The Youla parameterization approach is treatedvery extensively in Doyle [1984] and Francis [1987] and in the references therein. Thepresentation of the state feedback H1 control in this chapter is based on Zhou [1992].

18H1 Loop Shaping

This chapter introduces a design technique which incorporates loop shaping methodsto obtain performance/robust stability tradeo�s, and a particular H1 optimizationproblem to guarantee closed-loop stability and a level of robust stability at all frequen-cies. The proposed technique uses only the basic concept of loop shaping methods andthen a robust stabilization controller for the normalized coprime factor perturbed sys-tem is used to construct the �nal controller. This chapter is arranged as follows: TheH1 theory is applied to solve the stabilization problem of a normalized coprime factorperturbed system in Section 18.1. The loop shaping design procedure is described inSection 18.2. The theoretical justi�cation for the loop shaping design procedure is givenin Section 18.3.

18.1 Robust Stabilization of Coprime Factors

In this section, we use the H1 control theory developed in the previous chapters tosolve the robust stabilization of a left coprime factor perturbed plant given by

P� = ( ~M + ~�M )�1( ~N + ~�N )

with ~M; ~N; ~�M ; ~�N 2 RH1 and h ~�N

~�M

i 1< �. The transfer matrices ( ~M; ~N)

are assumed to be a left coprime factorization of P (i.e., P = ~M�1 ~N), and K internallystabilizes the nominal system.

469

470 H1 LOOP SHAPING

f

ff�

�

yw

z2z1

r

6

?

� �

-

~�M

~M�1--

-

~N

~�N-

-K--

Figure 18.1: Left Coprime Factor Perturbed Systems

It has been shown in Chapter 9 that the system is robustly stable i� "K

I

#(I + PK)�1 ~M�1

1

� 1=�:

Finding a controller such that the above norm condition holds is an H1 norm min-imization problem which can be solved using H1 theory developed in the previouschapters.

Suppose P has a stabilizable and detectable state space realization given by

P =

"A B

C D

#

and let L be a matrix such that A + LC is stable then a left coprime factorization ofP = ~M�1 ~N is given by

h~N ~M

i=

"A+ LC B + LD L

C D I

#:

DenoteK = �K

then the system diagram can be put in an LFT form as in Figure 18.2 with the gener-alized plant

G(s) =

264"

0

~M�1

# "I

P

#

~M�1 P

375 =

266664

A �L B"0

C

# "0

I

# "I

D

#

C I D

377775

=:

264

A B1 B2

C1 D11 D12

C2 D21 D22

375 :

18.1. Robust Stabilization of Coprime Factors 471

i~M�1

K

� ~N� �?

�

�

-

uy

z1

z2

w

Figure 18.2: LFT Diagram for Coprime Factor Stabilization

To apply the H1 control formulae in Chapter 17, we need to normalize the \D12"matrix �rst. Note that"

I

D

#= U

"0

I

#(I +D�D)

12 ; where U =

"D�(I +DD�)�

12 I +D�D)�

12

�(I +DD�)�12 D(I +D�D)�

12

#

and U is a unitary matrix. Let

K = (I +D�D)�12 ~K"

z1

z2

#= U

"z1

z2

#:

Then kTzwk1 = kU�Tzwk1 = kTzwk1 and the problem becomes of �nding a controller~K so that kTzwk1 < with the following generalized plant

G =

"U� 0

0 I

#G

"I 0

0 (I +D�D)�12

#

=

266664

A �L B"�(I +DD�)�

12C

(I +D�D)�12D�C

# "�(I +DD�)�

12

(I +D�D)�12D�

# "0

I

#

C I D(I +D�D)�12

377775 :

Now the formulae in Chapter 17 can be applied to G to obtain a controller ~K and thenthe K can be obtained from K = �(I + D�D)�

12 ~K. We shall leave the detail to the

472 H1 LOOP SHAPING

reader. In the sequel, we shall consider the case D = 0. In this case, we have > 1 and

X1(A� LC

2 � 1) + (A� LC

2 � 1)�X1 �X1(BB� � LL�

2 � 1)X1 +

2C�C

2 � 1= 0

Y1(A+ LC)� + (A+ LC)Y1 � Y1C�CY1 = 0:

It is clear that Y1 = 0 is the stabilizing solution. Hence by the formulae in Chapter 17we have h

L11 L21

i=h0 L

iand

Z1 = I; D11 = 0; D12 = I; D21 =

p 2 � 1

I:

The results are summarized in the following theorem.

Theorem 18.1 Let D = 0 and let L be such that A+ LC is stable then there exists acontroller K such that

"K

I

#(I + PK)�1 ~M�1

1

<

i� > 1 and there exists a stabilizing solution X1 � 0 solving

X1(A� LC

2 � 1) + (A� LC

2 � 1)�X1 �X1(BB� � LL�

2 � 1)X1 +

2C�C

2 � 1= 0:

Moreover, if the above conditions hold a central controller is given by

K =

"A�BB�X1 + LC L

�B�X1 0

#:

It is clear that the existence of a robust stabilizing controller depends upon the choiceof the stabilizing matrix L, i.e., the choice of the coprime factorization. Now let Y � 0be the stabilizing solution to

AY + Y A� � Y C�CY +BB� = 0

and let L = �Y C�. Then the left coprime factorization ( ~M; ~N) given by

h~N ~M

i=

"A� Y C�C B �Y C�

C 0 I

#

is a normalized left coprime factorization (see Chapter 13).


Corollary 18.2 Let D = 0 and L = �Y C� where Y � 0 is the stabilizing solution to

AY + Y A� � Y C�CY +BB� = 0:

Then P = ~M�1 ~N is a normalized left coprime factorization and

infK stabilizing

"K

I

#(I + PK)�1 ~M�1

1

=1p

1� �max(Y Q)

=

�1�

h ~N ~Mi 2

H

��1=2=: min

where Q is the solution to the following Lyapunov equation

Q(A� Y C�C) + (A� Y C�C)�Q+ C�C = 0:

Moreover, if the above conditions hold then for any > min a controller achieving "K

I

#(I + PK)�1 ~M�1

1

<

is given by

K(s) =

"A�BB�X1 � Y C�C �Y C�

�B�X1 0

#

where

X1 = 2

2 � 1Q

�I � 2

2 � 1Y Q

��1:

Proof. Note that the Hamiltonian matrix associated with X1 is given by

H1 =

"A+ 1

2�1Y C�C �BB� + 1

2�1Y C�CY

� 2

2�1C�C �(A+ 1

2�1Y C�C)�

#:

Straightforward calculation shows that

H1 =

"I � 2

2�1Y

0 2

2�1I

#Hq

"I � 2

2�1Y

0 2

2�1I

#�1

where

Hq =

"A� Y C�C 0

�C�C �(A� Y C�C)�

#:

It is clear that the stable invariant subspace of Hq is given by

X�(Hq) = Im

"I

Q

#

474 H1 LOOP SHAPING

and the stable invariant subspace of H1 is given by

X�(H1) =

"I � 2

2�1Y

0 2

2�1I

#X�(Hq) = Im

"I � 2

2�1Y Q 2

2�1Q

#:

Hence there is a nonnegative de�nite stabilizing solution to the algebraic Riccati equa-tion of X1 if and only if

I � 2

2 � 1Y Q > 0

or

>1p

1� �max(Y Q)

and the solution, if it exists, is given by

X1 = 2

2 � 1Q

�I � 2

2 � 1Y Q

��1:

Note that Y and Q are the controllability Gramian and the observability Gramian ofh~N ~M

irespectively. Therefore, we also have that the Hankel norm of

h~N ~M

iisp�max(Y Q). 2

Corollary 18.3 Let P = ~M�1 ~N be a normalized left coprime factorization and

P� = ( ~M + ~�M )�1( ~N + ~�N )

with h ~�N~�M

i 1< �:

Then there is a robustly stabilizing controller for P� if and only if

� �p1� �max(Y Q) =

r1�

h ~N ~Mi 2

H:

The solutions to the normalized left coprime factorization stabilization problem arealso solutions to a related H1 problem which is shown in the following lemma.

Lemma 18.4 Let P = ~M�1 ~N be a normalized left coprime factorization. Then "K

I

#(I + PK)�1 ~M�1

1

=

"K

I

#(I + PK)�1

hI P

i 1

:


Proof. Since ( ~M; ~N) is a normalized left coprime factorization of P , we haveh~M ~N

i h~M ~N

i�= I

and h ~M ~Ni 1

= h ~M ~N

i� 1

= 1:

Using these equations, we have "K

I

#(I + PK)�1 ~M�1

1

=

"K

I

#(I + PK)�1 ~M�1

h~M ~N

i h~M ~N

i� 1

� "K

I

#(I + PK)�1 ~M�1

h~M ~N

i 1

h ~M ~Ni�

1

=

"K

I

#(I + PK)�1

hI P

i 1

� "K

I

#(I + PK)�1 ~M�1

1

h ~M ~Ni 1

=

"K

I

#(I + PK)�1 ~M�1

1

:

This implies "K

I

#(I + PK)�1 ~M�1

1

=

"K

I

#(I + PK)�1

hI P

i 1

:

2

Corollary 18.5 A controller solves the normalized left coprime factor robust stabiliza-tion problem if and only if it solves the following H1 control problem

"I

K

#(I + PK)�1

hI P

i 1

<

and

infK stabilizing

"

I

K

#(I + PK)�1

hI P

i 1

=1p

1� �max(Y Q)

=

�1�

h ~N ~Mi 2

H

��1=2:

476 H1 LOOP SHAPING

The solution Q can also be obtained in other ways. Let X � 0 be the stabilizingsolution to

XA+A�X �XBB�X + C�C = 0

then it is easy to verify that

Q = (I +XY )�1X:

Hence

min =1p

1� �max(Y Q)=

�1�

h ~N ~Mi 2

H

��1=2=p1 + �max(XY ):

Similar results can be obtained if one starts with a normalized right coprime factoriza-tion. In fact, a rather strong relation between the normalized left and right coprimefactorization problems can be established using the following matrix fact.

Lemma 18.6 Let M and N be any compatibly dimensioned complex matrices suchthat MM = M , NN = N , and M +N = I. Then �i(M) = �i(N) for all i such that0 < �i(M) 6= 1.

Proof. We �rst show that the eigenvalues ofM and N are either 0 or 1 andM and Nare diagonalizable. In fact, assume that � is an eigenvalue ofM and x is a correspondingeigenvector, then �x = Mx = MMx = M(Mx) = �Mx = �2x, i.e., �(1 � �)x = 0.This implies that either � = 0 or � = 1. To show that M is diagonalizable, assumeM = TJT�1 where J is a Jordan canonical form, it follows immediately that J mustbe diagonal by the condition M =MM . The proof for N is similar.

Next, assume that M is diagonalized by a nonsingular matrix T such that

M = T

"I 0

0 0

#T�1:

Then

N = I �M = T

"0 0

0 I

#T�1:

De�ne "A B

B� D

#:= T �T

and assume 0 < � 6= 1. Then A > 0 and

det(M�M � �I) = 0

, det(

"I 0

0 0

#T �T

"I 0

0 0

#� T �T ) = 0


, det

"(1� �)A ��B��B� ��D

#= 0

, det(��D � �2

1� �B�A�1B) = 0

, det(1� �

�D +B�A�1B) = 0

, det

"��A ��B��B� (1� �)D

#= 0

, det(N�N � �I) = 0:

This implies that all nonzero eigenvalues of M�M and N�N that are not equal to 1 areequal, i.e., �i(M) = �i(N) for all i such that 0 < �i(M) 6= 1. 2

Using this matrix fact, we have the following corollary.

Corollary 18.7 Let K and P be any compatibly dimensioned complex matrices. Then "

I

K

#(I + PK)�1

hI P

i = "

I

P

#(I +KP )�1

hI K

i :

Proof. De�ne

M =

"I

K

#(I + PK)�1

hI P

i; N =

"�PI

#(I +KP )�1

h�K I

i:

Then it is easy to verify that MM = M , NN = N and M +N = I . By Lemma 18.6,we have kMk = kNk. The corollary follows by noting that

"I

P

#(I +KP )�1

hI K

i=

"0 I

�I 0

#N

"0 �II 0

#:

2

Corollary 18.8 Let P = ~M�1 ~N = NM�1 be respectively the normalized left and rightcoprime factorizations. Then

"K

I

#(I + PK)�1 ~M�1

1

= M�1(I +KP )�1

hI K

i 1:

478 H1 LOOP SHAPING

Proof. This follows from Corollary 18.7 and the fact that

M�1(I +KP )�1hI K

i 1

=

"

I

P

#(I +KP )�1

hI K

i 1

:

2

This corollary says that any H1 controller for the normalized left coprime factor-ization is also an H1 controller for the normalized right coprime factorization. Henceone can work with either factorization.

18.2 Loop Shaping Using Normalized Coprime Sta-

bilization

This section considers the H1 loop shaping design. The objective of this approach is toincorporate the simple performance/robustness tradeo� obtained in the loop shaping,with the guaranteed stability properties of H1 design methods. Recall from Section 5.5of Chapter 5 that good performance controller design requires that

��(I + PK)�1

�; �

�(I + PK)�1P

�; �

�(I +KP )�1

�; �

�K(I + PK)�1

�(18:1)

be made small, particularly in some low frequency range. And good robustness requiresthat

��PK(I + PK)�1

�; �

�KP (I +KP )�1

�(18:2)

be made small, particularly in some high frequency range. These requirements in turnimply that good controller design boils down to achieving the desired loop (and con-troller) gains in the appropriate frequency range:

�(PK)� 1; �(KP )� 1; �(K)� 1

in some low frequency range and

�(PK)� 1; �(KP )� 1; �(K) �M

in some high frequency range where M is not too large.The design procedure is stated below.

Loop Shaping Design Procedure

(1) Loop Shaping: Using a precompensator W1 and/or a postcompensator W2, thesingular values of the nominal plant are shaped to give a desired open-loop shape.The nominal plant G and the shaping functions W1;W2 are combined to form theshaped plant, Gs where Gs =W2GW1. We assume that W1 andW2 are such thatGs contains no hidden modes.

18.2. Loop Shaping Using Normalized Coprime Stabilization 479

e

eee

n

6

�?

yu

ddi

�r ? --P

? -K ---


(2) Robust Stabilization: a) Calculate �max, where

�max =

inf

K stabilizing

"

I

K

#(I +GsK)�1 ~M�1

s

1

!�1

=

r1�

h ~Ns~Ms

i 2H< 1

and ~Ms; ~Ns de�ne the normalized coprime factors of Gs such that Gs = ~M�1s

~Ns

and~Ms

~M�s + ~Ns

~N�s = I:

If �max � 1 return to (1) and adjust W1 and W2.

b) Select � � �max, then synthesize a stabilizing controller K1, which satis�es "

I

K1

#(I +GsK1)�1 ~M�1

s

1

� ��1:

(3) The �nal feedback controller K is then constructed by combining the H1 con-troller K1 with the shaping functions W1 and W2 such that

K =W1K1W2:

A typical design works as follows: the designer inspects the open-loop singular valuesof the nominal plant, and shapes these by pre- and/or postcompensation until nominalperformance (and possibly robust stability) speci�cations are met. (Recall that theopen-loop shape is related to closed-loop objectives.) A feedback controller K1 withassociated stability margin (for the shaped plant) � � �max, is then synthesized. If �maxis small, then the speci�ed loop shape is incompatible with robust stability requirements,and should be adjusted accordingly, then K1 is reevaluated.

In the above design procedure we have speci�ed the desired loop shape by W2GW1.But, after Stage (2) of the design procedure, the actual loop shape achieved is in fact

480 H1 LOOP SHAPING

e

e

W1 G W2

W1 G W2K1

W2 K1 W1 G

- - - -

- - -6-

- - - -6

K

Gs

�

�

Figure 18.4: The Loop Shaping Design Procedure

given by W1K1W2G at plant input and GW1K1W2 at plant output. It is thereforepossible that the inclusion of K1 in the open-loop transfer function will cause dete-rioration in the open-loop shape speci�ed by Gs. In the next section, we will showthat the degradation in the loop shape caused by the H1 controller K1 is limited atfrequencies where the speci�ed loop shape is su�ciently large or su�ciently small. Inparticular, we show in the next section that � can be interpreted as an indicator of thesuccess of the loop shaping in addition to providing a robust stability guarantee for theclosed-loop systems. A small value of �max (�max � 1) in Stage (2) always indicatesincompatibility between the speci�ed loop shape, the nominal plant phase, and robustclosed-loop stability.

Remark 18.1 Note that, in contrast to the classical loop shaping approach, the loopshaping here is done without explicit regard for the nominal plant phase information.That is, closed-loop stability requirements are disregarded at this stage. Also, in con-trast with conventional H1 design, the robust stabilization is done without frequencyweighting. The design procedure described here is both simple and systematic, and onlyassumes knowledge of elementary loop shaping principles on the part of the designer.

~

Remark 18.2 The above robust stabilization objective can also be interpreted as themore standard H1 problem formulation of minimizing the H1 norm of the frequencyweighted gain from disturbances on the plant input and output to the controller input

18.3. Theoretical Justi�cation for H1 Loop Shaping 481

and output as follows. "

I

K1

#(I +GsK1)�1 ~M�1

s

1

=

"

I

K1

#(I +GsK1)�1

hI Gs

i 1

=

"

W2

W�11 K

#(I +GK)�1

hW�1

2 GW1

i 1

=

"

I

Gs

#(I +K1Gs)

�1hI K1

i 1

=

"W�1

1

W2G

#(I +KG)�1

hW1 GW�1

2

i 1

This shows how all the closed-loop objectives in (18.1) and (18.2) are incorporated. ~

18.3 Theoretical Justi�cation for H1 Loop Shaping

The objective of this section is to provide justi�cation for the use of parameter � as adesign indicator. We will show that � is a measure of both closed-loop robust stabilityand the success of the design in meeting the loop shaping speci�cations.

We �rst examine the possibility of loop shape deterioration at frequencies of highloop gain (typically low frequency). At low frequency (in particular, ! 2 (0; !l)), the de-terioration in loop shape at plant output can be obtained by comparing �(GW1K1W2)to �(Gs) = �(W2GW1). Note that

�(GK) = �(GW1K1W2) = �(W�12 W2GW1K1W2) � �(W2GW1)�(K1)=�(W2)

(18:3)where �(�) denotes condition number. Similarly, for loop shape deterioration at plantinput, we have

�(KG) = �(W1K1W2G) = �(W1K1W2GW1W�11 ) � �(W2GW1)�(K1)=�(W1):

(18:4)In each case, �(K1) is required to obtain a bound on the deterioration in the loop shapeat low frequency. Note that the condition numbers �(W1) and �(W2) are selected bythe designer.

Next, recalling that Gs denotes the shaped plant, and that K1 robustly stabilizesthe normalized coprime factorization of Gs with stability margin �, then we have

"I

K1

#(I +GsK1)�1 ~M�1

s

1� ��1 := (18:5)

where ( ~Ns; ~Ms) is a normalized left coprime factorization of Gs, and the parameter is de�ned to simplify the notation to follow. The following result shows that �(K1)

482 H1 LOOP SHAPING

is explicitly bounded by functions of � and �(Gs), the minimum singular value of theshaped plant, and hence by (18.3) and (18.4) K1 will only have a limited e�ect on thespeci�ed loop shape at low frequency.

Theorem 18.9 Any controller K1 satisfying (18.5), where Gs is assumed square, alsosatis�es

�(K1(j!)) � �(Gs(j!))�p 2 � 1p

2 � 1�(Gs(j!)) + 1

for all ! such that

�(Gs(j!)) >p 2 � 1:

Furthermore, if �(Gs) �p 2 � 1, then �(K1(j!)) ' 1=

p 2 � 1, where ' denotes

asymptotically greater than or equal to as �(Gs)!1.

Proof. First note that �(Gs) >p 2 � 1 implies that

I +GsG�s > 2I:

Further since ( ~Ns; ~Ms) is a normalized left coprime factorization of Gs, we have

~Ms~M�s = I � ~Ns

~N�s = I � ~MsGsG

�s~M�s :

Then~M�s~Ms = (I +GsG

�s)�1 < �2I:

Now "

I

K1

#(I +GsK1)�1 ~M�1

s

1�

can be rewritten as

(I +K�1K1) � 2(I �K�

1G�s)( ~M

�s~Ms)(I �GsK1): (18:6)

We will next show that K1 is invertible. Suppose that there exists an x such thatK1x = 0, then x� � (18:6)� x gives

�2x�x � x� ~M�s~Msx

which implies that x = 0 since ~M�s~Ms < �2I , and hence K1 is invertible. Equa-

tion (18.6) can now be written as

(K��1 K�1

1 + I) � 2(K��1 �G�s) ~M�

s~Ms(K

�11 �Gs): (18:7)

De�ne W such that

(WW �)�1 = I � 2 ~M�s~Ms = I � 2(I +GsG

�s)�1


and completing the square in (18.7) with respect to K�11 yields

(K��1 +N�)(WW �)�1(K�1

1 +N) � ( 2 � 1)R�R

where

N = 2Gs((1� 2)I +G�sGs)�1

R�R = (I +G�sGs)((1� 2)I +G�sGs)�1:

Hence we have

R��(K��1 +N�)(WW �)�1(K�1

1 +N)R�1 � ( 2 � 1)I

and

��W�1(K�1

1 +N)R�1� �p 2 � 1:

Use ��W�1(K�1

1 +N)R�1� � �(W�1)�(K�1

1 +N)�(R�1) to get

�(K�11 +N) �

p 2 � 1�(W )�(R)

and use �(K�11 +N) � �(K1)� �(N) to get

�(K1) �n( 2 � 1)1=2�(W )�(R) + �(N)

o�1: (18:8)

Next, note that the eigenvalues of WW �; N�N , and R�R can be computed as follows

�(WW �) =1 + �(GsG

�s)

1� 2 + �(GsG�s)

�(N�N) = 4�(GsG

�s)

(1� 2 + �(GsG�s))2

�(R�R) =1 + �(GsG

�s)

1� 2 + �(GsG�s)

therefore

�(W ) =p�max(WW �) =

�1 + �min(GsG

�s)

1� 2 + �min(GsG�s)

�1=2=

�1 + �2(Gs)

1� 2 + �2(Gs)

�1=2

�(N) =p�max(N�N) =

2p�min(GsG�s)

1� 2 + �min(GsG�s)=

2�(Gs)

1� 2 + �2(Gs)

�(R) =p�max(R�R) =

�1 + �min(GsG

�s)

1� 2 + �min(GsG�s)

�1=2=

�1 + �2(Gs)

1� 2 + �2(Gs)

�1=2:

484 H1 LOOP SHAPING

Substituting these formulas into (18.8), we have

�(K1) ��( 2 � 1)1=2(1 + �2(Gs)) + 2�(Gs)

�2(Gs)� ( 2 � 1)

��1=�(Gs))�

p 2 � 1p

2 � 1�(Gs) + 1:

2

The main implication of Theorem 18.9 is that the bound on �(K1) depends onlyon the selected loop shape, and the stability margin of the shaped plant. The value of (= ��1) directly determines the frequency range over which this result is valid{a small (large �) is desirable, as we would expect. Further, Gs has a su�ciently large loopgain, then so also will GsK1 provided (= ��1) is su�ciently small.

In an analogous manner, we now examine the possibility of deterioration in the loopshape at high frequency due to the inclusion of K1. Note that at high frequency (inparticular, ! 2 (!h;1)) the deterioration in plant output loop shape can be obtainedby comparing �(GW1K1W2) to �(Gs) = �(W2GW1). Note that, analogously to (18.3)and (18.4) we have

�(GK) = �(GW1K1W2) � �(W2GW1)�(K1)�(W2):

Similarly, the corresponding deterioration in plant input loop shape is obtained bycomparing �(W1K1W2G) to �(W2GW1) where

�(KG) = �(W1K1W2G) � �(W2GW1)�(K1)�(W1):

Hence, in each case, �(K1) is required to obtain a bound on the deterioration in theloop shape at high frequency. In an identical manner to Theorem 18.9, we now showthat �(K1) is explicitly bounded by functions of , and �(Gs), the maximum singularvalue of the shaped plant.

Theorem 18.10 Any controller K1 satisfying (18.5) also satis�es

�(K1(j!)) �p 2 � 1 + �(Gs(j!))

1�p 2 � 1�(Gs(j!))

for all ! such that

�(Gs(j!)) <1p 2 � 1

:

Furthermore, if �(Gs) � 1=p 2 � 1, then �(K1(j!)) /

p 2 � 1, where / denotes

asymptotically less than or equal to as �(Gs)! 0.

Proof. The proof of Theorem 18.10 is similar to that of Theorem 18.9, and is onlysketched here: As in the proof of Theorem 18.9, we have ~M�

s~Ms = (I +GsG

�s)�1 and

(I +K�1K1) � 2(I �K�

1G�s)( ~M

�s~Ms)(I �GsK1): (18:9)


Since �(Gs) <1p 2�1 ,

I � 2G�s(I +GsG�s)�1Gs > 0

and there exists a spectral factorization

V �V = I � 2G�s(I +GsG�s)�1Gs:

Now completing the square in (18.9) with respect to K1 yields

(K�1 +M�)V �V (K1 +M) � ( 2 � 1)Y �Y

where

M = 2G�s(I + (1� 2)GsG�s)�1

Y �Y = ( 2 � 1)(I +GsG�s)(I + (1� 2)GsG

�s)�1:

Hence we have��V (K1 +M)Y �1

� �p 2 � 1

which implies

�(K1) �p 2 � 1

�(V )�(Y �1)+ �(M): (18:10)

As in the proof of Theorem 18.9, it is easy to show that

�(V ) = �(Y �1) =�1� ( 2 � 1)�2(Gs)

1 + �2(Gs)

�1=2

�(M) = 2�(Gs)

1� ( 2 � 1)�2(Gs):

Substituting these formulas into (18.10), we have

�(K1) � ( 2 � 1)1=2(1 + �2(Gs)) + 2�(Gs)

1� ( 2 � 1)�2(Gs)=

p 2 � 1 + �(Gs)

1�p 2 � 1�(Gs)

:

2

The results in Theorem 18.9 and 18.10 con�rm that (alternatively �) indicates thecompatibility between the speci�ed loop shape and closed-loop stability requirements.

Theorem 18.11 Let G be the nominal plant and let K = W1K1W2 be the associatedcontroller obtained from loop shaping design procedure in the last section. Then if

"I

K1

#(I +GsK1)�1 ~M�1

s

1�

486 H1 LOOP SHAPING

we have

��K(I +GK)�1

� � �( ~Ms)�(W1)�(W2) (18.11)

��(I +GK)�1

� � minn �( ~Ms)�(W2); 1 + �(Ns)�(W2)

o(18.12)

��K(I +GK)�1G

� � minn �( ~Ns)�(W1); 1 + �(Ms)�(W1)

o(18.13)

��(I +GK)�1G

� � �( ~Ns)

�(W1)�(W2)(18.14)

��(I +KG)�1

� � minn1 + �( ~Ns)�(W1); �(Ms)�(W1)

o(18.15)

��G(I +KG)�1K

� � minn1 + �( ~Ms)�(W2); �(Ns)�(W2)

o(18.16)

where

�( ~Ns) = �(Ns) =

��2(W2GW1)

1 + �2(W2GW1)

�1=2(18:17)

�( ~Ms) = �(Ms) =

�1

1 + �2(W2GW1)

�1=2

(18:18)

and ( ~Ns; ~Ms), respectively, (Ns;Ms), is a normalized left coprime factorization, respec-tively, right coprime factorization, of Gs =W2GW1.

Proof. Note that~M�s~Ms = (I +GsG

�s)�1

and~Ms

~M�s = I � ~Ns

~N�s :

Then

�2( ~Ms) = �max( ~M�s~Ms) =

1

1 + �max(GsG�s)=

1

1 + �2(Gs)

�2( ~Ns) = 1� �2(Ms) =�2(Gs)

1 + �2(Gs):

The proof for the normalized right coprime factorization is similar. All other inequalitiesfollow from noting

"I

K1

#(I +GsK1)�1 ~M�1

s

1�

and "

I

K1

#(I +GsK1)�1 ~M�1

s

1

=

"

W2

W�11 K

#(I +GK)�1

hW�1

2 GW1

i 1

=

"W�1

1

W2G

#(I +KG)�1

hW1 GW�1

2

i 1


2

This theorem shows that all closed-loop objectives are guaranteed to have boundedmagnitude and the bounds depend only on , W1;W2, and G.


The H1 loop shaping using normalized coprime factorization was developed by Mc-Farlane and Glover [1990, 1992]. In the same references, some design examples werealso shown. The method has been applied to the design of scheduled controllers fora VSTOL aircraft in Hyde and Glover [1993]. The robust stabilization of normalizedcoprime factors is closely related to the robustness in the gap metric and graph topol-ogy, see El-Sakkary [1985], Georgiou and Smith [1990], Glover and McFarlane [1989],McFarlane, Glover, and Vidyasagar [1990], Qiu and Davison [1992a, 1992b] Vinnicombe[1993], Vidyasagar [1984, 1985], Zhu [1989], and references therein.

488 H1 LOOP SHAPING

19Controller Order Reduction

We have shown in the previous chapters that the H1 control theory and � synthesis canbe used to design robust performance controllers for highly complex uncertain systems.However, since a great many physical plants are modeled as high order dynamical sys-tems, the controllers designed with these methodologies typically have orders compara-ble to those of the plants. Simple linear controllers are normally preferred over complexlinear controllers in control system designs for some obvious reasons: they are easier tounderstand and computationally less demanding; they are also easier to implement andhave higher reliability since there are fewer things to go wrong in the hardware or bugsto �x in the software. Therefore, a lower order controller should be sought whenever theresulting performance degradation is kept within an acceptable magnitude. There areusually three ways in arriving at a lower order controller. A seemingly obvious approachis to design lower order controllers directly based on the high order models. However,this is still largely an open research problem. The Lagrange multiplier method devel-oped in the next chapter is potentially useful for some problems. Another approach is to�rst reduce the order of a high order plant, and then based on the reduced plant modela lower order controller may be designed accordingly. A potential problem associatedwith this approach is that such a lower order controller may not even stabilize the fullorder plant since the error information between the full order model and the reducedorder model is not considered in the design of the controller. On the other hand, onemay seek to design �rst a high order, high performance controller and subsequentlyproceed with a reduction of the designed controller. This approach is usually referredto as controller reduction. A crucial consideration in controller order reduction is totake into account the closed-loop so that the closed-loop stability is guaranteed and the

489

490 CONTROLLER ORDER REDUCTION

performance degradation is minimized with the reduced order controllers. The purposeof this chapter is to introduce several controller reduction methods that can guaranteethe closed-loop stability and possibly the closed-loop performance as well.

19.1 Controller Reduction with Stability Criteria

We consider a closed-loop system shown in Figure 19.1 where the n-th order generalizedplant G is given by

G =

"G11 G12

G21 G22

#=

264

A B1 B2

C1 D11 D12

C2 D21 D22

375

and G22 =

"A B2

C2 D22

#is a p � q transfer matrix. Suppose K is an m-th order

controller which stabilizes the closed-loop system. We are interested in investigatingcontroller reduction methods that can preserve the closed-loop stability and minimizethe performance degradation of the closed-loop systems with reduced order controllers.

G(s)

K(s)

z

y

w

u

� �

�

-

Figure 19.1: Closed-loop System Diagram

Let K be a reduced order controller and assume for the sake of argument that K hasthe same number of right half plane poles. Then it is obvious that the closed-loop stabil-

ity is guaranteed and the closed-loop performance degradation is limited if K � K

1

is su�ciently small. Hence a trivial controller reduction approach is to apply the modelreduction procedure to the full order controller K. Unfortunately, this approach hasonly limited applications. One simple reason is that a reduced order controller that sta-bilizes the closed-loop system and gives satisfactory performance does not necessarilymake the error ��(K�K)(j!) su�ciently small uniformly over all frequencies. Therefore,the approximation error only has to be made small over those critical frequency rangesthat a�ect the closed-loop stability and performance. Since stability is the most basicrequirement for a feedback system, we shall �rst derive controller reduction methodsthat guarantee this property.

19.1. Controller Reduction with Stability Criteria 491

19.1.1 Additive Reduction

The following lemma follows from small gain theorem.

Lemma 19.1 Let K be a stabilizing controller and K be a reduced order controller.Suppose K and K have the same number of right half plane poles and de�ne

� := K �K; Wa := (I �G22K)�1G22:

Then the closed-loop system with K is stable if either

kWa�k1 < 1 (19:1)

ork�Wak1 < 1: (19:2)

Proof. Since

I �G22K = I �G22K �G22� = (I �G22K)(I � (I �G22K)�1G22�)

by small gain theorem, the system is stable if kWa�k1 < 1. On the other hand,

I � KG22 = I �KG22 ��G22 = (I ��(I �G22K)�1G22)(I �KG22)

so the system is stable if k�Wak1 < 1. 2

Now suppose K has the following state space realization

K =

"Ak Bk

Ck Dk

#:

Then

Wa = S "

0 Ip

Iq K

#; G22

!= S

0B@264Ak 0 Bk

0 0 Ip

Ck Iq Dk

375 ;"A B2

C2 D22

#1CA

=

2664A+B2DkR

�1C2 B2~R�1Ck B2

~R�1

BkR�1C2 Ak +BkD22

~R�1Ck BkD22~R�1

R�1C2 R�1D22Ck D22~R�1

3775

where R := I �D22Dk and ~R := I �DkD22. Hence in general the order of Wa is equalto the sum of the orders of G and K.

In view of the above lemma, the controller K should be reduced in such a way so

that the weighted error Wa(K � K)

1or (K � K)Wa

1is small and K and K have


the same number of unstable poles. Suppose K is unstable, then in order to make surethat K and K have the same number of right half plane poles, K is usually separatedinto stable and unstable parts as

K = K+ +K�

where K+ is stable and a reduction is done on K+ to obtain a reduced K+, the �nalreduced order controller is given by K = K+ +K�.

We shall illustrate the above procedure through a simple example.

Example 19.1 Consider a system with

A =

264�1 0 4

0 2 0

0 0 �3

375 ; B1 = C�1 =

264

0 0

1 0

0 0

375 ; B2 = C�2 =

264

1

1

1

375

D11 = 0; D12 = D�21 =

"0

1

#; D22 = 0:

A controller minimizing kTzwk2 is given by

K = � 148:79(s+ 1)(s+ 3)

(s+ 31:74)(s+ 3:85)(s� 9:19)

with kTzwk2 = 55:09. Since K is unstable, we need to separate K into stable part andantistable part K = K+ +K� with

K+ = � 114:15(s+ 3:61)

(s+ 31:74)(s+ 3:85); K� =

�34:64s� 9:19

:

Next apply the frequency weighted balanced model reduction in Chapter 7 to Wa(K+ � K+) 1;

we have

K+ = � 117:085

s+ 34:526

and

K := K+ +K� = � 151:72(s+ 0:788)

(s+ 34:526)(s� 9:19):

The kTzwk2 with the reduced order controller K is 61:69. On the other hand, if K+

is reduced directly by balanced truncation without stability weighting Wa, then thereduced order controller does not stabilize the closed-loop system. The results aresummarized in Table 19.1 where both weighted and unweighted errors are listed.


Methods K � K

1

Wa(K � K) 1

kTzwk2BT 0.1622 2.5295 unstable

WBT 0.1461 0.471 61.69

Table 19.1: BT: Balance and Truncate, WBT: Weighted Balance and Truncate

It may seem somewhat strange that the unweighted error K � K

1

resulted from

weighted balanced reduction is actually smaller than that from unweighted balancedreduction. This happens because the balanced model reduction is not optimal in L1norm. We should also point out that the stability criterion

Wa(K � K) 1< 1 (or (K � K)Wa

1< 1) is only su�cient. Hence having

Wa(K � K) 1� 1 does not

necessarily imply that K is not a stabilizing controller. 3

19.1.2 Coprime Factor Reduction

It is clear that the additive controller reduction method in the last section is somewhatrestrictive, in particular, this method can not be used to reduce the controller orderif the controller is totally unstable, i.e., K has all poles in the right half plane. Thismotivates the following coprime factorization reduction approach.

Let G22 and K have the following left and right coprime factorizations, respectively

G22 = ~M�1 ~N = NM�1; K = ~V �1 ~U = UV �1

and de�neh~Nn

~Mn

i:= ( ~MV � ~NU)�1

h~N ~M

i= V �1(I �G22K)�1

hG22 I

iand "

Nn

Mn

#:=

"N

M

#( ~VM � ~UN)�1 =

"G22

I

#(I �KG22)

�1 ~V �1:

Note that ~Mn; ~Nn;Mn; Nn do not depend upon the speci�c coprime factorizations ofG22.

Lemma 19.2 Let U ; V 2 RH1 be the reduced right coprime factors of U and V . ThenK := U V �1 stabilizes the system if

h� ~Nn

~Mn

i " U

V

#�"U

V

#! 1< 1: (19:3)


Similarly, let ~U; ~V 2 RH1 be the reduced left coprime factors of ~U and ~V . Then

K := ~V�1

~U stabilizes the system if �h

~U ~Vi�h

~U ~Vi�" �Nn

Mn

# 1< 1: (19:4)

Proof. We shall only show the results for the right coprime controller reduction. Thecase for the left coprime factorization is analogous. It is well known that K := U V �1

stabilizes the system if and only if ( ~MV � ~NU)�1 2 RH1. Since

~MV � ~NU = ( ~MV � ~NU)

"I �

h� ~Nn

~Mn

i " U � UV � V

##

the stability of the closed-loop system is guaranteed if h� ~Nn

~Mn

i " U � UV � V

# 1< 1:

2

Now suppose U and V have the following state space realizations

"U

V

#=

264

A B

C1 D1

C2 D2

375 (19:5)

and suppose D2 is nonsingular. Then the reduced order controller is given by

K =

"A� BD�12 C2 BD�12C1 � D1D

�12 C2 D1D

�12

#: (19:6)

Similarly, suppose ~U and ~V have the following state space realizations

h~U ~V

i=

"A B1 B2

C D1 D2

#(19:7)

and suppose D2 is nonsingular. Then the reduced order controller is given by

K =

"A� B2D

�12 C B1 � B2D

�12 D1

D�12 C D�12 D1

#: (19:8)

It is clear from this lemma that the coprime factors of the controller should be reduced sothat the weighted errors in (19.3) and (19.4) are small. Note that there is no restriction


on the right half plane poles of the controller. In particular,K and K may have di�erentnumber of right half plane poles. It is also not hard to see that the additive reductionmethod in the last subsection may be regarded as a special case of the coprime factorreduction if the controller K is stable by taking V = I or ~V = I .

Let L; F; Lk and Fk be any matrices such that A + LC2; A + B2F;Ak + LkCk andAk +BkFk are stable. De�ne

h~N ~M

i=

"A+ LC2 B2 + LD22 L

C2 D22 I

#;

"N

M

#=

264

A+B2F B2

C2 +D22F D22

F I

375 ;

"U

V

#=

264Ak + BkFk Bk

Ck +DkFk Dk

Fk I

375 ; h ~U ~V

i=

"Ak + LkCk Bk + LkDk Lk

Ck Dk I

#:

Then K = UV �1 = ~V �1 ~U and G22 = NM�1 = ~M�1 ~N are right and left coprimefactorizations over RH1, respectively. Moreover,

h� ~Nn

~Mn

i=

264 A+B2DkR

�1C2 �B2~R�1Ck B2

~R�1 �B2DkR�1

�BkR�1C2 Ak +BkR

�1D22Ck �BkR�1D22 BkR

�1

�R�1C2 R�1D22Ck � Fk �R�1D22 R�1

375

"�Nn

Mn

#=

266664Ak +BkD22

~R�1Ck �BkR�1C2 �Lk +BkD22

~R�1

�B2~R�1Ck A+B2

~R�1DkC2 �B2~R�1

�D22~R�1Ck R�1C2 �D22

~R�1

~R�1Ck � ~R�1DkC2~R�1

377775

where R := I � D22Dk and ~R := I � DkD22. Note that the orders of the weighting

matricesh� ~Nn

~Mn

iand

"�Nn

Mn

#are in general equal to the sum of the orders of

G and K. However, if K is an observer-based controller, i.e.,

K =

"A+B2F + LC2 + LD22F �L

F 0

#;

letting Fk = �(C2 +D22F ) and Lk = �(B2 + LD22), we get

"U

V

#=

264

A+B2F �LF 0

C2 +D22F I

375 ; h ~U ~V

i=

"A+ LC2 �L �(B2 + LD22)

F 0 I

#;

~MV � ~NU = I; ~VM � ~UN = I:


Therefore, we can choseh� ~Nn

~Mn

i=h� ~N ~M

iand

"�Nn

Mn

#=

"�NM

#

which have the same orders as that of the plant G.We shall also illustrate the above procedure through a simple example.

Example 19.2 Consider a system with

A =

264�1 0 4

0 �2 0

0 0 �3

375 ; B1 = C�1 =

264

0 0

1 0

0 0

375 ; B2 = C�2 =

264

1

1

1

375

D11 = 0; D12 = D�21 =

"0

1

#; D22 = 0:

A controller minimizing kTzwk2 is given by

K =

266664

�1 �8:198 4 0

�8:198 �18:396 �8:198 8:198

0 �8:198 �3 0

0 �8:198 0 0

377775 = � 67:2078(s+ 1)(s+ 3)

(s+ 23:969)(s+ 3:7685)(s� 5:3414)

with kTzwk2 = 37:02. Since the controller is an observer-based controller, a naturalcoprime factorization of K is given by

"U

V

#=

26666664

�1 �8:198 4 0

0 �10:198 0 8:198

0 �8:198 �3 0

0 �8:198 0 0

1 1 1 1

37777775:

Furthermore, we have

h� ~N ~M

i=

266664

�1 0 4 �1 0

�8:198 �10:198 �8:198 �1 �8:1980 0 �3 �1 0

1 1 1 0 0

377775 :

Applying the frequency weighted balanced model reduction in Chapter 7 to the weightedcoprime factors, we obtain a �rst order approximation

"U

V

#=

264�0:1814 1:0202

1:2244 0

6:504 1

375

19.2. H1 Controller Reductions 497

which gives h� ~Nn

~Mn

i " U

V

#�"U

V

#! 1

= 2:0928 > 1;

K =1:2491

s+ 6:8165; kTzwk2 = 52:29

and a second order approximation

"U

V

#=

266664�0:1814 14:5511 1:0202

�0:5288 �4:1434 �1:26421:2244 29:5182 0

6:504 �1:3084 1

377775

which gives h� ~Nn

~Mn

i " U

V

#�"U

V

#! 1= 0:024 < 1;

K = � 36:069(s+ 1:1102)

(s+ 17:3741)(s� 4:7601); kTzwk2 = 39:14:

Note that the �rst order reduced controller does not have the same number of righthalf plane poles as that of the full order controller. Moreover, the su�cient stabilitycondition is not satis�ed nevertheless the controller is a stabilizing controller. It is alsointeresting to note that the unstable pole of the second order reduced controller is notat the same location as that of the full order controller. 3

19.2 H1 Controller Reductions

In this section, we considerH1 performance preserving controller order reduction prob-lem. Again we consider the feedback system shown in Figure 19.1 with a generalizedplant realization given by

G(s) =

264

A B1 B2

C1 D11 D12

C2 D21 D22

375 :


(A1) (A;B2) is stabilizable and (C2; A) is detectable;

(A2) D12 has full column rank and D21 has full row rank;

(A3)

"A� j!I B2

C1 D12



(A4)

"A� j!I B1

C2 D21


It is shown in Chapter 17 that all stabilizing controllers satisfying kTzwk1 < can beparameterized as

K = F`(M1; Q); Q 2 RH1; kQk1 < (19:9)

where M1 is of the form

M1 =

"M11(s) M12(s)

M21(s) M22(s)

#=

264

A B1 B2

C1 D11 D12

C2 D21 D22

375

such that D12 and D21 are invertible and A � B2D�112 C1 and A � B1D

�121 C2 are both

stable, i.e., M�112 and M�1

21 are both stable.

The problem to be considered here is to �nd a controller K with a minimal possible

order such that the H1 performance requirement F`(G; K)

1< is satis�ed. This

is clearly equivalent to �nding a Q so that it satis�es the above constraint and the orderof K is minimized. Instead of choosing Q directly, we shall approach this problem froma di�erent perspective. The following lemma is useful in the subsequent developmentand can be regarded as a special case of Theorem 11.7 (main loop theorem).

Lemma 19.3 Consider a feedback system shown below

N

Q

z

y

w

u

� �

�

-

where N is a suitably partitioned transfer matrix

N(s) =

"N11 N12

N21 N22

#:

Then, the closed-loop transfer matrix from w to z is given by

Tzw = F`(N;Q) = N11 +N12Q(I �N22Q)�1N21:

Assume that the feedback loop is well-posed, i.e., det(I �N22(1)Q(1)) 6= 0, and eitherN21(j!) has full row rank for all ! 2 R [ 1 or N12(j!) has full column rank for all! 2 R [1 and kNk1 � 1 then kF`(N;Q)k1 < 1 if kQk1 < 1.


Proof. We shall assume N21 has full row rank. The case when N12 has full columnrank can be shown in the same way.

To show that kTzwk1 < 1, consider the closed-loop system at any frequency s = j!with the signals �xed as complex constant vectors. Let kQk1 =: � < 1 and note thatTwy = N+

21(I �N22Q) where N+21 is a right inverse of N21. Also let � := kTwyk1. Then

kwk2 � �kyk2, and kNk1 � 1 implies that kzk22 + kyk22 � kwk22 + kuk22. Therefore,kzk22 � kwk22 + (�2 � 1)kyk22 � [1� (1� �2)��2]kwk22

which implies kTzwk1 < 1 . 2

19.2.1 Additive Reduction

Consider the class of (reduced order) controllers that can be represented in the form

K = K0 +W2�W1;

where K0 may be interpreted as a nominal, higher order controller, � is a stable per-turbation, with stable, minimum phase, and invertible weighting functions W1 and W2.Suppose that kF`(G;K0)k1 < . A natural question is whether it is possible to obtain

a reduced order controller K in this class such that the H1 performance bound remainsvalid when K is in place of K0. Note that this is somewhat a special case of the abovegeneral problem; the speci�c form of K restricts that K and K0 must possess the sameright half plane poles, thus to a certain degree limiting the set of attainable reducedorder controllers.

Suppose K is a suboptimal H1 controller, i.e., there is a Q 2 RH1 with kQk1 <

such that K = F`(M1; Q). It follows from simple algebra that

Q = F`( �K�1a ; K)

where

�K�1a :=

"0 I

I 0

#M�11

"0 I

I 0

#:

Furthermore, it follows from straightforward manipulations that

kQk1 < () F`( �K�1

a ; K) 1<

() F`( �K�1a ;K0 +W2�W1)

1 <

() F`( ~R;�) 1 < 1

where

~R =

" �1=2I 0

0 W1

#"R11 R12

R21 R22

#" �1=2I 0

0 W2

#


and R is given by the star product"R11 R12

R21 R22

#= S( �K�1

a ;

"Ko I

I 0

#):

It is easy to see that ~R12 and ~R21 are both minimum phase and invertible, andhence have full column and full row rank, respectively for all ! 2 R[1. Consequently,by invoking Lemma 19.3, we conclude that if ~R is a contraction and k�k1 < 1 then F`( ~R;�) 1 < 1. This guarantees the existence of a Q such that kQk1 < , or

equivalently, the existence of a K such that F`(G; K)

1< . This observation leads

to the following theorem.

Theorem 19.4 Suppose W1 and W2 are stable, minimum phase and invertible transfermatrices such that ~R is a contraction. Let K0 be a stabilizing controller such that

kF`(G;K0)k1 < . Then K is also a stabilizing controller such that F`(G; K)

1<

if

k�k1 = W�1

2 (K �K0)W�11

1< 1:

Since ~R can always be made contractive for su�ciently small W1 and W2, there arein�nite many W1 and W2 that satisfy the conditions in the theorem. It is obvious that

to make W�1

2 (K �K0)W�11

1< 1 for some K, one would like to select the \largest"

W1 and W2.

Lemma 19.5 Assume kR22k1 < and de�ne

L =

"L1 L2

L�2 L3

#= F`(

266664

0 �R11 0 R12

�R�11 0 R�21 0

0 R21 0 �R22

R�12 0 �R�22 0

377775 ; �1I):

Then ~R is a contraction if W1 and W2 satisfy"(W�

1 W1)�1 0

0 (W2W�2 )�1

#�"L1 L2

L�2 L3

#:

Proof. See Goddard and Glover [1993]. 2

An algorithm that maximizes det(W�1 W1) det(W2W

�2 ) has been developed by God-

dard and Glover [1993]. The procedure below, devised directly from the above theorem,can be used to generate a required reduced order controller which will preserve the

closed-loop H1 performance bound F`(G; K)

1< .


1. Let K0 be a full order controller such that kF`(G;K0)k1 < ;

2. Compute W1 and W2 so that ~R is a contraction;

3. Using model reduction method to �nd a K so that W�1

2 (K �K0)W�11

1< 1.

19.2.2 Coprime Factor Reduction

The H1 controller reduction problem can also be considered in the coprime factorframework. For that purpose, we need the following alternative representation of alladmissible H1 controllers.

Lemma 19.6 The family of all admissible controllers such that kTzwk1 < can alsobe written as

K(s) = F`(M1; Q) = (�11Q+�12)(�21Q+�22)�1 := UV �1

= (Q~�12 + ~�22)�1(Q~�11 + ~�21) := ~V �1 ~U

where Q 2 RH1, kQk1 < , and UV �1 and ~V �1 ~U are respectively right and leftcoprime factorizations over RH1, and

� =

"�11 �12

�21 �22

#=

264

A� B1D�121 C2 B2 � B1D

�121 D22 B1D

�121

C1 � D11D�121 C2 D12 � D11D

�121 D22 D11D

�121

�D�121 C2 �D�121 D22 D�121

375

~� =

"~�11

~�12

~�21~�22

#=

264

A� B2D�112 C1 B1 � B2D

�112 D11 �B2D

�112

C2 � D22D�112 C1 D21 � D22D

�112 D11 �D22D

�112

D�112 C1 D�112 D11 D�112

375

��1 =

264

A� B2D�112 C1 B2D

�112 B1 � B2D

�112 D11

�D�112 C1 D�112 �D�112 D11

C2 � D22D�112 C1 D22D

�112 D21 � D22D

�112 D11

375

~��1 =

264

A� B1D�121 C2 �B1D

�121 B2 � B1D

�121 D22

D�121 C2 D�121 D�121 D22

C1 � D11D�121 C2 �D11D

�121 D12 � D11D

�121 D22

375 :

Proof. Note that all admissible H1 controllers are given by

K(s) = F`(M1; Q)


where M1 =

"M11 M12

M21 M22

#is given in the last subsection. Next note that from

Lemma 10.2, we have

� =

"�11 �12

�21 �22

#=

"M12 �M11M

�121 M22 M11M

�121

�M�121 M22 M�1

21

#

~� =

"~�11

~�12

~�21~�22

#=

"M21 �M22M

�112 M11 �M22M

�112

M�112 M11 M�1

12

#

��1 =

"M�1

12 �M�112 M11

M22M�112 M21 �M22M

�112 M11

#

~��1 =

"M�1

21 M�121 M22

�M11M�121 M12 �M11M

�121 M22

#:

Then the results follow from some algebra. 2

Theorem 19.7 LetK0 = �12��122 be a central H1 controller such that kF`(G;K0)k1 <

and let U ; V 2 RH1 be such that " �1I 0

0 I

#��1

"�12

�22

#�"U

V

#! 1< 1=

p2: (19:10)

Then K = U V �1 is also a stabilizing controller such that kF`(G; K)k1 < .

Proof. Note that by Lemma 19.6, K is an admissible controller such that kTzwk1 < if and only if there exists a Q 2 RH1 with kQk1 < such that"

U

V

#:=

"�11Q+�12

�21Q+�22

#= �

"Q

I

#(19:11)

andK = UV �1:

Hence, to show that K = U V �1 with U and V satisfying equation (19.10) is alsoa stabilizing controller such that kF`(G; K)k1 < , we need to show that there isanother coprime factorization for K = UV �1 and a Q 2 RH1 with kQk1 < suchthat equation (19.11) is satis�ed.

De�ne

� :=

" �1I 0

0 I

#��1

"�12

�22

#�"U

V

#!


and partition � as

� :=

"�U

�V

#:

Then "U

V

#=

"�12

�22

#��

" I 0

0 I

#� = �

"� �U

I ��V

#

and "U(I ��V )

�1

V (I ��V )�1

#= �

"� �U (I ��V )

�1

I

#:

De�ne U := U(I��V )�1, V := V (I��V )

�1 and Q := � �U (I��V )�1. Then UV �1

is another coprime factorization for K. To show that K = UV �1 = U V �1 is a stabilizingcontroller such that kF`(G; K)k1 < , we need to show that

�U (I ��V )�1

1 < ,

or equivalently �U (I ��V )

�1 1 < 1. Now

�U (I ��V )�1 =

hI 0

i��I �

h0 I

i��1

= F`

0@24 0

hI 0

iI=p2h0 I=

p2i35 ;p2�

1A

and by Lemma 19.3 �U (I ��V )

�1 1 < 1 since2

4 0hI 0

iI=p2h0 I=

p2i35

is a contraction and p2� 1 < 1. 2

Similarly, we have the following theorem.

Theorem 19.8 LetK0 = ~��122 ~�21 be a central H1 controller such that kF`(G;K0)k1 <

and let ~U; ~V 2 RH1 be such that "

~�21

~�22

#�"

~U

~V

#!~��1

" �1I 0

0 I

# 1< 1=

p2:

Then K = ~V�1

~U is also a stabilizing controller such that kF`(G; K)k1 < .

The above two theorems show that theH1 controller reduction problem is equivalentto a frequency weighted H1 model reduction problem.

H1 Controller Reduction Procedures


(i) Let K0 = �12��122 (=

~��122 ~�21) be a suboptimal H1 central controller (Q = 0) suchthat kTzwk1 < .

(ii) Find a reduced order controller K = U V �1 (or ~V�1

~U) such that the followingfrequency weighted H1 error

" �1I 0

0 I

#��1

"�12

�22

#�"U

V

#! 1< 1=

p2

or "

~�21

~�22

#�"

~U

~V

#!~��1

" �1I 0

0 I

# 1< 1=

p2:

(iii) The closed-loop system with the reduced order controller K is stable and theperformance is maintained with the reduced order controller, i.e.,

kTzwk1 = F`(G; K)

1< :

19.3 Frequency-Weighted L1 NormApproximations

We have shown in the previous sections that controller reduction problems are equivalentto frequency weighted model reduction problems. To that end, the frequency weightedbalanced model reduction approach in Chapter 7 can be applied. In this section, wepropose another method based on the frequency weighted Hankel norm approximationmethod.

Theorem 19.9 Let W1(s) 2 RH�1 and W2(s) 2 RH�1 with minimal state space real-izations

W1(s) =

"A1w B1w

C1w D1w

#; W2(s) =

"A2w B2w

C2w D2w

#

and let G(s) 2 RH1. Suppose that G1(s) =

"A1 B1

C1 D1

#2 RH1 is an r-th order

optimal Hankel norm approximation of [W1GW2]+, i.e.,

G1 = arg infdegQ�r

[W1GW2]+ �Q H

and assume "A1w � �I B1w

C1w D1w

#;

"A2w � �I B2w

C2w D2w

#

19.3. Frequency-Weighted L1 Norm Approximations 505

have respectively full row rank and full column rank for all � = �i(A1); i = 1; : : : ; r.Then there exist matrices X;Y;Q, and Z such that

A1wX �XA1 +B1wY = 0 (19.12)

C1wX +D1wY = C1 (19.13)

QA2w � A1Q+ ZC2w = 0 (19.14)

QB2w + ZD2w = B1: (19.15)

Furthermore, Gr :=

"A1 Z

Y 0

#is the frequency weighted optimal Hankel norm approx-

imation, i.e.,

infdegG�r

W1(G� G)W2

H= kW1(G�Gr)W2kH = �r+1 ([W1GW2]+) :

Proof. We shall assume W2 = I for simplicity. The general case can be provensimilarly. Assume without loss of generality that A1 has a diagonal form

A1 = diag[�1; �2; : : : ; �r]:

(The proof can be easily modi�ed if A1 has a general Jordan canonical form). PartitionX , Y , and C1 as

X = [X1; X2; : : : ; Xr]; Y = [Y1; Y2; : : : ; Yr]; C1 = [C11; C12; : : : ; C1r]:

Then the equations (19.12) and (19.13) can be rewritten as"A1w � �iI B1w

C1w D1w

# "Xi

Yi

#=

"0

C1i

#; i = 1; 2; : : : ; r:

By the assumption, the matrix "A1w � �iI B1w

C1w D1w

#

has full row rank for all i and thus the existence of X and Y is guaranteed. Let

W1 =

"A1w �XB1

C1w �D1

#2 RH�1:

Then using equations (19.12) and (19.13), we get

W1Gr =

264A1w B1wY 0

0 A1 B1

C1w D1wY 0

375 =

264A1w A1wX �XA1 +B1wY �XB1

0 A1 B1

C1w C1wX +D1wY 0

375


=

264A1w 0 �XB1

0 A1 B1

C1w C1 0

375 = W1 + G1:

Using this expression, we have

kW1(G�Gr)kH = [W1G]+ + [W1G]� � W1 � G1

H= [W1G]+ � G1

H

= �r+1 ([W1G]+) � infdegG�r

W1(G� G) H� kW1(G�Gr)kH :

2

Note that Y = C1 if W1 = I , Z = B1 if W2 = I , and the rank conditions in theabove theorem are actually equivalent to the statements that the poles of G1 are notzeros of W1 and W2. These conditions will of course be satis�ed automatically if W1(s)and W2(s) have all zeros in the right half plane. Numerical experience shows that ifthe weighted Hankel approximation is used to obtain a L1 norm approximation, thenchoosingW1(s) andW2(s) to have all poles and zeros in the right half plane may reducethe L1 norm approximation error signi�cantly.

Corollary 19.10 Let W1(s) 2 RH�1;W2(s) 2 RH�1 and G(s) 2 RH1. Then

infdegG�r

W1(G� G)W2

1� inf

degG�r

W1(G� G)W2

H= �r+1 ([W1GW2]+) :

The lower bound in the above corollary is not necessarily achievable. To make the 1-norm approximation error as small as possible, a suitable constant matrix Dr should

be chosen so that W1(G� G�Dr)W2

1

is made as small as possible. This Dr can

usually be obtained using any standard convex optimization algorithm. To furtherreduce the approximation error, the following optimization is suggested.

Weighted L1 Model Reduction Procedures

Let W1 and W2 be any antistable transfer matrices with all zeros in the right halfplane.

(i) Let

G1 =

"A1 Z

Y 0

#

be a weighted optimal Hankel norm approximation of G.

(ii) Let the reduced order model G be parameterized as

G(�) =

"A1 B�

Y D�

#; or G(�) =

"A1 Z

C� D�

#:

19.3. Frequency-Weighted L1 Norm Approximations 507

(iii) Find C� (or B�) and D� from the following convex optimization:

min�2Rm

W1(G� G(�))W2

1:

It is noted that the weighted Hankel singular values can be used to predict theapproximation error and hence to determine the order of the reduced model as in theunweighted Hankel approximation problem although we do not have an explicit L1norm error bound in the weighted case.

If the given W1 and W2 do not have all poles and zeros in the right half plane,factorizations must be performed �rst to obtain the equivalent �W1(s) and �W2(s) so that�W1(s) and �W2(s) have all poles and zeros in the right half plane and

W�1 (s)W1(s) = �W�

1 (s) �W1(s); W2(s)W�2 (s) = �W2(s) �W

�2 (s)

Then we have W1(G � G)W2

1= �W1(G� G) �W2

1:

These factorizations can be easily done using Corollary 13.28 ifW1 andW2 are stable andW1(1) and W2(1) have respectively full column rank and full row rank. For example,

assume W2 =

"A B

C D

#2 RH1 with D full row rank and W2(j!)W

�2 (j!) > 0 for all

!. Then there is a M(s) =

"A BW

CW (DD�)1=2

#2 RH1 such that M�1(s) 2 RH1

andW2(s)W

�2 (s) =M�(s)M(s)

whereBW = PC� +BD�; CW = (DD�)�1=2(C �B�WX)

andPA� +AP +BB� = 0

XA+A�X + (C �B�WX)�(DD�)�1(C �B�WX) = 0:

Finally, take �W2(s) =M�(s). Then �W2(s) has all the poles and zeros in the right halfplane and W1(G � G)W2

1= W1(G� G) �W2

1:

In the case where W1 and W2 are not necessarily stable, the following procedures canbe applied to accomplish this task.

Spectral Factorization Procedures

Let W1 2 L1 and W2 2 L1.

(i) Let W1n :=W1(�s) and W2n :=W2n(�s).


(ii) Let W1n =M�11 N1 and W2n = N2M

�12 be respectively the left and right coprime

factorizations such that M1 and M2 are inners. (This step can be done usingTheorem 13.34.)

(iii) Perform the following spectral factorizations

N�1 N1 = V �1 V1; N2N

�2 = V2V

�2

so that V1 and V2 have all zeros in the left half plane. (Corollary 13.23 may beused here if N1(1) has full column rank and N2(1) has full row rank. Otherwise,the factorization in Section 6.1 may be used to factor out the undesirable polesand zeros in the weights.)

(iv) Let �W1(s) = V1(�s) and �W2(s) = V2(�s).

We shall summarize the state space formulas for the above factorizations as a lemma.

Lemma 19.11 Let W (s) =

"A B

C D

#2 L1 be a controllable and observable realiza-

tion.

(a) Suppose W�(j!)W (j!) > 0 for all ! or

"A� j! B

C D

#has full column rank for

all !. Let

Y = Ric

"�A� �C�C0 A

#� 0

X = Ric

"�(A�BR�1D�C) �BR�1B��C�(I �DR�1D�)C (A�BR�1D�C)�

#� 0

with R := D�D > 0. Then we have the following spectral factorization

W�W = �W� �W

where �W; �W�1 2 RH�1 and

�W =

"A+ Y C�C B + Y C�D

R�1=2(D�C �B�X)(I + Y X)�1 R1=2

#:

(b) Suppose W (j!)W�(j!) > 0 for all ! or

"A� j! B

C D

#has full row rank for all

!. Let

X = Ric

"�A �BB�0 A�

#� 0

19.4. An Example 509

Y = Ric

"�(A�BD� ~R�1C)� �C� ~R�1C�B(I �D� ~R�1D)B� (A�BD� ~R�1C)

#� 0

with ~R := DD� > 0. Then we have the following spectral factorization

WW� = �W �W�

where �W; �W�1 2 RH�1 and

�W =

"A+BB�X (I + Y X)�1(BD� � Y C�) ~R�1=2C +DB�X ~R1=2

#:

19.4 An Example

We consider a four-disk control system studied by Enns [1984]. We shall set up thedynamical system in the standard linear fractional transformation form

_x = Ax+B1w + B2u

z =

" pq1H

0

#x+

"0

I

#u

y = C2x+h0 I

iw

where q1 = 1� 10�6; q2 = 1 and

A =

26666666666664

�0:161 �6:004 �0:58215 �9:9835 �0:40727 �3:982 0 0

1 0 0 0 0 0 0 0

0 1 0 0 0 0 0 0

0 0 1 0 0 0 0 0

0 0 0 1 0 0 0 0

0 0 0 0 1 0 0 0

0 0 0 0 0 1 0 0

0 0 0 0 0 0 1 0

37777777777775

B2 =

26666666666664

1

0

0

0

0

0

0

0

37777777777775

B1 =

h pq2B2 0

i; H =

h0 0 0 0 0:55 11 1:32 18

iC2 =

h0 0 6:4432 � 10

�32:3196 � 10

�37:1252 � 10

�21:0002 0:10455 0:99551

i:

The optimal H1 norm for Tzw is opt = 1:1272. We choose = 1:2 to compute an8th order suboptimal controller Ko. The coprime factorizations of Ko are obtainedusing Lemma 19.6 as Ko = �12�

�122 = ~��122 ~�21. The controller is reduced using several

methods and the results are listed in Table 19.2 where the following abbreviations aremade:


UWA Unweighted additive reduction:

Ko � K 1

UWRCF Unweighted right coprime factor reduction:

"�12

�22

#�"U

V

# 1

UWLCF Unweighted left coprime factor reduction:

h ~�21~�22

i�h

~U ~Vi 1

SWA Stability weighted additive reduction:

Wa(Ko � K) 1

19.4. An Example 511

SWRCF Stability weighted right coprime factor reduction:

h� ~Nn

~Mn

i " �12

�22

#�"U

V

#! 1

SWLCF Stability weighted left coprime factor reduction:

�h

~�12~�22

i�h

~U ~Vi�" �Nn

Mn

# 1

PWA Performance weighted additive reduction:

W�12 (Ko � K)W�1

1

1

PWRCF Performance weighted right coprime factor reduction:

" �1I 0

0 I

#��1

"�12

�22

#�"U

V

#! 1

PWLCF Performance weighted left coprime factor reduction:

�h

~�12~�22

i�h

~U ~Vi�

~��1" �1I 0

0 I

# 1

B Balance reduction with or without weighting

H/O Hankel/convex optimization reduction with or without weighting

Table 19.3 lists the performance weighted right coprime factor reduction errors andtheir lower bounds obtained using Corollary 19.10. The � in Table 19.3 is de�ned as

� :=

" �1I 0

0 I

#��1

"�12

�22

#�"U

V

#! 1:

By Corollary 19.10, � � �r+1 if the McMillan degree of

"U

V

#is no greater than r.

Similarly, Table 19.4 lists the stability weighted right coprime factor reduction errors


Order of K 7 6 5 4 3 2 1 0

PWA B 1.196 1.196 1.199 1.197 U 4.99 U U

H/O 1.196 1.197 1.195 1.199 2.73 1.94 U U

PWRCF B 1.2 1.196 1.207 1.195 2.98 1.674 U U

H/O 1.196 1.198 1.196 1.199 2.036 1.981 U U

PWLCF B 1.197 1.196 U 1.197 U U U U

H/O 1.197 1.197 1.198 1.198 1.586 2.975 3.501 U

UWA B U 1.321 U U U U U U

H/O 23.15 U U U U U U U

UWRCF B 1.198 1.196 1.199 1.196 U U U U

H/O 1.197 1.197 1.282 1.218 U U U U

UWLCF B 1.985 1.258 27.04 5.059 U U U U

H/O 5.273 U U U U U U U

SWA B 1.327 1.199 2.27 1.47 23.5 U U U

H/O 1.375 2.503 2.802 4.341 1.488 15.12 2.467 U

SWRCF B 1.236 1.197 1.251 1.201 13.91 1.415 U U

H/O 2.401 1.893 1.612 1.388 2.622 3.527 U U

SWLCF B 1.417 1.217 48.04 3.031 U U U U

H/O 1.267 1.485 2.259 1.849 4.184 27.965 3.251 U

Table 19.2: F`(G; K) with reduced order controller: U{closed-loop system is unstable

and their lower bounds obtained using Corollary 19.10. The �s and �u in Table 19.3 arede�ned as

�s :=

h� ~Nn

~Mn

i " �12

�22

#�"U

V

#! 1

�u :=

"�12

�22

#�"U

V

# 1

where �u is obtained by taking the same U and V as in �s, not from the unweightedmodel reduction.

Table 19.2 shows that performance weighted controller reduction methods work verywell. In particular, the PWRCF and PWLCF are easy to use and e�ective and thereis in general no preference of using either the right coprime method or left coprimemethod. Although some unweighted reduction methods and some stability weightedreduction methods do give reasonable results in some cases, their performances arevery hard to predict. What is the worst is that the small approximation error for the


reduction criterion may have little relevance to the closed-loop H1 performance. Forexample, Table 19.4 shows that the 7-th order weighted approximation error �s usingthe Hankel/Optimization method is very small, however, the H1 performance is veryfar away from the desired level.

Although the unweighted right coprime factor reduction method gives very goodresults for this example, one should not be led to conclude that the unweighted rightcoprime factor method will work well in general. If this is true, then one can easilyconclude that the unweighted left coprime factor method will do equally well by con-sidering a dual problem. Of course, Table 19.2 shows that this is not true because theunweighted left coprime factor method does not give good results. The only reason forthe good performance of the right coprime factor method for this example is the specialdata structure of this example, i.e., the relationship between the B1 matrix and B2

matrix:B1 =

h pq2B2 0

i:

The interested reader may want to explore this special structure further.

Order of K r 7 6 5 4 3 2 1 0

Lower bounds �r+1 0.295 0.303 0.385 0.405 0.635 0.668 0.687 0.702

� B 1.009 0.626 4.645 0.750 71.8 6.59 127.2 2.029

H/O 0.295 0.323 0.389 0.658 0.960 1 1 1

Table 19.3: PWRCF: � and lower bounds


The stability oriented controller reduction criterion is �rst proposed by Enns [1984]. Theweighted and unweighted coprime factor controller reduction methods are proposed byLiu and Anderson [1986, 1990], Liu, Anderson, and Ly [1990], Anderson and Liu [1989],and Anderson [1993]. For normalized H1 controller, Mustafa and Glover [1991] haveproposed a controller reduction method with a prior performance bounds. Normalizedcoprime factors have been used in McFarlane and Glover [1990] for controller orderreductions. Lenz, Khargonekar and Doyle [1987] have also proposed another H1 con-troller reduction method with guaranteed performance for a class of H1 problems. Themain results presented in this chapter are based on the work of Goddard and Glover[1993,1994].

We should note that a satisfactory solution to the general frequency weighted L1norm model reduction problem remains unavailable and this problem has a crucialimplication toward controller reduction with preserving closed-loop H1 performanceas its objective. The frequency weighted Hankel norm approximation is considered inLatham and Anderson [1986], Hung and Glover [1986], and Zhou [1993]. The L1 modelreduction procedures discussed in this chapter are due to Zhou [1993].


Order of K r 7 6 5 4

Lower bounds of �s �r+1 1:1� 10�6 1:2� 10�6 1:9� 10�6 1:9� 10�6

�s B 0.8421 0.5048 2.5439 0.5473

H/O 0.0001 0.2092 0.3182 0.3755

�u B 254.28 9.7018 910.01 21.444

H/O 185.9 30.85 305.3 15.38

Order of K r 3 2 1 0

Lower bounds of �s �r+1 9� 10�6 6:23� 10�5 1:66� 10�4 2:145� 10�4

�s B 11.791 1.3164 9.1461 1.5341

H/O 0.5403 0.7642 1 1

�u B 2600.2 365.45 3000.6 383.277

H/O 397.9 288.1 384.3 384.3

Table 19.4: SWRCF: �s and the corresponding �u

20Structure Fixed Controllers

In this chapter we focus on the problem of designing optimal controllers with controllerstructures restricted; for instance, the controller may be limited to be a state feedback ora constant output feedback or a �xed order dynamic controller. We shall be interestedin deriving some explicit necessary conditions that an optimal �xed structure controllerought to satisfy. The fundamental idea is to formulate our optimal control problemsas some constrained minimization problems. Then the �rst-order Lagrange multipliernecessary conditions for optimality are applied to derive our optimal controller formulae.Readers should keep in mind that our purpose here is to introduce the method, not totry to solve as many problems as possible. Hence, we will try to be concise but clear.In section 20.1, we will review some Lagrange multiplier optimization methods. Thenthese tools will be used in section 20.2 to solve a �xed order H2 optimal controllerproblem.

20.1 Lagrange Multiplier Method

In this section, we consider the constrained minimization problem. The results quotedbelow are standard and can be found in any references given at the end of the chapter.

Let f(x) := f(x1; x2; : : : ; xn) 2 R be a real valued function de�ned on a set S � Rn .A point x0 2 R

n in S is said to be a (global) minimum point of f on S if

f(x) � f(x0)

for all points x 2 S. A point x0 2 S is said to be a local minimum point of f on S ifthere is a neighborhood N of x0 such that f(x) � f(x0) for all points x 2 N .

515

516 STRUCTURE FIXED CONTROLLERS

We will be particularly interested in the case where the set S is described by a setof functions, hi(x) = 0; i = 1; 2; : : : ;m and m < n or equivalently

H(x) :=hh1(x) h2(x) : : : hm(x)

i= 0:

Hence we will focus on the local necessary (and su�cient) conditions for the followingproblem: (

minimize f(x)

subject toH(x) = 0:(20:1)

We shall assume that f(x) and hi(x) are di�erentiable and denote

rf(x) :=

266664

@f@x1@f@x2...@f@xn

377775

rH(x) :=hrh1(x) rh2(x) � � � rhm(x)

i=

266664

@h1@x1

@h2@x1

� � � @hm@x1

@h1@x2

@h2@x2

� � � @hm@x2

......

...@h1@xn

@h2@xn

� � � @hm@xn

377775 :

De�nition 20.1 A point x0 2 Rn satisfying the constraints H(x0) = 0 is said to bea regular point of the constraints if rH(x0) has full column rank (m); equivalently, let�(x) := H(x)z for z 2 Rm , and then r�(x0) = rH(x0)z = 0 has the unique solutionz = 0.

Theorem 20.1 Suppose that x0 2 Rn is a local minimum of the f(x) subject to theconstraints H(x) = 0 and suppose further that x0 is a regular point of the constraints.Then there exists a unique multiplier

� =

266664

�1

�2...

�m

377775 2 R

m

such that, if we set F (x) = f(x) +H(x)�, then rF (x0) = 0, i.e.,

rF (x0) = rf(x0) +rH(x0)� = 0:

In the case where the regular point conditions are either not satis�ed or hard toverify, we have the following alternative.

20.1. Lagrange Multiplier Method 517

Theorem 20.2 Suppose that x0 2 Rn is a local minimum of f(x) subject to the con-straints H(x) = 0. Then there exists"

�0

�

#2 R

m+1

such that �0rf(x0) +rH(x0)� = 0.

Remark 20.1 Although some second order necessary and su�cient conditions for localminimality can be given, they are usually not very useful in our applications for thereason that they are very hard to verify except in some very special cases. Furthermore,even if some su�cient conditions can be veri�ed, it is still not clear whether the minimais global. It is a common practice in many applications that the �rst-order necessaryconditions are used to derive some necessary conditions for the existence of a minima.Then �nd a solution (or solutions) from these necessary conditions and check if thesolution(s) satis�es our objectives regardless of the solution(s) being a global minimaor not. ~

In most of the control applications, constraints are given by a symmetric matrixfunction, and in this case, we have the following lemma.

Lemma 20.3 Let T (x) = T (x)� 2 Rl�l be a symmetric matrix function and let x0 2 Rn

be such that T (x0) = 0. Then x0 is a regular point of the constraints T (x) = T (x)� = 0if, for P = P � 2 Rl�l , rTrace(T (x0)P ) = 0 has the unique solution P = 0.

Proof. Since T (x) = [tij(x)] is a symmetric matrix, tij = tji and the e�ective con-straints for T (x) = 0 are given by the l(l + 1)=2 equations, tij(x) = 0 for i = 1; 2; : : : ; land i � j � l. By de�nition, x0 is a regular point for the e�ective constraints (tij(x) = 0for i = 1; 2; : : : ; l and i � j � l) if the following equation has the unique solution pij = 0for i = 1; 2; : : : ; l and i � j � l:

(x0) :=

lXi=1

rtiipii +lX

i=1

lXj=i+1

2rtijpji = 0:

Now the result follows by de�ning pij := pji and by noting that (x0) can be writtenas

(x0) = r0@ lX

i=1

lXj=1

tijpji

1A = rTrace(T (x0)P )

with P = P �. 2

Corollary 20.4 Suppose that x0 2 Rn is a local minimum of f(x) subject to the con-straints T (x) = 0 where T (x) = T (x)� 2 R

l�l and suppose further that x0 is a regular


point of the constraints. Then there exists a unique multiplier P = P � 2 Rl�l such thatif we set F (x) = f(x) + Trace(T (x)P ), then rF (x0) = 0, i.e.,

rF (x0) = rf(x0) +rTrace(T (x0)P ) = 0:

In general, in the case where a local minimal point x0 is not necessarily a regularpoint, we have the following corollary.

Corollary 20.5 Suppose that x0 2 Rn is a local minimum of f(x) subject to the con-straints T (x) = 0 where T (x) = T (x)� 2 Rl�l . Then there exist 0 6= (�0; P ) 2 R � Rl�l

with P = P � such that

�0rf(x0) +rTrace(T (x0)P ) = 0:

Remark 20.2 We shall also note that the variable x 2 Rn may be more convenientlygiven in terms of a matrix X 2 R

k�q , i.e., we have

x = VecX :=

266666666666666664

x11...

xk1

x12...

xk2...

xkq

377777777777777775

:

Then

rF (x) :=

2664

@F (x)@x11...

@F (x)@xkq

3775 = 0

is equivalent to

@F (x)

@X:=

2666664

@F (x)@x11

@F (x)@x12

� � � @F (x)@x1q

@F (x)@x21

@F (x)@x22

� � � @F (x)@x2q

......

...@F (x)@xk1

@F (x)@xk2

� � � @F (x)@xkq

3777775 = 0:

This later expression will be used throughout in the sequel. ~As an example, let us consider the following H2 norm minimization with constant

state feedback: the dynamic system is given by

_x = Ax+B1w +B2u

z = C1x+D12u;

20.1. Lagrange Multiplier Method 519

and the feedback u = Fx is chosen so that A+B2F is stable and

J0 = kTzwk22is minimized. For simplicity, we shall assume that D�12D12 = I and D�12C1 = 0. It isroutine to verify that J0 = Trace(B1B

�1X) where X = X� � 0 satis�es

T (X;F ) := X(A+B2F ) + (A+B2F )�X + (C1 +D12F )

�(C1 +D12F ) = 0:

Hence the optimal control problem becomes a constrained minimization problem, andwe can use the Lagrange multipliers method outlined above. (Note that in this case,

the variable x takes the form x =

"VecX

VecF

#). Let

J(X;F ) := J0 +Trace(T (X;F )P )

with P = P �. We �rst verify the regularity conditions: the equation

rTrace(T (X;F )P ) = 0

or, equivalently,"@ Trace(T (X;F )P )

@X@ Trace(T (X;F )P )

@F

#=

"P (A+B2F )

� + (A+B2F )P

2(B�2X + F )P

#= 0

has a unique solution P = 0 since A + B2F is assumed to be stable at the minimumpoint. Hence regularity conditions are satis�ed. Now the necessary condition for localoptimum can be applied:

@J(X;F )

@X= B1B

�1 + P (A+B2F )

� + (A+B2F )P = 0 (20.2)

@J(X;F )

@F= 2(B�2X + F )P = 0 (20.3)

@J(X;F )

@P= X(A+B2F ) + (A+B2F )

�X + (C1 +D12F )�(C1 +D12F ) = 0: (20.4)

It should be pointed out that, in general, we cannot conclude from equation (20.3) thatB�2X + F = 0. Care must be exercised to arrive at such a conclusion. For example,if we assume that B1 is square and nonsingular, then we know that if F is such thatA+B2F is stable, then P > 0. Hence we have

F = �B�2X;and we substitute this relation into equation (20.4) and get the familiar Riccati equation:

XA+A�X �XB2B�2X + C�1C1 = 0:

This Riccati equation has a stabilizing solution if (A;B2) is stabilizable and if (C1; A)has no unobservable modes on the imaginary axis. Indeed, in this case, the controllerthus obtained is a global optimal control law.


20.2 Fixed Order Controllers

In this section, we shall use the Lagrange multiplier method to derive some necessaryconditions for a �xed-order controller that minimizes an H2 performance. We shallagain consider a standard system setup

G

K

z

y

w

u

� �

�

-

where the system G is n-th order with a realization given by

G(s) =

264

A B1 B2

C1 0 D12

C2 D21 0

375 :

For simplicity, we shall assume that


(ii) (A;B2) is stabilizable and (C2; A) is detectable;

(iii) D�12hC1 D12

i=h0 I

i;

(iv)

"B1

D21

#D�21 =

"0

I

#.

We shall be interested in the following �xed order H2 optimal controller problem:given an integer nc � n, �nd an nc-th order controller

K =

"Ac Bc

Cc 0

#

that internally stabilizes the system G and minimizes the H2 norm of the transfer matrixTzw. For technical reasons, we will further assume that the realization of the controlleris minimal, i.e., (Ac; Bc) is controllable and (Cc; Ac) is observable.

Suppose a such controller exists, and then the closed loop transfer matrix Tzw canbe written as

Tzw =

264

A B2Cc B1

BcC2 Ac BcD21

C1 D12Cc 0

375 =:

"~A ~B

~C 0

#

20.2. Fixed Order Controllers 521

with ~A stable. Moreover,kTzwk22 = Trace( ~B ~B� ~X) (20:5)

where ~X is the observability Gramian of Tzw:

~X ~A+ ~A� ~X + ~C� ~C = 0: (20:6)

Theorem 20.6 Suppose (Ac; Bc; Cc) is a controllable and observable triple and K ="Ac Bc

Cc 0

#internally stabilizes the system G and minimizes the norm Tzw. Then

there exist n� n nonnegative de�nite matrices X, Y , X, and Y such that Ac, Bc, andCc are given by

Ac = �(A�B2B�2X � Y C�2C2)�

� (20.7)

Bc = �Y C�2 (20.8)

Cc = �B�2X�� (20.9)

for some factorizationY X = ��M�; �� = Inc

with M positive-semisimple1 and such that with � := �� and �? := In�� the followingconditions are satis�ed:

0 = A�X +XA�XB2B�2X + ��?XB2B

�2X�? + C�1C1 (20.10)

0 = AY + Y A� � Y C�2C2Y + �?Y C�2C2Y ��? +B1B

�1 (20.11)

0 = (A� Y C�2C2)�X + X(A� Y C�2C2) +XB2B

�2X � ��?XB2B

�2X�? (20.12)

0 = (A�B2B�2X)Y + Y (A�B2B

�2X)� + Y C�2C2Y � �?Y C�2C2Y �

�? (20.13)

rank X = rank Y = rank Y X = nc:

Proof. The problem can be viewed as a constrained minimization problem with the ob-jective function given by equation (20.5) and with constraints given by equation (20.6).Let ~Y = ~Y � 2 R(n+nc )�(n+nc) and denote

J1 = Tracen( ~X ~A+ ~A� ~X + ~C� ~C) ~Y

o:

Then@J1

@ ~X= ~Y ~A� + ~A ~Y = 0

has the unique solution ~Y = 0 since ~A is assumed to be stable. Hence the regularityconditions are satis�ed and the Lagrange multiplier method can be applied. Form theLagrange function J as

J := Trace( ~B ~B� ~X) + J1

1A matrix M is called positive semisimple if it is similar to a positive de�nite matrix.


and partition ~X and ~Y as

~X =

"X11 X12

X�12 X22

#; ~Y =

"Y11 Y12

Y �12 Y22

#:

The necessary conditions for (Ac; Bc; Cc) to be a local minima are

@J

@Ac= 2(X�

12Y12 +X22Y22) = 0 (20.14)

@J

@Bc= 2(X22Bc +X�

12Y11C�2 +X22Y

�12C

�2 ) = 0 (20.15)

@J

@Cc= 2(B�2X11Y12 +B�2X12Y22 + CcY22) = 0 (20.16)

@J

@ ~Y= ~X ~A+ ~A� ~X + ~C� ~C = 0 (20.17)

@J

@ ~X= ~Y ~A� + ~A ~Y + ~B� ~B = 0: (20.18)

It is clear that ~X � 0 and ~Y � 0 since ~A is stable. Equations (20.17) and (20.18) canbe written as

0 = X11A+A�X11 +X12BcC2 + C�2B�cX

�12 + C�1C1 (20.19)

0 = X12Ac +A�X12 +X11B2Cc + C�2B�cX22 (20.20)

0 = X22Ac +A�cX22 +X�12B2Cc + C�cB

�2X12 + C�cCc (20.21)

0 = AY11 + Y11A� +B2CcY

�12 + Y12C

�cB

�2 +B1B

�1 (20.22)

0 = AY12 + Y12A�c +B2CcY22 + Y11C

�2B

�c (20.23)

0 = AcY22 + Y22A�c +BcC2Y12 + Y �12C

�2B

�c +BcB

�c : (20.24)

For clarity, we shall present the proof in three steps:

1. X22 > 0 and Y22 > 0: We show �rst that X22 > 0. Since ~X � 0, we have X22 � 0and X22X

+22X

�12 = X�

12 by Lemma 2.16. Hence equation (20.21) can be written as

0 = X22(Ac +X+22X

�12B2Cc) + (Ac +X+

22X�12B2Cc)

�X22 + C�cCc:

Since (Cc; Ac) is observable by assumption, (Cc; Ac + X+22X

�12B2Cc) is also ob-

servable. Now it follows from the Lyapunov theorem (Lemma 3.18) that X22 > 0.Y22 > 0 follows by a similar argument.

2. formula for Ac, Bc, Cc, �, and �: given X22 > 0 and Y22 > 0, we de�ne

� := �X�122 X

�12 (20.25)

� := Y �122 Y�12 (20.26)

� := �� (20.27)

20.2. Fixed Order Controllers 523

X := X12X�122 X

�12 � 0 (20.28)

Y := Y12Y�122 Y

�12 � 0 (20.29)

X := X11 �X12X�122 X

�12 (20.30)

Y := Y11 � Y12Y �122 Y�12: (20.31)

Then it follows from equation (20.14) that

�� = I

and �2 = �� = �� = � . It also follows from ~X � 0 and ~Y � 0 that X � 0and Y � 0. Moreover, from equation (20.14), we have

nc = rank X22 � rank X12 � nc:

This implies that rank X12 = nc = rank Y12 and

rank X = rank X12 = nc = rank Y = rank Y X:

The product of Y X can be factored as

Y X = ��(�Y �12X12)� = ��Y22X22�;

andM := Y22X22 = Y

1=222 (Y

1=222 X22Y

1=222 )Y

�1=222

is a positive semisimple matrix.

Using these formulae in equations (20.15) and (20.16), we get

Bc = �(X�122 X

�12Y11 + Y �12)C

�2 = (�Y11 � (��)Y �12)C

�2

= �(Y11 ��Y �12)C�2

= �Y C�2

and

Cc = �B�2(X11Y12Y�122 +X12) = �B�2(X11 +X12�)�

�

= �B�2X��:

The formula for Ac follows from (20.24)-��(20.23) and some algebra.

3. Equations for X , Y , X, and Y : Equations (20.12) and (20.13) follow by sub-stituting Ac, Bc, and Cc into (20.20)�� and (20.23)�� and by using the factthat X = �X and Y = � Y . Finally, equations (20.10) and (20.11) follow fromequations (20.19) and (20.22) with some tedious algebra.

2


Remark 20.3 It is interesting to note that if the full order controller nc = n is con-sidered, then we have � = I and �? = 0. In that case, equations (20.10) and (20.11)become standard H2 Riccati equations:

0 = A�X +XA�XB2B�2X + C�1C1

0 = AY + Y A� � Y C�2C2Y +B1B�1

Moreover, there exist unique stabilizing solutions X � 0 and Y � 0 to these two Riccatiequations such that A�B2B

�2X and A� Y C�2C2 are stable. Using these facts, we get

that equations (20.12) and (20.13) have unique solutions:

X =

Z 1

0

e(A�Y C�

2C2)�tXB2B

�2Xe

(A�YC�

2C2)tdt � 0

Y =

Z 1

0

e(A�B2B�

2X)tY C�2C2Y e(A�B2B�

2X)�tdt � 0:

It is a fact that X is nonsingular i� (B�2X;A � Y C�2C2) is observable and that Y isnonsingular i� (A�B2B

�2X;Y C

�2 ) is controllable, or equivalently i�

Ko :=

"A�B2B

�2X � Y C�2C2 Y C�2�B�2X 0

#

is controllable and observable. (Note that Ko is known to be the optimal H2 controllerfrom Chapter 14). Furthermore, if X and Y are nonsingular, we can indeed �nd � and� such that �� = In. In fact, in this case, � and � are both square and � = (��)�1.Hence, we have

K =

"Ac Bc

Cc 0

#=

"�(A�B2B

�2X � Y C�2C2)�

�1 �Y C�2�B�2X��1 0

#

=

"A�B2B

�2X � Y C�2C2 Y C�2�B�2X 0

#= Ko

i.e., if X and Y are nonsingular or, equivalently, if optimal controller Ko is controllableand observable as we assumed, then Theorem 20.6 generates the optimal controller.However, in general, Ko is not necessarily minimal; hence Theorem 20.6 will not beapplicable.

It is possible to derive some similar results to Theorem 20.6 without assuming theminimality of the optimal controller by using pseudo-inverse in the derivations, but that,in general, is much more complicated. An alternative solution to this dilemma would besimply by direct testing: if a given nc does not generate a controllable and observablecontroller, then lower nc and try again. ~Remark 20.4 We should also note that although we have the necessary conditions fora reduced order optimal controller, it is generally hard to solve these coupled equationsalthough some ad hoc homotopy algorithm might be used to �nd a local minima. ~


Remark 20.5 This method can also be used to derive the H1 results presented in theprevious chapters. The interested reader should consult the references for details. Itshould be pointed out that this method su�ers a severe de�ciency: global results arehard to �nd. This is due to (a) only �rst order necessary conditions can be relativelyeasily derived; (b) the controller order must be �xed; hence even if a �xed-order optimalcontroller can be found, it may not be optimal over all stabilizing controllers. ~


Optimization using the Lagrange multiplier can be found in any standard optimizationtextbook. In particular, the book by Hestenes [1975] contains the �nite dimensionalcase, and the one by Luenberger [1969] contains both �nite and in�nite dimensionalcases. The Lagrange multiplier method has been used extensively by Hyland and Bern-stein [1984], Bernstein and Haddard [1989], and Skelton [1988] in control applications.Theorem 20.6 was originally shown in Hyland and Bernstein [1984].

21Discrete Time Control

In this chapter we discuss discrete time Riccati equations and some of their applicationsin discrete time control. A simpler form of a Riccati equation is the so-called Lyapunovequation. Hence we will start from the solutions of a discrete Lyapunov equation whichare given in section 21.1. Section 21.2 presents the basic property of a Riccati equationsolution as well as the necessary and su�cient conditions for the existence of a solutionto the LQR problem related Riccati equation. Various di�erent characterizations of abounded real transfer matrix are presented in section 21.3. The key is the relationshipbetween the existence of a solution to a Riccati equation and the norm bound of astable transfer matrix. Section 21.4 collects some useful matrix function factorizationsand characterizations. In particular, state space criteria are stated (mostly withoutproof) for a transfer matrix to be inner, for the existence of coprime factorizations, inner-outer factorizations, normalized coprime factorizations, and spectral factorizations. Thediscrete time H2 optimal control will be considered brie y in Section 21.5. Finally, thediscrete time balanced model reduction is considered in section 21.6 and 21.7.

21.1 Discrete Lyapunov Equations

Let A;B, andQ be real matrices with appropriate dimensions, and consider the followinglinear equation:

AXB �X +Q = 0: (21:1)

Lemma 21.1 The equation (21.1) has a unique solution if and only if �i(A)�j(B) 6= 1for all i; j.

527

528 DISCRETE TIME CONTROL

Proof. Analogous to the continuous case. 2

Remark 21.1 If �i(A)�j(B) = 1 for some i; j, then the equation (21.1) has either nosolution or more than one solution depending on the speci�c data given. If B = A� andQ = Q�, then the equation is called the discrete Lyapunov equation. ~The following results are analogous to the corresponding continuous time cases, so theywill be stated without proof.

Lemma 21.2 Let Q be a symmetric matrix and consider the following Lyapunov equa-tion:

AXA� �X +Q = 0

1. Suppose that A is stable, and then the following statements hold:

(a) X =P1

i=0AiQ(A�)i and X � 0 if Q � 0.

(b) if Q � 0, then (Q;A) is observable i� X > 0.

2. Suppose that X is the solution of the Lyapunov equation; then

(a) j�i(A)j � 1 if X > 0 and Q � 0.

(b) A is stable if X � 0, Q � 0 and (Q;A) is detectable.

21.2 Discrete Riccati Equations

This section collects some basic results on the discrete Riccati equations. So the presen-tation of this section and the sections to follow will be very much like the correspondingsections in Chapter 13. Just as the continuous time Riccati equations play the essentialroles in continuous H2 and H1 theories, the discrete time Riccati equations play theessential roles in discrete time H2 and H1 theories.

Let a matrix S 2 R2n�2n be partitioned into four n� n blocks as

S :=

"S11 S12

S21 S22

#;

and let J =

"0 �II 0

#2 R2n�2n ; then S is called simplectic if J�1S�J = S�1. A

simplectic matrix has no eigenvalues at the origin, and, furthermore, it is easy to see thatif � is an eigenvalue of a simplectic matrix S, then ��; 1=�, and 1=�� are also eigenvaluesof S.

Let A, Q, and G be real n�n matrices with Q and G symmetric and A nonsingular.De�ne a 2n� 2n matrix:

S :=

"A+G(A�)�1Q �G(A�)�1�(A�)�1Q (A�)�1

#:

21.2. Discrete Riccati Equations 529

Then S is a simplectic matrix. Assume that S has no eigenvalues on the unit circle.Then it must have n eigenvalues in jzj < 1 and n in jzj > 1. Consider the two n-dimensional spectral subspaces X�(S) and X+(S): the former is the invariant subspacecorresponding to eigenvalues in jzj < 1, and the latter corresponds to eigenvalues injzj > 1. After �nding a basis for X�(S), stacking the basis vectors up to form a matrix,and partitioning the matrix, we get

X�(S) = Im

"T1

T2

#

where T1; T2 2 Rn�n . If T1 is nonsingular or, equivalently, if the two subspaces

X�(S); Im

"0

I

#

are complementary, we can set X := T2T�11 . Then X is uniquely determined by S, i.e.,

S 7! X is a function which will be denoted Ric; thus, X = Ric(S). As in the continuoustime case, we make the following de�nition.

De�nition 21.1 The domain of Ric, denoted by dom(Ric), consists of all (2n � 2n)simplectic matrices S such that S has no eigenvalues on the unit circle and the two

subspaces X�(S) and Im

"0

I

#are complementary.

Theorem 21.3 Suppose S 2 dom(Ric) and X = Ric(S). Then

(a) X is unique and symmetric;

(b) I +XG is invertible and X satis�es the algebraic Riccati equation

A�XA�X �A�XG(I +XG)�1XA+Q = 0; (21:2)

(c) A�G(I +XG)�1XA = (I +GX)�1A is stable.

Note that the discrete Riccati equation in (21.2) can also be written as

A�(I +XG)�1XA�X +Q = 0:

Remark 21.2 In the case that A is singular, all results presented in this chapter willstill be true if the eigenvalue problem of S is replaced by the following generalizedeigenvalue problem:

�

"I G

0 A�

#�"

A 0

�Q I

#

and X�(S) is taken to be the subspace spanned by the generalized principal vectorscorresponding to those generalized eigenvalues in jzj < 1. Here the generalized principal


vectors corresponding to a generalized eigenvalue � are referred to the set of vectorsfx1; : : : ; xkg satisfying "

A 0

�Q I

#x1 = �

"I G

0 A�

#x1

"A 0

�Q I

#� �

"I G

0 A�

#!xi =

"I G

0 A�

#xi�1; i = 2; : : : ; k:

See Dooren [1981] and Arnold and Laub [1984] for details. ~

Proof. (a): Since S 2 dom(Ric), 9T =

"T1

T2

#such that

X�(S) = Im

"T1

T2

#

and T1 is invertible. Let X := T2T�11 , then

X�(S) = Im

"T1

T2

#= Im

"I

X

#T1 = Im

"I

X

#:

Obviously, X is unique since

Im

"I

X1

#= Im

"I

X2

#

i� X1 = X2. Now let us show that X is symmetric. Since

XT1 = T2; (21:3)

pre-multiply by T �1 to getT �1XT1 = T �1 T2:

We only need to show that T �1 T2 is symmetric.Since X�(S) is a stable invariant subspace, there is a stable n � n matrix S� such

thatST = TS�: (21:4)

Pre-multiply by S��T�J and note that S�JS = J ; we get

(S��T�J)TS� = (S��T

�J)ST = (S��T�)JST = T �S�JST = T �JT

S��(T�JT )S� � T �JT = 0:


This is a Lyapunov equation and S� is stable. Hence there is a unique solution

T �JT = 0

i.e.,

T �2 T1 = T �1 T2:

Thus we have X = X� since T1 is nonsingular.(b): To show that X is the solution of the Riccati equation, we note that equa-

tion (21.4) can be written as

S

"I

X

#=

"I

X

#T1S�T�11 : (21:5)

Pre-multiply (21.5) byh�X I

ito get

h�X I

iS

"I

X

#= 0:

Equivalently, we get

�XA+ (I +XG)(A�)�1(X �Q) = 0: (21:6)

We now show that I+XG is invertible. Suppose I+XG is not invertible, then 9 v 2 Rn

such that

v�(I +XG) = 0: (21:7)

Pre-multiply (21.6) by v�, and we get v�XA = 0. Since A is assumed to be invertible, itfollows that v�X = 0. This, in turn, implies v = 0 by (21.7). Hence I+XG is invertibleand the equation (21.6) can be written as

A�(I +XG)�1XA�X +Q = 0:

This is the Riccati equation (21.2).

(c): To show that X is the stabilizing solution, pre-multiply (21.5) byhI 0

ito

get

A+G(A�)�1Q�G(A�)�1X = T1S�T�11 :

The above equation can be simpli�ed by using Riccati equation to get

(I +GX)�1A = T1S�T�11

which is stable. 2


Remark 21.3 Let X+(S) = Im

"~T1~T2

#and suppose that ~T1 is nonsingular. Then the

Riccati equation has an anti-stabilizing solution ~X = ~T2 ~T�11 such that (I +G ~X)�1A is

antistable. ~

Lemma 21.4 Suppose that G and Q are positive semi-de�nite and that S has no eigen-

values on unit circle. Let X�(S) = Im

"T1

T2

#; then T �1 T2 � 0.

Proof. Let T =

"T1

T2

#and S� be such that

ST = TS�; (21:8)

and S� has all eigenvalues inside of the unit disc. Let

Uk = TSk�; k = 0; 1; : : : :

Then Uk+1 = SUk with U0 = T . De�ning V =

"0 I

0 0

#, we get T �1 T2 = U�0V U0.

Further de�ne

Yk := �U�kV Uk + U�0V U0

= �k�1Xi=0

(U�i+1V Ui+1 � U�i V Ui)

= �k�1Xi=0

U�i (S�V S � V )Ui:

Now

S�V S � V =

"I �Q0 I

# "�Q 0

0 �A�1G(A�)�1

#"I 0

Q I

#� 0

since G and Q are assumed to be positive semi-de�nite. So Yk � 0 for all k � 0. Notethat Uk ! 0 as k ! 1 since S� has all eigenvalues inside the unit disc. ThereforeT �1 T2 = limk!1 Yk � 0. 2

Lemma 21.5 Suppose that G and Q are positive semi-de�nite. Then S 2 dom(Ric) i�(A;G) is stabilizable and S has no eigenvalues on the unit circle.


Proof. The necessary part is obvious. We now show that the stabilizability of (A;G)and S having no eigenvalues on the unit circle are, in fact, su�cient. To show this, weonly need to show that T1 is nonsingular, i.e. Ker T1 = 0. First, it is claimed thatKerT1 is S�-invariant. To prove this, let x 2 KerT1. Rewrite (21.8) as

(A+G(A�)�1Q)T1 �G(A�)�1T2 = T1S� (21.9)

�(A�)�1QT1 + (A�)�1T2 = T2S�: (21.10)

Substitute (21.10) into (21.9) to get

AT1 �GT2S� = T1S�; (21:11)

and pre-multiply the above equation by x�S��T�2 and post-multiply x to get

�x�S��T �2GT2S�x = x�S��T�2 T1S�x:

According to Lemma 21.4 T �2 T1 � 0, we have

GT2S�x = 0:

This in turn implies that T1S�x = 0 from (21.11). Hence Ker T1 is invariant under S�.Now to prove that T1 is nonsingular, suppose on the contrary that Ker T1 6= 0. Then

S�jKerT1 has an eigenvalue, �, and a corresponding eigenvector, x:

S�x = �x (21:12)

j�j < 1; 0 6= x 2 KerT1:

Post-multiply (21.10) by x to get

(A�)�1T2x = T2S�x = �T2x:

Now if T2x 6= 0, then 1=�� is an eigenvalue of A and, furthermore, 0 = GT2S�x = �GT2x,so GT2x = 0, which is contradictory to the stabilizability assumption of (A;G). Hence

we must have T2x = 0. But this would imply

"T1

T2

#x = 0, which is impossible. 2

Lemma 21.6 Let G and Q be positive semi-de�nite matrices and

S =

"A+G(A�)�1Q �G(A�)�1�(A�)�1Q (A�)�1

#:

Then S has no eigenvalues on the unit circle i� (A;G) has no uncontrollable modes and(Q;A) has no unobservable modes on the unit circle.


Proof. (() Suppose, on the contrary, that S has an eigenvalue ej�. Then

S

"x

y

#= ej�

"x

y

#

for some

"x

y

#6= 0, i.e.,

(A+G(A�)�1Q)x�G(A�)�1y = ej�x

�(A�)�1Qx+ (A�)�1y = ej�y:

Multiplying the second equation by G and adding it to the �rst one give

Ax� ej�Gy = ej�x (21.13)

�Qx+ y = ej�A�y: (21.14)

Pre-multiplying equation (21.13) by e�j�y� and equation (21.14) by x� yield

e�j�y�Ax = y�Gy + y�x

�x�Qx+ x�y = ej�x�A�y:

Thus�y�Gy � x�Qx = 0:

It follows that

y�G = 0

Qx = 0:

Substitute these relationships into (21.13) and (21.14) to get

Ax = ej�x

ej�A�y = y:

Since x and y cannot be zero simultaneously, ej� is either an unobservable mode of(Q;A) or an uncontrollable mode of (A;G), a contradiction.

()): Suppose that S has no eigenvalue on the unit circle but ej� is an unobservablemode of (Q;A) and x is a corresponding eigenvector. Then it is easy to verify that

S

"x

0

#= ej�

"x

0

#;

so ej� is an eigenvalue of S, again a contradiction. The case for (A;G) having uncon-trollable mode on the unit circle can be proven similarly. 2


Theorem 21.7 Let G and Q be positive semi-de�nite matrices and

S =

"A+G(A�)�1Q �G(A�)�1�(A�)�1Q (A�)�1

#:

Then S 2 dom(Ric) i� (A;G) is stabilizable and (Q;A) has no unobservable modes onthe unit circle. Furthermore, X = Ric(S) � 0 if S 2 dom(Ric) and X > 0 if and onlyif (Q;A) has no unobservable stable modes.

Proof. Let Q = C�C for some matrix C. The �rst half of the theorem follows fromLemmas 21.5 and 21.6. Now rewrite the discrete Riccati equation as

A�(I +XG)�1X(I +GX)�1A�X +A�X(I +GX)�2GXA+ C�C = 0 (21:15)

and note that by de�nition (I+GX)�1A is stable and A�X(I+GX)�2GXA+C�C � 0.Thus X � 0 by Lyapunov theorem. To show that the kernel of X has the refereedproperty, suppose x 2 KerX , pre-multiply (21.15) by x�, and post-multiply by x to get

XAx = 0; Cx = 0: (21:16)

This implies that KerX is an A-invariant subspace. If KerX 6= 0, then there is an0 6= x 2 KerX , so Cx = 0, such that Ax = �x. But for x 2 KerX , (I + GX)�1Ax =�(I+GX)�1x = �x, so j�j < 1 since (I+GX)�1A is stable. Thus � is an unobservablestable mode of (Q;A).

On the other hand, suppose that j�j < 1 is a stable unobservable mode of (Q;A).Then there exists a x 2 C n such that Ax = �x and Cx = 0; do the same pre- and post-multiplications on (21.15) as before to get

j�j2x�(XG+ I)�1Xx� x�Xx = 0:

This can be rewritten as

x�X1=2[j�j2(I +X1=2GX1=2)�1 � I ]X1=2x = 0:

Now X � 0; G � 0, and j�j < 1 imply that j�j2(I + X1=2GX1=2)�1 � I < 0. HenceXx = 0, i.e., X is singular. 2

Lemma 21.8 Suppose that D has full column rank and let R = D�D > 0; then thefollowing statements are equivalent:

(i)

"A� ej�I B

C D

#has full column rank for all � 2 [0; 2�].

(ii)�(I �DR�1D�)C; A�BR�1D�C� has no unobservable modes on the unit circle,or, equivalently, (D�?C;A � BR�1D�C) has no unobservable modes on the unitcircle.


Proof. Suppose that ej� is an unobservable mode of�(I �DR�1D�)C; A�BR�1D�C�;

then there is an x 6= 0 such that

(A�BR�1D�C)x = ej�x; (I �DR�1D�)Cx = 0

i.e., "A� ej�I B

C D

# "I 0

�R�1D�C I

#"x

0

#= 0:

But this implies that "A� ej�I B

C D

#(21:17)

does not have full column rank. Conversely, suppose that (21.17) does not have full

column rank for some �; then there exists

"u

v

#6= 0 such that

"A� ej�I B

C D

#"u

v

#= 0:

Now let "u

v

#=

"I 0

�R�1D�C I

# "x

y

#:

Then "x

y

#=

"I 0

R�1D�C I

# "u

v

#6= 0

and(A�BR�1D�C � ej�I)x+By = 0 (21:18)

(I �DR�1D�)Cx+Dy = 0: (21:19)

Pre-multiply (21.19) by D� to get y = 0. Then we have

(A�BR�1D�C)x = ej�x; (I �DR�1D�)Cx = 0

i.e., ej� is an unobservable mode of�(I �DR�1D�)C; A�BR�1D�C�. 2

Corollary 21.9 Suppose that D has full column rank and denote R = D�D > 0. LetS have the form

S =

"E +G(E�)�1Q �G(E�)�1�(E�)�1Q (E�)�1

#

where E = A�BR�1D�C, G = BR�1B�, Q = C�(I �DR�1D�)C, and E is assumed

to be invertible. Then S 2 dom(Ric) i� (A;B) is stabilizable and

"A� ej�I B

C D

#has

full column rank for all � 2 [0; 2�]. Furthermore, X = Ric(S) � 0.

21.3. Bounded Real Functions 537

Note that the Riccati equation corresponding to the simplectic matrix in Corol-lary 21.9 is

E�XE �X �E�XG(I +XG)�1XE +Q = 0:

This equation can also be written as

A�XA�X � (B�XA+D�C)�(D�D +B�XB)�1(B�XA+D�C) + C�C = 0:

21.3 Bounded Real Functions

Let a real rational transfer matrix be given by

M(z) =

"A B

C D

#

where again A is assumed to be nonsingular and the realization is assumed to haveno uncontrollable and no unobservable modes on the unit circle. Note again that allresults hold for A singular case with the same modi�cation as in the last section. De�neM�(z) :=MT (z�1). Then

M�(z) =

"(A�)�1 �(A�)�1C�B�(A�)�1 D� �B�(A�)�1C�

#:

Lemma 21.10 LetM(z) =

"A B

C 0

#2 RL1 and let S be a simplectic matrix de�ned

by

S :=

"A�BB�(A�)�1C�C BB�(A�)�1

�(A�)�1C�C (A�)�1

#:


(i) kM(z)k1 < 1;

(ii) S has no eigenvalues on the unit circle and kC(I �A)�1Bk < 1.

Proof. It is easy to compute that

[I �M�(z)M(z)]�1 =

264

A�BB�(A�)�1C�C �BB�(A�)�1(A�)�1C�C (A�)�1

B

0

�B�(A�)�1C�C �B�(A�)�1 I

375 :

It is claimed that [I�M�(z)M(z)]�1 has no uncontrollable and/or unobservable modeson the unit circle.


To show that, suppose that � = ej� is an uncontrollable mode of [I�M�(z)M(z)]�1.

Then 9 q ="q1

q2

#2 C 2n such that

q�"A�BB�(A�)�1C�C �BB�(A�)�1

(A�)�1C�C (A�)�1

#= ej�q�; q�

"B

0

#= 0:

Hence q�1B = 0 andhq�1A+ q�2(A

�)�1C�C q�2(A�)�1

i= ej�

hq�1 q�2

i:

There are two possibilities:

1. q2 6= 0. Then we have q�2(A�)�1 = ej�q�2 , i.e., Aq2 = e�j�q2. This implies e�j� is

an eigenvalue of A. This is a contradiction since M(z) 2 RL1.2. q2 = 0. Then q�1A = ej�q�1 , which again implies that M(z) has a mode on the unit

circle if q1 6= 0, again a contradiction.

Similar proof can be done for observability, hence the claim is true.Now note that

S =

"�I

I

#"A�BB�(A�)�1C�C �BB�(A�)�1

(A�)�1C�C (A�)�1

#"�I

I

#:

Hence S does not have eigenvalues on the unit circle. It is clear that we have alreadyproven that S has no eigenvalues on the unit circle i� (I �M�M)�1 2 RL1. So it issu�cient to show that

kM(z)k1 < 1 , (I �M�M)�1 2 RL1 andkM(1)k < 1:

It is obvious that the right hand side is necessary. To show that it is also su�cient, sup-pose kM(z)k1 � 1, then �max(M(ej�)) = 1 for some � 2 [0; 2�], since �max(M(1)) < 1andM(ej�) is continuous in �. This implies that 1 is an eigenvalue ofM�(e�j�)M(ej�),so I �M�(e�j�)M(ej�) is singular. This contradicts to (I �M�M)�1 2 RL1. 2

In the above Lemma, we have assumed that the transfer matrix is strictly proper.We shall now see how to handle non-strictly proper case. For that purpose we shallfocus our attention on the stable system, and we shall give an example below to showwhy this restriction is sometimes necessary for the technique to work.

We �rst note that H1-norm of a stable system is de�ned as

kM(z)k1 = supjzj�1

� (M(z)) :

Then it is clear that kM(z)k1 � � (M(1)) = kDk. Thus in particular if kM(z)k1 < 1then I �D�D > 0.

On the other hand, if a function is only known to be in RL1, the above conditionmay not be true if the system is not stable.


Example 21.1 Let 0 < � < 1=2 and let

M1(z) =z

z � 1=�=

"1=� �

1 1

#2 RL1:

Then kM1(z)k1 = �1�� < 1, but 1�D�D = 0. In general, ifM 2 RL1 and kMk1 < 1,

then I �D�D can be inde�nite. 3

Lemma 21.11 Let M(z) =

"A B

C D

#2 RH1. Then kM(z)k1 < 1 if and only if

N(z) 2 RH1 and kN(z)k1 < 1 where

N(z) =

"A+B(I �D�D)�1D�C B(I �D�D)�1=2

(I �DD�)�1=2C 0

#=:

"E B

C 0

#:

Proof. This is exactly the analogy of Corollary 17.4. 2

Theorem 21.12 Let M(z) =

"A B

C D

#2 RH1 and de�ne

E := A+B(I �D�D)�1D�CG := �B(I �D�D)�1B�Q := C�(I �DD�)�1C:

Suppose that E is nonsingular and de�ne a simplectic matrix as

S :=

"E +G(E�)�1Q �G(E�)�1�(E�)�1Q (E�)�1

#:


(a) kM(z)k1 < 1;

(b) S has no eigenvalues on the unit circle and kC(I �A)�1B +Dk < 1;

(c) 9X � 0 such that I �D�D �B�XB > 0 and

E�XE �X �E�XG(I +XG)�1XE +Q = 0

and (I +GX)�1E is stable. Moreover, X > 0 if (C;A) is observable.

(d) 9X > 0 such that I �D�D �B�XB > 0 and

E�XE �X �E�XG(I +XG)�1XE +Q < 0;


(e) 9X > 0 such that

"A B

C D

#� "X 0

0 I

#"A B

C D

#�"X 0

0 I

#< 0;

(f) 9T nonsingular such that

�

"TAT�1 TB

CT�1 D

#!=

"T 0

0 I

#"A B

C D

#"T 0

0 I

#�1 < 1;

(g) ��

"A B

C D

#!< 1 with � 2� and

� :=

("�1In 0

0 �2

#: �1 2 C ;�2 2 C

m�p)� C

(n+m)�(n+p)

(assuming that M(s) is a p�m matrix and A is an n� n matrix).

Note that the Riccati equation in (c) can also be written as

A�XA�X + (B�XA+D�C)�(I �D�D �B�XB)�1(B�XA+D�C) + C�C = 0:

Proof. (a),(b) follows by Lemma 21.10

(a))(g) follows from Theorem 11.7.

(g))(f) follows from Theorem 11.5.

(f))(e) follows by letting X = T �T .

(e))(d) follows by Schur complementary formula.

(d))(c) can be shown in the same way as in the proof of Theorem 13.11.

(c))(a) We shall only give the proof for D = 0 case, the case D 6= 0 can betransformed to the zero case by Lemma 21.11. Hence in the following we have E = A,G = �BB�, and Q = C�C.

Assuming (c) is satis�ed by some X � 0 and considering the obvious relation withz := ej�

(z�1I �A�)X(zI �A) + (z�1I �A�)XA+A�X(zI �A) = X �A�XA

= C�C +A�XB(I �B�XB)�1B�XA:


The last equality is obtained from substituting in Riccati equation. Now pre-multiplythe above equation by B�(z�1I �A)�1 and post-multiply by (zI �A)�1B to get

I �M�(z�1)M(z) =W �(z�1)W (z)

where

W (z) =

"A B

(I �B�XB)�1=2B�XA �(I �B�XB)1=2

#:

Suppose W (ej�)v = 0 for some � and v; then ej� is a zero of W (z) if v 6= 0. However,all the zeros of W (z) are given by the eigenvalues of

A+B(I �B�XB)�1B�XA = (I �BB�X)�1A

that are all inside of the unit circle. Hence ej� cannot be a zero of W .

Therefore, we get I �M�(e�j�)M(ej�) > 0 for all � 2 [0; 2�], i.e., kMk1 < 1. 2

The following more general results can be proven easily following the same procedureas in the proof of (c))(a).

Corollary 21.13 Let M(z) =

"A B

C 0

#2 RL1 and suppose 9X = X� such that

A�XA�X +A�XB(I �B�XB)�1B�XA+ C�C = 0:

Then

I �M�(z�1)M(z) =W �(z�1)(I �B�XB)W (z)

where

W (z) =

"A B

(I �B�XB)�1B�XA �I

#:

Moreover, the following statements hold:

(1) if I �B�XB > 0 (< 0), then kM(z)k1 � 1 (� 1) ;

(2) if I �B�XB > 0 (< 0) and j�if(I �BB�X)�1Agj 6= 1, then kM(z)k1 < 1 (> 1).

Remark 21.4 As in the continuous time case, the equivalence between (a) and (b) inTheorem 21.12 can be used to compute the H1 norm of a discrete time transfer matrix.

~


21.4 Matrix Factorizations

21.4.1 Inner Functions

A transfer matrix N(z) is called inner if N(z) is stable and N�(z)N(z) = I for allz = ej�. Note that if N has no poles at the origin, then N�(ej�) = N�(ej�). A transfermatrix is called outer if all its transmission zeros are stable (i.e., inside of the unit disc).

Lemma 21.14 Let N =

"A B

C D

#and suppose that X = X� satis�es

A�XA�X + C�C = 0:

Then

(a) D�C + B�XA = 0 implies N�N = (D � CA�1B)�D;

(b) X � 0, (A;B) controllable, and N�N = (D�CA�1B)�D implies D�C+B�XA =0.

Proof. The results follow by noting the following equality:

N�(z)N(z) =

264

A 0 B

(A�)�1C�C (A�)�1 (A�)�1C�D

(D � CA�1B)�C �B�(A�)�1 (D � CA�1B)�D

375

=

2664

A 0 B

0 (A�)�1 XB + (A�)�1C�D

DTC +B�XA �B�(A�)�1 (D � CA�1B)�D

3775 :

2

The following corollary is a special case of this lemma which gives the necessary andsu�cient conditions of a discrete inner transfer matrix with the state-space representa-tion.

Corollary 21.15 Suppose that N(z) =

"A B

C D

#2 RH1 is a controllable realiza-

tion; then N(z) is inner if and only if there exists a matrix X = X� � 0 such that

(a) A�XA�X + C�C = 0

(b) D�C + B�XA = 0

(c) (D � CA�1B)�D = D�D +B�XB = I .

21.4. Matrix Factorizations 543

The following alternative characterization of the inner transfer matrix is often usefuland insightful.

Corollary 21.16 Let N(s) =

"A B

C D

#2 RH1 and assume that there exists a T

nonsingular such that

P =

"TAT�1 TB

CT�1 D

#and P �P = I:

Then N(z) is an inner. Furthermore, if the realization of N is minimal, then such Texists.

Proof. Rewrite P �P = I as"A� C�

B� D�

# "T �T

I

# "A B

C D

#=

"T �T

I

#:

Let X = T �T , and then"A�XA�X + C�C A�XB + C�D

B�XA+D�C B�XB +D�D � I

#= 0:

This is the desired equation, so N is inner. On the other hand, if the realization isminimal, then X > 0. This implies that T exists.

2

In a similar manner, Corollary 21.15 can be used to derive the state-space represen-tation of the complementary inner factor (CIF).

Lemma 21.17 Suppose that a p�m (p > m) transfer matrix N(z) (minimal) is inner;

then there exists a p � (p �m) CIF N? 2 RH1 such that the matrixhN N?

iis

square and inner. A particular realization is

N?(z) =

"A Y

C Z

#

where Y and Z satisfy

A�XY + C�Z = 0 (21:20)

B�XY +D�Z = 0 (21:21)

Z�Z + Y �XY = I: (21:22)


Proof. Note thathN N?

i=

"A B Y

C D Z

#is inner. Now it is easy to prove

the results by using the formula in Corollary 21.15. 2

21.4.2 Coprime Factorizations

Recall that two transfer matrices M(z); N(z) 2 RH1 are said to be right coprime if"M

N

#is left invertible in RH1 i.e., 9U; V 2 RH1 such that

U(z)N(z) + V (z)M(z) = I:

The left coprime is de�ned analogously. A plant G(z) 2 RL1 is said to have doublecoprime factorization if 9 a right coprime factorization G = NM�1, a left coprimefactorization G = ~M�1 ~N , and U; V; ~U; ~V 2 RH1 such that"

V U

� ~N ~M

# "M � ~U

N ~V

#= I: (21:23)

The state space formulae for discrete time transfer matrix coprime factorization are thesame as for the continuous time. They are given by the following theorem.

Theorem 21.18 Let G(z) =

"A B

C D

#2 RL1 be a stabilizable and detectable re-

alization. Choose F and L such that A + BF and A + LC are both stable. LetU; V; ~U; ~V ; N; M; ~N , and ~M be given as follows

"M

N

#:=

264

A+BF BZr

F

C +DF

Zr

DZr

375

"~U

~V

#:=

264

A+BF LZ�1l

F

�(C +DF )

0

Z�1l

375

"~M~N

#:=

"A+ LC L B + LD

ZlC Zl ZlD

#"U

V

#:=

"A+ LC L �(B + LD)

Z�1l F 0 Z�1l

#

where Zr and Zl are any nonsingular matrices. Then G = NM�1 = ~M�1 ~N are rcf andlcf, respectively, and (21.23) is satis�ed.


Some coprime factorizations are particularly interesting, for example, the coprimefactorization with inner numerator. This factorization in the case of G(z) 2 RH1yields an inner-outer factorization.

Theorem 21.19 Assume that (A; B) is stabilizable,

"A� ej�I B

C D

#has full column

rank for all � 2 [0; 2�], and D has full column rank. Then there exists a right coprimefactorization G = NM�1 such that N is inner. Furthermore, a particular realization isgiven by "

M

N

#:=

264

A+BF BR�1=2

F

C +DF

R�1=2

DR�1=2

375

where

R = D�D +B�XB

F = �R�1(B�XA+D�C)

and X = X� � 0 is the unique stabilizing solution

A�DX(I +B(D�D)�1B�X)�1AD �X + C�D?D�?C = 0

where AD := A�B(D�D)�1D�C.

Using Lemma 21.17, the complementary inner factor of N in Theorem 21.19 can beobtained as follows

N? =

"A+BF Y

C +DF Z

#

where Y and Z satisfy

A�XY + C�Z = 0

B�XY +D�Z = 0

Z�Z + Y �XY = I:

Note that Y and Z are only related to F implicitly through X .

Remark 21.5 If G(z) 2 RH1, then the denominator matrix M in Theorem 21.19 isan outer. Hence, the factorization G = N(M�1) is an inner-outer factorization. ~

Suppose that the system G(z) is not stable; then a coprime factorization with innerdenominator can also be obtained by solving a special Riccati equation.


Theorem 21.20 Assume that G(z) :=

"A B

C D

#2 RL1 and that (A;B) is stabiliz-

able. Then there exists a right coprime factorization G = NM�1 such that M is innerif and only if G has no poles on the unit circle. A particular realization is

"M

N

#:=

264

A+BF BR�1=2

F

C +DF

R�1=2

DR�1=2

375

whereR = I +B�XB

F = �R�1B�XAand X = X� � 0 is the unique stabilizing solution to

A�X(I + BB�X)�1A�X = 0:

Another special coprime factorization is called normalized coprime factorizationwhich has found applications in many control problems such as model reduction con-troller design and gap metric characterization.

Recall that a right coprime factorization of G = NM�1 with N;M 2 RH1 is calleda normalized right coprime factorization if

M�M +N�N = I

i.e., if

"M

N

#is an inner.

Similarly, an lcf G = ~M�1 ~N is called a normalized left coprime factorization ifh~M ~N

iis a co-inner. Then the following results follow in the same way as for the

continuous time case.

Theorem 21.21 Let a realization of G be given by

G =

"A B

C D

#

and de�neR = I +D�D > 0; ~R = I +DD� > 0:

(a) Suppose that (A;B) is stabilizable and that (C;A) has no unobservable modeson the imaginary axis. Then there is a normalized right coprime factorizationG = NM�1

"M

N

#:=

264

A+BF BZ�1=2

F

C +DF

Z�1=2

DZ�1=2

375 2 RH1


whereZ = R+B�XB

F = �Z�1(B�XA+D�C)

and X = X� � 0 is the unique stabilizing solution

A�nX(I +BR�1B�X)�1An �X + C� ~R�1C = 0

where An := A�BR�1D�C.(b) Suppose that (C;A) is detectable and that (A;B) has no uncontrollable modes on

the imaginary axis. Then there is a normalized left coprime factorization G =~M�1 ~N h

~M ~Ni:=

"A+ LC L B + LD

~Z�1=2C ~Z�1=2 ~Z�1=2D

#

where~Z = ~R+ CY C�

L = �(BD� +AY C�) ~Z�1

and Y = Y � � 0 is the unique stabilizing solution

~AnY (I + C� ~R�1CY )�1 ~A�n � Y +BR�1B� = 0

where ~An := A�BD� ~R�1C.

(c) The controllability Gramian P and observability Gramian Q of

"M

N

#are given

byP = (I + Y X)�1Y; Q = X

while the controllability Gramian ~P and observability Gramian ~Q ofh

~M ~Ni

are given by~P = Y; ~Q = (I +XY )�1X:

21.4.3 Spectral Factorizations

The following theorem gives a solution to a special class of spectral factorization prob-lems.

Theorem 21.22 Assume G(z) :=

"A B

C D

#2 RH1 and > kG(z)k1. Then, there

exists a transfer matrix M 2 RH1 such that M�M = 2I � G�G and M�1 2 RH1.A particular realization of M is

M(z) =

"A B

�R1=2F R1=2

#


where

RD = 2I �D�D

R = RD �B�XBF = (RD �B�XB)�1(B�XA+D�C)

and X = X� � 0 is the stabilizing solution of

A�sX(I �BR�1D B�X)�1As �X + C�(I +DR�1D D�)C = 0

where As := A+BR�1D D�C.

Similar to the continuous time, we have the following theorem.

Theorem 21.23 Let G(z) :=

"A B

C D

#2 RH1 with D full row rank and G(ej�)G�(ej�) >

0 for all �. Then, there exists a transfer matrix M 2 RH1 such that M�M = GG�. Aparticular realization of M is

M(z) =

"A BW

CW DW

#

where

BW = APC� +BD�

D�WDW = DD�

CW = DW (DD�)�1(C �B�WXA)

and

APA� � P +BB� = 0

A�XA�X + (C �B�WXA)�(DD�)�1(C �B�WXA) = 0:

21.5 Discrete Time H2 Control


G

K

z

y

w

u

� �

�

-

21.5. Discrete Time H2 Control 549


G(z) =

264

A B1 B2

C1 D11 D12

C2 D21 0

375 =:

"A B

C D

#:


(A1) (A;B2) is stabilizable and (C2; A) is detectable;

(A2) D12 is full column rank withhD12 D?

iunitary and D21 is full row rank with"

D21

~D?

#unitary;

(A3)

"A� ej�I B2

C1 D12

#has full column rank for all � 2 [0; 2�];

(A4)

"A� ej�I B1

C2 D21

#has full row rank for all � 2 [0; 2�].

The problem in this section is to �nd an admissible controller K which minimizeskTzwk2.

Denote

Ax := A�B2D�12C1; Ay := A�B1D

�21C2:

Let X2 � 0 and Y2 � 0 be the stabilizing solutions to the following Riccati equations:

A�x(I +X2B2B�2)�1X2Ax �X2 + C�1D?D

�?C1 = 0

and

Ay(I + Y2C�2C2)

�1Y2A�y � Y2 +B1~D�? ~D?B�1 = 0:

Note that the stabilizing solutions exist by the assumptions (A3) and (A4). Note alsothat if Ax and Ay are nonsingular, the solutions can be obtained through the followingtwo simplectic matrices:

H2 :=

"Ax +B2B

�2(A

�x)�1C�1D?D

�?C1 �B2B

�2(A

�x)�1

�(A�x)�1C�1D?D�?C1 (A�x)�1

#

J2 :=

"A�y + C�2C2A

�1y B1

~D�? ~D?B�1 �C�2C2A�1y

�A�1y B1~D�? ~D?B�1 A�1y

#:


De�ne

Rb := I +B�2X2B2

F2 := �(I +B�2X2B2)�1(B�2X2A+D�12C1)

F0 := �(I +B�2X2B2)�1(B�2X2B1 +D�12D11)

L2 := �(AY2C�2 +B1D�21)(I + C2Y2C

�2 )�1

L0 := (F2Y2C�2 + F0D

�21)(I + C2Y2C

�2 )�1

andAF2 := A+B2F2; C1F2 := C1 +D12F2

AL2 := A+ L2C2; B1L2 := B1 + L2D21

A2 := A+B2F2 + L2C2

Gc(z) :=

"AF2 B1 +B2F0

C1F2 D11 +D12F0

#

Gf (z) :=

"AL2 B1L2

R1=2b (L0C2 � F2) R

1=2b (L0D21 � F0)

#:

Theorem 21.24 The unique optimal controller is

Kopt(z) :=

"A2 �B2L0C2 �(L2 �B2L0)

F2 � L0C2 L0

#:

Moreover, min kTzwk22 = kGck22 + kGfk22.

Remark 21.6 Note that for a discrete time transfer matrix G(z) =

"A B

C D

#2 RH2,

its H2 norm can be computed as

kG(z)k22 = TracefD�D +B�LoBg = TracefDD� + CLcC�g

where Lc and Lo are the controllability and observability Gramians

ALcA� � Lc +BB� = 0

A�LoA� Lo + C�C = 0:

Using the above formula, we can compute min kTzwk22 by noting that X2 and Y2 satisfythe equations

A�F2X2AF2 �X2 + C�1F2C1F2 = 0

AL2Y2A�L2 � Y2 +B1L2B

�1L2 = 0:

21.5. Discrete Time H2 Control 551

For example,

kGck22 = Tracef(D11 +D12F0)�(D11 +D12F0) + (B1 +D2F0)X2(B1 +D2F0)g

and

kGfk22 = TraceRb f(L0D21 � F0)(L0D21 � F0)� + (L0C2 � F2)Y2(L0C2 � F2)�g :

~

Proof. Let x denote the states of the system G. Then the system can be written as

_x = Ax+B1w +B2u (21.24)

z = C1x+D11w +D12u (21.25)

y = C2x+D21w: (21.26)

De�ne � := u� F2x� F0w; then the transfer function from w; � to z becomes

z =

"AF2 B1 +B2F0 B2

C1F2 D11 +D12F0 D12

#"w

�

#= Gcw + UR

1=2b �

where

U(s) :=

"AF2 B2R

�1=2b

C1F2 D12R�1=2b

#:

It is easy to shown that U is an inner and that U�Gc 2 RH?2 . Now denote the transferfunction from w to � by T�w. Then

Tzw = Gc + UR1=2b T�w

and

kTzwk22 = kGck22 + R1=2

b T�w

22� kGck22

for any given stabilizing controller K. Hence if the states (x) and the disturbance (w)are both available for feedback (i.e., full information control) and u = F2x+ F0w, thenT�w = 0 and kTzwk2 = kGck2. Therefore, u = F2x+F0w is an optimal full informationcontrol law. Note that

� = T�ww; T�w = F`(G� ;K)

G�

K

�

y

w

u

� �

�

-

G� =

264

A B1 B2

�F2 �F0 I

C2 D21 0

375


Since A � B2(�F2) = A + B2F2 is stable, from the relationship between an outputestimation (OE) problem and a full control (FC) problem, all admissible controllers forG� (hence for the output feedback problem) can be written as

K = F`(Mt;KFC); Mt =

26664A+B2F2 0

hI �B2

i�F2 0

h0 I

iC2 I

h0 0

i37775

where KFC is an internally stabilizing controller for the following system:

G� =

26664

A B1

hI 0

i�F2 �F0

h0 I

iC2 D21

h0 0

i37775 :

Furthermore, T�w = F`(G� ;F`(Mt;KFC)) = F`(G� ;KFC). Hence

minKkTzwk22 = kGcB1k22 + min

KFC

kR1=2b F`(G� ;KFC)k22:

Next de�ne

~G� :=

"R1=2b 0

0 I

#G�

264I 0 0

0 I 0

0 0 R�1=2b

375

=

26664

A B1

hI 0

i�R1=2

b F2 �R1=2b F0

h0 I

iC2 D21

h0 0

i37775

and

~KFC :=

"I 0

0 R1=2b

#KFC :

Then it is easy to see that

R1=2b F`(G� ;KFC) = F`( ~G� ; ~KFC)

and

minKFC

kR1=2b F`(G� ;KFC)k2 = min

~KFC

kF`( ~G� ; ~KFC)k2:

21.6. Discrete Balanced Model Reduction 553

A controller minimizing kF`( ~G� ; ~KFC)k2 is ~KFC =

"L2

R1=2b L0

#since the transpose (or

dual) of F`( ~G� ; ~KFC) is a full information feedback problem considered at the beginningof the proof. Hence we get

F`( ~G� ;

"L2

R1=2b L0

#) = Gf

andminKkTzwk22 = kGck22 + kGfk22:

Finally, the optimal output feedback controller is given by

K = F`(Mt;

"L2

L0

#) =

"A+B2F2 + L2C2 �B2L0C2 L2 �B2L0

�F2 + L0C2 L0

#:

The proof of uniqueness is similar to the continuous time case, and hence omitted. 2

It should be noted that in contrast with the continuous time the full informationoptimal control problem in the discrete time is not a state feedback even when D11 = 0.

The discrete time H1 control problem is much more involved and it is probablymore e�ective to obtain the discrete solution by using a bilinear transformation.

21.6 Discrete Balanced Model Reduction

In this section, we will show another application of the LFT machinery in discretebalanced model reduction. We will show an elegant proof of the balanced truncationerror bound.

Consider a stable discrete time system G(z) and assume that the transfer matrixhas the following realization:

G(s) =

"A B

C D

#2 RH1:

Let P and Q be two positive semi-de�nite symmetric matrices such that

APA� � P +BB� � 0 (21:27)

A�QA�Q+ C�C � 0: (21:28)

Without loss of generality, we shall assume that

P = Q =

"�1 0

0 �2

#


with

�1 = diag(�1Is1 ; �2Is2 ; : : : ; �rIsr ) � 0

�2 = diag(�r+1Isr+1 ; �r+2Isr+2 ; : : : ; �nIsn) � 0

where si denotes the multiplicity of �i. (Note that the singular values are not necessarilyordered.) Moreover, the realization for G(z) is partitioned conformably with P and Q:

G(s) =

264A11 A12 B1

A21 A22 B2

C1 C2 D

375 :

Theorem 21.25 Suppose �1 > 0. Then A11 is stable.

Proof. We shall �rst assume �2 > 0. From equation (21.28), we have

A�11�1A11 ��1 +A�21�2A21 + C�1C1 � 0: (21:29)

Assume that � is an eigenvalue of A11; then there is an x 6= 0 such that

A11x = �x: (21:30)

Now pre-multiply x� and post-multiply x to equation (21.29) to get

(j�j2 � 1)x��1x+ x�A�21�2A21x+ x�C�1C1x � 0:

It is clear that j�j � 1. However, if j�j = 1, say � = ej� for some �, we have

A21x = 0; C1x = 0:

These equations together with equation (21.30) imply that"A11 A12

A21 A22

#"x

0

#= ej�

"x

0

#

i.e., ej� is an eigenvalue of A. This contradicts the stability assumption of A, so A11 isstable.

Now assume that �2 is singular, and we will show that we can remove all thosestates corresponding to the zero singular values without changing the system stability.For that purpose, we assume �2 = 0. Then the inequality (21.27) can be written as"

A11�1A�11 ��1 +B1B

�1 A11�1A

�21 +B1B

�2

A21�1A�11 +B2B

�1 A21�1A

�21 +B2B

�2

#� 0:


This implies thatA11�1A

�11 ��1 +B1B

�1 � 0 (21:31)

andA21�1A

�21 +B2B

�2 � 0: (21:32)

But inequality (21.32) implies that

A21 = 0; B2 = 0:

Hence we have

A =

"A11 A12

0 A22

#;

and the stability of A11 and A22 are ensured.

Substitute this A matrix into the inequality (21.28), and we obtain

A�11�1A11 ��1 + C�1C1 � 0:

The subsystem with A11 still satis�es inequalities (21.27) and (21.28) with �1 > 0. Thisproves that we can assume without loss of generality that �2 > 0. 2

Remark 21.7 It is important to note that the realization for the truncated subsystem

Gr =

"A11 B1

C1 D

#

is still balanced in some sense1 since the system parameters satisfy the following equa-tions:

A11�1A�11 � �1 +A12�2A

�12 +B1B

�1 � 0

A�11�1A11 ��1 +A�21�2A21 + C�1C1 � 0:

But these equations imply that

A11�1A�11 ��1 +B1B

�1 � 0

A�11�1A11 ��1 + C�1C1 � 0

hold. ~

Theorem 21.26 Suppose Gr =

"A11 B1

C1 D

#. Then kG�Grk1 � 2

nXi=r+1

�i.

In particular, kGk1 � kDk+ 2

nXi=1

�i.

1Balanced in the sense that the same inequalities as (21.27) and (21.28) are satis�ed.


Without loss of generality, we shall assume �n = 1. We will prove that for �2 =�nI = I , we have

kG�Grk1 � 2; r = n� 1:

Then the theorem follows immediately by scaling and recursively applying this resultsince the reduced system Gr is still balanced.

It will be seen that it is more convenient to set � = �1=21 . The proof of the theorem

will follow from the following two lemmas and the bounded real lemma which establishesthe relationship between the H1 norm of a transfer matrix and its realizations. (Notethat in the following, a constant matrix X is said to be contractive or a contraction ifkXk � 1 and strictly contractive if kXk < 1).

The lemma below shows that for any stable system there is a realization such thathA B

iis a contraction, and, similarly, there is another realization such that

"A

C

#is a contraction.

Lemma 21.27 Suppose that a realization of the transfer matrix G satis�es P = Q =diagf�2; Ig; then "

��1A12 ��1A11� ��1B1

A22� A21 B2

#

and 264

A21��1 A22

�A11��1 �A12

C1��1 C2

375

are contractive.

Proof. Since P =

"�2 0

0 I

#satis�es the inequality (21.27),

hA B

i " P 0

0 I

#"A�

B�

#� P

i.e,hP�1=2AP 1=2 P�1=2B

iis a contraction. But

hP�1=2AP 1=2 P�1=2B

i=

"��1A12 ��1A11� ��1B1

A22� A21 B2

#2640 I 0

I 0 0

0 0 I

375 :

Hence "��1A12 ��1A11� ��1B1

A22� A21 B2

#


is also a contraction. The other part follows by a similar argument. 2

Lemma 21.28 Suppose that X =

"X11 X12

Z X22

#and Y =

"Y11 Z

Y21 Y22

#are contrac-

tive (strictly contractive). Then

M�=

2664

0 1p2X11 X12

1p2Y11 Z 1p

2X22

Y211p2Y22 0

3775

is also contractive (strictly contractive).

Proof. Dilate M to the following matrix:

Md�=

266664

0 1p2X11 X12

1p2X11

1p2Y11 Z 1p

2X22 0

Y211p2Y22 0 � 1p

2Y22

1p2Y11 0 � 1p

2X22 �Z

377775 :

Considering X and Y are contractive, we can easily verify that M�dMd � I , i.e, Md is a

contraction. 2

We can now prove the theorem.

Proof of Theorem 21.26. Note that

Gr =

"A11 B1

C1 D

#=

264A11 0 B1

0 0 0

C1 0 D

375 :

Hence

1

2(G�Gr) =

266666664

A11 0 0 0 1p2B1

0 0 0 0 0

0 0 A11 A121p2B1

0 0 A21 A221p2B2

� 1p2C1 0 1p

2C1

1p2C2 0

377777775:

Now apply the similarity transformation

T =

266664�� 0 � 0

0 �I 0 I

��1 0 ��1 0

0 I 0 I

377775


to the realization of 12 (G�Gr) to get

1

2(G�Gr) =

26666664

�A11��1 1

2�A12 0 12�A12 0

12A21�

�1 12A22

12A21�

12A22

12B2

0 12�

�1A12 ��1A11�12�

�1A12 ��1B1

12A21�

�1 12A22

12A21�

12A22

12B2

C1��1 1

2C2 0 12C2 0

37777775:

It is easy to verify that as a constant matrix the right hand side of the above realizationfor 1

2 (G�Gr) can be written as26666664

0 0 I 0

0 1p2I 0 0

I 0 0 0

0 1p2I 0 0

0 0 0 I

37777775Md

266664I 0 0 0 0

0 1p2I 0 1p

2I 0

0 0 I 0 0

0 0 0 0 I

377775 (21:33)

where

Md�=

266664

0 1p2��1A12 ��1A11� ��1B1

1p2A21�

�1 A221p2A21�

1p2B2

�A11��1 1p

2�A12 0 0

C1��1 1p

2C2 0 0

377775 : (21:34)

According to Theorem 21.12 (a) and (g), the theorem follows if we can show thatthe realization for 1

2 (G � Gr) as a constant matrix is a contraction. However, this is

guaranteed if Md is a contraction since both the right and left hand matrices in (21.33)are contractive.

Finally, the contractiveness of Md follows from Lemmas 21.27 and 21.28 by identi-fying

Z = A22; X11 =1p2��1A12; Y11 =

1p2A21�

�1

and"X12

X22

#=

"��1A11� ��1B1

1p2A21�

1p2B2

# hY21 Y22

i=

"�A11�

�1 1p2�A12

C1��1 1p

2C2

#:

2

21.7 Model Reduction Using Coprime Factors

In this section, we consider lower order controller design using coprime factor reduction.We shall only consider the special case where the normalized right coprime factors areused.

21.7. Model Reduction Using Coprime Factors 559

Suppose that a dynamic system is given by

G =

"A B

C D

#

and assume that the realization is stabilizable and detectable. Recall from Theo-rem 21.21 that there exists a normalized right coprime factorization G = NM�1 such

that

"M

N

#is inner.

Lemma 21.29 Let �i =p�i(Y X). Then the Hankel singular values of

"M

N

#are

given by

�i =�ip1 + �2i

< 1:

Proof. This is obvious since

�2i = �i(PQ) =�i(Y X)

1 + �i(Y X)

and �i(Y X) � 0. 2

It is known that there exists a transformation such that X and Y are balanced:

X = Y = � =

"�1 0

0 �2

#

with �1 = diag[�1Is1 ; : : : ; �rIsr ] > 0.

Now partitioning the system G and matrix F accordingly,

G =

264A11 A12 B1

A21 A22 B2

C1 C2 D

375 F =

hF1 F2

i:

Then the reduced coprime factors

"M

N

#:=

264A11 +B1F1 B1Z

�1=2

F1

C1 +DF1

Z�1=2

DZ�1=2

375 2 RH1

satisfy the following error bound.

Lemma 21.30

"M

N

#�"M

N

# 1� 2

Xi�r+1

�i = 2Xi�r+1

�ip1 + �2i

:


Proof. Analogous to the continuous time case. 2

Remark 21.8 It should be understood that the reduced model can be obtained bydirectly computing X and P and by obtaining a balanced model without solving theRiccati equation for Y . ~

This reduced coprime factors combined with the robust or H1 controller designmethods can be used to design lower order controllers so that the system is robustlystable and some speci�ed performance criteria are satis�ed. We will leave the readersto explore the utility of this model reduction method. However, we would like to pointout that unlike the continuous time case, the reduced coprime factors in discrete timemay not be normalized. In fact, we can prove a more general result.

Lemma 21.31 Let a realization for N(z) 2 RH1 be given by

N(z) =

"A B

C D

#

with A stable. Suppose that there exists an X = X� � 0 such that"A�XA�X + C�C A�XB + C�D

B�XA+D�C B�XB +D�D � I

#= 0: (21:35)

Then N is an inner, i.e., N�N = I. Moreover, if the realization for

N =

"A B

C D

#=

264A11 A12 B1

A21 A22 B2

C1 C2 D

375

is also balanced with

X = � =

"�1 0

0 �2

#

and

A�A� ��+BB� = 0

with �1 > 0, then the truncated system

Nr =

"A11 B1

C1 D

#

is stable and contractive, i.e., N�r Nr � I.


Proof. Pre-multiply equation (21.35) by

U =

" hI 0

i0

0 I

#

and post-multiply equation (21.35) by U� to get"A�11�1A11 ��1 + C�1C1 +A�21�2A21 A�11�1B1 + C�1D +A�21�2B2

B�1�1A11 +D�C1 +B�2�2A21 B�1�1B1 +D�D � I +B�2�2B2

#= 0

or "A�11�1A11 ��1 + C�1C1 A�11�1B1 + C�1D

B�1�1A11 +D�C1 B�1�1B1 +D�D � I

#= �

"A�21B�2

#�2

"A�21B�2

#�:

This gives "A11 B1

C1 D

#� "�1 0

0 I

#"A11 B1

C1 D

#��"�1 0

0 I

#� 0:

Now let T = �1=21 , and then

"TA11T

�1 TB1

C1T�1 D

# � 1

i.e., kNrk1 � 1. It is clear that Nr is inner if

"A�21B�2

#�2 = 0 (although the converse

may not be true). 2


The results for the discrete Riccati equation are based mostly on the work of Kucera[1972] and Molinari [1975]. Numerical algorithms for solving discrete ARE with singu-lar A matrix can be found in Arnold and Laub [1984], Dooren [1981], and referencestherein. The matrix factorizations are obtained by Chu [1988]. The normalized coprimefactorizations are obtained by Meyer [1990] and Walker[1990]. The detailed treatmentof discrete time H1 control can be found in Stoorvogel [1990], Limebeer, Green, andWalker [1989], and Iglesias and Glover [1991].

Bibliography

[1] Adamjan, V.M., D.Z. Arov, and M.G. Krein (1971). \Analytic properties ofSchmidt pairs for a Hankel operator and the generalized Schur-Takagi problem,"Math. USSR Sbornik, Vol. 15, pp. 31-73.

[2] Adamjan, V.M., D.Z. Arov, and M.G. Krein (1978). \In�nite block Hankel ma-trices and related extension problems," AMS Transl., vol. 111, pp. 133-156.

[3] Al-Saggaf, U. M. and G. F. Franklin (1987). \An error bound for a discrete reducedorder model of a linear multivariable system," IEEE Trans. Automat. Contr.,Vol.AC-32, pp. 815-819.

[4] Al-Saggaf, U. M. and G. F. Franklin (1988). \Model reduction via balanced realiza-tions: an extension and frequency weighting techniques," IEEE Trans. Automat.Contr., Vol.AC-33, No. 7, pp. 687-692.

[5] Anderson, B. D. O. (1967). \A system theory criterion for positive real matrices,"SIAM J. Contr. Optimiz., Vol. 6, No. 2, pp. 171-192.

[6] Anderson, B.D.O. (1967). \An algebraic solution to the spectral factorizationproblem," IEEE Trans. Auto. Control, Vol. AC-12, pp. 410-414.

[7] Anderson, B. D. O. (1993). \Bode Prize Lecture (Control design: moving fromtheory to practice)," IEEE Control Systems, Vol. 13, No.4, pp. 16-25.

[8] Anderson, B.D.O., P.Agathoklis, E.I.Jury and M.Mansour (1986). \Stabilityand the Matrix Lyapunov Equation for Discrete 2-Dimensional Systems," IEEETrans., Vol.CAS-33, pp.261� 266.

[9] Anderson, B. D. O. and Y. Liu (1989). \Controller reduction: concepts and ap-proaches, " IEEE Trans. Automat. Contr. , Vol. AC-34, No. 8, pp. 802-812.

[10] Anderson, B. D. O. and J. B. Moore (1979). Optimal Filtering. Prentice-Hall,Englewood Cli�s, New Jersey.

[11] Anderson, B. D. O. and J. B. Moore (1989). Optimal Control: Linear QuadraticMethods. Prentice-Hall, Englewood Cli�s, New Jersey.

563

564 BIBLIOGRAPHY

[12] Anderson, B. D. O. and S. Vongpanitlerd (1973). Network Analysis and Synthesis:A Modern System Theory Approach. Prentice-Hall, Englewood-Cli�s, New Jersey.

[13] Arnold, W. F. and A. J. Laub (1984). \Generalized Eigenproblem algorithms andsoftware for algebraic Riccati equations," Proceedings of the IEEE, Vol. 72, No.12, pp. 1746-1754.

[14] Balas, G. (1990). Robust Control of Flexible Structures: Theory and Experiments,PhD. Thesis, Caltech.

[15] Balas, G., J. C. Doyle, K. Glover, A. Packard, and R. Smith (1991). �-Analysisand Synthesis Toolbox, MUSYN Inc. and The MathWorks, Inc.

[16] Ball, J.A. and N. Cohen (1987). \Sensitivity minimization in an H1 norm: pa-rameterization of all suboptimal solutions," Int. J. Control, vol. 46, pp. 785-816.

[17] Ball, J.A. and J.W. Helton (1983). \A Beurling-Lax theorem for the Lie groupU(m,n) which contains most classical interpolation theory," J. Op. Theory, vol.9, pp. 107-142.

[18] Ba�sar, T. and P. Bernhard (1991). H1-Optimal Control and Related MinimaxDesign Problems: A Dynamic Game Approach. Systems and Control: Foundationsand Applications. Birkh�auser, Boston.

[19] Ba�sar, T. and G. J. Olsder (1982). Dynamic Noncooperative Game Theory, Aca-demic Press.

[20] Bernstein, D.S. and W.M. Haddard (1989). \LQG control with an H1 perfor-mance bound: A Riccati equation approach," IEEE Trans. Auto. Control, vol.AC-34, pp. 293-305.

[21] Bettayeb, M., L. M. Silverman, and M. G. Safonov (1980). \Optimal approxima-tion of continuous-time systems." Proc. IEEE Conf. Dec. Contr., Albuquerque,pp. 195-198.

[22] Bode, H. W. (1945). Network Analysis and Feedback Ampli�er Design, D. VanNostrand, Princeton.

[23] Boyd, S.P., V. Balakrishnan, C. H. Barratt, N. M. Khraishi, X. Li, D. G. Meyer,S. A. Norman (1988). \A New CAD Method and Associated Architectures forLinear Controllers," IEEE Trans. Auto. Contr., Vol.AC-33, pp.268� 283.

[24] Boyd, S., V. Balakrishnan, and P. Kabamba (1989). \A bisection method for com-puting the H1 norm of a transfer matrix and related problems," Math. Control,Signals, and Systems, vol. 2, No. 3, pp. 207-220.

[25] Boyd, S. and C. Barratt (1991). Linear Controller Design { Limits of Perfor-mance, Prentice Hall, Inc.

BIBLIOGRAPHY 565

[26] Boyd, S. and C.A. Desoer (1985). \Subharmonic functions and performancebounds in linear time-invariant feedback systems," IMA J. Math. Contr. andInfo., vol. 2, pp. 153-170.

[27] Boyd, S. and Q. Yang (1989). \Structured and simultaneous Lyapunov functionsfor system stability problems," Int. J. Control, vol. 49, No. 6, pp. 2215-2240.

[28] Brasch, F. M. and J. B. Pearson (1970). \Pole placement using dynamical com-pensators," IEEE Trans. Auto. Contr., Vol. 15, pp. 34-43.

[29] Brogan, W. L. (1991).Modern Control Theory, 3rd Ed., Prentice Hall, EnglewoodCli�s, New Jersey.

[30] Bruinsma, N. A., and M. Steinbuch (1990). \A fast algorithm to compute theH1-norm of a transfer function matrix", Systems and Control letters, Vol. 14, pp.287-293.

[31] Bryson Jr., A. E. and Y-C. Ho (1975). Applied Optimal Control, HemispherePublishing Corporation.

[32] Chen, C. T. (1984). Linear System Theory and Design. Holt, Rinehart and Win-ston.

[33] Chen, J. (1992a). \Sensitivity integral relations and design tradeo�s in linearmultivariable feedback systems," submitted to IEEE Trans. Auto. Contr..

[34] Chen, J. (1992b). \Multivariable gain-phase and sensitivity integral relations anddesign tradeo�s," submitted to IEEE Trans. Auto. Contr..

[35] Chen, J. (1995). Personal communication.

[36] Chen, M. J. and C.A. Desoer (1982). \Necessary and su�cient condition for robuststability of linear distributed feedback systems," Int. J. Control, vol. 35, no. 2,pp. 255-267.

[37] Chen, T. and B. A. Francis (1992). \ H2-optimal sampled-data control," IEEETrans. Auto. Contr., Vol. 36, No. 4, pp. 387-397.

[38] Chu, C. C. (1985).H1 optimization and robust multivariable control. PhD Thesis,University of Minnesota.

[39] Chu, C. C. (1988). \On discrete inner-outer and spectral factorization," Proc.Amer. Contr. Conf..

[40] Clements, D. J. (1993). \Rational spectral factorization using state-space meth-ods," Systems and Control Letters, Vol. 20, pp. 335-343.

[41] Clements, D. J. and K. Glover (1989). \Spectral factorization by Hermitian pen-cils," Linear Algebra and its Applications, Vol. 122-124, pp. 797-846.

566 BIBLIOGRAPHY

[42] Copple, W. A. (1974). \Matrix quadratic equations," Bull. Austral. Math. Soc.,Vol. 10, pp. 377 � 401.

[43] Dahleh, M. A. and J. B. Pearson (1987). \`1 optimal feedback controllers forMIMO discrete time systems," IEEE Trans. Auto. Contr., Vol. AC-32, No. 4, pp.314 � 322.

[44] Daniel, R. W., B. Kouvaritakis, and H. Latchman (1986). \Principal directionalignment: a geometric framework for the complete solution to the � problem,"IEE Proceedings, vol. 133, Part D, No. 2, pp. 45-56.

[45] Davis, C., W. M. Kahan, and H. F. Weinberger (1982). \Norm-preserving dilationsand their applications to optimal error bounds," SIAM J. Numer. Anal., vol 19,pp.445 � 469.

[46] Davison, E. J. (1968). \On pole assignment in multivariable linear systems," IEEETrans. Auto. Contr., Vol. 13, No. 6, pp. 747-748.

[47] Delsarte, P., Y. Genin, and Y. Kamp (1979). \The Navanlinna-Pick problem formatrix valued functions," SIAM J. Applied Math., Vol. 36, pp. 47-61.

[48] Desai, U. B. and D. Pal (1984). \A transformation approach to stochastic modelreduction," IEEE Trans. Automat. Contr., Vol 29, NO. 12, pp. 1097-1100.

[49] Desoer, C. A., R. W. Liu, J. Murray, and R. Saeks (1980). \Feedback systemdesign: the fractional representation approach to analysis and synthesis," IEEETrans. Auto. Contr., Vol. AC-25, No. 6, pp. 399 � 412.

[50] Desoer, C. A. and M. Vidyasagar (1975). Feedback Systems: Input-Output Prop-erties. Academic Press, New York.

[51] Dooren, P. V. (1981). \A generalized eigenvalue approach for solving Riccati equa-tions," SIAM J. Sci. Stat. Comput., Vol. 2, No. 2, pp. 121-135.

[52] Doyle, J. C. (1978). \Guaranteed Margins for LQG Regulators," IEEE Trans.Auto. Contr., Vol. AC-23, No. 4, pp. 756 � 757.

[53] Doyle, J.C. (1982). \Analysis of feedback systems with structured uncertainties,"IEE Proceedings, Part D, Vol.133, pp.45� 56.

[54] Doyle, J. C. (1985). \Structured uncertainty in control system design", IEEECDC, Ft. Lauderdale.

[55] Doyle, J. C. (1984). \Lecture notes in advances in multivariable control,"ONR/Honeywell Workshop, Minneapolis.

[56] Doyle, J. C., B. Francis, and A. Tannenbaum (1992). Feedback Control Theory,Macmillan Publishing Company.

BIBLIOGRAPHY 567

[57] Doyle, J.C., K. Glover, P.P. Khargonekar, B.A. Francis (1989). \ State-space solu-tions to standard H2 and H1 control problems," IEEE Trans. Auto. Control, vol.AC-34, No. 8, pp. 831-847. Also see 1988 American Control Conference, Atlanta,June, 1988.

[58] Doyle, J., K. Lenz, and A. Packard (1986). \Design Examples using � synthesis:Space Shuttle Lateral Axis FCS during reentry," IEEE CDC, December 1986, pp.2218-2223.

[59] Doyle, J.C., A.Packard and K. Zhou (1991). \Review of LFTs, LMIs and �," Proc.30th IEEE Conf. Dec. Contr., England, pp. 1227-1232

[60] Doyle, J.C., R.S. Smith, and D.F. Enns (1987). \Control of Plants with InputSaturation Nonlinearities," American Control Conf., pp 1034-1039.

[61] Doyle, J. C. and G. Stein (1981). \Multivariable feedback design: Concepts fora classical/modern synthesis," IEEE Trans. Auto. Control, vol. AC-26, pp. 4-16,Feb. 1981.

[62] Doyle, J. C., J. Wall and G. Stein (1982). \Performance and robustness analysisfor structured uncertainty," in Proc. 21st IEEE Conf. Decision Contr., pp. 629-636.

[63] Doyle, J. C., K. Zhou, and B. Bodenheimer (1989). \Optimal Control with mixedH2 andH1 performance objective," Proc. of American Control Conference, Pitts-burgh, Pennsylvania, pp. 2065-2070.

[64] Doyle, J. C., K. Zhou, K. Glover, and B. Bodenheimer (1994). \MixedH2 andH1performance objectives II: optimal control," IEEE Trans. on Automat. Contr., vol.39, no. 8, pp. 1575-1587.

[65] El-Sakkary, A. (1985). \The gap metric: Robustness of stabilization of feedbacksystems," IEEE Trans. Automat. Contr., Vol. 30, pp. 240-247.

[66] Enns, D. (1984a). Model Reduction for Control System Design, Ph.D. Disserta-tion, Department of Aeronautics and Astronautics, Stanford University, Stanford,California.

[67] Enns, D. (1984b) \Model reduction with balanced realizations: An error boundand a frequency weighted generalization," Proc. 23rd Conf. Dec. Contr. , LasVegas, NV.

[68] Fan, M. K. H. and A.L. Tits (1986). \Characterization and e�cient computationof the structured singular value," IEEE Trans. Auto. Control, vol. AC-31, no. 8,pp. 734-743.

[69] Fan, M. K. H. and A.L. Tits, and J. C. Doyle (1991). \Robustness in the presenceof mixed parametric uncertainty and unmodeled dynamics," IEEE Trans. Auto.Control, vol. AC-36, no. 1, pp. 25-38.

568 BIBLIOGRAPHY

[70] Foias, C. and A. Tannenbaum (1988). \On the four-block problem, I," OperatorTheory: Advances and Applications, vol. 32 (1988), pp. 93-112 and "On the fourblock problem, II: the singular system," Integral Equations and Operator Theory,vol. 11 (1988), pp. 726-767.

[71] Francis, B. A. (1987). A course in H1 control theory, Lecture Notes in Controland Information Sciences, vol. 88.

[72] Francis, B.A. and J.C. Doyle (1987). \Linear control theory with anH1 optimalitycriterion," SIAM J. Control Opt., vol. 25, pp. 815-844.

[73] Freudenberg, J.S. (1989). \Analysis and design for ill-conditioned plants, part 2:directionally uniform weightings and an example," Int. J. Contr., vol. 49, no. 3,pp. 873-903.

[74] Freudenberg, J.S. and D.P. Looze (1985). \Right half plane zeros and poles anddesign tradeo�s in feedback systems," IEEE Trans. Auto. Contr., vol. AC-30, no.6, pp. 555-565.

[75] Freudenberg, J.S. and D.P. Looze (1988). Frequency Domain Properties of Scalarand Multivariable Feedback Systems, Lecture Notes in Contr. and Info. Science,vol. 104, Berlin: Springer-Verlag, 1988.

[76] Freudenberg, J. S., D.P. Looze and J.B. Cruz (1982). \Robustness analysis usingsingular value sensitivities", Int. J. Control, vol. 35, no. 1, pp. 95-116.

[77] Garnett, J. B. (1981). Bounded Analytic Functions, Academic Press.

[78] Georgiou, T. T. (1988). \On the computation of the gap metric," Syst. Contr.Lett., Vol. 11, pp. 253-257.

[79] Georgiou, T. T. and M. C. Smith (1990). \Optimal robustness in the gap metric,"IEEE Trans. Automat. Contr., Vol.AC-35, No. 6, pp. 673-686.

[80] Glover, K. (1984). \All optimal Hankel-norm approximations of linear multivari-able systems and their L1-error bounds," Int. J. Control, vol. 39, pp. 1115-1193,1984.

[81] Glover, K. (1986). \Robust stabilization of linear multivariable systems: Relationsto approximation," Int. J. Contr., Vol. 43, No. 3, pp. 741-766.

[82] Glover, K. (1989). \A tutorial on Hankel-norm Approximation," in Data to Model,J. C. Willems, Ed. New York: Springer-Verlag.

[83] Glover, K. (1986). \Multiplicative approximation of linear multivariable systemswith L1 error bounds, " Proc. Amer. Contr. Conf., Seattle, WA, pp. 1705-1709.

BIBLIOGRAPHY 569

[84] Glover, K. and J. Doyle (1988). \State-space formulae for all stabilizing controllersthat satisfy an H1 norm bound and relations to risk sensitivity," Systems andControl Letters, vol. 11, pp. 167-172.

[85] Glover, K. and J. C. Doyle (1989). \A state space approach to H1 optimal con-trol," in Three Decades of Mathematical Systems Theory: A Collection of Surveysat the Occasion of the 50th Birthday of Jan C. Willems, H. Nijmeijer and J. M.Schumacher (Eds.), Springer-Verlag, Lecture Notes in Control and InformationSciences, vol. 135, 1989.

[86] Glover, K., D.J.N. Limebeer, J.C. Doyle, E.M. Kasenally, and M.G. Safonov(1991). \A characterization of all solutions to the four block general distanceproblem," SIAM J. Optimization and Control, Vol. 29, No. 2, pp. 283 � 324.

[87] Glover, K., D. J. N. Limebeer, and Y. S. Hung (1992). \A structured approxima-tion problem with applications to frequency weighted model reduction, " IEEETrans. Automat. Contr., vol.AC-37, No. 4, pp. 447-465.

[88] Glover, K. and D. McFarlane (1989). \Robust stabilization of normalized coprimefactor plant descriptions with H1 bounded uncertainty," IEEE Trans. Automat.Contr., vol. 34, no. 8, pp.821-830.

[89] Glover, K. and D. Mustafa (1989). \Derivation of the maximum entropy H1controller and a state space formula for its entropy," Int. J. Control, vol. 50, pp.899.

[90] Goddard, P. J. and K. Glover (1993). \Controller reduction: weights for stabilityand performance preservation," Proceedings of the 32nd Conference on Decisionand Control, San Antonio, Texas, December 1993, pp. 2903-2908.

[91] Goddard, P. J. and K. Glover (1994). \Performance preserving frequency weightedcontroller approximation: a coprime factorization approach," Proceedings of the33nd Conference on Decision and Control, Orlando, Florida, December 1994.

[92] Gohberg, I., and S. Goldberg (1981). Basic Operator Theory. Birkhauser, Boston.

[93] Gohberg, I., P. Lancaster, and L. Rodman (1986). \On hermitian solutions of thesymmetric algebraic Riccati equation," SIAM J. Control and Optimization, Vol.24, No. 6, November 1986, pp. 1323-1334.

[94] Golub, G. H., and C. F. Van Loan (1983). Matrix Computations. Baltimore, Md.:Johns Hopkind Univ. Press.

[95] Green, M. (1988a). \A relative error bound for balanced stochastic truncation,"IEEE Trans. Automat. Contr., Vol.AC-33, No. 10, pp. 961-965.

[96] Green, M (1988b). \Balanced stochastic realizations," Journal of Linear Algebraand its Applications, Vol. 98, pp. 211-247.

570 BIBLIOGRAPHY

[97] Green, M., K. Glover, D.J.N. Limebeer and J.C. Doyle (1988), \A J-spectralfactorization approach to H1 control," SIAM J. Control and Optimization, Vol.28, pp. 1350-1371.

[98] Hautus, M. L. J. and L. M. Silverman (1983). \System structure and singularcontrol," Linear Algebra and its Applications, Vol. 50, pp. 369-402.

[99] Helton, J. W. (1981). \Broadband gain equalization directly from data," IEEETrans. Circuits Syst., Vol. CAS-28, pp. 1125-1137.

[100] Hestenes, M. (1975). Optimization Theory: the �nite dimensional case, John Wi-ley & Sons.

[101] Heymann, M. (1968). \Comments \On pole assignment in multi-input controllablelinear systems," " IEEE Trans. Auto. Contr., Vol. 13, No. 6, pp. 748-749.

[102] Hinrichsen, D., and A. J. Pritchard (1990). \An improved error estimate forreduced-order models of discrete time system," IEEE Transactions on AutomaticControl, AC-35, pp. 317-320.

[103] Ho�man, K. (1962). Banach Spaces of Analytic Functions, Prentice-hall, Inc.,Englewood Cli�s, N.J.

[104] Horn, R. A. and C. R. Johnson (1990). Matrix Analysis, Cambridge UniversityPress, First Published in 1985.

[105] Horn, R. A. and C. R. Johnson (1991). Topics in Matrix Analysis, CambridgeUniversity Press.

[106] Horowitz, I. M. (1963). Synthesis of Feedback Systems, Academic Press, London.

[107] Hung, Y. S. (1989).\H1 -optimal control- Part I: model matching, - Part II:solution for controllers," Int. J. Control, Vol. 49, pp. 1291-1359.

[108] Hung, Y. S. and K. Glover (1986). \Optimal Hankel-norm approximation of stablesystems with �rst order stable weighting functions," Syst. Contr. Lett., Vol. 7 pp.165-172.

[109] Hung, Y. S. and A. G. J. MacFarlane (1982). Multivariable feedback: A quasi-classical approach, Springer-Verlag.

[110] Hyde, R. A. and K. Glover (1993). \The Application of Scheduled H1 Controllersto a VSTOL Aircraft," IEEE Trans. Automatic Control, Vol. 38, No. 7, pp. 1021-1039.

[111] Hyland, D. C. and D. S. Bernstein (1984). \The optimal projection equations for�xed-order dynamic compensation," IEEE Trans. Auto. Contr., Vol. AC-29, N0.11, pp. 1034 � 1037.

BIBLIOGRAPHY 571

[112] Iglesias, P. A. and K. Glover (1991). \State-space approach to discrete-time H1control," Int. J. Control, Vol. 54, No. 5, pp. 1031-1073.

[113] Iglesias, P. A., D. Mustafa, and K. Glover (1990). \Discrete time H1 controllerssatisfying minimum entropy criterion," Syst. Contr. Lett., Vol. 14, pp. 275-286.

[114] Jonckheere, E.A. and J.C. Juang (1987). \Fast computation of achievable per-formance in mixed sensitivity H1 design," IEEE Trans. on Auto. Control, vol.AC-32, pp. 896-906.

[115] Jonckheere, E. A. and L. M. Silverman (1978). \Spectral theory of the linearquadratic optimal control problem: discrete-time single-input case," IEEE Trans.Circuits. Syst., Vol. CAS-25, No. 9, pp. 810-825.

[116] Kailath, T. (1980). Linear Systems. Prentice-Hall, Inc., Englewood Cli�s, N.J.07632.

[117] Kalman, R. E. (1964). \When is a control system optimal?" ASME Trans. SeriesD: J. Basic Engineering, Vol. 86, pp. 1-10.

[118] Kalman, R. E. and R. S. Bucy (1960). \New results in linear �ltering and predic-tion theory," ASME Trans. Series D: J. Basic Engineering, Vol. 83, pp. 95-108.

[119] Kavrano�glu, D. and M. Bettayeb (1994). \Characterization and computation ofthe solution to the optimal L1 approximation problem," IEEE Trans. Auto.Contr., Vol. 39, No. 9, pp. 1899-1904.

[120] Khargonekar, P.P., I.R. Petersen, and M.A. Rotea (1988). \H1- optimal controlwith state-feedback," IEEE Trans. on Auto. Control, vol. AC-33, pp. 786-788.

[121] Khargonekar, P.P., I.R. Petersen, and K. Zhou (1990). \Robust stabilization andH1-optimal control," IEEE Trans. Auto. Contr., Vol. 35, No. 3, pp. 356 � 361.

[122] Khargonekar, P. P. and M. A. Rotea (1991). \Mixed H2H1 control: a convexoptimization approach," IEEE Trans. Automat. Contr., vol. 36, no. 7, pp. 824-737.

[123] Khargonekar, P. P. and E. Sontag (1982). \On the relation between stable matrixfraction factorizations and regulable realizations of linear systems over rings,"IEEE Trans. Automat. Contr., Vol. 27, pp. 627-638.

[124] Kimura, H. (1988). \Synthesis of H1 Controllers based on Conjugation," Proc.IEEE Conf. Decision and Control , Austin, Texas.

[125] Kishore, A. P. and J. B. Pearson (1992). \Uniform stability and performance inH1," Proc. IEEE Conf. Dec. Contr., Tucson, Arizona, pp. 1991-1996.

[126] Kleinman, D. L. (1968). \On an iterative technique for Riccati equation compu-tation," IEEE Transactions on Automatic Control, Vol. 13, pp. 114-115.

572 BIBLIOGRAPHY

[127] Kung, S. K., B. Levy, M. Morf and T. Kailath (1977). \New results in 2-D SystemsTheory, 2-D State-Space Models { Realization and the Notions of Controllability,Observability and Minimality," Proc. IEEE, pp. 945 � 961.

[128] Kung, S. K. and D. W. Lin (1981). \Optimal Hankel norm model reduction:Multivariable systems." IEEE Trans. Automat. Contr., Vol. AC-26, pp. 832-852.

[129] Kwakernaak, H. (1986). \A polynomial approach to minimax frequency domainoptimization of multivariable feedback systems," Int. J. Control, pp. 117-156.

[130] Kwakernaak, H. and R. Sivan (1972). Linear Optimal Control Systems, Wiley-Interscience, New York.

[131] Kucera, V. (1972). \A contribution to matrix quadratic equations," IEEE Trans.Auto. Control, AC-17, No. 3, 344-347.

[132] Kucera, V. (1979). Discrete Linear Control, John Wiley and Sons, New York.

[133] Lancaster, P. and M. Tismenetsky (1985). The Theory of Matrices: with applica-tions, 2nd Ed., Academic Press.

[134] Latham, G. A. and B. D. O. Anderson (1986). \Frequency weighted optimalHankel-norm approximation of stable transfer function," Syst. Contr. Lett., Vol.5 pp. 229-236.

[135] Laub, S. J. (1980). \Computation of 'balancing' transformations," Proc. 1980JACC, Session FA8-E.

[136] Lenz, K. E., P. P. Khargonekar, and John C. Doyle (1987). \Controller orderreduction with guaranteed stability and performance, " American Control Con-ference, pp. 1697-1698.

[137] Limebeer, D. J. N., B. D. O. Anderson, P. P. Khargonekar, and M. Green (1992).\A game theoretic approach to H1 control for time varying systems," SIAM J.Control and Optimization, Vol. 30, No. 2, pp. 262-283.

[138] Limebeer, D. J. N., M. Green, and D. Walker (1989). \Discrete time H1 control,"Proc. IEEE Conf. Dec. Contr., Tampa, Florida, pp. 392-396.

[139] Limebeer, D.J.N. and G.D. Halikias (1988). \A controller degree bound for H1-optimal control problems of the second kind," SIAM J. Control Opt., vol. 26, no.3, pp. 646-677.

[140] Limebeer, D.J.N. and Y.S. Hung (1987). \An analysis of pole-zero cancelationsin H1-optimal control problems of the �rst kind," SIAM J. Control Opt., vol.25,pp. 1457-1493.

[141] Liu, K. Z., T. Mita, and R. Kawtani (1990). \Parameterization of state feedbackH1 controllers," Int. J. Contr., Vol.51, No. 3, pp. 535-551.

BIBLIOGRAPHY 573

[142] Liu, K. Z. and T. Mita (1992). \Generalized H1 control theory," Proc. AmericanControl Conference.

[143] Liu, Y. and B. D. O. Anderson (1986). \Controller reduction via stable factoriza-tion and balancing," Int. J. Control, Vol. 44, pp. 507-531.

[144] Liu, Y. and B. D. O. Anderson (1989). \Singular perturbation approximation ofbalanced systems," Int. J. Control, Vol. 50, No. 4, pp. 1379-1405.

[145] Liu, Y. and B. D. O. Anderson (1990). \Frequency weighted controller reductionmethods and loop transfer recovery," Automatica, Vol. 26, No. 3, pp. 487-497.

[146] Liu, Y., B. D. O. Anderson, and U. Ly (1990). \Coprime factorization controllerreduction with Bezout identity induced frequency weighting," Automatica, Vol.26, No. 2, pp. 233-249, 1990.

[147] Lu, W. M., K. Zhou, and J. C. Doyle (1991). \Stabilization of LFT systems,"Proc. 30th IEEE Conf. Dec. Contr., Brighton, England, pp. 1239-1244.

[148] Luenberger, D. G. (1969). Optimization by Vector Space Methods, John Wiley &Sons, Inc.

[149] Luenberger, D. G. (1971). \An introduction tp observers," IEEE Trans. Automat.Contr., Vol. 16, No. 6, pp. 596-602.

[150] MacFarlane, A.G.J. and J.S. Hung (1983). \Analytic Properties of the singularvalues of a rational matrix," Int. J. Contr., vol. 37, no. 2, pp. 221-234.

[151] Maciejowski, J. M. (1989). Multivariable Feedback Design, Wokingham: Addison-Wesley.

[152] Mageirou, E. F. and Y. C. Ho (1977). \Decentralized stabilization via game the-oretic methods," Automatica, Vol. 13, pp. 393-399.

[153] Mariton, M. and R. Bertrand (1985). \A homotopy algorithm for solving coupledRiccati equations,"Optimal. Contr. Appl. Meth., vol. 6, pp. 351-357.

[154] Martensson, K. (1971). \On the matrix Riccati equation," Information Sciences,Vol. 3, pp. 17-49.

[155] McFarlane, D. C., K. Glover, and M. Vidyasagar (1990). \Reduced-order con-troller design using coprime factor model reduction," IEEE Trans. Automat.Contr., Vol. 35, No. 3, pp. 369-373.

[156] McFarlane, D. C. and K. Glover (1990). Robust Controller Design Using Normal-ized Coprime Factor Plant Descriptions, vol. 138, Lecture Notes in Control andInformation Sciences. Springer-Verlag.

574 BIBLIOGRAPHY

[157] McFarlane, D. C. and K. Glover (1992). \A loop shaping design procedure usingH1 synthesis," IEEE Trans. Automat. Contr., Vol. 37, No. 6, pp. 759-769.

[158] Meyer, D. G. (1990). \The construction of special coprime factorizations in dis-crete time," IEEE Trans. Automat. Contr., Vol. 35, No. 5, pp. 588-590.

[159] Meyer, D. G. (1990). \Fractional balanced reduction: model reduction via frac-tional representation," IEEE Trans. Automat. Contr., Vol. 35, No. 12, pp. 1341-1345.

[160] Meyer, D. G. and G. Franklin (1987). \A connection between normalized co-prime factorizations and linear quadratic regulator theory," IEEE Trans. Au-tomat. Contr., Vol. 32, pp. 227-228.

[161] Molinari, B. P. (1975). \The stabilizing solution of the discrete algebraic Riccatiequation," IEEE Trans. Auto. Contr., June 1975, pp. 396 � 399.

[162] Moore, B. C. (1981). \Principal component analysis in linear systems: control-lability, observability, and model reduction," IEEE Trans. Auto. Contr., Vol. 26,No.2.

[163] Moore, J. B., K. Glover, and A. Telford (1990). \All stabilizing controllers asfrequency shaped state estimate feedback," IEEE Trans. Auto. Contr., Vol. AC-35, pp. 203 � 208.

[164] Mullis, C. T. and R. A. Roberts (1976). \Synthesis of minimum roundo� noise�xed point digital �lters," IEEE Trans. Circuits and Systems, Vol. 23, pp. 551-562.

[165] Mustafa, D. (1989). \Relations between maximum entropy / H1 control andcombined H1/LQG control," Systems and Control Letters, vol. 12, no. 3, p. 193.

[166] Mustafa, D. and K. Glover (1990). Minimum Entropy H1 Control, Lecture notesin control and information sciences, Springer-Verlag.

[167] Mustafa, D. and K. Glover (1991). \Controller Reduction by H1- Balanced Trun-cation", IEEE Trans. Automat. Contr., vol.AC-36, No. 6, pp. 669-682.

[168] Naylor, A. W., and G. R. Sell (1982). Linear Operator Theory in Engineering andScience, Springer-Verlag.

[169] Nagpal, K. M. and P. P. Khargonekar (1991). \Filtering and smoothing in an H1setting," IEEE Trans. Automat. Contr., Vol.AC-36, No. 2, pp. 152-166.

[170] Nehari, Z. (1957). \On bounded bilinear forms," Annals of Mathematics, Vol. 15,No. 1, pp. 153-162.

[171] Nesterov, Y. and A. Nemirovski (1994). Interior-point polynomial algorithms inconvex programming, SIAM.

BIBLIOGRAPHY 575

[172] Nett, C.N., C.A.Jacobson and N.J.Balas (1984). \A Connection between State-Space and Doubly Coprime Fractional Representations," IEEE Trans., Vol.AC-29,pp.831� 832.

[173] Osborne, E.E. (1960). \On Preconditioning of Matrices," J. Assoc. Comp. Mach.,7, pp. 338-345.

[174] Packard, A. (1991). Notes on ��. Unpublished lecture notes.

[175] Packard, A. and J. C. Doyle (1988). \Structured singular value with repeatedscalar blocks," 1988 ACC, Atlanta.

[176] Packard, A. and J. C. Doyle (1988). Robust Control of Multivariable and LargeScale Systems, Final Technical Report for Air Force O�ce of Scienti�c Research.

[177] Packard, A. and J. C. Doyle (1993), \The complex structured singular value,"Automatica, Vol. 29, pp. 71-109.

[178] Packard, A. and P. Pandey (1993). Continuity properties of the real/complexstructured singular value," IEEE Trans. Automat. Contr., Vol. 38, No. 3, pp.415-428.

[179] Packard, A., K. Zhou, P. Pandey, J. Leonhardson, G. Balas (1992). \Optimal,Constant I/O similarity scaling for full information and state feedback controlproblems," Syst. Contr. Lett., Vol. 19, No. 4, pp. 271-280.

[180] Parrott, S. (1978). \On a quotient norm and the Sz-Nagy Foias lifting theorem,"Journal of Functional Analysis, vol. 30, pp. 311 � 328.

[181] Pernebo, L. and L. M. Silverman (1982). "Model Reduction via Balanced StateSpace Representation," IEEE Trans. Automat. Contr., vol.AC-27, No. 2, pp. 382-387.

[182] Petersen, I.R. (1987). \Disturbance attenuation and H1 optimization: a designmethod based on the algebraic Riccati equation," IEEE Trans. Auto. Contr., vol.AC-32, pp. 427-429.

[183] Postlethwaite, I., J. M. Edmunds, and A. G. J. MacFarlane (1981). \Principalgains and Principal phases in the analysis of linear multivariable feedback sys-tems," IEEE Trans. Auto. Contr., Vol. AC-26, No. 1, pp. 32-46.

[184] Postlethwaite, I. and A. G. J. MacFarlane (1979). A complex variable approach tothe analysis of linear multivariable feedback systems. Berlin: Springer.

[185] Power, S. C. (1982). Hankel operators on Hilbert space. Pitman Advanced Pub-lishing Program.

576 BIBLIOGRAPHY

[186] Ran, A. C. M. and R. Vreugdenhil (1988). \Existence and comparison theoremsfor algebraic Riccati equations for continuous- and discrete- time systems," LinearAlgebra and its Applications, Vol. 99, pp. 63-83.

[187] Redhe�er, R. M. (1959). \Inequalities for a matrix Riccati equation," Journal ofMathematics and Mechanics, vol. 8, no. 3.

[188] Redhe�er, R.M. (1960). \On a certain linear fractional transformation," J. Math.and Physics, vol. 39, pp. 269-286.

[189] Richter, S.L. (1987). \A homotopy algorithm for solving the optimal projec-tion equations for �xed-order dynamic compensation: existence, convergence andglobal optimality," American Control Conference, Minneapolis.

[190] Rockafellar, R. T. (1970). Convex Analysis . Princeton University Press, Prince-ton, New Jersey.

[191] Roesser, R.P. (1975). \A Discrete State Space Model for Linear Image Processing,"IEEE Trans. Automat. Contr., Vol.AC-20, No. 2, pp.1 � 10.

[192] Rotea, M. A., and P. P. Khargonekar (1991). \H2 optimal control with an H1constraint: the state feedback case," Automatica, Vol. 27, No. 2, pp. 307-316.

[193] Rudin, W. (1966). Real and Complex Analysis. McGraw Hill Book Company, NewYork.

[194] Qiu, L. and E. J. Davison (1992a). \Feedback stability under simultaneous gapmetric uncertainties in plant and controller," Syst. Contr. Lett., Vol. 18, pp. 9-22.

[195] Qiu, L. and E. J. Davison (1992b). \Pointwise gap metrics on transfer matrices,"IEEE Trans. Automat. Contr., Vol. 37, No. 6, pp. 770-780.

[196] Safonov, M.G. (1980). Stability and Robustness of Multivariable Feedback Systems.MIT Press.

[197] Safonov, M. G. (1982). \Stability margins of diagonally perturbed multivariablefeedback systems," Proc. IEE, Part D, Vol. 129, No. 6, pp. 251-256.

[198] Safonov, M.G. (1984). \Stability of interconnected systems having slope-boundednonlinearities," 6th Int. Conf. on Analysis and Optimization of Systems, Nice,France.

[199] Safonov, M.G. and J.C. Doyle (1984). \Minimizing conservativeness of robustnesssingular values", in Multivariable Control, S.G. Tzafestas, Ed. New York: Reidel.

[200] Safonov, M. G., A. J. Laub, and G. L. Hartmann (1981). \Feedback propertiesof multivariable systems: the role and use of the return di�erence matrix," IEEETrans. Auto. Contr., Vol. AC-26, No. 1, pp. 47 � 65.

BIBLIOGRAPHY 577

[201] Safonov, M.G. and D.J.N. Limebeer, and R. Y. Chiang (1990). \Simplifying theH1 Theory via Loop Shifting, matrix pencil and descriptor concepts," Int. J.contr., Vol 50, pp. 2467 � 2488.

[202] Sarason, D. (1967). \Generalized interpolation in H1," Trans. AMS., vol. 127,pp. 179-203.

[203] Scherer, C. (1992). \H1 control for plants with zeros on the imaginary axis,"SIAM J. Contr. Optimiz., Vol. 30, No. 1, pp. 123-142.

[204] Scherer, C. (1992). \H1 optimization without assumptions on �nite or in�nitezeros," SIAM J. Contr. Optimiz., Vol. 30, No. 1, pp. 143-166.

[205] Silverman, L. M. (1969). \Inversion of multivariable linear systems," IEEE Trans.Automat. Contr., Vol. AC-14, pp. 270-276.

[206] Skelton, R. E. (1988). Dynamic System Control, John Willey & Sons.

[207] Stein, G. and J. C. Doyle (1991). \Beyond singular values and loop shapes," AIAAJournal of Guidance and Control, vol. 14, no.1.

[208] Stein, G., G. L. Hartmann, and R. C. Hendrick (1977). \Adaptive control lawsfor F-8c ight test," IEEE Trans. Auto. Contr.

[209] Stoorvogel, A. A. (1992). The H1 Control Problem: A State Space Approach,Prentice-Hall, Englewood Cli�s.

[210] Tadmor, G. (1990). \Worst-case design in the time domain: the maximum princi-ple and the standard H1 problem," Mathematics of Control, Systems and SignalProcessing, pp. 301-324.

[211] Tits, A. L. (1994). \The small � theorem with non-rational uncertainty," preprint.

[212] Tits, A. L. and M. K. H. Fan (1994). \On the small � theorem," preprint.

[213] Vidyasagar, M. (1984). \The graph metric for unstable plants and robustnessestimates for feedback stability," IEEE Trans. Automatic Control, Vol. 29, pp.403-417.

[214] Vidyasagar, M. (1985). Control System Synthesis: A Factorization Approach. MITPress, Cambridge, MA.

[215] Vidyadagar, M. (1988). \Normalized coprime factorizations for non-strictly propersystems," IEEE Trans. Automatic Control, Vol. 33, No. 3, pp. 300-301.

[216] Vidyadagar, M. and H. Kimura (1986). \Robust controllers for uncertain linearmultivariable systems," Automatica, Vol. 22, pp. 85-94.

[217] Vinnicombe, G. (1993). \Frequency domain uncertainty and the graph topology,"IEEE Trans. Automatic Control, Vol. 38, No. 9, pp. 1371-1383.

578 BIBLIOGRAPHY

[218] Walker, D. J. (1990). \Robust stabilizability of discrete-time systems with nor-malized stable factor perturbation," Int. J. Control, Vol. 52, No.2, pp.441-455.

[219] Wall, J.E., Jr., J.C. Doyle, and C.A. Harvey (1980). \Tradeo�s in the design ofmultivariable feedback systems," Proc. 18th Allerton Conf., pp. 715-725.

[220] Wang, W., J.C.Doyle and K.Glover (1991). \Stability and Model Reduction ofLinear Fractional Transformation System," 1991 IEEE CDC, England.

[221] Wang, W. and M. G. Safonov (1990). \A tighter relative-error bound for balancedstochastic truncation," Syst. Contr. Lett., Vol 14, pp. 307-317.

[222] Wang, W. and M. G. Safonov (1992). \Multiplicative-error bound for balancedstochastic truncation model reduction," IEEE Trans. Automat. Contr., vol.AC-37, No. 8, pp. 1265-1267.

[223] Whittle, P. (1981). \Risk-sensitive LQG control," Adv. Appl. Prob., vol. 13, pp.764-777.

[224] Whittle, P. (1986). \A risk-sensitive certainty equivalence principle," in Essaysin Time Series and Allied Processes (Applied Probability Trust, London), pp383-388.

[225] Willems, J.C. (1971). \Least Squares Stationary Optimal Control and the Al-gebraic Riccati Equation", IEEE Trans. Auto. Control, vol. AC-16, no. 6, pp.621-634.

[226] Willems, J. C. (1981). \Almost invariant subspaces: an approach to high gainfeedback design { Part I: almost controlled invariant subspaces," IEEE Trans.Auto. Control, vol. AC-26, pp. 235-252.

[227] Willems, J. C., A. Kitapci and L. M. Silverman (1986). \Singular optimal control:a geometric approach," SIAM J. Control Optimiz., Vol. 24, pp. 323-337.

[228] Wimmer, H. K. (1985). \Monotonicity of Maximal solutions of algebraic Riccatiequations," Systems and Control Letters, Vol. 5, April 1985, pp. 317-319.

[229] Wonham, W. M. (1968). \On a matrix Riccati equation of stochastic control,"SIAM J. Control, Vol. 6, No. 4, pp. 681-697.

[230] Wonham, W.M. (1985). Linear Multivariable Control: A Geometric Approach,third edition, Springer-Verlag, New York.

[231] Youla, D.C. and J.J. Bongiorno, Jr. (1985). \A feedback theory of two-degree-of-freedom optimal Wiener-Hopf design," IEEE Trans. Auto. Control, vol. AC-30,No. 7, pp. 652-665.

[232] Youla, D.C., H.A. Jabr, and J.J. Bongiorno (1976). \Modern Wiener-Hopf designof optimal controllers: part I," IEEE Trans. Auto. Control, vol. AC-21, pp. 3-13.

BIBLIOGRAPHY 579

[233] Youla, D.C., H.A. Jabr, and J.J. Bongiorno (1976). \Modern Wiener-Hopf designof optimal controllers: part II," IEEE Trans. Auto. Control, vol. AC-21, pp. 319-338.

[234] Youla, D.C., H.A. Jabr, and C. N. Lu (1974). \Single-loop feedback stabilizationof linear multivariable dynamical plants," Automatica, Vol. 10, pp. 159-173.

[235] Youla, D. C. and M. Saito (1967). \Interpolation with positive-real functions",Journal of The Franklin Institute, Vol. 284, No. 2, pp. 77-108.

[236] Young, P. M. (1993). Robustness with Parametric and Dynamic Uncertainty, PhDThesis, California Institute of Technology.

[237] Yousu�, A. and R. E. Skelton (1984). \A note on balanced controller reduction,"IEEE Trans. Automat. Contr., vol.AC-29, No. 3, pp. 254-257.

[238] Zames, G. (1966). \On the input-output stability of nonlinear time-varying feed-back systems, parts I and II," IEEE Trans. Auto. Control, vol. AC-11, pp. 228and 465.

[239] Zames, G. (1981). \Feedback and optimal sensitivity: model reference transforma-tions, multiplicative seminorms, and approximate inverses," IEEE Trans. Auto.Control, vol. AC-26, pp. 301-320.

[240] Zames, G. and A. K. El-Sakkary (1980). \Unstable systems and feedback: Thegap metric," Proc. Allerton Conf., pp. 380-385.

[241] Zhou, K. (1992). \Comparison between H2 and H1 controllers," IEEE Trans. onAutomatic Control, Vol. 37, No. 8, pp. 1261-1265.

[242] Zhou, K. (1992). \On the parameterization of H1 controllers," IEEE Trans. onAutomatic Control, Vol. 37, No. 9, pp. 1442-1445.

[243] Zhou, K. (1993). \Frequency weighted model reduction with L1 error bounds,"Syst. Contr. Lett., Vol. 21, 115-125, 1993.

[244] Zhou, K. (1993). \Frequency weighted L1 norm and optimal Hankel norm modelreduction," submitted to IEEE Trans. on Automatic Control .

[245] Zhou, K. and P. P. Khargonekar (1988). \An algebraic Riccati equation approachto H1 optimization," Systems and Control Letters, vol. 11, pp. 85-92, 1988.

[246] Zhou, K., J. C. Doyle, K. Glover, and B. Bodenheimer (1990). \MixedH2 andH1control," Proc. of the 1990 American Control Conference, San Diego, California,pp. 2502-2507.

[247] Zhou, K., K. Glover, B. Bodenheimer, and J. Doyle (1994). \Mixed H2 and H1performance objectives I: robust performance analysis," IEEE Trans. on Automat.Contr., vol. 39, no. 8, pp. 1564-1574.

580 BIBLIOGRAPHY

[248] Zhu, S. Q. (1989). \Graph topology and gap topology for unstable systems," IEEETrans. Automat. Contr., Vol. 34, NO. 8, pp. 848-855.

Index

additive approximation, 157additive uncertainty, 213, 219adjoint operator, 94admissible controller, 294algebraic duality, 297algebraic Riccati equation (ARE), 11,

168, 321all solutions, 322complementarity property, 327discrete time, 523maximal solution, 335minimal solution, 335solutions, 27stability property, 327stabilizing solution, 327

all-pass, 179, 357dilation, 179

analytic function, 94anticausal, 97antistable, 97

balanced model reductionadditive, 157discrete time, 548error bounds, 161, 195frequency-weighted, 164multiplicative, 157relative, 157stability, 158

balanced realization, 69, 71, 157Banach space, 90basis, 17Bezout identity, 123Bode integral, 141

Bode's gain and phase relation, 155bounded real function, 353

discrete time, 532bounded real lemma, 353

discrete time, 532

Cauchy sequence, 90Cauchy-Schwartz inequality, 92causal function, 175causality, 175Cayley-Hamilton theorem, 21, 48, 51,

54central solution

H1 control, 411matrix dilation problem, 41

classical control, 232co-inner function, 357compact operator, 178complementary inner function, 358complementary sensitivity, 128, 142conjugate system, 66contraction, 408controllability, 45

Gramian, 71matrix, 47operator, 175

controllable canonical form, 60controller parameterization, 293controller reduction, 485

additive, 486coprime factors, 489H1 performance, 493

coprime factor uncertainty, 222, 467coprime factorization, 123, 316, 358, 467

581

582 INDEX

discrete time, 539normalized, 470

cyclic matrix, 21, 61

design limitation, 153design model, 211design tradeo�, 141destabilizing perturbation, 268detectability, 45, 51di�erential game, 459direct sum, 94discrete algebraic Riccati equation, 523discrete time, 523

coprime factorization, 539H1 control, 548H2 control, 543inner-outer factorization, 540Lyapunov equation, 523model reduction, 553normalized coprime factorization, 541

disturbance feedforward (DF), 10, 294,298

H1 control, 423H2 control, 386

D �K iteration, 289dom(Ric), 327double coprime factorization, 123dual system, 66duality, 297

eigenvalue, 20eigenvector, 20, 324

generalized, 21lower rank generalized eigenvector,

21entropy, 435equivalence class, 305equivalence relation, 305equivalent control, 305

F` , 242Fu, 242

feedback, 116�ltering

H1 performance, 452�xed order controller, 516Fourier transform, 96frequency weighting, 135frequency-weighted balanced realization,

164Frobenius norm, 29full control (FC), 10, 294, 298

H1 control, 422H2 control, 387, 546

full information (FI), 10, 294, 297H1 control, 416H2 control, 385, 546

full rank, 18column, 18row, 18

gain, 155gain margin, 232gap metric, 484generalized eigenvector, 26, 27, 324generalized inverse, 35generalized principal vector, 525Gilbert's realization, 69Gramian

controllability, 71, 74observability, 71, 74

graph metric, 484graph topology, 484

Hamiltonian matrix, 321Hankel norm, 173Hankel norm approximation, 173, 188Hankel operator, 173, 393

mixed Hankel-Toeplitz operator, 397Hankel singular value, 75, 162, 173Hardy spaces, 95harmonic function, 142Hermitian matrix, 21H1 control, 405

INDEX 583

discrete time, 548loop shaping, 467singular problem, 448state feedback, 459

H1 �ltering, 452H1 optimal controller, 430, 437H1 performance, 135, 138H1 space, 89, 97H�1 space, 97hidden modes, 77Hilbert space, 91homotopy algorithm, 520H2 optimal control 365

discrete time, 543H2 performance, 135H2 space, 89, 95H2 stability margin, 389H?2 space, 96H2(@D ) space, 199H?2 (@D ) space, 199Hurwitz, 49

image, 18induced norm, 28, 101inertia, 179inner function, 357

discrete time, 537inner product, 92inner-outer factorization, 143, 358input sensitivity, 128input-output stability, 406integral control, 448


internal model principle, 448internal stability, 119, 406invariant subspace, 26, 322invariant zero, 84, 333inverse

of a transfer function, 66isometric isomorphism, 91, 174

Jordan canonical form, 20

Kalman canonical decomposition, 52kernel, 18Kronecker product, 25Kronecker sum, 25

Lagrange multiplier, 511left coprime factorization, 123linear combination, 17linear fractional transformation (LFT),

241linear operator, 91L1 space, 96loop gain, 130loop shaping, 132

H1 approach, 467normalized coprime factorization, 475

loop transfer matrix (function), 128LQG stability margin, 389LQR problem, 367LQR stability margin, 373l2(�1;1) space, 93L2 space, 95L2(�1;1) space, 93L2(@D ) space, 199Lyapunov equation, 26, 70

discrete time, 523

main loop theorem, 275matrix

compression, 38dilation, 38, 445factorization, 343, 537Hermitian, 21inequality, 335inertia, 179inversion formulas, 22norm, 27square root of a, 36

maximum modulus theorem, 94max-min problem, 400McMillan degree, 81McMillan form, 80

584 INDEX

minimal realization, 67, 73minimax problem, 400minimum entropy controller, 435mixed Hankel-Toeplitz operator, 397modal controllability, 51modal observability, 51model reduction

additive, 157multiplicative, 157relative, 157

model uncertainty, 116, 211�, 263

lower bound, 273� synthesis, 288upper bound, 273

multidimensional system, 249multiplication operator, 98multiplicative approximation, 157, 165multiplicative uncertainty, 213, 220

Nehari's Theorem, 203nominal performance (NP), 215nominal stability (NS), 215non-minimum phase zero, 141norm, 27

Frobenius, 29induced, 28semi-norm, 28

normal rank, 79normalized coprime factorization, 362

loop shaping, 475null space, 18Nyquist stability theorem, 123

observability, 45Gramian, 71operator, 175

observable canonical form, 61observable mode, 51observer, 62observer-based controller, 62operator

extension, 91restriction, 91

optimal Hankel norm approximation, 188optimality of H1 controller, 430optimization method, 511orthogonal complement, 18, 94orthogonal direct sum, 94orthogonal matrix, 18orthogonal projection theorem, 94outer function, 358output estimation (OE), 10, 294, 298


output injection, 298H2 control, 386

output sensitivity, 128

Parrott's theorem, 40Parseval's relations, 96PBH (Popov-Belevitch-Hautus) tests, 51performance limitation, 141phase, 155phase margin, 232Plancherel theorem, 96plant condition number, 230Poisson integral, 141pole, 77, 79pole direction, 143pole placement, 57pole-zero cancelation, 77positive real, 354positive (semi-)de�nite matrix, 36power signal, 100pseudo-inverse, 35

quadratic control, 365quadratic performance, 365, 393

Rp(s) , 80range, 18realization, 67

balanced, 75

INDEX 585

input normal, 77minimal, 67output normal, 77

Redhe�er star-product, 259regular point, 512regulator problem, 365relative approximation, 157, 165return di�erence, 128RH1 space, 97RH�1 space, 97RH2 space, 95RH?2 space, 96Riccati equation, 321Riccati operator, 327right coprime factorization, 123risk sensitive, 436robust performance (RP), 215

H1 performance, 226, 275H2 performance, 226structured uncertainty, 279

robust stability (RS), 215structured uncertainty, 278

robust stabilization, 467roll-o� rate, 144

Schmidt pair, 178Schur complement, 23self-adjoint operator, 94, 178sensitivity function, 128, 142

bounds, 153integral, 144

separation theory, 310H1 control, 426H2 control, 388

simplectic matrix, 525singular H1 problem, 448singular value, 32singular value decomposition (SVD), 31singular vector, 32skewed performance speci�cation, 230small gain theorem, 215Smith form, 79span, 17

spectral factorization, 343spectral radius, 20spectral signal, 100stability, 45, 49

internal, 119stability margin

LQG, H2, 389stabilizability, 45, 49stabilization, 293stabilizing controller, 293stable invariant subspace, 27, 327, 525star-product, 259state feedback, 298


state space realization, 60strictly positive real, 354structured singular value, 263

lower bound, 273upper bound, 273

structured uncertainty, 263subharmonic function, 142supremum norm, 89Sylvester equation, 26Sylvester's inequality, 19, 68

Toeplitz operator, 197, 395trace, 19tradeo�, 141transition matrix, 46transmission zero, 81

uncertainty, 116, 211state space, 252

unimodular matrix, 79unitary matrix, 18unstructured uncertainty, 211

weighted model reduction, 164, 500weighting, 135well-posedness, 117, 243winding number, 409

586 INDEX

Youla parameterization, 303, 316, 454

zero, 77, 79blocking, 81direction, 142invariant, 84transmission, 81

robust and optimal control

Education