eth struct opt

.

Structural Optimization

Gerald Kress, David Keller and Benjamin Schlapfer

January 22, 2015

Laboratory of Composite Materials and Adaptive Structures

Contents

1 Scope, Goals, and Sample Structural Optimization Problems 11.1 Introductory Remarks on Design Optimization . . . . . . . . . . . . . . . . 11.2 Overview of the Contents and Acknowledgment . . . . . . . . . . . . . . . . 21.3 Problems of Structural Optimization . . . . . . . . . . . . . . . . . . . . . . 3

1.3.1 Weight Minimization of a Motorcycle Tubular Frame . . . . . . . . . 41.3.2 Racing Car Rim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3.3 Maximum-Strength Flywheel Design . . . . . . . . . . . . . . . . . . 61.3.4 Maximum Bond Strength Design . . . . . . . . . . . . . . . . . . . . 61.3.5 Composite Boat Hull . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.3.6 Minimum Weight Fuel Cell Stack End Plate . . . . . . . . . . . . . . 8

2 Treatment of a Structural Optimization Problem 92.1 Eschenauer’s Three-Columns Concept . . . . . . . . . . . . . . . . . . . . . 9

2.1.1 Structural Model or Structural Analysis . . . . . . . . . . . . . . . . 102.1.2 Optimization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 102.1.3 Optimization Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Practical Optimization Problem Solution Setup . . . . . . . . . . . . . . . . 122.2.1 The DynOPS evaluator . . . . . . . . . . . . . . . . . . . . . . . . . 12

3 Design Evaluation 153.1 Local Optimality Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.2 Global Objective Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.3 Constraining Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.4 Several Design Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.4.1 Pareto Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.4.2 Substitute Problem and Preference Function or Scalarization . . . . 17

3.5 Transformation Methods and Pseudo Objectives . . . . . . . . . . . . . . . 183.5.1 Penalty Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.5.2 Method of Multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . 203.5.3 Unconstrained Lagrange Problem Formulation . . . . . . . . . . . . 21

3.6 Fitness Function for Evolutionary Algorithms . . . . . . . . . . . . . . . . . 223.6.1 Mapping Functions for Objectives . . . . . . . . . . . . . . . . . . . 233.6.2 Constraint Mapping Functions . . . . . . . . . . . . . . . . . . . . . 24

3.7 Design Evaluation Exemplified on Selected Problems . . . . . . . . . . . . . 273.7.1 Weight Minimization of a Motorcycle Tubular Frame . . . . . . . . . 273.7.2 Racing Car Rim Design Evaluation . . . . . . . . . . . . . . . . . . . 283.7.3 Maximum-Strength Flywheel Design . . . . . . . . . . . . . . . . . . 303.7.4 Maximum Bond-Strength Design . . . . . . . . . . . . . . . . . . . . 323.7.5 Composite Boat Hull . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

c© ETH Zurich IDMF-CMAS, January 22, 2015

3.7.6 Minimum-Weight Fuel-Cell Stack End Plate . . . . . . . . . . . . . . 34

4 Parameterization and Variable Transformations 374.1 Classification of Design Variables and Structural Optimization Problems . . 374.2 Design Parameterization Sample Problems . . . . . . . . . . . . . . . . . . . 39

4.2.1 Motorcycle Frame and Sizing Parameters . . . . . . . . . . . . . . . 394.2.2 Shape Parameters of the Formula 1 Race Car Rim . . . . . . . . . . 404.2.3 Maximum-Strength Flywheel and Mesh-Dependent Shape Parameters 424.2.4 Onsert Design and Mesh-Independent Shape Parameters . . . . . . . 424.2.5 Composite Boat Hull and the Patch Idea . . . . . . . . . . . . . . . 444.2.6 Fuel-Cell-Stack End Plate . . . . . . . . . . . . . . . . . . . . . . . . 45

4.3 The Parameterization Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . 464.3.1 Influence of Mechanical Situation on Parameterization . . . . . . . . 47

5 Some Basic Concepts of Global Nonlinear Optimization 515.1 Nonlinear Optimization Task . . . . . . . . . . . . . . . . . . . . . . . . . . 515.2 Feasible Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.3 Convex and Non-Convex Functions . . . . . . . . . . . . . . . . . . . . . . . 525.4 Method of feasible directions . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.4.1 Inequality Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . 535.4.2 Equality Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . 545.4.3 Generalization of Lagrange factor calculation . . . . . . . . . . . . . 55

5.5 Lagrange Multiplier Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 555.5.1 Lagrange Multiplier Method Sample Problem . . . . . . . . . . . . . 56

5.6 Necessary and Sufficient Optimality Criteria . . . . . . . . . . . . . . . . . . 575.6.1 Unconstrained Objective Functions . . . . . . . . . . . . . . . . . . . 575.6.2 Kuhn-Tucker Optimality Conditions . . . . . . . . . . . . . . . . . . 57

5.7 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595.7.1 The Max-Min Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 595.7.2 The Primal and Dual Problems . . . . . . . . . . . . . . . . . . . . . 605.7.3 Computational Considerations . . . . . . . . . . . . . . . . . . . . . 605.7.4 Use of Duality in Nonlinear Optimization . . . . . . . . . . . . . . . 61

5.8 Optimization Algorithms Overview . . . . . . . . . . . . . . . . . . . . . . . 635.8.1 An Argument for Mathematical Programming . . . . . . . . . . . . . 645.8.2 An Argument for Stochastic Methods . . . . . . . . . . . . . . . . . 655.8.3 Mathematical Programming and Stochastic Search Methods . . . . . 66

5.9 Practical Considerations for Numerical Optimization . . . . . . . . . . . . . 675.9.1 Advantages of Using Numerical Optimization . . . . . . . . . . . . . 675.9.2 Limitations of Numerical Optimization . . . . . . . . . . . . . . . . . 67

6 Search for Design Improvement: Mathematical Programming 696.1 Simplex Search Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716.2 Method of Steepest Descent . . . . . . . . . . . . . . . . . . . . . . . . . . . 736.3 Quadratic Objection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756.4 Original and Modified Newton Methods . . . . . . . . . . . . . . . . . . . . 776.5 Nonlinear Conjugated Gradient Methods . . . . . . . . . . . . . . . . . . . . 796.6 Powell’s Conjugate Direction Method . . . . . . . . . . . . . . . . . . . . . . 836.7 Response-Surface Method Minimizing Algorithms . . . . . . . . . . . . . . . 87

6.7.1 Constructing a Response Surface Model from Supporting Points . . 876.7.2 Finding the Minimum Point of Response Surface Model . . . . . . . 88

6.7.3 The Relation Between RSM and NM . . . . . . . . . . . . . . . . . . 88

6.7.4 Adaptive Response Surface Method . . . . . . . . . . . . . . . . . . 95

6.8 Line-Search Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

6.8.1 One-Dimensional Search in Multidimensional Variables Space . . . . 97

6.8.2 Interval Halving and Golden Section Methods . . . . . . . . . . . . . 97

6.8.3 Quadratic and Cubic Approximation Methods . . . . . . . . . . . . 99

6.8.4 Brent’s Routine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.9 Lagrange Multiplier Method Numerical Optimization . . . . . . . . . . . . . 109

6.9.1 Modified Lagrangian with Local Minima . . . . . . . . . . . . . . . . 109

6.9.2 Algorithm for Removing Constraint Violations . . . . . . . . . . . . 110

6.10 Objective Function Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . 113

6.10.1 Differences Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

6.10.2 Sensitivity-Formula Gradient Calculation . . . . . . . . . . . . . . . 113

6.11 Global Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

6.11.1 Multi-Start Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

6.11.2 Tunneling Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

7 Stochastic Search 119

7.1 Introduction to stochastic search and optimization . . . . . . . . . . . . . . 119

7.1.1 Neighborhood concept . . . . . . . . . . . . . . . . . . . . . . . . . . 120

7.1.2 Strategies in stochastic search . . . . . . . . . . . . . . . . . . . . . . 120

7.1.3 A prototype of a stochastic search algorithm . . . . . . . . . . . . . 120

7.1.4 Performance of stochastic search . . . . . . . . . . . . . . . . . . . . 121

7.2 Stochastic Search Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 121

7.2.1 Random Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

7.2.2 Stochastic Descent . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

7.2.3 Metropolis Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 122

7.2.4 Simulated Annealing . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

7.2.5 Evolutionary Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 124

7.3 Representation Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

7.3.1 The universal genotype . . . . . . . . . . . . . . . . . . . . . . . . . 131

8 Composite Structures 133

8.1 Design of Fiber Reinforced Composites . . . . . . . . . . . . . . . . . . . . . 133

8.2 Laminated Composites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

8.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

8.2.2 Classical Laminate Theory . . . . . . . . . . . . . . . . . . . . . . . 136

8.3 Finite Element Representation . . . . . . . . . . . . . . . . . . . . . . . . . 141

8.3.1 Layered Shell Elements . . . . . . . . . . . . . . . . . . . . . . . . . 141

8.3.2 Laminate Sensitivities . . . . . . . . . . . . . . . . . . . . . . . . . . 142

8.4 Optimization with Lamination Parameter . . . . . . . . . . . . . . . . . . . 149

8.4.1 Basic Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

8.5 Optimization on Physical Design Variables . . . . . . . . . . . . . . . . . . . 153

8.5.1 Optimization of the Fiber Orientation . . . . . . . . . . . . . . . . . 153

8.5.2 Optimization of the Stacking Sequence . . . . . . . . . . . . . . . . . 155

8.5.3 Material Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 156

8.5.4 Optimization of the Laminate Thickness . . . . . . . . . . . . . . . . 156

8.5.5 Combined Laminate Optimizations . . . . . . . . . . . . . . . . . . . 158

8.6 Laminate Tailoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

8.6.1 FEM-Based Parametrization of the Sub-Domains . . . . . . . . . . . 1608.6.2 CAD-Based Parametrization of the Sub-Domains . . . . . . . . . . . 162

9 Selected Methods and Case Studies 1659.1 Computer Aided Optimization after Mattheck . . . . . . . . . . . . . . . . . 1659.2 Soft-Kill Option after Mattheck . . . . . . . . . . . . . . . . . . . . . . . . . 1679.3 Flywheel Optimization and Inspired Mechanical Model . . . . . . . . . . . . 169

9.3.1 Sodola’s Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1699.3.2 Shape optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1709.3.3 Discussion of results . . . . . . . . . . . . . . . . . . . . . . . . . . . 1719.3.4 Simple Prediction of Optimum Shape Features . . . . . . . . . . . . 1729.3.5 Conclusions on the Flywheel Optimization and Modeling . . . . . . 176

9.4 Topology Optimization after Bendsøe and Kikuchi . . . . . . . . . . . . . . 1779.4.1 Topology optimization sample problem . . . . . . . . . . . . . . . . . 1789.4.2 Objective Function and Design Evaluation . . . . . . . . . . . . . . . 1799.4.3 Parameterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1799.4.4 Optimization Problem Statement . . . . . . . . . . . . . . . . . . . . 1819.4.5 Lagrange Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1819.4.6 Gradients with Respect to Densities . . . . . . . . . . . . . . . . . . 1829.4.7 Calculation of Lagrange Factors . . . . . . . . . . . . . . . . . . . . 1839.4.8 A Dual Algorithm for Topology Design . . . . . . . . . . . . . . . . 1849.4.9 Sample Topology Design Problem . . . . . . . . . . . . . . . . . . . 186

9.5 Truss Optimization with Ground Structure Approach . . . . . . . . . . . . 1879.5.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1889.5.2 Problem Statement Extension for Multiple Loads . . . . . . . . . . . 1889.5.3 Problem Statement with Self-Weight Loading . . . . . . . . . . . . . 1899.5.4 Fully Stressed Design and Optimality Criteria Methods . . . . . . . 189

10 Demonstration Programs 19310.1 Program DEMO OPT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19310.2 Program TOP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

A Finite Element Method 211A.1 Equation of Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211A.2 Formulation of a Layered Shell Element . . . . . . . . . . . . . . . . . . . . 215

B Material Invariants of Orthotropic Materials 219

Chapter 1

Scope, Goals, and SampleStructural Optimization Problems

1.1 Introductory Remarks on Design Optimization

Optimization is inseparable from product development. Before the invention of structuralanalysis methods, optimization was based on a combination of experience, or experimen-tation, and intuition. The sensitivity of the performance properties of an existing productto changes of its design became thus apparent only through some trial and error process.The time needed for some design improvement was thus determined by building the newdesign and testing it or finding out about its performance during service.The invention of powerful structural analysis methods, such as the finite-element method(FEM), reduced the time needed for one optimization cycle because a product does nothave to be built anymore to find out about its performance properties. The structuralanalysis information may give more or less direct clues to the analyst, or the ”decisionmaker” as to how the design of the structure may be changed to improve its propertieswith respect to some requirement or objective. Still, the optimization process requiresintuition to identify the improving design changes and manual work to implement thesechanges in the structural analysis models, evaluating the results of the new simulations,and comparing these with the previous results. Apart from this, in complex situationshuman intuition may fail at finding the improving design variations and the path from theinitial to a fully optimized design is often so winding that a large number of very smallstraight steps is required to follow it up.Fully automated optimization procedures let the analyst partake more efficiently at theactive and creative product development process. They reduce the process times greatlyand find the best design solutions systematically, letting the analyst focus on his humanrole as planner and decision maker.Structural optimization is quite well developed on academic level but is still not routinelyused in industrial environments. In the past, optimization procedures following closely thepath winding through design space from the initial to the optimum design have receivedthe highest attention. Such nonlinear search methods are summarized under the termmathematical programming. They assume the optimization objective to be formulated interms of continuous and, often, at least twice differentiable functions that should also beconvex. Then, mathematical programming requires a numerical effort that can be ordersof magnitude less than with other methods. The practical difficulty with mathematicalprogramming lies in interfacing of the design, analysis, and evaluation models and the


2 Scope, Goals, and Sample Structural Optimization Problems

optimizers.Generally, real optimization problems yield objective functions that can not be processedwith mathematical programming. Stochastic search methods do not suffer limitations likemathematical programming and the interfacing problems are often less severe. They arethus better suited for many practical problems but they require a much higher number offunction evaluations unless the optimum is hit early by chance. Consequently, the latestresearch [1] focuses on rendering stochastic search methods more efficient by borrowingconcepts from mathematical programming. The more efficient stochastic search methodsare based on concepts inspired by the evolution of life or bacterial search mechanisms,for instance, and involve the evaluation of whole populations of trial design solutions (ortheir genotypes) within one generation of the evolution process. Since the individuals ofone population can easily be evaluated in-parallel, the use of modern massively parallelcomputer architecture, such as Beowulf clusters with hundreds of processors, mitigates theefficiency deficiency of the stochastic search methods.

1.2 Overview of the Contents and Acknowledgment

The students shall understand the essential aspects of structural optimization by studyingthe various aspects given with this lecture class script. They begin with studying theselected structural optimization problems listed in the following Section 1.3. The prob-lems are used in the lecture to exemplify the various aspects of setting up an automatedoptimization procedure. The task is supported by the schematic presented in Chapter 2.Chapter 3 teaches the various components of design evaluation in terms of formulation ofthe optimization objectives and the design constraints and the recovery of the necessarydata from the structural analysis results. The analysis models and the design models mustcommunicate through transformations between design and analysis variables which is thesubject of Chapter 4 on design parameterization. The parameterization is also linkedwith the classification of structural optimization problems. Chapter 5 explains some basicconcepts related to minimization of constrained objective functions and the material hasbeen extended to include dual problem formulation in June 2007 although this materialwill first be taught in class in 2008. Optimization algorithms of mathematical program-ming and advanced stochastic search are explained in some detail in Chapters 6 and 7,respectively. The topic of stochastic search, Chapter 7, has been provided by my colleagueDr. David Keller in 2007. In 2013, Dr. Benjamin Schlapfer has contributed Chapter 8on optimization of laminated composite structures; this topic must also address basics ofthe mechanics of composite materials as well as of the finite-element method. Finally,Chapter 9 presents special methods of importance, specifically the methods developedby Claus Mattheck, the homogenization method pioneered by Bendsøe and Kikuchi, theground-structure approach for truss design. It also includes a mechanical model inspired[2] by flywheel shape optimization to demonstrate that the interpretation of numericaloptimization results may enhance understanding of mechanical problems.The sample problems of structural optimization presented in the following section havebeen provided by my colleagues of the optimization group. The motorcycle-frame (Section1.3.1) and the race-car rim (Section 1.3.2) design problems are contributed by M. Win-termantel and O. Konig [3, 4, 5, 6]. The flywheel [2] and load-introduction for sandwichproblems [7, 8] are provided by G. Kress. The problem of global laminate optimization(Section 1.3.5), with the composite boat hull sample problem, has been provided by N.Zehnder [9, 10]. The problem of finding a minimum-weight design for the end plates of afuel-cell stack (Section 1.3.6) has been worked out by O. Konig [5, 11].


1.3 Problems of Structural Optimization 3

1.3 Problems of Structural Optimization

The structural optimization sample problems presented in the following subsections 1.3.1through 1.3.6 have been studied at the Centre of Structure Technologies in the course ofbuilding up know-how of optimization methods, research and development projects, Ph.D.theses, and student graduate projects. Each of the problems is unique in terms of problemstatement and solution methods. They are here introduced to trigger the creativity of thestudents who are invited to suggest ideas on how the various problems could be solved.The problems are also used throughout the lecture class to illustrate the various solutionapproaches, design parameterizations, and search techniques for finding the best designsolutions.Different types of optimization problems are illustrated in Fig. 1.1. Often, design solu-

(a) (b) (c) (d)

Figure 1.1: Types of Structural Optimization

tions in terms of load-carrying frames, often used for bridges, motorcycle frames, or otherstructures, have a fixed connectivity of the various members with each other, or topology.If the topology is already specified, further possibility for improving the performance ofsuch structures lies in the adjustment of the various member’s size. The sizing changesthe area values, or moments of inertia, of members such as trusses or beams. The illus-tration shown on Fig. 1.1(a) indicates not only the sizing of member properties but alsothe change of the position of the nodes where members connect.The position change of connectors indicated in Fig. 1.1(a) is very similar to the shapeoptimization indicated in Fig. 1.1(b). The example shows an optimized design with ashape that resulted, through a shape-optimization process, from a rectangular-shaped ini-tial design.Sizing and shape optimization always require some initial design where the topology isalready fixed. Less information, or pre-existing knowledge, requires the topology opti-mization. It requires only a definition of the physical or geometric design space and thespecification of geometric boundary conditions and loads. By redistributing the material,which is initially evenly distributed in the design space, some topology as indicated in Fig.1.1(c) is automatically created. The method, along with a specific solution technique [12],is explained in section 8.1. It was used to create the title page illustration. The shownresult is the stiffest structure with respect to the sketched boundary conditions and thefineness of the finite-element mesh. The simulated best design hints at the optimal Michellstructures [13].Advanced composite materials consist of fibers of high stiffness and strength embeddedin some matrix the material of which may be rather weak and compliant. The resultingcomposite, for instance with unidirectional reinforcement, has mechanical properties thatare highly direction-dependent, or anisotropic. Such materials are most often used for thinshell structures, where the walls consist of laminates of several layers of the composite ma-terial. The orientation of the reinforcement of the individual layers, see Fig. 1.1(d), can be



chosen to obtain some desired global structural behavior. Fiber orientation is one of theinternal parameters characterizing a laminate, and algorithms for the optimum adjustmentof these are called internal material parameter optimization.Of course, different structural optimization types may be combined to solve one opti-mization task. Topology optimization may be followed by shape optimization or shapeoptimization may be coupled with internal material parameter optimization.

1.3.1 Weight Minimization of a Motorcycle Tubular Frame

The motor cycle manufacturer Ducati, active in racing, uses a tubular steel trellis framessuch as the one of the Ducati 996R Superbike shown in Fig. 1.2(a). A co-operationwith Ducati lead to a student graduate project and inspired the development of somesolution technique to be published soon [3]. The company is interested in making the frame

(a) (b)

Figure 1.2: Ducati 996R frame and geometry model [3]

structure as lightweight as possible whilst at the same time preserving prescribed structuralstiffness properties. The structural stiffness influences the racing performance of the bikebecause it contributes to the suspension characteristics. The springs and shocks providethe suspension characteristics of the bike in the straight position. The forced displacementsor dynamic loads, exerted by the road onto the bike at certain speeds, act then parallelto the vertical axis of the bike. In curves, surrounded at high speeds, the forces act moresideways and the suspension system takes only a small component in the vertical directionof the bike. The lateral component must be damped by the structural compliance of thebike in the lateral direction and with respect to the contact points between wheels androad. A significant factor determining the lateral suspension properties to be provided bythe frame is its torsional stiffness. Therefore, the torsional stiffness of the frame, measuredin terms of twisting moment related to relative twist between its front and rear ends, is afixed value that must be kept constant when reducing the weight. Such a requirement iscalled a constraint, specifically an equality constraint. Another constraint is derived fromthe fact that the frame must withstand the loads to be transferred by it. This impliesthat the stresses induced by external loading must not exceed the material strength atany location of the frame. In order to obtain the necessary information on the structuralstiffness and the stressing of the frame, its behavior and performance under loads mustbe simulated. The simulations are provided by the finite-element method FEM and Fig.1.2(b) shows the frame geometry that was used for the FEM modelling.



1.3.2 Racing Car Rim

The problem was considered in the Ph.D. theses of O. Konig [5] and M. Wintermantel [6],a number student projects. The present problem characterization is taken from [5].The performance of a race car as shown in Fig. 1.3 depends on several factors: weightshould be as low as possible, the moment of inertia along the rotation axis has to besmall, and the stiffness should be high at the same time. The weight of the rim does

Figure 1.3: CAD-Model of the racing car rim (courtesy O. Konig [5])

not only influence the performance as a part of the cars overall weight. Since the wheelsbelong to the so called unsprung mass, a low rim weight improves the mechanical grip ofthe car especially on bumpy road surfaces. The moment of inertia along the wheelsrotational axis should be minimal for several reasons. Low moments of inertia allow fasteracceleration and deceleration of the wheels and therefore of the whole car. Furthermore,the moment of inertia leads through the gyro effect to higher steering forces as well as ahigher inertia of the car with respect to direction changes. Finally, the stiffness of therim is of high importance in turns at high speed. Vertical loads of 5700N , resulting fromthe car’s weight and aerodynamical descending forces, as well as maximum lateral forcesof 7000N , as a result of the centripetal forces, build up a bending moment on the rim.Additionally, strength requirements must be fulfilled. Plastic yield must not occur in use,whereas some parts reach temperatures well above 200C. A special magnesium alloyis used where maximum yield stress does hardly decrease with higher temperatures andthe mass-specific stiffness ratio is high. Manufacturing starts with a forging blank, therim’s bed is shaped by CNC-lathe, and the spokes form the interspace of CNC-milledpockets. The forging blank’s shape is not to be changed, as this would exceed costs. FIA1

regulations affect the bead diameter as well as dimensions of the lower rim-bed.For the optimization presented in here, maximum bending stiffness is defined as maindesign objective from the rim manufacturer. Nevertheless, the desired properties

• Low mass

• Low rotational moment of inertia

• Sufficient margin of safety for mechanical stresses

of the existing design should also be matched or even surpassed by the optimized design.Furthermore, the optimization must also take into account the functional, regulatory, andmanufacturing requirements discussed earlier in this section. Altogether, this constitutes

1Federation International d’Automobiles (http://www.fia.com)



a highly constrained optimization problem, which can not be tackled with classical math-ematical optimization techniques. Since all original data is confidential, arbitrary newgeometries for the rim and the forging blank, as well as modified load cases are createdand optimized for the presentation in here.

1.3.3 Maximum-Strength Flywheel Design

Consider a flywheel of constant thickness with a central bore as illustrated in Fig. 1.4(a).Because of the rotational symmetry, the structural behavior of the flywheel can be simu-lated by a 2-dimensional FEM model where the finite elements are based on a rotationalsymmetry assumption. Such a model is plotted in Fig. 1.4(b) and the radial and circum-ferential stress distributions are also shown. The direct stress in radial direction falls downto zero at the edge of the central bore and also at the outer rim because both are stress-freeboundaries. The direct stress in the circumferential direction, however, increases sharplyat the edge of the central bore. The material of the flywheel is unevenly stressed and,

(a) (b)

Figure 1.4: Flywheel with central bore, 2-dimensional FEM model and stresses [2]

therefore, not economically used.A more even stress distribution can be found by varying the thickness of the flywheel alongthe radius.

1.3.4 Maximum Bond Strength Design

The onsert is a joining element to achieve load introduction into lightweight structurestypical for the transportation sectors [7, 8]. In contrast to the insert, the onsert is simplybonded to the surface of a plate that can be a sandwich structure as shown in Fig. 1.5(a).The load that can be transferred by the system consisting of the parts onsert and sandwichstructure depends on the stress distribution in the bonding layer between the two parts.Assuming that the onsert geometry is rotationally symmetric and a load is centrally appliedin the axial direction, a 2-dimensional model similar to that used for analyzing the flywheelcan be used. The 2-dimensional basic geometry model is shown in Fig. 1.5(b). The finiteelement model shown in Fig. 1.5(c) obtains the stress distributions in the bonding layershown in Fig. 1.5(d). The stresses are very unevenly distributed with high concentrationsat the central bore. Failure of the bond will be initiated there.The maximum-strength design of onserts and other bonded systems is one of the ongoingresearch activities shared between Alcan and the chair Structure Technologies.



(a) (b)

(c) (d)

Figure 1.5: Onsert design demonstrator (a) and geometry model (b)

1.3.5 Composite Boat Hull

The hull of a sail boat, made of composite materials, should be as stiff as possible undertypical service loads. The particular model shown in Fig. 1.6 should be low priced formarketing reasons. The sample problem is thus useful to demonstrate a problem where a

Figure 1.6: ANSYS model of the sail boat hull with composite material patches (CourtesyN. Zehnder [14])

mechanical and an economical objective are combined to give a multi-objective optimiza-tion problem. Also, achieving desired mechanical properties with a composite materialdesign provides an interesting design parameterization problem. An elegant key idea tosolve this problem is studied and developed by N. Zehnder in the course of his Ph.D. thesiswork.



1.3.6 Minimum Weight Fuel Cell Stack End Plate

The weight of an end plate of a fuel cell stack shall be minimized. The problem arose infuel cell research projects2 [15, 16] at the Swiss Federal Institute of Technology Zurich. Theend plates for the fuel cell stacks developed in these projects were designed by TribecraftAG3 [17], which also provided the detailed problem description for the optimization pre-sented in this section. A fuel cell stack, as shown in Figure 1.7, consists of bipolar plates,two collectors, electrical insulation, and the end plates that are connected with tie bolts.The fuel and cooling supply line runs through the upper end plate. The weight of such

Figure 1.7: Conceptual model of a fuel cell stack.

an end plate is minimized subject to a stress constraint and manufacturing requirements,whereas the structure is foreseen to be manufactured by extrusion molding. The objectiveis to increase the power density, i.e. power per weight, of a fuel cell stack. In a secondoptimization procedure, described in [17], the bottom surface of the end plate can be cam-bered independently to guarantee a constant pressure distribution on the fuel cell stackin built-in state. This sequential partitioning of the optimization problem allows to firstminimize the weight of the plate under a stress constraint without concern for the stiff-ness of the plate. For the verification model mentioned above, the same CAD-model andgenotype is used to optimize a bridge-like structure. The objective of this optimization isto minimize its compliance subject to a constant mass given in percent of the fully-filleddesign domain.

2http://www.powerpac.ethz.ch3http://www.tribrecraft.ch


Chapter 2

Treatment of a StructuralOptimization Problem

Optimizing a structure by some automated numerical procedure may seem very complexand difficult to organize. The concept presented in the following sections was worked outby Eschenauer [18] and decomposes the task into manageable subtasks so that it can besolved in a straightforward manner. Although Eschenauer’s concept seems to have beendeveloped with regard to the optimization algorithms labelled by the term mathematicalprogramming, it is also valid when other solution techniques such as genetic algorithmsare used.

2.1 Eschenauer’s Three-Columns Concept

An optimization problem can generally be solved by applying the Three-Columns Conceptafter Eschenauer [18]. The three columns are the structural model, the optimization model,and the optimization algorithm. The concept is depicted in Fig. 2.1 and the followingsections follow to some extend the presentation in the textbook [18].

Figure 2.1: Three-columns concept after Eschenauer [18]


10 Treatment of a Structural Optimization Problem

2.1.1 Structural Model or Structural Analysis

In order to be able to perform a computerized optimization process, the real-life structuremust be transferred into a structural model. The structural model describes mathemat-ically, or numerically, the physical behavior of a structure. The physical behavior of amechanical system is its response to loads or structural properties such as eigenfrequen-cies or weight. The optimization objective and constraints are formulated in terms ofstate variables. For instance, if the structural model is formulated by using the finite-element method (FEM), the state variables of a solid mechanics problem are the nodal-point displacements u. Other interesting quantities such as stresses are calculated fromthe displacements within the postprocessing step.

2.1.2 Optimization Algorithm

Real-life optimization problems generally lead to nonlinear and constrained optimizationproblems. Optimization algorithms solve such problems. They are based on iterationprocedures that proceed from an initial design x0 and produce improved design variablevectors xk. The procedure stops upon satisfaction of some predefined convergence crite-rion. Experience teaches, and numerous studies have shown, that the choice of the mostappropriate optimization algorithm is problem-dependent.

2.1.3 Optimization Model

The optimization model builds a bridge between the structural model and the optimiza-tion algorithms.The evaluation model performs the design evaluation in terms of the optimization objec-tive and the state (i.e. violated or not) of existing constraints from the values of the statevariables and other information from the structural model. The optimization objective isoften formulated as a scalar objective function f or, in case of multi-objective optimization,a vector f . The constraints of a design are formulated in terms of constraining functionsvectors g (inequality constraints) and h (equality constraints). The evaluation model maybased on the state variables u (for calculating stresses, for instance) or some other vari-ables influencing the design (for calculating weight, for instance).The optimization model also contains variable definitions and transformations that can besummarily called parameterization. The analysis variables y are chosen from the struc-tural parameters. For example, nodal point positions at the boundary of a structural modeldomain define the shape of the structural model and change during a shape optimizationprocess. The shape of a structure or design is explicitly defined in terms of design variablesx. The design model describes the mathematical relation between the analysis variablesy and the design variables x.The design variables variables x may by additionally transformed into transformationvariables z in order to adapt the optimization problem to the special requirements of theoptimization algorithm.The sensitivity analysis yields the sensitivity of the objective and constraints with re-spect to small changes of the design variables. This information is used to control theoptimization algorithm and/or by the decision maker who judges the design.

Design Evaluation

Design evaluation and the setting up of an evaluation model is discussed in chapter 3on the basis of the sample problems presented in chapter 1. In structural optimization,


2.1 Eschenauer’s Three-Columns Concept 11

one uses generally FEM to obtain the response of a structure to loads under specifiedgeometric boundary conditions. The solution of the numerical system of equations obtainsthe primary solution in terms of the nodal-point degrees of freedom. In structural analysis,the degrees of freedom are displacements. From the primary solution other results suchas stresses can be obtained. The stresses may be used to formulate an objective whenthe strength of a part is to be maximized. Stresses may also be used to formulate aconstraint when, for instance, the weight of a load-carrying part is to be minimized but arequired strength must be preserved. The weight is then calculated from the integral ofthe material densities over the volume of the considered part and does not depend on theload response. Generally, however, the structural response is needed for the evaluation ofboth, the objective and the constraints.

Design Parameterization

The parameterization of a design is crucial for achieving an efficient optimization process.Parameterization concerns the design and the analysis models and the transformation be-tween the design variables and the analysis variables.One important aspect arises from using FEM structural models in terms of a finite-elementmesh. The mesh parameters comprise the element types with their physical properties,such as thickness of plates or cross-sectional values of beams, and the fineness of the meshresp. number of elements or nodal points and the nodal point positions. Obviously, themesh parameters can be messy to handle because there are so many of them. Also, onemay wish to change the fineness of the mesh to specific needs, for instance to have a smallernumber of unknowns to reduce the solution time for one design evaluation or to increasethe accuracy of stress results. Such changes would entail changing the whole parameter-ization. It is therefore desirable to use design parameters that are defined independentof mesh size and to have an automated transformation between those design parametersand the analysis parameters. The transformation can be in terms of a mesh generatorthat uses, as input, some fixed and variable design parameters and some data controllingthe mesh fineness and generates the FEM mesh data establishing the analysis model. Theparameterization in terms of design variables is then mesh independent. Also, the numberof design variables is typically much smaller than that of the analysis variables so that theoptimization algorithm converges much faster, reducing the number of necessary designevaluations and with it the solution time of the optimization process.It is difficult to discuss more specific details of the parameterization issue in general be-cause the kind and choice of design variables depend very much on the considered problem.Therefore, the sample problems listed in chapter 1 are used to discuss their respective pa-rameterization models in chapter 3.

Transformed Variables

The design variables may be transformed to meet certain requirements of the optimizationalgorithm. For instance, genetic algorithms operate upon genotype variables that may bedefined in terms of bit strings whose encoded information must be transformed into thephenotype design variables. The topic of transformed variables is therefore discussed incontext with genetic algorithms in section 6.3.



2.2 Practical Optimization Problem Solution Setup

Some of the sample problems discussed throughout this lecture class (Flywheel, Onsert)have been solved by writing dedicated computer codes where the design is controlled bythe variables read from a data input file. The variables determine details of a finite ele-ment mesh which is built by a specially written mesh generator. The solution of the FEMmodel, its evaluation with objective and constraints calculation, and the search algorithmare also implemented in the one dedicated code. Although this concept may give perfectprograms for special problems, it is not very useful in practice where one desires to have ageneral tool with which various and different problems can be used. A concept with morepractical significance has been developed by O. Konig [5], M. Wintermantel [6], and N.Zehnder in the course of Ph.D. work. The concept has been developed and realized forevolutionary algorithms as optimization engine but could also be applied with mathemati-cal programming techniques. It is here described by using material from O. Konig’s Ph.D.thesis [5].To efficiently apply design optimization to engineering practice, it should be possible tointegrate external simulation software, only controllable through text files, with the op-timizer. The in-house developed software DynOPS (Dynamic Optimization ParameterSubstitution) is made for that purpose: it allows to run evolutionary optimization usingarbitrary simulation software. The software is built on an Generic Evolutionary Algorithm,it design parameters, and it also integrates the design evaluation. The main componentnewly added in DynOPS is an interface transferring the optimization parameters from theoptimizer to the respective simulation program using file amendment as well as handlingof the results. Further more, a general concept for program handling is implemented,allowing also sequences with different simulation programs to be used for evaluation. Anexample for such an evaluation composed of two simulation programs is to handle a pa-rameterized structure in a 3D CAD program, compute its mass and inertia, and export themodel to a FEM program where under given load cases deflections and maximum stressesare evaluated.

2.2.1 The DynOPS evaluator

To explain the general functionality of DynOPS, the new evaluation module is first dis-cussed as stand-alone tool. Given an arbitrary population of eoUniGene genotypes, theDynOPS evaluator calculates fitness values for every individual as shown in Fig. 2.2. Toset up such an evaluation procedure, the following problem-dependent information mustbe defined:

Sequence of simulation programs. The framework of the fitness evaluation is definedthrough a sequence of involved simulation programs together with the needed in-put/output files.

Input files for the simulation programs. The actual simulations to be carried out areestablished with the input files, as well as the effective results to be calculated andstored in output files.

Mapping from genotype to input files. Every optimization parameter in the inputfiles must be linked to the appropriate gene of the eoUniGene genotype. This isdone by storing the exact row and column, where every parameter must be insertedin the input files.


2.2 Practical Optimization Problem Solution Setup 13

Dynops EvaluatorOffspring

Objective & ConstraintValue Reading File File File

File

File

Fitness ValueComputation

Optimization ParameterTransfer

SimulationProgramManager

File

SimulationProgram A

SimulationProgram B

Figure 2.2: DynOPS evaluator to calculate fitness values for a population of individualsusing external simulation software.

Objectives, constraints, and fitness function. How to read the objectives and con-straints from the appropriate output files, as well as how the actual fitness shouldbe calculated from these values must be defined.

The simulation program manager starts the first simulation program together with theappropriate input files, and waits for job completion. The results from the evaluationstage stored in output files are either used as input for the next simulation program (e.g.a geometry file), or are directly used for fitness calculation (e.g. a mass evaluated in theCAD system). The next evaluation stage in the sequence is then started, and so forth.After completion of the sequence of simulation programs, objective and constraint valuesstored in result files are transferred back to the DynOPS evaluator. The evaluation of anindividual is completed by computing its fitness value. This evaluation loop is repeatedfor the whole population.


Chapter 3

Design Evaluation

3.1 Local Optimality Criteria

Material failure is a local affair. A material fails when its strength is exceeded by thelocal stress state at a point. So, an equivalent stress can be a local optimality criterion.The objective would be to reduce the highest equivalent stress, appearing at some spatialposition in the structure, to the smallest possible value or

min max σeqv(x, spatial position) (3.1)

by adjusting the design parameters x.An added difficulty with such an objective is that the position where the highest stressvalue appears is likely to change when the design changes. Evaluating the design thenimplies calculating the stress distribution and selecting the highest value. The subsequentsystematic design change is likely to reduce the highest stress value at the selected positionbut may also raise stress values at other positions. This holds the risk of a non-convergingoptimization process or the continued switching between two or more design solutions.Examples for local optimality criteria are provided by the growth-strategy approaches byMattheck, called computer aided design (CAO) and soft-kill option. The local optimalitycriteria are implicitly able, under certain circumstances or restrictions, to improve oroptimize global design properties such as load transfer capacity.

3.2 Global Objective Functions

A global objective function defines how the global objective depends on the design pa-rameters and how it changes with a change of the design variables. However, as globalproperty, it is independent of the spatial coordinates. It is customary to set up the globalobjective function so that its absolute minimum value corresponds to the objective, or thebest design solution regarding the objective,

min f(x) , (3.2)

and if the objective is actually to maximize some property such as volume V , the objectivefunction can be set up so that minimizing it is equivalent to maximizing that property.This is easily achieved by multiplying the property with a negative number, f(x) = −V (x)or dividing a positive number by it, f(x) = 1

V (x) . In context with evolutionary-algorithmsolution techniques, the objective function is also called a fitness function.


16 Design Evaluation

3.3 Constraining Functions

Generally a design is subject to one or several constraints such as discussed in section3.2. The constraints are cast into constraining functions. Following the nature of theconsidered constraints, one distinguishes between inequality constraining functions g andequality constraining functions h:

gi(x) ≤ 0, i = 1, 2, ...,m

hj(x) = 0, j = 1, 2, ..., n(3.3)

where l and m are the numbers of the respective constraining functions.Constraining functions restrict the search space. Inequality constraining functions formhyperplanes in search space dividing it into the feasible regions and infeasible regions,where design constraints are violated. As long as the optimization variables vector pointsinto a feasible region, the inequality constraints are said to be inactive and then they donot restrict the search at the momentary stage. Inequality constraints become active whenthe respective constraint is violated or the limiting state is reached,

gi(x) ≥ 0. (3.4)

Equality constraints are always active.

3.4 Several Design Criteria

We have so far considered the optimization of a structure with respect to a single de-sign criterion. If there are two or more criteria to be considered, the respective objectivefunctions can also be minimized simultaneously. Such optimization procedures are calledmulticriteria optimization, vector optimization, multiobjective optimization. Problems ofthis kind are of particular relevance to practice where, in general, several structural re-sponse modes and failure criteria must be taken into account in the design process [18].The form of a constrained vector optimization problem can be stated as

minX∈Rn

f(x)|h(x) = 0,g(x) ≤ 0 , (3.5)

where f(x) is called a vector objective function of the design variables

f(x) :=

f1(x)

...fm(x)

. (3.6)

At some stage during an multi-objective optimization process the situation appears thata further minimization of one objective function goes on account of increasing some otherobjective function value. Such a situation is called an objective conflict because none ofthe possible solutions allows for simultaneous optimum fulfillment of all objectives.

3.4.1 Pareto Optimality

A vector is then - and only then - called Pareto-optimal if no vector xεX exists for which

fj(x) ≤ fj(x∗) for all j ∈ 1, ....,m

and fj(x) < fj(x∗) for at least one j ∈ 1, ....,m

(3.7)


3.4 Several Design Criteria 17

Fig. 3.1 shows a projection from the two-dimensional design space X into the objectivefunction space Y . The Pareto optimal solutions lie on the lines AB. The designer maychoose from these solutions from assessment of the relative values of the two objectivefunctions.

Figure 3.1: Mapping of a feasible design space into the criteria space [18]

3.4.2 Substitute Problem and Preference Function or Scalarization

Multi-objective optimization problems can be reduced to optimization problems with ascalar objective function by formulating a substitute problem with a preference functionp so that

minX∈Rn

p [f(x)] , (3.8)

such that

p [f(x)] = minX∈R\

p [f(x)] . (3.9)

Eschenauer [18] cites various formulations of the preference functions from which here onlythe sum of weighted objectives is mentioned:

p [f(x)] :=m∑j=1

[wjfj(x)] , x ∈ Rn. (3.10)

The weighting factors wj are chosen by the designer

0 ≤ wj ≤ 1,m∑j=1

wj = 1. (3.11)

If all objectives are convex, a full set of Pareto-optimal solutions can be generated byrunning a sequence of substitute scalar problems where the preference functions cover anappropriate range of values for the weighting factors wj . If one or several of the objectivesare not convex, the Pareto-optimal set is not so easily generated and Eckhart Zitzlerexplains the topic in his lecture class Bio-Inspired Computation and Optimization at ETHZurich.



3.5 Transformation Methods and Pseudo Objectives

Searching for a minimum of a constrained objective function adds the burden of keepingout of the infeasible region. Principally, when the algorithms for unconstrained searchexist, they can also be used for solving constrained problems if they are transformed intounconstrained problems. Some of these transformation methods, namely two versions ofthe penalty method and the multiplier method, are described in the following.

3.5.1 Penalty Methods

The penalty methods [19] transform the objective function f(x) and the constrainingfunctions h and g into a transformed objective function p without explicit constraints.Such a transformed objective function is also called a pseudo objective and it is obtainedby adding to f a penalty function Ω that is composed of the constraining functions andsome penalty parameters R,

p(x, R) = f(x) + Ω (R, g(x), h(x)) . (3.12)

The function Ω can be defined so that either the exterior point method or the interiorpoint method results [20].

Exterior Point Penalty Method

An example for the exterior penalty method is provided by setting up the penalty functionby using the quadratic penalty term so that the constraint violations are penalized:

Ω(x, R) = Rm∑j=1

max [0, gj(x)]2 +Rl∑

k=1

hk(x)2 . (3.13)

Therefore, of the inequality constraining functions g(x), only the active ones may be consid-ered in the penalty formulation (3.13). Inserting it into (3.12) results in an unconstrainedfunction the minimum point of which lies outside the feasible region, hence the name ofthis method. As Fig. 3.2(a) shows, with increasing values of the penalty parameter R theminimum moves closer to the feasible region but it can never quite reach it.

Interior Point Penalty Method

An interior point method is the result of selecting a form for Ω that will force stationarypoints of P (x,R) to be feasible. Such methods are also called barrier methods, sincethe penalty term forms a barrier of infinite P function values along the boundary of thefeasible region. Since the keeping of constraints may be crucial for safe design solutions,it is generally to be preferred to use methods finding the improved design solutions withinthe feasible region. The interior point method has that property because, for the inequalityconstraints, it penalizes the closeness to the infeasible region, even before the constraintsare actually violated. This is achieved, for instance, with the inverse penalty term:

Ω(x, R′) = R′m∑j=1

−1

gj(x)+R′

l∑k=1

hk(x)2 . (3.14)

Fig. 3.2(b) illustrates that with decreasing value of the penalty parameter R′ the minimumpoint of the transformed objective function moves closer to the infeasible region or theconstrained minimum point.The pseudo objective after the inner penalty method exhibits an unfavorable topology


3.5 Transformation Methods and Pseudo Objectives 19

Illustrative Sample Problem

The objective function

f(x) =1

20(x+ 2)2 (3.15)

depends on only one variable x and has its unconstrained minimum point at x = −2. The

Figure 3.2: Examples for the exterior (a) and interior (b) penalty functions

variable x is subject to the side constraints

1 < x ∧ x < 2. (3.16)

Thus, the feasible region of the optimization variable is 1 < x < 2. The two side constraintsdefine the inequality constraining functions

g1(x) : 1− x ≤ 0, g2(x) : x− 2 ≤ 0. (3.17)

It can be seen that the minimum point of the constrained original problem is at x = 1.The substitute problem resulting from the transformation with the outer penalty methodhas its minimum point between that of the unconstrained and the constrained originalobjective functions, −2 < x∗ < 1. In Fig. 3.2(a) the unconstrained function f and threetransformed functions p are plotted for the penalty parameter values 1, 10, and 100. Itcan be seen how the minimum of the transformed function moves closer to x = 1 as thepenalty parameter increases from 1 to 100.The inner penalty method obtains an unconstrained substitute function whose minimumpoint lies within the feasible region. As the penalty parameter R′ decreases from 100 to 1the minimum point moves closer to the edge of the infeasible region. It can also be seenfrom Fig. 3.2(b) that it will take very small values of R′ to move the minimum of p closeto x∗ = 1.One might conclude that solutions close to the constrained minimum point can be obtainedsimply by using very high or small values for the penalty parameters R or R′, respectively,and just minimizing the pseudo objective for these values. However, it can be seen fromFig. 3.2 that the unconstrained pseudo objectives are more distorted if compared to theoriginal objectives. The search methods of mathematical programming (section 6.3) aredesigned to work best on objective functions that behave almost like quadratic functionsand might fail at the highly distorted pseudo-objectives resulting from choosing the penaltyparameters so that the minimum point of the unconstrained pseudo objective is very closeto the true constrained minimum point.It is then inevitable that the constrained optimization problem is solved by a sequence



of unconstrained subproblems, where the penalty parameters are updated at each step.Considering the exterior point method, the parameter R is chosen small, for instance zero,at the first stage, and gradually increased with the subsequent stages. For the interiorpoint method, one starts with a high value of R′ and decreases it from stage to stage. Theminimum point of each subproblem is then used as a starting point for solving the nextsubproblem. So the considered regions in search space become smaller as the pseudo objec-tives become more distorted. Nevertheless it is inevitable that the generated subproblemsbecome progressively ill-conditioned so that, at one point, the sequence terminates notbecause of finding a very close approximation to the true constrained minimum point butbecause of failure of the search algorithms.

3.5.2 Method of Multipliers

The method of multipliers (MOM) [21, 22, 23] uses the penalty function

P (x,σ, τ ) = f(x) +R

J∑j=1

〈gj(x) + σj〉2 − σ2

j

+R

K∑k=1

[hk(x) + τk]

2 − τ2k

(3.18)

where R is a constant scale factor (R may vary from constraint to constraint but remainsconstant from stage to stage), and the bracket operator is defined as

〈α〉 =

α if α > 00 if α ≤ 0

. (3.19)

The σj and τk parameters are constant during each unconstrained minimization but areupdated from stage to stage. It is not necessary for the starting vector x0 to be feasible,and the parameters can be conveniently chosen for the first stage as σ = τ = 0. Thus thefirst minimization stage is identical to the first unconstrained minimization using standardexterior point method penalty terms.

Multiplier Update Rule

Suppose that the vector xt minimizes the tth-stage penalty function:

P (x,σ(t), τ (t)) = f(x) +R∑J

j=1

⟨gj(x) + σ

(t)j

⟩2− [σ

(t)j ]2

+R

∑Kk=1

[hk(x) + τ

(t)j

]2− [τ

(t)k ]2

(3.20)

Multiplier estimates for the (t+ 1)st stage are formed according to the following rules:

σ(t+1)j =

⟨gj(x

(t)) + σ(t)j

⟩j = 1, 2, 3, ...J

τ(t+1)j = hk(x

(t)) + τ(t)k k = 1, 2, 3, ...K

(3.21)

Because of the bracket operator, σ has no negative elements, whereas the elements of τcan take either sign.


3.5 Transformation Methods and Pseudo Objectives 21

3.5.3 Unconstrained Lagrange Problem Formulation

An optimization problem with linear equality constraints can be transformed into anunconstrained substitute problem by introducing the Lagrange function L:

minX∈Rn

f(x)|h(x) = 0 → minX∈Rn

L(x) , L = f + λTh (3.22)

Suppose that improvement is sought by moving from a reference point, which must befeasible, along a search direction s. The search direction is the linear combination of

Figure 3.3: Illustration to the unconstrained Lagrange problem formulation

a usable direction, along which smaller values of f are found and an obvious choice ofwhich is the direction of steepest descent, and the gradients of the equality constrainingfunctions:

s = −∇f − λT∇h (3.23)

The search direction is feasible, or will not violate the linear constraint, if it is orthogonalto the constraining function gradient:

sT∇h = 0 (3.24)

Inserting the definition (3.23) of the search direction into the orthogonality condition (3.24)yields the Lagrange factors λi:

λi = −∇fT∇hi

∇hTi ∇hi(3.25)

The Lagrange function, or Lagrangian, is the function whose negative gradient provides afeasible search direction obeying all constraints.

L(x,λ) = f(x) +

m∑j=1

λjgj(x) +

l∑k=1

λm+khk(x) (3.26)

More information on the method of feasible directions, the Lagrangian, and the Kuhn-Tucker conditions for constrained optimization problems is given in Sections 5.4 through5.6. Section 5.5 on page 55 explains why the stationary point of the Lagrangian is asaddle point and Section 5.7 on page 59 introduces the concept of duality, where the dualproblem is that of solving the constrained optimization problem in terms of the Lagrangemultipliers. Section 6.9 considers numerical solution of the Lagrangian by searching for aminimum. If the constraining function g is not linear, the found search direction is feasibleonly at the reference point and moving along it will eventually violate the inequalityconstraints. It will then be necessary to remove the violations, or find a feasible pointclose to the infeasible one just obtained. Such an algorithm is explained in Section 6.9.2.



3.6 Fitness Function for Evolutionary Algorithms

Evolutionary Algorithms evaluate a given design by a fitness function. The term is incontext with Genetic Algorithms, which are inspired by the understanding of the mecha-nisms of biological development under environmental pressure. The analogy lead also todistinguishing between phenotype and genotype. The phenotype refers to the parametersof the real design while the genotype results from transforming the phenotype parametersinto a form upon which the genetic algorithms can work efficiently. In mimicry of nature’sworking, genotype representation used to be in terms of binary bit strings. For this reason,the symbol x, familiar to those working with mathematical programming, is replaced byp or g, to point out that phenotype or genotype variables, respectively, are meant. Thefitness function is a sum of products of weights w and demands D.

F (p) =∑i

wiDi(p) (3.27)

The demands represent ratings for one or several objectives and constraints so that thefitness function F appears on first sight analogous to the pseudo-objective function Pexplained in section 3.5.1.Recent research [6, 5] has elaborated schemes for defining the ratings in such a way thatthe problem of finding appropriate weight values dissolves. The following is direct citation,or uses material, from Oliver Konig’s Ph.D. thesis [5].

In order to avoid that one of these terms becomes much larger than the otherones and therefore dominant, only bounded functions scaled to the interval[0, 1] are used. Moreover, this facilitates the adjustment of the weight coeffi-cients wi. Further, to enhance general usability of the fitness formulations, themapping functions Di (~p) are defined range-independent. This means that acertain mapping function does not change its behavior if applied to objectivesor constraints operating in different number ranges.Based on these requirements, functions Di for the different possible types ofoptimization objectives and constraints are presented. The focus for the formu-lation of these mapping functions is put on good practical usability. The userof an evolutionary design optimization program shall be able to define goodmapping functions by only bringing in know-how about the problem he wantsto solve. Thus three types of general mapping functions are defined with theirdefining parameters as listed in Table 3.1. The defining parameters are chosenso that they relate directly to engineering practice. For a problem at hand,

Table 3.1: General types of fitness functions with defining parameters.

Type ParametersRequired Optional

Design objective Oinit , Oestim αpost

Upper/lower limit constr. Climit , Cfeas tol -Target constraint Ctarget , Cadm tol , Cfeas tol -

the parameters Oinit and Oestim refer to the initial value and the estimatedbest-possible value of the design objective respectively. The design objectivefunction can optionally be modified using an amplification factor αpost. A


3.6 Fitness Function for Evolutionary Algorithms 23

limit-value constraint is defined through Climit for the given limit value, andthrough a parameter Cfeas tol specifying a tolerance range for constraint valuesstill considered feasible for the problem at hand. Finally there are target-valueconstraints defined through three parameters. Ctarget defines the target valueto be achieved. It is useful to specify an admissible tolerance Cadm tol , definingan interval for the target value to be reached at the end of the optimization.Additionally a feasible value tolerance Cfeas tol should be specified, definingwhich constraint values should still be considered during optimization.In the following, fitness functions for the different types and parameters arepresented.

3.6.1 Mapping Functions for Objectives

Be O (~p) a measure of the objective of a design optimization problem, as forinstance the mass or the compliance of the considered structure. Then, onecan define

O′ (~p) =

O (~p) : if O (~p) to be minimized

−O (~p) : if O (~p) to be maximized(3.28)

and therefore O′ (~p) is always representing a value to be minimized. The func-tional mapping Di (O′) should satisfy the following requirements:

1. The resulting fitness values must fit into the interval [0, 1].

2. Relevant design improvements should be reflected in distinct decreases ofDi (O′).

3. Selection pressure is initially strong and slows down close at the optimum.

To meet these requirements, the mapping function is defined as

Di

(O′)

=(aO′ + b

)α(3.29)

where the choice of the exponential factor α = 5 is based on experience, and aand b are scaling factors defined by the conditions

Di (O = Oinit) = 1

Di (O = Oestim) = 0.1(3.30)

Oinit represents an initial value of the design objective, which shall result inthe maximum fitness value 1. Oestim is the estimated goal value that can beachieved in the optimization. The fitness value 0.1 for Oestim was adjusted to-gether with an exponential factor α = 5 in order to fulfill the third requirementdefined above. Furthermore, a small fitness value for the estimated objectivealso ensures conformance with the second requirement defined. The scalingfactors a and b can be computed as:

a =1− α√

0.1

Oinit −Oestim(3.31)

b = 1− aOinit

The user has only to specify Oinit and Oestim to define the fitness function fora design objective of a problem at hand. The bold line in Fig. 3.4 pictures an



Figure 3.4: Fitness function for a design objective defined through Oinit and Oestim .

example of the fitness function computed for Oinit = 100 and Oestim = 60. Thegraph demonstrates that the three given requirements for the fitness functionare met for arbitrary ranges of objective values. As a last tuning parameterαpost is introduced. Leaving the scaling parameters a and b constant, theexponential factor α can be varied subsequently to adapt for the problem athand as presented in Fig. 3.4. Usually these variations will be of negligibleinfluence on the overall performance of the algorithm.

3.6.2 Constraint Mapping Functions

The community of evolutionary algorithms use for constraints as well a differentvocabulary than the community of mathematical programming. The inequalityconstraints become upper and lower limit constraints and equality constraintsare called target value constraints.

Upper and Lower Limit Constraint Mapping Functions

For constraint values C (~p) that are not allowed to exceed/fall below a certainvalue Climit , the most straight-forward approach would be to define a simplestep function of the form

Di (C) =

0 : C (~p) ≤ Climit

1 : else(3.32)

With this penalty function a strong selection pressure is exerted towards so-lutions that meet the constraint. On the other hand, the constraint functionmakes no distinction between values that hurt the constraint only marginallyand values that clearly violate the restriction. However the former values cancontain valuable information for the problem and should therefore not be ex-cluded strictly. In order to take this fact into account, a smoothed step functionis defined as

Di (C) =1

1 + e−λ(C(~p)−Climit−∆)(3.33)


3.6 Fitness Function for Evolutionary Algorithms 25

allowing the algorithm to also extract valuable information from solutions withmarginally violated constraints. The step function is controlled with two pa-rameters: ∆ adjusts the horizontal positioning of the function, and λ deter-mines the steepness of the step. However, the parameters ∆ and λ are difficultto adjust correctly, since they depend on the range of occurring constraint val-ues. Furthermore it is difficult to estimate their quantitative effect on the stepfunction.Based on these findings, another definition of the step function has been de-veloped. For practical purpose it would be much more comfortable to definethe step function only with its limit value Climit , and an additional toleranceCfeas tol giving an upper limit of feasible constraint values to be taken into ac-count by the EA. A method to define the step functions that way is to specifytwo conditions

Di (C = Climit) = Dlimit (3.34)

Di (C = Climit + Cfeas tol ) = Dfeas

where Dlimit is the penalty value typically reached at the end of an optimiza-tion, where the EA has found an equilibrium between the different demandsof the fitness function. Dfeas corresponds to the penalty value which is typi-cally still taken into account by the EA. For the normalized fitness formulationused within this thesis, Dlimit = 0.01 and Dfeas = 0.5 proved to be reasonable.With these two conditions given, the original parameters of Equation 3.33 canbe computed as

λ =1

Cfeas tol

(ln

(1

Dlimit− 1

)− ln

(1

Dfeas− 1

))(3.35)

∆ =1

λln

(1

Dlimit− 1

)Fig. 3.5 presents the mapping functions for a critical value given as Climit = 60,and for tolerance values Cfeas tol = 0.6...6. This corresponds to a 1 − 10%tolerance of the limit value Climit . For this formulation, it has to be Cfeas tol ∈R+ for upper limits and Cfeas tol ∈ R− for lower limits, respectively.

Target Value Constraint Mapping Functions

A formulation for constraints where a parameter has to achieve a given targetvalue Ctarget , as e.g. a defined stiffness of a structure, is discussed. For real-valued parameters it makes sense to define a tolerance Cadm tol leading to anacceptance interval where the resulting constraint values should fit in. Furthersmooth transitions are defined, similar to the formulations for limit constraintvalues. Therefore, the function is defined as

Di (C) =

0 : |C(~p)− Ctarget | < Cadm tol

1− e−(|C(~p)−Ctarget |−Cadm tol )

2

2σ2 : |C(~p)− Ctarget | ≥ Cadm tol

(3.36)where C(~p) refers to the actual constraint value. The parameter σ determineshow fast the penalty value increases when the acceptance interval is left. In this



Figure 3.5: Upper limit constraint penalty functions defined through Climit and Cfeas tol .

form, the penalty function can be used to solve practical problems. However,the parameter σ depends on the number range of the constraint considered,and is therefore very difficult to adjust for a problem at hand. To paraphrasea constraint function defined through the parameters given in Table 3.1, anadditional condition is introduced.

Di (C = Ctarget + Cfeas tol ) = Dfeas (3.37)

As introduced before, the feasible value tolerance Cfeas tol determines whichconstraint values should still be taken into consideration during optimization.Dfeas corresponds to the penalty value which is typically still taken into accountby the EA. For practical purpose, Dfeas = 0.5 proved to be reasonable. Theinitial parameter σ can now be determined as

σ2 =(Cfeas tol − Cadm tol )

2

−2 ln (1−Dfeas)(3.38)

Fig. 3.6 shows examples for this penalty function where a target value of 60±5shall be achieved. Therefore Ctarget = 60 and Cadm tol = 5 are set, and thefeasible tolerance is varied with Cfeas tol = 6− 15.

Figure 3.6: Penalty functions for a target constraint


3.7 Design Evaluation Exemplified on Selected Problems 27

3.7 Design Evaluation Exemplified on Selected Problems

Chapter 1 presents selected structural optimization problems and specifies the design ob-jectives and constraints. This section discusses the objectives and constraints. The formof objective and constraining functions depends on the optimization algorithms which arechosen for problem. The methods of Mathematical Programming function with the formu-lations given in Sections 3.1 through 3.5 while Evolutionary Algorithms draw advantagefrom those provided in section 3.7. Table 3.2 tells whether the respective sample problemis solved with mathematical programming or evolutionary algorithm optimization engines.The explanations to the problems solved with evolutionary algorithms are taken from therecently finished Ph.D. work of O. Konig [5] and M. Wintermantel [6] and the ongoingresearch of N. Zehnder. The constraining function formulations depend sometimes on the

Table 3.2: Problems and solution methods.

Problem Solution Method

Motorcycle Frame Evolutionary AlgorithmsRacing Car Rim Evolutionary AlgorithmsFlywheel Mathematical ProgrammingOnsert Mathematical ProgrammingCFRP Boat Hull Evolutionary AlgorithmsEnd Plate Evolutionary Algorithms

parameterization and cannot be considered separately. In these cases, reference is madeto the respective sections in chapter 4. Side constraints giving lower and upper boundson analysis or design variables are directly discussed in the respective sections of chapter4 on parameterization and variable transformations.

3.7.1 Weight Minimization of a Motorcycle Tubular Frame

The motorcycle frame problem is presented in section 1.3.1 and the evaluation model pre-sented here is being published [3]. The objective is minimum weight. As a constraint,the initial structural torsional stiffness of the frame must be the same as in the optimizeddesign. Another constraint is the strength of the frame under various selected severe loadcases. Further constraints come from the manufacturing considerations regarding the de-sign parameters: it is intended to make the frame from a selection of standard size tubesor a limited number of customized tubes. The problem is thus used to demonstrate asearch method based on Evolutionary Algorithms to solve the discrete problem.The objective of minimum weight and the constraints are transformed into a fitness func-tion of the form (3.27). Normalizing the weight W (g) with respect to the initial weightWinitial gives

Dweight(g) =W (g)

Winitial. (3.39)

The chassis is to keep the torsional stiffness value of the existing initial design, which isincluded in the fitness function by the target-value constraint mapping function 3.36. Theframe must withstand the applied loads so that it must be demanded that the maximumequivalent stress value σeqvmax(g) is less than the critical value σcrit. This is achieved bythe upper-limit-constraint mapping function (3.32). The chassis is a space truss design



Figure 3.7: FEM model of the frame and load cases, courtesy =. Konig[5]

and the tubes are suffering mainly loads in the axial directions. However, since theyare welded together at the joints, bending will also contribute slightly to the structuralstiffness. Thus, the fitness of the design is more sensible to changes of the cross-sectionalareas of the tubes but depends also slightly on their second moments of inertia. The areaA, depending on the geometrical parameters diameter D and wall thickness t, influencesboth the weight and the extensional stiffness linearly. Given a certain area value, thebending stiffness increases quadratically with increasing diameter:

I =1

8AD2. (3.40)

Due to the influence of bending stiffness on the fitness tubes will tend to grow to largediameters and small wall thicknesses. It is more difficult to weld tubes with small thanthose with large wall thickness values. Tubes with smaller diameter and larger wall thick-ness are therefore preferable. A further term is introduced to the fitness function, givingtubes with thin walls a slight penalty,

Dthick(g) =

∑Ntubesi=1

1(αi(ti(g)−tmini )+1)βi

Ntubes, (3.41)

where ti(g) is the actual thickness and tmini the minimum allowed thickness of the ith tube.The factor αi is used for scaling and βi adjusts the severity of the penalty.Thanks to the normalized transformation functions developed by Konig [5], the weightfactors wi of each term in (3.27) can be chosen unity.

3.7.2 Racing Car Rim Design Evaluation

The optimization task is to improve the bending stiffness of an existing design. Thebending stiffness is measured by applying loads and relating them to their conjugatedisplacements at the points indicated in Fig. 3.8. The rim design is subject to severaltypes of constraints:

• mass and inertia constraints



B1

B2

Figure 3.8: Rim compliance measure (courtesy O. Konig [5])

• strength constraints

• geometric constraints

The mass and the moment of inertia around the wheel’s axis of the existing design are lowand the same values should be attained by any new design. This is achieved by using thetarget-value constraint mapping function 3.36. The complex geometry makes it necessaryto use CAD as well as FEM to simulate and evaluate the design. Therefore, the mass andmoment of inertia can be evaluated by the CAD system but the bending-stiffness objectiveand the strength constraints require costly FEM analysis.For evaluating the strength constraints, the load case combined from the loads indicated inFig. 3.9 is considered. The loads are applied to the rim’s shoulders – since this is the onlyarea of contact with the tire – and include horizontal (car’s weight plus aerodynamic force),vertical (centripetal), and rotational forces (braking). Essential boundary conditions of

Figure 3.9: Loading of the Rim (courtesy O. Konig [5])

the analysis model simulate clamping at the contact area with the car’s suspension. Themargin of safety for mechanical stresses is defined as

msafety =σ

σyield− 1, (3.42)

where σ is the acting stress and σyield is the yield stress of the material at a given tem-perature. Values less than zero indicate plastic yield to occur. The margin of safety iscalculated for every node, a counter S(p) records the number of nodes with msafety < 0.



The geometric constraints stem from

• manufacturing constraints

• assembly constraints

• regulatory constraints (FIA1)

and are kept by the parameterization of the design model, which will be explained inSection 4.2.2.The manufacturing constraints include the given shape of the forging blank. A 1mmdistance to its outer contour must be kept to allow for properly machined surfaces. Inaddition, manufacturing techniques must not be changed. Finishing is restricted to aCNC-lathe and a CNC-mill. Also, the rim’s bed wall thickness must not be thinner than2mm.The assembly constraints include contact areas to the car’s suspension as well as to thenut holding the wheel and the tire. An additional constraint is a 3mm distance to thebrake assembly positioned on the inside of the rim.The FIA regulations applying to the rim include that the maximum bed diameter is limitedto 330mm. The minimum depth of the lower bed is 13.57mm and the maximum distancefrom the outside surface is 43.3mm.

3.7.3 Maximum-Strength Flywheel Design

Kinetic Energy and Stressing of a Flywheel

A flywheel of constant mass density ρ and a radius-dependent thickness distribution t(r)stores at an angular speed ω the kinetic energy

U = ω2πρ

∫rr3t(r)dr. (3.43)

The inertia effects induce a body force distribution

f = trω2πρ (3.44)

so that the equilibrium of forces in the radial direction requires

(σrt) ,r +t

r(σr − σθ) + trω2ρ = 0. (3.45)

The equilibrium equation 3.45 shows that the radial and the circumferential stresses, σrand σθ, increase quadratically with increasing rotational speed ω. Thus, the rotationalspeed, and with it the kinetic energy, of the flywheel are limited by the maximum stressthat the material can bear.As outlined in section 1.3.3 and shown in Fig. 1.4, the distribution of stresses in a flywheelof constant thickness and with a central bore is uneven. Particularly, the circumferentialstress increases steeply towards the edge of the bore. Failure will initiate at the borealthough other regions are not critically stressed.

1Federation International d’Automobiles (http://www.fia.com)



Objective and Constraint Formulations

Assume the objective of maximizing the kinetic energy storage capability of a flywheel.Let it also be assumed that both, the total mass and the rotational mass inertia, of theinitial design with constant thickness t0 must be kept constant. Then, the kinetic energycapability can only be maximized by increasing the failure rotational speed of the flywheel.This, in turn, can only be done when the maximum stress can somehow be minimized:

minimize max σ(r) . (3.46)

Let the symbol σ stand for a failure criterion such as maximum principal stress σI or vonMises equivalent stress σeqv. The problem with evaluating the objective (3.46) is that notonly the stress values at all points along r must be calculated but that the location atwhich the maximum stress appears may change when the design changes. This causes thefirst and second derivatives of the objective function to be discontinuous. Consequently,one would be forced to use genetic algorithms instead of a mathematical programmingtechnique which would increase the numerical solution effort considerably.A continuous objective function with continuous derivatives can be constructed by thefollowing argument. The maximum stress at some point in the flywheel is minimized whenits value equals the minimum stress at some other point. Then, the stress distributionmust be a constant. This idea was expanded by Stodola [24] who derived analyticallythe shape of a turbine disk of constant strength. The constant stress distribution impliesthat the local stress σ(r) is everywhere equal to the mean stress σ and that therefore itsvariance (average quadratic deviation) is zero:

σ(r) = σ → s =

∫rr (σ(r)− σ)2 dr = 0. (3.47)

In reality a perfectly spatially constant stress state can not be reached for physical reasons:the surfaces perpendicular to the radial direction at the bore and at the outside are stress-free and the radial direct stress must drop off to zero. This causes the stress distributioninevitably not be a constant. However, a global objective function f can be based on(3.47) so that the objective is reached by minimizing

f =

∫rr (σ(r)− σ)2 dr. (3.48)

Now let us consider again the two assumptions of the mass and the rotational moment ofinertia remaining constant during the optimization process. They constitute two equalityconstraints which can be written in terms of constraining functions h1 and h2,

h1 = 2πρ

∫rr (tr − t0) dr = 0 (3.49)

and

h2 = ω2πρ

∫rr3 (tr − t0) dr = 0. (3.50)



Flywheel Evaluation Model

The objective function (3.48) and the constraining functions (3.49) and (3.50) must beevaluated from the analysis model. The analysis model is based on the finite-elementmethod and delivers a numerical system of equations the solution of which gives the nodal-point displacements. This primary solution is used on element-level to calculate stressesfrom which some equivalent stress is formed. Since the stresses are numerically evaluatedat the discrete stress points, the integrals in the objective and constraining functions mustbe replaced by sums. When the finite element mesh is equidistantly spaced in the radialdirection as indicated in Fig. 3.10, it is not necessary to attach individual weights to thestress values.

Figure 3.10: Region out of the Flywheel Analysis Model with Nodal and Optimum StressPoints of Quadratic Serendipity Type Finite Elements

3.7.4 Maximum Bond-Strength Design

Objective and Constraint Formulations

The objective is to maximize the load that can be transferred by the bonding layer fromthe onsert into the substrate. Failure within the bond layer is predicted by evaluatingequivalent stress criteria for thin-layer adhesives.Material failure is a local affair and therefore its prediction requires the evaluation of thestress states at all points within the adhesive domain. The location where the highestfailure probability occurs may change when changing the onsert’s shape. In order tohave a smooth objective function that can be very efficiently minimized by using gradientmethods, the original objective of minimizing the maximum value of a failure criterionat any point in the bonding layer is recast into a global objective functional [2]. Thefunctional penalizes the variance of some equivalent stress σeqv along the radial direction,

f =1

r2 − r1

∫ r2

r1

(σeqv − σeqv)2p dr, (3.51)

from the mean stress σeqv. For p = 1, the objective function becomes the variance (squareof the standard deviation), of the stress distribution. Higher values of p further penalizestress peaks so that the absolute values of minimum and maximum stress deviate less fromthe mean stress. A value of p = 4 has been used for the sample calculations. The stressesare evaluated at the optimum stress points of each finite element in the bonding layer.The parameterization of the onsert problem will be described in section 4.2.4. However,the shape optimization, to maximize the bond strength, regards the thickness distributionof the onsert. There it is assumed that the interface between the onsert and the bonding



layer remains straight, preserving the initial constant thickness distribution of the bondinglayer. Thus, the thickness t(x) = p2(x) − p1 depends only on the vertical position of thepoints p2(x) on the upper onsert surface. The thickness must be positive and greater thanor at least equal to some predefined small minimum-thickness value εt which gives theconstraining function

g(x) = εt − t(x) = εt + p1 − p2(x) ≤ 0 (3.52)

Onsert Problem Evaluation Model

It is quite obvious that the analysis model of the onsert problem has much in commonwith that of the flywheel problem: both are based on the rotational-symmetry assumptionand they use the same finite-element formulations.The stresses threatening the cohesion of the bonding layer material are evaluated at theoptimum stress within each finite element of the bonding layer. They are then insertedinto a failure criterion for thin layers of bonding materials.As explained in section 4.2.4, considering the nature of the stress distributions indicated inFig. 1.5(d), it is a good idea to have the mesh density increase at the inner and the outeronsert edge as indicated in Fig. 1.5(c). In order to approximate the integration requiredby (3.51), it is therefore necessary to weigh the stresses at the stress points with the widthof the element wherein the stress points are located.

3.7.5 Composite Boat Hull

Evaluation of objective and constraints is again based on a combination of CAD and FEM.The objective is to maximize stiffness which is the same as minimizing the displacementsconjugate to the loads indicated in Fig. 3.11. It is mapped onto a demand D by using the

Figure 3.11: ANSYS model of the sail boat hull with composite material patches

mapping function 3.28. The constraints include

• upper limit on mass

• upper limit on cost

• sufficient strength

The parameterization of the hull is explained in Chapter 4 and allows the laminate con-struction to change over the hull’s shell area. It is foreseen that the hull is made fromglass-fiber as well as carbon-fiber reinforced prepreg material. Carbon fibers are much



more expensive than glass fibers and so the material cost responds to the total amountas well as the amount ratio of the two types of material. However, under the given massconstraint the carbon fiber may be more effective, if used in certain regions, because of itslow mass and high stiffness properties. The upper limits on mass and cost are expressedas demands D by using the mapping function 3.32. Their evaluations do not require FEManalysis.The finite element model is needed for the stiffness and strength evaluations. It is assem-bled from shell elements. Their extensional and bending stiffness depends on the locallaminate construction which is given by the number of layers, their respective thicknessesand the materials and orientation of principal material axes used for each layer, and isevaluated by the theory of laminates plates [25]. Once the primary unknowns of the finite-element model are known, the stresses and failure probabilities of the laminate can beevaluated.

3.7.6 Minimum-Weight Fuel-Cell Stack End Plate

Commercial versions of fuel cell sources of electric power for cars should be light andinexpensive. An idea for reaching the objective of low manufacturing costs is to use anextruded-aluminum-profile design [17]. A quarter model of it is rendered in Fig. 3.12. The

symsym

Fb

ps

Figure 3.12: Quarter end-plate model in the initial design (courtesy [5])

other objective, minimum weight, is reached through an automated optimization processwhere the parameters of the cross-sectional design are adjusted. The necessity to have analmost constant pressure distribution over the cross-sectional area of the bipolar platesis considered another objective. The other constraint follows from demanding that themaximum von Mises stress anywhere in the plate domain not exceed an allowable value.The objective and constraining function formulations are closely connected with the chosendesign variables so that the design model is presented here.

Weight and Gas-Tightness Objectives

Since the plate is homogeneous, the total weight is given by the sum of the weights of allfinite elements,

W = ρ

Nel∑k=1

Vel. (3.53)

The other objective of having an even pressure distribution on the bipolar plate cross-section areas is reached independently of, or after completing, the optimization procedure.In the middle of the stack, the cross section will be plane because of symmetry. Whenthe other cross sections also remain plane, the interface between the fuel-cell stack andthe end plates should be so as well. Then, the stack is under a state of plane strain



but the pressure will not necessarily by constant everywhere because of Poisson’s ratioeffects. However, the animus behind the constant pressure idea is that the bipolar platesbe gas tight and one may expect that that goal is equally well reached by demanding thelongitudinal strain to be constant. A structural analysis modelling the optimized designwill reveal the out-of-plane displacement field in the interface between the end plate andthe stack. The displacement field is interpreted as a shape and the end plate’s bottomsurface will then be given the negative of that shape. The displacements under load willthen cancel out that shape so that the bottom becomes plane, consistent with a planestate of the stack. The surface shape variations are expected to be small enough so thatthere effect on the bending stiffness is negligible, decoupling the objective of gas tightnessfrom the minimum weight objective and stress constraints.

Stress Constraining Equations

The stress constraint gk = σeqv(k)− σmax ≤ 0 is evaluated for each finite element, giving anumber of constraints that equals the number of finite elements contained in the model.

Optimization Problem Statement

It remains the single-objective constrained optimization problem

minx∈Rn

W (x)|g(x) ≤ 0 . (3.54)


Chapter 4

Parameterization and VariableTransformations

4.1 Classification of Design Variables and Structural Opti-mization Problems

Design parameterization defines the fixed design parameters and the design variables. Thedesign variables may describe the configuration of a structure, element quantities such ascross sections, wall thicknesses, shapes, and physical properties of the material. Eschenauer[18] classifies structural optimization problems in terms of their design variables. Consid-ering a truss structure, and following [26] and [27], possible design variables can be dividedinto the different classes indicated in Fig.

Figure 4.1: Classification of design optimization problems for truss-like structures in termsof different types of design variables, after Eschenauer [18]


38 Parameterization and Variable Transformations

a) Constructive layoutThe determination of the best suited layout requires optimizing each layout cominginto consideration and comparing the calculated optimum solutions.

b) TopologyThe topology or arrangement of the elements in a structure is often described byparameters that can be modified in discrete steps only. Different topologies canalso be obtained by eliminating nodes and linking elements. Note also the topologyoptimization method introduced by Bendsøe and Kikuchi [12] explained in section9.1.

c) Material propertiesThe material properties of isotropic building materials such as steel or aluminumdescribe the stiffness in terms of Young’s modulus or Poisson’s ratio, the strengthin terms of the yield stress or other strength limit, or the weight in terms of specificweight or mass density. The designer can often select the most suitable materialfrom a selection of alloys and sometimes he may decide whether steel, aluminumor any other type of metal alloy would best meet the requirements of the designobjective. All of these choices are discrete in nature, leading to a discrete designvariable set in terms of the materials contained in a data base. Only laminates madefrom anisotropic composite materials have some continuous design variables in termsof fiber orientation.

d) Geometry and ShapeThe geometry of trusses or frames is described by the coordinates of the nodes. Theshape of solid bodies is determined by their bounding surfaces.

e) Supports and loadingOften a design may be enhanced by changing the geometric and the natural boundaryconditions. Process optimization, for instance in the case of injection molding, isusually dealing with optimum adjustment of boundary conditions such as injectionflow speed or injection point location.

f) SizingStructures in terms of members such as bars, trusses, beams, plates, or shells and alsotheir FEM models offer properties such as thickness, cross-sectional area, momentof inertia, or thickness to be used as design variables. It is important to distinguishbetween independent and dependent design variable. When a cross-section geometryis determined by some variable, the geometrical properties listed above are depen-dent on that variable. Sizing usually leads to a discrete optimization problem ascommercially available members with I-sections or channel sections come in differentdiscrete sizes.

The foregoing classification is exemplified on truss-like structures but remains valid forshells or solid body structures. A number of design parameterization models and trans-formations between design and analysis variables are discussed on the basis of the samplestructural optimization problems used for the lecture class.


4.2 Design Parameterization Sample Problems 39

4.2 Design Parameterization Sample Problems

4.2.1 Motorcycle Frame and Sizing Parameters

The design freedom is limited by the consideration that some key measures such as thespatial arrangement of the nodes should not be changed. The remaining parameters regardthe sizing of the tubes. Deciding that tubes have a circular cross-section geometry, theinner and the outer radii are used to determine the properties area and second moments ofinertia which are needed to simulate the structural stiffness behavior of the frame. Tubesare available in standard sizes or they can be custom made. Considering standard tubes,

Figure 4.2: Standard Tube Sizes after DIN 2394

one can choose from a catalogue or a set of existing discrete design variable values.Considering custom made tubes, one can freely adjust the inner and outer diameters ascontinuous variables but the number of the customized tube sizes should be small, saythree. All tubes of the frame should then be chosen from the small number of custommade tubes and when it is not predefined which of the tubes are to be of the same size,a choice must be made. That choice again renders the optimization problem discrete.The structural model must be so that all finite elements along one of the tubes must

Tube types: : ri= 12.5mm, t=1.5mm : ri= 9.5mm, t=1.5mm : ri= 6.75mm, t=1.5mm

12

1

1

1

1

11

1

1

1

1

1

1

1

1

11

2

2

2

2

2

2

2

23

3

2

2

2

2 3

2

2

2

22

3

3

15

1

2

5

4

11 610

713

148

9

12

Figure 4.3: FEM-Model and Parameterization of the Motorcycle Frame, after [5, 6]

have the same properties determined by the respective design variable. Consequently, theindependent design variables must be transformed into the dependent geometric elementcross-section property value variables, and these must be assigned to all the elementsconstituting the respective tube. Also, the frame must be symmetric with respect to the



midplane spanned by vertical and longitudinal directions so that each tube on one side ofthe midplane must have the same dimensions as the respective tube on the other side ofit.

4.2.2 Shape Parameters of the Formula 1 Race Car Rim

This sub section is taken from [5]. Four substructures (features) build up the basic geom-etry of the CAD model as shown in Fig. 4.4:

• A rotational body for the rim’s bed.

• A second rotational body for the spokes.

• Two pockets that remove the spokes’ interspace.

Half of the rim bed. Spoke body with the two pocket features subtracted

Figure 4.4: Structure of the CAD model of the rim.

All these features are built on fully parametric two-dimensional sketches. In addition tothese basic features, various chamfers are applied, chosen completely parametric as well.The basic design is taken from an existing racing car rim.

Parameterization

For performance reasons as outlined before, a parameterization including implicitly asmany of the mentioned constraints as possible has to be found, without excluding relevantfeasible solutions. The bed is parameterized by 9 wall thicknesses including the inner andthe outer bead as shown in Figure 4.5. With this approach the manufacturing constraintof a minimum wall thickness of 2mm can directly be included. The bed’s outer contourcan not be altered in some sections. The outer bead diameter and some shoulder diametersare restricted by FIA regulations. The inner bed’s maximum diameter is limited to theshoulder’s diameter to allow the assembly of the tire. The lower bed’s minimum depthand the maximum distance from the rim’s outer face are also subject to FIA regulations.Therefore the coordinates of the point in question also form parameters allowing to complywith those restrictions. For the remaining degrees of freedom of the bed, contour lines ofthe forging blank and the brake assembly limit the geometric design space. To make sure



distances

wall thicknesses

radii

pointcoordinates

Figure 4.5: Optimization variables for spoke-body and bed contour

the minimum distances of 1mm to the forging blank and 3mm to the braking contourcan be approached as closely as possible without crossing them, the contour lines formconstruction elements in the sketch and the bed’s contour is directly dimensioned to thosecontours, setting the distances as parameters. The rotational body for the spokes isparameterized in a way similar to the rim’s bed. Front and rear contour are dimensionedto the blank’s contours. The parts of the contour-forming interfaces to the suspensionand the nut are non-parametric. The interface between spokes and bed depends on theshape of both features. The sketch for the bed contains the spokes’ rear and front contourlines as construction elements. The second sketch for the rotational body of the spokes isreferenced to those construction elements and the interface line with reference dimensions.This way, the parameterization for both features is done in only one sketch and updateloops are avoided. At the same time, two separate features are needed, because the basepart between the spokes is not defined by prismatic pockets but the contour of the bed.The pockets are removed from the spoke body going beyond the outer diameter. Theremaining spokes are then added to the rim bed. The sharp-angled base of the spokes isthen smoothed out by two parametric fillet features. The two pockets are parameterized ina third sketch (see Fig. 4.6). A base line with a constant radius is defined. Parameters for

Pocket 2 Pocket 1

Figure 4.6: Optimization variables for the pockets

the larger pocket include height and width at the base as well as at a continuous transitionpoint, and the radius of the upper rounding. The gap between the two pockets, forming



the spoke, is parameterized by two widths, one at the base and one at the same transitionpoint. This leaves only the upper rounding as a free parameter of the small pocket.

4.2.3 Maximum-Strength Flywheel and Mesh-Dependent Shape Param-eters

The thickness values at the element interfaces in the radial direction take the role of shapeparameters as outlined in Fig. 4.7. Thus, the number of optimization variables dependslinearly on the number of elements in the radial direction but is independent of the numberof elements in axial direction. The position of the nodes on the element sides is alwayson the straight line connecting the respective corner nodes. The flywheel problem uses

Figure 4.7: Relation between nodal point coordinates and shape parameter t[2]

very little pre-existing knowledge, allowing a great design freedom: assuming an analysismodel with 100 element columns in the radial direction, there are 101 element interfacelines associated with independent thickness values to define a design with individual fea-tures. Thus, it became possible that the numerical shape optimization results suggesteda simplified mechanical model for finding the optimum flywheel design by exact formulae[2].

4.2.4 Onsert Design and Mesh-Independent Shape Parameters

The onsert shape is defined by the curves of its surfaces along the radial coordinate x. Achange of thickness implies a change of the y-coordinate values of the finite-element nodalpoints. The surface nodal point positions p of the initial design are partitioned into theaxial nodal point positions p1 at the bottom and p2 at the top surfaces, see Fig. 4.8. Thereference plane of the nodal-point positions is the surface of the sandwich. The values ofp1 gives thus the thickness of the bonding layer. In this study the bonding layer thicknessis taken as constant and all the entries of p1 have the same value, p1i = pi. Changing

Figure 4.8: Definition of shape parameters p2.

the values of p2 changes the thickness distribution of the onsert. This allows to study theinfluence of the onsert structural stiffness properties on the bonding strength. In the initial



configuration the onsert has a rectangular cross section with a regular mesh consisting ofrectangular elements evenly arranged in rows and columns. Mesh distortion is minimumwhen the positions of nodes between the unmovable bottom node and the top node areadjusted so that a constant spacing in the axial direction is maintained as can be seenfrom the mesh plot shown in Fig.

Figure 4.9: Final Design and Mesh Distortion

The number of entries of p increases with increasing fineness of the finite-element mesh inthe radial direction and may be quite high when an accurate stress analysis is required.The number of optimization variables becomes decoupled from mesh size when the nodalpoint positions p are made to depend on a different set of design parameters x. Bychoosing that the onsert surface shape is composed of simple polynomial functions P, theoptimization variables x control the original shape parameters p as weight coefficients ofthe polynomials P:

p = Pnxn, 0 ≤ n ≤ N. (4.1)

Einstein’s summation convention is implied in (4.1) and the set of polynomials P is com-plete, ranging in degree from zero to a specified maximum value n.The shape depicted in Fig. 4.8 corresponds with a polynomial including the constant,linear, and quadratic terms.

Transformation of the Constraining Equation

The transformation between the design variables and the analysis variables is given by(4.1). The constraining function (3.52) is defined in terms of the nodal-point positionsp2 and must be evaluated at all element interfaces. An analysis model with NEL elementcolumns in the onsert region must therefore fulfill the constraint (3.52) at all discreteNEL + 1 interface positions,

gi = εt − ti = εt + p1 − p2i ≤ 0 (4.2)

Corresponding with the discrete analysis model, the continuous constraining function hasthus been replaced by a number of NEL + 1 discrete constraining equations that can bewritten as one vector equation:

g = ε− t = ε+ p1 − p2 ≤ 0 (4.3)

The constraining equations are written in terms of the analysis variables p2. Next wewish to express them in terms of the design variables x which is achieved by use of thetransformation (4.1):

g = ε− t = ε+ p1 −Px ≤ 0 (4.4)

The transformation does not reduce the number of constraining equations. Generally, thedesign variables are subject to a larger number of constraining equations, say one-hundred.



Mathematical method for observing the design variables constraints

Only for cubic polynomials whose complexity is not higher than cubic, the constraintscan be mathematically expressed in terms of the design variables. For this purpose it isconvenient to normalize the radius range with −1 ≤ ξ ≤ 1 and then choose for the shapeparameterization the form:

t(ξ) = t0 + t1ξ + t2ξ2 + t3ξ

3 (4.5)

A first step is to require that the minimum thickness ε is kept at the points ξ = −1, ξ = 0,and ξ = 1:

g1 = g(−1) = ε − t0 + t1 − t2 + t3 ≤ 0g2 = g( 0) = ε − t0 ≤ 0g3 = g( 1) = ε − t0 − t1 − t2 − t3 ≤ 0

(4.6)

If these conditions are satisfied, violations in the interior can only exist together with realpolynomial extrema. These are found with the condition:

g(ξ),xi = 0 → t1 + 2t2ξ + 3t3ξ2 = 0 → ξ1,2 = − 1

3t3

[t2 ∓

√t22 − 3t1t3

](4.7)

Real extrema do not exist if the radikant of the root is negative, or if the discriminat D ispositive:

D = 3t1t3 − t22 ≥ 0 . (4.8)

This could be used complementary with (4.6) to form a set of conditions for keeping theminimum thickness everwere within the considered domain. However, the search spacewould be unnecessarily reduced. The maximum possible constrained search space is main-tained by first checking with the discriminat whether real extremas exist. Then, theposition of the maximum

ξmax = − 1

3t3

[t2 −

√t22 − 3t1t3

](4.9)

is substituted in (4.5). This result is then used as complementary condition (4.6)

g4 = ε− tξmax ≤ 0 . (4.10)

4.2.5 Composite Boat Hull and the Patch Idea

The idea of patches, illustrated in Fig. 4.10, relates the parameterization of a laminatedstructure to its manufacturing process. A structure made from prepregs consists of an

Figure 4.10: Patch Pattern and Laminate (Courtesy N. Zehnder [14])

assembly of prepreg sheets each of which is characterized by its instance in the lay-up



sequence, position in space, shape, and size. These parameters are geometrical patchparameters. Each prepreg sheet constitutes a patch and is in addition assigned a choice ofmaterial and orientation of the principal material axes, which are internal material patchparameters. For each point on the shell-type structure, the local laminate construction isdetermined by the geometrical and internal material patch parameters. Of the completeset of parameters, the lay-up sequence and the choice of material are discrete, renderingthe parameterization suitable for Evolutionary Algorithms rather than for MathematicalProgramming optimization engines.The composite boat hull problem is multi-objective where stiffness and prize are in conflictwith each other. This assigns a crucial role to the choice and placement of materials whichcan be chosen from a range of inexpensive low-stiffness glass-fiber and expensive high-stiffness carbon-fiber reinforced plastics. A patch pattern on the boat hull is indicated inFig. 1.6.

4.2.6 Fuel-Cell-Stack End Plate

The CAD model is assembled from three types of CAD features: the lower plate, theupper plate, and four ribs. Fig. 4.11 details how these entities are defined. The lower

tb1

tb2

tb3

tb4

tb5

tb6

tu2

tu3

tu4

tu5

tu6

tu1hu1

hu2

hu3

hu4

hu5

hu6

xli

tri

xu

i

Figure 4.11: CAD features: lower plate, upper plate and a rib

plate is bounded through a planar functional face at the bottom and through assembledface segments at the top. These segments are defined through equidistant sampling pointsdefining six optimization variables tl1...tl6 of the lower plate. Additionally, the edges of thetop faces are chamfered. The upper plate is defined through two sets of sampling points,i.e. six equidistant height parameters hu1...hu6 defining the bottom face and six thicknessvariables tu1...tu6 defining the top face of the upper plate. Again, the edges of these facesare chamfered. Finally, each rib is defined through three optimization variables: a lowerposition xli, an upper position xui, and a thickness tri. For all these optimization parame-ters a range and a step size σ for Gaussian mutation or initialization are assigned as listedin Table 4.1. The genotype for the Evolutionary Algorithm is defined as follows. The ribsare chosen as CAD features to be optimized; one rib is defined by three parameters thatare represented in one gene. For the lower and upper plate sampling positions are chosenas genes. That means for the lower plate each thickness represents a gene, and for theupper plate the height and the thickness at a sampling position form a gene. This leadsto the following genotype, where each gene is marked through accolades:tl1 hu1, tu1 tl2 hu2, tu2 tl3 hu3, tu3 tl4 hu4, tu4 tl5 hu5, tu5 tl6hu6, tu6 xl1, xu1, tr1 xl2, xu2, tr2 xl3, xu3, tr3 xl4, xu4, tr4.This representation implicitly fulfills the manufacturing requirement, i.e. extrusion mold-



Table 4.1: Ranges and step sizes for the optimization variables.

Parameters Range Step[mm] [mm]

tl1...tl6 [1, 7] 1hu1...hu6 [1, 33] 4tu1...tu6 [1, 7] 1xl1...xl4 [0.1, 73.3] 8xu1...xu4 [0.1, 73.3] 8tr1...tr4 [1, 7] 1

ing. For the end plate, the holes for medium flow and the tension bolts are made in asubsequent machining process, and the global vertical edges are rounded. Since materialcan only be removed from the extruded base block, planar horizontal faces have to bemachined into the upper plate for a well defined load introduction from the bolts to theend plate. The position of these contact faces must be adapted to the varying slope andcurvature of the top face of the upper plate, whereas for some solutions this face even canbe split in two subregions.

4.3 The Parameterization Spectrum

The various parameterization models presented in the previous sections can be character-ized by their respective levels of design-parameter and constraint densities. This idea of aparameterization spectrum is due to M. Wintermantel and O. Konig [4]. Fig. 4.12 givesa two-dimensional representation of it. The vertical axis stands for the number of design

Figure 4.12: Sample problems and parameterization spectrum

variables. The horizontal axis indicates the measure of constraint density, invested pre-existing know-how, or parameterization sophistication. The indicated example problemsare placed in the scheme according to the summarizing description below.


4.3 The Parameterization Spectrum 47

Motorcycle Frame The problem was solved by assigning tube dimensions to each of the15 independent tubes. An additionally posed, and interesting, problem was that onlythree different dimensions where to be used but these had to be adjusted optimally.On the other hand, the design freedom is reduced by prescribing fixed values for thenode positions where the tubes are connected with each other. This can be regardedas pre-existing knowledge.

Race Car Rim The problem was solved by adjusting about 30 parameter values of CADentities. The problem is highly constrained as explained in Section 3.7.2.

Flywheel The problem was solved by using a mesh-dependent shape parameterizationwhere, depending on the fineness of the mesh, the number of variables can be a hun-dred or more. This allows the forming of optimum design solutions with unexpectedshapes. Also, only two equality constraints, namely preservation of the mass andtorsional inertia of the initial design, where prescribed.

Onsert The problem was solved by using a mesh-independent shape parameterizationwhere the number of variables depends on a desired shape complexity in termsof the maximum considered polynomial power. The calculated examples includepowers from zero to three, which implies that the small number of four weightingcoefficients act as optimization variables. This reduces the numerical effort of findingthe optimum design which is, on the other hand, limited to a shape whose complexityis limited to a polynomial of the third degree. In order to guarantee meaningfulsolutions, the onsert thickness must explicitly constrained to a minimum positivevalue.

FRP-Boat Hull The problem is solved by patch parameterization including geometryand internal material parameters whose number is a multiple of the number ofpatches, here prepreg cuttings, used for making the structure. On the other hand,there is much freedom in adjusting these parameters and the only restrictions usedin the example problem are the given shape of the boat hull and that the mass ofthe initial design must not be exceeded.

Fuel-Cell-Stack End Plate Since four ribs where foreseen for the extruded-profile de-sign, there are 27 free CAD parameters used in the optimization. Thickness valuesmust be larger than the minimum value that can be realized by the extrusion mold-ing technique. The upper plate must be in a position above the lower plate. The fourribs must be positioned within the geometric design space of the structure. Apartfrom these obvious constraints that must be imposed to guarantee meaningful de-sign solutions, there is much freedom to arrive at an optimum design solution whoseshape was not intuitively expected.

4.3.1 Influence of Mechanical Situation on Parameterization

The two problems Maximum-Strength Flywheel Design and Maximum Strength Onsert De-sign use different levels of parameterization. The flywheel problem uses design variablesthat transform very directly to the mesh-dependent analysis variables, resulting in a highnumber of optimization variables, say 200. The onsert problem, on the other hand, usesdesign variables controlling the global onsert shape via a linear combination low-degreepolynomials, resulting in a much lower number of optimization variables, say 4.Apart from pure arbitrariness or the curiosity to try out things - what could one motivate



to use such different parameterizations? For both problems seem to be very similar re-garding the structural models, the objectives, the formulation of the objective functions,and the side constraints.Both mechanical problems are rotationally symmetric and use in fact the same differen-tial equations and the same quadratic eight-node Serendipity finite-elements to create thestructural stiffness matrix.The objectives are to maximize strength and the basic idea behind reaching these ob-jectives is to make the distribution of stresses, within the regions of interest, as even aspossible. So both objective functions globally penalize the sum of the quadratic (if n = 0)deviations of the local stresses from the mean stress.The two problems have also in common that thickness values must be positive, yieldingbasically identical side constraints.Only the loading sets the two problems apart. The flywheel is loaded by inertia body forcesin the radial direction due to the rotation while the onsert is loaded by some external forcein the axial direction.Anticipating the study of the behavior of the two optimization programs, the flywheeloptimization procedure yields, without any problems, shape results such as shown in Fig.1.1(b). On the other hand, an early version of the onsert program, using the same parame-terization as the flywheel program, failed to produce a smooth shape such as shown in Fig.4.9 [28, 29]. Rather, jagged shapes such as shown in Fig. 4.13 were obtained [30]. The

Figure 4.13: Jagged onsert shape obtained with mesh-dependent analysis variables [30]

parameterization based on global polynomial shape representation makes sure that the ob-tained shapes are nice in appearance and easy to manufacture although the jagged shaperesults are quite correct regarding the successful minimization of the objective function.Another positive effect of the global shape representation is that the reduced number ofindependent optimization variables tends to speed up convergence and reduce the numberof necessary design evaluations.The question remains as to why the flywheel problem is better-natured than the onsertproblem. The answer lies in its loading situation and how it affects the stress distribution,forming an objective function topology favoring smooth shape results. The loading of theflywheel is in the radial direction and, assuming a mesh with only one element row, thefinite elements are loaded in-series. Consequently, when one of these elements becomesthinner, the radial stress must increase and as it becomes thicker, the stress decreases. But


4.3 The Parameterization Spectrum 49

these stress deviations are immediately penalized by the objective function. Particularly,the objective function tends to prevent exceedingly small thickness values, having the sameeffect as if the side constraint were implemented via a penalty method transformation. Onthe other hand, the loading of the onsert with respect to the stressing of the bonding layeris partially in-parallel and does not produce the same stabilizing effect occurring in theflywheel problem.The different design variable models place the two problems at different positions on theparameterization spectrum. Also, one could say that the program for finding maximum-strength onsert shapes is useful as a preliminary design tool while the program for findingmaximum-strength flywheel shapes gave some deeper understanding of the mechanicalproblem and inspired a new simplified mechanical model [2].


Chapter 5

Some Basic Concepts of GlobalNonlinear Optimization

The nonlinear constrained optimization problem can be written as follows [20]:

Minimize: f(x) objective function

Subject to:

gi(x) ≤ 0 j = 1,m inequality constraints

hk(x) = 0 k = 1, l equality constraints

xli ≤ xi ≤ xui i = 1, n side constraints

where x =

x1

x2

x3...xn

design variables

The vector x is called vector of design variables. The objective function as well as theconstraining functions may be linear or nonlinear functions of x. They may be explicit orimplicit in x and may be evaluated by any analytical or numerical techniques we have atour disposal. However, if mathematical programming is used, it is important that thesefunctions be continuous and have continuous first derivatives in x. If these conditionsare not satisfied, for instance when discrete-valued variables appear, one must either in-vent homogenization techniques such as [12] or resort to other methods such as geneticalgorithms.

5.1 Nonlinear Optimization Task

The nonlinear optimization tasks require iterative solution processes some of which are de-scribed in chapter 6. The nonlinearity makes iterative solution process, such as illustratedin Fig. 6.3, necessary


52 Some Basic Concepts of Global Nonlinear Optimization

5.2 Feasible Region

The inequality constraints divide the search space into feasible and infeasible regions.In the infeasible regions are all the designs violating one or several constraints. Theboundaries gi(x) = 0 between feasible and infeasible regions define hyperplanes in themulti-dimensional search space. The equality constraints hi(x) = 0 define also hyperplanesbut they do not separate feasible from infeasible regions - they are feasible regions. Anyfeasible design must lie on those equality-constraint hyperplanes.

5.3 Convex and Non-Convex Functions

Fig. 5.1 illustrates convex functions, concave functions, and functions that are neitherconvex nor concave. We share Vanderplaats [20] mental picture that a convex functionlooks like a bowl that will be able to hold water. Thus, a concave function is the negativeof a convex function. The function Fig. 5.1(c) is neither convex nor concave. A function

Figure 5.1: Convex (a), concave (b), and neither convex nor concave function (c)

f(x) is convex if for any two points x1 and x2 contained in the set it holds that

f[θx1 + (1− θ) x2

]≤ θf(x1) + (1− θ) f(x2) 0 ≤ θ ≤ 1 (5.1)

For further illustration we consider the sample optimization problem:

Minimize: f(x) = (x1 − 0.5)2 + (x2 − 0.5)2 (5.2)

Subject to: g(x) =1

x1+

1

x2− 1 ≤ 0 (5.3)

The objective and constraining functions are plotted in Fig. 5.2. The objective function

Figure 5.2: Convex design space. Source: Vanderplaats [20]

is convex and its feasible region is bounded by a convex constraining function. Then, thedesign space is a convex set.


5.4 Method of feasible directions 53

5.4 Method of feasible directions

As the unconstrained Lagrangian, for taking constraints in account, is so closely connectedwith the method of feasible directions, a first first glimpse on the latter is given with Section3.5.3. Here, more information is provided as the equality and the inequality constraintsmay be treated differently.

5.4.1 Inequality Constraints

The method of feasible directions seeks to minimize the given original, or untransformed,objective function and to preserve a feasible design at each iteration step. As with theunconstrained optimization task, we search for an improved point of the optimizationvariables x by determining a useful search direction s and adjusting a factor α so that thenew point,

x(k+1) = x(k) + αs, (5.4)

further minimizes the objective f . The requirement that the new design must not violatethe constraints restricts the search direction to the space of the feasible directions. Thedirection-finding algorithm is now provided with the double task of producing search direc-tions that are not only useful (having the descent property) but also feasible (not violatingconstraints). Consider a design with an active constraint g(x(0)) as illustrated in Fig.5.3. The gradients of the objective as well as of the constraining functions are indicated

Figure 5.3: Usable and feasible sectors of a search space

in the figure. The line passing through the reference point x(0) and being vertical to theobject-function gradient ∇f divides the two-dimensional search space into the usable andunusable sectors. The line passing through the reference point x(0) and being vertical tothe constraining-function gradient ∇g divides the feasible and the unfeasible sectors. Fora usable search direction it must be that

∇f(x(0))s ≤ 0 (5.5)

and for a feasible search direction

∇g(x(0))s ≤ 0 (5.6)



must hold. The feasible search direction with steepest descent is tangential to the contourline of the constraining function, g = const. However, if the constraining function isnonlinear, the feasible search direction of steepest descent may stray into the infeasibleregion which is unwanted. Therefore, one can introduce into (5.6) a positive parameterθ (push-off factor), pushing the search direction away from the tangent at the infeasibleregion:

∇g(x(0))s + θ ≤ 0 (5.7)

The concept of including push-off factors and the resulting algorithms are well describedin Vanderplaat’s textbook [20].

5.4.2 Equality Constraints

Equality constraints are always active. If they are nonlinear, the line search necessarilycauses constraint violations, demanding their subsequent removal. At the beginning of theline search, the search direction is perpendicular to the constraining function gradients.Since the search direction must also be usable, it is composed of the direction of steepestdescent and the necessary corrections determined by the gradients of the constrainingfunctions.Fig. 3.3 illustrates the finding of a feasible search direction with an objective functionthat depends on two design variables subject to one equality constrained. The followingequations, however, refer to a situation where the number of design variables is not specifiedand two equality constraints are assumed. The situation with two constraints is readilygeneralized to one with m constraints. One proposes a search direction composed of thedirection of steepest descent and the gradients of the constraining functions:

s = −∇f + λ1∇h1 + λ2∇h2. (5.8)

The weight factors λi must now be adjusted so that the search direction s is orthogonalto the gradients of the constraining functions:

sT∇h1 = (−∇f + β1∇h1 + β1∇h1)T∇h1 = 0

sT∇h2 = (−∇f + β1∇h1 + β1∇h1)T∇h2 = 0. (5.9)

After executing the multiplications and re-ordering the terms one obtains the system ofequations for the weight factors λi[

∇h1T∇h1 ∇h2

T∇h1

∇h1T∇h2 ∇h2

T∇h2

]λ1

λ2

=

∇fT∇h1

∇fT∇h2

. (5.10)

The system of equations can be resolved when the determinant D of the coefficient matrixis not equal to zero. The determinant D

D = (∇h1T∇h1)(∇h2

T∇h2)− (∇h1T∇h2)(∇h2

T∇h1) ≥ 0 (5.11)

vanishes only if the two equality constraints are identical.


5.5 Lagrange Multiplier Method 55

5.4.3 Generalization of Lagrange factor calculation

The instantaneous feasible search direction s considering the equality and all active in-equality constraints is:

s = −∇f(x)−m∑j=1

λj∇gj(x)−l∑

k=1

λm+k∇hk(x) (5.12)

It is written in the more concise form:

s = −∇f −Bλ (5.13)

where all constraining function gradients and the Lagrange factors are collected in arraysabbreviated with:

B = [∇g∇h] λ =

λgλh

(5.14)

The feasible search direction s must be orthogonal to all active constraining functiongradients Bso that the inner product vanishes:

BT s = −BT∇f −BTBλ = 0. (5.15)

Resolving for the Lagrange factors gives:

λ = −[BTB

]−1BT∇f (5.16)

5.5 Lagrange Multiplier Method

After having developed the background in terms of the method of feasible directions, onefinds a Lagrange function simply to be the function whose negative gradient falls togetherwith the feasible search direction:

L(x,λ) = f(x) +

m∑j=1

λjgj(x) +

l∑k=1

λm+khk(x) (5.17)

Solving a constrained optimization problem by the method of Lagrange multipliersamounts to finding a stationary point of the Lagrangian L in terms of the optimiza-tion variables x and the Lagrange multipliers λ. The gradient of L with respect to theoptimization variables, ∇xL, obtains the negative of the feasible search direction s (5.12):

∇xL(x, λ) = −s (5.18)

At the stationary point ∇xL = −s = 0 the objective f possesses its smallest value withinthe feasible region, so that the stationary point is a minimum with respect to the opti-mizations variables x:

L(x∗, λ∗) ≤ L(x, λ∗) ; (5.19)

The gradient of L with respect to the optimization variables, ∇λL, obtains the constrainingfunctions g and h:

∇λL(x, λ) =

gh

. (5.20)

At the stationary point ∇λL = 0 all constraining equations are satisfied. The inequalityconstraining equations g have negative values within the feasible region and the respectiveLagrange factors are all positive; for the equality constraints the opposite might hold.Therefore the stationary point is a maximum with respect to the Lagrange factors λ:

L(x∗, λ∗) ≥ L(x∗, λ) . (5.21)



5.5.1 Lagrange Multiplier Method Sample Problem

For an illustrative example the sample problem for penalty methods in Section 3.5.1 onpage 18 is reconsidered. Objective and constraining functions are provided with (3.15)and (3.17), respectively. The one constraint that will be activated when searching for thebest solution is g1. The Lagrangian formulation of the optimization problem is:

L =1

20(x+ 2)2 + λ (1− x) (5.22)

The gradients of the objective and constraining functions are:

∇f =1

10(x+ 2) ∇g1 = −1 (5.23)

At the known constrained minimum x = 1 the Lagrange multiplier λ is calculated with(3.25) on page 21:

λ = − ∇fT∇g1

∇g1T∇g1

=3

10(5.24)

With λ thus specified, Fig. 5.4(a) shows that the Lagrangian, as a function of the op-timization variable x, is smooth with a minimum at the smallest possible value of theobjective f . To demonstrate that L as a function of λ has a maximum at the constrainedminimum point, the gradient ∇xL is set equal to zero and then resolved for x:

∇xL =x+ 2

10− λ = 0 → x = 10λ− 2 (5.25)

The result allows to eliminate x from (5.22) to express the Lagrangian as a function of theLagrange multiplier λ:

L = λ (3− 5λ) (5.26)

The plot in Fig. 5.4(b) shows that the Lagrangian, as a function of the Lagrange multiplier

‐0.5

0

0.5

1

1.5

‐2 ‐1 0 1 2 3

x

f g L

0

0.1

0.2

0.3

0.4

0.5

0 0.2 0.4 0.6

Lagran

gian

a b

Figure 5.4: Sample problem Lagrangian versus x and versus λ

λ, is smooth with a maximum at the smallest possible value of the objective f .


5.6 Necessary and Sufficient Optimality Criteria 57

5.6 Necessary and Sufficient Optimality Criteria

5.6.1 Unconstrained Objective Functions

The minimum point has a smaller function value than all other points. Probing thepoints in the vicinity of a minimum point must always yield higher function values. Whenthe objective function is smooth with existing first and second derivatives, the necessaryoptimality conditions can be expressed in terms of its first and second derivatives. Thefirst derivative with respect to the optimization variables is a vector called gradient ∇f(x)that must vanish at the optimum:

∇f(x) = 0, ∇f(x) =

∂f(x)∂x1∂f(x)∂x1...

∂f(x)∂xn

. (5.27)

The second derivative is a matrix called Hessian H and, at the minimum point, it mustbe positive definite:

|H|> 0,

∂2f(x)∂x2

1

∂2f(x)∂x1∂x2

. . . ∂2f(x)∂x1∂xn

∂2f(x)∂x1∂x2

∂2f(x)∂x2

2. . . ∂2f(x)

∂x2∂xn

......

. . ....

∂2f(x)∂x1∂xn

∂2f(x)∂x2∂xn

. . . ∂2f(x)∂x2n

. (5.28)

Positive definiteness means that this matrix has all positive eigenvalues. If the gradient iszero and the Hessian matrix is positive definite for a given x, this insures that the designis at least a relative minimum, but does not insure that the design is a global minimum.Only when the objective is known to be convex will (5.27) and (5.28) suffice to identifythe global minimum.

5.6.2 Kuhn-Tucker Optimality Conditions

Constraints restrict the search space to the feasible region, and the optimal solution is theone indicated in Fig. 3.3 and Fig. 5.5with a minimum objective value within the feasibleregion. The necessary Kuhn-Tucker conditions [31] for x∗ being a constrained minimumare:

1. x is feasible

gj(x∗) ≤ 0 j = 1, m

hk(x∗) = 0 k = 1, l

2. λ∗jgj(x∗) = 0 j = 1,m λ∗j ≥ 0

3. ∇xf(x∗) +

m∑j=1

λ∗j∇xgj(x) +

l∑k=1

λ∗m+k∇xhk(x) = 0

λ∗j ≥ 0

λ∗m+k unrestricted in sign



The first of the Kuhn-Tucker conditions requires that the minimum of the constrainedfunction lie within the feasible region. The second condition implies that inactive con-straints, as the one inactive constraint shown in Fig. 5.5, must not be considered. Thethird condition can be understood by the method of feasible directions: At the constrainedminimum point, the length of the feasible search direction becomes zero. This appearswith one constraint, see Fig. 3.3, or several active constraints, see Fig. 5.5.

Figure 5.5: Illustrations to Kuhn-Tucker conditions for constrained optima


5.7 Duality 59

5.7 Duality

Section 5.5 explains that the critical, or stationary, points of the Lagrangian are saddlepoints. The saddle-point nature is connected with the max-min problem considered onSection 5.7.1 and the duality concept introduced with 5.7.2; the presentation of both ofthese topics rely on textbook material [20, 32, 33, 34].

5.7.1 The Max-Min Problem

The results of the previous sections allow defining the Lagrangian in terms of λ alone as

L(λ) = minxL(λ,x) (5.29)

For finding a solution to the optimization problem the Lagrangian must be maximizedwith respect to λ, or

maxλ

L(λ) = maxλ

minxL(λ,x) (5.30)

The same solution is obtained by stating this as an equivalent min-max problem as

minxL(x) = min

xmaxλ

L(λ,x) (5.31)

Vanderplaats [20] presents the example of a simple max-min problem:

F =1

x; g = x− 1 ≤ 0; x ≥ 0 (5.32)

The Lagrangian is

L(x, λ) =1

x+ λ(x− 1) (5.33)

From (5.29) the gradient of L with respect to x must vanish for any value of λ, so

λxL(x, λ) = − 1

x2+ λ = 0 → x = ± 1√

λ(5.34)

where only the positive root is meaningful since x is required to be positive. Now for anyvalue of λ the value of x which will minimize L is known. The Lagrangian can now bewritten in terms of λ only as

L(λ) =√λ+ λ

(1√λ− 1

)= 2√λ− λ (5.35)

The constrained minimum is found by maximizing L(λ) and the gradient with respect toλ must therefore vanish. This gives:

∇λL(λ) =1

λ− 1 → λ∗ = 1 (5.36)

and then from ((5.34))x∗ = 1 (5.37)

The original inequality-constrained problem has been solved by conversion to a max-min problem. Next, the primal and equivalent dual problems are defined and finallythe problem types, where the presented concepts give a computational advantage, will beidentified.



5.7.2 The Primal and Dual Problems

We are already familiar with the primal problem:

minx

f(x)

∣∣∣g ≤ 0, h = 0

(5.38)

and it can be solved without concern for the values of the Lagrangian multipliers λ. Atthe optimum x∗ the values of λ∗ can be calculated with the third Kuhn-Tucker condition(5.29), which is actually n equations, n being the number of variables x. Usually theLagrange multiplier values are not calculated because they are of no particular use. If,however, the saddle-point values λ were known in advance, the constrained problems couldbe solved with only one unconstrained minimization.The dual optimization problem is stated as

maxλ

f(λ)

∣∣∣λj ≥ 0, j = 1,m

(5.39)

The Lagrange multipliers are now called dual variables. Before we have found out were itis useful we may just play with the idea to solve the dual problem, or finding the Lagrangemultipliers, first and then retrieve the optimum primal variables x∗. Actually in manydesign problems only a few of many constraints are critical at the optimum so that onlythe few corresponding Lagrange multipliers are nonzero. The maximization problem (orminimization problem for −L) has only simple non-negativity constraints on the Lagrangemultipliers corresponding with the inequality constraints on x.A further attractive aspect of duality is that the primal and dual objective values becomethe same at the saddle-point. The dual always provides a lower bound on the primalproblem.

5.7.3 Computational Considerations

The usefulness of using dual methods depends on the finding of a way of convenientlyhandling the primal variables within the dual problem. This is why dual methods becomeattractive when the primal problem is mathematically separable. Separability exists whenthe objective and constraint functions are each calculated as the sum of functions of theindividual design variables. Thus, a separable function shows the following characteristic

f(x) = f1(x1) + f1(x1) + · · ·+ fn(xn) (5.40)

The primal problem we complete by writing the inequality constraining equations

gj(x) = gj1(x1) + gj2(x2) + · · ·+ gjn(xn) ≤ 0, j = 1, m (5.41)

and equality and side constraints are omitted here for brevity. The Lagrangian is thenwritten as

L(x,λ) =n∑i=1

fi(xi) +m∑j=1

λj

n∑i=1

gji(xi) (5.42)

The dual problem, depending on the Lagrange multipliers only, is prepared by using (5.29)and the property that the minimum of the separable function is the sum of the minima ofthe individual parts:

Maximize: L(λ) =

n∑i=1

minxi

fi(xi) +

m∑j=1

λjgji(xi)

(5.43)

subject to: λj ≥ 0 j = 1, m (5.44)


5.7 Duality 61

5.7.4 Use of Duality in Nonlinear Optimization

Vanderplaats [20] discusses three cases where the dual form is used for computationaladvantage, namely:

- Separable objective with linear and quadratic terms and linear constraints,

- Separable objective and constraints both with linear and quadratic terms, and

- nonseparable objective and separable linear constraints.

Separable Objective with Linear and Quadratic Terms and Linear Constraints

The primal problem is

f(x) =n∑i=1

(aixi +

1

2bix

2i

)(5.45)

g(x) =

n∑i=1

cjixi j = 1, m (5.46)

Referring to the result (5.43) the following function is minimized with respect to x:

minxi

aixi +1

2bix

2i +

m∑j=1

λjcjixi

(5.47)

At the minimum the derivative with respect xi must vanish which condition yields

ai + bixi +

m∑j=1

λjcji = 0 (5.48)

Solving for xi in terms of λj gives

xi =−ai −

∑mj=1 λjcji

bi(5.49)

Since some of the bi may be zero, it is important to foresee upper and lower limits for theprimal variables to be treated explicitly:

xli ≤ xi ≤ xui i = 1, n (5.50)

In case of bi being equal to zero, xi is set to its lower bound if the numerator in (5.49)is positive and vice versa. The result is to be inserted into (5.43) which is then to bemaximized with respect to the dual variables λj .

Separable Objective and Constraints Both with Linear and Quadratic Terms

f(x) =n∑i=1

(aixi +

1

2bix

2i

)(5.51)

gj(x) =

n∑i=1

(cjixi +

1

2djix

2i

)j = 1,m (5.52)



Inserting the objective and constraints into (5.43) gives the sub problem

minxi

aixi +1

2bix

2i +

m∑j=1

λj

(cjixi +

1

2djix

2i

) (5.53)

Differentiating with respect to xi and equating to zero gives

ai + bixi +m∑j=1

λj (cji + djixi) = 0 (5.54)

the resolution of which gives the primal variable as

xi =−ai −

∑mj=1 λjcji

bi +∑m

j=1 λjdji(5.55)

Again, if the denominator becomes zero the xi is set to its upper or lower bound if thenumerator is negative or positive, respectively.

Nonseparable Objective and Separable Linear Constraints

Here the primal quadratic programming problem is written in matrix notation:

Minimize: f(x) =1

2xTAx + bTx (5.56)

subject to: λj ≥ 0 Bx ≤ 0 (5.57)

Nonnegativity requirements on xi can for convenience be included in the set of linearinequality constraints. The matrix A is the Hessian matrix and is required to be positivedefinite for convexity and because it must be invertible. Here the Lagrangian becomes

L(x,λ) =1

2xTAx + bTx + λT (Bx− c) (5.58)

Solving for the stationary condition with respect to x gives

∇xL(x,λ) = Ax + b + BTλ = 0 (5.59)

from which the primal variables x

x = −A−1(b + BT

)(5.60)

are obtained. Substituting this into (5.58) and simplifying gives the Lagrangian in termsof λ alone:

L(λ) = −1

2λTDλ− dTλ− 1

2bTA−1b (5.61)

where the abbreviations

D = BA−1BT ; d = BA−1b (5.62)

have been used.The special cases presented by Vanderplaats permit writing explicit expressions for theprimal variables in terms of the dual variables. often this may not be possible and thenthe minimization subproblem can be solved as an optimization problem itself. When thiscan be done with great efficiency, dual methods can provide efficient solutions techniquesfor nonlinear optimization. Examples are the topology optimization problems explainedin Sections 9.4 and 9.5.


5.8 Optimization Algorithms Overview 63

5.8 Optimization Algorithms Overview

The nonlinear optimization tasks, stated in section 5.1, require iterative solution methodsfor finding the minimum of a global objective function that may be constrained or un-constrained. The iteration scheme illustrated in Fig. 5.6 applies largely to all structuraloptimization processes.

Figure 5.6: Design Optimization Iteration Basic Scheme

There are many different methods from which optimization algorithms are derived. Mullerprovides in her recent dissertation [1] a concise scheme for dividing the methods. The directmethods require only the values of the objective function itself and the indirect methodsneed additional information in terms of first or higher derivatives. Complete evaluation isan extremely simple and inefficient direct method devoid of any strategy to utilize topol-ogy information from previous function evaluations for improving convergence. The moresophisticated direct methods are further subdivided into stochastic and deterministic ones.Stochastic methods use random numbers to identify new design solutions. The stochasticmethods include, for instance, evolutionary algorithms some of which are explained inchapter 7.The deterministic methods rely on some information on the local objective function topol-ogy to identify more systematically promising points in variable space. They include thesimplex search method or Powell’s conjugate direction method, explained in sections 6.1and 6.6, respectively.The indirect methods are all deterministic. One distinguishes there between first-ordermethods, needing first-derivative information, and second-order methods, needing in ad-dition second-derivative information. The first-order methods include the methods ofsteepest descent (section 6.2) or the nonlinear conjugated gradient method of Fletcherand Reeves (section 6.5). The second-order methods include the Newton’s method (sec-tion 6.4) or surface-response methods (section 6.7).In the literature [18], the deterministic methods, or the algorithms based on them, are cus-tomarily called mathematical programming and the direct deterministic methods go underthe label zeroth-order methods of mathematical programming. The illustration in Fig. 5.7adopts parts from [1] but also indicates the realm of mathematical programming. Apartfrom the method order indicated in Fig. 5.7, search methods are characterized by themodel order which is is the order of the Taylor series expansion that approximates the ob-jective function. Fig. 5.8 arranges the methods in a matrix whose rows and columns standfor the model and method orders, respectively. Search methods based on quadratic modelsof the objective function converge very quickly on quadratic functionals. For instance, theNewton and the surface-response methods converge in one step on quadratic functionals



! ! !

" # $ % &

" & % ' " ( ) $ *

" + " , ' " ) "

" - , '

Figure 5.7: Overview of Optimization Algorithms. Source: Muller [1]

!

"

#

$

%

&

#

& % $

Figure 5.8: Search methods arranged after model and method orders

although the methods are of order two and zero, respectively. The lower triangle of thatmatrix is empty because model order limits method order.

5.8.1 An Argument for Mathematical Programming

The simplest search method is complete evaluation of the search space. Value rangesare assigned to the respective variables so that a region of the search space is coveredwhere the optimum is suspected. Along each value range several coordinate values areselected, for instance equidistantly. These coordinate values define a hypermesh of pointsin n-dimensional space of the optimization variables. Fig. 5.9 illustrates the situation forn = 2. The objective function is then evaluated at each point and the point having thelowest function value is taken to be optimum. The numerical effort of the method explodeswith increasing number n of search space dimensions. The situation is vividly depicted byVanderplaats [20]:

”Consider, for example, a design problem described by three variables. Assumewe wish to investigate the designs for 10 values of each variable. Assume also


5.8 Optimization Algorithms Overview 65

Figure 5.9: Objective function and hypermesh in two dimensions

that any proposed design can be analyzed in one-tenth of a central processorunit (CPU) second on a digital computer. There are then 103 combinations ofdesign variables to be investigated, each requiring one-tenth second for a totalof 100 CPU seconds to obtain the desired optimum design. This would probablybe considered an economical solution in most design situations. However, nowconsider a more realistic design problem where 10 variables describe the design.Again, we wish to investigate 10 values of each variable. Also now assume thatthe analysis of a proposed design requires 10 CPU seconds on the computer.The total CPU time now required to obtain the optimum design is 1011 seconds,or roughly 3200 years of computer time! Clearly, for most practical designproblems, a more rational approach to design automation is needed.

He uses this to underline the importance of the much more efficient mathematical pro-gramming algorithms.

5.8.2 An Argument for Stochastic Methods

Most of the mathematical programming methods work local. Usually they proceed fromone starting point and estimate from the function value at that point and from derivativeinformation at the same point, or from information of previously obtained function valuesat other points, a new point with a function value lower than those at the other points.Thus, all points with higher function values than at the starting point, or any point reachedduring the search, are systematically not evaluated which explains the incredible efficiencygain if compared with complete evaluation.The local nature of mathematical programming leads thus to great efficiency but, on theother hand, lies at the heart of a great disadvantage: the search tends to get trapped atlocal minima of non-convex objective functions. Practical ways and conceptual approachesto mitigate this problem are addressed in section 6.11. However, the multi-start methodencounters a similar efficiency problem as complete evaluation and the tunneling methodis not yet fully developed.In addition, the first- and second-order methods require the objective function to be con-tinuous with continuous first and second derivatives, respectively. But generally one hasto expect that an engineering design problem may also be of a discrete nature, and then



mathematical programming is not applicable.All of this poses no problem for the stochastic methods which gain much higher efficiencythan complete evaluation by using random numbers to define design changes.

5.8.3 Mathematical Programming and Stochastic Search Methods

Mathematical programming can be so much more efficient than other methods that suc-cessful methods have been derived where problems of a discrete nature are transformed sothat minimization of continuous objective functions solves the problem. Other aspects areaddressed in the introductory remarks on design optimization in section 1.1. Chapter 6explains some selected algorithms of mathematical programming and chapter 7 deals withstochastic search methods.


5.9 Practical Considerations for Numerical Optimization 67

5.9 Practical Considerations for Numerical Optimization

From Vanderplaats [20] we quote advantages and limitations of numerical optimization.

5.9.1 Advantages of Using Numerical Optimization

• A major advantage is the reduction in design time—this is especially truewhen the same computer program can be applied to many design projects.

• Optimization provides a systematized logical design procedure.

• We can deal with a wide variety of design variables and constraints whichare difficult to visualize using graphical or tabular methods.

• Optimization virtually always yields a design improvement.

• It is not biased by intuition or experience in engineering. Therefore, thepossibility of obtaining improved, nontraditional designs is enhanced.

• Optimization requires a minimal amount of human-machine interaction.

5.9.2 Limitations of Numerical Optimization

• Computational time increases as the number of design variables increases.If one wishes to consider all possible design variables, the cost of auto-mated design is often prohibitive. Also, as the number of design variablesincreases, these methods tend to become numerically ill-conditioned.

• Optimization techniques have no stored experience or intuition on whichto draw. They are limited to the range of applicability of the analysisprogram.

• If the analysis program is not theoretically precise, the results of opti-mization may be misleading, and therefore the results should always bechecked very carefully. Optimization will invariably take advantage ofanalysis errors in order to provide mathematical design improvements.

• Most optimization algorithms have difficulty in dealing with discontinuousfunctions. Also, highly nonlinear problems may converge slowly or notat all. This requires that we be particularly careful in formulating theautomated design problem.

• It can seldom be guaranteed that the optimization algorithm will obtainthe global design optimum. Therefore, it may be necessary to restart theoptimization process from several different points to provide reasonableassurance of obtaining the global optimum.

• Because many analysis programs were not written with automated designin mind, adaption of these programs to an optimization code may requiresignificant reprogramming of the analysis routines.


Chapter 6

Search for Design Improvement:Mathematical Programming

Mathematical Programming (MP) is a rather generic term and the algorithms embraced byit assume that in general the objective and the constraining functions are continuous andat least twice differentiable [18]. Moreover, MP assumes convex objective functions. Givena non-convex objective and a starting point, MP will generally obtain a local minimumand an additional effort, for instance by employing a multi-start technique, is needed tofind other local minima with smaller function values, and, hopefully, the global minimumtoo. Some mathematical programming methods are regarded as highly efficient but it isalso important to note that the efficiency of these methods depends very much on theactual objective function. Therefore, general statements, claiming one certain method tobe always better than another certain method, are usually untenable.The terms Sequential Linear Programming and Sequential Quadratic Programming areused to characterize solution methods for nonlinear constrained optimization problemswhich use linear and quadratic approximations, respectively, to the problems. The follow-ing definitions have been taken from Vanderplaats’ textbook [20].

Sequential Linear Programming The nonlinear optimization problemstated at the beginning of Chapter 5 is linearized via a first-order Taylor seriesexpansion:

Minimize: f(x) ≈ f(x0) +∇f(x0)∆x

Subject to: g(x) ≈ g(x0) +∇g(x)∆x ≤ 0

h(x) ≈ h(x0) +∇h(x)∆x = 0

where ∆x = x− x0

(6.1)

Whereas the unconstrained problem requires iterative line searches along acurrent search direction, SLP can replace these line searches with a direct stepto activate the closest and as of yet inactive constraining equation.

Sequential Quadratic Programming Newton’s method for solving non-linear minimization problems uses a quadratic approximation to the uncon-strained objective by Taylor-series approximation up to the quadratic term, seeSection 6.4. SQP methods use the same quadratic Taylor-series approximationof the Lagrangian, see Section 3.5.3, of the nonlinear constrained minimization


70 Search for Design Improvement: Mathematical Programming

problem.

L(x,λ) ≈ L(x0,λ0) +∇xf(x0,λ0)∆x + 12∆xT∇2

xf(x0,λ0)∆x (6.2)

Because of the approximation both methods require iterations and each ap-proximation is called a quadratic sub problem. If all constraints are equalityconstraints, the search method remains identical to that of the Newton method.If the problem includes inequality conditions, the search becomes similar to themodified Newton method but a nonlinear line search along may be replacedwith a direct step to activate the closest and as of yet inactive constrainingequation.

Fig. 5.7 illustrates how the methods of mathematical programming are divided into thecategories direct methods, gradient-based methods, and second-order methods. Zeroth-order, or direct, methods require only objective-function values. First-order, or gradient-based, methods require in addition the gradient (first derivative). Second-order methodsrequire also the Hesse matrix (second derivative).Some methods are derived upon the model of a quadratic function, and thus have a the-oretical basis. The textbook by Reklaitis, Ravindran, and Ragsdell [23] gives two reasonsfor choosing a quadratic model:

1. It is the simplest type of nonlinear function to minimize, and hence any generaltechnique must work well on a quadratic if it is to have any success with a generalfunction

2. Near the optimum, all nonlinear functions can be approximated by a quadratic(since in the Taylor expansion, the linear part must vanish). Hence, the behavior ofthe algorithm on the quadratic will give some indication of how the algorithm willconverge for general functions.

The direct method due to Powell [35] is also based on a quadratic model. Cauchy’s methodof steepest descent [36] is a gradient method which is based on a linear model: improvedpoints are suspected in directions along the most local descent. Since no assumption ismade how that descent property might change at some distance away from the referencepoint, the method is often less efficient than those based on quadratic models. Its oneadvantage over more sophisticated strategies lies in its descent property.


6.1 Simplex Search Method 71

6.1 Simplex Search Method

The rendering of the simplex search method of Spendley, Hext, and Himsworth [37] followsthe textbook by Reklaitis, Ravindran, and Ragsdell [23]. Consider the problem of findingsome first-order information on direction-depending function-value distribution of a func-tion in N dimensions by function values at points. For N = 1, two points can give theinformation whether the function is increasing or decreasing with increasing variable value.For N = 2 three points are necessary to give an equivalent information. In N dimensionsthere are always N + 1 points necessary to obtain the directional trend of a function insome region. When this smallest number of points are arranged equidistantly, they definea regular simplex. For example, the equilateral triangle is a simplex in two dimensions; atetrahedron is a simplex in three dimensions. The main property of a simplex employedby the simplex search method is that a new simplex can be generated on any face of theold one by projecting any chosen vertex a suitable distance through the centroid of theremaining vertices of the old simplex. The new simplex is then formed by replacing the oldvertex by the newly generated projected point. In this way each new simplex is generatedwith a single evaluation of the objective. This process is demonstrated for two dimensionsin Fig. 6.1.

Figure 6.1: Construction of new simplex (after [23])

The method begins by setting up a regular simplex in the space of the independent vari-ables and evaluating the function at each vertex. The vertex with highest functional valueis located. This vertex is then reflected through the centroid to generate a new point,which is used to complete the next simplex. As long as the function values obtained ateach new point decrease monotonously, the iterations move along until either the min-imum point is straddled or the iterations begin to cycle between two or more vertices.The minimum point is straddled when always the same vertex is reflected back and forththrough the iterations. These situations are resolved using the following three rules:

1. Minimum ”Straddled”If the vertex with the highest function value was generated in the previous iteration,then choose instead the vertex with the next highest value for reflection.

2. CyclingIf a given vertex remains unchanged for a M iterations, reduce the size of the simplexby some factor. Set up a new simplex with the currently lowest point as the basepoint. Spendley et al. suggest that M be predicted via M = 1.65N + 0.05N2, whereN is the problem dimension and M is rounded to the nearest integer. This rulerequires the specification of a reduction factor.



3. Termination CriterionThe search is terminated when the simplex gets small enough or else if the standarddeviation of the function values at the vertices gets small enough. This rule requiresthe specification of a termination parameter.

Apart from the objective evaluations, there are only two types of calculation required forthe simplex search algorithm: (1) generation of simplex at a given base point x(0) andscale factor α in N -dimensional space and (2) calculation of the reflected point. For agiven base point the other vertices of the simplex are calculated from

x(i)j =

x(0)j + δ1 if j = i

x(0)j + δ2 if j 6= i

(6.3)

for i and j = 1, 2, ..., N . The increments δ1 and δ2, which depend only on N and theselected scale factor α, are calculated from

δ1 =[√

N+1+N−1N√

2

]α

δ2 =[√

N+1−1N√

2

]α

. (6.4)

The design of a simplex in two dimensions is illustrated in Fig. 6.2(a). Suppose x(j) is the

(a) (b)

Figure 6.2: Design Principle (a) and Reflection (b) of a Simplex in Two Dimensions

point to be reflected as illustrated in Fig. 6.2(b). Then the centroid of the remaining Npoints is

xc =1

N

N∑i=0

x(i), i 6= j. (6.5)

All points on the line from x(j) through xc are given by

x = x(j) + λ(xc − x(j)

). (6.6)

Choosing λ = 2 will yield the new vertex point so that the regularity of the simplex isretained. Thus,

x(j)new = 2xc − x

(j)old. (6.7)


6.2 Method of Steepest Descent 73

6.2 Method of Steepest Descent

Consider a reference point x0 and the function value f at that point. Let the gradient∇f also be known. The gradient gives the direction of steepest ascent of the objectivefunction f in the space of the optimization variables x. Therefore, in the neighborhoodof the reference point, points with lower function values exist in the negative gradientdirection, or the direction of steepest descent, s:

s = −∇f (6.8)

Thus, the direction of steepest descent is a useful search direction along which to locatepoints with lower function values. Moving along the search direction will initially obtainever smaller function values but after reaching a minimum the values will, from there on,again increase. That minimum point along the search direction can be used as a newreference point. This generates the sequence

xk+1 = xk + αksk (6.9)

where the αk are obtained by line searches (see section 6.8). The slope f ′ is the componentof the gradient in the normalized search direction,

f ′ = (∇f(xk+1))Tsk

|sk|(6.10)

It vanishes at the minimum point along the search direction because there the searchdirection and the gradient are perpendicular to each other. Therefore, any two consecutiveline searches are perpendicular to each other as Fig. 6.3 illustrates. However, exact

Figure 6.3: Cauchy-method zigzag path through two-dimensional search space

values of the step-length αk can be calculated only when the objective is a quadraticfunctional. Even then the sequence 6.9 is not guaranteed to yield the absolute minimum



of a convex function in a finite number of iteration steps - the zigzagging course indicatedin Fig. 6.3 could be continued forever. Moreover, the step-length can only be numericallyestimated and very accurate estimations may cost many function evaluations, furtherreducing the efficiency of the sequence 6.9. Therefore, using in-exact line searches withreduced numerical effort have the effect that two consecutive search directions are nolonger perpendicular to each other but the overall efficiency may increase as long as anynew point generated by the line search has a lower function value than the reference point.


6.3 Quadratic Objection 75

6.3 Quadratic Objection

First we recall a quadratic function depending on a single variable x only,

f(x) = a0 + a1x+ a2x2, (6.11)

and its first and second derivatives,

f ′(x) = a1 + 2a2x

f ′′(x) = + 2a2

. (6.12)

Setting the first derivative equal to zero determines the point xe where the function valueis an extremum:

xe = − a1

2a2. (6.13)

If the value of the second derivative, or a2, is positive, the extremum is a minimum. Whenthe minimum point is considered an optimum, the two conditions are called optimalityconditions.Next we generalize the concept to quadratic functions depending on several variables xithat are arranged in the variables vector x,

f(x) = p+ xTp + xTPx, (6.14)

where the function f and the constant coefficient p are scalars, the variables vector x andthe coefficient p of the linear term are vectors, and the coefficient P is a matrix. Theextreme point xe in variables space is identified by setting the first derivative of (6.14)equal to zero:

∇f(x) =∂f

∂x= p + 2Px = 0. (6.15)

The first derivative of a function of several variables, or variables vector, is a vector calledgradient ∇f . From (6.15) follows the extremum point xe

xe = −1

2P−1p. (6.16)

The extreme point is a minimum if for any arbitrary vector v with non-zero length itholds:

vTPv > 0. (6.17)


6.4 Original and Modified Newton Methods 77

6.4 Original and Modified Newton Methods

We realize that the quadratic polynomial is the simplest form of a function with justone minimum that can be determined by the optimality criteria. Next we consider theexpansion of a smooth function at a reference point x0 by a Taylor series up to the quadraticterms. The Taylor series approximates the function the better the smaller the consideredregion is. The quadratic Tailor series approximation of a function f depending on a singlevariable x is

f(x0 +4x) = f(x0) +4xf ′(x0) +1

2(4x)2f ′′(x0), (6.18)

where 4x is the distance from the reference point x0. When f depends on a variablesvector x, the gradient ∇f is the first derivative and the Hesse matrix H = ∇∇f is thesecond derivative of f(x):

f(x0 +4x) = f(x0) +4x∇f(x0) +1

24xTH4x, (6.19)

The first and second derivatives of the Tailor-series expansions are

∇f(x0 +4x) = ∇f(x0) + H4x

∇∇f(x0 +4x) = H. (6.20)

Setting the gradient of the Tailor-series approximation f(x0 +4x) equal to zero yields thedistance 4x from the reference point x0 to the extremum point xappe :

4x = −H−1∇f(x0). (6.21)

The extremum point itself follows from

xappe = x0 +4x = x0 −H−1∇f(x0). (6.22)

The extremum is a minimum if, for any arbitrary vector v, it holds that vTHv > 0.The extremum point of the Tailor-series approximation is generally not identical with theextremum of the objective function itself. Under certain conditions it will be closer tothe true extremum than the reference point. It can then replace the reference point andapplying (6.22) again yields an improved estimate so that the continued sequence,

xk+1 = xk −H−1∇f (6.23)

converges to the true extremum. It is understood that the gradient and the Hesse matrixare evaluated at the point xk. This procedure is called the Newton Method.The original objective function may not be well approximated by its quadratic Taylor-Series expansion. This is likely to occur when the reference point is too far away fromthe minimum point of the original objective. Then, the calculated extremum point of theapproximating function may be farther away from the true minimum than the referencepoint so that the sequence does not converge. Introducing a variable step length αk leadsto the so-called Modified Newton Method,

xk+1 = xk − αkH−1∇f. (6.24)

The step length is not beforehand known and must be determined by a so-called linesearch. Line search methods are explained in Section 6.7.



In cases the Newton and Modified Newton Methods may be very efficient. However, instructural optimization they enjoy only limited popularity. This is because the derivativescan only be extracted numerically by using the methods explained in Section 6.8. Thenumber of function evaluations to calculate the Hesse matrix scales quadratically with thenumber of design variables so that for a large number of design variables the numericaleffort becomes too high. The methods presented in the following sections avoid the needfor explicit calculation of the Hesse matrix.


6.5 Nonlinear Conjugated Gradient Methods 79

6.5 Nonlinear Conjugated Gradient Methods

The disadvantage of Cauchy’s Method is that the number of line search searches to obtainthe exact minimum point of even a convex objective function may be infinite. Althoughthis does not necessarily stand in the way of finding significant improvements over aninitial design after a small number of steps, one is interested in a more efficient sequencefor very close approximations of a minimum point. This is achieved by the Method ofConjugated Gradients (CG) that is also used to efficiently solve large systems of linearequations in iterations. The systems of equations then define a quadratic functional (TotalPotential Energy) of the unknown solution parameters the minimum point of which yieldsthe desired solution. The solution of those quadratic functional is guaranteed to be foundafter as many steps as there are unknowns but may be found earlier depending on thecondition of the coefficient matrix. The method for such linear problems is well derivedand explained in [38] and it is also considered in the lecture class Structural Analysis withFEM. One important feature of CG is that, instead of the orthogonality property of twoconsecutive search directions, they fulfill the condition of C-conjugacy, or

s0TCs1 = 0, (6.25)

if the objective function is a quadratic functional, for instance

f =1

2xTCx. (6.26)

The point of C-conjugacy is illustrated in Fig. 6.4. However, a structural optimization task

Figure 6.4: Orthogonal and Conjugate Search Directions

usually leads to nonlinear objective functions so that the algorithm that is well-developedfor quadratic functionals cannot be applied. Then, the Method of Conjugated Directionsdeveloped by Fletcher and Reeves [39] can be used. It generates new search directionsthat are approximately C-conjugated to the previous one from the information of previousline searches after the formula

sk = −∇fk +|∇fk|2

|∇fk−1|2sk−1. (6.27)

The sequence requires memorizing the previous search-direction and gradient vectors. Atthe first step of the sequence, these quantities are not known so that the first step is thesame as for the Method of Steepest Descent.



In deriving the sequence (6.27) one assumes a quadratic functional (6.26) so that thegradient at a given point x is given by

∇f = Cx. (6.28)

The search starts at a point x0 and the initially chosen search directions is that of steepestdescent at this point,

s(0) = −∇f(x(0)) = −Cx(0) → x(0) = C−1∇f (0). (6.29)

The vector of optimization variables along the search direction depends on a factor λ,

x = x(0) + λs(0). (6.30)

That factor is chosen to minimize the value of f along the search direction. At thatminimum point the derivative of f with respect to λ must vanish. This condition is usedalong with (6.29) and (6.30) to derive the minimizing value of λ:

f,λ = xTCx,λ = 0

= (x(0) + λs(0))TCs(0)

= (C−1∇f (0) − λ∇f (0))TC∇f (0)

= ∇f (0)TC−TC∇f (0) − λ∇f (0)TC∇f (0)

→ λ = ∇f (0)T∇f (0)

∇f (0)TC∇f (0)

. (6.31)

We obtain the minimum point along the search direction s(0) by inserting the just derivedminimizing value of λ in (6.30) and call it x(1). The gradient at this point,

∇f (1) = C(x(0) + λs(0)) = ∇f (0) − ∇f (0)T∇f (0)

∇f (0)TC∇f (0)C∇f (0), (6.32)

is naturally orthogonal to ∇f (0). Guided by the pre-existing knowledge, that the mini-mum of quadratic functionals is obtained after a finite number of iteration steps if twoconsecutive search directions fulfill C-conjugacy, we require that the form

s(1) = −∇f (1) + βs(0) (6.33)

satisfy (6.25). That obtains

s(0)TCs(1) = 0

= −s(0)TC∇f (1) + βs(0)TCs(0)

= ∇f (0)TC∇f (1) + β∇f (0)TC∇f (0)

→ β = −∇f(0)TC∇f (1)

∇f (0)TC∇f (0)

. (6.34)

The problem with the result (6.34) is that the coefficients C appear explicitly. But thesecoefficients are generally not known. They can, however, be eliminated by an identity that


6.5 Nonlinear Conjugated Gradient Methods 81

is derived by analyzing the scalar product of the gradient ∇f (1) with itself:

∇f (1)T∇f (1) = ∇f (1)T∇f (0) − ∇f (0)T∇f (0)

∇f (0)TC∇f (0)∇f (1)TC∇f (0)

= − ∇f(0)T∇f (0)

∇f (0)TC∇f (0)∇f (1)TC∇f (0)

→ ∇f (1)TC∇f (0)

∇f (0)TC∇f (0)= −∇f

(1)T∇f (1)

∇f (0)T∇f (0)

. (6.35)

With the identity (6.35) we find for the factor β:

β =∇f (1)T∇f (1)

∇f (0)T∇f (0)(6.36)

and for the search direction s(1)

s(1) = −∇f (1) +∇f (1)T∇f (1)

∇f (0)T∇f (0)s(0). (6.37)

Generalizing the indices 0 and 1 to k − 1 and k, respectively, gives the formula (6.27)found by Fletcher and Reeves [39]. The derivation is based on a quadratic functional andthe formula will result in a better choice of search directions than Cauchy’s Method if theobjective function is dominated by the quadratic terms of its Taylor series representation.The quadratic terms tend to overwhelm the other terms with decreasing distance to theminimum point.


6.6 Powell’s Conjugate Direction Method 83

6.6 Powell’s Conjugate Direction Method

According to a textbook [23] the most successful of the direct search methods is the methoddue to Powell [35] which has further improved by modifications by Zangwill [40] and Brent[41]. This method is motivated by the observation that a quadratic function of N variablesin the form of a sum of perfect squares can be minimized in N steps, one with respect toeach of the variables. Also, general quadratics can be transformed into a sum of perfectsquares in terms of the transformed variables. The quadratic form

Q(x) = xTCx (6.38)

is generally not a sum of perfect squares unless all off-diagonal elements are zero. Theprocess of transforming it into a sum of perfect squares is equivalent to finding a transfor-mation matrix T,

x = Tz, (6.39)

so that the functional becomes a sum of perfect squares in terms of the transformedvariables z and the diagonal matrix D,

Q(x) = zTTTCTz = zTDz. (6.40)

We realize that the columns tj of the transformation matrix T give a new set of coordinatesthat, because they diagonalize the quadratic C, correspond to its principal axes:

x = Tz = t1z1 + t2z2 + ...+ tNzN . (6.41)

The new coordinates tj are called conjugate directions and the remaining problem is howto calculate such a set of conjugate directions.The basic approach to finding the conjugate directions is based on the parallel subspaceproperty of a quadratic function. The parallel subspace property is explained by usingthe two-dimensional example illustrated in Fig. 6.5. Consider two arbitrary but distinctpoints x(1) and x(2) and a direction vector d. Let y(1) be the point corresponding to theminimum of Q(x(1) + λd) and (y)2 be the solution to Q(x(2) + λd). Then the direction(y(2) − y(1)) is C conjugate to d. From Fig. 6.5 it is apparent that two line line searchesdetermine the points y(1) and y(2), establishing the set of C conjugate directions d and(y(2)−y(1)), and a third line search with reference point y(1) or y(2) and along the direction(y(2)−y(1)) finds the minimum point of the quadratic. All this is achieved without gradientcalculation.The foregoing can be extended to give a elucidating definition of conjugacy that is takenfrom the textbook [23]. Given an N ×N symmetric matrix C, the directions s(1), s(2), ..., s(r), r ≤ N are called C conjugate if the directions are linearly independent, and

s(i)TCs(j) = 0 for all i 6= j. (6.42)

Now consider the general quadratic function

q(x) = a+ bTx +1

2xTCx. (6.43)

The points along the direction d from x(1) depend on the single variable λ,

x = x1 + λd. (6.44)



Figure 6.5: Conjugacy in two dimensions. Source: [23]

The minimum of q along the line defined by (6.44) is obtained by finding λ∗ such that thederivative of q with respect to λ is zero:

∂q

∂λ=∂q

∂x

∂x

∂λ= (bT + xTC)d. (6.45)

We call the minimum y(1) so that[(y(1)T

)C + bT

]d = 0. (6.46)

Similarly, by using the same arguments, we have[(y(2)T

)C + bT

]d = 0. (6.47)

Subtracting (6.46) from (6.47) gives us(y(2)T − y(1)T

)Cd = 0. (6.48)

Accordingly, the directions (y(2) − y(1)) and d are C conjugate, and the parallel subspaceproperty of quadratic functions has been verified.It remains to develop the actual minimization algorithm. The parallel subspace propertyhas been explained by using two starting points and a direction. Having to generate anumber of starting points is, however, not elegant from a computational point of view.Therefore we construct a way to find C conjugate directions and the minimum by usingonly one starting point. To this end we employ the coordinate unit vectors e(1), e(2),..., e(N). We consider, as before, the two-dimensional case and let e(1) = [1, 0]T ande(2) = [0, 1]T . We use a starting point x(0) and calculate λ(0) so that f(x(0) + λ(0)e(1)) isminimized. This gives us the new point x(1)

x1 = x0 + λ0e1. (6.49)

From the new point we start a line search in the direction of e(2) so that f(x(1) +λ(1)e(2))is minimized which obtains a second new point

x2 = x1 + λ1e2. (6.50)


6.6 Powell’s Conjugate Direction Method 85

Figure 6.6: Powell’s Method. Source: [23]

Next, we use the first coordinate unit vector again for a line search to calculate λ(2) sothat f(x(2) + λ(2)e(1)) is minimized and let

x3 = x2 + λ2e1. (6.51)

Then, the directions (x(3) − x(1)) and e(1) will be conjugate as Fig. 6.6 illustrates. Theconstruction can be extended from 2 to N dimensions and yields Powell’s ConjugateDirection Method :

1. Define the starting point x(0) and a set of N linearly independent directions, possiblys(i) = e(i), i = 1, 2, 3, ..., N .

2. Minimize along the N + 1 directions, using the previous minimum to begin the nextsearch and letting s(N) be the first and last searched

3. Form the new conjugate direction using the extended parallel subspace property

4. Delete s(1) and replace it with s(2), and so on. Place the new conjugate direction ins(N). Goto step 2.

Figure 6.7 illustrates the path taken by Powell’s method through the variables space of anon-quadratic function.



Figure 6.7: Path Taken by Powell’s Method Algorithm Through Non-Quadratic FunctionSpace. Source: [20]


6.7 Response-Surface Method Minimizing Algorithms 87

6.7 Response-Surface Method Minimizing Algorithms

A standard non-linear optimization problem is usually formulated as at the beginning ofChapter 5 which can also be written more compact:

minx∈Rn

f(x)|gi ≤ 0, hk = 0 , j = 1, ..., J, k = 1, ...,K, xL,i ≤ xi ≤ xU,i (6.52)

The lower and upper bounds xL,i and xU,i are initially imposed by the designer. To reducethe number of costly function evaluations, a response surface model, also called a surrogate,is built for the objective and possibly also for the constraining functions. The search forthe optimum design can then be partly make use of the surrogate functions and then theoriginal optimization problem is replaced by

minx∈Rn

f(x)|gi ≤ 0, hk = 0

, j = 1, ..., J, k = 1, ...,K, xL,i ≤ xi ≤ xU,i (6.53)

where the tilde symbol indicates that the respective surrogates are meant instead of theoriginal functions.The response surface methodology was originally developed for constructing empiricalresponse functions based on physical experiments. When the physical experiments arereplaced by numerical function evaluations, the methodology can be used to find minimumpoints of the function. For that purpose quadratic polynomials are obviously suited.They are the simplest functions with a minimum point and can easily be constructed andevaluated. The methodology is based on the elements:

1. Construct the response surface model for the given number of variables

2. Apply a scheme for placing the supporting points

3. Find response surface model parameters from the supporting point evaluations

4. Find minimum point of the response surface model, or surrogate objective function

5. Update response surface model in iterations

6. Apply some termination criteria

Response surface approximations shift the computational burden from the optimizationproblem to the problem of constructing the approximations, and accommodate the use ofdetailed analysis techniques without the need of derivative information, [42]. Additionally,response surface approximations filter out numerical noise inherent to most numericalanalysis procedures, by providing a smooth approximate response function, and simplifythe integration of the analysis and the optimization codes.

6.7.1 Constructing a Response Surface Model from Supporting Points

The objective function, actually a functional, is alternatively called a response surface.It is approximated by a response surface model constructed from quadratic polynomialsΠ2 in terms of the optimization variables. The following example is written out for twovariables:

Π2 = a1 + a2x1 + a3x2 + a4x21 + a5x1x2 + a6x

22. (6.54)

The polynomial Π2 (6.54) can in general also be written as inner product of the coefficientsand parameters vectors, c and a:

Π2 = CTa. (6.55)



The parameters ai of Π2 can be adjusted to fit the objective function values f at supportingpoints. The number of supporting points spanning the search space must at least equalthe number of parameters or coefficients nc in Π2. That number depends on the numbernx of optimization variables:

nc = 1 + nx +1

2[nx(nx + 1)] . (6.56)

If the number of supporting points equals the number of the parameters of the quadraticapproximation, the parameters are obtained by inverting (6.55):

a = C−1f . (6.57)

It is also possible to use a higher number of supporting points for constructing the quadraticmodel. Then, the model response surface is chosen to give a best fit to the higher numberof supporting points which is achieved by using

a =(CTC−1

)CT f . (6.58)

instead of (6.57). If there are more supporting points than model parameters, the matrix Cis not quadratic. Instead, the number of rows equals the number of supporting points andthe number of columns equals the number of model parameters of the quadratic responsesurface approximation.

6.7.2 Finding the Minimum Point of Response Surface Model

For estimating the minimum point of the quadratic approximation it is more convenientto recast (6.54) into the form

Π2 = p+ xTp + xTPx, (6.59)

where the parameters a are contained in the scalar p, the vector p, and the matrix P. Asa necessary condition, the gradient must vanish at an extremum,

4Π2 = p + 2Px = 0, (6.60)

yielding the extreme point

xE = −1

2P−1p. (6.61)

The extreme point is a minimum when for any vector v with non-zero length it holds:

vTPv > 0, |v| 6= 0. (6.62)

6.7.3 The Relation Between RSM and NM

The Response Surface Method is a zero-order methods since they do not require functionderivatives. The Newton Method, on the other hand, is second-order method because itrequires first and second derivatives. Therefore, the two methods may seem to be verydifferent from each other. A closer look reveals that the methods have more in commonthan is often realized, and they can in fact be unified by using the supporting placementscheme [43] explained in the following Paragraph.



Figure 6.8: Placement of supporting points x(1) through x(10) in 3-dimensional variablesspace

The Unifying Supporting Point Placement Scheme

Fig. 6.8 shows the placement of the supporting points that can be used for a three-dimensional response-surface approximation. The point x(1) at the center of the point setis the reference for constructing the other supporting points. Assuming the placementof the points on a regular lattice, indicated in Fig. 6.8, with a spacing D along therespective coordinate directions, the coordinate variations of the individual supportingpoints with respect to the reference point x(1) are given in the matrix presentation ofTable 6.1. The rows correspond to the optimization variables and the columns correspondto the supporting points x(2) through x(10). The coordinates of the points can generally

2 3 4 5 6 7 8 9 10

1 −D D 0 0 D 0 0 D 0

2 0 0 −D D D 0 0 0 D

3 0 0 0 0 0 −D D D D

Table 6.1: Supporting points in terms of changes with respect to x(1)

be identified for nx variables by the scheme developed in the following. The points areidentified successively from variable 1 through nx. For the ith variable, a number of i+ 1supporting points must be added to the set of nc = nc(i − 1) (6.56) existing points. Thefirst two of the new points span the direction of the ith variable:

x(nc+1)k = x1

k −

0, k 6= nx + 1D, k = nx + 1

(6.63)

and

x(nc+2)k = x1

k +

0, k 6= nc + 1D, k = nc + 1

. (6.64)

The remaining i− 1 points are variations of the new point x(nc+2) (6.64):

x(nc+2+l)k = x

(nc+2)k +

0, k 6= lD, k = l

, l = 1, i− 1 (6.65)



The supporting points span some appropriate region of the search space. Along with thefunction values at each point, they are used to construct the response-surface approxima-tion.It is interesting to examine the presented supporting-point arrangement for the NM. TheNM uses a set of supporting points in the vicinity of a reference point to calculate thegradient and Hessian matrix at that point. Point x(1) then becomes the reference pointat which the gradient and Hessian matrix must be obtained by the method of differences.The other supporting points must then move very close to the reference point which isachieved by choosing a very small value for the distance D:

D = ε << 1. (6.66)

It is interesting to note that the point arrangement allows exact calculation of both firstand second derivatives of quadratic polynomials independent of the value of ε. This isillustrated on the three dimensional case, where the points 1 through 10 are arranged asshown in Fig. 6.8. Let the gradient and Hesse matrix be calculated from

∇f =1

2ε

f3 − f2

f5 − f4

f8 − f7

(6.67)

and

∇∇f =1

ε2

(Haij +Hb

ij +Hcij

), (6.68)

where Ha is formed in terms of the reference point,

Haij = −

2f1 −f1 −f1

−f1 2f1 −f1

−f1 −f1 2f1

, (6.69)

Hb is formed in terms of the points on the coordinate axes,

Hbij =

(f2 + f3) −(f3 + f5) −(f3 + f8)

−(f3 + f5) (f4 + f5) −(f5 + f8)

−(f3 + f8) −(f5 + f8) (f7 + f8)

, (6.70)

and Hc is formed in terms of the points shifted in away from the coordinate axes in otherdirections:

Hcij =

0 f6 f9

f6 0 f10

f9 f10 0

. (6.71)

Obviously, the gradient of quadratic functions is calculated exactly because (6.67) is thecentral differences method. It can be seen from (6.69), (6.70), and (6.71) that (6.68)produces a symmetric Hessian matrix. Moreover, it can be shown that the evaluationsreproduce the symmetric version of the matrix P, multiplied by two:

∇∇f = (Pij + Pji) (6.72)

The presented point arrangement and evaluation scheme calculate for quadratic functionsf := Rn → R and regardless of the value of the epsilon parameter not only the Hessian



but also the gradient exactly.The RSM evaluation scheme obtaining the approximation parameters p, p, and P, ex-plained in Subsections 6.1 and 6.2, can therefore be replaced by p = f(x(1), p = ∇f , andP = 1

2∇∇f where ∇f , and ∇∇f are obtained by (6.67) and (6.68), respectively for arbi-trary values of D. For quadratic functions, both methods obtain the same approximationparameters. Therefore, the evaluation scheme of the Newton method becomes identicalwith that of the RSM method at the limit of D approaching very small values D → ε.The entries of the gradient vector, where (6.67) represents the example for three variables,are calculated by the general scheme

∇fk =fnc(k−1)+2 − fnc(k−1)+1

2ε. (6.73)

The entries of the matrices Ha, Hb, and Hc follow from

Haij = −

2f1 , j = i

−f1 , j 6= i, (6.74)

Hbij =

(fnc(i−1) + fnc(j−1) ) , j = i

−(fnc(i−1)+2 + fnc(j−1)+2) , j 6= i, (6.75)

and

Hcij =

0 , j = i

fnc(j)−i+1 , j 6= i. (6.76)

Rigid Lattice Minimization strategy

If the point defined by (6.61) has a smaller function value than any of the supportingpoints, it is used as the reference point for the next set of supporting points. The distancevalue D of the next supporting point set depends on the distance between the current andnext reference points:

D = |xk+1 − xk|. (6.77)

This is illustrated in Fig. 6.9. A current point set in 2-dimensional variables space (solidcircles) defines a response surface approximation. Its minimum point xE is the encircledsquare. It becomes the reference x(1) for the new set of supporting points that is markedwith squares. The minimum point estimate of this set is indicated by the encircled triangle,defining the next point set. As can be seen from Fig. 6.9, the distance values D of newpoint sets are always chosen so that the region covered by the new set extends to theprevious reference point. Thus, if the successful minimum point estimate is outside of theregion of the current set, the new set will span a larger region. If the minimum pointis found inside of the region of the current point set, the new region will automaticallycontract. Expanding the point set region makes it more likely to find the minimum pointof the objective function inside that region when future minimum point estimates willfail, and contracting the region helps accelerating convergence once it seems likely thatthe objective minimum point is being closed in. If, on the other hand, the point definedby (6.61) does not yield a smaller function value than any of the supporting points, it isdiscarded and the strategy considers two cases.In the one case, the current reference point has a smaller function value than any otherpoint of the set, thus remaining the best estimate of the objective minimum point. Then,the reference point is kept and convergence is facilitated by shrinking the set of supporting



Figure 6.9: Updated supporting point sets around successful minimum point estimates

Figure 6.10: Shrinking of supporting point region around the reference point by a factorof 1/

√2

points about the reference point. The shrinking factor can be chosen 1/√

2, for instance.In the other case, a point of the set other than the referenced point appears as bestestimate of the objective minimum point. The whole point set is then shifted by movingthe reference point to the point with smallest function value. Instead of using this mentalpicture, one can also say that say that the point with smallest function value becomes thenew reference point and the distance value defining the extension of the region covered bythe point set is kept constant. This keeping constant of the distance value is motivatedby the opportunity to reduce the number of function evaluations of the new point setas some points of the new set will then have the same coordinates as other points of theprevious set whose function values are already known. Fig. 6.11 illustrates the shifts to thepoints 2 through 6, respectively. The figure obtains that in each case three points of theshifted sets coincide with previous points so that three function evaluations can be saved,making the optimization process more time-efficient. In general, however, when movingthe supporting point set so that the reference point of the shifted set coincides with one ofthe other points in the original position, the total number of coinciding points depends onthe point to which the reference point is shifted. Table 6.2 shows the number of coincidingpoints when the reference point of the new supporting point set coincides with the otherpoints of the previous set. In the 1-dimensional variables space there are two other pointswhere the reference point can be shifted and in each case there are two coinciding points.



Figure 6.11: Supporting point set shifted to center around the respective points withsmallest function value

2 3 4 5 6 7 8 9 10 11 12 13 14 15

2 23 3 3 3 34 4 4 4 3 4 4 3 45 5 5 5 3 5 5 3 3 5 5 3 3 3

Table 6.2: Number of coinciding points for the different reference point positions of theshifted sets corresponding to 1-, 2-, 3-, and 4-dimensional variable spaces.

The situation in the 2-dimensional variables space is illustrated in Fig. 6.11. In the 3-dimensional variables space with ten supporting points the number of coincidences dependson which point of the previous point set the reference point of the new point set is located.The number of coincidences is either 3 or 4. In the 4-dimensional space the number ofcoincidences is either 3 or 5. For a set of supporting points corresponding to nx variables,the average number nc of coinciding points can be shown to be

nc =nx

np − 1[2 (nx+ 1) + 3 (nx − 1)] (6.78)

The minimum, maximum, and average numbers of coinciding points, depending on thenumber of variables, are shown in Fig. 6.12. The maximum number increases linearlyand the minimum number remains constant at three for variables spaces higher than one-



Figure 6.12: Maximum, average, and minimum number of coinciding points versus numberof variables

dimensional. The average number shown in the figure is calculated with (6.78). The pointsof each shifted set coinciding with previously evaluated points need not be evaluated again.Fig. 6.13 shows the point set size and its reduction due to coinciding points depending onthe number of variables. The potential computing time savings are rather large when the

Figure 6.13: Points set size and reduction due to coinciding points versus number ofvariables

number of variables is small but tend to become insignificant at large numbers of variablesas Fig. 6.14 illustrates. The figure plots the point set size over the point set size reducedby the average number of coinciding points,

e =np

np − pc, (6.79)

which is a measure of the potential efficiency gain. As the figure shows, a program mayrun up to three times faster when searching for the minimum point of only one variable, ortwice as fast when there are two variables, but only insignificantly faster when the numbersof variables is large.



Figure 6.14: Relative computing time savings due to coinciding points versus number ofvariables

6.7.4 Adaptive Response Surface Method

The Adaptive Response Surface Method (ARSM) reduces systematically the design space.Ideally, the reduced design space contains only points with function values smaller than acertain threshold value. Some take this threshold value from the second highest functionvalue of a current supporting point set. The reduction of the design space can be mentallycompared with the shrinking water surface area of an evaporating puddle. For such aminimization strategy, a more flexible support placement scheme, than the one presentedabove, is useful, and explanations in the following Subsection are taken from Wang [44].

Latin Hypercube Sampling Points Scheme

Latin Hypercube Sampling, or Latin Hypercube Design (LHD), was first introduced byMcKay et al. [45]. In practice, Latin Hypercube Design samples can be obtained as follows.The range of each design input variable is divided into n intervals, and one observationon the input variable is made in each interval using random sampling with each interval.Thus, there are n observations on each of the d input variables. One of the observations onx1 is randomly selected (each observation is equally likely to be selected), matched with arandomly selected observation on x2, and so on through xd. These collectively constitute adesign alternative x1. One of the remaining observations on x1 is then matched at randomwith one of the remaining observations on x2, and so on, to get x2. A similar procedure isfollowed for x3, , xn, which exhausts all observations and results in n LHD sample points[15]. The sampling method is indicated in Fig. 6.15. LHD sampling is well suited for

Figure 6.15: Latin Hypercube Sampling in a Two-Dimensional Design Space

ARSM because the points are easy to construct and they fill the complete reducing design



space.The quadratic response surface model, or surrogate, is fitted to the data using the leastsquare method. When constraints are considered, a global optimization algorithm is usedto find the optimum. Following this step, the value of the actual objective function at theoptimum of the surrogate is calculated through an evaluation of the computation-intensiveobjective function. If the value of the actual objective function at the surrogate optimumis better than the values at all other experimental designs, the point is added to the set ofexperimental designs for the following iteration because the point represents a promisingsearch direction.

Search Space Reduction

All experimental designs and the accepted model optimum are recorded in a design library.Then a threshold value of the objective function is chosen. The design space that leads toobjective function values larger than the threshold is then discarded. In ARSM, the secondhighest value of the objective function in the set of experimental designs is chosen as thethreshold. If this second highest value cannot help to reduce the design space, the nexthighest value of the design function will be used, and so on. The optimization process willterminate if a satisfactory design emerges in the design library or the difference betweenthe lower limit and the upper limit of each design variable is smaller than a given smallnumber.

Using Inherited Points

The new reduced design space is again subdivided into square intervals (hyper cubes).Then, some of the sampling points of the previous, larger space will fall within the newspace, providing an incomplete sampling. These points are inherited and complementedwith new points that are placed by using the LHD scheme and the details of the methodsuggested by Wang can be found in his recent publication [44].


6.8 Line-Search Methods 97

6.8 Line-Search Methods

The iteration rule

xk+1 = xk + αksk (6.80)

has first been stated in the context of the method of steepest descent but, in its generalform, stands also for all the other methods where the minimum is sought from a referencepoint along a useful search direction s. It has been mentioned before that the distancebetween the reference point x0 and the minimum point along the search direction s mustbe calculated for each iterate. Finding that distance is left to the so-called line-searchmethods. Since we are dealing with nonlinear optimization, the line-search methods areiterative. So, the minimization process is a nestled loop where the outer loop determinesa sequence of search directions while the inner loop finds the minimum point along eachsearch direction.

6.8.1 One-Dimensional Search in Multidimensional Variables Space

The iteration rule (6.80) finds an improved point xk+1 in multi-dimensional variables spacex by going from the reference point xk a certain way αk along the useful search directionsk. The known search direction reduces the problem of obtaining the new point xk+1 tothe problem of determining the scalar αk. When the search direction is normalized,

s =s

|s|(6.81)

the distance α becomes a direct measure of distance. The a change of position 4x isrelated to the absolute value of it, 4x, by the normalized search direction s:

4x = 4xs. (6.82)

At the minimum point the slope f ′, or the derivative of the objective function f withrespect to the distance variable x, along the search direction s must vanish. The slope canbe calculated as scalar product of the gradient vector and the normalized search direction:

f ′ = sT∇f. (6.83)

6.8.2 Interval Halving and Golden Section Methods

The scalar-valued reference point x = x0 corresponds with the point in variables spacexk. The initial value of the increment 4x, obtaining x1 = x0 +4x, must be chosen. Ifthe function value f(x1) is less than the value f(x0), the new point x1 is a success andthe search continuous by adding another increment to obtain the point x2 = x0 + 24x.The variable x increases incrementally until a newly obtained point xm+1 has a higherfunction value than the previous point xm. Then the minimum point xmin is assumedto be bounded by xm−1 < xmin < xm+1. The orientation of the search is reversed andthe increment is divided by two, 4x = −1

24x, so that the new point lies in the middlebetween xm and xm+1 as Fig. 6.16 indicates. In the figure, the open circles indicatepoints whose function values have been obtained by a previous line search. The necessarycomputational effort, up to a certain level of accuracy, thus follows from summing up thenumber of solid circles. The increment is from now on modified after no more than twosteps, respectively, until the process is terminated by some criteria such as



Figure 6.16: Sketch of the interval-halving algorithm

• The relative change between two consecutive function values is smaller than a presetvalue

• The increment is smaller than a preset value

The golden-section method uses an optimized division of the intervals to obtain thedesired containment of the minimum point with the least possible number of iterationsteps. For deriving the method, consider a function with one single minimum within thenormalized interval xL = 0 and xU = 1, see Fig. Next, two more points x1 and x2 arechosen inside of the initial interval. The example illustrated in Fig. 6.17 indicates a higherfunction value at x1 so that this point yields an improved lower bound. The coordinatesof the new points are chosen such that the interval is reduced by the same fraction. Since

Figure 6.17: Sketch of the golden-section algorithm



generally it is not known which of the points x1 and x2 will become the new bound, thenew points must be chosen symmetrical to the midpoint of the interval:

xU − x2 = x1 − xL. (6.84)

The ratio between the old and the new intervals remains constant at each iterate, if thepoints abide to

x1 − xLxU − xL

=x2 − x1

xL − x1. (6.85)

From Fig. 6.17 we have that

x2 = 1− x1 (6.86)

and with this we obtain from (6.85) that

x1 =1− 2x1

1− x1→ x2

1 − 3x1 + 1 = 0 → x1 = 0.38197. (6.87)

Thus, we obtain for the ration between x2 and x1,

x2

x1= 1.61803 (6.88)

This ratio has some significance in fine arts, architecture, and philosophy: proportions inthe golden-section ratio are sensed esthetic or harmonious. In the more general case, whenxL 6= 0 and xU 6= 1, new bounds are obtained by the formulae

x1 = xL + τ(xU − xL)

x2 = xU − τ(xU − xL). (6.89)

6.8.3 Quadratic and Cubic Approximation Methods

The approximation methods approximate the objective function along the search directionby a simple polynomial the minimum point of which is calculated analytically. Whetheror not the approximation methods are far better or far worse than the interval-halving orgolden-section methods, depends on how well the objective can approximated. As a rule,the objective can be better approximated by quadratic or cubic polynomials the closer onegets to its minimum point.The quadratic approximation methods model the original objective function by

f = a0 + a1x+ a2x2. (6.90)

One version uses three supporting points (x1, f1), (x2, f2), and (x3, f3) to determine thecoefficients ai of (6.90) fitting through these points. An illustrative example of this is givenin Fig. 6.18 The coefficients a0, a1, and a2 follow from the system of equations

1 x1 x21

1 x2 x22

1 x3 x23

a0

a1

a2

=

f1

f2

f3

(6.91)



Figure 6.18: Approximation of the objective f = x4 + 13x

3 + 12x

2−0.4x+1 with a quadraticpolynomial and three supporting points at x1 = 0, x2 = 0.5, and x2 = 1.0.

The equations 6.90 can be resolved after Cramer’s Rule:

a0 = 1D

f1 x1 x2

1

f2 x2 x22

f3 x3 x23

, a1 = 1D

1 f1 x2

1

1 f2 x22

1 f3 x23

, a2 = 1D

1 x1 f1

1 x2 f2

1 x3 f3

D =

1 x1 x2

1

1 x2 x22

1 x3 x23

(6.92)

If one places value upon a translucent symbolic resolution, the form [23]

f = b0 + b1(x− x1) + b2(x− x2)(x− x3)

f ′ = b1 + b2(2x− x2 − x3), (6.93)

after substituting the supporting points for the variable x, leads to the less strongly coupledsystem of equations

f1 = f(x1) = b0 + b2(x1 − x2)(x1 − x3)

f2 = f(x2) = b0 + b1(x2 − x1)

f3 = f(x3) = b0 + b1(x3 − x1)

. (6.94)

It can be resolved more concisely and yields the result

b1 =f3 − f2

x3 − x2, b0 = f2 − b1(x2 − x1), b2 =

f2−f1

x2−x1 − b1x1 − x3

(6.95)

The coefficients ai of the polynomial (6.90) are related with the bi by

a0 = b0 − b1x1 + b2x2x3

a1 = b1 − b2(x2 + x3)

a2 = b2

. (6.96)



The extreme xe is given by (6.13). The iteration requires replacement of one of the threesupporting points by the newly found point xe. The function value fe at the new pointfollows from evaluating the original objective function. For instance, one can discard theold supporting point with the highest function value. The algorithm is relatively easy toimplement and works without additional gradient information.The cubic approximation method uses the model

f = a0 + a1x + a2x2 + a3x

3

f ′ = a1 + 2a2x + 3a3x2. (6.97)

It needs only two supporting points but at each of these the function value f as well as theslope f ′ must be calculated. An illustrative example is given in Fig. 6.19. The coefficients

Figure 6.19: Approximation of the objective f = x4 + 13x

3 + 12x

2 − 0.4x+ 1 with a cubicpolynomial and two supporting points at x1 = 0 and x2 = 1.

follow from the system of equations1 x1 x2

1 x31

0 1 2x1 3x21

1 x2 x22 x3

2

0 1 2x2 3x22

a0

a1

a2

a3

=

f1

f2

f3

f4

. (6.98)

Again, a suitable form allows a more concise resolution by hand:

f = b0 + b1(x− x1) + b2(x− x1)(x− x2) + b3(x− x1)2(x− x2)

f ′ = b1 + b2[(x− x1) + (x− x2)] + b3[(x− x1)2 + 2(x− x1)(x− x2)]. (6.99)

By using the abbreviations 4x = (x2−x1), 4f = (f2− f1), and 4f ′ = (f ′2− f ′1) we writethe system of equation the solution of which determines the coefficients bi:

f1 = f(x1) = b0

f2 = f(x2) = b0 + b14x

f ′1 = f ′(x1) = b1 − b24x

f ′2 = f ′(x2) = b1 + b24x + b34x2

. (6.100)



The result is

b0 = f1, b1 =4f4x

, b2 =4f −4x4f ′

4x2, b3 =

4xf ′1 +4xf ′2 − 24f4x3

. (6.101)

The coefficients ai of the polynomial (6.97) are related with the bi by

a0 = b0 − b1x1 + b2x1x2 − b3x21x2

a1 = b1 − b2(x1 + x2) + b3(2x1x2 + x21)

a2 = b2 − b3(2x1 + x2)

a3 = b3

. (6.102)

The extreme points then follow from setting the derivative (6.97) equal to zero:

x2e + 2pxe + q = 0, p =

a2

3a3, q =

a1

3a3(6.103)

If the original objective is a quadratic, the approximation can only be quadratic as well sothat the coefficient a3 vanishes. Then xe follows from (6.13). Else the solution of (6.103)is given by

xe =−a2 ±

√a2

2 − 3a1a3

3a3(6.104)

The sign of the root must be chosen correctly. The cubic approximation offers a maximumof two extreme points the one of which is the desired minimum and the other one is amaximum. At the minimum point the second derivative is positive:

f ′′1 = 2a2 + 6a3xe ≥ 0. (6.105)

Combining (6.105) with (6.104) yields after some simplifications the result

±√a2

2 − 3a1a3 ≥ 0 (6.106)

and reveals that by choosing the positive sign of the root in (6.104) we always obtainthe minimum point xe = xmin. Extreme values exist only if the argument of the rootis positive. An unfortunate choice of the supporting points may yield an approximatingpolynomial without extreme points. This can be avoided by checking the condition

x ≤ xe → f ′ ≤ 0 ∧ x ≥ xe → f ′ ≥ 0. (6.107)

is fulfilled or that the extreme point is bounded between the two supporting points.Often, the numerical effort, or the programming difficulty, of obtaining derivative infor-mation lets the cubic approximation method based on four supporting points appear moreattractive. Here, the parameters ai can easily be obtained analytically by using the form

f = b0 + b1(x− x1) + b2(x− x2)(x− x3) + b3(x− x2)(x− x3)(x− x4). (6.108)

Inserting the four supporting points yields the system of equations:

f1 = f(x1) = b0 + b24x124x13 + b34x124x134x14

f2 = f(x2) = b0 + b14x21

f3 = f(x3) = b0 + b14x31

f4 = f(x4) = b0 + b14x41 + b24x424x43

, (6.109)



where 4xij = xi − xj . The equations (6.109) are resolved to give the parameters bi:

b1 = 4f32

4x32,

b0 = f3 − b14x31,

b2 = f4−b0−b14x41

4x424x43,

b3 = f1−b0−b24x124x13

4x124x134x14

. (6.110)

Ordering the terms of (6.108) after powers of the variable x yields the form

f = x0[b0 − x1b1 + (x2x3)b2 − (x2x3x4)b3]

+ x1[ b1 − (x2 + x3)b2 + (x2x3 + x3x4 + x4x2)b3]

+ x2[ b2 − (x2 + x3 + x4)b3]

+ x3[ b3]

. (6.111)

From (6.111) the parameters ai, in terms of bi, can be immediately obtained.

6.8.4 Brent’s Routine

The success, or efficiency, of the various line search methods depends very much on theobjective function topology. For instance, if a gradient calculation is costly compared toa function evaluation, the line search methods depending on gradient information can beless efficient in terms of computing time although the number of iterations for one linesearch may be significantly smaller. Principally, we desire the routine to work reliable ona wide range of function topologies and to consume as little computing time as possible.In structural optimization we generally do not know much in advance about the topologyof a given problem. Therefore, we do not wish the performance to depend on topology-dependent tuning parameters.The combination of several methods in one routine often gives the best results in termsof reliability as well as efficiency for a range of function topologies. In the following, acollection of methods and algorithms is presented that is combined to give a reliable andefficient minimization routine. All of this follows closely the idea’s underlying Brent’sroutine by [41] available in the Numerical Recipes. The method presented here sharesmuch of the basic approach with Brent’s Routine available from the Numerical RecipesLibrary but has been coded with no direct reference to that routine. Both routines performsimilar.

Basic Ideas and Algorithms

The routine starts with bracketing the minimum point along the search direction. Thereliability of the further approximation iterations is based on the knowledge that theminimum point is between those brackets. The further steps try to reduce the bracketedrange. At the beginning, the reduction is based on the slower but more robust golden-section bracketing method and at a certain point the more efficient but less robust methodof three-point quadratic approximation is used instead.Upon calling of the search routine the following information is relevant input:

• Search direction vector s

• Maximum step length xmax



• Starting point in optimization variables space x0

• Object function value at starting point f

• The integer argument value of minimization loop

• Small epsilon value ε

The step-length value is initially set equal to ε. The starting point in variables space refersto a zero distance along the search direction. Points at other distances are calculatedaccording to. The initially given point, transferred in the array OPV AR, must be storedin another array that is called OPSTORE because OPV AR is updated every time a newpoint at the distance x from the starting point is evaluated. After its initiation and duringone line search OPSTORE is not changed but rather used as reference.The routine first checks the descent property. It ensures that the line search makes anysense at all because one may not hope to find a minimum in directions of increasingfunction values. When the inner product of the gradient and search direction vectors isnon-positive the descend property is given. Otherwise, the routine returns.Next, the minimum point is bracketed, i.e. an upper and a lower bound of the minimumpoint are established. The first bracket spans the distance from the starting point x1 = 0to the point x4 where an increase of the initially decreasing objective function value isobserved. The value of the point x4 is obtained by multiplying the value STEP withincreasing powers of ten until the newly calculated objective function value FUN exceedsthe previously calculated value fmin by at least ten percent. However, x4 may not begreater then the maximum step length xmax that prevents the line search to lead into theinfeasible region.An initial set of four supporting points is created by the points x1 and x4 that mark therange of the bracket, and two additional points x2 and x3 that lie in the interior of thebracket and divide the range of the bracket according to the golden-section ratio, see Fig.6.20. The variables xL and xU carry the information on the upper and lower limits of the

Figure 6.20: Golden section method supporting point on a range bracketing the minimumpoint

range. The variable iQ is used to switch from the golden section method the quadraticapproximation method depending on a criterion explained below.

Loop on Refining the Brackets on the Minimum Point

First, from the range defined by the four initially found points a new and smaller rangecontaining the minimum point is selected. The smaller range is bracketed either by thepoints x1 and x3 or by the points x2 and x4. Referring to the situation sketched in Fig.6.20, the smaller range is defined by x2 and x4 because of the four points x4 is the one



with the smallest objective function value. A set of three points is now defined thatconsequently contains x2, x3, and X4 of the four-point set. The lower limit xL of thenarrowed range is set equal to x1 of the three-point set and the upper limit is set to x3

of the three-point set as shown in Fig. 6.21. In the Situation depicted in Fig. 6.21 the

Figure 6.21: Narrowing the bracket by reduction of four supporting points to three

minimum point, in terms of the original four supporting points, is point x4. Using thequadratic approximation method for estimating a new minimum point implies a risk thatthe new minimum point is estimated outside of the current bracketed range as indicatedin Fig. 6.22. This is not wanted because it has been established before that the minimummust be inside the bracketed range. Therefore, the integer switch iQ is set equal to zeroso that the golden section method instead of the quadratic approximation methods isselected. The golden section method uses the reduced range defined by the set of three

Figure 6.22: Quadratic approximation incorrectly indicated minimum outside of bracketedrange

supporting points to replace the previous set of four points by a new set of four points.Based on the situation depicted so far, the result is shown in Fig. 6.23. Characteristic forthe golden section method is that the new four points now divide the new and narrowerrange with the same ratios as the previous set divided the previous range. This is achievedby using the selected three points of the previous set and adding just one new point. Thenew point is point x3 of the new set in Figure 6.23. The objective function is evaluated

Figure 6.23: Updated four-point set



at the new point and it is determined which point of the set has the minimum functionvalue found so far. As Fig. 6.23 indicates, point x4 remains the minimum point for theconsidered example. The minimum point and function value found by the golden sectionmethod are called xGS and fGS . One could continue iterations on the golden sectionmethod by restarting the narrowing of the bracketed range as explained above. However,the routine CBSEARCH performs a quadratic approximation as well because the caseof the estimated minimum point lying outside of the bracketed range is only a possibilitybut, generally, not a certainty.The quadratic approximation method with three supporting points is explained in section6.8.3. Basically, a quadratic polynomial is fitted to the three supporting points of thecurrent three-point set. As a condition for an extreme point, the spatial derivative of thepolynomial is set equal to zero. The resulting equation is resolved for x. The secondspatial derivative must be positive if x is a minimum point. If it is not positive or if theminimum point x is calculated outside the bracketed range, the result is disregarded andc is placed in the middle between the two brackets. In the program, x is named XQA andthe objective function is evaluated and the value is called FQA. Even if the quadraticapproximation as such failed, it is possible that the newly found point in the middle of therange might be a valid minimum point. Therefore, the results of the golden section andthe quadratic approximation codes are compared with each other.The step of comparing the golden-section and the quadratic-approximation methods isskipped when the golden section method has not been used. A criterion is introducedthat characterizes the shape of the objective function within the bracketed range. Ifthe function is extremely steep, numerical round-off errors may impair the quality of thequadratic-approximation results. If the function is not terribly steep and the functionvalue found by the quadratic approximation method is smaller than that found by thegolden section method, the result of the quadratic approximation is chosen for the newminimum point and the switch IQ is set equal to 1 so the golden section method is notused anymore in order to save computing time. Referring to the situation depicted in Figs.6.20 through 6.23 one can expect one or two more iterations on the golden section methodbefore the program switches to the then more effective quadratic approximation method.After it has been decided upon which one of the points XGS and XQA determines thenew minimum-point step length XNEW , the new point x0 in variable space is calculated.If FNEW is smaller than the previous found minimum value FMIN , the minimum pointis update by replacing XMIN with XNEW and FMIN with FNEW . The newly foundstep-length value XNEW must now replace one of the points of the existing four-pointsets according to its value with respect to the other points.The newly found step-length value XNEW and the current three-point produce the newfour-point set. In case the new point is found by quadratic approximation the golden-section ratio is generally lost. Fig. 6.24 sketches that a newly found point is locatedbetween points two and three of the current three-point set so that it becomes point threeof the new four-point set. Its position deviates from that corresponding with the goldensection ratio, which does not really matter because when the quadratic search starts beingpreferred over the golden section method, the latter one is not used anymore during theremainder of the current line search. The point with the smallest function value has thestep length XMIN and the function value FMIN . The point with the second smallestfunction value has the step length XSEC and the function value FSEC. The points areneeded for the termination or convergence criteria, respectively. Referring to Fig. 6.24,the third point would be XMIN and the fourth point would be XSEC.There are two criteria that must both be fulfilled simultaneously in order to terminate the



Figure 6.24: Minimum point successfully estimated by quadratic approximation

line search.

1. Close at the true minimum point, the actual function value FUN must be very closeto the function value FNEWA as estimated by the quadratic approximation curve.The variable DF carries a relative error that is calculated as the absolute value ofthe difference divided by the sum of the two values

2. Close at the true minimum point, the relative change of the step length with respectto the absolute value is small. The criterion is set up so that smaller values resultas the minimum point moves closer to the center of the bracketed range.

A third criterion terminates the search even if the two described above do not. A perfectlysolved DoD problem corresponds to a zero objective function value. It is therefore safeto terminate the search when the value of the objective function itself is smaller than thechosen epsilon value.If the criteria do not terminate the search, the iterations continue with extracting a newthree-point set from the four-point set shown in Fig. 6.24.


6.9 Lagrange Multiplier Method Numerical Optimization 109

6.9 Lagrange Multiplier Method Numerical Optimization

The Lagrange multiplier method requires knowledge of the Lagrange multipliers, whosevalues are often not accessible through analytic formulas. The problem can be mitigatedby expanding the set of optimization variables with the set of Lagrange multipliers sothat numerical optimization methods search within both sets simultaneously for the bestnumerical approximation of constrained minima. At constrained minima the Lagrangianformulation seen in Sections 3.5.1 and 5.5 is stationary; the stationary points are minimumwith respect to the optimization variables and maximum with respect to the Lagrange-multipliers. Within the unified space of optimization variables and Lagrange multipliersthe Lagrangian form has saddle points which cannot be found with minimum search meth-ods. Recasting the Lagrangian formulation into an alternative formulation transforms allsaddle points to minima by the method explained in the following Section 6.9.1.

6.9.1 Modified Lagrangian with Local Minima

A modified form of the Lagrangian (3.26) on page 21 is built with its gradients with respectto the optimization variables x and its Lagrange multipliers λ:

∇xL = ∇xf + λg∇xg + λh∇xh ; ∇λL = g + h (6.112)

If no other means are available, the partial derivatives can be approximated with:

(∇xL)i ≈L(x + εi,λ)− L(x,λ)

ε; (∇λL)i ≈

L(x,λ+ εi)− L(x,λ)

ε(6.113)

The squared norms of both gradients are non-negative and vanish at critical points. There-fore, the objective L

L = ∇xLT · ∇xL+∇λLT · ∇λL (6.114)

possesses minima at the original Lagrangian saddle points and these can be found withany minimum-searching optimization routine simultaneously for both x and λ.The sample problem addressed in Section 3.5.1 and revisited in Section 5.5 gives themodified Lagrangian:

L =

(x+ 2

10− λ)2

+ (1− x)2 (6.115)

Fig. 6.25 shows the contour lines of the modified Lagrangian, which is a quadratic poly-

Figure 6.25: Two-dimensional variable space with two linear constraining functions

nomial, and that the method of conjugate gradients by Fletcher and Reeves finds theconstrained minimum exactly within two line searches.



6.9.2 Algorithm for Removing Constraint Violations

The following algorithm removes several constraint violations in one step if the constrainingfunctions are linear. The derivations are illustrated for a 2-dimensional problem with twoconstraining functions by Fig. 6.26. The figure also illustrates that each constrainingequation reduces the search space dimensionality by one so that the feasible region isreduced to one feasible point if the number of constraints equals the number of problemvariables.

Figure 6.26: Two-dimensional variable space with two linear constraining functions

Let xv be a violating point in an infeasible region of the search space. It is then desiredto find a new point x0 where all the violated constraints remain active with zero values.The vector pointing from xv to x0 is called c and is a linear combination of the gradients∇gi of the constraining functions at the point xv:

c = µi∇gi. (6.116)

It is assumed that xv is close enough to the feasible region so that the constraining func-tions gi can be linearly approximated along c:

gi = βig′i = βi

∇gTi ∇gi|∇gi|

= βi|∇gi|. (6.117)

In the case of side constraints, the assumption made above is fully justified because theconstraining functions are linear anyway and (6.117) holds exactly. The distances of thefunctions (or hyper planes) gi = 0 to the point xv are

βi =gi|∇gi|

. (6.118)

The constraining function gradients ∇gi will generally not be linearly independent. There-fore one may not confuse the linear-combination weight factors µi with the respective dis-tances βi. The following method for constructing vectors di that have components in thedirection of the gradients ∇gi only, i.e. that are completely decoupled from all the othergradients ∇gj

diT∇gj = 0, i 6= j, (6.119)


6.9 Lagrange Multiplier Method Numerical Optimization 111

requires that not two or several of the gradients ∇gi are equal to each other. The vectorsdi are then constructed from all remaining violated constraining function gradients dj:

di = ∇gi − γij∇gj (6.120)

After substituting (6.120) in the condition (6.119) and re-ordering one obtains the factorsγij :

i = j → γij = 0

i 6= j → γij =∇gTi ∇gj∇gTj ∇gj

(6.121)

Of course the correction vector c (6.116) can as well be written in terms of the vectors di:

c = νidi. (6.122)

In contrast to the gradients ∇gi the vectors di are linearly independent from each otherand so the weight factors νi can be calculated from the respective single constrainingfunctions. The component νidi of the vector di in the direction of the gradient ∇gi mustbe equal to the distance βi:

νidi∇gi|∇gi|

= βi. (6.123)

Solving (6.123) for νi and substituting (6.118) for βi yields

νi =gi

diT∇gi

. (6.124)

In a program the first step is to identify the violated constraints and to obtain the respectivegradients. The factors γij are calculated after (6.121) and the linearly independent set ofvectors di is computed after (6.120). Next the factors νi are calculated after (6.124) andthe correction vector c follows from (6.122). In cases where the assumption (6.117), theconstraining functions being linear along c, is only an approximation, the new point mustbe checked for remaining constraint violations. If any of the constraining functions havevalues greater than zero the procedure must be repeated until all constraints are satisfied.An illustration to the here presented concept is given in Fig. 6.26. If multiple vectorsappear, they must first be removed from the set. From a set of multiple gradient vectors∇gi, the one associated with the constraining function with the greatest value should bekept.


6.10 Objective Function Derivatives 113

6.10 Objective Function Derivatives

The most effective optimization algorithms, at least in terms of small numbers of itera-tions, can be constructed when gradient information is available for the calculating thesearch directions as well as performing the line searches. In terms of computing time, effi-ciency depends highly on the numerical effort for calculating the gradients. If the gradientcalculation is very expensive, recourse to gradient-free minimization methods will yield abetter time efficiency.How expensive is the gradient calculation when the objective function evaluation involvessolving a FEM model? First of all, that depends on the computational effort for solvingthe system of equations. The computational effort is mainly dependent on the number ofunknowns (degrees-of-freedom) as well as the structure of the stiffness matrix. When itcomes to gradient calculation, the numerical effort increases linearly with the number ofindependent optimization variables.

6.10.1 Differences Method

The differences method calculates the gradient at a reference point (x0, f0) by performing aloop i = 1, N on all N optimization variables. Each variable xi is increased by an increment4xi and the objective is then evaluated. The difference between the objective value ofthe modified design and f0, divided by the increment, approximates the sensitivity of theobjective with respect to the respective optimization variable and, thus, the respectiveentry of the gradient vector.

∇fi =f(x0 +4xi)− f(x0)

4xi. (6.125)

The differences method implies that the structural model must be evaluated N times foreach gradient calculation and is therefore also referred to as a brute-force method. Theexpense for one gradient calculation equals roughly N function evaluations. For problemswith a high number of optimization variables the expense for one gradient calculation maybecome much greater than one function evaluation, abolishing the efficiency of gradient-based minimization methods.

6.10.2 Sensitivity-Formula Gradient Calculation

Gradient calculation after the sensitivity formula for finite-element models requires no ad-ditional function evaluations. The finite-element method for linear static elasticity prob-lems assembles a system of linear equations

Ku = r (6.126)

that must be solved for the unknown nodal-point displacements, or degrees of freedom, u.The stiffness matrix K is a numerical representation of the behavior of the structure andis assembled from the stiffness matrices of the individual finite elements into which thestructure domain is divided. The right-hand-side vector r contains the nodal forces derivedfrom the loads acting upon the structure. Once (6.126) is resolved, or the primary solutionu is obtained, any other quantity is calculated as a function of the primary solution, andpossibly model parameters, by the postprocessing step. Having gradient calculation inmind, we assume that the objective function depends explicitly and implicitly, throughthe primary solution, on the optimization variables x, f = f(x, u(x)). Therefore, the



sensitivity of the objective with respect to a change of the ith optimization variable isobtained by the rule of chains:

∇fi =∂f

∂xi+∂f

∂u

∂u

∂xi(6.127)

The postprocessing is most often numerically insignificant if compared with the effort tosolve (6.126). Therefore, the quantities ∂f

∂xiand ∂f

∂u are obtained at low numerical cost.The cost of calculating the sensitivity of the primary solution u upon the optimizationvariables x is much reduced by the formula derived in the following.The first step in deriving the sensitivity formula is to form the partial derivative of (6.126)with respect to the ith optimization variable xi,

∂K

∂xiu + K

∂u

∂xi=

∂r

∂xi. (6.128)

Resolving (6.128) for the sensitivity of the primary solution upon xi yields the desiredsensitivity formula

∂u

∂xi= K−1

∂r

∂xi− ∂K

∂xiu

. (6.129)

On first sight one could conclude that nothing has been gained since, after all, the inversionof the stiffness matrix is at least numerically equivalent to resolving (6.126). In fact,however, (6.128) allows to calculate the gradient very efficiently, especially for large models.Let us realize that the inverse at the reference point, at least in terms of a triangulatedmatrix, exists at the point (x0, f0) because the objective has just been evaluated there.It remains to calculate the terms within the braces. The sensitivity of the nodal forcesupon the optimization variables, which are design parameters in structural optimization,is easily calculated at practically no cost. Often, if the design is subject to fixed andconcentrated loads, the nodal forces are independent of the design parameters so that thefirst term in braces becomes zero. The second term in braces is a product of the sensitivityof the stiffness matrix with the existing primary solution vector. Often, the change of adesign parameter affects only part of the structural model, or only a small subset of allthe finite elements making up the whole model. It is therefore not necessary to assemblythe complete stiffness matrix. Instead, one may only consider the elements whose shape orother properties are affected by a change of xi. Then, the numerical effort may be muchless than for a complete stiffness matrix assembly, which is again much less expensive thana solution of the system equations. Fig. 6.27 illustrates the point. The example refers

Figure 6.27: Shape optimization and sensitivity formula for gradient calculation

to shape optimization where the positions of boundary nodes is variable in the directionperpendicular to the boundary. The variation of the position of one node affects the shape


6.11 Global Optimization 115

of the two adjacent finite elements while all other elements remain unchanged. The stiffnessmatrices of the two elements with the reference shapes are subtracted from those with themodified shapes. The difference is multiplied with the existing displacement vector, whereonly the displacements on the nodes of the affected elements need to be considered.Although the sensitivity formula allows gradient calculation at much lower cost than thedifferences method, it may be difficult to implement in existing general-purpose FEMprograms.

6.11 Global Optimization

Mathematical programming may obtain a local minimum and terminate there becausethe necessary optimality conditions are satisfied. Thus, the local minima of a non-convexobjective function may stand in the way of obtaining its absolute, or global, minimum.This is a disadvantage and practical weakness of mathematical programming that mayotherwise be so much more efficient than genetic algorithms, for instance. Attempts tomitigate this weakness of mathematical programming lead to methods such as

• Multi-start method

• Tunneling method

6.11.1 Multi-Start Method

The multi-start method assigns reasonable ranges of values to the individual optimizationvariables. The ranges should be chosen inside of the feasible region. From these ranges anumber of values, maybe in equidistant spacing, are selected. The values define a hypermesh of points in the N dimensional space of the optimization variables. One after another,

Figure 6.28: Function with minima in considered region and starting-point mesh.

these points are used as starting points for a minimum search by using mathematicalprogramming. If the search from each starting points leads to the same minimum point,one may be dealing with a convex function. Operating upon a non-convex function, themulti-start method will generally obtain several minimum points (if there are several inthe feasible region). The one with the lowest function value may be taken for the globalminimum although there is no proof that the absolute minimum may not have escaped thesearch. The probability of finding the global minimum increases with increasing densityof the starting-point mesh. However, a high mesh implies an immense number of startingpoints, or optimization processes, when the number of dimensions is high. Covering the



range of each of the N variables with M points yields MN starting points altogether. Itgoes without saying that the method is forbidding when it comes to large optimizationproblems.

6.11.2 Tunneling Method

Assume non-convex function and a local minimum point having just been obtained bymathematical programming. Starting from this local minimum point, the tunneling methodinvented by Levy and Gomez [46] finds systematically a new minimum point with lowerfunction value. The iterative process ends with finding the global minimum point. Themethod has the property that the other local minimum points, having higher functionvalues than the most recently found minimum point, are not considered, or tunneled,hence the name of the method. Each iteration of the tunneling method consists of twophases:

1. Minimization phase: starting from the point x1, a new local minimum point x∗1is obtained by mathematical programming

2. Tunneling phase: starting from the local minimum point x∗1, a different point x2,having the same function value than x∗1 or fulfilling f(x2) = f(x∗1), is searched

After the tunneling phase follows the return to the minimization phase. With given min-imum point x∗i a tunneling function t,

t(x,x∗i , si) =F (x)− f(x∗i )

[(x− x∗i )T (x− x∗i )]

si , (6.130)

is defined. The parameter si must be chosen so that the denominator tends strongeragainst zero than the numerator when x → x∗i so that the tunneling function value isdifferent from zero:

t(x→ x∗i ,x∗i , si) 6= 0. (6.131)

On the other hand, the value si should also be such that the tunneling function decreaseswith increasing distance from the latest minimum point x∗1:

t(x∗i +4x,x∗i , si) ≤ t(x∗i ,x∗i , si). (6.132)

If the conditions (6.131) and (6.132) are satisfied, any null of the tunneling function t is apoint where the objective function f has the same value as the reference minimum point.This new point is a suitable starting point for a new minimum search and guarantees thatthe next obtained minimum point has a lower function value than the previous one. Thetunneling function corresponding to the global minimum has no null.The tunneling method is demonstrated on a polynomial of degree six that depends on onlyone variable. The objective function shown in Fig. 6.29(a),

f(x) = x6 − 16x5 + 100x4 − 310x3 + 499x2 − 394x+ 140, (6.133)

has a minimum point at x∗1 = 1 with a value f(x∗1) = 20. For purpose of demonstratingan unusable tunneling function we choose a value si = 0.5 for the root parameter. Then,the tunneling function shown in Fig. 6.29(b) is obtained by subtracting 20 from thepolynomial (6.133) and dividing the result by x − 1. Obviously, that tunneling functiondoes not satisfy the condition (6.131) because it has a null at x∗1 or t(x∗1) = 0. However,choosing si = 0.5 obtains the tunneling function shown in Fig. 6.29(c) by subtracting 20


6.11 Global Optimization 117

(a) (b) (c)

Figure 6.29: Function with objective (a) and unusable (b) and usable (c) tunneling func-tions

from the polynomial (6.133) and dividing the result by (x− 1)2. This tunneling functionis usable because its value at x∗1 is different from zero and it decreases monotonouslyfrom there to the next null at x2 = 2. The null is quickly approximated by employingthe Raphson method. From there, mathematical programming will obtain the other localminimum point x∗2 ≈ 2.517.


Chapter 7

Stochastic Search

A general optimization problem can be characterized by the structure of its search space, itsobjective space and its objective function. The optimization variables are parameterizedwithin the search space (e.g. Rn in a real-valued optimization problem). Candidatesolutions are rated in the objective space (e.g. R for single-objective problems or Rn formulti-objective problems). The methods from the field of Mathematical Programming arewell suited for continuous, homogeneous search spaces, continuous, scalar objective spacesand smooth, convex objective functions. These qualities can not always be expected to bepresent in an engineering problem at hand, indicating that deterministic algorithms areinappropriate. Therefore, a lot of research has focused on the development and applicationsof stochastic search methods to overcome this limitation. This chapter gives an overviewover the field of stochastic optimization. Some of the concepts presented earlier in thislecture are directly transferable to stochastic search. However, it should be pointed outthat the separation of the parameterization concept (or in general the representation) andthe search method does no longer hold for all search methods presented in the following.Some of the methods evolved closely affiliated representation concepts dedicated at aspecific problem type.

7.1 Introduction to stochastic search and optimization

According to [47] a stochastic search method is characterized by the presence of one orboth of the following properties:

• There is random noise in the measurements of the criterion to be optimized and/orrelated information.

• There is a random choice made in the search direction as the algorithm iteratestoward a solution.

These properties contrast with one of the basic hypothesis of Mathematical Programming,where it is assumed that one has perfect knowledge about the objective and dependingon the method on its derivatives and that this information is used to determine a searchdirection through to a deterministic rule. In many applications, this information is notavailable. This is usually referred to as black-box optimization.

A large group of stochastic search methods is inspired by biology or physics, e.g.Simulated Annealing, Evolutionary Algorithms, Ant Colony Optimization and Swarm Op-timization. Others imitate learning mechanisms, e.g. Tabu Search or Neural Networks.


120 Stochastic Search

The methods presented in the following sections rely on two model assumptions con-cerning the objective function:

1. Pointwise sampling of the search space allows to get a kind of a problem landscapeat least locally.

2. Better solutions can be found close to already visited good solutions.

These assumptions are less restrictive than the typical polynomial models employed inMathematical Programming. However, there are problem types which are excluded bythis model. A typical optimization problem which is not a model of the above concept is :

minimize F (x) =

0 if x = 01 otherwise

(7.1)

7.1.1 Neighborhood concept

From the second model assumption stated above the requirement to define a measure forcloseness arises. This is usually done by the definition of a problem specific neighborhood -function. The neighborhood -function neighborhood(x) returns a subset Xn of the searchspace X which is close to the solution x. In a real-valued search space X = Rn theEuclidian distance is a common choice to define a neighborhood function, i.e.

neighborhood(x, r) =x′ ∈ X||x′ − x| ≤ r

⊆ X (7.2)

In a binary search space the Hamming distance H(x,x′) may serve as distance measure.It denotes the number of bits which need to be flipped to change x into x′.

7.1.2 Strategies in stochastic search

Stochastic global search involves two opponent search strategies named exploration andexploitation. Both of them characterize the way how new solutions are created withinthe iterations of the algorithm. Explorative search addresses unacquainted regions ofthe search space, while exploitation inquires the regions around the best solutions knownso far. Exploitation drives an algorithm toward an optimum (which could be a localor the global one), thus it is usually referred to as convergent component. Explorationshould avoid a premature (i.e. local) convergence and is therefore also called divergentcomponent. In typical implementations of stochastic search both strategies occur andthe balance between them is either determined from a process parameter to be set bythe user or adapted by the algorithm according to some rules. It should be pointed outthat the chance of finding a global optimal solution depends crucially on the weighing ofexploration versus exploitation. An optimal setup is not known a priori but it depends onthe objective function at hand.

7.1.3 A prototype of a stochastic search algorithm

The search algorithms presented in the following sections fit all in this generalscheme:

1: Create a set of start solutions xi ∈ X2: Calculate F (xi)3: Initialize the memory M ← (xi, F (xi))4: Initialize the iteration counter t← 0


7.2 Stochastic Search Algorithms 121

5: while continue(M, t) = 1 do6: t← t+ 17: Create a set of alternative solutions xi ∈ X8: Calculate F (xi)9: Update the memory M ← update (M, (xi, F (xi)))

10: end while11: Return the best solution x∗ in M

The steps 1, 7 and 9 may contain a random component. The function continue(M, t)stands for some kind of termination criterion. It returns 1 as long as the current stateof the search (i.e. the iteration counter t and the solutions in the memory M) does notindicate a termination.

7.1.4 Performance of stochastic search

The performance of optimization algorithms is usually measured by the number of objec-tive function evaluations required to find an optimal solution. Since the way a stochasticalgorithm iterates toward a solution is influenced by random decisions this number maychange between different runs of the same algorithm on one problem. Therefore, an empir-ical performance assessment has to consider multiple runs of an optimization algorithm.

No Free Lunch Theorems for Optimization

D.H. Wolpert, W.G. Macready [48] present a pure theoretical investigation on the perfor-mance of optimization methods. All optimization problems are considered. The averageperformance of an arbitrary pair of optimization algorithms over all optimization problemsis investigated. The so called No Free Lunch Theorems show that the average performanceof these two algorithms has to be the same. The No Free Lunch Theorems are based onthe assumptions that all optimization problems are equally likely and that each algorithmknows when an optimal solution is found. These assumptions are not met by typical opti-mization applications. However, the No Free Lunch Theorems point out that a comparisonreporting the performance of a particular algorithm on a few sample problems can onlyindicate the behavior on this range of problems considered. A generalization to otherproblems may not be possible. Moreover, the optimization algorithm should be chosento match the problem type at hand as far as possible. A general optimization algorithmcapable of solving a wide range of problems may loose performance when compared witha problem-specific algorithm.

7.2 Stochastic Search Algorithms

7.2.1 Random Search

Random search is a global stochastic search algorithm:

1: Create a random start solution x ∈ X2: Calculate F (x)3: Initialize the memory M ← (x, F (xi))4: Initialize the iteration counter t← 05: while continue(M, t) = 1 do6: t← t+ 17: Create one alternative solutions x ∈ X8: Calculate F (x)



9: Add the new solution to the memory M ←⊎(x, F (xi))

10: end while11: Return the best solution x∗ in M

Random search applies a pure explorative search by just randomly sampling the searchspace. The creation of new solutions is not based on solutions stored in the memory.

7.2.2 Stochastic Descent

Stochastic descent is a local search algorithm:

1: Create a random start solution x ∈ X2: Calculate F (x)3: Initialize the iteration counter t← 04: while continue(x, t) = 1 do5: t← t+ 16: Create a random solution x ∈ neighborhood(x)7: Calculate F (x)8: if F (x) ≺ F (x) then9: x← x

10: end if11: end while12: Return the best solution x∗ in M

It transfers the concept of gradient-based descent methods to stochastic methods whatmakes it applicable to discrete problems. Stochastic descent is based on a pure exploitationstrategy.

7.2.3 Metropolis Algorithm

The Metropolis Algorithm is a generalization of the stochastic descent algorithm. Themethod accepts uphill steps with a certain probability. Therefore, the objective is inter-preted as energy. The energy difference between two states ∆E = F (x)−F (x) determinesthe probability for a state transition from x to x. The probability is given by

p(∆E) = e−∆ET (7.3)

whereas T denotes a constant referred to as temperature of the algorithm.

1: Create a random start solution x ∈ X2: Calculate F (x)3: Initialize the iteration counter t← 04: while continue(x, t) = 1 do5: t← t+ 16: Create a random solution x ∈ neighborhood(x)7: Calculate ∆E = F (x)− F (x)8: Chose a random number k in [0, 1]9: if ∆E > 0 then

10: x← x11: else if k ≤ p(∆E) then12: x← x13: end if14: end while15: Return the best solution x∗ in M



Figure 7.1: Metropolis Algorithm: Probability p for the acceptance of a new solution independence of the energy difference ∆E and the temperature T .

The probability for uphill steps increases with increasing temperature, thus exploration isemphasized. For low temperatures the algorithm becomes similar to the stochastic descentalgorithm.

7.2.4 Simulated Annealing

In many applications an exploration strategy seems to be useful at the very beginning ofthe optimization run. Exploitative search is used in the regions where good solutions havebeen found. This requires an adaption of the temperature T on the run of a Metropo-lis algorithm. Simulated annealing is inspired by the problem of finding an equilibriumstate for frozen n-body systems. There, the decreasing temperature reduces the proba-bility for state transitions with increasing time. The basic scheme is the same as for theMetropolis algorithm but with an additional update step for the temperature within theiteration:

1: Create a random start solution x ∈ X2: Calculate F (x)3: Initialize the iteration counter t← 04: while continue(x, t) = 1 do5: t← t+ 16: T ← update(T, t)7: Create a random solution x ∈ neighborhood(x)8: Calculate ∆E = F (x)− F (x)9: Chose a random number k in [0, 1]

10: if ∆E > 0 then11: x← x12: else if k ≤ p(∆E) then13: x← x14: end if15: end while16: Return the best solution x∗ in M



There are several alternatives introduced concerning the temperature update. A commonchoice is a linear rule:

Tt+1 = update(Tt, t) = αTt (7.4)

The constant α (0 < α < 1) is called annealing constant. Simulated annealing is a globalsearch algorithm. The probability Pt that simulated annealing finds a global optimalsolution in t iterations converges to one:

limt→∞

Pt = 1 (7.5)

This is a special property of the Simulated Annealing algorithm within the field of stochas-tic search. However, it is usually not relevant for application since the value of t has totake huge values in order to achieve a high probability Pt.

7.2.5 Evolutionary Algorithms

Muller [1] summarizes concisely the field of evolutionary computation [49]. Fig. 7.2 visu-alizes that the field divides into the three branches evolutionary programming [50], geneticalgorithms [51, 52], and evolution strategies. Each branch contributes to the class of evo-

Figure 7.2: Selected mathematical programming methods and ordering scheme

lutionary algorithms. The well-written characterization of the evolutionary computationalgorithms, mirrored against mathematical programming, [1], is quoted here:

Evolutionary computation algorithms have in common that they are basedon biological principles such as reproduction, mutation, isolation, recombina-tion, and selection, applied to individuals in a population. The members of apopulation are capable of evolving over time, that is, of adapting to their en-vironment. In each generation, a surplus of individuals is created by means ofmutation and/or recombination, out of which the most promising individualsare selected as members of the next generation. In contrast to many determin-istic algorithms, evolutionary algorithms require only the value of the objectivefunction for a given point in the parameter space, but do not need gradientinformation. Like other stochastic algorithms, evolutionary algorithms employrandomness which makes them less sensitive to noise, discontinuities, and thedanger of getting trapped in a local optimum. While evolutionary algorithmsdo not possess highly efficient convergence properties like for example the con-jugate gradient method, they have the important feature of being inherentlyparallel.

Evolutionary algorithms are inspired by and based upon evaluation in nature. They typi-cally use an analogy to natural evolution to perform search by evolving solutions to prob-lems, usually working with large collection or populations of solutions at a time. From the



point of view of classical optimization, EAs represent a powerful stochastic zeroth ordermethod which can find the global optimum of very rough functions.

In Structural Optimization, evolutionary algorithms are used in many different forms.Commonly they are divided into four categories with respect to their historical background.They are:

• Genetic Algorithms (GAs) developed by John Holland [53]. The basic terminologyof genetic search and its principal components are discussed by Goldberg [54]. Anintroduction to the application of Genetic Algorithms to Structural Optimizationusing traditional binary string coding is given by Hajela [55].

• Evolutionary Programming (EP) created by Lawrence Fogel [56] and developed fur-ther by his son David Fogel [57].

• Evolution Strategies (ES) created by Ingo Rechenberg [58] and today strongly pro-moted by Thomas Back [59].

• Genetic Programming (GP) is the most recent development in the field by JohnKoza [60]

However, they are all based on the same evolutionary principles. Therefore we will usea more modern terminology also used by Bentley [61] and Schoenauer [62] and generallyspeak just of Evolutionary Algorithms (EAs). All the above listed strategies can be seenas specializations of general EAs which we will describe below.

Architecture of Evolutionary Algorithms

Based on biology and the theory of Universal Darwinism, an object to evolve must meetseveral criteria which are:

• Reproduction

• Inheritance

• Variation

• Selection

They perform the reproduction of individuals, either directly cloning parents or by usingrecombination and mutation operators to allow inheritance with variation. These opera-tors may perform many different tasks, from a simple random mutation to a complete localsearch algorithm. All EAs also use some form of selection to determine which solutions willhave the opportunity to reproduce, and which will not. The key thing to remember aboutselection is that it exerts selection pressure, or evolutionary pressure to guide the evolu-tionary process towards specific areas of the search space. To do this, certain individualsmust be allocated a greater probability of having offspring compared to other individuals.As will be shown below, selection does not only mean parent selection - it can also beperformed using fertility, replacement, or even death operators. It is also quite commonfor multiple evolutionary pressures to be exerted towards more than one objective in asingle EA.

But unlike natural evolution, EAs further require three other important features:

• Initialization



• Evaluation

• Termination

Because we are not prepared to wait for the computer to evolve for several million ofgenerations, EAs are typically given a head start by initializing (or seeding) them withsolutions that have fixed structures and meanings, but random values. This means wefeed a certain amount of knowledge already at the beginning of the evolution into thealgorithm. Evaluation in EAs is responsible for guiding evolution towards better solutions.Unlike natural evolution, evolutionary algorithms do not have a real environment in whichthe survivability or goodness of its solutions can be tested, they must instead rely onsimulation, analysis and calculation to evaluate solutions.

Extinction is the only guaranteed way to terminate natural evolution. This is obviouslya highly unsuitable way to halt EAs, for all the evolved solutions will be lost. Instead,explicit termination criteria are employed to halt evolution, typically when a good solutionhas evolved or when a predefined number of generations have passed. In practice even moreoften it happens that just the programmers patience is exceeded and he manually stopsthe evolution.

There is one more important processes, which, although not necessary to trigger orcontrol evolution, will improve the capabilities of evolution: Mapping. Even though notalways necessary we typically separate between the search space (genotype space) andthe phenotype space. The search space is a space of coded solutions (genotypes) to theproblem, and the solution space is the space of actual solutions (phenotype). The genotypecan be understood as a recipe of how to build an actual solution. E.g., it encodes attributesof a geometry model. Coded solutions must be mapped onto actual solutions, before thefitness of each solution can be evaluated. The process of decoding the recipe and buildingthe actual solution is called mapping.

Figure 7.3 shows the general architecture of EAs. This architecture should be regardedas a general framework for EAs, not an algorithm itself. Indeed, most EAs use only a subsetof the stages listed. In the following we will now briefly examine each of these possiblestages of EAs.

Initialization EAs typically seed the initial population with entirely random values.Evolution is then used to discover which of the randomly sampled areas of the searchspace contain better solutions, and then to converge upon that area. Sometimes the entirepopulation is constructed from random mutants of a single user-supplied solution. Oftenrandom values are generated inside specified ranges (a form of constraint handling). It isnot uncommon for explicit constraint handling to be performed during initialization, bydeleting any solutions which do not satisfy the constraints and subsequently creating newones. More complex problems often demand alternative methods of initialization. Someresearches provide the EA with embryos - simplified non-random solutions which are thenused as starting points for the evolution of more complex solutions. Some algorithmsactually attempt to evolve representations or low-level building blocks first, then use theresults to initialize another EA which will evolve complex designs using these representa-tions or building blocks. Although most algorithms do use solutions with fixed structures(i.e. a fixed number of decision variables), some allow the evolution of the number andorganization of parameters in addition to parameter values. In other words, some evolvestructure as well as details. For such algorithms, initialization will typically involve theseeding of solutions with both random values and random structures. paragraphMap Since



Figure 7.3: General architecture of an evolutionary algorithm.

typically EAs distinguish between search space (genotype) and solution space (phenotype)they require a mapping stage to convert genotypes into phenotypes. This process, knownby biologists as embryogeny, is highly important for the following reasons:

• Reduction of search space. Embryogeny permits highly compact genotypes to de-fine phenotypes. This reduction (often recursive, hierarchical and multi-functional)results in genotypes with fewer parameters than their corresponding phenotypes,causing a reduction in the dimensionality of the search space, hence a smaller searchspace for the EA.

• Better enumeration of search space. Mapping permits two very differentlyorganized spaces to coexist, i.e. a search space designed to be easily searched allowsthe EA to locate corresponding solutions within a hard-to-search solution space.

• More complex solutions in solution space. By using growing instructionswithin genotypes to define how phenotypes should be generated, a genotype candefine highly complex phenotypes.

• Improved constraint handling. Mapping can ensure that all phenotypes alwayssatisfy all constraints, without reducing the effectiveness of the search process in anyway, by mapping every genotype onto a legal phenotype.

Especially in the field of Structural Optimization this mapping is a very crucial point forthe performance of the whole optimization. Therefore we will have a closer look at it inSection 7.3.



Evaluation Every new phenotype must be evaluated to provide a level of goodness foreach solution. Often a single run of an EA will involve thousands of evaluations, whichmeans that almost all computation time is spent performing the evaluation process. InStructural Optimization, evaluation is often performed by dedicated analysis software(CAD-software, FEM-tools etc.) which can take minutes or even hours to evaluate asingle solution. Therefore often a strong emphasis exists toward reducing the numberof evaluations during evolution. Sometimes one even knows at the beginning how manyevaluations can be afforded and therefore the EA should be designed to make the best outof it.

Evaluation involves the use of fitness functions to assign fitness scores to solutions.These fitness functions can have single or multiple objectives, they can be unimodal ormulti-modal, continuous or discontinuous, smooth or noisy, static or continuously chang-ing. EAs are known to be proficient at finding good solutions for all these types of fitnessfunctions. Nevertheless, the implementation of the fitness function often has tremendousinfluence on the performance of the EA.

Mating Selection Parent solutions are always required in an EA, otherwise no childsolutions can be generated. However, their preferential selection of some parents insteadof others is not essential for the EA to work. Evolution will still occur without it, as longas evolutionary pressure is exerted by one of the the other selection methods: fertility,replacement and death. Nevertheless, most forms of EAs perform parent selection.

Choosing the fitter solutions to be parents of the next generation is the most commonand direct way of inducing a selective pressure towards the evolution of fitter solutions.Typically, one of three selection methods are utilized: fitness ranking, tournament selectionor fitness proportionate selection. Fitness ranking sorts the population in order of thefitness values and bases the probability of a solution being selected for parenthood on itsposition in the ranking. Tournament selection bases the probability of a solution beingselected on how many other randomly picked individuals it can beat. Fitness proportionateselection (or roulette wheel selection) bases the probability of a parent being selected onthe relative fitness score of each individual, e.g. a solution ten times as fit as another is tentimes more likely to be picked as parent (Goldberg [54]). This method also incorporatesfertility selection (see below).Although normally fitter parents are selected, this does not have to be the case. It ispossible to select parents based on how many constraints they satisfy, or how well theyfulfill other criteria, as long as a fitness-based selecting pressure is introduced elsewhere inthe algorithm. In algorithms that record the age of individuals, parent selection may belimited to individuals that are mature or individuals which are below their maximum lifespans.

Reproduction Reproduction is the cornerstone of every evolutionary algorithm - it isthe stage responsible for the generation of child solutions from parent solutions. Crucially,child solutions must inherit characteristics from their parents, and there must be somevariability between the child and parent. This is achieved by the use of the genetic oper-ators: recombination and mutation.Recombination operators require two or more parent solutions. The solutions (or geno-types) are shuffled together to generate child solutions. EAs normally use recombinationto generate most or all offspring. Recombination is normally performed by crossover op-erators in EAs.Mutation operators modify a single solution at a time. Some EAs mutate a copy of a



parent solution in order to generate the child, some mutate the solution during the appli-cation of the recombination operators, others use recombination to generate children andthen mutate these children. In addition, the probability of mutation varies depending onthe EA.There are huge numbers of different mutation operators in use today. Examples include:bit-mutation, translocation, segregation, inversion, structure mutation, permutation, edit-ing, encapsulation, mutation directed by strategy parameters, and even mutation usinglocal search algorithms (see [54, 60, 63, 59]).An important feature of both recombination and mutation is non-disruption. Althoughvariation between parent and child is essential, this variation should not produce excessivechanges to phenotypes. In other words, child solutions should always be near to theirparent solutions in the solution space. If this is not the case, i.e. if huge changes are per-mitted, then the semblance of inheritance from parent to child solutions will be reduced,and their position in the solution space will become excessively randomized. Evolutionrelies on inheritance to ensure the preservation of useful characteristics of parent solutionin child solutions. When disruption is too high, evolution becomes no more than a randomsearch algorithm.

Environmental Selection Once offspring have been created, they must be insertedinto the population. EAs usually maintain populations of fixed sizes, hence for every newindividual that is inserted into the population, an existing individual must be deleted.Therefore, this stage is also called replacement. The simpler EAs just delete every individ-ual and replace them with new offspring. However, some EAs use an explicit replacementoperator to determine which solutions are replaced by the offspring. Replacement is oftenfitness-based, i.e. children always replace solutions less fit than themselves, or the weakestin the population are replaced by fitter offspring.Replacement is clearly a third method of introducing evolutionary pressure to EAs, butinstead of being a selection method, it is a negative selection method. In other words,instead of choosing which individuals should reproduce or how many offspring they shouldhave, replacement chooses which individuals will die.Replacement needs not be fitness-based, it can be based on constraint satisfaction, thesimilarity of genotypes, the age of solutions, or any other criterion, as long as a fitness-based evolutionary pressure is exerted elsewhere in the EA. Replacement is also limited byspeciation within EAs: a child from two parents of one species/population/island shouldnot replace an individual in a different species/population/island.

Termination Evolution by an EA is halted by termination criteria, which are normallybased on solution quality and time. Most EAs use quality-driven termination as theprimary halting mechanism: they simply continue evolving until an individual which isconsidered sufficiently fit has been evolved. Some EAs will also re-initialize and restartevolution if no solutions have attained a specific level or fitness after a certain number ofgenerations.For algorithms which use computationally heavy fitness functions, or for algorithms whichmust generate solutions quickly, the primary termination criterion is based on time. Nor-mally evolution is terminated after a specific number of generations, evaluations, or sec-onds. In order to reduce the number of unnecessary generations, some algorithms measurethe convergence rates during evolution, and terminate when convergence has occurred (i.e.when the genotypes, phenotypes or fitness values of all individuals are static for a numberof generations). Many EAs also permit the user to halt evolution.


7.3 Representation Concepts 131

7.3 Representation Concepts

7.3.1 The universal genotype

The universal genotype consists of a collection of different gene types, where each typerepresents a common parameter type.

Therefore any genotype that is representable by an arbitrary collection of the availablegene types can be realized by just composing a heterogeneous list of the appropriate genes.

Up to date, the following gene types are available:

Float-gene represents an arbitrary floating-point parameter.

Upper and/or lower limits can be provided. If the parameter is unbounded, anadditional step size ε for uniform mutation must be specified.

Integer-gene represents an integer parameter.

Again upper and/or lower limits, as well as a step size, can be provided.

Bool-gene represents a binary parameter that can be true or false.

Float-list-gene is a list of arbitrary floating point values. The parameter always has torepresent one of the values specified in the definition of the gene.

Const-float-list-gene are equally distributed floating-point values, i.e. they have a con-stant distance between two neighboring values.

This gene is quite similar to the integer gene. Optionally, limits or a mutation stepsize can be provided.

String-list-gene is a list of arbitrary discrete values upon which no norm or ordering canbe applied.

This is in contrast to the integer-gene, the float-list-gene, and the const-float-list-gene.

For the definition of the listed gene types, the standard deviation σ used for Gaussianmutation must be specified additionally for every gene.

Float-gene, integer-gene, float-list-gene, and const-float-list-gene can further be pro-vided with so called cyclic properties.

This is suitable when between two possible gene values a distance but no absoluteorder can be defined, as e.g. for angle values.

Examples of different gene types

To clarify how an universal genotype for a given optimization problem must be defined,examples for the different gene types are explained in the following, as they are read fromfile during initialization of an actual EA:

• [float gene 1.2 true 0.5 false 0.0 3.0 5.0 false] denotes a float gene with a defaultvalue of 1.2, with a lower limit 0.5, and with no upper limit defined (the booleansbefore the limit values specify, whether a limit exists or not).

Further the mutation parameters ε = 3.0 and σ = 5.0 are specified.

ε denotes the range for uniform mutation of an unbounded gene, and σ defines thestandard deviation to be used for Gaussian mutation.



The last boolean value indicates that this example gene does not have cyclic prop-erties.

• [int gene 3 true 0 true 360 0 10.0 true] denotes an int-gene with a default valueof 3, with a lower limit 0, and an upper limit of 360.

Further, mutation parameters ε = 0 (useless for this fully bounded gene), and σ =10.0 are specified.

In addition, this gene has cyclic properties, indicated by the last boolean value.

• [bool gene true 2.0] denotes a bool-gene with default value true.

A standard deviation σ = 2.0 for some sort of Gaussian mutation is also defined.

• [float list gene 4 3 0.0 1.0 false 1.2 2.5 4.5 4.9] denotes a float-list-gene with 4values, where the value with index 3 (starting from 0) is the actual default value.

Mutation parameters ε = 0.0 (useless for this gene) and σ = 1.0 follow.

Then it is defined that this gene has no cyclic properties, and the four possible floatvalues are appended.

• [const float list gene 1.75 1.0 true 3.0 false 8 0.5 1.2 false] denotes a const-float-list-gene with a default value of 1.75.

The gene has an active lower limit of 1 and an inactive (indicated through the booleanafter the limit value) upper limit of 3.

By specifying the number of intervals to be 8, the possible values for this genes are[1.0, 1.25, 1.5, ...].

Finally, ε = 0.5, σ = 1.2 and no cyclic properties are given.

• [string gene 3 1 1.0 blue green red ] denotes a string-gene with 3 possible valuesand where the value with index 1 (again starting from 0) is chosen to be the actualdefault value. Further a σ = 1.0 value is specified, and the different possible valuesare appended.

During the initialization of the Evolutionary Algorithm, a list containing such genes isread from file.

This list defines the genotype structure for a problem at hand.Further, the given default values can also be used to incorporate existing solutions for

knowledge-based initialization.


Chapter 8

Composite Structures

This chapter discusses the structural optimization of laminated composite materials withfocus on fiber reinforced materials. The thickness of these structures is usually thin com-pared to the other dimensions which is why they are typically represented in FEM simula-tions with layered shell finite elements. The first section gives a short introduction on thedesign of laminated composites including a selection of applications. Consequently, theClassical Lamination Theory (CLT) is introduced which is the standard calculus for lam-inated composites. Afterward, the basics of a shell finite element are explained. Togetherwith the CLT, the element is enhanced to a layered shell element which can be used forthe computational representation of laminates. The majority of these fundamentals arebased on the doctoral thesis of B. Schlapfer [64]. The second part of this chapter discussesdifferent disciplines in laminate optimization including fiber orientation, laminate thick-ness, stacking sequence and material optimization. Finally, the theories are specialized tooptimization techniques for finding locally varying laminates which is also called laminatetailoring. Additionally, a selection of laminate tailoring methods is presented.

8.1 Design of Fiber Reinforced Composites

Fiber reinforced composite materials have superior mass-specific mechanical propertieswhich is why they are primarily used for light-weight applications. In contrast to metallicmaterials such as aluminum or steel, they are usually strongly anisotropic. Together withthe layer-wise building-technique, the connection between the design parameters and theresulting structural behavior becomes very complex and can barely be understood intu-itively. Consequently, employing computer-based engineering methods becomes important.Common optimization techniques are used for laminate design even if the objective is notto find the globally best solution but designs with significantly increased properties. Thisis why this process can basically be understood as automated design process. The thick-ness of light-weight structures is typically thin compared to the other dimensions whichis why they have shell characteristics. From the structural engineering point of view, ashell is a plane structure with curved inner and outer surfaces which are separated by thethickness t which is smaller compared to the other dimensions of the shell. The thicknesscan be constant or may vary and the mid-surface is defined to be the distance t/2 fromboth surfaces. Since the thickness of shell structures is parameterized with a scalar valuet, it can be adapted without changing the representation of the spatial model. This is animportant benefit in terms of computational cost which is a decisive factor in preliminarydesign. The optimization and the automated design of fiber reinforced plastics needs adeeper understanding of the lamination theory and layered shell elements which are pre-


134 Composite Structures

sented within the next sections. A selection of applications with shell characteristics are

(a) Fuselage section of the Air-bus A350

(b) Payload fairing of theAriane 5 launcher

(c) CFRP-monocoque of aLamborghini Aventador

Figure 8.1: Examples of applications with shell structures

presented in Figure 8.1. Figure 8.1(a) shows a fuselage section of the Airbus 350 whichis entirely made of Carbon Fiber Reinforced Plastics (CFRP). Considering light-weightstructures or especially aircraft structures, the thickness of almost any components isusually much smaller compared to the other dimensions wherefore they are prevailinglymodeled with shells. Another typical shell application is the payload fairing of the Ariane 5launcher shown in Figure 8.1(b). Its mission is to protect the payload during the launchand it should of course be as light as possible in order to maximize the payload capacity ofthe launcher. Additionally, it must fulfill acoustic and dynamic requirements. The thirdexample shown in Figure 8.1(c) is a CFRP-monocoque of a Lamborghini. With increasingfuel costs, weight saving has become an important issue in automotive engineering andcomponents are increasingly often made of fiber reinforced plastics.


8.2 Laminated Composites 135

8.2 Laminated Composites

8.2.1 Introduction

Composite materials are a combination of at least two different sub-materials. The com-bination aims for a resulting material having superior properties by taking advantage ofthe properties of its sub-materials. On a macroscopic scale, the composite material canbe considered as homogeneous which simplifies the design and analysis significantly. Thecomposite may have material properties that cannot be achieved by the single compo-nents. Since not all properties can be improved simultaneously, the composites have to bedesigned considering the specific requirements of an application.The most important class of composites are the Fiber Reinforced Plastics (FRP) whichconsist of fiber material embedded in matrix material. The superior material properties ofFRP are mainly arising from the fiber material. Typical fiber materials are carbon, glass,boron, or aramid. The benefit of fiber materials is the low number of voids due to theirsmall diameters which restrains crack growing and therefore increases the strength [65].Moreover, the mentioned fiber materials have low mass densities compared to commonmetallic materials (e.g. carbon ∼1.8 g/cm3) which results in outstanding mass-specificmechanical properties. However, the fibers themselves are not able to build a continuousmaterial wherefore they are embedded in a matrix system (see Figure 8.2(a)) which arepreferably duroplastic or thermoplastic materials. Widely spread matrix materials areepoxy resins which are easy to process and have acceptable mechanical properties withlow specific mass ( ∼1.2 g/cm3). In order to simplify the handling in the manufacturingprocess, FRP are usually prefabricated. Fiber rovings are joined to laminar plies, the so-called laminae. The fibers are processed to textiles with two or multiple fiber directions orto unidirectional plies with a single principal fiber orientation. In so-called PREPREGs,the fiber plies are already pre-impregnated with the polymer resin. This ensures an idealmixing ratio of the components and a simple handling of the fabrics which results in ahigh quality of the final laminated composite. The mechanical properties of such FRP plies

(a) Matrix-embedded unidirectionalfibers

(b) Laminated composite

Figure 8.2: Laminated composites

are highly anisotropic. While the mass-specific properties in fiber direction may be onemagnitude higher than for metallic materials, the material properties perpendicular to thefiber orientation, which are mainly governed by the properties of the matrix material, arecomparatively low. Therefore, structures are built by stacking several plies with differentorientation angles which are then called laminated composites (see Figure 8.2(b)). Basi-cally, the orientation angles can be chosen in a way so that the overall mechanical behaviorbecomes nearly isotropic (also called quasi-isotropic). This is achieved by distributing thefiber orientations regularly in all directions. A big advantage of laminated composites



is the possibility to specifically design their structural response to different loadings andrequirements. In contrast to isotropic materials for which the amount of material, inparticular the thickness, is the only design parameter, the behavior of composites can bedesigned by varying the orientation angles, the number of plies, the stacking sequence orin special cases the fiber volume content. Material can be added easily to highly loadedregions and the fibers can be oriented dependent on the principal directions of the loads.Furthermore, the layered design enables to build laminates of different ply materials. Dueto the behavior of anisotropic materials, the layered design method and the resulting largenumber of design variables, the design process of laminated composites becomes complexand time consuming. The experience and the intuition of structural engineers may bepushed to the limit and, excepting structures with a low degree of complexity, e.g. platesor cylinders, finding solutions with good light-weight properties is hard to be accomplishedmanually. The application of computer-based design methods is indispensable aiming forhigh quality designs in a reasonable time frame.

8.2.2 Classical Laminate Theory

This section gives a short introduction to the Classical Laminate Theory (CLT) which isthe standard theory for the analysis and the design of thin laminated composites. Thereexist different conventions in literature which may vary slightly. The chosen conventionsfor this thesis are based on the textbook of Jones [25]. The CLT bases on the assumptionsof the Kirchhoff-Love theory, namely that:

• the structure be thin compared to the other dimensions and a constant thickness.

• the Bernoulli-theory be valid, namely plane sections and no transverse shear strains.

• the stress state be plane (σz = τxz = τyz = 0).

• material behavior be linear-elastic.

• deformations be small.

Supplementary to the assumptions for homogeneous materials, it is assumed that the singlelayers are bonded perfectly. Due to the stacking of several layers, the material stiffnessproperties of a laminate through the thickness are inhomogeneous. In order to providea linear relation between plate deformations to plate line loads of the global laminate,the CLT performs a stiffness homogenization. This requires that the layer stiffnesses,which are usually given in material principal coordinates 1,2, be transformed to the globalcoordinates x, y of the problem.

Plane-Stress State and Stiffness Transformation

The stiffness matrix of an orthotropic layer is defined as

Cortho =

C11 C12 C13 0 0 0C12 C22 C23 0 0 0C13 C23 C33 0 0 00 0 0 C44 0 00 0 0 0 C55 00 0 0 0 0 C66

(8.1)



In order to express the stiffness entries as functions of the engineering constants such asthe Young’s moduli, the Poisson ratios and the shear moduli, it is easier to formulatethe so-called compliance matrix S which is the inverse of the stiffness matrix C. Thecompliance matrix of an orthotropic material is defined as

Sortho =

1E11

− ν21E22

− ν31E33

0 0 0

− ν12E11

1E22

− ν32E33

0 0 0

− ν13E11

− ν23E22

1E33

0 0 0

0 0 0 1G23

0 0

0 0 0 0 1G31

0

0 0 0 0 0 1G12

(8.2)

Measured engineering constants can be inserted directly and the stiffness matrix resultsof the inversion. Considering a unidirectional reinforced composite material, the materialproperties are transversely-isotropic. Such a material is defined with 5 elastic coefficientswhich are usually the Young’s modulus in fiber direction E11, the Young’s modulus per-pendicular to the fibers E22, the in-plane shear modulus G12, the in-plane Poissons ratioν12 and the transverse shear modulus G23. The transversal shear Poisson ratio is relatedto the variables above with

ν23 =E22

2G23− 1 (8.3)

Due to the assumption of a plane stress state, the stresses with out-of-plane componentsare zero and the compliance matrix S can be reduced to a 3×3-matrix. Consequently, thestiffness matrix C is reduced to the so-called reduced stiffness matrix Q and the Hooke’slaw is simplified to σ1

σ2

τ12

=

Q11 Q12 0Q12 Q22 0

0 0 Q66

ε1

ε2

γ12

(8.4)

Keep in mind that the strains having out-of-plane components, namely ε3, γ13 and γ23,are not zero, they are just not considered. Strains and stresses are expressed in materialprincipal coordinates 1,2. However, the orientation angle of a unidirectional laminatelayer can be chosen arbitrarily wherefore the material principal directions do not coincidewith the global coordinates of the problem formulation. Consider a laminate illustratedin Figure 8.3 where the material coordinates 1,2 are rotated with respect to the globalcoordinates x,y by a given angle ϕ. A relation which transforms the stresses and the

2

1

φ

y

x

Figure 8.3: Material principal and global coordinate systems

strains from the material coordinate system to the global coordinate system is needed.



This is achieved by applying the rotation matrix T.

T =

cos2 ϕ sin2 ϕ 2 sinϕ cosϕsin2 ϕ cos2 ϕ −2 sinϕ cosϕ

− sinϕ cosϕ sinϕ cosϕ cos2 ϕ− sin2 ϕ

(8.5)

This rotation matrix is different from the common transformation matrix for a vectorrotation by a coordinate axis. This is because the stresses and the strains are 2nd-ordertensors (not vectors) and they are only arranged in arrays. While the material stresses aremapped directly using the matrix T according to equation (8.6),σ1

σ2

τ12

= T

σxσyτxy

(8.6)

it has to be taken into account that the engineering shear strains γij are 2 times thetensorial shear strains

γij = 2εij (8.7)

which leads to the following relation. ε1

ε212γ12

= T

εxεy

12γxy

(8.8)

In order to be able to work directly with the engineering strains without a pre-factor,Reuter [66] introduced the simple matrix R.

R =

1 0 00 1 00 0 2

(8.9)

An application of this Reuter matrix reduces the potential for mistakes to a minimumsince it is clear that only engineering stains are utilized. The engineering strains are thentransformed with equation (8.10).

εxy = RTR−1ε12 (8.10)

Contrariwise, the global strains and stresses can be mapped to the material principalstrains and stresses performing a multiplication with the inverse rotation matrix T−1.The connection between the strains and stresses in global coordinates is based on thetransformed reduced stiffness matrix Q.σxσy

τxy

= Q

εxεyγxy

(8.11)

The transformed reduced stiffness matrix Q can be derived by using equations (8.4), (8.6)and (8.10) and is therefore a function of Q, R and T only.

Q = T−1QRTR−1 = T−1QT−T (8.12)

The transformed reduced stiffness matrix Q has to be evaluated for every laminate layerin order to be able to homogenize the material data of the entire laminate stack. However,the evaluation is straight forward and the computational costs are low.



Stiffness Homogenization

In contrast to a homogeneous material, the material properties of a laminate are alter-nating through the thickness. This complicates the analysis but also the formulation ofa finite element, since the integration through the thickness becomes more complex. Thestiffness discontinuity due to the layer-wise design technique results in a discontinuousstress distribution. To quantify a stress state in a laminate, the components have to beevaluated for each layer particularly. In order to have a load unit that includes all layers,the line loads are introduced, namely the force per unit length N and the moment perunit length M. The force per unit length N is the integration of the stress componentsover the laminate thickness t.

t

t2

hkhk-1

hj

h1h2

zkzk-1

zj

z1z0

Figure 8.4: General layup of a laminated composite material

N =

Nx

Ny

Nxy

=

∫ t2

− t2

σxσyσxy

dz (8.13)

Analogously, the moment per unit length M is given by the integration of the stresscomponents multiplied with the stacking position z.

M =

Mx

My

Mxy

=

∫ t2

− t2

σxσyσxy

zdz (8.14)

Since the stiffness is constant within a single layer, the integration can be replaced witha summation of the integrals over each layer taking advantage of the transformed reducedstiffness matrix Q derived above. Using the kinematic relations saying that the strains arecomposed of the membrane strains and a part arising from the curvature (see equation(A.27))

ε = ε0 + zκ (8.15)

equations (8.13) and (8.14) yield

N =

∫ t2

− t2

σdz =n∑j=1

Qj

∫ zj

zj−1

εdz =n∑j=1

Qj

∫ zj

zj−1

(ε0 + zκ

)dz (8.16)

=n∑j=1

Qj

[(zj − zj−1) ε0 +

1

2

(z2j − z2

j−1

)κ

](8.17)

M =

∫ t2

− t2

σzdz =

n∑j=1

Qj

∫ zj

zj−1

εzdz =

n∑j=1

Qj

∫ zj

zj−1

(ε0 + zκ

)zdz (8.18)

=n∑j=1

Qj

[1

2

(z2j − z2

j−1

)ε0 +

1

3

(z3j − z3

j−1

)κ

](8.19)



Considering equations (8.17) and (8.19), the matrices A, B and D can be extracted whichare all of dimension 3×3. The A-matrix connects the membrane strains ε0 with the forceper unit length N.

A =n∑j=1

Qj (zj − zj−1) (8.20)

and the D-matrix connects the plate curvatures κ with the moments per unit length M.

D =1

3

n∑j=1

Qj

(z3j − z3

j−1

)(8.21)

The B-matrix is responsible for the coupling of membrane and bending components.

B =1

2

n∑j=1

Qj

(z2j − z2

j−1

)(8.22)

Dependent on the laminate layup, some entries may become zero. In case of a symmet-ric laminate, there is no coupling between bending and membrane effects wherefore theB-matrix vanishes. These matrices build the so-called ABD-matrix which is the mainachievement of the homogenization process.[

NM

]=

[A BB D

] [ε0

κ

](8.23)

Nx

Ny

Nxy

Mx

My

Mxy

=

A11 A12 A16 B11 B12 B16

A12 A22 A26 B12 B22 B26

A16 A26 A66 B16 B26 B66

B11 B12 B16 D11 D12 D16

B12 B22 B26 D12 D22 D26

B16 B26 B66 D16 D26 D66

ε0x

ε0y

ε0xy

κxκyκxy

(8.24)


8.3 Finite Element Representation 141

8.3 Finite Element Representation

The application of shell elements requires that the thickness of the represented structuresis much smaller compared to its other dimensions. Thin structures are usually not modeledwith 3D elements, especially if analyzing plate bending. If 3D elements are modeled thinin only the thickness direction according to Figure 8.5(a), there may be the problemof shear locking and ill-conditioning [67]. This problem can be avoided by using manyelements according to Figure 8.5(b). However, the number of degrees-of-freedom (d.o.f.) isincreased drastically and the models become large wherefore solving the problems becomescomputationally expensive. Using shell elements and the underlying plate theory, theproblem of having a large number of d.o.f. is mitigated (see Figure 8.5(c)).

(a) 3D-modeling with large aspect ratios may lead to shear lock-ing and ill-conditioning

(b) 3D modeling with smaller mesh leads to an extensive numberif d.o.f.

(c) 2D-modeling with plate elements mitigates both problems

Figure 8.5: Front view of different modeling techniques for bending

8.3.1 Layered Shell Elements

Stiffness Matrix

A detailed derivation of a general shell element is given in Appendix A.2. Based onthese results and the Classical Lamination Theory introduced above, a layered shell el-ement can be formulated. In contrast to the general formulation, the material here isnot homogeneous through the thickness which is illustrated in Figure 8.6. Consequently,the integration of the stiffness parts through the thickness (see equations (A.34), (A.37)and (A.39)) have to be replaced with a summation according to the formulation of theABD-matrix. The stiffness matrix of a layered shell element is given by

k = km + kc + kb + ks (8.25)



with

km =

n∑j=1

[∫A

BTmQjBm dA (zj − zj−1)

](8.26)

kc =n∑j=1

[∫A

(BTmQjBb + BT

b QjBm

)dA

1

2

(z2j − z2

j−1

)](8.27)

kb =n∑j=1

[∫A

BTb QjBb dA

1

3

(z3j − z3

j−1

)](8.28)

ks =n∑j=1

[∫A

BTs κQ

Sj Bs dA (zj − zj−1)

](8.29)

Consider that the matrices Bm, Bb and Bs are strain-displacement matrices (see AppendixA.2) and should not be confused with the B-matrix of the lamination theory. In contrast tohomogeneous materials, an additional term from the coupling stiffness matrix kc appearswhich becomes zero again for symmetric laminates. However, no additional componentsare need since it is a combination of membrane and bending parts.

Mass Matrix

The mass matrix formulation for a layered shell element is simple when using a lumpedmass model where rotary inertia is neglected. It is feasible for small deformations which isusually true for harmonic vibration problems. The lumped mass matrix is expressed withequation

m =n∑j=1

[ρjtj

∫AφTφ dA

](8.30)

The shape functions φ map the mass element to its nodes.

8.3.2 Laminate Sensitivities

The expression sensitivity origins from the so-called sensitivity analysis which is a commonmethodology in the field of optimization with mathematical programming. The sensitiv-ities express the influence of a change of the design variables x to the objective functionf . From the mathematical point of view, the sensitivities are the gradients of an objectivefunction. The general expression is therefore

∇f =df

dx= lim

∆x→0

f(x + ∆x)− f(x)

∆x(8.31)

A simple method for the numerical determination of the sensitivities is the finite differenceapproximation. The forward finite difference approximation is defined as

df

dxi=f(x1, ..., xi + ∆xi, ..., xn)− f(x1, ..., xi, ..., xn)

∆xi(8.32)

For an objective function f with n design variables, n+ 1 function evaluations are neededto fully evaluate the sensitivities. In almost any problem of structural optimization, afinite element run is included in the evaluation of the function. If the size of the finiteelement model is large or if many design variables are used, the numerical determination



of the sensitivities may become too expensive. Additionally, the numerical determinationincludes a numerical error which is caused by the finite value of ∆xi. However, a numericaldetermination of the sensitivities is the only possibility if the objective function is notknown explicitly, for example in black box optimizations.

Considering for instance the common sensitivity equations for minimal compliance

∂W∂x = −uT ∂K

∂x u + 2uT ∂r∂x

W = uT r

(8.33)

for displacements which has already been derived in Section 6.10.2 (see equation (6.129))

∂u∂x = K−1

[∂r∂x −

∂K∂x

]u = Kr (8.34)

or eigenfrequencies

∂λn∂x =

ΦTn( ∂K∂x −λn

∂M∂x )Φn

ΦTnMΦn

(K− λnM) Φn = 0(8.35)

it can be noticed that they both require the sensitivities of the stiffness matrices. Since theformulation of the finite elements, and therefore the connections between design variablesand the stiffness matrix are known, the sensitivities can be derived analytically which isdone here for the thickness of the layers and the orientation angles.

Stiffness Matrix Sensitivities With Respect To Layer Thicknesses

The general formulation of the stiffness matrix of a layered shell element is given byequations (8.26) to (8.25). The formulation is dependent on the positions of the laminateinterfaces zj . To differentiate the stiffness matrix with respect to the layer thicknesses hj ,they have to be reformulated. To keep things simple, the sensitivities of the stiffness matrixwith respect to layer thickness is here only derived for symmetric laminates. However,general formulations are given in [64]. For symmetric laminates, the integration maybe done for one half of the laminate only if the value of the stiffness matrix is doubledsimultaneously. Figure 8.7 illustrates one half of the symmetric laminate where the layersare numbered here from the laminate mid-plane. Consequently, the interface coordinateszj may be replaced by the thickness values with

zj =

j∑k=1

hk (8.36)

The finite element stiffness matrices for membrane, bending and transverse shear conse-quently yield

km = 2

n/2∑j=1

[∫A

BTmQjBm dA

(j∑

k=1

hk −j−1∑k=1

hk

)](8.37)

kb = 2

n/2∑j=1

∫A

BTb QjBb dA

1

3

( j∑k=1

hk

)3

−

(j−1∑k=1

hk

)3 (8.38)

ks = 2

n/2∑j=1

[∫A

BTs κQ

Sj Bs dA

(j∑

k=1

hk −j−1∑k=1

hk

)](8.39)



Due to the symmetry of the laminate, the coupling stiffness matrix kc is zero. Takingadvantage of relation

j∑k=1

hk −j−1∑k=1

hk = hj (8.40)

the stiffness for the membrane and shear parts can be simplified to

km = 2

n/2∑j=1

[∫A

BTmQjBm dA hj

](8.41)

ks = 2

n/2∑j=1

[∫A

BTs κQ

Sj Bs dA hj

](8.42)

Consequently, the sensitivities of these parts are only dependent on the material propertiesof the respective layer and yield

dkmdhl

= 2

∫A

BTmQlBm dA (8.43)

dksdhl

= 2

∫A

BTs κQ

Sl Bs dA (8.44)

For simplicity, it is assumed that the stiffness correction factor κ is independent on thethickness hl. The derivative of bending part is more complex because the thickness appearswith a higher order. Considering equation (8.38), it must be distinguished whether thelayer with respect to which the derivative is taken is part of the summation or not. Thiscan be expressed with

∂

∂hl

(j∑

k=1

hk

)=

1 for j ≥ l0 for j < l

(8.45)

Consequently, the bending stiffness sensitivities yield

dkbdhl

= 2

∫A

BTb QlBb dA

(l∑

k=1

hk

)2

+ 2

n/2∑j=l+1

∫A

BTb QjBb dA

( j∑k=1

hk

)2

−

(j−1∑k=1

hk

)2 (8.46)

Here it becomes obvious that a change of the bending stiffness is caused by two differenteffects. The first term expresses the stiffness change due to the thickness change of thelayer itself. The terms within the summation consider the change of the stiffness caused bypushing outward the overlaying layers which results in a higher area moment of inertia (seeFigure 8.8). This relation can be written alternatively with expressions (8.47) and (8.48).The top layer (l = n/2) is independent on the subjacent layers wherefore its sensitivity isexpressed with

dkbdhl

= 2

∫A

BTb QlBb dA

n/2∑k=1

hk

2

(8.47)



The lower layers (l = 1, ..., n/2 − 1) take into account the stiffness contribution of theoverlaying layers wherefore

dkbdhl

= 2

(∫A

BTb QlBb dA−

∫A

BTb Ql+1Bb dA

)( l∑k=1

hk

)2

+dkbdhj+1

(8.48)

The total sensitivities of the stiffness matrix are built with an addressed summation of theseveral parts corresponding to equation (8.25).

dk

dhl=dkmdhl

+dkbdhl

+dksdhl

(8.49)

All the derivations above have been performed on the element level for the element stiffnessk. However, the sensitivity equations (8.33) through (8.35) require the sensitivities of theglobal stiffness matrix K. They can however be derived directly from the element stiffnessparts. The global stiffness matrix is an addressed summation of all the element matriceswhich is denoted schematically in equation (8.50).

K =

k1 0

k2

k3

0. . .

(8.50)

The element stiffness matrix derivatives are dependent only on the thickness of the respec-tive element. The derivatives with respect to the thicknesses of other elements are all zero.Thus, the global stiffness matrix derivative dK

dhlcontains only zeros except of the entries

corresponding to the considered element l which is shown schematically in equation (8.51).

dK

dhl=

0 0

dklhl

0 0

(8.51)

The sensitivity of the mass matrix with respect to a layer thickness change is simply

dm

dhl= ρl

∫AφTφ dA (8.52)



A

tx,u

z,w

y,v

Figure 8.6: Layered shell element

zj

zn/2 t2

z1

Figure 8.7: Schematic illustration of one half of a symmetric laminate

∆t

k1(z0,z1)

k3(z2,z3)k2(z1,z2) k1(z0+∆h,z1)

k3(z2+∆h,z3+∆h)k2(z1+∆h,z2+∆h)

z0

z1

z2

z3

z0

z1

z2

z3

Figure 8.8: Effect of a thickness change to the overlaying layers



Stiffness Matrix Sensitivities With Respect To Fiber Orientation

The calculation of the sensitivities with respect to the fiber orientation is much simplersince the stiffness of the surrounding layers is not influenced by changing the fiber orienta-tion of a particular layer. Taking the derivative with respect of the fiber orientation anglesof the membrane stiffness part given in equation (8.26) simply yields

∂km∂ϕj

=

∫A

BTm

∂Qj

∂ϕjBm dA (zj − zj−1) (8.53)

since only the transformed reduced material stiffness matrix Qj is dependent on the fiberorientations This is the same for all other stiffness parts given in equations (8.27) through(8.29). The derivative if the reduced material stiffness matrix starts from equation (8.12)and yields

∂Q

∂ϕj=∂T−1

∂ϕjQRTR−1 + T−1QR

∂T

∂ϕjR−1 (8.54)

The derivative of the transformation matrix introduced in equation (8.5) is

∂T

∂ϕj=

−2 sinϕ cosϕ 2 sinϕ cosϕ 2 cos2 ϕ− 2 sin2 ϕ2 sinϕ cosϕ −2 sinϕ cosϕ −2 cos2 ϕ+ 2 sin2 ϕ

− cos2 ϕ+ sin2 ϕ cos2 ϕ− sin2 ϕ −4 cosϕ sinϕ

(8.55)

The derivatives of the inverse can simply be determined by considering that

T−1 (ϕ) = T (−ϕ) (8.56)

The mass matrix is not sensitive to a change of the fiber orientations wherefore theirsensitivities become zero.


8.4 Optimization with Lamination Parameter 149

8.4 Optimization with Lamination Parameter

The theory of lamination parameters is a common approach in laminate optimization andwidely spread in literature (e.g. [68, 69, 70, 71, 72]). It basically operates on the ABD-matrix which is given in equation (8.23). The idea is to find an optimal set of ABD entrieswhich represent an optimal laminate regarding the considered objective function. Assum-ing that all layers are of the same material and same thickness and the laminatethickness is defined to a given value t, the ABD-matrix may be formulated as a sumof the matrices Γi containing the invariants of a orthotropic material and the so calledlamination parameters V A,B,D

i according to equations (8.57) through (8.59).

A = t(Γ0 + Γ1V

A1 + Γ2V

A2 + Γ3V

A3 + Γ4V

A4

)(8.57)

B =t2

4

(Γ1V

B1 + Γ2V

B2 + Γ3V

B3 + Γ4V

B4

)(8.58)

D =t3

12

(Γ0 + Γ1V

D1 + Γ2V

D2 + Γ3V

D3 + Γ4V

D4

)(8.59)

The invariants of an orthotropic material contained in Γi are dependent on the entries ofthe stiffness matrix and are explicitly listed in Appendix B. The lamination parametersare defined as non-dimensional integrals of the orientation angles through the thicknessaccording to equations (8.60), (8.61) and (8.62).

V A

1 , V A2 , V A

3 , V A4

=

∫ 12

− 12

Φdz (8.60)

V B

1 , V B2 , V B

3 , V B4

= 4

∫ 12

− 12

zΦdz (8.61)

V D

1 , V D2 , V D

3 , V D4

= 12

∫ 12

− 12

z2Φdz (8.62)

with

Φ = cos 2ϕ, sin 2ϕ, cos 4ϕ, sin 4ϕ (8.63)

Assuming that the thickness of all layers is unique and taking into account the fact thatthe orientation angles ϕ are constant through the thickness of one layer, the integrals canbe replaced with summations according to equations (8.64), (8.65) and (8.66).

V A

1 , V A2 , V A

3 , V A4

=

1

n

n∑i=1

[Φi] (8.64)

V B

1 , V B2 , V B

3 , V B4

=

2

n2

n∑i=1

[((n2− i+ 1

)2−(n

2− i)2)

Φi

](8.65)

V D

1 , V D2 , V D

3 , V D4

=

4

n3

n∑i=1

[((n2− i+ 1

)3−(n

2− i)3)

Φi

](8.66)

with

Φi = cos 2ϕi, sin 2ϕi, cos 4ϕi, sin 4ϕi (8.67)

Laminate optimization by means of the lamination parameter theory operates directly onthe lamination parameters V A,B,D

i and not on the physical design variables such as for



example the orientation angle or layer thickness. Consequently, the number of linearly in-dependent parameters is 12, for laminates which are symmetric and balanced even only 4.Naturally, the lamination parameters cannot be chosen arbitrarily since they are boundedto a feasible domain through the underlying trigonometric functions. A few publications[73, 74, 75] focus on the description of the feasible lamination parameter domains, but sofar, there exists no generally valid analytical approach. However, if assuming a balancedand symmetric laminate, the feasible domain of the in-plane lamination parameters canbe found rather easily. The contours of the feasible domain for the in-plane laminationparameters can be determined, by evaluating equation (8.64) for the border case. Anillustration of the feasible domain for the lamination parameters V A

1 and V A3 is given

in Figure 8.9. The lower boundary is defined by (±ϕ)S-laminates. The upper straight

−1 1

−1

1 (04)S

(±152)S

(±302)S

(±452)S

(±602)S

(±752)S

(904)S(0/903)S

(02/902)S

(03/90)S

(0/45/902)S

(0/452/90)S

(02/45/90)S(45/903)S (03/45)S

(02/452)S(452/902)S

(453/90)S (0/453)S

VA3

VA1

Figure 8.9: Feasible domain of the in-plane lamination parameters V A,1 , V A,

3

boundary is given by (0j/90n−j)S-laminates. For symmetric and balanced laminates, thein-plane lamination parameters V A

2 and V A4 are always zero due to the integration over

the thickness. Additionally, all coupling lamination parameters V Bi are equal to zero for

the symmetric case. For simplicity, only the +45-layers are labelled in Figure 8.9. Sincethe lamination parameters V A

1 and V A3 are based on cosine-functions, each +45-layer can

directly be replaced with a −45-layerIt can be shown that the feasible domain is convex wherefore an optimal solution of theABD-matrix can be found efficiently by using algorithms of mathematical programming.However, the major drawback of using lamination parameters is the fact that the infor-mation of the ABD-matrix does not explicitly include the information of the physicallaminate. The back-substitution from the ABD-matrix or from the lamination parame-ters, respectively, to the physical laminate with a known stacking sequence and orientationangles is not unique and includes a second optimization problem. Considering again thefeasible domain in Figure 8.9, the lamination parameters can take every value in the do-main. However, the feasible physical laminates are distributed discretely in the designdomain. Having a given number of layers, which is a basic assumption of the theory, theoptimal lamination parameters can only be represented approximately by a physical lam-inate. In case of a symmetric and balanced laminate as shown above, the complexity ofthe back-substitution is acceptable. The problem becomes much more complex for generallaminates.


8.4 Optimization with Lamination Parameter 151

8.4.1 Basic Examples

A simple application of lamination parameters is demonstrated on the example of an in-plane stiffness design of balanced and symmetric laminates. According to the derivationsabove, V A

2 and V A4 are zero and V A

1 and V A3 must lie in the feasible domain given in

Figure 8.9. The A-matrix is a linear combination of the lamination parameters and thematerial invariants.A11 A12 A16

A12 A22 A26

A16 A26 A66

︸︷︷︸

A

=

U1 U4 0U4 U1 00 0 U5

︸︷︷︸

Γ0

+V A1

U2 0 00 −U2 00 0 0

︸︷︷︸

Γ1

+V A3

U3 −U3 0−U3 U3 0

0 0 −U3

︸︷︷︸

Γ3

(8.68)In a first example, we assume that the in-plane stiffness in x-direction has to be maximized.Consequently, the entry A11 must be maximal. The material invariants U2 and U3 arestrictly positive (see Appendix B). Using the plot of the feasible domain in Figure 8.9 andthe matrices in equation (8.68) we can determine that the entry A11 is maximal for beingboth, V A

1 and V A3 , equal to one. This corresponds to a laminate consisting of 0-layers

only.

maxA11

(= U1 + V A

1 · U2 + V A3 · U3

)→ V A

1 = 1, V A3 = 1 → (04)S

The same calculations can be done for a laminate with maximal in-plane shear stiffness:

maxA66

(= U5 + V A

1 · 0 + V A3 · (−U3)

)→ V A

3 = −1, (V A1 = 0) → (±452)S

Alternatively, we may seek for a laminate which has equal stiffness in all directions, whichis also known as quasi-isotropic laminate. The equations characterizing a quasi-isotropiclaminate are

A11 = A22 → U1 + V A1 · U2 + V A

3 · U3 = U1 − V A1 · U2 + V A

3 · U3

A66 = 12 (A11 −A12) → U5 − V A

3 · U3 = 12

(U1 + V A

1 · U2 + V A3 · U3 − U4 + V A

3 · U3

)The first equation directly implies

V A1 = 0

Using the explicit formulation of the invariants given in Appendix B, it can be shown that

V A3 = 0

Since we assume the lamination parameters V A2 and V A

4 to be zero, the laminate mustbe balanced wherefore the solution corresponds to (0/45/-45/90)S-laminate. However,this is not the only solution for a quasi-isotropic laminate. Also a laminate with layup(0/60/-60)S is quasi-isotropic, symmetric and balanced and results in the same lamina-tion parameters (V A

1 = V A3 = 0). This demonstrates that the back-substitution for the

lamination parameters to the physical laminate is not is not unique. Of course, the twolaminates (0/45/-45/90)S and (0/60/-60)S produce different entries V D

i . However, thedifference becomes small if having a large number of layers.


8.5 Optimization on Physical Design Variables 153

8.5 Optimization on Physical Design Variables

The parametrization schemes presented in this section all base on the physical designvariables such as fiber orientation angles, layer thickness, stacking sequence, number oflayers of material. In contrast to the parametrization with lamination parameter, whichdirectly operates on the entries of the ABD-matrix, the ABD-matrix is here a result ofthe physical design variables which are processed with the homogenization routine of theCLT introduced in section 8.2.2. As long as the geometry of the considered structures aresimple (e.g. plates or cylinders) and the loads are distributed homogeneously, a solutioncan directly be evaluated with the CLT. However, if the geometries become more complex,the application of a FEM model becomes necessary. A flowchart of a typical optimizationprocess including a Finite Element evaluation is shown in Figure 8.10. It is separated into

OptimizerOptimization Environment

FEMEnvironment

FEMInputFile

FEMModel

FEMOutput

FileFEM-Solver

Read Read Write

x0

x

Figure 8.10: Flowchart of a typical optimization with an FEM-Solver

two different environments, the optimization environment which is usually implementedin a programming language (e.g. MATLAB, Python, C), and a FEM environment whichis represented by a FEM solver. Typically, the process is starting with an initial finiteelement model whose laminate layup has to be optimized. An input-interface is reading allthe relevant data for the optimization. Naturally, the optimization can be combined witha simultaneous shape optimization. The optimization algorithm then generates an initialdesign variable vector x0 which is processed to the new model. Based on that, the output-interface generates a input-file for the finite element framework and the finite elementsolver is activated by the optimizer. The resulting output-file is read again to calculatethe objective value. The evaluation of the objective value can either be carried out in thefinite element framework or the optimization environment. Based on the objective value,the optimizer creates a new set of design variables. Many programming environmentsprovide a set of pre-implemented optimization algorithms wherefore the programmingeffort can be reduced. However, the implementation of the input/output-interfaces for anefficient communication between the two parts is more programming expensive.

8.5.1 Optimization of the Fiber Orientation

The optimization of the fiber orientation is a very common discipline in the field of laminateoptimization. The adaption of the fiber orientation allows a specific design of materialanisotropy which is one of the major benefits of laminates compared to isotropic materials.The fiber orientations of each layer are adapted with respect to a specific design criterion.Taking a very simple approach, it is assumed that the number of layers and the material



is predefined and fixed during the optimization. The orientation angles are taken asdesign variables wherefore the optimization problem becomes continuous. Figure 8.11(a)schematically illustrates a laminate consisting of four layers whose orientation angles areoptimized. The design variables can simply be arranged in an array which is illustrated

φ1 = x1,1φ2 = x2,1φ3 = x3,1φ4 = x4,1

φ1 = x1,nφ2 = x2,nφ3 = x3,nφ4 = x4,n

(a) Fiber orientation optimization scheme

x1,1 x2,1 x3,1 x4,1 x1,n x2,n x3,n x4,n(b) Fiber orientation optimization parametrization

Figure 8.11: Fiber orientation optimization

in Figure 8.11(b). In general, the fiber orientation values are real numbers (ϕj ∈ R).Due to the periodicity of the orientation angles, their are often bounded to an interval of180, such as [0, 180] or [−90, 90]. Consequently, the search space is restricted withoutreducing the potential solution quality. However, some optimization algorithms cannothandle constrained problems. Of course, the search space is not restricted using this typeof algorithms. The continuity of the design variables allows to solve fiber orientationoptimization with algorithms of the class of mathematical programming. However, itmust be considered that the search space is multi-modal which may be problematic forthis class of algorithms since they risk to get stuck in a local optima. The non-convexityof a search space with two orientation angles has been visualized by Keller [76]. Figure

x

y

θ

(a) The structure is clamped at theleft side and a uniform line-load of0.01 N/mm is applied to right side.

0.16

0.16 0.16

0.16

0.16

0.17

0.17 0.17

0.17

0.17

0.18

0.18

0.18

0.18 0.18

0.18

0.18

0.18

0.18

0.18 0.180.18

0.18

0.18

0.180.18

0.19

0.19

0.19

0.19

0. .2

0.2

0.2

0.21 0.21

0.21 0.21

-90 -75 -60 -45 -30 -15 0 15 30 45 60 75-90

-75

-60

-45

-30

-15

0

15

30

45

60

75

-90 -75 -60 -45 -30 -15 0 15 30 45 60 75

-90

-75

-60

-45

-30

-15

0

15

30

45

60

75

θ1

θ 2

(b) Contour plot of the objective for the tensilespecimen experiment as a function of the ply an-gles φ1 and φ2 (in degrees).

Figure 8.12: Geometry and objective of the tensile specimen experiment. Source: Keller[76]

8.12(a) illustrates a simple plate which is clamped at the left side and loaded with a



uniform line load. The plate consists of two layers with orientation angles x1 and x2.Figure 8.12(b) shows the corresponding contour plot of the objective function which isobviously multi-modal. There will even be more local optima when using more than twolayers.

Allowing the orientation angles to be real may lead to solutions with many positionsafter decimal points which is inappropriate for manufacturing since processes are often re-stricted to discrete steps, for instance to 5-steps or higher. This renders the problem to bediscrete wherefore the application of algorithms of the category of mathematical program-ming becomes impossible. Stochastic algorithms may mitigate that problem. Alterna-tively, continuous orientation angles may be rounded to the next feasible angle. However,if the discrete steps are too large, the solution may change its behavior significantly.

8.5.2 Optimization of the Stacking Sequence

Another optimization discipline, which has already been performed in the early stages ofcomputer-aided laminate design, is the optimization of the laminate stacking sequence. Itbasically modifies the order of the laminate stack by exchanging the layers or its positionin the laminate, respectively. Stacking sequence optimization is important if the load casesinclude bending. For such a case, it is absolutely essential which layers are located in theouter positions since these layers have a greater impact on the area moment of inertia.Moreover, the stacking sequence is important for the mechanical coupling between bend-ing and twisting which is mathematically represented by the B-matrix. If the layers aredistributed appropriately, an in-plane load may induce out-of-plane deformations and viceversa which may be interesting for specific applications. Additionally, the interlaminarstress distribution is strongly dependent on the distribution of the single layers. Figure8.13(a) schematically illustrates a stacking sequence optimization. The parametrization of

ABCD

A

B

C

D

(a) Stacking sequence optimization scheme

AB CDA B C DA B C D

(b) Stacking sequence optimization parametrization

Figure 8.13: Stacking sequence optimization

a stacking sequence is usually solved by arranging the numbered layers in an array accord-ing to Figure 8.13(b). An exchange of the array sequence is equivalent to an exchange ofthe corresponding layers. When exchanging two layers in the laminate stack, the objec-tive function changes by leaps and bounds wherefore the optimization problem becomesdiscrete. Consequently, these category of problems can only be solved with stochasticalgorithms. A type of algorithms which has been used very often to optimize the stackingsequence are Genetic Algorithms (GA) [77, 78, 79, 80, 81, 82, 83, 84, 85]. Basically, astacking sequence problem can be transformed in to a problem of fiber orientation op-timization. Instead of exchanging the layers, their fiber orientations may be varied andpotentially, the same solution can be found.



8.5.3 Material Optimization

Another parameter which can be optimized is the material of the single layers. Of course, itdoes not make sense to vary the engineering parameters of the sub-materials. However, theapplied materials can be selected from a list of predefined materials which are compatiblewith each other. Usually, a set of candidate materials is predefined in a database. Eachmaterial has a unique labeling number which is used as design variable. Consequently, thedesign variables are discrete and the problem is solved with stochastic algorithms. Purematerial optimizations are rather rare since the material which may be used for the designis often predefined. Moreover, it is intuitively clear to the engineer which material best fitsthe needs. However, material optimization is sometimes used in combination with otherlaminate optimization disciplines.

The Discrete Material Optimization (DMO) method which has been introduced byStegmann and Lund [86, 87] renders the discrete problem a continuous formulation. Theconstitutive matrix C is expressed as a weighted sum of candidate materials which arecharacterized by their own constitutive matrices Ci.

C =n∑i=1

wiCi, 0 ≤ wi ≤ 1 (8.69)

The weighting factors wi, which are bounded by 0 and 1, are used as design variables where-fore the problem becomes continuous. According to the topology optimization method ofBendsøe and Kikuchi (see Section 9.4), the design variables are pushed towards theirboundaries and the final weighting factor array of each layer must have one value of 1and the rest must be zero. If defining several constitutive matrices made of one materialbut different orientation angles, the method can be used for fiber orientation or stackingsequence optimizations.

8.5.4 Optimization of the Laminate Thickness

In contrast to stacking sequence or fiber orientations, the laminate thickness has an influ-ence on the structural weight. Consequently, the thickness must be adapted at a certainstage of the design process if aiming for structures with minimal weight. There are twooptions for varying the laminate thickness, whether a variation of the thickness of theparticular layers or a variation of the number of layers.

Variable Layer Thickness

Analogously to the fiber orientation optimization, the thickness of each layer can be varied.Basically, the layer thickness are continuous wherefore the algorithms of mathematicalprogramming can be used. It must be ensured, that the thickness values do not becomenegative since this would lead to physically meaningless solutions. Some structural modelscan handle negative thickness values (e.g. the CLT) which leads to negative stiffness andnegative mass. If allowing the thickness values to become zero, layers can vanish whichresults in a topology optimization. A layer having zero thickness is non-existent wherefore,practically, the number of layers is changed. If using the Finite Element Method torepresent the structural model, it must be taken care that the thickness may not becomezero for all layers since this would result in numerical problems.

From the practical point of view, the variation of the layer thickness with continuousvariables is limited. If aiming for manufacturing-friendly solutions, the layer thicknessis restricted to values given by the semi-finished products which are provided by the



t1 = x1,1t2 = x2,1t3 = x3,1t4 = x4,1

t1 = x1,n

t2 = x2,nt3 = x3,n

t4 = x4,n

(a) Layer thickness optimization scheme

x1,1 x2,1 x3,1 x4,1 x1,n x2,n x3,n x4,n(b) Layer thickness optimization parametrization

Figure 8.14: Layer thickness optimization

manufacturers. Practically, this can be done differently. A pragmatic approach is to letthe thickness variables to be continuous during the optimization process. The thicknessvalues of the final solution are then rounded to the next value predefined by the semi-finished product. It is important to assure that the new solution is feasible and that itsobjective value is not differing too much from the initially found solution. It might be thatthe rounded solution behaves completely different. Alternatively, the thickness variablescan be set to the discrete values given by the semi-finished product. This requires theapplication of stochastic algorithms.

If using the layer thicknesses as design variables, it is important to include the structuralweight in the objective function or in the side constraints. In case that the optimizationobjective is to maximize for instance the stiffness or the strength, the weight must bebounded with a side constraint. Alternatively, the thicknesses will increase to very highvalues and the structural weight tends to infinity. If aiming for solutions with minimalweight, stiffness or strength must be bounded. Otherwise, the thickness tends to becomezero.

Variable Number of Layers

An alternative approach to vary the thickness of the laminate is to let the number of lay-ers be variable. The handling of these problems becomes slightly more difficult since thenumber of design variables is not constant anymore. Figure 8.15 schematically shows an

φ1 = x1,1φ2 = x2,1φ3 = x3,1φ4 = x4,1

φ1 = x1,nφ2 = x2,nφ3 = x3,nφ4 = x4,nφ5 = x5,n

(a) Scheme of a problem with variable number of layers

x1,1 x2,1 x3,1 x4,1 x1,n x2,n x3,n x4,n x5,n(b) Parametrization of a problem with variable number of layers

Figure 8.15: Optimization with variable number of layers

optimization problem with a variable number of layers. For each new appearing layer, thefiber orientation, the thickness, the material and the position in the laminate stack has tobe determined (even if in Figure 8.15 only the orientation angles are indicated), which is



usually done randomly or by choosing them randomly from a predefined database, respec-tively. Due to the variable length of the design variable array, these kinds of problems cannot be solved with algorithms from Mathematical Programming. Usually, Genetic Algo-rithms with variable genotype length are applied. It is important to implement appropriatereproduction and mutation routines in order to come close to optimal solutions.

8.5.5 Combined Laminate Optimizations

Often, the above introduced disciplines are combined. For instance, both, the fiber ori-entation angles and the layer thickness are optimized simultaneously. On one hand, itincreases the potential for finding optimal solutions. On the other hand, the search spaceis becoming more complex wherefore finding the optimal solution becomes more difficult.Basically, the parametrization must be chosen in a way that the solutions are feasiblefor production. Even if mathematically optimal solutions might be interesting from themechanical point of view, they do not really make sense if they cannot be realized. Anappropriate parametrization scheme helps to reduce the search space in order to obtainmanufacturing-friendly solutions.


8.6 Laminate Tailoring 159

8.6 Laminate Tailoring

The methods above are addressing problems for which an optimal unique laminate hasto be found. If the design domain is geometrically simple (e.g. rectangular plates orcylinders), and the loads are applied homogeneously as illustrated in Figure 8.16, theinternal load distribution is homogeneous as well. For such loadings, there exists a unique

Motivation

In contrast to isotropic materials, laminates can be designed specifically to the occurring load conditions

For homogeneous loading states, finding an optimal laminate layup is rather easy

Homogeneous load distributions occur only in academic examples

18. December 2012 7Doctoral Examination – Benjamin Schläpfer

Figure 8.16: Homogeneous loads: Tension and shear

optimal laminate which can be found rather easily. However, in general, the internal loadsare inhomogeneous and the stress state at each point of the laminate is different. Figure8.17 shows the strain field of a plate with a centered hole which is loaded uniaxially. Even

Motivation

Find global laminate layup which is the best solution for the structural behavior → local loadings are not respected

Tailor the local laminate properties to the local needs by taking advantage of locally varying thickness or fiber orientations

locally varying anisotropy

18. December 2012 8Doctoral Examination – Benjamin Schläpfer

Figure 8.17: Inhomogeneous strain field of a plate with centered hole

if this problem is quite simple, the internal load distribution is highly inhomogeneous.In order to fully exploit the potential of composites, the laminate properties have to betailored to the local load conditions. This can be achieved by splitting the design domainΩ into sub-domains Ωi each having a unique layup, which is schematically illustrated inFigure 8.18. Consequently, the objective here is to find an optimal laminate for each sub-

Die grundlegenden Parametrisierungen und Optimierungsmethoden bleiben dieselbenDie Design-Domain wird in sogenannte Sub-Domains aufgeteilt

Jede Sub-Domain besitzt einen eigenen LaminataufbauDie Optimierung wird aufwendiger da die Anzahl Optimierungs-variablen ansteigt

Laminatoptimierung (Laminate-Tailoring)

45Optimierung von Schalentragwerken aus Faserverbundwerkstoffen

1 2

3

Montag, 13. Mai 2013

Figure 8.18: Global domain Ω split with subdomains Ωi

domain. The optimization of the local laminate layup is also called laminate tailoring.Even if the optimization techniques are the same as presented above, laminate tailoring isa relatively new discipline since the computer system requirements are rather high. Thenumber of design variables increases linearly with the number of sub-domains which makesthe optimization more computational expensive.

Performing laminate tailoring, there might be the risk of finding solutions which can-not be physically realized since they are not manufacturing friendly. Consequently, man-ufacturing aspects should be regarded in the parametrization concept. For instance, the



connectivity of the sub-domain must be guaranteed with a minimal number of layerscovering multiple sub-domains. Figure 8.19 shows a laminate section consisting of three

Layer 1

Layer 2

Layer 3

Layer 4

Layer 5

Global Ply 25

Global Ply 12

Global Ply 31

Global Ply 9

Global Ply 16

Region 1 Region 2 Region 3

Elemente

Layer 1

Layer 2

Layer 3

Layer 4

Layer 5

Global Ply

Global Ply B

Global Ply C

Global Ply D

Global Ply E

Sub-domain 3

Elemente

1

2

5

A

Sub-domain 1 Sub-domain 2

Figure 8.19: Schematic illustration of layers covering multiple sub-domains

sub-domains. Layers A, C and E are covering the entire domain and a global connectiv-ity is given. A simple method to guarantee connectivity is to define a number of plieswhich cover the entire domain and may not be adapted during the optimization. Thismay restrict again the search space.

Basically, the geometric representation of the sub-domains can be predefined and as-sumed to be fixed during the optimization. This results in a constant number of designvariables which simplifies the optimization, but also restricts the search space and the so-lution quality, respectively. More freedom in terms of design is achieved if the parametriza-tion additionally includes the geometrical representation of the sub-domains. A variablenumber of sub-domains leads to a variable number of design variables which can onlybe solved by means of stochastic algorithms. Two parametrization approaches for thesub-domains are presented in the next two sections.

8.6.1 FEM-Based Parametrization of the Sub-Domains

A simple approach, which is often used to find locally varying laminates (e.g. [87, 88,89, 90, 91]), is the direct application of the finite elements as sub-domains. Consequently,the goal is to find a unique laminate within each finite element wherefore the number ofdesign variables increases linearly with finite element mesh size. A schematic illustrationof such a parametrization is shown in Figure 8.20(a). The contours of the sub-domains are

Ω1

Ω2

Ω3

Ω4

Ω…

Ωn(a) Parametrization with single elementsrepresenting a sub-domain

Ω1

Ω2 Ωn(b) Parametrization with multiple ele-ments pooled into a sub-domain

Figure 8.20: FEM-based Parametrization of Sub-Domains

dependent on the finite element mesh which may restrict the design freedom, but only if arough mesh is chosen. However, no additional parametrization of the geometry is needed.



Alternatively, the number of design variables can be reduced by pooling a given numberof finite elements into groups which then represent the design domains. A schematicsketch is shown in Figure 8.20(b). One possibility is to pre-define the elements belongingto one sub-domain and keep it fixed during the optimization. This of course restrictsthe freedom of design but it might be helpful to make the solutions more manufacturingfriendly. Alternatively, the elements which belonging to a sub-domain may be chosen tobe variable. The realization and the optimization become more complex but the potentialspace for improved solutions becomes bigger. Consider that for both parameterizations,the solution quality, but also the computational costs are dependent on the finite elementmesh.

Giger et al.[92] proposed a graph-based parametrization scheme which allows to makethe elements which are pooled in a sub-domain to be variable. This parametrization

Figure 8.21: Graph-based parametrized vertex patches. Source: Giger [92]

scheme allows maximal freedom in terms of design since regions having a unique laminateare morphing. The material data and the orientation angle of a so called patch, whichcan be understood as a layer that partially covers the design domain, is stored in a patchvertex as shown in Figure 8.21. The elements belonging to a patch, but also the materialand the orientation angle can be modified by means of a genetic algorithm. In order toguarantee the connectivity of a patch, only boundary elements may be added or removedwhich are located automatically by a computer routine. The total laminate layup in eachfinite element is set up by summarizing the patch layers the respective element belongsto in the right order (according to Figure 8.21 for instance from left to right). Thisparametrization scheme is inspired by the manufacturing process and the solutions aremanufacturing-friendly.

An alternative approach for laminate tailoring based on a finite element basedparametrization is proposed by Schlapfer and Kress [91, 93, 64]. The method basicallylocally reinforces a predefined laminate structure with laminate patches regarding a spe-cific design criterion such as stiffness, strength or dynamic behavior by simultaneouslyaiming for minimal mass. Layers with given material and fiber orientation are predefinedin the design domain. Some of these layers have a thickness of zero wherefore they do notreally contribute to the finite element model. These zero-thickness layers act as potentialreinforcements. The local reinforcements are generated by partially setting the thicknessof these layers to values of the semi-finished material. The decision, whether a layer inone finite element is set to a real value or not is made on the sensitivities which have been



introduced in section 8.3.2. In particular, the sensitivities of a particular design criterionwith respect to the layer thickness in each finite element are calculated. Since a change ofthe thickness has always an influence on the stiffness, even if the layer has a thickness ofzero (consider that in equation (8.43) and (8.44), the thickness is not existent anymore),they are non-zero for the defined zero-thickness layers. Consequently, the sensitivitiesprovide information how big the impact of a thickness change of a layer on the objectivefunction is. Figure 8.22 shows a simple optimization of a vibrating plate, whose second andthird eigenfrequency are separated. The sensitivities in Figure 8.22(a) are a superpositionof the sensitivities of the second and the third vibrating mode. The resulting layers arerepresented by the black areas in Figure 8.22(b). The present approach is inspired by themanufacturing process. The global connectivity is guaranteed by basic layers which arepredefined and not participating in the optimization process. The obtained solutions canbasically all be manufactured.

8.6.2 CAD-Based Parametrization of the Sub-Domains

Alternatively, the laminate patches can be parametrized by taking advantage of a CAD-environment as for example made by Zehnder et al.[94]. There, the geometry of the patcheswith given material and fiber orientations are parametrized by means of a CAD-software.The shape of the patches is only restricted by the parametrization capacities of the softwareitself. If using splines, the shapes can basically be chosen arbitrarily. As illustrated inFigure 8.23, the sub-domains having a unique laminate layup are defined by summarizingthe patches in the right order. Also here, the structural model is evaluated with help ofthe finite element method. However, here finite element mesh is created dependent on theparametrization of the parametrization of the sub-domain. Consequently, the sub-domainsare not dependent on the finite element mesh as it was the case above. An evaluation ofa solution always includes the generation and solving of the finite element model which is

0°

45°

-45°

90°

low high

(a) Red areas indicate high sensitivities whichimplies that the thickness of the correspondinglayer has to be increased in order to improve thedesign value

0°

45°

-45°

90°

(b) The resulting local reinforcements are rep-resented by the black areas

Figure 8.22: Sensitivities and local reinforcements of a vibration plate for which the secondand the third eigenfrequency are separated. Source: Schlapfer [64]



Figure 8.23: Connection between global layers and laminate regions. Source: Zehnder [94]

rather computational expensive. Also this method works with a genetic algorithm.


Chapter 9

Selected Methods and CaseStudies

9.1 Computer Aided Optimization after Mattheck

The growth-strategy method by Claus Mattheck [95] can, under certain circumstancesmore efficiently than other methods, reduces geometry-caused stress concentrations in de-signs. His algorithm, called computer-aided optimization (CAO) is modelled to be ananalog to the growth behavior of trees which he derived from nature observations. Heidentified the axiom of constant stresses as the basic principle of the growth and the heal-ing mechanisms of trees. Trees are annually forming a region where growth takes place,called cambium, located right under the bark. The thickness growth ensues from thecambium that is growing faster at more highly stressed locations than at others. So, bytime, a strength optimization results in the sense of reducing local stress concentrationsby increasing the areas of the more highly stressed cross sections.

Mattheck transforms this observed mechanism to an automated optimization method.The structural model of the considered part, suffering reduced strength from stress con-centrations, is modeled with a layer of constant thickness underneath its surface along thehighly stressed region, as shown in Fig. 9.1. Each iteration of the CAO process consists of

Figure 9.1: Part with stress raising geometry and modelled cambium

two phases. The first phase determines the stress distribution within the cambium, due tothe specified geometric boundary conditions and applied forces, by a structural analysis.The stresses are reduced to some equivalent stress distribution. During the second phase,a temperature distribution is derived from the equivalent stress distribution. A tempera-


166 Selected Methods and Case Studies

ture expansion coefficient is appropriately chosen and Young’s modulus of the cambium isreduced by some factor, for instance 400. The structural analysis of the second phase thensimulates a growth of the cambium effected by the temperature strains. The strains tendto be higher where higher equivalent stress values were calculated by the preceding firstphase. The strains imply a growth of the structural model. The growth is numericallyrecorded by adding the nodal-point displacements to the nodal point reference positions.The so obtained geometry changes can have the effect that the stress concentrations dueto the mechanical loading are reduced. The process is illustrated in Fig. 9.2 During the

Figure 9.2: Two-phase CAO process after Mattheck

iterations the geometry changes automatically so that the stress concentrations are sys-tematically reduced.

The advantage of the method with its ingenious simplicity lies in the relatively lownumerical effort to obtain a significantly improved design. The method was used withmuch success to reduce stress concentrations, and dramatically increase lifetime, in load-carrying automotive parts that could be re-designed to achieve lower weight.

The disadvantage of it lies in the very nature of is its local approach: more sophis-ticated solutions other than increasing the thickness of highly stressed parts can not befound by it. For example, the maximum-strength flywheel design, see section 9.3, couldnot be solved by CAO.


9.2 Soft-Kill Option after Mattheck 167

9.2 Soft-Kill Option after Mattheck

The topology optimization method after Bendsœand Kikuchi presented in the precedingsection uses as a global objective some specified compliance property resulting from thebehavior of the whole structure. Claus Mattheck [95] replaces the minimization of theglobal compliance objective function with an algorithm, sketched in Fig. 9.3, based on alocal optimality criterion. Again, the geometric design space is specified, spanned with

Figure 9.3: Soft-Kill Option process after Mattheck

a finite-element mesh, and geometric boundary conditions as well as forces specified. Anaverage Young’s modulus is initially assigned to each of the finite elements in the designspace. A structural analysis obtains a primary displacement solution the element deriva-tives of which are used to obtain a stress distribution over the domain. The stresses arecombined to establish the distribution of an equivalent stress. The local optimality cri-terion used by Mattheck assumes that the stiffness of the design will globally increasewhen Young’s modulus is increased in regions with higher stresses and reduced where thestresses are lower. When the stresses fall below a certain threshold, Young’s modulus isreduced to zero, or with respect to numerical stability, small values.


9.3 Flywheel Optimization and Inspired Mechanical Model 169

9.3 Flywheel Optimization and Inspired Mechanical Model

This section brings together the results of automated shape optimization of a flywheel andStodola’s solution on high-strength turbine disk design. The example exposes that oppor-tunities exist that computer aided optimization may inspire understanding and modelingof complex mechanical situations.Within a flywheel of constant thickness having a central bore stresses are distributed asplotted in Fig. 9.4. The normal stress in radial direction σr must meet the natural bound-

Figure 9.4: Initial flywheel shape and radial and circumferential stress distributions

ary condition at the bore and at the rim. The circumferential stress σθ is higher than theradial stress everywhere and has a stress peak at the bore. Because of the inhomogeneousdistribution of both the normal and the circumferential stresses, the material of the diskof constant thickness is not economically used.

9.3.1 Sodola’s Solution

Stodola [24] invented an optimum-strength design for the steam-turbine center disk. Thecenter disk connects the turbine’s blades to its drive shaft. Stodola’s design essentiallyavoids a central hole, Fig. 9.5. The inertia forces of the blades acting on the rim of

Figure 9.5: Blueprint of a turbine disk design solution after Stodola

the central disk are taken to be equivalent to an average radial stress which enters themathematical model as a natural boundary condition. By choosing the more generalformulation - allowing for a changing thickness t - of the equilibrium of forces in the radialdirection,

(σrt) ,r +t

r(σr − σθ) + trω2ρ = 0 , (9.1)



and requiring both stresses to be equal and also constant everywhere,

σr = σθ = σ , (9.2)

the general equilibrium equation is reduced to a differential equation where only the thick-ness remains as a dependent variable

t,r +t0ρω2

σr = 0 , (9.3)

and the shape of the evenly stressed flywheel is defined by the solution

t+ t0e− ρω

2

2σr2

(9.4)

It must, however, be kept in mind that the model is based on radial equilibrium (9.1)only, wherefore it is accurate only if the reference thickness t0 is small compared to thediameter of the disk. Then, the normal stress in the axial direction σz and the shear stressτrz can be neglected. As many flywheels possess a central bore for the purpose of fixingthem on a rotating shaft, the study of Stodola’s problem provokes the question whethera shape can be found for which an almost even stress distribution exists in the presenceof a central bore and in the absence of external radial stress. An answer to this questionis found by using a numerical shape optimization scheme where a FEM-based structuralmodel provides the system equations.

9.3.2 Shape optimization

The objective is to find a shape of the flywheel resulting in an even stress distributionwithin it. Using a flywheel of constant thickness t0 as the initial configuration its massand rotational energy are also to be kept constant. Mathematically, this is expressedby using the standard deviation of the radial stress distribution σ as the objective andby introducing the equality constraints h1 (constant mass) and h2 (constant rotationalenergy),

O =

∫r

(σ − σ)2n dr = min, σ : average stress (9.5)

h1 = 2πρ

∫rr (t(r)− t0)2n dr = 0 (9.6)

h2 = 2πρ

∫rr3 (t(r)− t0)2n dr = 0 (9.7)

The thickness values at the element interfaces in the radial direction take the role of shapeparameters as outlined in Fig. 9.6. Thus, the number of optimization variables depends

Figure 9.6: Parameterization of thickness and mesh generator



linearly on the number of elements in the radial direction but is independent of the numberof elements in the axial direction. The position of the nodes on the element sides is alwayson the straight line connecting the respective corner nodes. The objective function is min-imized using the method of feasible directions according to the textbook of Vanderplaats[20] as well as exact gradient information from the objective and the constraint functions.Additional corrections to ensure that the equality constraints remain satisfied are madeafter each line search. Quadratic approximations of the objective function are used forquickly finding the minimum value along the current search directions as described in thetextbooks of Reklaitis et al. [33] and and Baier et al. [96].

9.3.3 Discussion of results

Fig. 9.4 shows the initial configuration, the finite element mesh, and the initial stressdistributions. These stress distributions indicate that the region adjacent to the centralhole is significantly more stressed than other parts of the flywheel.The shape optimization procedure yields the results plotted in Figs. 9.7 and 9.8. Both thestress distributions and the shape suggest a division of the flywheel into three regions fora characterization of the optimized geometry. Region I adjacent to the hole increases pro-gressively in thickness with decreasing distance to the hole. The circumferential stress alsoincreases with decreasing distance to the hole but its maximum value is much lower thanin the initial configuration. The most striking feature of region II is the near constancyof both the radial and the circumferential stresses. Moreover, the values of both stressesare almost equal. The thickness of region II decreases progressively but at a slow ratewith increasing distance to the hole. Within region III, the shape of which is somewhatreminiscent of a bell, the radial stress drops to zero.The difference between the results shown in Fig. 9.7 and in Fig. 9.8 is in the choice ofthe stress used for the objective. Fig. 9.7 shows the shape corresponding to the most

Figure 9.7: Shape optimized after maximum stress criterion and stress distributions

evenly distributed circumferential stress. Since the circumferential stress has higher val-ues everywhere in the initial configuration, it is identical to the first principal stress andminimizing the objective represents the engineering aspect of seeking a spatially constantprobability of brittle failure in the flywheel. Fig. 9.8 shows the shape corresponding tothe most evenly distributed von Mises stress. The stress distribution obtained here comesclosest to a spatially constant probability of yield failure in the flywheel. Thus, it maybe concluded that local details of the shape depend on the choice of stress for the objec-tive function while global features such as the existence of regions I, II, and III remain



Figure 9.8: Shape optimized after yield stress criterion and stress distributions

the same. Both the shape of region II and the almost even distribution of the stresseswithin it bring to mind Stodola’s evenly stressed turbine disk. The deviations from per-fect constancy of the stresses are explained by the effects of steep thickness changes on thetwo-dimensional stress equilibrium. The hypothesis is proposed that region I alleviatesthe stress perturbations of the central hole and that region III introduces radial stresseson the outer edge of region II which are used in Stodola’s model to simulate the forcesexerted by the blades on the rotating turbine disk.

9.3.4 Simple Prediction of Optimum Shape Features

A mechanical system is proposed, consisting of a disk modelling region II and two discreterings located at the edges of the disk, modeling regions I and III (Fig. 9.9). The disk has

Figure 9.9: Mechanical model consisting of Stodola’s disk and two discrete rings repre-senting regions I and III

a thickness distribution as given by Stodola [24]. The inner and outer rings at positionsr1 and r2 have the cross-sectional areas A1 and A2, respectively.The line load Nr resulting from the integration of the radial stress σr over the thicknessis assumed to be positive, and the circumferential stresses σθ in the rings are

σθ1 = ρω2r21 +

r1

A1Nr1

σθ2 = ρω2r22 −

r2

A2Nr2

where the first terms on the right-hand side give the circumferential stress in the inde-pendently rotating ring and the second terms make corrections for the tensile radial lineload which tends to pull the inner ring apart and the outer ring together. The rotational



symmetry of the problem automatically satisfies the condition of force equilibrium in thecircumferential direction. The condition of force equilibrium in the radial direction re-quires that the radial line loads be continuous at the interfaces between the disk and therings. If the material of the flywheel is homogeneous, the kinematics are satisfied if thecircumferential stress is also continuous. The particular shape of the disk as defined byStodola’s equation guarantees the constant stress distribution indicated in (9.2) as long asthe conditions at the discrete rings

Nri = σt|ri, i = 1, 2 (9.8)

apply. This determines the cross-sectional areas of the rings, A1 and A2,

A1 =σr1

σ − ρω2r21

t0e− ρω

2r212σ

A1 =σr1

σ − ρω2r22

t0e− ρω

2r222σ

The cross-sectional areas of the rings are positive as long as σ is higher than the circum-ferential stress due to the inertia body forces that would arise in the rotating inner ringas an isolated system, and also lower than the circumferential stress that would arise inthe rotating outer ring:

ρω2r21 ≤ σ ≤ ρω2r2

2 . (9.9)

Then the stress σ, acting in the disk, tends to pull the inner ring apart in addition toits own inertia effect and to restrain the outer ring against the inertia effect acting on it.Thus, the stress σ can be chosen freely as long as it lies between the bounds defined by(9.9). In accordance with the finite-element shape optimization described above it wasdecided to fix the mass M as well as the rotational energy U of the disk to the respectivevalues of the initial shape (IS),

M = tISπρ(r2

2 − r21

)= 2πρ

(r1A1 + r2A2 + t0

∫ r2

r1

re−ρω2

2σr2

), (9.10)

U =1

4tISπρω

2(r4

2 − r41

)= 2πρω2

(r3

1A1 + r32A2 + t0

∫ r2

r1

r3e−ρω2

2σr2

). (9.11)

This can be achieved by adjusting the reference thickness t0 and the constant stress valueσ. This stress value will always be lower than the maximum circumferential stress at theedge of the hole in the initial design. The practical significance of the optimized shapelies in the wider choice of less expensive materials that can be used for the flywheel. Theintegrals in (9.10) and (9.11) can not be solved algebraically. They can, however, beeliminated by using the identity∫ r2

r1

rear2dr ≡ r2

2ear

2∣∣∣r2r1− a

∫ r2

r1

r3ear2dr , a = −ρω

2

2σ. (9.12)

Combining (9.10), (9.11), and (9.12) leads to the stress σ,

σ = ρU

M=

1

4ρω2

(r2

2 + r22

). (9.13)

The maximum circumferential stress at the edge of the hole of the initial configurationdepends on the material’s Poisson’s ratio ν,

σθIS =1

4ρω2

[r2

2 (3 + ν) + r21 (1− ν)

]. (9.14)



Its value, even when neglecting the influence of Poisson’s ratio, can be up to three timeshigher than that of the constant stress of the optimized flywheel.Substituting the result given in (9.13) into the exponent of the thickness function of thedisk,

tr = t0e− ρω

2

2σr2

= t0e−Mω2

2Ur2

= t0e− 2r2

r22+r21 , (9.15)

reveals that the diameters of the rim and the central hole uniquely determine the shapeof the optimized disk as well as the ring areas,

A1 = t0r2

2 + r21

r22 − 3r2

1

r1e− 2r21r22+r21

A2 = t0r2

2 + r21

3r22 − r2

1

r2e− 2r22r22+r21 . (9.16)

The condition stated by (9.9) can now be written as

r2 ≥√

3r1 . (9.17)

Under the constraints of having the same mass and rotational energy as a reference flywheelof constant thickness, a shape for constant stressing can be found only when the outerdiameter is at least 1.73205 times larger than the diameter of a central hole. Introducingthe ratio α of the two radii

α =r1

r2, 0 ≤ α ≤ 1√

3. (9.18)

yields more transparent equations for the ring areas and the thickness function of the disk,

A1 = t01 + α2

1− 3α2r1e− 2α2

1+α2 ,

A2 = t01 + α2

3− α2r2e− 2

1+α2 ,

t(r) = t0e− 2r/r2

1+α2 . (9.19)

The ring areas depend on the diameter ratio as well as linearly on the absolute size of theflywheel. The thickness function is independent of the diameter of the flywheel. Thus,the ratio α uniquely determines the overall shape of the optimized flywheel. In order todetermine the remaining model parameter t0 it is necessary to evaluate one of the integralsin (9.10) or (9.11), for instance, by using a numerical integration scheme.Figure 9.10 gives the areas of the two rings and the disk over the full permitted rangeof α as (9.18) indicates. The considered initial-shape flywheels have a unit radius and athickness of one hundredth.In the case of very small values of α, the disk and the outer ring A2 dominate the areaof the optimized flywheel. With increasing α, the areas of the inner ring A1 and the diskquickly increase and decrease, respectively. In the case of values of α higher than 0.3 theinner ring tends to take up the major portion of the total area. The area of the outer ringdecreases slowly with increasing α. As α approaches the upper limit, the whole designdegenerates in the sense that all the available area moves into the inner ring.The shapes according to the simplified model are visualized on the left-hand side of Fig.



Figure 9.10: Relative Values of the areas of the inner and outer rings and the disk

Figure 9.11: Optimum shapes for different values of the diameter ratio. Source: [2]

9.11 where ring areas are represented by solid circles. On the right-hand side of Fig.9.11 the results of the numerical two-dimensional FEM analysis, based on the same inputdata, are shown. It appears that the shape results agree well for lower values of α upto 0.3, which seems remarkable in view of the crudeness of the simplified model. Themain difference in shape is that the extra areas at the inner and the outer rim are moresmoothly distributed according to the FEM model than according to the simplified model.The two models agree in the extra area at the outer rim remaining fairly constant, theextra area at the inner rim increasing quickly, and the thickness of the disk decreasing, allwith increasing values of α.The reason for the degeneration of the agreement between the two models with increasingvalues of α becomes obvious when examining the stress plot in Fig. 9.7. The radial stresshas to vanish at the boundaries causing a perturbation to the even stress distribution. Theperturbation is modeled well by the FE method but is not considered by the simplified one-dimensional model. Thus, the shape plotted in Fig. 9.7 is optimal although the stressesare not even throughout the domain. Equation (9.13) also implies that the capacity ofan evenly stressed flywheel for storing kinematic energy per unit mass equals the specificstrength of the material,

U

M= σ∗ =

σ

ρ. (9.20)

This provides an immediate estimate of the energy storage capacity per unit mass atoptimum shape for any given material.



9.3.5 Conclusions on the Flywheel Optimization and Modeling

A procedure based on FEM-analysis for finding a maximum strength design for flywheelswith central bore has been coded in FORTRAN and explained. The program yields satis-factory results which approximate the objective of even stressing. A simplified analyticalmodel yields rough shape features of the maximum strength design and elucidates theunderlying essential mechanisms. According to the simplified model, an even-stress shapedesign does not exist for bore radii greater than the square root of one-third of the radiusof the flywheel. The optimum-shape predictions of both models agree well if the radius ofthe bore does not exceed approximately one third of the radius of the disk. The stress levelin the optimized design is approximately only one third of the maximum circumferentialstress at edge of the bore in a flywheel of constant thickness. The specific energy-storingcapacity of the evenly stressed flywheel according to the closed-form solution of the sim-plified model equals the specific strength of the material being used.

.


9.4 Topology Optimization after Bendsøe and Kikuchi 177

9.4 Topology Optimization after Bendsøe and Kikuchi

Topology optimization of solid structures involves the determination of features such asthe number and location of holes and the connectivity of the domain. The only knownquantities in the problem are the applied loads, the possible support conditions, the vol-ume or domain shape of structure and some prescribed topological items such as sizeand location of prescribed holes. Topology optimization is well established for minimumcompliance design, which method obtains a design proposal for highest structural stiff-ness under the specified geometric and natural boundary conditions and a specified totalamount of material to be distributed in the physical design space. It is modelled by afinite-element mesh and the deformation of the given volume is numerically evaluated bethe finite-element method for any kind of geometric boundary conditions and loads. Thedistribution of the material is initially constant and then changed so that a topology ofgeometrically favorable regions with maximum density and void regions is created. Anyelement in the often uniform finite-element mesh maps either material or a void, renderingthe topology optimization problem discrete.Finding a solution by checking all possible combinations appears impractical since thenumber of topologies nT increases exponentially with the number of finite elements n,

nT = 2n, (9.21)

and the sketch in Fig. 9.12 illustrates the equation with n = 4. The moderate-sized sample

Figure 9.12: Illustration of topologies for a mesh of 1 by 4 elements

problem given in Fig. 9.14 uses n = 300 finite elements and the number of unconstrainedtopologies is nT = 2.037 × 1090. Of course, a mass, or volume, constraint reduces thenumber of feasible topologies drastically. Let n continue to give the mesh size and introducem to give the number of elements that must be filled with material due to a specifiedaverage density in the design space. Then the number of combinations is only

nT =n!

m! (n−m)!(9.22)

and Fig. 9.12 illustrates how, for instance, a mean density ρ = 0.25 reduces the number ofsolutions from 16 to 6. The number of combinations for the sample problem with n = 300and specified material density , giving m = 75, is nT = 9.796× 1072. A further constraintarises from the consideration that the topology must at least connect the boundarieswhere displacements are prescribed with those where non-vanishing stresses or forces areapplied. Design solutions violating that requirement are called illegal. A simple formulafor obtaining the legal solutions for any mesh does of course not exist but the illustrativesample shown in Fig. 9.13 obtains that, under the mass constraint ρ = 0.5, only four of the924 design solutions after (9.22) are legal. Here it would appear attractive to investigate



Figure 9.13: The legal (top) and some illegal (bottom) topologies with 4 by 3 elements

the possibilities of stochastic or even exhaustive search provided that illegal topologiescould be systematically identified and kept from being numerically evaluated.However, the minimum compliance problem has been proven to be convex and the well-known homogenization approach of Bendsøe and Kikuchi [12] uses a distributed-functionparameterization rendering the objective function continuous. Each element is equippedwith a variable density function whose minimum value is zero, representing a void, andwhose maximum value represents the density of the of the massive material. Between thosebounds are intermediate states of more or less thinned out material. These intermediatestates are not wanted in the resulting topology, or layout, but accepted in order to enable acontinuous optimization process. The final design should have only material-filled or voidelements and the filled elements present the generated shape of the part to be designed.Then the best design solution can be found quickly by using some method provided bymathematical programming.

9.4.1 Topology optimization sample problem

These points are illustrated by the example shown in Fig. 9.14. The rectangular designspace is meshed with 10 times 30 finite elements. The boundary conditions are not shownin the figure: all nodes on the left vertical part of the boundary are fixed (homogeneousgeometric boundary condition) and a downward oriented concentrated force acts uponthe nodal point at the lower right corner. The compliance of the initial configuration,

(a) (b) (c)

(d) (e) (f)

Figure 9.14: Topology optimization example: initial (a), intermediate (b through e), andfinal (f) states for an average density ρ = 0.252

Fig. 9.14(a), is high enough to reveal the deformation of the finite-element mesh with thehighest displacement at the node where the force attacks. The displacement scale factor is



the same for all plots and it can seen that the mesh deformation reduces quickly throughthe optimization process. The mass density is presented by gray level. Unfortunately, thereare not enough different gray levels to portray the density differences between elementsin more detail. Nevertheless, it can be seen that already after the first iteration step adifferentiated mass density distribution with concentrations at the corners of the clampedboundary emerges, Fig. 9.14(b). The state after 20 iterations, Fig. 9.14(b), reminds of asandwich design where less strained regions are already thinned out. After 40 iterationsemerges some differentiation in the interior in terms of diagonally arranged members,Fig. 9.14(d). The intermediate crossing-members design, emerging after 60 iterations andshown in Fig. 9.14(e), dissolves again in favor of the simple diagonal member pattern ofthe final design obtained after 83 iterations and shown in Fig. 9.14(f).

9.4.2 Objective Function and Design Evaluation

The objective of the highest possible structural stiffness is elegantly cast into a globalobjective function by minimizing the work W done by the external forces r along theconjugated displacements u caused by the forces:

W = uT r = min. (9.23)

The load vector r remains constant. The displacements u depend on the state of thedesign parameters and require the solution of system of equations

Ku = r (9.24)

assembled by the FEM model. The tilde reminds of the fact that the displacements arediscrete values defined on the nodal points of the finite-element mesh. Then, the objectivefunction calculation itself requires only to perform the scalar product (9.23).

9.4.3 Parameterization

The structural stiffness depends on the stiffness contributions of the finite elements. Thefinite elements do not change their size or shape but their stiffness is controlled by Young’smodulus E. It depends, see Fig. 9.15 in turn on the density distribution function ρ so

Figure 9.15: Young’s modulus and density

thatE = E0ρ

p, → 0 ≤ ε ≤ 1, (9.25)



where E0 is a reference stiffness value and ε is a small but finite number so that ρ isprevented to become exactly zero for numerical reasons. The exponent in (9.25), usuallychosen from the range 3 ≤ p ≤ 4, is justified by some material modelling considerationsbut also effects Young’s modulus moving against its bounds more quickly. In the originalapproach, the values of the density functions ρ are directly related to the each finiteelement and used as variable design variables. The lower bound signifies a void and theupper bound signifies a region of massive material extending over the sub-domain of justone finite element. Thus, there are as many design variables as finite elements. If thevector x denotes the sought topology result, the density distribution function ρ(x) is

ρ(x) =

ε if x ∈ Ωv

1 if x ∈ Ωs(9.26)

where Ωv and Ωs denote the void and solid regions in the design space Ω, respectively. Sowe have a typical case of a mesh-dependent parameterization and the number of variabledesign parameters, or optimization variables, may become quite high: the relatively simpleexample with its course mesh presented in section 9.4.1 already features 300 independentoptimization variables. An evenly spaced mesh, where all finite elements have the sameshape and size, reduces the numerical effort because all element matrices can be tracedback to a reference matrix K0 so that

Kk = EkK0 (9.27)

and the reference element stiffness matrix needs to be calculated only once. The stiffnessmatrix K of the whole structure, appearing in the system of equations (9.24), is thusassembled from the element matrices by

K = E0K0

NEL∑k=1

ρpk. (9.28)

Fig. illustrates how the nodal point addresses are used to connect the element stiffnessmatrix entries with those of the system matrix. Finally it is important to note that the

Figure 9.16: Illustration to the assembly of the global stiffness matrix

total mass must be constrained to some initially specified value M0, or mean density ρ inthe volume V ,

NEL∑i=1

ρiVi = M0 = ρV , (9.29)



where Vi is the volume of one finite element, because otherwise the objective of maximumstructural stiffness would be reached by the trivial solution that the design space Ω iscompletely filled with material.

9.4.4 Optimization Problem Statement

The topology optimization problem is summarized as

minρ∈Rn

W (ρ)

∣∣hM (ρ) = 0, g+ ≤ 0, g− ≤ 0

(9.30)

The equality constraining function for constant mass, hM (ρ), is

hM (ρ) =

NEL∑k=1

ρkVk −M0 = 0 <; . (9.31)

The side constraints on the design variables ρi are cast into the inequality constrainingfunctions

g+i = ρmin − ρi ≤ 0 (9.32)

andg−i = ρi − 1 ≤ 0 (9.33)

9.4.5 Lagrange Function

The optimization problem with objective and constraining functions can be expressed bya Lagrange function L(ρ,Λ,λ+,λ−):

L(ρ,Λ,λ+,λ−) = W (ρ) + ΛhM (ρ) +

NEL∑i=1

λ+i g

+i +

NEL∑i=1

λ−i g−i (9.34)

Optimal topology is obtained at the saddle point of the Lagrangian and the derivative ofit with respect to the optimization variables must vanish:

∂L(ρ,Λ,λ+,λ−)

∂ρi= 0 (9.35)

The next step is to derive the gradients of all parts of the Lagrange function.



9.4.6 Gradients with Respect to Densities

The large number of optimization variables of this type of topology optimization prob-lem calls for a very efficient optimization procedure. The ingenious step of Bendsœ’s andKikuchi’s method, the introduction of smooth density functions as optimization variablesinto the otherwise discrete-natured problem, allows using one of the methods of mathe-matical programming.

Objective Function Gradient

Specifically, gradient calculation after the sensitivity formula, described in section 6.10,is extremely simple and numerically effective as the following derivation [96] shows. Thederivative of the objective function (9.23) with respect to the ith is

∂W

∂ρi=

(∂u

∂ρi

)(T )

r + u(T ) ∂r

∂ρi(9.36)

The sensitivity of the displacement solution u to the optimization variable ρi is describedby the sensitivity formula (6.129). Substituting it into (9.36) yields

∂W

∂ρi=

∂r

∂ρi− ∂K

∂ρiu

(T )

K(−T )r + u(T ) ∂r

∂ρi(9.37)

Because of the symmetry of the global stiffness matrix K we have that K(−T )r = u andtherefore (9.36) simplifies to

∂W

∂ρi= 2u(T ) ∂r

∂ρi− u(T )∂K

∂ρiu (9.38)

A change of the density value ρi of the ith finite element influences only the stiffnessof it and not the stiffness of any other element. If it influences, in the case of density-dependent weight, only the load of the ith finite element and not the load of other elements,the sensitive of the objective with respect to the variable ρi can be calculated on elementlevel,

∂W

∂ρi= 2u

(T )i

∂ri∂ρi− u

(T )i

∂Ki

∂ρiui, (9.39)

where ui and Ki are the displacement vector and stiffness matrix of the ith finite element.Fixed loads do not depend on the state of ρi and vanish from (9.39). With a referencestiffness matrix K0 for a unit Young’s modulus E = 1 the stiffness matrix of the ith finiteelement is obtained by

Ki = ρ4iK

0. (9.40)

The reference stiffness matrix must be calculated only once if the mesh is a rectangulararray or otherwise regular in a way so that all elements have the same geometry. Thisreduces the numerical effort for the assembly of the global stiffness matrix drastically.Returning to our gradient-calculation problem, we discover that the derivative of thestiffness matrix can now be obtained analytically,

∂Ki

∂ρi= 4ρ3

iK0. (9.41)

Last we assume fixed loads and insert (9.41) in the element-level sensitivity equation (9.39):

∂W

∂ρi= −4ρ3

i u(T )i K0ui. (9.42)



Constraining Functions Gradients

From (9.31), (9.32), and (9.33) it follows by derivation that

∂hM (ρ)

∂ρi= Vi (9.43)

∂g+i (ρ)

∂ρi= −1 (9.44)

∂g−i (ρ)

∂ρi= 1 (9.45)

9.4.7 Calculation of Lagrange Factors

With the just calculated gradient information the optimality condition (9.35) can be writ-ten in detail,

∂L

∂ρi= −pE0ρ

p−1i uTi K0ui + ΛVi − λ+

i + λ−i = 0 , (9.46)

but it remains to calculate the hitherto unknown Lagrange multipliers. From Section 3.5.3it is known that

Λ =(∇W )T∇h(∇h)T∇h

(9.47)

if all inequality conditions are inactive. The inner product of the gradient of the equalityconstraining function is:

(∇h)T∇h = NELV2k =

V 2

NEL(9.48)

The inner product of the gradients of the objective and the equality constraining functionsis:

(∇W )T∇h = −p V

NELE0

NEL∑k=1

ρp−1k uTkK0uk (9.49)

Inserting the results into (9.47) gives, after simplification,

Λ = − pVE0

NEL∑k=1


Substituting the result into (9.46) and taking the negative of the gradient gives a searchdirection along which the mass remains the same:

si = pE0ρp−1i uTi K0ui +

p

VE0

NEL∑k=1


Since it has been assumed that the inequality constraints are inactive, it must be discussedwhat happens if they become active. Then, since they apply individually to the densityfunctions, the respective entries in the gradient vectors can be considered constants andthe respective entries in all gradients can be set equal to zero:

ρi ≤ ρmin ∧ ∇Wi > 0 → ρi = ρmin; ∇hi = 0 . (9.52)

ρi ≥ 1 ∧ ∇Wi < 0 → ρi = 1; ∇hi = 0 . (9.53)



9.4.8 A Dual Algorithm for Topology Design

The same topology optimization problem has been visited by Jog [97] and he uses adual formulation of the problem for finding a different solution method. We present hisderivations and results but continue to use, where possible, the notation we have becomefamiliar with. He uses a more general expression for the compliance, or external work:

W =

∫Γσ

σTudΓ−∫

Γu

σT udΓ +

∫ΓΩ

bTudΩ (9.54)

where σ and u are prescribed natural and geometric boundary conditions, respectively,and b is distributed body force. The first term on the right-hand-side carries the samemeaning as (9.23), with the only difference being that the latter is expressed with thediscrete forces and displacements of a finite element model. The second term gives thecompliance due to prescribed displacement rather than traction and one wishes, for a stiffstructure, the reaction stress as high as possible which explains the negative sign. Thethird term gives the work done by distributed body force such as self-weight (sometimesalso called dead weight). The total potential energy Π of an elastic system is the differencebetween the stored elastic deformation energy U and the external work W or

Π = U −W . (9.55)

The principle of the minimum of the total potential energy requires that Π be a minimumwith respect to the displacement. Then, at the equilibrium point, Clapeyron’s theoremstates that Π = −W/2 or

Π(u, ρ) =

∫ΩudΩ−

∫Γσ

σTudΓ−∫

ΩbTudΩ (9.56)

where u denotes the strain energy density function (please distinguish this from the boldprinted displacement) which for a linear elastic material is given by

u = εTCε (9.57)

where ε denotes the linearized strain tensor in vector notation and C the materials law inmatrix notation.Jog defines the dual problem by the task of finding the Lagrange multiplier Λ associatedwith the volume constraint that solves

minΛL(Λ), Λ > 0 (9.58)

where

L(Λ) = maxρ

minu

Π(u, ρ)− Λ

(∫ΩρdΩ−M0

)(9.59)

From Section 5.7.4 we recall that dual formulations require the primal problem consist ofseparable functions. Jog proceeds with establishing a separable approximation for L(Λ)and approximates the total potential energy by using the reciprocal variables κ to denotethe field 1/ρ. Inventing the symbols u0 and ρ0 to denote the equilibrium displacement fieldcorresponding to the design ρ0, respectively, a first-order approximation for Π by Taylorseries is given by

Π(u0, ρ) ≈ Π(u0, ρ0) +

∫Ω

∂u

∂κ

∣∣∣κ0

(1

ρ− 1

ρ0

)dΩ (9.60)



The total potential energy can now be maximized without considering the constant term.Jog next introduces the discrete representation corresponding to the finite element method.Then, the Lagrangian can be written as a sum over the separate contributions of each finiteelement, or

L(Λ) = maxρ

∑i

∫Ωi

∂u

∂κ

∣∣∣κ0

(1

ρ− 1

ρ0

)dΩ− Λ

(∑i

∫Ωi

ρdΩ−M0

)(9.61)

where Omegai is the domain occupied by the ith finite element. Since the density valuesare assumed to be constant over each element, the densities can be removed from withinthe integrals, or integrals can be replaced by summations. Thus we get

L(Λ) = maxρ

∑i

[ρ02

i

(1

ρ0i

− 1

ρi

)∫Ωi

∂u

∂ρi

∣∣∣ρ0i

dΩ− ΛρiVi

]+ ΛM0 (9.62)

The maximization is carried out by maximizing each term under the summation sign, or

L(Λ) =∑i

maxρ

[ρ02

i

(1

ρ0i

− 1

ρi

)∫Ωi

∂u

∂ρi

∣∣∣ρ0i

dΩ− ΛρiVi

]+ ΛM0 (9.63)

The density ρi can take on either of the two values (ρmin, ρmax). If the maximum of theterm in square brackets occurs at the value of (ρmax), it holds that

ρ02

i

(1

ρ0i

− 1

ρmax

)∫Ωi

∂u

∂ρi

∣∣∣ρ0i

dΩ− ΛρmaxVi > ρ02

i

(1

ρ0i

− 1

ρmin

)∫Ωi

∂u

∂ρi

∣∣∣ρ0i

dΩ− ΛρminVi

(9.64)and simplification leads to the condition

ρi = ρmax ifρ02

i

ρmaxρmin

∫Ωi

∂u

∂ρi

∣∣∣ρ0i

dΩ > ΛρminVi (9.65)

The other case gives the condition

ρi = ρmin ifρ02

i

ρmaxρmin

∫Ωi

∂u

∂ρi

∣∣∣ρ0i

dΩ < ΛρminVi (9.66)

In fact the dual problem has only the one single dual variable Λ which needs to be found forminimizing the Lagrangian, all other variables follow from the above conditions. The dualoptimization problem requires an iterative procedure. In alternations the displacementfield is solved for given density distribution and the Lagrangian function is then minimizedwith respect to the Lagrangian multiplier Λ. A line search is conducted for the Lagrangemultiplier only. Its value is increased if the current mass is higher than the specified mass,and vice versa. Jog reports difficulty with obtaining topologies close to the optimumarising from the fact that the linear approximations of the Lagrangian close to a referencepoint permit only small changes in the design variables. As a remedy he uses a filteringtechnique.



9.4.9 Sample Topology Design Problem

One of the sample problems used by Jog [97] is explained with Fig. 9.17. Jog uses Young’s

Figure 9.17: Topology design problem considered by Jog. Source: Jog [97]

modulus and Poisson’s ratio E = 2.1 × 107 and 0.25, the force attacks at the domaincenter, and the average mass density is 30 per cent. He utilizes the existing symmetry anduses for the half-model a mesh of 40 × 25 finite elements with quadratic shape functions(Q9). His solution, see Fig. 9.18, is obtained with the dual problem and here compared

Figure 9.18: Topology design solution after Jog. Source: Jog [97]

with a solution, see Fig. 9.19 obtained with the demonstration program TOP which isbased on the primal problem. The performance data, as reported by Jog and obtained

Figure 9.19: Topology design solutions with half (left) and full (right) models

with our program TOP, are summarized in Table 9.1. Using a half model with TOP yields

Table 9.1: Topology optimization process performance data

dual primal half primal full

iterations ≈ 200 142 272

compliance 2087 2088 2117

element type Q9 L4 L4

a design with almost the same compliance as the one obtained by Jog; it exhibits more ofthe expected symmetries and the number of iterations is even smaller.


9.5 Truss Optimization with Ground Structure Approach 187

9.5 Truss Optimization with Ground Structure Approach

The ground structure approach is well established for the optimization of the geometryand topology of trusses. A number of mathematical formulations for solving the problemof finding minimum compliance is explained in the review paper [98] and the materialpresented in here is selected from it. The layout of a truss structure is found by allowing acertain set of connections between a fixed set of nodal points as active structural membersor vanishing members. Fig. 9.20 exemplifies a ground structure in two dimensions with

Figure 9.20: Various ground structures

fifteen points and four pre-defined sets of connections with increasing complexity. In caseof the so-called complete ground structure, where each nodal points is connected with allothers and which is illustrated by Fig. 9.20(d), the number of bars, m, increases with thesquare of the number of points, n, by

m =1

2n (n− 1) . (9.67)

Then the number of bars is much higher than the number of degrees-of-freedom of the truss-structure finite-element model. Please note that with the topology optimization method,commented in Section 9.4, the numbers of density variables and degrees-of-freedom growonly linearly with the mesh density. Also, the full ground structure produces a structuralstiffness matrix lacking any sparseness and bandedness. It should therefore not surprisethat, for the two approaches, the respective most efficient problem formulations and solu-tion methods are not the same.Obviously, the ground-structure approach is only a sizing problem, where the optimumcombination of the member cross-sectional areas must be found, if the areas must notbe smaller than a lower limit. Permitting zero area values combines the sizing with thetopology optimization problem. If, moreover, the positions of the nodal points of thestructure are allowed to move, shape optimization joins in and the combination of allthree disciplines is called a layout problem.



9.5.1 Problem Statement

The problem of finding the minimum compliance truss for a given amount of material isexpressed as

minu,t

rTu, subject to:m∑i=1

tiKiu = r,

m∑i=1

ti = V, ti ≥ 0, i = 1, . . . , m (9.68)

Here, ti symbolizes the volume of the ith bar, and ti = aili is introduced to achieve a morecompact notation. In local coordinates, the stiffness matrix of a bar reads

K =EA

L

[1 −1

−1 1

](9.69)

and consequently the Ki appearing in (9.68) must be normalized with respect to barvolume:

Ki =E

l2i

[1 −1

−1 1

](9.70)

If a non-negative lower bound is imposed on the volumes ti, the stiffness matrix remainspositive definite for all ti > 0 and the remaining problem of just adjusting the bar volumesfor maximum stiffness with respect to the defined forces has been shown by Svanberg [99]to be convex with assured existing solutions.The zero lower bound on the variables ti implies that bars of the ground structure can beremoved and the problem statement thus covers topology design. It also implies that thestiffness matrix is not necessarily positive definite and that the displacement solution u cannot simply be removed from the problem by solving the finite-element-method equations.

9.5.2 Problem Statement Extension for Multiple Loads

The problem of finding the minimum compliance design for several load cases can be solvedby treating the problem of minimizing a weighted average of the compliances of each loadcase [100]. For a set of M different load cases pk, k = 1, . . . , M , the multiple loadproblem can be written as

minu,t

M∑k=1

wkrkTuk, subject to:

m∑i=1

tiKiuk = rk,

k = 1, . . . , M,m∑i=1

ti = V, ti ≥ 0, i = 1, . . . , m . (9.71)

For convenience the extended displacement vector u = (u1, . . . , uM ), an extended forcevector r = (w1r1, . . . , wMrM ) of the weighted force vectors, and the extended element



stiffness matrices as the block diagonal matrices

Ki =

w1Ki

w2Ki

. . .

wMKi

(9.72)

are introduced. This allows writing problem (9.71) as

minu,t

rkTuk, subject to:

m∑i=1

tiKiuk = rk,

m∑i=1

ti = V, ti ≥ 0, i = 1, . . . , m . (9.73)

which is of the same form as (9.68).

9.5.3 Problem Statement with Self-Weight Loading

For a truss it can be assumed that the weight of a bar is carried equally by the joints at itsends. With gi denoting the specific nodal gravitational force vector due to the self-weightof bar i, the problem of finding the optimal topology takes the form

minu,t

rTu +

(m∑i=1

tigi

)Tu

,

subject to:m∑i=1

tiKiu = r +m∑i=1

tigi,m∑i=1

ti = V,

ti ≥ 0, i = 1, . . . , m . (9.74)

9.5.4 Fully Stressed Design and Optimality Criteria Methods

This section will cite the derivation of the optimality conditions for the general minimumcompliance problem (9.74) with self-weight and explain how these conditions constitutethe basis for the optimality criteria method for the numerical solution of the general layoutand topology design problem.In order to obtain the necessary conditions for optimality for problem (9.74) the Lagrangemultipliers u, Λ, and λ for the equilibrium constraint, the volume constraint and the zerolower bound constraint; respectively, must be introduced. The necessary conditions arethen found as the conditions of stationarity of the Lagrangian L

L =

(r +

m∑i=1

tigi

)Tu− uT

(m∑i=1

tiKiu− r−m∑i=1

tigi

)

+Λ

(m∑i=1

ti − V

)+

m∑i=1

λi (−ti) . (9.75)



The conditions of vanishing derivatives,

∂L

∂u= 0; → ∂L

∂t= 0 , (9.76)

obtain the necessary conditions

m∑i=1

tiKiu = r +

m∑i=1

tigi

uT

(m∑i=1

Kiu− 2gi

)= Λ− λi

λi ≤ 0; λiti = 0; i = 1, . . . , m; Λ ≥ 0 . (9.77)

Let µ(u) denote the maximal mutual energy with self-weight uT (Kiu − 2gi) of the indi-vidual bars, i.e.

µ(u) = maxuT (Kiu− 2gi) |i = 1, . . . , m

, (9.78)

and let J(u) denote the set of bars for which the mutual energy attains this maximumlevel,

J(u) =i|uT (Kiu− 2gi) = µ(u)

. (9.79)

After defining the non-dimensional element volumes ti = ti/V the necessary conditionsare satisfied with

u = u; ti = tiV, i ∈ Ju; ti = 0 i /∈ J(u); Λ = U(u)

λi = 0, i ∈ J(u); λi = µ(u)− uT (Kiu− 2gi) , i /∈ J(u) , (9.80)

provided that there exists a displacement field u with corresponding set J(u) and non-dimensional element volumes ti, i ∈ J(u), such that

V∑i∈J(u)

tiKiu = r + V∑i∈J(u)

tigi;∑i∈J(u)

ti = 1 (9.81)

The reduced optimality conditions (9.81) state that a convex combination of the gradientsof the quadratic functions

V

(1

2uTKiu− gTi u

), i ∈ J(u) , (9.82)

equals the load vector r.The following derivations, copied from [101], show that a pair (u, t) exists which is asolution to (9.81) which implies that there exists an optimal truss having bars with constantmutual energies and the set J(u) is the set of these active bars. Because of these propertiesthe solution is labelled fully stressed design. The proof shows that the assumed existingoptimum design (u, t) has smaller compliance uT r than any other design (v, s). The proofalso utilizes that the total potential energy of an elastic system,

Π = U −W , (9.83)



Figure 9.21: Total potential energy visualization for only one dependent variable

at equilibrium equals one half of the negative of the external work or

W = −2Π (9.84)

which is visualized in Fig. 9.21. Here, we identify the external work W and the totalpotential energy Π with

W =

(r +

m∑i=1

tigi

)Tu, Π =

1

2uT

(tiKiu− 2r− 2

m∑i=1

tigi

)(9.85)

Substituting the expressions (9.85) into the equilibrium condition (9.84) gives(r +

m∑i=1

tigi

)Tu = 2

(r +

m∑i=1

tigi

)Tu−

m∑i=1

tiuTKiu (9.86)

Re-ordering of the terms appearing on the right-hand-side of (9.84) produces an expressionthat is identified with the maximal mutual energy (9.78) found in the active bars of thefully stressed design:

2rTu−m∑i=1

tiuT (Kiu− 2gi) = 2rTu−

m∑i=1

tiµ(u) (9.87)

Since the maximal mutual energy µ(u) is a constant, the sum over the volumes of all barscan be replaced with the total volume V of the truss. Changing the volumes (which istraced back to changing the cross-sectional areas) gives a different truss design si but letthe maximum mutual energy µ(u) continue to correspond to the fully-stressed design ti.These thoughts give

2rTu− V µ(u) = 2rTu−m∑i=1

siµ(u) (9.88)

However, the other design si may have different bar volumes only for the active bars butgenerally it must be assumed that volumes are assigned to bars which are inactive in thefully stressed design ti. The mutual energy of the inactive bars is smaller than µ(u) andtherefore it holds that

2rTu−m∑i=1

siµ(u) ≤ 2rTu−m∑i=1

siuT (Kiu− 2gi) (9.89)

Next observe that the design si can not be at equilibrium with the here assumed dis-placements corresponding with the fully stressed design ti. The extremum principle for



equilibrium requires that Π be a minimum or, by (9.84), W be maximum. Let the maxi-mum of W for design si be found by adjusting a variable displacement vector w:

2rTu−m∑i=1

siuT (Kiu− 2gi) ≤ 2 max

w

(r +

m∑i=1

sigi

)Tw − 1

2

m∑i=1

siwTKiw

(9.90)

Practically the equilibrium adjustment of w is found by simply solving the elasticity prob-lem for given design si for the displacements which are here here labelled v. Using (9.84)again identifies

2

(r +

m∑i=1

sigi

)Tv −

m∑i=1

sivTKiv =

(r +

m∑i=1

sigi

)Tv (9.91)

The sequence (9.87) through (9.91) of equalities and inequalities proves that(r +

m∑i=1

tigi

)Tu ≤

(r +

m∑i=1

sigi

)Tv (9.92)

which says that the best design solution is the fully-stressed design.The optimality criterion (9.81) with its implicated fully stressed design solution allowsdevising a very simple and effective search method which is known in the literature [102,103] as optimality criterion method. The iterative method assign in each step volumes tothe bars proportionally to the respective mutual energies in order to the state of constantmutual energy in the active bars. At iteration step k the following computations areperformed:

- the current intermediate design tk−1i is given

- compute displacements uk−1 by solving the truss system equations

- assign preliminary bar volumes ζki = max(tk−1i uk−1TKiu

k−1, tmin

)- find actual volumes satisfying volume constraint V k =

∑mi=1 ζ

ki , t

ki = ζki

(V/V k

).


Chapter 10

Demonstration Programs

Two demonstration programs are used for exercises and are available to the students. Thedata input files of both programs are explained in the following sections.

10.1 Program DEMO OPT

The various methods of mathematical programming are implemented in DEMO OPT tosolve a few typical test functions in two-dimensional variables space. The students can ob-serve the search paths taken and the number of iterations and function evaluations neededto reach the respective optimum and thus obtain some feeling for the advantages and lim-itations of the various methods. The most important educational goal is to understandthat the performance of any method depends very much on the nature of the objectivefunction.The input data file is shown in Fig. 10.1. It can be seen from it that there are five dif-

! " " # # # #

# $

!

#

%

!

& # '

%

( !

) !

! & ' & '

*

Figure 10.1: DEMO OPT input data file


194 Demonstration Programs

ferent test functions, where the functions 2 through 4 often appear in the literature. Theuser defines a viewing region and choices of the corner coordinates are suggested in table10.1. The table suggests also starting point coordinates and the number and distribution

Table 10.1: Topology optimization process performance data

xmin xmax ymin ymax number exp xS ys

quadratic polynomial

Himmelblau −6.0 6.0 −6.0 6.0 100 4

Rosenbrock −3.0 3.0 −4.0 2.0 100 4 −1.2 1.0

Fenton-Eason 0.1 3.0 0.1 3.0 100 16 0.5 0.5

Cantilver Beam 1.0 40.0 1.0 30.0 100 16 2.0 2.0

of contour lines. According to the suggested values DEMO OPT produces a series of plotsshowing each iteration of the optimization process and Fig. 10.2 shows the respective last

Figure 10.2: DEMO OPT input data file

images. The parameter size has an effect on the initial size of the simplex, the initial lat-tice spacing of the supporting point set of the response-surface method, and takes on themeaning of a fixed step length when these are combined with choosing steepest-descendsearch directions. Choosing the central differences method over the forward differencesmethod tends to increase computation time. The parameter demonstration time span canbe adjusted to view the optimization process as quickly or slowly as desired.


10.2 Program TOP 195

10.2 Program TOP

Topology optimization after the homogenization method pioneered by Bendsøe andKikuchi [12] is implemented in TOP. The making of the new version of the programhas been motivated by the ETH project NOVA (http://www.nova.ethz.ch/) where thesequence of improving design solutions, Ein perfektes Tragwerk can be seen on a three-dimensional display. The program supports the student to creatively set up their owntopology design problems and find solutions.Fig. 10.3 shows a design solution and corresponding data file the meaning of which is illus-trated by the sketches. The numbers within the rectangle signify the number of elements

!

!

" #

#

" #

" " #

$

$

%

& ' (

)

$ )

Figure 10.3: TOP input data file and explaining sketches

by which the respective load introduction points are moved away from the boundary intothe domain. Rather time consuming optimization processes, for which the presented dataset is an example, can be viewed at much higher speed after the end of the computations.Then, the encircled number must be set to 1 and the intermediate topology informationis read from a data file and plotted.


196 Demonstration Programs


List of Figures

1.1 Types of Structural Optimization . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Ducati 996R frame and geometry model . . . . . . . . . . . . . . . . . . . . 4

1.3 CAD-Model of the racing car rim . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4 Flywheel with central bore, 2-dimensional FEM model and stresses . . . . . 6

1.5 Onsert design demonstrator and geometry model . . . . . . . . . . . . . . . 7

1.6 ANSYS model of the sail boat hull with composite material patches . . . . 7

1.7 Conceptual model of a fuel cell stack. . . . . . . . . . . . . . . . . . . . . . 8

2.1 Three-columns concept after Eschenauer . . . . . . . . . . . . . . . . . . . . 9

2.2 DynOPS evaluator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1 Mapping of a feasible design space into the criteria space . . . . . . . . . . . 17

3.2 Examples for the exterior and interior penalty functions . . . . . . . . . . . 19

3.3 Illustration to the unconstrained Lagrange problem formulation . . . . . . . 21

3.4 Fitness function for a design objective . . . . . . . . . . . . . . . . . . . . . 24

3.5 Upper limit constraint penalty functions . . . . . . . . . . . . . . . . . . . . 26

3.6 Penalty functions for a target constraint . . . . . . . . . . . . . . . . . . . . 26

3.7 FEM model of the frame and load cases . . . . . . . . . . . . . . . . . . . . 28

3.8 Rim compliance measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.9 Loading of the Rim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.10 Region out of the Flywheel Analysis Model . . . . . . . . . . . . . . . . . . 32

3.11 ANSYS model of the sail boat hull with composite material patches . . . . 33

3.12 Quarter end-plate model in the initial design . . . . . . . . . . . . . . . . . 34

4.1 Classification of design optimization problems for truss-like structures . . . 37

4.2 Standard Tube Sizes after DIN 2394 . . . . . . . . . . . . . . . . . . . . . . 39

4.3 FEM-Model and Parameterization of the Motorcycle Frame . . . . . . . . . 39

4.4 Structure of the CAD model of the rim. . . . . . . . . . . . . . . . . . . . . 40

4.5 Optimization variables for spoke-body and bed contour . . . . . . . . . . . 41

4.6 Optimization variables for the pockets . . . . . . . . . . . . . . . . . . . . . 41

4.7 Relation between nodal point coordinates and shape parameter . . . . . . . 42

4.8 Definition of shape parameters . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.9 Final Design and Mesh Distortion . . . . . . . . . . . . . . . . . . . . . . . 43

4.10 Patch Pattern and Laminate . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.11 CAD features: lower plate, upper plate and a rib . . . . . . . . . . . . . . . 45

4.12 Sample problems and parameterization spectrum . . . . . . . . . . . . . . . 46

4.13 Jagged onsert shape obtained with mesh-dependent analysis variables . . . 48

5.1 Convex and concave functions . . . . . . . . . . . . . . . . . . . . . . . . . . 52


198 LIST OF FIGURES

5.2 Convex design space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.3 Usable and feasible sectors of a search space . . . . . . . . . . . . . . . . . . 535.4 Sample problem Lagrangian versus x and versus λ . . . . . . . . . . . . . . 565.5 Illustrations to Kuhn-Tucker conditions for constrained optima . . . . . . . 585.6 Design Optimization Iteration Basic Scheme . . . . . . . . . . . . . . . . . . 635.7 Overview of Optimization Algorithms . . . . . . . . . . . . . . . . . . . . . 645.8 Search methods arranged after model and method orders . . . . . . . . . . 645.9 Objective function and hypermesh in two dimensions . . . . . . . . . . . . . 65

6.1 Construction of new simplex . . . . . . . . . . . . . . . . . . . . . . . . . . . 716.2 Design Principle and Reflection of a Simplex in Two Dimensions . . . . . . 726.3 Cauchy-method zigzag path through two-dimensional search space . . . . . 736.4 Orthogonal and Conjugate Search Directions . . . . . . . . . . . . . . . . . 796.5 Conjugacy in two dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . 846.6 Powell’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 856.7 Path Taken by Powell’s Method . . . . . . . . . . . . . . . . . . . . . . . . . 866.8 Placement of supporting points in 3-dimensional variables space . . . . . . . 896.9 Updated supporting point sets around successful minimum point estimates 926.10 Shrinking of supporting point region around the reference point . . . . . . . 926.11 Supporting point set shifted . . . . . . . . . . . . . . . . . . . . . . . . . . . 936.12 Number of coinciding points versus number of variables . . . . . . . . . . . 946.13 Points set size and reduction due to coinciding points . . . . . . . . . . . . 946.14 Relative computing time savings due to coinciding points . . . . . . . . . . 956.15 Latin Hypercube Sampling in a Two-Dimensional Design Space . . . . . . . 956.16 Sketch of the interval-halving algorithm . . . . . . . . . . . . . . . . . . . . 986.17 Sketch of the golden-section algorithm . . . . . . . . . . . . . . . . . . . . . 986.18 Approximation with a quadratic polynomial and three supporting points . . 1006.19 Approximation with a cubic polynomial and two supporting points . . . . . 1016.20 Golden section method supporting points . . . . . . . . . . . . . . . . . . . 1046.21 Narrowing the bracket by reduction of four supporting points to three . . . 1056.22 Quadratic approximation incorrectly indicated minimum . . . . . . . . . . . 1056.23 Updated four-point set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1056.24 Minimum point successfully estimated by quadratic approximation . . . . . 1076.25 Two-dimensional variable space with two linear constraining functions . . . 1096.26 Two-dimensional variable space with two linear constraining functions . . . 1106.27 Shape optimization and sensitivity formula for gradient calculation . . . . . 1146.28 Function with minima in considered region and starting-point mesh. . . . . 1156.29 Function with objective and unusable and usable tunneling functions . . . . 117

7.1 Metropolis Algorithm: Probability for the acceptance of a new solution . . 1237.2 Selected mathematical programming methods and ordering scheme . . . . . 1247.3 General architecture of an evolutionary algorithm. . . . . . . . . . . . . . . 127

8.1 Examples of applications with shell structures . . . . . . . . . . . . . . . . . 1348.2 Laminated composites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1358.3 Material principal and global coordinate systems . . . . . . . . . . . . . . . 1378.4 General layup of a laminated composite material . . . . . . . . . . . . . . . 1398.5 Front view of different modeling techniques for bending . . . . . . . . . . . 1418.6 Layered shell element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1468.7 Schematic illustration of one half of a symmetric laminate . . . . . . . . . . 146


LIST OF FIGURES 199

8.8 Effect of a thickness change to the overlaying layers . . . . . . . . . . . . . . 1468.9 Feasible domain of the in-plane lamination parameters . . . . . . . . . . . . 1508.10 Flowchart of a typical optimization with an FEM-Solver . . . . . . . . . . . 1538.11 Fiber orientation optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 1548.12 Geometry and objective of the tensile specimen experiment . . . . . . . . . 1548.13 Stacking sequence optimization . . . . . . . . . . . . . . . . . . . . . . . . . 1558.14 Layer thickness optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 1578.15 Optimization with variable number of layers . . . . . . . . . . . . . . . . . . 1578.16 Homogeneous loads: Tension and shear . . . . . . . . . . . . . . . . . . . . . 1598.17 Inhomogeneous strain field of a plate with centered hole . . . . . . . . . . . 1598.18 Global domain Ω split with subdomains Ωi . . . . . . . . . . . . . . . . . . 1598.19 Schematic illustration of layers covering multiple sub-domains . . . . . . . . 1608.20 FEM-based Parametrization of Sub-Domains . . . . . . . . . . . . . . . . . 1608.21 Graph-based parametrized vertex patches . . . . . . . . . . . . . . . . . . . 1618.22 Sensitivities and local reinforcements of a plate with eigenfrequency band . 1628.23 Connection between global layers and laminate regions . . . . . . . . . . . . 163

9.1 Part with stress raising geometry and modelled cambium . . . . . . . . . . 1659.2 Two-phase CAO process after Mattheck . . . . . . . . . . . . . . . . . . . . 1669.3 Soft-Kill Option process after Mattheck . . . . . . . . . . . . . . . . . . . . 1679.4 Initial flywheel shape and radial and circumferential stress distributions . . 1699.5 Blueprint of a turbine disk design solution after Stodola . . . . . . . . . . . 1699.6 Parameterization of thickness and mesh generator . . . . . . . . . . . . . . 1709.7 Shape optimized after maximum stress criterion and stress distributions . . 1719.8 Shape optimized after yield stress criterion and stress distributions . . . . . 1729.9 Mechanical model consisting of Stodola’s disk and two discrete rings . . . . 1729.10 Relative Values of the areas of the inner and outer rings and the disk . . . . 1759.11 Optimum shapes for different values of the diameter ratio . . . . . . . . . . 1759.12 Illustration of topologies for a mesh of 1 by 4 elements . . . . . . . . . . . . 1779.13 The legal and some illegal topologies with 4 by 3 elements . . . . . . . . . . 1789.14 Topology optimization example . . . . . . . . . . . . . . . . . . . . . . . . . 1789.15 Young’s modulus and density . . . . . . . . . . . . . . . . . . . . . . . . . . 1799.16 Illustration to the assembly of the global stiffness matrix . . . . . . . . . . . 1809.17 Topology design problem considered by Jog . . . . . . . . . . . . . . . . . . 1869.18 Topology design solution after Jog . . . . . . . . . . . . . . . . . . . . . . . 1869.19 Topology design solutions with half and full models . . . . . . . . . . . . . . 1869.20 Various ground structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1879.21 Total potential energy visualization for only one dependent variable . . . . 191

10.1 DEMO OPT input data file . . . . . . . . . . . . . . . . . . . . . . . . . . . 19310.2 DEMO OPT input data file . . . . . . . . . . . . . . . . . . . . . . . . . . . 19410.3 TOP input data file and explaining sketches . . . . . . . . . . . . . . . . . . 195

A.1 Loads and kinematics of a Kirchhoff-plate . . . . . . . . . . . . . . . . . . . 215


200 LIST OF FIGURES


List of Tables

3.1 General types of fitness functions with defining parameters. . . . . . . . . . 223.2 Problems and solution methods. . . . . . . . . . . . . . . . . . . . . . . . . 27

4.1 Ranges and step sizes for the optimization variables. . . . . . . . . . . . . . 46

6.1 Supporting points in terms of changes with respect to x(1) . . . . . . . . . . 896.2 Number of coinciding points for the different reference point positions . . . 93

9.1 Topology optimization process performance data . . . . . . . . . . . . . . . 186

10.1 Topology optimization process performance data . . . . . . . . . . . . . . . 194


202 LIST OF TABLES


Bibliography

[1] Muller, S.D. Bio-inspired optimization algorithms for engineering applications. Dis-sertation ETH No. 14719, Swiss Federal Institute of Technology Zurich, 2002.

[2] Kress, G.R. Shape optimization of a flyweel. Struct. Multidisc. Optim., 19:74–81,2000.

[3] Konig, O. M. Wintermantel, U. Fasel, and P. Ermanni. Using evolutionary methodsfor design optimization of a tubular steel trellis motorbike frame. Journal articleunder preparation.

[4] Wintermantel, M. and O. Konig. Design-entity based structural optimization withevolutionary algorithms. Project performed at ETH, Structure Technologies, sup-ported by Swiss National Science Foundation grant no. 21-66879.01, 4/1/2002 -9/30/2004.

[5] Konig, O. Evolutionary design optimization: Tools and applications. DissertationETH No. 15???, Swiss Federal Institute of Technology Zurich, 2004.

[6] Wintermantfel, M. Design-encoding for evolutionary algorithms in the field of struc-tural optimization. Dissertation ETH No. 15323, Swiss Federal Institute of Technol-ogy Zurich, 2004.

[7] Kress, G., P. Naeff, M. Niedermeier, and P. Ermanni. Onsert strength design. Int.J. of Adhesion & Adhesives, 24:201–209, 2004.

[8] Kress, G. and P. Ermanni. The onsert: A new joining technology for sandwichstructures. In Proc. International Conference on Buckling and Postbuckling Behaviorof Composite Laminated Shell Structures, Eilat, Israel, 2004.

[9] Zehnder, B. and P. Ermanni. A methodology for the global optimization of laminatedcomposite structures. Composite Structures, 72(3):311–320, 2006.

[10] Zehnder, B. and P. Ermanni. Optimizing the shape and placement of patches ofreinforcement fibers. Composite Structures, 77(1):1–9, 2007.

[11] Gatzi, R., M. Uebersax, O. Konig. Structural optimization tool using genetic al-gorithms and ansys. Proc. 18. CAD-FEM User’s Meeting, Internationale FEM-Technologietage, Graf-Zeppelin-Haus, Friedrichshafen, 2000.

[12] Bendsœ, M.P., N. Kikuchi. Generating optimal topologies in structural design using ahomogenization method. Computer Methods in Applied Mechanics and Engineering,71:197–224, 1988.


204 BIBLIOGRAPHY

[13] Michell, A.G.M. The limits of economy of material in frame-structures. Phil. Mag.(Series 6), 8:589–597, 1904.

[14] N. Zehnder. Global Optimization of Laminated Structures. PhD thesis, ETH Zurich,2008. DISS. ETH. NO. 17573.

[15] Ruge, M. Entwicklung eines flussigkeitsgekuhlten Polymer-Elektrolyt-Membran-Brennstoffzellenstapels mit einer Leistung von 6.5 kW. Ph.D thesis, Swiss FederalInstitute of Technology, VDI Verlag GmbH, Reihe 6, Nr. 494, FortschrittberichteVDI, 2003.

[16] Schmid, D. Entwicklung eines Brennstoffzellenstapels fur portable Aggregate unter-schiedlicher Leistungsbereiche. Ph.D thesis, Swiss Federal Institute of Technology,VDI Verlag GmbH, Reihe 6, Nr. 500, Fortschrittberichte VDI, 2003.

[17] Evertz, J. and M. Gunthart. Structural concepts for lightweight and cost-effectiveend plates for fuel cell stacks. In In 2nd European PEFC Forum, Lucerne, Switzer-land, 2003.

[18] H. Eschenauer, H., N. Olhoff, W. Schnell. Applied Structural Mechanics. Springer,1997.

[19] Courant, R. Variational methods for the solution of problems of equilibrium andvibrations. Bull. Am. Math. Soc., pages 1–23, 1943.

[20] Vanderplaats, G.N. Numerical Optimization Techniques for Engineering Design:with Applications. McGraw-Hill: Series in Mechanical Engineering, 1984.

[21] Schuldt, S.B. A method of multipliers for mathematical programming methods withequality and inequality constraints. JOTA, 17:155–162, 1975.

[22] Schuldt, S.B., G.A. Gabriele, R.R. Root, E. Sandgren, and K.M. Ragsdell. Applica-tion of a new penalty function method to design optimization. ASME J. Eng. Ind.,99:31–36, 1977.

[23] Reklaitis, G.V., A. Ravindran, K.M. Ragsdell. Engineering Optimization - Methodsand Applications. John Woley and Sons, 1983.

[24] Stodola, A. Dampf- und Gasturbinen. Springer, 1924.

[25] Jones, R.M. Mechanics of Composite Materials. Hemisphere Publishing Corporation,1975.

[26] Schmit, L.A., R.H. Mallet. Structural synthesis and design parameters. HierarchyJournal of the Struct. Division, Proceedings of the ASCE, 89:269–299, 1963.

[27] Olhoff, N., J.E. Taylor. On structural optimization. J. Applied Mechanics, 50:1139–1151, 1983.

[28] Fritsche, D. Lasteinleitungselemente maximaler verbindungsfestigkeit f/”ur sand-wichbauteile. Student project thesis no. winter semester 2000/2001, Structure Tech-nologies, Institute of Mechanical Systems, ETH Zurich.

[29] Naeff, P. Experimentelle verifikation strukturell geklebter lasteinleitungselemente.Student project thesis no. 02-115 summer semester 2002, Structure Technologies,Institute of Mechanical Systems, ETH Zurich.


BIBLIOGRAPHY 205

[30] Heer, A. Gestaltung von lasteinleitungslementen f/”ur sandwichbauteile. Studentproject thesis summer semester 2000, Structure Technologies, Institute of MechanicalSystems, ETH Zurich.

[31] Kuhn, H.W., A.W. Tucker. Nonlinear programming. Proc. Second Berkeley Symp.on Math. Statist. and Probability, pages 481–492, 1951. J. Neyman, Ed.

[32] Schumacher, A. Optimierung mechanischer Strukturen. Springer, 2005.

[33] Reklaidis, G.V., A. Ravindran, K.M. Ragsdell, A. Engineering Optimization: Meth-ods and Applications. Wiley Interscience, 1983.

[34] Spellucci, P. Numerische Verfahren der nichtlinearen Optimierung. Birkha user,1993.

[35] Powell, M.J.D. An efficient method for finding the minimum of a function of severalvariables without calculating derivatives. Computer J., 7:155–162, 1964.

[36] Cauchy, A. Method generale pour la resolution des systemes d’equations simultanees.Compt. Rend. Acad. Sci., 25:536–538, 1847.

[37] Spendley, W., G.R. Hext, F.R. Himsworth. Sequential application of of simplexdesigns in optimization and evolutionary operation. Technometrics, 4:441–461, 1962.

[38] Shewchuck, J.R. An introduction to the conjugate gradient method without theagonizing pain. School of Computer Science, Carnegie Mellon University, Pittsburgh,PA, 15213, Aug. 4, 1994.

[39] Fletcher, R., C.M. Reeves. Function minimization by conjugate gradients. ComputerJ., 7(5):149–154, 1964.

[40] Zangwill, W.I. Minimizing a function without calculating derivatives. Computer J.,10:293–296, 1967.

[41] Brent, R.P. Algorithms for Minimization without Derivatives. Prentice-Hall, Engle-wood Cliffs, NJ, 1973.

[42] Venter, G. Non-dimensional response surfaces for structural optimization with un-certainty. Dissertation, University of Florida, 1998.

[43] Kress, G. and P. Ermanni. Comparison between newton and response surface meth-ods. Struct. Multidisc. Optim., accepted for publication, 2004.

[44] Wang, G.G. Adaptive response surface method using inherited latin hypercubedesign points. Journal of Mechanical Design, 125:210–220, 2004.

[45] McKay, M.D., R.J. Bechmann, and W.J. Conover. A comparison of three methodsfor selecting values of input variables in the analysis of output from a computer code.Technometrics, 21(2):239–245, 1997.

[46] Levy, A.V., S. Gomez. The tunneling method applied to global optimization. Proc.SIAM Conf. on Num. Optimization, pages 213–244, 1984.

[47] James C. Spall. Introduction to stochastic search and optimization : estimation,simulation, and control. Wiley-Interscience series in discrete mathematics and opti-mization, 2003.


206 BIBLIOGRAPHY

[48] D.H. Wolpert and W.G. Macready. No free lunch theorems for optimization,. IEEETransactions on Evolutionary Computation, 1:67–82, 1997.

[49] Back, Th., U. Hammel, H.-P. Schwefel. Evolutionary computation: Comments onthe history and current state. IEEE Transactions on Evolutionary Computation,1(1), 1997.

[50] Fogel, L.J., A.J. Owens, A.J. Walsh. Artificial Intelligence Through Simulated Evo-lution. Wiley, New York, 1966.

[51] Holland, J.H. Nonlinear environments permitting efficient adaptation. Computerand Information Sciences II, Acadameic Press, New York, 1967.

[52] Holland, J.H. Adaptation in Natural and Artificial Systems. Michigan Press, AnnArbor, MI, 1975.

[53] J.H. Holland. Adaptation in Natural and Artificial Systems. University MichigenPress, Ann Arbor, MI, 1975.

[54] D.E. Goldberg. Genetic Algorithms in Search, Optimization and Machine Learning.Addison-Wesley Publishing Company, Inc., 1989.

[55] P. Hajela. Stochastic search in structural optimization: Genetic algorithms andsimulated annealing. In Structural Optimization: Status and Promise, volume 150,pages 611–637. Progress in Astronautics and Aeronautics, 1992.

[56] L. J. Fogel. Biotechnology: Concepts and Applications. Prentice Hall, EnglewoodCliffs, NJ, 1963.

[57] D. Fogel. Evolving Artificial Intelligence. PhD thesis, University of California, SanDiego, CA, 1992.

[58] I. Rechenberg. Evolutionsstrategie: Optimierung Technischer Systeme nach Prinzip-ien der Biologischen Evolution. Frommann-Holzboog, Stuttgart, 1973.

[59] T. Back. Evolutionary Algorithms in Theory and Practice. Oxford Univerity Press,New York, 1996.

[60] J. Koza. Genetic Programming: On the Programming of Computers by Means ofNatural Selection. MIT Press, Cambridge, MA, 1992.

[61] Peter J. Bentley, editor. Evolutionary Design by Computers. Morgan KaufmannPublishers, Inc, San Francisco California, 1999.

[62] M. Schoenauer and Z. Michalewicz. Evolutionary computation: An introduction.Control and Cybernetics Special Issue on Evolutionary Computation, 26(3):307–338,1997.

[63] N.J. Radcliffe and P.D. Surry. Formal memetic algorithms. Technical report, Edin-burgh Parallel Computing Centre, 1994.

[64] B. Schlapfer. Optimal Design of Laminated Structures with Local Reinforcements.PhD thesis, ETH Zurich, 2013. DISS. ETH. NO. 20894.

[65] J. E. Gordon. The new science of strong materials: or, Why you don’t fall throughthe floor. Pelican books. Penguin, 1991.


BIBLIOGRAPHY 207

[66] R. C. Reuter. Concise property transformation relations for an anisotropic lamina.Journal of Composite Materials, 5(2):270–272, 1971.

[67] R. D. Cook, D. S. Malkus, M. E. Plesha, and R. J. Witt. Concepts and applicationsof finite element analysis. John Wiley & Sons Inc, 4 edition, 2001.

[68] S. W Tsai and N. J. Pagano. Invariant properties of composite materials. Technicalreport, Air Force Materials Laboratory, Wright-Patterson Air Force Base, Ohio,AFB, 1968.

[69] H. Fukunaga. On laminate configurations for simultaneous failure. Journal of Com-posite Materials, 22(3):271–286, 1988.

[70] J. L. Grenestedt. Layup optimization and sensitivity analysis of the fundamentaleigenfrequency of composite plates. Composite Structures, 12(3):193–209, 1989.

[71] J. L. Grenestedt. Layup optimization against buckling of shear panels. StructuralOptimization, 3:115–120, 1991.

[72] J. Foldager, J. S. Hansen, and N. Olhoff. A general approach forcing convexity ofply angle optimization in composite laminates. Structural Optimization, 16:201–211,1998.

[73] H. Fukunaga and H. Sekine. Stiffness design method of symmetric laminates usinglamination parameters. AIAA Journal, 30:2791–2793, 1992.

[74] C.G. Diaconu, M. Sato, and H. Sekine. Layup optimization of symmetrically lami-nated thick plates for fundamental frequencies using lamination parameters. Struc-tural and Multidisciplinary Optimization, 24(4):302–311, 2002.

[75] J. M. J. F. van Campen, C. Kassapoglou, and Z. Grdal. Generating realistic laminatefiber angle distributions for optimal variable stiffness laminates. Composites Part B:Engineering, 43(2):354–360, 2012.

[76] D. Keller. Optimization of ply angles in laminated composite structures by a hybrid,asynchronous, parallel evolutionary algorithm. Composite Structures, 92(11):2781 –2790, 2010.

[77] K. J. Callahan and G. E. Weeks. Optimum design of composite laminates usinggenetic algorithms. Composites Engineering, 2(3):149–160, 1992.

[78] N. Kogiso, L. T. Watson, Z. Gurdal, and R. T. Haftka. Genetic algorithms withlocal improvement for composite laminate design. Structural and MultidisciplinaryOptimization, 7:207–218, 1994.

[79] R. Le Riche and R.T. Haftka. Improved genetic algorithm for minimum thicknesscomposite laminate design. Composites Engineering, 5(2):143 – 161, 1995.

[80] A. Todoroki and R. T. Haftka. Stacking sequence optimization by a genetic algorithmwith a new recessive gene like repair strategy. Composites Part B: Engineering,29(3):277 – 285, 1998.

[81] Z. Gurdal, R. Haftka, and P. Hajela. Design and Optimization of Laminated Com-posite Materials. John, 1999.


208 BIBLIOGRAPHY

[82] J. H. Park, J. H. Hwang, C. S. Lee, and W. Hwang. Stacking sequence design ofcomposite laminates for maximum strength using genetic algorithms. CompositeStructures, 52(2):217 – 231, 2001.

[83] G. Soremekun, Z. Grdal, R. T. Haftka, and L. T. Watson. Composite laminate designoptimization by genetic algorithm with generalized elitist selection. Computers &Structures, 79(2):131 – 143, 2001.

[84] A. Rama Mohan Rao and N. Arvind. A scatter search algorithm for stacking sequenceoptimisation of laminate composites. Composite Structures, 70(4):383–402, 2005.

[85] J.-S. Kim. Development of a user-friendly expert system for composite laminatedesign. Composite Structures, 79(1):76 – 83, 2007.

[86] J. Stegmann. Analysis and Optimization of Laminated Composite Shell Structures.PhD thesis, Institute of Mechanical Engineering, Aalborg University, 2005.

[87] J. Stegmann and E. Lund. Discrete material optimization of general composite shellstructures. International Journal for Numerical Methods in Engineering, 62(14):2009– 2027, 2005.

[88] E. Lund and J. Stegmann. On structural optimization of composite shell structuresusing a discrete constitutive parametrization. Wind Energy, 8(1):109–124, 2005.

[89] N. Pedersen. On design of fiber-nets and orientation for eigenfrequency optimizationof plates. Computational Mechanics, 39:1–13, 2005.

[90] Shahriar Setoodeh, Mostafa M. Abdalla, and Zafer Grdal. Design of variablestiffnesslaminates using lamination parameters. Composites Part B: Engineering, 37(45):301– 309, 2006.

[91] B. Schlapfer and G. Kress. A sensitivity-based parameterization concept for theautomated design and placement of reinforcement doublers. Composite Structures,94(3):896–903, 2012.

[92] M. Giger, D. Keller, and P. Ermanni. A graph-based parameterization conceptfor global laminate optimization. Structural and Multidisciplinary Optimization,36(3):289–305, 2008.

[93] B. Schlapfer, B. Rentsch, and G. Kress. Specific design of laminated compositesregarding dynamic behavior by the application of local reinforcements. submitted toComposite Structures, 99:433–442, 2013.

[94] N. Zehnder and P. Ermanni. A methodology for the global optimization of laminatedcomposite structures. Composite Structures, 72(3):311–320, 2006.

[95] Mattheck, C. Design in der Natur - der Baum als Lehrmeister Bd. 1. RombacherOkologie, 1993.

[96] Baier, H., Ch. Seeßelberg, B. Specht. Optimierung in der Struktrumechanik. Vieweg,1994.

[97] Jog, C.S. A robust dual algorithm for topology design of structures in discretevariables. Int. J. Numer. Meth. Engng, 50:1607–1618, 2001.


BIBLIOGRAPHY 209

[98] Bendsøe, M.P., A. Ben-Tal, J. Zowe. Optimization methods for truss geometry andtopology design. Structural Optimization, 7:141–159, 1994.

[99] Svanberg, K. On local and global minima in structural optimization. In Atrek,A.; Gallager, R.H.; Ragsdell, K.M.; Zienkiewicz, O.C. , editor, New Directions inOptimum Structural Design, pages 327–341. Wiley, New York, 1984.

[100] Diaz, A., Bendsøe, M.P. Shape optimization of structures for multiple loading con-ditions using a homogenization method. Structural Optimization, 4:17–22, 1992.

[101] Taylor, J.E. Maximum strength elastic structural design. Proc. ASCE, 95:653–663,1969.

[102] Olhoff, N., Taylor, J.E. On structural optimization. J. Appl. Mech., 50:1134–1151,1983.

[103] Rozvani,. Structural Design via Optimality Criteria. Dordrecht: Kluwer, 1989.

[104] J. N. Reddy. An introduction to the finite element method. McGraw-Hill series inmechanical engineering. McGraw-Hill Higher Education, 2006.

[105] H.R. Schwarz. Methode der finiten Elemente: eine Einfuhrung unter besondererBerucksichtigung der Rechenpraxis. Leitfaden der angewandten Mathematik undMechanik. Teubner, 1991.

[106] H. Goldstein. Classical Mechanics. Addison-Wesley, 2001.

[107] A. E. H. Love. The small free vibrations and deformation of a thin elastic shell.Philosophical Transactions of the Royal Society of London. A, 179:pp. 491–546, 1888.

[108] M. Iura and S. N. Atluri. Formulation of a membrane finite element with drillingdegrees of freedom. Computational Mechanics, 9(6):417–428, 1992.

[109] R. D. Cook. Four-node flat shell element: Drilling degrees of freedom, membrane-bending coupling, warped geometry, and behavior. Computers & Structures,50(4):549–555, 1994.


210 BIBLIOGRAPHY


Appendix A

Finite Element Method

Within this section, a short overview of the Finite Element Method (FEM) is given. Thecontent is mainly based on the textbooks of Cook et al. [67], Reddy [104] and Schwarz[105]. The FEM is a numerical method for solving partial differential equations. It canbe applied to a wide field of physical problems such as structural analysis, heat transfer,magnetic fields, flow processes and many more. The geometry of the physical problem isdiscretized by dividing the domain into smaller parts called the finite elements. A finiteelement has a domain, a boundary and so-called nodes on which the degrees of freedomare defined. While the partial differential equations can only be solved analytically forsimple geometries, there is no geometric restriction using the FEM. Moreover, there isno restriction to boundary conditions or material properties which makes the method ap-plicable to any continuum mechanical problems. There exist numerous commercial finiteelement codes which are still enhanced and refined today. While a deeper understanding ofthe method requires some effort, the usage of FEM-codes is possible with little knowledgeof the method and the underlying problem. However, the consequences of an incorrectapplication ”may range from embarrassing to disastrous” [67].Within this thesis, the further explanations are focused on the FEM for structural anal-ysis problems. The solution of the elasticity problem demands for the fulfillment of thefundamental equation

LTCLu + f = ρu (A.1)

within a given domain Ω taking into consideration the prescribed surface stresses σ anddisplacements u on the surface Γ as well as the internally acting body forces f . As men-tioned, an exact solution of the problem can only be found for simple geometries andsimple boundary conditions, e.g. a bar under uniaxial load. Generally, the FEM providesonly approximate solutions. Accepting a sufficient numerical effort, the obtained solutionmay come close to the true solution.

A.1 Equation of Motion

There are several ways to derive the linear equation system of the FEM which can be solvedfor the unknown displacements u. A general approach is based on Lagrangian mechanics[106]. It is a formalism based on Hamilton’s Principle which describes the dynamics ofa system with a scalar function called the Lagrangian L and can be understood as ageneralization of the principle of virtual displacements to dynamics of solids [104]. Theequation of motion in Lagrangian mechanics, also known as the Euler-Lagrange equation,


212 Finite Element Method

is defined asd

dt

(∂L

∂x

)− ∂L

∂x= 0 (A.2)

where L denotes the Lagrangian and x the generalized coordinates. The Lagrangian L isdefined as

L = T −Π = T − (U −W ) (A.3)

where T denotes the kinetic energy and Π the potential energy. The Hamilton’s principlefor an elastic body is represented with equation

δ

∫ t2

t1

[T − (U −W )] dt = δ

∫ t2

t1

Ldt = 0 (A.4)

which calls the time integral of the Lagrangian L to be stationary. The fundamental lemmaof the calculus of variation shows that solving the Lagrange equations is equivalent to find-ing a solution of the Hamilton’s principle. However, taking advantage of the Lagrangianmechanics transforms the equation of motion to its weak form which is needed for finiteelement formulation.The potential energy Π is the sum of the deformation energy U and the negative of thepotential of the external forces W . If the displacements u are small, which is required forthe linear elasticity problem, the velocities can be approximated with displacement timederivatives u. The kinetic energy can thus be formulated as the integral over the domain Ωof the density ρ and the scalar product or the velocities u.

T =1

2

∫ΩρuT udΩ (A.5)

The deformation energy U is defined as the domain integral of the scalar product ofstresses σ and strains ε.

U =1

2

∫ΩεTσdΩ (A.6)

and the potential of the external forces W is dependent on the body forces f and thesurface stresses σ. Consider that the surface stresses are integrated over the surface Γ.

W =

∫Ω

fTudΩ +

∫ΓσTudΓ (A.7)

The Lagrangian L is formulated for the entire domain Ω. In order to get the FEM for-mulation, the domain must be discretized into smaller sub-domains Ωe, namely the finiteelements. The integration over the domain Ω is replaced with a summation of the integralsover the sub-domains Ωe. Additionally, local approximation functions φ, which are alsocalled shape functions, are defined. They map the finite element nodal displacements u,which represent also the degrees of freedom, to the continuous displacements u. The nodalvelocities ˙u and the accelerations ¨u are mapped analogously.

u ≈ φT u , u ≈ φT ˙u , u ≈ φT ¨u (A.8)

Considering the kinematic equationε = Lu, (A.9)

the strains ε can be expressed as a function for the nodal point displacements u.

ε = Lu = LφT u = Bu (A.10)


A.1 Equation of Motion 213

A strain-displacement matrix B, which contains the spacial derivatives of the local shapefunctions, is built by applying the differential matrix operator L to the shape functions φ.The total Lagrangian for the discretized system then takes the discrete form

L =∑nelm

[1

2

∫Ωe

ρ ˙uTφφT ˙udΩe

]−∑nelm

[1

2

∫Ωe

uTBTCBudΩe

]+∑nelm

[∫Ωe

fTφT udΩe

]+∑nelm

[∫Γe

σTφT udΓe

](A.11)

Finally, the unknown nodal displacements u are taken as generalized coordinates whereforethe Lagrange-Euler equation (A.2) is modified to

d

dt

(∂L

∂ ˙u

)− ∂L

∂u= 0 (A.12)

The evaluation of the Lagrangian formalism leads to the equation of motion for the dis-cretized linear system.∑

nelm

[∫Ωe

ρφφT üdΩe

]+∑nelm

[∫Ωe

BTCBudΩe

]−∑nelm

[∫Ωe

φfdΩe

]−∑nelm

[∫Γe

φσdΓe

]= 0 (A.13)

The equation is then rearranged so that terms with no dependence on the nodaldisplacements u are brought to the right hand side.∑

nelm

[∫Ωe

BTCBdΩe

]u +

∑nelm

[∫Ωe

ρφφTdΩe

]ü

=∑nelm

[∫Ωe

φfdΩe

]+∑nelm

[∫Γe

φσdΓe

](A.14)

The sums on the left side represent the global stiffness matrix K and the global massmatrix M while the right hand side represents the load vector r. With these symbols,thebasic problem of the FEM for the linear elastic case is written as

Ku + Mü = r (A.15)

Assuming a static problem, the accelerations ü vanish so that the equation simplifies to

Ku = r (A.16)

which is a simple linear equation system for the unknown displacements u. The equationof motion can also be used for the determination of the harmonic eigenfrequencies of thestructural system. There, it is assumed that the load vector r is zero. Taking advantageof the harmonic approach, the equation of motion can be transferred into an eigenvalueproblem with the unknown eigenvalues λ and eigenvectors Φ, whereas λ is equal to thesquare of the angular frequency ω.

u = Φ sin (ωt) (A.17)

ü = −ω2Φ sin (ωt) (A.18)



The combination of the equation of motion (A.15) and the harmonic approach (A.17, A.18)yields to the eigenvalue problem for the harmonic vibration.(

K− ω2M)

Φ = 0 (A.19)

Considering equation (A.14), the global system matrices can be extracted directly. Theglobal stiffness matrix K is defined as the sum of the element stiffness matrices

K =∑nelm

[∫Ωe

BTCBdΩe

](A.20)

whereas the element stiffness matrices k are given as

k =

∫Ωe

BTCBdΩe (A.21)

Analogously, the global mass matrix M and the element mass matrices m are defined as

M =∑nelm

[∫Ωe

ρφφTdΩe

](A.22)

and

m =

∫Ωe

ρφφTdΩe (A.23)

The load vector r contains the internal body forces and the prescribed forces on the surfaceof the structure. The shape functions φ distribute the loads to the nodes.

r =∑nelm

[∫Ωe

φfdΩe

]+∑nelm

[∫Γe

φσdΓe

](A.24)

The summations must consider the connectivity of the respective element with the globallynumbered mesh nodes.


A.2 Formulation of a Layered Shell Element 215

A.2 Formulation of a Layered Shell Element

A general shell element contains both, membrane and flexural stress theory. The Kirchhoff-Love theory [107], which has already been postulated for the derivation of the CLT, isused here for the formulation of the bending behavior. The displacements of a pointare composed of the mid-plane displacements u0, v0 and a term arising from the platecurvature. The curvature term is dependent on the position through the thickness z andthe rotations of the normal to the middle plane θx and θx according to equations (A.25)and (A.26).

u = u0 + z θy (A.25)

v = v0 + z (−θx) (A.26)

The chosen conventions for the derivation of the finite shell element are shown inFigure A.1. However, an alternative convention is feasible as well. Taking advantage

x,u

y,v

z,w

MyMx

Nx

Ny

Mxy

Mxy

Nxy

Nxy

θxθy

z,w

x,u

zθy

wzPθy

Figure A.1: Loads and kinematics of a Kirchhoff-plate

of the kinematic relations, the strains yield εxεyγxy

︸︷︷︸ε

=

u,xv,y

u,y + v,x

︸︷︷︸

ε

=

u0,x

v0,y

u0,y + v0,x

︸︷︷︸

ε0

+z

θy,x−θx,y

θy,y − θx,x

︸︷︷︸

κ

(A.27)

where ε0 denotes the mid-plane or membrane strains and κ the plate curvatures. Thinplate theory assumes the strain distribution to be linear through the thickness. The out-of-plane deflections w are connected to the in-plane displacement u, v with the transverseshear which is [

γxzγyz

]=

∂w

∂x+∂u

∂z∂v

∂z+∂w

∂y

(A.28)

Alternatively it can be expressed in terms of θx and θy taking advantage of the derivativesof the kinematic relations (A.25) and (A.26). The derivatives yield

∂u

∂z= θy (A.29)

∂v

∂z= −θx (A.30)



which leads to [γxzγyz

]=

∂w

∂x+ θy

∂w

∂y− θx

(A.31)

Taking advantage of the Finite Element Method (FEM) (see Appendix A), the strainsare composed of the matrix multiplication of the differential operator L, the shape func-tions φ and the nodal displacements u (see equation (A.10)).

ε = Lu = LφT u = Bu (A.32)

According to equation (A.21), the stiffness matrix is defined as

k =

∫Ωe

BTCBdΩe (A.33)

Since membrane (m), bending (b) and transverse shear (s) parts are differently dependenton the out-of-plane-coordinates z (see equation (A.27)) and on different material laws,respectively, the integration over the shell thickness is separated. The membrane stiffnessmatrix assuming a homogeneous material yields

km =

∫A

∫ t2

− t2

BTmCfBmdzdA = t

∫A

BTmCfBmdA (A.34)

withu v

Bm =

∂φ

∂x0

0∂φ

∂y∂φ

∂y

∂φ

∂x

(A.35)

and

Cf =E

(1− ν)2

1 ν 0ν 1 00 0 1−ν

2

(A.36)

The bending stiffness is given by

kb =

∫A

∫ t2

− t2

BTb zCfzBbdzdA =

t3

12

∫A

BTb CfBbdA (A.37)

withθx θy

Bb =

0

∂φ

∂x

−∂φ

∂y0

−∂φ

∂x

∂φ

∂y

(A.38)


A.2 Formulation of a Layered Shell Element 217

Also the transverse shear stiffness part

ks =

∫A

∫ t2

− t2

BTs κCcBsdzdA = t

∫A

BTs κCcBsdA (A.39)

is linearly dependent on the stiffness. The stress-strain relation in transverse shear isdefined as

Cc =

[G 00 G

](A.40)

and a shear correction factor κ is introduced to mitigate the shear locking problem. Thecorresponding strain-displacement matrix is given by

w θx θy

Bs =

∂φ

∂x0 φ

∂φ

∂y−φ 0

(A.41)

The total element stiffness matrix k is the addressed summation of the single parts whichmeans, that the degrees-of-freedom indicated at the top of equations (A.35), (A.38) and(A.41) must be respected.

k = km + kb + ks (A.42)

A general shell element must have 6 degrees-of-freedom in order to be feasible for spatial3D-modeling. However, the formulation above covers only the 5 d.o.f. u, v, w, θx and θybut not θz The so called drilling degree-of-freedom θz must be introduced artificially (see[108, 109]) in order to make the equation system solvable. Considering equations (A.34),(A.37) and (A.39), it becomes obvious that no re-evaluation of the integrals is needed if thethickness t is changed since it is contained explicitly. The thickness can be adapted withvery low additional computational cost which makes the application of the shell elementsvery efficient and adequate for preliminarily design.


Appendix B

Material Invariants of OrthotropicMaterials

The theory of lamination parameters introduced in Section 8.4 require the matrices Γicontaining the material invariants of orthotropic materials. Tsai and Pagano [68] accom-plished to formulate the reduced stiffness transformation equations by a combination oftrigonometric identities and 5 material invariants Ui (which are invariant under rotationsabout the z-axis). After the textbook of Jones [25] the relations are

Q11 = U1 + U2 cos 2ϕ+ U3 cos 4ϕ (B.1)

Q12 = U4 − U3 cos 4ϕ (B.2)

Q22 = U1 − U3 cos 2ϕ+ U3 cos 4ϕ (B.3)

Q16 = −1

2U2 sin 2ϕ− U3 sin 4ϕ (B.4)

Q26 = −1

2U2 sin 2ϕ+ U3 sin 4ϕ (B.5)

Q66 = U5 − U3 cos 4ϕ (B.6)

in which

U1 =1

8(3Q11 + 3Q22 + 2Q12 + 4Q66) (B.7)

U2 =1

2(Q11 −Q22) (B.8)

U3 =1

8(Q11 +Q22 − 2Q12 − 4Q66) (B.9)

U4 =1

8(Q11 +Q22 + 6Q12 − 4Q66) (B.10)

U5 =1

8(Q11 +Q22 − 2Q12 + 4Q66) (B.11)

whereas Qii are the entries of the material stiffness matrix.


220 Material Invariants of Orthotropic Materials

These equation can be transformed to a matrix notation

Q = Γ0 + Γ1 cos 2ϕ+ Γ2 sin 2ϕ+ Γ3 cos 4ϕ+ Γ4 sin 4ϕ (B.12)

with

Γ0 =

U1 U4 0U4 U1 00 0 U5

(B.13)

Γ1 =

U2 0 00 −U2 00 0 0

(B.14)

Γ2 =1

2

0 0 U2

0 0 U2

U2 U2 0

(B.15)

Γ3 =

U3 −U3 0−U3 U3 0

0 0 −U3

(B.16)

Γ4 =

0 0 U3

0 0 −U3

U3 −U3 0

(B.17)

These are also the matrices Γi which are employed in the lamination parameter theory inSection 8.4.


eth struct opt

Documents

design optimization

optimization model

maximum bond strength

maximum bondstrength

onsert design

design criteria163

nonlinear optimization

structural model