dr-rnn: a deep residual recurrent neural network for ?· dr-rnn: a deep residual recurrent neural...
Post on 31-Aug-2018
Embed Size (px)
DR-RNN: A deep residual recurrent neural network for model
J.Nagoor Kania,, Ahmed H. Elsheikha
a School of Energy, Geoscience, Infrastructure and Society,Heriot-Watt University, Edinburgh, UK
We introduce a deep residual recurrent neural network (DR-RNN) as an efficientmodel reduction technique for nonlinear dynamical systems. The developed DR-RNNis inspired by the iterative steps of line search methods in finding the residual minimiserof numerically discretized differential equations. We formulate this iterative scheme asstacked recurrent neural network (RNN) embedded with the dynamical structure of theemulated differential equations. Numerical examples demonstrate that DR-RNN can ef-fectively emulate the full order models of nonlinear physical systems with a significantlylower number of parameters in comparison to standard RNN architectures. Further, wecombined DR-RNN with Proper Orthogonal Decomposition (POD) for model reduc-tion of time dependent partial differential equations. The presented numerical resultsshow the stability of proposed DR-RNN as an explicit reduced order technique. Wealso show significant gains in accuracy by increasing the depth of proposed DR-RNNsimilar to other applications of deep learning.
Recently, detailed numerical simulations of highly nonlinear partial differential equa-tions representing multi-physics problems became possible due to the increased powerand memory of modern computers. Nevertheless, detailed simulations remain far tooexpensive to be used in various engineering tasks including design optimization, uncer-tainty quantification, and real-time decision support. For example, Bayesian calibrationof subsurface reservoirs might involve millions of numerical simulations to account forthe heterogeneities in the permeability fields [13, 14]. Model Order Reduction (MOR)provides a solution to this problem by learning a computationally cheap model froma set of the detailed simulation runs. These reduced models are used to replace thehigh-fidelity models in optimization and statistical inference tasks. MOR could bebroadly categorized into three different classes: simplified physics based models, data-
Corresponding authorEmail addresses: email@example.com (J.Nagoor Kani ), firstname.lastname@example.org (Ahmed H. Elsheikh)
Preprint submitted to Elsevier September 5, 2017
fit black box models (surrogate models)  and projection based reduced order modelscommonly referred to as ROM .
Physics based reduced order models are derived from high-fidelity models using ap-proaches such as simplifying physics assumptions, using coarse grids, and/or upscalingof the model parameters. Data-fit models are generated using regression of the high-fidelity simulation data from the input to the output [15, 29]. In projection based ROM,the governing equations of the system are projected into a low-dimensional subspacespanned by a small number of basis functions commonly obtained by Galerkin pro-jection. In all projection based ROM methods, it is generally assumed that the mainsolution characteristics could be efficiently represented using a linear combination ofonly a small number of basis functions. Under this assumption, it is possible to ac-curately capture the input-output relationship of a large-scale full-order model (FOM)using a reduced system with significantly fewer degrees of freedom [24, 4].
In projection based ROM, different methods could be used to construct the pro-jection bases including: Proper Orthogonal Decomposition (POD), Krylov sub-spacemethods, and methods based on truncated balanced realization [24, 31]. ROM basedon Proper Orthogonal Decomposition has been widely used to model nonlinear sys-tems [15, 31]. Despite the success of POD based methods, there exist a number ofoutstanding issues that limit the applicability of POD method as an effective reducedorder modeling technique.
One issue is related to the cost of evaluating the projected nonlinear function and thecorresponding Jacobian matrix in every Newton iteration. These costs create a compu-tational bottleneck that reduces the performance of the resulting reduced order models.Some existing approaches for constructing a reduced order approximation to alleviatesuch computational bottleneck are gappy POD technique, sparse sampling, MissingPoint Estimation (MPE), Best Point Interpolation Method (BPIM), Empirical Inter-polation Method and Discrete Empirical Interpolation Method (DEIM) [38, 3, 9]. Allthese methods rely on interpolation schemes involving the selection of discrete spatialpoints for producing an interpolated approximation of the nonlinear functions. More-over, these methods are developed especially for removing the computational complexitydue to the nonlinear function in the PDE system after spatial discretization.
Another issue is related to convergence and stability of the extracted ROM. Al-though POD based methods decrease the calculation times by orders of magnitude asa result of reducing the state variables dimension, this reduction goes hand in handwith loss of accuracy. This may result not only in inaccurate results, but also in slowconvergence and in some cases model instabilities. Slow convergence means that manyiterations are needed to reach the final solution and corresponds to an increase in thecomputational time. Divergence is even less desirable as it produces invalid simulationresults.
Artificial Neural Networks (ANN) have found growing success in many machinelearning applications such as computer vision, speech recognition and machine trans-lation [19, 18, 20, 17]. Further, ANNs offer a promising direction for the developmentof innovative model reduction strategies. Neural network use in the domain of MOR
is generally limited to constructing surrogate models to emulate the input-output rela-tionship of the system based on the available simulation and experimental data [23, 33].Neural networks have also been combined with POD to generate reduced order modelswithout any knowledge of the governing dynamical systems . One reason for devel-oping such non-intrusive reduced order modeling methods is to address the main issuesof POD-Galerkin projection ROM technique such as stability and efficient nonlinearityreduction.
Recently, Recurrent Neural Network (RNN) a class of artificial neural network whereconnections between units form a directed cycle have been successfully applied to vari-ous sequence modeling tasks such as automatic speech recognition and system identifi-cation of time series data [19, 18, 20, 17]. RNN has been used to emulate the evolutionof dynamical systems in a number of applications [40, 2] and hence has large potentialin building surrogate models and reduced order models for nonlinear dynamical sys-tems. The standard approach of modeling dynamical systems using RNN relies on threesteps: (a) generating training samples from a number of detailed numerical simulations,(b) defining the suitable structure of RNN to represent the system evolution, and (c)fitting the RNN parameters to the training data. This pure data-driven approach isvery general and can be effectively tuned to capture any nonlinear discrete dynami-cal system. However, the accuracy of this approach relies on the number of trainingsamples (obtained by running a computationally expensive model) and on the selectedRNN architecture. In addition, generic architectures might require a large number ofparameters to fit the training data and thus increases the computational cost of theRNN calibration process.
Many types of recurrent neural network architectures have been proposed for mod-eling time-dependent phenomena [40, 2]. Among those, a recurrent neural networkcalled Error Correction Neural Network (ECNN) , that utilizes the misfit betweenthe model output and the true output termed as model error to construct the RNNarchitecture. ECNN architecture  augmented the standard RNN architecture byadding a correction factor based on the model error. Further, the correction factor inECNN was activated only during the training of RNN. In other words, ECNN takesthe time series of the reference output as an input to RNN for a certain length of thetime period and after that time period (i.e. in future time steps), ECNN forecasts theoutput without the reference output as input from the fitted model.
In the current paper, we propose a physics aware RNN architecture to capture theunderlying mathematical structure of the dynamical system under consideration. Wefurther extend this architecture as a deep residual RNN (DR-RNN) inspired by theiterative line search methods [5, 36] which iteratively find the minimiser of a nonlinearobjective function. The developed DR-RNN is trained to find the residual minimiserof numerically discretized ODEs or PDEs. We note that the concept of depth in theproposed DR-RNN is different from the view of hierarchically representing the abstractinput to fit the desired output commonly adopted in standard deep neural networkarchitectures [27, 28]. The proposed DR-RNN method reduces the computational com-plexity from O(n3) to O(n2) for fully coupled nonlinear systems of size n and from
O(n2) to O(n) for sparse nonlinear systems obtained from discretizing time-dependentpartial differential equations.
We further combined DR-RNN with projection based ROM ideas (e.g. POD andDEIM ) to produce an efficient explicit nonlinear model reduction technique withsuperior convergence and stability properties. Combining DR-RNN with POD/DEIM,resulted in further reduction of the computational complexity form O(r3) to O(r2),where r is the size of the reduced order model.
The rest of this paper is org