system identification, forecasting, control neural networks - indico
TRANSCRIPT
Page 1 © Siemens AG, Corporate TechnologyCT RTC BAM
Hans Georg Zimmermann
Siemens AG, Corporate Technology
Email : [email protected]
Corporate Technology
System Identification, Forecasting, Controlwith
Neural Networks
Page 2 © Siemens AG, Corporate TechnologyCT RTC BAM
Neural Networks @ Siemens: 25 Years of Research, Development, Innovation
Prior Rule Information
Recurrent Neural Nets
State Space Modeling
FeedforwardNeural Nets
Function Approximation Complex SystemsComplex SystemsComplex Systems
Neuro-Fuzzy Networks
Forecasting& Diagnosis
UncertaintyAnalysis
Optimal Control
TopologyDesign
LearningAlgorithms
Specification Languages
Scripted Modeling
Dataprocessing
Fuzzy Rules
Graphical User Interface
High Performance Computing
Software Environment
Process Modeling
ForecastingRenewables
Process Control
Process Surveillance
Customer Relation Mgt.
Risk Analysis Price
Forecasting
DemandForecasting
EconomicalApplications
TechnicalApplications
for Neural Networks
DecisionSupport
Mathematics of
Neural NetworksNeural NetworksNeural Networks
Prior Rule Information
Recurrent Neural Nets
State Space Modeling
FeedforwardNeural Nets
Function Approximation Complex SystemsComplex SystemsComplex Systems
Neuro-Fuzzy Networks
Forecasting& Diagnosis
UncertaintyAnalysis
Optimal Control
TopologyDesign
LearningAlgorithms
Specification Languages
Scripted Modeling
Dataprocessing
Fuzzy Rules
Graphical User Interface
High Performance Computing
Software Environment
Process Modeling
ForecastingRenewables
Process Control
Process Surveillance
Customer Relation Mgt.
Risk Analysis Price
Forecasting
DemandForecasting
EconomicalApplications
TechnicalApplications
for Neural Networks
DecisionSupport
Mathematics of
Neural NetworksNeural NetworksNeural Networks
Page 3 © Siemens AG, Corporate TechnologyCT RTC BAM
Mathematical Neural Networks
# variables
non-linearity
Linear Algebra
Calculus Neural Networks
)( 12 xWfWy =
Existence Theorem:(Hornik, Stinchcombe, White 1989)3-layer neural networks can approximate any continuous function on a compact domain.
Complex Systems Nonlinear Regression
( )21 ,1
2 minWW
T
t
dtt yyE →−=∑
=
Based on data identify an input-output relation
Neural networks imply a Correspondence ofEquations, Architectures, Local Algorithms.
)( 12 xWfWy =
2W
1W
)tanh()( zzf =
xinput
tanh
∗=∂
∂−=Δ
22 W
EW η
∗=∂
∂−=Δ
11 W
EW η
youtput
Page 4 © Siemens AG, Corporate TechnologyCT RTC BAM
Neural Networks are No Black Boxes
Sensitivity Analysis: Compute the first derivatives along the time series:
Inpu
ts
NOx Sensitivity over Time
Model Output vs. actual NOx
A classification of input-output sensitivities:
constant over time (= linear relationship)
monotone (input can be used in 1dim. control)
non-monotone (only multi-dim control possible)
~ zero (input useless in modeling and control)
0>∂∂
iinputoutput 0<
∂∂
iinputoutput
Application: Modeling of a Gas Turbine Inputs: 35 sensor measures and control
variables of the turbine Output: NOx emission of the gas turbine
Page 5 © Siemens AG, Corporate TechnologyCT RTC BAM
Neural Forecasting of Wind Power Supply
Task: Forecast the wind energy supply of an entire wind field over the next 24 hours Solution: Deep Neural Network (DNN) with 10 hidden layers. Inputs are wind speed and direction Results: DNN (RMSD: 7.20%) outperform the analytical Jensen model (benchmark, RMSD: 10.22%)
Forecast vs. actual Energy Supply (0h) Forecast vs. actual Energy Supply (6h)
Forecast vs. actual Energy Supply (12h) Forecast vs. actual Energy Supply (18h)
Training Test
Training Test Training Test
Training Test
Page 6 © Siemens AG, Corporate TechnologyCT RTC BAM
0,00
0,20
0,40
0,60
0,80
1,00
1,20
1,40
1,60
PV Supply SENN + Physics Physics SENN
Forecasting of Solar Power Supply
Forecast the energy supply of a PV plant with a neural network (SENN) based on weather forecasts and/or a physics based model
Data set: 2011-09-08 to 2012-01-23. Minute data (200k data points). 40k data points are randomly selected as test data.
Performance of a purely data-driven model (SENN) is comparable to the physics based model
A hybrid model (SENN & physics) improves the forecast accuracy
Tota
l PV
supp
ly (s
cale
d)
Forecast Horizon: 24h in minute time buckets (Test day, 2012-01-01)
5,78%
5,19%
4,45%
3,0%
3,5%
4,0%
4,5%
5,0%
5,5%
6,0%
Physics SENN SENN + Physics
norm
alize
d M
odel
Erro
r
Model
Model Error measured on Test Data
Fore
cast
Erro
rTime
Combined ApproachNeural NetworkPhysical Model
Page 7 © Siemens AG, Corporate TechnologyCT RTC BAM
Optimizing Sensors with Neural Networks
Improve Optical Smoke Detectors:
• Smoke-poor fires are hard to detect
• Avoid false alarms caused by e.g. steam, welding, exhaust fumes
sensor signal
NN-Output
target ( =1 if fire, =0 else )
water steam plastic fire
Modeling with Neural Networks:
• Input: 3 inputs extracted from one sensor over 40 time steps. 50 fire scenarios.
• Network Design: Feedforward network including local and global modeling.
• Network Learning: Stochastic learning including cleaning noise
• Result: Neural networks was able to classify all patterns correctly (time series of 2 different scenarios)
Page 8 © Siemens AG, Corporate TechnologyCT RTC BAM
Finite unfolding in time transforms time into a spatial architecture. We assume, that xt=const in the future.The analysis of open systems by RNNs allows a decomposition of its autonomous & external driven subsystems.Long-term predictability depends on a strong autonomous subsystem.
( )CBA
T
t
dtt yy
,,1
2 min→−∑=
state transition
output equation
identification
( )ttt uBAss +=+ tanh1
tt Csy =( )ttt usfs ,1 =+
( )tt sgy =
y
s
u
Modeling of Open Dynamical Systems with Recurrent Neural Networks (RNN)
3+ts
3+ty
C
3−tu
B
2−ts
2−tu
A
B
1−tu
1−ts A
B
tsC
ty
A1+ts
1+ty
CA
2+ts
2+ty
CA
1−ty
C
2−ty
C
tu
4+ts
4+ty
CA
B
1−−= ttt xxuPreprocessing:
Page 9 © Siemens AG, Corporate TechnologyCT RTC BAM
An error correction system considers the forecast error in present time as a reaction on unknown external information.
In order to correct the forecasting this error is used as an additional input, which substitutes the unknown external information.
( )dttttt yyusfs −=+ ,,1
( )gf
T
t
dtt yy
,1
2min→−∑
=
( )tt sgy =
-Iddty 2−
2−tz 2+tsD1−ts C
A
1−tz D ts C tz D1+ts C 1+ty C 2+ty
-Id-Id
A A
dty 1−
dty
B
2−tu
B
1−tu
( )
B
tu
Modeling Dynamical Systems with Error Correction Neural Networks (ECNN)
2−ts C
B
3−tu
A
noise+0 Id
Page 10 © Siemens AG, Corporate TechnologyCT RTC BAM
Combining Variance - Invariance Separation with ECNN
The bottleneck autoassociator solves the variance - invariance decomposition.
The Error Correction Neural Networksolves the transformed temporal problem.
The sub-networks are implicitly coupled by shared weights.
separation
dynamicst
invariants variantst
variantst+1
dynamicst+1
forecastidentity
invariants
combination
separation
dynamicst
invariants variantst
variantst+1
dynamicst+1
forecastidentity
invariants
combination
2+ts1−ts CA
1−tz D ts C tz D1+ts C 1+tx C
2+tx
A A
1+ty 2+ty
tx
dty 1−− d
ty−dty
ty
B
2−tuB
1−tu
B
tu
F F F
E E E
noise+0 Id
Page 11 © Siemens AG, Corporate TechnologyIntelligent Systems & Control
Load Forecast actual Load
date
Effective Load Forecasting with Recurrent Neural Networks
Task: Predict the upcoming energy demand on a 15 min. time grid up to 5 days ahead.
Difficulty: Incorporate the impact of ex-ternal influences on the energy demand.
Load Forecasting for Electrical Energy
Load Forecast vs. Actual Load
ener
gy lo
ad
Accuracy of load forecast is 97,95% (Benchmark 95%)
1ty
2ty … 95
ty96ty, , ,,
1ty
2ty … 95
ty96ty, , ,,
α1, …, α8logistic
B > 0
A
96
95
2
1
t
t
t
t
yy
yy
M
8
1
α
α
M= ·
Daily load is represented by a super-position of intraday sub-load curves
1ty
2ty … 95
ty96ty, , ,,1
ty2ty … 95
ty96ty, , ,,1
ty2ty … 95
ty96ty, , ,,
1ty
2ty … 95
ty96ty, , ,,1
ty2ty … 95
ty96ty, , ,,1
ty2ty … 95
ty96ty, , ,,
α1, …, α8logistic
B > 0
A
96
95
2
1
t
t
t
t
yy
yy
M
8
1
α
α
M= ·
Daily load is represented by a super-position of intraday sub-load curves
Compression of Electrical Load Curves
Page 12 © Siemens AG, Corporate TechnologyIntelligent Systems & Control
( )0,1
2 minsA
T
t
dtt yy →−∑
=
state transition
output equation
identification
( ) 01 ,tanh ssAs tt =+
[ ] tt sIdy 0,=
Modeling Closed Dynamical Systems with Recurrent Neural Networks
… but to understand the dynamics of the observables, we have to reconstruct at least a part of the hidden states of the world. Forecasting is based on observables and hidden states.
2+ty
Id2−ts A
1−ts Idts A
1+ts
1+ty
Id2+tsA
ty1−ty
]0,[Id]0,[Id]0,[Id]0,[IdId A
]0,[Id
bias 0s tanhtanhtanhtanh
2−ty 2+ty
Id2−ts A
1−ts Idts A
1+ts
1+ty
Id2+tsA
ty1−ty
]0,[Id]0,[Id]0,[Id]0,[IdId A
]0,[Id
bias 0s tanhtanhtanhtanh
2−ty
( )tt sfs =+1
( )tt sgy =
We can only observe a fragment of the world …
Page 13 © Siemens AG, Corporate TechnologyIntelligent Systems & Control
The Identification of Dynamical Systems in Closed Form
Embed the original architecture into a larger architecture, which is easier to learn. After thetraining, the extended architecture has to converge to the original model.
3−ts Id A Id A Id A]0,[Id]0,[Id]0,[Id]0,[Id
Id A1+ts
1+ty
2+ts
2+ty
A]0,[Id]0,[Id
−0Id
−0Id
−0Id
−0Id
2−ts 1−ts ts
dty 3−
dty 2−
dty 1−
dty
bias 0s
-Id -Idtar
0=tar
0=tar
0=tar
0=
-Id -Id
Idtanh tanh tanh tanhtanh
The essential task is NOT to reproduce the past observations, but to identify related hidden variables, which make the dynamics of the observables reasonable.
Page 14 © Siemens AG, Corporate TechnologyIntelligent Systems & Control
Approaches to Model Uncertainty in Forecasting1 Measure uncertainty as volatility (variance) of the
target series. The underlying forecast model is a constant. Thus sin(ωt) can be highly uncertain!??
2 Build a forecast model. The error is interpreted as uncertainty in form of additive noise. The width of the uncertainty channel is constant over time.
3 Describe uncertainty as a diffusion process (random walk). The diffusion channel widens over time, e.g. scaled by the one-step model error. For large systems 2 & 3 fail: We have to learn to zero error → the uncertainty channel disappears.
4 One large model doesn't allow to analyze forecastuncertainty, but an ensemble forecast shows the characteristics of an uncertainty channel: Given a finite set of data, there exist many perfect models of the past data, showing different future scenarios caused by different estimations of the hidden states.
Training Test
Target SeriesForecast
Time Steps of Forecast Horizon (20 Days)Forecast Horizon (in days)
LME
Copp
er P
rice
Id A Id A]0,[Id]0,[Id]0,[Id
Id A1+ts
1+ty
2+ts
2+ty
A]0,[Id]0,[Id
−0Id
−0Id
−0Id
2−ts 1−ts ts
dty 2−
dty 1−
dty
bias 0s
-Idtar
0=
-Id -Id
Idtanh tanh tanhtanh
tar0=
tar0=
Id A Id A]0,[Id]0,[Id]0,[Id
Id A1+ts
1+ty
2+ts
2+ty
A]0,[Id]0,[Id
−0Id
−0Id
−0Id
2−ts 1−ts ts
dty 2−
dty 1−
dty
bias 0s
-Idtar
0=tar
0=
-Id -Id
Idtanh tanh tanhtanh
tar0=
tar0=
tar0=
tar0=
Page 15 © Siemens AG, Corporate TechnologyIntelligent Systems & Control
Control of Dynamical Systems with Recurrent Neural Networks
Questions to Solve
System Identification
State Estimation
Controller Design
Page 16 © Siemens AG, Corporate TechnologyIntelligent Systems & Control
Dt
yL min)( →∑>τ
τ
normative target
System Identification, State Estimation & Optimal Control with RNNs
Unfold the RNN into the future, given A,B,C.
Learn a linear feedback controller:
3+ts
3+ty
B
2−ts
3−tu
A
B
2−tu
1−ts
B
ts
ty
1+ts
1+ty
2+ts
2+ty1−ty2−ty
1−tu
A A A A
B
3−ts
4−tu
A
3−ty
B∗+1tu
B∗
+2tuB∗
tu
D D D
τττ Dssuu ==∗ )(
C C C C C C C
( )CBA
t
dyy,,
2 min→−∑≤τ
ττ
ττ Csy =
system identification
( )τττ BuAss +=+ tanh1
noise+0
ττ sy =( )+=+ ττ ss tanh1 )( ττ suA B
C
Page 17 © Siemens AG, Corporate TechnologyIntelligent Systems & Control
Reward Maximization with a Learned Controller
0 100 200 300 400 500 600 700 800 900 10000.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Rew
ard
Number of 5 s time steps
Random actions
No operation
Learned controller(different alogrithms)
0 100 200 300 400 500 600 700 800 900 10000.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Rew
ard
Number of 5 s time steps
Random actions
No operation
Learned controller(different alogrithms)
Challenges: High data volume (5000 var/sec)
Streaming real-time analysis
Autonomous learning
Real-time data analytics using 1000 learned models
Optimization of emissions
Optimal turbine operation
Tasks:System Identification
State EstimationController Design