a reproducible flood forecasting case study using different machine learning techniques ·...
Post on 05-Aug-2020
2 Views
Preview:
TRANSCRIPT
ESoWC 2019 - Machine learning for predicting extreme weather hazards
A reproducible flood forecasting case study using different
machine learning techniques
Lukas Kugler & Sebastian Lehner
ESoWC-Mentors: Claudia Vitolo, Julia Wagemann, Stephan Siemen
16.10.2019
Modelingriverdischarge=>high-dimensionalproblem:manymeteorological/hydrologicalparametersimportantwithcomplexinteractions=>coupleddynamicalmodels.
● CanstatisticalconnectionsbasedonMLtechniqueswithcomparableskillbeestablishedforextremefloodingevents?
➔ Explorativecomparisonstudy:investigatefloodforecastingcapabilityofMLtechniques(withdatafromECMWF/Copernicus)
➔ Codeanddocumentationforinterestedpeers(opensourceandreproducible;https://github.com/esowc/ml_flood)
Motivation & goal
2
Data
● PredictordatafromERA5viatheClimateDataStoreAPIforpython
● PredictanddatafromGloFASv2.0○ Reanalysis○ 4ForecastrerunsforafloodingeventdirectlyfromtheGloFASteamviaFTP
● Dailyresolution(preprocessingCDO);spatialdomainofinterest 3
Climate Data Store API
Notebook:1.03_data_overview.ipynb
Data characteristics
● Riverdischargenon-normaldistribution=>Predictchangeindischarge
● Predictors=>normalization
4 ©GLOWA Danube
Workflow
● Determinepointofinterest(fordischargeforecasts)
● Findupstreamarea(catchment)
● Spatiallyaveragefeaturesinthecorrespondingarea
● Predictand:dischargeatspecifiedoutflowpoint
=>one-dayforecastforchangeindischarge!
5
ERA5 reanalysis
river discharge
GloFAS reanalysis
precip runoff snow
Reanalysis
day 1
ML-Model
day 2 day 3
ML model setup // Overview
Models:
● LinearRegressionModelviascikitlearn● SupportVectorRegressorviascikitlearn● GradientBoostingRegressorviascikitlearn● Time-DelayNeuralNetviaKeras
6
Training 1981 - 2005
Validation 2006 - 2011
Test 2012 - 2016
ML model settings // Validation
Hyperparameteroptimization(viaGridSearch):● StandardOLSLinearRegressiondoesnothaveanyhyperparameters✓ ● SupportVectorRegressor:kernel=poly, C=100, epsilon=0.01, degree=3
● GradientBoostingRegressor:n_estimators=200, learning_rate=0.1, max_depth=5
● Time-DelayNeuralNet: one hidden layer, 64 nodes (1281 dof), batch size 90 d, tanh activation
7 Notebooks:3.02_LinearRegression.ipynb3.03_SupportVectorRegression.ipynb
3.04_GradientBoost.ipynb3.05_Time-delay_neural-net.ipynb
Validation LR SVR GBR TDNN
RMSE [m3/s] 169 156 142 126
NSE 0.84 0.86 0.89 0.93
ML model setup // SupportVectorRegressorexample
Hyperparameteroptimization:viaGridSearch
● kernel:rbf,poly● C:[1,10,100]● epsilon:[0.01,0.1,1]● degree(ifpoly):[1,2,3,4,5]
8 Notebook:3.03_SupportVectorRegression.ipynb
Bestsetting:
Results // Model comparison in the full test period
14-dayforecastsoverthefulltestperiod:Bestmodel=>Time-delayneuralnet
9 Notebook:3.06_model_evaluation.ipynb
Results // Samples: rapid changes in discharge
● 14-dayforecastsamples● Inittimes:
a. 30.06.2012 b. 22.10.2013 c. 03.08.2014 d. 23.08.2016
● Floodingevent2013=>moredetailedcasestudy
10 Notebook:3.06_model_evaluation.ipynb
a b c d
Results // Samples: rapid changes in discharge
metricsforall4sampleperiodstogether=>
11 Notebook:3.06_model_evaluation.ipynb
<=metricsforsampleperioda)
Results // Case study: 2013 European Flood
● Rerunsareforecast-driven
● MLisReanalysis-driven
➔ Time-delayneuralnet&SupportVectorReg.performbest
➔ Grad.Boost.&Lin.Reg.Modelshownosignificantincreaseindischarge(shapeisquitefine,butamplitudenot)
12 Notebook:3.06_model_evaluation.ipynb
Conclusion
● TheTD-NNandSVRmodels:
○ areabletopredictanextremefloodingevent,despitetheirrelativelysimplesetting,
○ mayprovidecheapadditionalinformationtoassistforecasts.
● LRandGBRwerenotabletopredictanadequateincreaseindischarge(althoughingeneralGBRperformedaboutasgoodasTD-NNandSVR)
Additionaltuningofmodels→neuralnetsexhibitthemosttheoreticalpotentialduetoalotofpossibleimprovementsinthebasearchitecture(LSTM)
● Opensource/reproducible(e.g.viaGithub)approachhelpsingettingquickfeedback 13
Outlook
● Incorporatestationaryinformation○ e.g. catchment specific variables
in a CNN or an LSTM as Embedding (see e.g. Kratzert et al 2019*)
● Exploredifferentkindsofcatchments
● Comparemoreextremeeventswithforecastrerunsandanalysethem
● Coupledmodel:Conceptidea
14
Github:github.com/esowc/ml_floodNbviewer:link*Kratzert,F.,Klotz,D.,Shalev,G.,Klambauer,G.,Hochreiter,S.,&Nearing,G.(2019).BenchmarkingaCatchment-AwareLongShort-TermMemoryNetwork(LSTM)forLarge-ScaleHydrologicalModeling.arXivpreprintarXiv:1907.08456.
15
Acknowledgements Big thanks go to
● Our mentors: Claudia Vitolo, Julia Wagemann and Stephan Siemen, for always providing helpful comments
● Esperanza Cuartero, for communicating the results
● The University of Vienna, Dept. of Meteorology and Geophysics, for office and server access
● ECMWF-Copernicus, for setting up ESoWC and providing such an opportunity!
top related