framework for quantification and reliability analysis for layered uncertainty using optimization:...

45
Framework for Quantification and Reliability Analysis for Layered Uncertainty using Optimization: NASA UQ Challenge Anirban Chaudhuri, Garrett Waycaster, Taiki Matsumura, Nathaniel Price, Raphael T. Haftka Structural and Multidisciplinary Optimization Group, University of Florida 1

Upload: alberto-gibb

Post on 31-Mar-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

  • Slide 1

Framework for Quantification and Reliability Analysis for Layered Uncertainty using Optimization: NASA UQ Challenge Anirban Chaudhuri, Garrett Waycaster, Taiki Matsumura, Nathaniel Price, Raphael T. Haftka Structural and Multidisciplinary Optimization Group, University of Florida 1 Slide 2 NASA Problem Description Combined aleatory and epistemic uncertainty Epistemic uncertainty: 31 s (Sub-parameters) Aleatory uncertainty: 21 ps (Parameters) 2 Intermediate Variables Constraints Worst case scenario Performance Metrics Parameters Design Variables Slide 3 Toy Problem 3 Function of G G 1 = 5(- P 1 + P 2 - (P 3 0.5)) G 2 = 0.7 P 3 P 1 : Constant P 2 : Normal distribution P 3 : Beta distribution No Intermediate variables w(p) = max(g 1, g 2 ) SymbolCategoryUncertainty ModelTrue Value p1p1 II=[0, 1]p 1 =0.5 p2p2 IIINormal, -2E[p 2 ]1, 0.5V[p 2 ]1.1E[p 2 ]=0, V[p 2 ]=1 p3p3 IIIBeta, 0.6 E[p 3 ] 0.8, 0.02 V[p 3 ] 0.04E[p 3 ]=0.7, V[p 3 ]=0.03 True distribution of G 1 Slide 4 Task A: Uncertainty Characterization 4 Slide 5 Assumption and Approaches Assumption The distribution of each uncertain parameter is modeled as a uniform distribution. Approaches 1)Bayesian based approach 2)CDF matching approach 3)Prioritized Observation UQ approach 5 Slide 6 Bayesian Based Approach Uncertainty models are updated by Bayesian inference Marginal distribution of each parameter i is obtained by integration Each marginal distribution (posterior) is obtained by Markov Chain Monte Carlo (MCMC) method as a sample distribution. 6 : set of uncertain parameters x 1,obs : 1 st set of observations P() : prior distribution L(|x 1,obs ) : likelihood function f(|x 1,obs ) : posterior distribution Slide 7 CDF Matching Approach 7 eCDF using given observations Slide 8 CDF Matching Approach 8 DIRECT optimizer is used (Finkel et al.) Slide 9 Prioritized Observation UQ Both performance metrics measure risk. Refining the UQ based on amount of risk attached to an observation. Similar strategy as the CDF matching method except the objective function is weighted modified KS statistic. 9 W R is weight of the observation according to the risk associated with it. Could be decided based on the J 2 value. Implementation is very expensive in order to find J 2 using Monte Carlo. Exploring Importance sampling or surrogate based strategies for future work. Slide 10 Posterior distributions updated by using 20 observations 20 observations of G 1 are used. Initially, the mean and variance P 2 are most uncertain (wider ranges). While MCMC reduced the range of mean and variance of P 2, the ranges of other parameters remain. makes sense! Toy Problem Results: Posterior Distributions using Bayesian approach 10 True Value Slide 11 Toy Problem Results: Reduced bounds using CDF matching 11 Epistemic uncertainty parameter True Value Given Prior Reduced Bounds using 20 observations (Median) Reduction in median range (% of prior range) p1 0.5[0, 1][0.0556, 0.9444] 11.1% p2 0[-2, 1][-0.5, 0.5] 66.7% 2 p2 1[0.5, 1.1][0.7663, 1.0988] 44.6% E[p 3 ]0.7[0.6, 0.8][0.6037, 0.7963] 3.7% V[p 3 ]0.03[0.02. 0.04][0.0204, 0.0396] 3.7% Using 20 observations for G 1. Maximum reduction in bounds for mean and variance of P 2. Similar results as the Bayesian Approach. Slide 12 Prior Bayesian approachCDF Matching approach Posterior by 5 observations Posterior by 20 observations Posterior by 5 observations Posterior by 20 observations K-S test rejection percentage 60.8%69%0.7%85.3%13.8% Toy Problem Results: Effects of Number of Observations 12 Rejection rate is substantially reduced by both approaches when 20 observations are used. Create eCDF using 1000 samples Perform KS test to see if the hypothesis that the CDF is same as eCDF of the given all 20 observations is rejected. Slide 13 NASA Problem Results: Posterior Distribution using Bayesian approach 13 Posterior by first 25 observationsPosterior by all 50 observations Slide 14 NASA Problem Results: Reduced bounds using CDF matching 14 SymbolGiven PriorUncertainty ModelReduction in range (% of prior range) E[p 1 ][0.6, 0.8][0.6012, 0.7444]28.4% V[p 1 ][0.02, 0.04][0.0209, 0.0344]32.1% p2p2 [0, 1][0.1173, 0.9983]12.4% E[p 4 ][-5, 5][-4.8148, 4.4444]7.4% V[p 4 ][0.0025, 4][0.0765, 3.9589]2.9% E[p 5 ][-5, 5][-4.4444, 0]55.6% V[p 5 ][0.0025, 4][0.6688, 3.7779]22.2% [-1, 1][-0.6914, 0.8889]21% Using first 25 observations Using all 50 observations SymbolGiven PriorUncertainty ModelReduction in range (% of prior range) E[p 1 ][0.6, 0.8][0.6267, 0.7667]30% V[p 1 ][0.02, 0.04][0.0231, 0.04]15.6% p2p2 [0, 1][0.1296, 0.9979]13.2% E[p 4 ][-5, 5][-4.8148, 3.2922]18.9% V[p 4 ][0.0025, 4][0.1423, 3.9260]5.3% E[p 5 ][-5, 5][-4.4444, 0.0412]55.1% V[p 5 ][0.0025, 4][1.5571, 3.9424]40.3% [-1, 1][-0.6667, 0.8916]22.1% Slide 15 Prior Bayesian approachCDF Matching approach Posterior by 25 observations Posterior by 50 observations Posterior by 25 observations Posterior by 50 observations K-S test rejection percentage 62.4%51.8%37.6%41.9%30.8% NASA Problem Results: Effects of Number of Observations 15 Rejection rate is reduced by both approaches as compared to the prior. Create eCDF using 1000 samples Perform KS test to see if the hypothesis that the CDF is same as eCDF of the given all 50 observations is rejected. Slide 16 Task B: Sensitivity Analysis 16 Slide 17 Primary objectives 17 Effect of reduced sub-parameter bounds on intermediate variable uncertainty Fix parameter values without error in intermediate variables Effect of reduced bounds on range of values of interest, J 1 and J 2 Fix parameter values without error in range of J 1 or J 2 Slide 18 Empirical estimate of p-box of intermediate variable, x. Using double-loop Monte Carlo simulation. Sample sub-parameter values within the bounds and subsequent parameter realizations. Reduce the range of sub-parameters by 25% and repeat the above process. Reduce upper bound. Increase lower bound. Centered reduction. Intermediate Variable Sensitivity 18 Sensitivity analysis based on changes in the bounds of a variable, rather than its value. Average change in the area of the p-box brought about by these three reductions is a measure of sensitivity of these bounds. A initial A revised Slide 19 J 1 and J 2 Range Sensitivity 19 -For J 1 and J 2, we use the range of values from Monte Carlo simulation -Surrogate models are used to reduce computation of J 1 and J 2 -Parameters are ranked based on each parameters sensitivity on J 1 and J 2 using a rank sum score Slide 20 Fixing Parameter Values 20 We use DIRECT global optimization to maximize the remaining uncertainty (either p-box area or J 1 /J 2 range) while fixing a single parameter. We generate an initial large random sample of all parameters and replace one parameter with a constant. We fix parameters where the optimized uncertainty measure is close to the initial value Slide 21 Toy Problem Results 21 Monte Carlo simulation introduces some error, as should be expected. Able to accurately rank the sensitivities of each of the parameter bounds, and suggests fixing unimportant parameters at reasonable values. Percent Change Effect on J 1 Effect on J 2 Parameter 143.0%21.8% Parameter 286.3%44.1% Parameter 385.8%44.2% G 1 = 5(- P 1 + P 2 - (P 3 0.5)) G 2 = 0.7 P 3 P 1 : Constant P 2 : Normal distribution P 3 : Beta distribution Slide 22 NASA Problem Results: Revised uncertainty model 22 ParameterPercent Change in J 1 Percent Change in J 2 140%88% 613%48% 106%53% 1219%47% 1625%64% 1833%89% 2024%55% 2151%115% Initial intermediate variable analysis: We are able to fix nine parameters: 2, 4, 5, 7, 8, 13, 14, 15, and 17. Based on their expected impact on both J 1 and J 2, we select revised models for parameters 1, 16, 18, and 21. Slide 23 Tasks C & D: Uncertainty Propagation & Extreme Case Analysis 23 Slide 24 Primary objectives Uncertainty Propagation Find the range of J1 and J2 Extreme Case Analysis Find the epistemic realizations that yield extreme J1 and J2 values Find a few representative realizations of x leading to J2 > 0 24 Slide 25 Double Loop Sampling (DLS) Double Loop Monte Carlo Sampling (DLS) Parameter Loop samples sub-parameters (epistemic uncertainty) 31 distribution parameters Probability Loop samples parameters (aleatory uncertainty) 17 parameters (ps) Challenges: Computationally expensive 25 Slide 26 Efficient Reliability Re-Analysis (ERR) (Importance Sampling Method) Full double loop MCS is infeasible. Black box function g = f(x,d baseline ) is computationally expensive Instead of re-evaluating the constraints at each epistemic realization we weigh existing points based on likelihood. Not importance sampling in traditional sense (i.e. No important region). How do we handle fixed but unknown constants that lie within given interval? Generate initial p samples over entire range, [0,1] Use narrow normal distribution as true pdf p i ~ N( i,0.25 i ) 26 [1] Farizal, F., and Efstratios Nikolaidis. "Assessment of Imprecise Reliability Using Efficient Probabilistic Reanalysis." System 2013: 10-17. Slide 27 Optimized Parameter Sampling 27 Slide 28 Validation of ERR Method on Toy Problem This shows that ERR method performed well when compared to the more expensive DLS method for the toy problem. Method usedDLSERROptimization with ERR Range of J 1 [-0.10, 4.71][-0.11, 4.50][-0.10, 5.22] Range of J 2 [0.23, 0.94][0.13, 0.92][0.11,1] An MCS was performed using the epistemic realizations from the optimization. [-0.10, 4.82] [0.22, 0.96] Slide 29 Results of DLS for NASA problem Results show a significant reduction in range of J1 Can we trust these results with such a small sample size? 29 Range of J 1 Range of J 2 Given uncertainty model [0.02, 5.19][0.08, 0.70] [0.03, 1.11][0.16, 0.76] It was only possible to use a small number of samples due to computational time required 400 samples of the epistemic uncertainty 1,000 samples of the aleatory uncertainty Slide 30 Results of ERR method: NASA problem ERR results didnt correspond very well with the DLS results. 30 MethodRange of J 1 Range of J 2 DLS [0.02, 5.19][0.08, 0.70] ERR [0, 2.72][0, 1] DLS with x to g 5 th order PRS Surrogate [11.01, 33.26][0.36, 0.78] Slide 31 Limitations of current Importance sampling based approach Good agreement with double loop sampling results for the toy problem but not for NASA problem. Hypothesized that poor performance of the importance sampling based approach is due to: Difficulty in creating initial set of samples with good coverage in 21 dimensional space (limited samples). Fixed but unknown constant parameters that were modeled using narrow normal distributions. Possible fix: Dimensionality reduction by fixing the parameters through sensitivity analysis. Use of surrogates to reduce computational time. 31 Slide 32 Summary Uncertainty Quantification using a given set of samples was successfully performed using a Bayesian approach and a CDF matching approach. P-box / reduction in range was used as the criterion to decided the sensitivity of the parameters. An importance sampling based approach was utilized for uncertainty propagation and extreme case analysis. A simpler toy problem was used validate all our methods, increasing our confidence in the methods. 32 Slide 33 Thank You Questions?? 33 Slide 34 Back-Up Slides 34 Slide 35 Reduced bounds using CDF matching Repeated the process 50 times. 35 Epistemic uncertainty parameter True Value Given Prior Reduced Bounds using 5 observations (Median) Reduction in median range (% of prior range) Standard deviation of lower bound Standard deviation of upper bound p1 0.5[0, 1][0.0556, 0.9444] 11.1% 0.04250.0201 p2 0[-2, 1][-0.5, 0.5] 66.7% 0.14100.0078 2 p2 1[0.5, 1.1][0.7663, 1.0988] 44.6% 0.13820.0586 E[p 3 ]0.7[0.6, 0.8][0.6037, 0.7963] 3.7% 0.00320.0073 V[p 3 ]0.03[0.02. 0.04][0.0204, 0.0396] 3.7% 0.00140.0007 Slide 36 MCMC Implementation (Backup Slide) 36 Metropolis MCMC is used 20 MCMC runs (m=20) - Different starting points * - 10,000 posterior samples (2n=10,000) - First 5000 samples are discarded for accuracy Proposal distribution* is a normal distribution with the standard deviation of 10% of the prior range. 1000 random samples* is generated to construct an empirical PDF of G 1 to calculate the likelihood Likelihood (empirical PDF) is calculated by the kernel density estimation - MATLAB ksdensity * Sources of noise in output Slide 37 MCMC Convergence (Backup Slide) 37 Potential scale reduction factor where Slide 38 Correlations between sub-parameters E[p 1 ]V[p 1 ]p2p2 E[p 4 ]V[p 4 ]E[p 5 ]V[p 5 ] E[p 1 ]0.020.03-0.02-0.030.08-0.03-0.06 V[p 1 ]0.050.080.000.020.060.01 p2p2 -0.04-0.05-0.06 0.05 E[p 4 ]0.030.040.03 V[p 4 ]0.000.05-0.03 E[p 5 ]0.000.040.00 V[p 5 ]0.00 -0.02 38 Slide 39 Task B Summary 39 Evaluating sensitivity using p-box and range as a metric to quantify changes Surrogate models are utilized to reduce the computational expense of the double loop simulation Parameter values are fixed by optimizing the remaining uncertainty using DIRECT global optimization Refined models are requesting based on the rank sum score of each parameter for both values of interest, J 1 and J 2 Though the Monte Carlo simulation and surrogate models introduce errors through approximation, our simple toy problem suggests this method is still adequate to provide rankings of parameter sensitivity Slide 40 Other Methods That Were Tried P-box Convolution Sampling Requires replacing distributional p-box with free p-box Failure Domain Bounding (Homothetic Deformations) NASA UQ toolbox for Matlab has steep learning curve Theoretical background is challenging Replacing x to g function with surrogates Requires 8 surrogates (one for each constraint function) in 5 dimensional space Exploration of functions indicates delta function type behavior that is difficult to fit with surrogate Attempts at creating PRS and Kriging surrogates results in poor accuracy 10 / 10 40 Slide 41 Importance Sampling Formulation Worst case requirement metric Similarly, for probability of failure 4 / 10 41 Slide 42 Sampling Distributions 19 ps are bounded between 0 and 1 (Beta, Uniform, or Constant). Uniform sampling distribution is used. 2 ps are normally distributed and possibly correlated. Samples must cover a large range. -5 E[p i ] 5 1/400 V[p i ] 4 Uncorrelated multivariate normal distribution with mean of 0 and standard deviation of 4.5 is used. 8 constraint functions are evaluated for 1e6 realizations of p. 42 Slide 43 Epistemic Realizations Corresponding to J1/J2 Extrema: Toy Problem Min J1Max J1Min J2Max J2 0.600.071.000.03 -2.001.00-1.431.00 0.521.100.50 0.800.600.800.60 0.030.02 Slide 44 Given Uncertainty model 44 J1 J2 Updated Uncertainty model Slide 45 NASA Problem ERR error 45 Percent Error (%)Worst-Case Requirement Metric (J 1 ) Probability of Failure (J 2 ) Percent Error in Mean75%29% Percent Error at Max97%84% Percent Error at Min70%40% Max Percent Error4,110%5,670% Percent error between MCS estimates for J1 and J2 using 1,000 p samples and ERR estimates using 1e6 initial samples