reliability of global sensitivity indices

57
1 Reliability of global sensitivity indices 1 2 Chonggang Xu 1 , George Z. Gertner 2 3 4 1-Department of Entomology and Center for Quantitative Sciences in Biomedicine, North 5 Carolina State University; 2-Department of Natural Resources and Environmental 6 Sciences, University of Illinois at Urbana-Champaign. 7 8 9 In Press. doi:10.1080/00949655.2010.509317 10 11 12

Upload: illinois

Post on 29-Apr-2023

1 views

Category:

Documents


0 download

TRANSCRIPT

1

Reliability of global sensitivity indices 1

2

Chonggang Xu1, George Z. Gertner2 3

4 1-Department of Entomology and Center for Quantitative Sciences in Biomedicine, North 5

Carolina State University; 2-Department of Natural Resources and Environmental 6

Sciences, University of Illinois at Urbana-Champaign. 7

8

9

In Press. doi:10.1080/00949655.2010.509317 10

11 12

2

Abstract: Uncertainty and sensitivity analysis is an essential ingredient of model 1

development and applications. For many uncertainty and sensitivity analysis techniques, 2

sensitivity indices are calculated based on a relatively large sample to measure the 3

importance of parameters in their contributions to uncertainties in model outputs. To 4

statistically compare their importance, it is necessary that uncertainty and sensitivity 5

analysis techniques provide standard errors of estimated sensitivity indices. In this paper, 6

a delta method is used to analytically approximate standard errors of estimated sensitivity 7

indices for a popular sensitivity analysis method, the Fourier Amplitude Sensitivity Test 8

(FAST). Standard errors estimated based on the delta method were compared to those 9

estimated based on 20 sample replicates. We found that the delta method can provide a 10

good approximation for the standard errors of both first-order and higher-order sensitivity 11

indices. Finally, based on the standard error approximation, we also proposed a method 12

to determine a minimum sample size to achieve the desired estimation precision for a 13

specified sensitivity index. The standard error estimation method presented in this paper 14

can make the FAST analysis computationally much more efficient for complex models. 15

Keywords: Fourier Amplitude Sensitivity Test, Sensitivity analysis, Uncertainty analysis, 16

Standard error, Simple random sampling, Random balance design sampling 17

18

3

1 Introduction 1

Uncertainty and sensitivity analysis is an essential ingredient of numerical model 2

development and applications [1]. It can help in the model application by providing 3

model users with information on the reliability of model predictions, and can help the 4

model development by providing judging criteria for model identification, calibration and 5

corroboration. Uncertainty and sensitivity analysis attempts to quantify the uncertainties 6

in model outputs (generally resulted from uncertainties in model parameters) and the 7

importance of parameters in their contribution to the uncertainties in model outputs. In 8

accordance with the tradition within the uncertainty and sensitivity analysis community, 9

the parameters in this paper refer to the numerical model input variables (e.g., the daily 10

survival rate describing mosquito survival in a mosquito population model), instead of 11

those describing the probability distribution as used in statistics (e.g., mean and standard 12

deviation of a normal distribution). Many uncertainty and sensitivity analysis techniques 13

are now available [2,3]. A large group of them are variance-based methods, which 14

attempt to decompose the variance of a model output y into partial variances contributed 15

by individual model parameters. Their main concern is to calculate total variance in a 16

model output y [i.e., V(y)] and a conditional variance for expected value of model output 17

y given a specific set of parameters [4]. The ratios of conditional variances (also termed 18

partial variances) to total variance in the model output (referred to as sensitivity indices) 19

are used to measure the importance of parameters in their contributions to the uncertainty 20

in model outputs. Since variance-based sensitivity indices are generally calculated by 21

incorporating uncertainties in all parameters simultaneously, they are also referred to as 22

global sensitivity indices in the literature [3]. 23

4

One of the most popular variance-based uncertainty and sensitivity analysis 1

techniques is Fourier Amplitude Sensitivity Test (FAST) [5,6,7,8]. For a review of other 2

variance-based methods, please refer to Borgonovo [9]. The FAST uses a periodic 3

sampling approach to introduce signals for all parameters of interest and a Fourier 4

transformation to decompose the variance of a model output into partial variances 5

contributed by different model parameters. The ratios of partial variances to total variance 6

(referred to as sensitivity indices) are used to measure the importance of main effects of 7

parameters (or interaction effects among parameters) in their contributions to variance of 8

the model output. Different sampling approaches can be used to estimate partial variances 9

in FAST, including the traditional sampling by a search curve in the parameter space 10

using an auxiliary variable [5,6,8], random balance design sampling [10], and simple 11

random sampling [11]. Xu and Gertner [11] compared the performance of different 12

sampling algorithms for FAST analysis. The FAST analysis is originally developed for 13

models with independent parameters. In order to extend FAST for models with dependent 14

parameters, Xu and Gertner [12,13] introduced a random reordering approach to account 15

for rank correlations among parameters. Until now, FAST analysis was mainly used to 16

estimate the partial variances contributed by the main effects of parameters. For the 17

traditional sampling based on the search curve, it has been heuristically shown or realized 18

that FAST can be used to estimate partial variances contributed by interactions among 19

parameters using linear combination of characteristic frequencies assigned to individual 20

parameters [14]. However, there is a lack of a theoretical proof and understanding for the 21

calculation of partial variances contributed by interactions among parameters. Xu and 22

Gertner [11] theoretically showed that FAST analysis can be used to estimate partial 23

5

variances contributed by both main effects and interaction effects of model parameters 1

for different sampling approaches including sampling based on the search curve, simple 2

random sampling, and random balance design sampling. 3

Sensitivity indices determined by the ratios of partial variances to total variance of 4

a model output in FAST can be used to measure the importance of parameters in their 5

contributions to uncertainty in the model output. However, due to the random errors 6

introduced by sampling, it could be misleading to judge the relative importance order for 7

parameters only based on the estimated sensitivity indices, especially if the sensitivity 8

indices of two parameters are close. In view of the random errors introduced by sampling, 9

it is important that we know the reliability of estimated sensitivity indices. In statistics, 10

reliability of a statistic (i.e., a sensitivity index in this paper) is generally measured by its 11

standard error. A common approach for standard error estimation is to conduct FAST 12

analysis using many (e.g., m) replicates of its periodic sample. For each replicate, a new 13

periodic sample with a relatively large sample size of N (N > 1000) that is required by 14

FAST is drawn . We will need to run the model on each newly drawn periodic sample 15

and calculate the sensitivity indices for different model parameters. Finally, the standard 16

error of an estimated sensitivity index can be calculated based on the standard deviation 17

of m estimated sensitivity indices. This method is effective and simple to implement for a 18

simple model (which takes a very short time to run a single simulation). However, for a 19

complex model which takes a long time (e.g., hours) to run a single simulation, the above 20

replicated sample method for standard error estimation could be computationally 21

expensive or infeasible. 22

6

Nowadays more complex models (especially spatial models) have been developed 1

to simulate complicated systems. For example, many forest landscape models have been 2

developed to predict the potential forest landscape changes under future global warming 3

(generally including inter-species competition, seed dispersal and seedling recruitment, 4

tree growth, fire disturbances or harvesting), which may take a number of hours to run a 5

single simulation for a large study area (e.g., millions of hectares) [15,16,17]. For vector-6

borne diseases (e.g., dengue fever or malaria), population dynamics models of disease 7

vector (e.g., mosquitoes) and human movement are important for a better understanding 8

of vector dynamic and a more efficient disease control. In view of the importance of 9

mosquito dispersals to population dynamics in heterogeneous spatial domain and to the 10

success of vector control for dengue fevers [18], new spatial models have been developed 11

to account for mosquito dispersals [19,20]. Since the spatial model needs to simulate 12

biological developments within water containers (e.g., weight growth, physiological 13

development, pupation), effects of environment factors (e.g., temperature and moisture), 14

spatial dispersals for individual mosquito cohort and even gene flow for releasing 15

genetically modified mosquitoes to reduce virus-borne mosquito population, it can take 16

the model several hours to run a single simulation for a single city with thousands of 17

houses [20]. Uncertainty and sensitivity analysis is an important tool to help us 18

understand the reliability of landscape or population dynamics predicted from those 19

complex models. However, for complex models, the replicate-based method of standard 20

error estimation for sensitivity indices could be difficult or infeasible with a large number 21

(m N) of model runs. 22

7

Two potentially efficient approaches available to calculate standard errors of 1

sensitivity indices are the bootstrap method and the delta method. For the bootstrap 2

method, m replicates of samples drawn from the original periodic sample can be used to 3

estimate the standard errors of sensitivity indices calculated by FAST. As such, running 4

of the model for each of the m replicates is not required with the bootstrap method. 5

However, since the FAST analysis depends on periodic samples to form signals for 6

parameters (see eq.(2)), the resampling can potentially destroy the periodic signal which 7

would lead to an unreliable estimate of the standard error. Therefore, in this paper, the 8

delta method is used to analytically calculate the standard errors of sensitivity indices 9

based on a single periodic sample of size N required by FAST. Based on the derived 10

standard error calculation formula, we also propose a method of determining the 11

minimum sample size required to achieve desired estimation precision for a specified 12

sensitivity index. 13

The paper is organized as follows: Section 2.1 gives a general background of 14

FAST analysis; Section 2.2 derives the standard errors of estimated sensitivity indices for 15

simple random sampling; Section 2.3 derives the standard errors of estimated sensitivity 16

indices for random balance design sampling; and Section 2.4 provides a summary of 17

FAST analysis with standard error estimation by the delta method. Section 3 provides 18

two test models. Section 4 presents their results. Section 5 discusses the potential use of 19

derived standard errors for the determination of a minimum sample size to achieve a 20

desired estimation precision and extends the proposed method to models with correlated 21

parameters. 22

8

2 Method 1

2.1 Review of FAST 2

Consider a computer model 1( ,.., )ny g x x , where ( 1,..., )ix i n is a model parameter 3

with probability density function ( )i if x and n is the total number of parameters of 4

interest. The parameter space is defined as 5

1{ ,..., | ~ ( ), 1,..., }nx n i i iK x x x f x i n . (1) 6

The traditional version of the FAST assumes independence among model parameters. 7

The main idea of FAST is to use a periodic sampling approach to introduce a signal for 8

each parameter and then use a Fourier transformation to decompose total variance of 9

model output (y) into partial variances contributed by different model parameters. In 10

order to generate periodic samples for a specific parameter ix , FAST uses a periodic 11

search function (a function to search or explore the parameter space) as follows 12

[11,14,21,22], 13

1 1( ) ( arcsin(sin( )))

2i i i ix G F . (2) 14

where i is a random variable uniformly distributed between 0 and 2 , and 1( )iF is the 15

inverse cumulative distribution function of parameter ix . For the sampled values of i , 16

the G( i ) is a function used to generate the corresponding samples for parameter ix , 17

which will follow the pre-specified probability density function ( )i if x . Using eq. (2), the 18

parameter space can now be explored by samples in the -space defined as follows 19

1{ ,..., | 0 2 , 1,..., }nn iK i n . (3) 20

9

A main concern of variance-based uncertainty and sensitivity analysis is to calculate the 1

total variance of model output y [i.e., V(y)] and the conditional variance for expected 2

value of y given a specific set of parameters [4]. The conditional variances are used to 3

measure the importance of parameters in their contributions to uncertainty in the model 4

output. For example, a relatively large conditional variance for the expected value of y 5

given ix [i.e., ( ( | ))iV E y x ] will indicate that a relatively high amount of uncertainty in a 6

model output is contributed by this parameter ix . Similarly, a relatively large conditional 7

variance for the expected value of y given a specific set of parameters subx [i.e., 8

( ( | ))subV E y x ] will indicate that a relatively large amount of uncertainty in a model 9

output is contributed by this set of parameters. Using the Strong Law of Large Numbers, 10

it can be shown that the conditional variance for the expected value of y in the parameter 11

space can now be calculated in the -space (see Xu and Gertner [11] for details). Namely, 12

1,...,

( )

( , )

( , , )

({ }\ )1,..., 1,...,

( ( | )) ( ( | ))

( ( | , )) ( ( | , )),

( ( | , , )) ( ( | , , )),

......

( ( |{ } \ )) ( ( |{ } \

i

i j

i j k

n i

xx x i i

x x

x x i j i j

x x x

x x i j k i j k

x x xx x n i n

V V E y x V E y

V V E y x x V E y i j

V V E y x x x V E y i j k

V V E y x x x V E y

1( ,..., )

1 1

))

( ( ,..., )) ( ( ( ),..., ( ( ))n

i

x xx n nV V g x x V g G g G

(4) 13

where ( )xV and ( )V are the conditional variances calculated in the parameter space 14

and -space respectively; ( )xE and ( )E are the expected values calculated in the 15

parameter space and -space respectively; 1,...,{ } \n ix x x represents all parameters except 16

ix .For a subset ( subx ) of all parameters 1,...,{ }nx x , the ( )subxV represents the partial variance 17

in model output y due to the uncertainty in subset parameters subx . The 1,...,( )nx xV represents 18

10

the variance of model output y resulting from uncertainties in all model parameters. 1

Namely, 2

1,...,( ) ( ).nx xV V y (5) 3

Following the decomposition of variance in analysis of variance (ANOVA) assuming 4

independence among parameters [4], we define 5

1,...,

1,..., 11 2

( )

( , )

( )...

1 1

,

,

......

, 1, 2,..., 1

i

i

i j

i j i j

n

n i iss

xx

x x

x x x x

rx x

x x x xs i i i n

V V

V V V V i j

V V V r n

(6) 6

as partial variances of model output contributed by the first-order (or main) effects, the 7

second-order interaction effects, the third-order interaction effects, and so on, until the 8

nth order interaction effects of model parameters. Summing all the left and right terms in 9

eq. (6), we get the variance decomposition as follows, 10

1,...,

1

( )i i j i j k n

n

x x x x x x x xi i j i j k

V y V V V V

(7) 11

which suggests that the total variance resulting from parameter uncertainties can be 12

decomposed into partial variances contributed by the first-order effects, the second-order 13

interaction effects, third-order interaction effects, and until the nth order interaction 14

effects of parameters. Dividing both sides of eq. (7) by 1,...,( )nV x x , we get 15

1,...,1

1i i j i j k n

n

x x x x x x x xi i j i j k

(8) 16

where 17

11

1,...,

1,...,

1...

1... 1,...,

( )

( )

( )

...

i

i n

i j

i j n

n

n n

xx x x

x x

x x x x

x xx x x x

V

VV

V

V

V

(9) 1

represent the first-order, second-order, and so on, until nth order sensitivity indices, 2

respectively. 3

Using search functions with eq. (2), the model output becomes a multiple periodic 4

function of ( 1,..., n ). Thus, we can apply a multiple Fourier transformation to the model5

1( ( ),..., ( ))ny g G G over all variables in the -space. Namely, we have 6

1 11

1

( )( )..1

,...,

( ( ),..., ( )) n nn

n

r rr rn

r r

g G G C e

i (10) 7

where 8

1 11

2 2( )( )

.. 1 1

0 0

1( ) ( ( ),..., ( ))2

n nn

r rnr r n nC g G G e d d

i . (11) 9

It is notable that 1

( ).. nr rC is the expected value of 1 1( )

1( ( ),..., ( )) n nr rng G G e i in view 10

that 1,..., n are independently and uniformly distributed in the -space. Namely, 11

1 11

( )( ).. 1( ( ( ),..., ( )) )n n

n

r rr r nC E g G G e

i (12) 12

which can be estimated based on N samples of 1,..., n as follows, 13

( ) ( )

1 11

( ) ( ) ( )( ).. 1

1

1ˆ ( ( ( ),..., ( )) )j j

n nn

Nj j r r

r r nj

C g G G eN

i (13) 14

where ( )ji represents the jth sample for i . For notational convenience, we define the 15

cosine Fourier coefficient 1.. nr ra and sine Fourier coefficient

1.. nr rb as follows, 16

12

1

1

.. 1 1 1

.. 1 1 1

[ ( ( ),..., ( )) cos( )]

[ ( ( ),..., ( )) cos( )]n

n

r r n n n

r r n n n

a E g G G r r

b E g G G r r

(14) 1

with 2

1 1 1

( ).. .. ..n n nr r r r r rC a b i . (15) 3

Based on eq. (14), the cosine Fourier coefficient ra and sine Fourier coefficient rb can be 4

estimated as follows, 5

( ) ( ) ( ) ( )1 1 1

1

( ) ( ) ( ) ( )1 1 1

1

1ˆ ( , ) cos( ),

1ˆ ( , )sin( ),

Nj j j j

r n n nj

Nj j j j

r n n nj

a g r rN

b g r rN

(16) 6

where ( )ji represents the jth sample for i . Using the multiple Fourier transformation, 7

partial variances in eq. (6) can be estimated by the sum of Fourier amplitudes (i.e., 8

1

2| |nr rC ) at different frequencies (see Xu and Gertner [11] for a proof), 9

1 1

1

( ) 200 00

| | 1

( ) 200 00

| |,| | 1

( ) 200 00

| |,| |,| | 1

( ) 2

| |, ,| | 1

| |

| |

| |

...

| | .

i i

i

i j i j

i j

i j k i j k

i j k

n n

n

x rr

x x r rr r

x x x r r rr r r

x x r rr r

V C

V C

V C

V C

(17) 10

Eq. (17) shows that the Fourier amplitude ( ) 200 00| |

irC

results from the main effects of 11

parameters, the Fourier amplitude ( ) 200 00| |

i jr rC results from the second-order 12

interaction effects, the Fourier amplitude ( ) 200 00| |i j kr r rC results from the third-order 13

13

interaction effects, and the Fourier amplitude1

( ) 2| |nr rC

results from the nth order 1

interaction effects. Thus, to calculate the partial variances, we only need to estimate the 2

Fourier coefficients based on eq. (13). Summing all the terms in eq. (17) based on eq. (7), 3

it is easy to show that 4

1

1

1 1

( ,..., ) ( ) 2

, , , ( , , ) (0,...,0)

| | .n

n

n n

x xr r

r r r r

V C

(18) 5

It is notable that there is only one period for the sampled parameter values using search 6

functions with eq. (2). Thus, there are strong signals for parameters in the Fourier 7

amplitudes (i.e., 1

( ) 2| |nr rC

) when the fundamental frequency is equal to one (i.e., 8

1 1r , or 2 1r ,..., or 1nr ). The signals decrease at higher harmonics which are integer 9

(termed the harmonic order) multiples of the fundamental frequency [6]. The signals in 10

the Fourier amplitude become close to zero, if any of 1,..., nr r are relatively large (i.e., at a 11

relatively high harmonic order). The harmonic order at which the Fourier amplitudes 12

become close to zero is defined as a maximum harmonic order, which is commonly four 13

or six in practice and could be greater for highly nonlinear models [6]. With a specified 14

maximum harmonic order M (please see the end of Section 2.2 for a description of M 15

selection), the partial variances contributed by the main effects and interaction effects can 16

be calculated as 17

14

1 1

1

( ) 200 00

| | 1

( ) 200 00

| |,| | 1

( ) 200 00

| |,| |,| | 1

( ) 2

| |, ,| | 1

| |

| |

| |

...

| | .

i i

i

i j i j

i j

i j k i j k

i j k

n n

n

M

x rr

M

x x r rr r

M

x x x r r rr r r

M

x x r rr r

V C

V C

V C

V C

(19) 1

Different sampling methods can be used to estimate the partial variances 2

contributed by different model parameters [11]. The sampling methods include the 3

traditional sampling by a search curve using an auxiliary variable s [5,6,8], random 4

balance design sampling [10], and simple random sampling [11]. For the traditional 5

sampling using an auxiliary variable s, in addition to sampling errors, there are also 6

interference errors resulted from the selection of frequencies for different model 7

parameters (see Xu and Gertner [11] for details). The sampling errors are substantially 8

smaller than the interference errors. In addition, for the traditional search-curve based 9

sampling, it is difficult or infeasible to conduct FAST analysis for models with many 10

parameters due to the large sample size required (e.g., >20 parameters) [10]. Therefore, in 11

this paper, we are mainly concerned with simple random sampling and random balance 12

design sampling. 13

2.2 Standard errors of sensitivity indices for simple random sampling 14

For simple random sampling, we draw a random sample of size N in the -space 15

{( ) ( ) ( )

1( ,..., )j j j

n

, j=1, …, N } to estimate the Fourier coefficients using eq. (13). For 16

15

the first-order or higher-order sensitivity indices, using eq. (9) and eq. (19), they can be 1

estimated based on a sample in the -space as follows, 2

( )

( ) 2

2

ˆ

ˆˆ

Mr

r H

C

(20) 3

where ( )1( ) {( ,..., ), | | 1,..., , 0 {1,.., } \ }M

i n i jr H x r r r M r for j n i for the calculation 4

of first-order sensitivity index of a specific parameter xi [i.e., ix in eq. (9)]; 5

( )1( , ) {( ,..., ), | |,| | 1,..., , 0 {1,.., } \{ , }}M

i j n i j kr H x x r r r r M r for k n i j for the 6

calculation of second-order sensitivity index resulted from interactions of two parameters 7

xi and xj [i.e., i jx x in eq. (9)]; and ( )

1 1 1( ,..., ) {( ,..., ), | |,...,| | 1,..., }Mn n nr H x x r r r r M

8

for the calculation of nth order sensitivity index resulted from interactions of all 9

parameters [i.e., 1... nx x in eq. (9)]. We define ( )MH as a general set to represent one of 10

the harmonic sets ( ) ( )MiH x , ( ) ( , )M

i jH x x , …, or ( )1( ,..., )M

nH x x for the calculation of first-11

order, second-order and so on until the nth order sensitivity indices. The M is a user-12

selected maximum harmonic order (generally four or six, we will discuss selection details 13

at the end of this section after we introduce the necessary basics). The 2 is the sample 14

variance of model output y, 15

2 ( ) ( ) 21 1

1

1ˆ [ ( ,..., ) ( ,..., )]

1

Nj j

n nj

g gN

(21) 16

where 1( ,..., )ng is the sample mean of model output y. Replace ( )ˆrC in eq. (20) by the 17

estimated cosine and sine Fourier coefficient in eq. (15), the sensitivity indices can be 18

calculated as follows, 19

16

( )

2 2

2

ˆˆ( )ˆ .

ˆM

r rr H

a b

(22) 1

Based on eq. (22), we can easily see that both estimation errors for Fourier coefficients 2

ˆra and rb , and the estimation error for sample variance 2 of the model output leads to 3

the estimation error for a sensitivity index . If we can estimate the standard errors for 4

ˆra , rb , and 2 , we are able to approximately estimate the standard error for the 5

sensitivity index using the delta method (see Appendix A for details of delta method). 6

For a large sample size N , we assume that ˆra , ˆrb , and 2 approximately follow a 7

multivariate-normal distribution with a mean vector ( ( ){ } | Mr r Ha

, ( ){ } | Mr r H

b

, 2 ) and a 8

covariance matrix , where ( ){ } | Mr r Ha

represents the list of cosine Fourier coefficients 9

and ( ){ } | Mr r Hb

represents the list of sine Fourier coefficients for all ( )Mr H . Let 10

( ) ( )

( ) ( )

2

2 2

2 2 2 2

({ } | ,{ } | , )

( )2 2

{ }| ,{ } | , ,( )

M M

M M

r H r Hr r

M

r rr r r

r H r H

a b

a ba b

(23) 11

based on the delta method (see Appendix A for details), will approximately follow a 12

normal distribution as follows 13

ˆ ~ ( , )TN (24) 14

for a large sample size N. In order to estimate the variance of , we need to estimate the 15

covariance matrix ( i.e., variance of and covariance among ˆra , rb , and 2 ). 16

17

Based on the Central Limit Theorem, Xu and Gertner [11] showed that the 1

estimated cosine Fourier coefficient ˆra and sine Fourier coefficient rb in eq.(16) 2

approximately follow normal distributions for a large sample size N . Namely, 3

ˆ ˆ~ ( , ( )),

ˆ ˆ~ ( , ( )),

r r r

r r r

a N a V a

b N b V b

(25) 4

where 1

( )..ˆ( )

nr rV a and 1

( )..

ˆ( )nr rV b are the variances of

1

( ).. nr ra and

1

( ).. nr rb , respectively. In 5

view that ( ) ( )1{ ,..., , 1,..., }j j

n j N are random samples drawn in the -space, 1

( )..ˆ( )

nr rV a 6

and 1

( )..

ˆ( )nr rV b can be empirically estimated based on eq. (16) as follows, 7

2 ( ) ( ) 2 ( ) ( ) 21 1 12

1

2 ( ) ( ) 2 ( ) ( ) 21 1 12

1

1 1ˆ ˆ ˆ( ) [ ( ( ),..., ( )) cos ( )] ( ) ,

1 1ˆ ˆˆ( ) [ ( ( ),..., ( ))sin ( )] ( ) .

Nj j j j

r n n n rj

Nj j j j

r n n n rj

V a g G G r r aN N

V b g G G r r bN N

(26) 8

In order to estimate the standard errors of sensitivity indices, we also need to estimate the 9

covariance between any two Fourier coefficients. Using eq. (16), we can estimate the 10

covariance between cosine Fourier coefficients ( )ˆ ra and sine Fourier coefficients ( )'

ˆrb 11

(where 1( , , )nr r r and 'r

is another vector of harmonics which could be different from 12

or the same as r

) as follows (see Appendix B for details), 13

2 T T' 1 '

1 1ˆˆ( , ) [ ( , ) cos( ( ) )sin( '( ) ] .r r n r rCov a b E g r r a bN N

(27) 14

Based the above equation, an empirical estimation of covariance between cosine Fourier 15

coefficients ( )ˆ ra and sine Fourier coefficient ( )'

ˆrb can be derived, 16

2 ( ) ( ) ( ) T ( ) T' 1 '2

1

1 1ˆ ˆˆ ˆ ˆ( , ) ( ,..., ) cos( ( ) )sin( '( ) ) .N

j j j jr r n r r

j

Cov a b g r r a bN N

(28) 17

18

Similarly, covariance between two cosine Fourier coefficients ( )ˆ ra and ( )'ˆ ra ( 'r r

) and 1

covariance between two sine Fourier coefficients ( )ˆrb and ( )

rb ( 'r r ) can be calculated 2

by the following two equations, 3

2 ( ) ( ) ( ) T ( ) T' 1 '2

1

1 1ˆ ˆ ˆ ˆ ˆ( , ) ( , ) cos( ( ) )cos( '( ) ) ,N

j j j jr r n r r

j

Cov a a g r r a aN N

(29) 4

2 ( ) ( ) ( ) T ( ) T' 1 '2

1

1 1ˆ ˆ ˆ ˆˆ ( , ) ( , )sin( ( ) )sin( '( ) ) ,N

j j j jr r n r r

j

Cov b b g r r b bN N

(30) 5

respectively. For the standard error of 2 , based on eq. (21) and the Central Limit 6

Theorem, 2 approximately follow a normal distribution for a large sample size N. 7

Namely, 8

2 2 21 1

1ˆ ~ ( , [ ( ,..., ) ( ,..., )] )

1 n nN V g gN

. (31) 9

Therefore, an empirical estimation of variance of 2 [i.e., V( 2 ) ] can be calculated as 10

2 21 1

( ) ( ) 4 2 21 1

1

(4) 2 2

1ˆ ˆˆ( ) [ ( ,..., ) ( ,..., )]11 1

ˆ[ ( ,..., ) ( ,..., )] ( )( 1) 1

1ˆ ˆ[ ( ) ],

( 1)

n n

Nj j

n nj

V V g gN

g gN N N

uN

(32) 11

where (4)u is the estimated fourth central moment for 1( ,..., )ng . Finally, the 12

covariance between cosine Fourier coefficient and 2 can be estimated as follows (see 13

Appendix C for details), 14

2

22

1 1 1

ˆ ˆ( , )

1{ ( ,..., ) cos( ( ) )[ ( ,..., ) ( ,..., )] } .

( 1)

r

T rn n n

Cov a

aE g r g g

N N

(33) 15

19

Thus, we can get an empirical estimation of covariance between cosine Fourier 1

coefficient and 2 with the following equation, 2

2

2( ) ( ) ( ) ( ) ( ) 2

1 1 11

ˆ ˆ ˆ( , )

ˆ ˆ1{ ( ,..., ) cos( ( ) )[ ( ,..., ) ( ,..., )] } .

( 1)

r

Nj j j T j j r

n n nj

Cov a

ag r g g

N N N

(34) 3

Similarly, we can derive an empirical estimation of variance between sine Fourier 4

coefficient and 2 as follows, 5

2

2( ) ( ) ( ) ( ) ( ) 2

1 1 11

ˆ ˆ( , )

ˆ ˆ1{ ( ,..., ) sin( ( ) )[ ( ,..., ) ( ,..., )] } .

( 1)

r

Nj j j T j j r

n n nj

Cov b

ag r g g

N N N

(35) 6

With the estimated covariance matrix among Fourier coefficients and sample variance 7

of output y based on eq. (27), eq. (28), eq. (29), eq. (30), eq. (32), eq. (34), and eq. (35), 8

we have an empirical estimation of the standard error of sensitivity index as follows 9

ˆˆ ˆˆˆ ( ) T (36) 10

where 11

( )

( ) ( )

2 2

2 2 2 2

ˆˆ( )ˆˆ2 2ˆ { } | ,{ } | ,ˆ( )

M

M M

r rr r r H

r H r H

a ba b

. 12

Based on eq. (36), we can see that the standard errors of sensitivity indices depend on 13

three components: 1) the magnitude of Fourier amplitudes, which depend on parametric 14

importance in terms of uncertainty contribution (i.e., the true value of sensitivity indices); 15

2) the magnitude of model outputs and variance of model outputs; and 3) the random 16

error measured by , which depends on sample size N. 17

20

The FAST method gives a biased estimator of partial variances contributed by 1

parameters due to the random noises introduced by sampling [11]. Thus, selection of the 2

maximum harmonic order (M) in eq. (19) is very important for the estimation of 3

sensitivity indices. If we select a very large M, it may cause overestimation by 4

incorporating much noise for those Fourier amplitudes at high harmonics. If we select a 5

low M, then it may cause too much underestimation if the model is highly nonlinear 6

(which will still have signals at high harmonics). Xu and Gertner [11] recommended 7

using a reasonably large M (specifically, M<50 for first-order sensitivity indices, M<5 for 8

second-order sensitivity indices, M<3 for third-order sensitivity indices) and estimating 9

Fourier amplitudes (i.e., 1 1 1

22 2( ) ( ) ( ).. .. ..

ˆˆ ˆn n nr r r r r rC a b ) by only using those sine and cosine 10

Fourier coefficients significantly larger or smaller than zero based on hypothesis tests. 11

The hypothesis tests are based on a z-score statistic of ˆra and ˆrb as follows 12

ˆ,

ˆ ˆ( )

ˆ,

ˆˆ ( )

ra

r

rb

r

aZ

V a

bZ

V b

(37) 13

which will follow standard normal distributions (assuming a relative large sample size N). 14

The incorporation of only Fourier coefficients (i.e., ˆra and ˆrb ) significantly larger or 15

smaller than zero will not affect the proposed method of standard error estimation. 16

Namely, we only need to calculate the standard errors based on those Fourier coefficients 17

significantly larger or smaller than zero. 18

21

2.3 Standard errors of sensitivity indices for random balance design sampling 1

For the random balance design sampling, we first draw N grid samples for each 2

parameter in the -space, 3

(1) ( ) ( )1 0

2{ ,..., | ( 1), 1,..., }N j

n i j j NN

(38) 4

where 0 is randomly drawn from [0,N

] so that the grid sample for i can ergodically 5

explore between 0 and 2 . Then the samples are randomly permuted to form N samples 6

in the -space. In this way, random balance design sampling draws random samples in 7

the high-dimensional -space similar to simple random sampling. Therefore, estimation 8

errors for Fourier coefficients using random balance design will be similar to those based 9

on simple random sampling. However, by using grid samples for each individual 10

parameter, estimation errors for Fourier coefficients used to calculate partial variances 11

contributed by main effects are different from those based on simple random sampling. 12

Based on the definition of Fourier coefficients ( )00 00ir

C in eq.(11) and eq. (12), we have 13

( )( )00 00 1

2 2( )

1 1

0 0

2 ( )10

2 ( )

0

( )

( ( ( ),..., ( )) )

1( ) ( ( ),..., ( ))2

1( ( ( ),..., ( )) | )

21

( )2

( ( ) )

i i

i

i i

i i

i i

i i

i

rr n

rnn n

rn i i

ri i

ri

C E g G G e

g G G e d d

E g G G e d

u e d

E u e

i

i

i

i

i

(39) 14

where 15

1( ) ( ( ( ),..., ( )) | )i n iu E g G G . 16

Let model output 17

22

( )iiy u (40) 1

where i

is a random error independent from i with ( ) 0i

E . Using eq. (40), it can 2

be easily shown that 3

1( ) ( ) [ ( ( ),..., ( )) | ]

( )i

i

n i

x

V V y V g G G

V y V

(41) 4

Based on eq. (40), the Fourier coefficients can be estimated as 5

( )

( ) ( )

( )( ) ( ) ( )00.. ..00 1

1

( ) ( )( ) ( )

1 1

( , ) ( , )00 00 00 00

1ˆ ( ( ( ),..., ( )) )

1 1( ) ( )

ˆ ˆ

ji i

i

j ji i i i

i

i i

Nrj j

r nj

N Nr rj j

ij j

ur r

C g G G eN

u e eN N

C C

i

i i (42) 6

where 7 ( )

( )

( )( , ) ( )00 00

1

( )( , ) ( )00 00

1

1ˆ ( ) ,

1ˆ ( ) .

ji i

i

ji i

i i

Nru j

r ij

Nrj

rj

C u eN

C eN

i

i

8

9

Based on eq. (42), it indicates that the standard error of Fourier coefficient ( )00.. ..00

ˆir

C is 10

resulted from estimation errors of both ( , )00 00i

urC and ( , )

00 00irC

. For 11

( , ) ( , ) ( , )00 00 00 00 00 00

ˆˆ ˆi i ir r rC a b

i with 12

( , ) ( ) ( )00 00

1

( , ) ( ) ( )00 00

1

1ˆ ( ) cos( ),

1ˆ ( )sin( ),

i i

i i

Nj j

r i ij

Nj j

r i ij

a rN

b rN

13

based on the Central Limit Theorem, we have 14

23

( , )00 00

( , )00 00

( )ˆ ~ (0, ),

2( )ˆ ~ (0, ),2

i

i

i

i

r

r

Va N

NV

b NN

(43) 1

for a relatively large sample size N. This indicates that the estimation error of ( , )00 00ir

C 2

decreases at an order of 1

N with an increasing sample size N. For estimation of 3

( , )00 00i

urC (where ( )( , )

00 00 [ ( ) ]i i

i

rur iC E u e i ), since we use grid samples for each 4

individual parameter, the estimation errors decrease as an order 2

1

N with an increasing 5

sample size N (see Xu and Gertner [11] for details). This suggests that the estimation 6

error for ( , )00 00ir

C is substantially larger than that for ( , )

00 00i

urC for a relatively large 7

sample size N. Thus, we can ignore the estimation error for ( , )00 00i

urC . Finally, for8

( )00.. ..00 00.. ..00 00.. ..00

ˆˆ ˆi i ir r rC a b i , we have 9

00 00 00 00

00 00 00 00

( )ˆ ~ , ,

2

( )ˆ ~ , ,2

i

i i

i

i i

r r

r r

Va N a

N

Vb N b

N

(44) 10

in view that ( , ) ( )00 00 00 00i i

ur rC C by eq. (39). To estimate ( )

iV based on eq. (41), we 11

need to first estimate ixV . Following Xu and Gertner [11], a conservative preliminary 12

estimation of ixV can be calculated as follows, 13

( ) 200 00

| | 1

ˆ| |i i

i

M

x rr

V C

(45) 14

24

where M is a relatively small harmonic order (e.g., 4), under which the Fourier 1

amplitudes have relatively small proportions of errors. In order to improve the estimation 2

accuracy of ( )ixV , the statistical tests with eq. (37) for simple random sampling can be 3

used to select those cosine and sine Fourier coefficients significantly larger or smaller 4

than zero for the estimation of Fourier amplitudes (i.e., ( ) 2 2 200 00 00 00 00 00

ˆ| | ( ) ( )i i ir r rC a b

). 5

Finally, ( )i

V can be estimated as 6

ˆ ˆ( ) ( )i ixV V y V . (46) 7

It can be shown that the covariance between cosine Fourier coefficient 00.. ..00ira and sine 8

Fourier coefficient 00.. '..00irb is zero (where the harmonic ir can be the same as or different 9

from 'ir , | |,| ' | 1,...,i ir r M , see Appendix D for details). We can also show that the 10

covariance between 00.. ..00ira and 00.. '..00ir

a , covariance between 00.. ..00irb and 00.. '..00ir

b are both 11

zero (where 'i ir r , | |,| ' | 1,...,i ir r M , see Appendix D for details). 12

For the covariance between cosine Fourier coefficients 00 00ˆir

a and sample 13

variance 2 in simple random sampling for first-order sensitivity indices, based on eq. 14

(42), we have 15

2 ( , ) 2 ( , ) 200 00 00 00 00 00ˆ ˆ ˆ ˆ ˆ ˆ( , ) ( , ) ( , ).

i i i

ur r rCov a Cov a Cov a (47) 16

Notice that the covariance estimation for 200 00ˆ ˆ( , )

irCov a based on eq. (33) is mainly 17

dependent on the signal of model output y for i at harmonic (0,...,0, ,0...0)ir r with 18

1,...,ir M . Since ( , )00 00ˆ

ira

is estimated based on the part of model output y with no signal 19

for i (i.e., i

which is independent from i , see eq. (40) for details), the second term 20

25

in right-hand side of eq. (47) [i.e., ( , ) 200 00ˆ ˆ( , )

irCov a ] will be close to zero. Therefore, we 1

have 2

2 ( , ) 200 00 00 00ˆ ˆ ˆ ˆ( , ) ( , ).

i i

ur rCov a Cov a 3

At the same time, for a grid sample of parameter xi in random balance design sampling, 4

( , )00 00ˆ

i

ura has a substantially lower estimation error compared to that of ( , )

00 00ˆir

a [11].Thus, 5

we can treat ( , )00 00ˆ

i

ura as a constant for a relatively large sample size of N and we have 6

2 ( , ) 200 00 00 00ˆ ˆ ˆ ˆ( , ) ( , ) 0.

i i

ur rCov a Cov a . 7

Please see Appendix E for a proof of above equation. Similarly, we can also show that 8

200 00

ˆ ˆ( , ) 0ir

Cov b . 9

Finally, the covariance matrix becomes a diagonal matrix and the standard error of 10

first-order sensitivity index ˆix can be estimated based on delta method as follows (see 11

Appendix F for details), 12

2

(4) 2 22

ˆ2 1ˆ ˆ ˆˆ ˆ ˆ( ) (1 ) [ ( ) ].

ˆ ( 1)i

i i i

xx x x u

N N

(48) 13

Eq. (48) suggests that the standard errors of sensitivity indices calculated by FAST 14

analysis depend on four factors. They are 1) the magnitude of sensitivity indices, 2) 15

sample variance ( 2 ) of the model output, 3) fourth central moment ( (4)u ) of the model 16

output, and 4) the sample size N. 17

2.4 Procedures 18

In this section, we specifically provide a summary of the procedure for FAST 19

analysis and corresponding standard error estimation for the calculated sensitivity indices. 20

26

FAST procedure using simple random sampling 1

a. Draw independent random samples in the -space 2

( 1{ ,..., | 0 2 , 1,..., }nn iK i n ). 3

b. Generate corresponding random samples in the parameter space using 4

search functions in eq. (2) and the sample in the -space. 5

c. Run the model using parameter samples from b. 6

d. Calculate the Fourier coefficients based on eq. (16) and the corresponding 7

sensitivity indices with eq. (22). Statistical tests with eq. (37) are used to 8

select Fourier coefficients significantly larger or smaller than zero. 9

e. Calculate the covariance matrix among Fourier coefficients and sample 10

variance of output y based on eq. (27), eq. (28), eq. (29), eq. (30), eq. (32), 11

eq. (34), and eq. (35). 12

f. Estimate standard errors of sensitivity indices using eq. (36). 13

FAST procedure using random balance design sampling 14

a. Draw N grid samples for { i } using eq. (38), which are then randomly 15

permuted to form a random sample in the -space. 16

b. Generate a corresponding random sample in the parameter space using 17

search functions in eq. (2) and the sample in the -space. 18

c. Run the model using parameter samples from b. 19

d. Calculate the Fourier coefficients based on eq. (16) and the corresponding 20

sensitivity indices with eq. (22). Statistical tests with eq. (44) are used to 21

select Fourier coefficients significantly larger or smaller than zero for first 22

order sensitivity indices calculation, and statistical tests with eq. (37) are 23

27

used to select Fourier coefficients significantly larger or smaller than zero 1

for higher-order sensitivity indices calculation. 2

e. Calculate the covariance matrix among Fourier coefficients and sample 3

variance of output y using eq. (44) and eq. (32) for first order sensitivity 4

indices with equal to a diagonal matrix, and using eq. (27), eq. (28), eq. 5

(29), eq. (30), eq. (32), eq. (34), and eq. (35) for higher order sensitivity 6

indices. 7

f. Estimate standard errors of first order and higher order sensitivity indices 8

using eq.(48) and eq. (36), respectively. 9

3 Applications 10

In order to test how good the delta method approximates the standard errors of 11

sensitivity indices, we use a simple test model and a realistic complex model. The simple 12

test model is as follows 13

32

1 2 2 3 1 3 1 2 31

( ) 2 3ii

y i x x x x x x x x x x

(49) 14

where 1 2 3, , andx x x are three independent parameters for the model. We assume all 15

parameters follow standard normal distributions. Although the model is simple, it is 16

representative since it is nonlinear and non-monotonic. The analytical sensitivity indices 17

have been derived by Xu and Gertner [11] for this model. 18

For the complex model, we applied the delta method to a world population model, 19

World 3 [23]. The World 3 model is a computer program for simulating the interactions 20

among population, industrial growth, food production and limits in the ecosystems of the 21

Earth. The model consists of six main systems: food system, agriculture system, 22

28

industrial system, population system, nonrenewable resources system and pollution 1

system. In this paper, we are concerned with the industrial system, which can provide the 2

products for world population [23]. At the same time, this industrial system creates 3

pollution, which reduces the land productivity. For details, please refer to [23]. The seven 4

parameters of interest in the model are shown in Table 1. The output of interest is the 5

world human population on a 5-year basis. For real application models, the ranges, 6

distributions and correlations among parameters can be derived from empirical or 7

historical data [24,25]. However, for the simplification of our test case, we assume a 8

uniform distribution for each parameter. The bounds for each parameter are assumed to 9

be a 10% deviation of the central value (Table 1). Since the FAST analysis does not allow 10

for the calculation of higher-order sensitivity indices for correlated parameters, in this 11

paper, we assume the independence among parameters. 12

Xu and Gertner [11] showed that, for partial variances contributed by main effects 13

of parameters in simple random sampling, and partial variances contributed by higher-14

order interactions in random balance design sampling, the estimation bias for Fourier 15

amplitude 1

2( ).. nr rC

is related to the expected square of model output16

21[ ( ( ),..., ( ))]nE g G G

N

. Since the sensitivity index is calculated based on a ratio of sum 17

of Fourier amplitudes 1

2( ).. nr rC to the variance of model output 2 (see eq. (9) for 18

details), it is possible that the calculated sensitivity index is larger than one if 19

21[ ( ( ),..., ( ))]nE g G G

N

is much larger than 2 in a model. To overcome this problem, Xu 20

and Gertner [11] recommended that the model output be centered by sample mean and/or 21

29

scaled by standard deviation of the model output, so that 2

1[ ( ( ),..., ( ))]nE g G G

N

is 1

generally smaller than the variance 2 . For the World 3 model, the variance of model 2

output is much smaller than 2

1[ ( ( ),..., ( ))]nE g G G

N

, especially at the beginning of 3

model simulation. In order to overcome this potential large estimation bias, we subtracted 4

the model output by its sample mean. 5

In order to test the reliability of standard errors derived based on the delta method, we 6

used the standard deviations of the sensitivity indices based on 20 replicates of samples 7

as references. In our preliminary analysis, we found that 20 replicates is enough for 8

reasonable approximation of the standard errors (i.e., the increase in the number of 9

replicates does not substantially affect the standard error estimation). 10

4 Results 11

For the simple test model, our results showed that the delta method can provide a 12

good approximation for standard errors of both first-order and higher-order sensitivity 13

indices in simple random sampling (Figure 1) as well as random balance design sampling 14

(Figure 2). The approximation is better for a larger sample size since the delta method is 15

dependent on the large sample size assumption. We can also see that the standard errors 16

of first-order sensitivity indices for random balance design sampling are substantially 17

reduced compared to those based on simple random sampling. Finally, the standard errors 18

are generally higher for larger sensitivity indices. However, the coefficients of variations 19

(i.e., the precision relative to the mean values of sensitivity indices) are much lower 20

(Figure 1a and Figure 3). 21

30

For the World 3 model, parameters x2, x3 and x5 have relative high first-order 1

sensitivity indices before year 2000. After that, the interaction between x2 and x3 and 2

interaction between x3and x5 become important (Figure 4). We derived the sensitivity 3

indices for all first-order, second-order and third-order interactions based on a single 4

sample of size N (N = 1000 or N = 5000). For convenience, we did not plot those 5

sensitivity indices less than 0.05. The standard errors of first-order and higher-order 6

sensitivity indices estimated based on the delta method can reasonably approximate those 7

based on 20 sample replicates for both simple random sampling (Figure 4 and Figure 5) 8

and random balance design sampling (Figure 6 and Figure 7). The increase of sample size 9

from 1000 to 5000 decreases the estimated standard errors for both simple random 10

sampling (Figure 4 and Figure 5) and random balance design sampling (Figure 6 and 11

Figure 7). However, the difference in standard errors for first-order sensitivity indices 12

between random balance design sampling and simple random sampling is not so evident 13

compared to that based on the test model using eq.(49). There are two reasons for that. 14

First, the centered output values reduce the magnitude of possible random errors by 15

reducing the bias term 2

1[ ( ( ),..., ( ))]nE g G G

N

. Second, there is only a small amount of 16

variations of the model output at the beginning of the simulation so that the random 17

balance design sampling does not increase much in sampling efficiency compared to 18

simple random sampling. After that (about 100 years after initial simulation year), the 19

random balance design sampling does increase the precision (i.e., reduce the standard 20

errors) slightly for first-order sensitivity indices of parameters x2 and x3 (compare Figure 21

4 and Figure 6, Figure 5 and Figure 7). 22

31

5 Discussion 1

Due to the random errors introduced by sampling, it is important that uncertainty and 2

sensitivity techniques provide a measure of reliability of estimated sensitivity indices 3

(commonly using the standard errors of estimated sensitivity indices). For complex 4

models which take a long time to run a single simulation, it is computationally expensive 5

or infeasible to estimate standard errors of sensitivity indices based on replicates of a 6

relative large sample (e.g., 10 or 20 replicates). The delta method can provide a good 7

approximation of standard errors of sensitivity indices at no additional cost of the model 8

simulations. This can substantially increase the efficiency of FAST analysis for complex 9

models. 10

In another way, the delta method can provide the user with a general approach to 11

obtain the minimum sample size to achieve a desired precision level for a specified 12

sensitivity index using the random balance design sampling. For a model with a sample 13

variance of output y ( 2 ) and a 4th central moment ( (4)u ) (estimated based on a 14

preliminary sample), if we want to have a desired standard error ˆˆ ( )ix of a specified 15

sensitivity index (e.g., ˆˆ ( )ix is less than 50% of a sensitivity index of 0.02), the 16

minimum sample size N ( minN ) can be approximately estimated based on eq. (48) as 17

follows, 18

2

(4) 2 2min 2 22

ˆ2 1ˆ ˆ ˆ ˆ ˆ(1 ) [ ( ) ].ˆˆ ˆˆ ˆ( ) ( )

i

i i

i i

xx x

x x

N u

(50) 19

Eq. (50) can provide the model user an operational approach to determine the minimum 20

sample size to achieve a desired precision level for a specified sensitivity index. Namely, 21

the model user can first have a preliminary estimation of 2 and (4)u based on a 22

32

relatively small sample size (dependent on simulation time of the model), which will take 1

less computational time. Then, the user can define a desired standard error for a 2

sensitivity index of interest. For example, for the last simulation year in World3 model, 3

we have 2 = 1.605 (10+18) and (4)u = 6.518 (10+36). In order to achieve a standard error 4

of 0.03 for a sensitivity index of 0.5, a minimum sample size of 981 is required based on 5

eq. (50) (see Figure 8). As far as we know, this is the first statistical approach proposed to 6

determine the sample size for an uncertainty and sensitivity analysis method based on 7

random balance design sampling. 8

We need to point out that the sample size based on eq. (50) is only a minimum 9

sample size from the standard error perspective. One key assumption in FAST is that the 10

model output y will become a multiple periodic function for parameters in the -space. 11

For models with many parameters, if a parameter has a relatively low contribution to 12

uncertainty in a model output, then the periodic signal may easily be contaminated by the 13

random noise (e.g., i

in eq. (50)) due to other parameters. In order to detect and 14

distinguish the signals for different parameters, a relatively large sample size is required. 15

Currently we do not have an analytical form to calculate the minimum required sample 16

size for that. Based on our experiences, a sample with size of 1000 is generally good for 17

models with less than 20 parameters. A statistical robust method to achieve a desired 18

precision level for estimated sensitivity indices is especially important for complex 19

models since it is computationally inefficient for trial-error approaches (i.e., try many 20

replicates at different samples sizes to get the desired precision). Thus, our proposed 21

method is a first step toward a precise determination of sample size for FAST 22

applications in complex models. 23

33

The proposed standard error estimation for the sensitivity indices is mainly based on 1

the assumption that the model parameters are independent. In practices, it is common that 2

the parameter can be correlated. Xu and Gertner [12,13] proposed using a simple random 3

reordering approach to generate samples for parameters with a specific rank correlation 4

structure. The main idea is that, after generating samples for 1( ,.., )n [using simple 5

random sampling approach or random balance design sampling approach] and the 6

corresponding samples for parameters 1( ,.., )nx x with the search function of eq. (2), 7

sampled parameter values are reordered to honor a specified correlation structure using 8

Iman and Conover’s method [26]. Samples of 1( ,.., )n are reordered correspondingly 9

based on the order of parameter samples. For first order sensitivity indices, it would be 10

easy to extend our proposed approaches for standard error estimation to model with 11

correlated parameters. For a model 1( ,.., )ny g x x with dependent parameters, the model 12

output can be decomposed as follows for a specific model parameters ix , 13

( )( )subi xy u x (51) 14

where 15

1( ) ( ( ,..., ) | )i n iu x E g x x x (52) 16

and ( )ix is the random error that arises due to the uncertainty in the parameters except 17

ix with 18

( )

( ) 1

( ) 0

( ) V( )- V( ( ( ,..., ) | )).i

i

x

x n i

E

V y E g x x x

(53) 19

34

With the search function eq. (2), we can see that the model output y becomes a periodic 1

function of i using eq. (51). Based on a Fourier transformation over i , the partial 2

variance contributed by parameter ix can be calculated as [11] 3

( ) 2

| | 1

| | ,u

i

i i

i

M

x rr

V C

(54) 4

where ( )( ) ( ) ( )

11

1ˆ ( ( ),..., ( ))u j

i i i

i

Nrj j

r nj

C g G G eN

i . With the sine and cosine Fourier 5

coefficient of ( )ˆ ui

irC , we can estimate the first-order sensitivity indices of all parameters 6

using eq. (22). It is noteworthy that the decomposition of y in the parameter space with eq. 7

(51) is equivalent to the decomposition in -space with eq. (40). Therefore, we can still 8

estimate standard errors of first order sensitivity indices using eq.(48) for models with 9

correlated parameter based on random balance design sampling. The decomposition in -10

space with eq. (40) is a general decomposition and is also valid for simple random 11

sampling for first-order sensitivity indices. Therefore, eq. (36) can be still used to 12

estimate standard errors of first order sensitivity indices for models with correlated 13

parameter based on simple random sampling. Due to the difficulty to distinguish 14

correlated effects and interaction effects, currently FAST analysis has not been proposed 15

for calculation of higher order sensitivity indices in models with correlated parameters. 16

6 Acknowledgement 17

This study was supported by U.S. Department of Agriculture McIntire-Stennis funds 18

(MS 875-359) and NIH grant R01-AI54954-0IA2. We thank two anonymous reviewers 19

for their very helpful comments which substantially improved this paper.20

35

References: 1

1. A. Saltelli, S. Tarantola, and F. Campolongo, Sensitivity analysis as an ingredient of 2 modeling. Statistical Science 15 (2000), pp. 377-395. 3

2. A. Saltelli, M. Ratto, S. Tarantola, and F. Campolongo, Sensitivity analysis for 4 chemical models. Chem Rev 105 (2005), pp. 2811-2826. 5

3. A. Saltelli, K. Chan, and M. Scott, Sensitivity Analysis. John Wiley and Sons, West 6 Sussex, 2000. 7

4. A. Saltelli, and S. Tarantola, On the relative importance of input factors in 8 mathematical models: safety assessment for nuclear waste disposal. J Am Stat 9 Assoc 97 (2002), pp. 702-709. 10

5. R.I. Cukier, J.H. Schaibly, and K.E. Shuler, Study of the sensitivity of coupled reaction 11 systems to uncertainties in rate coefficients. III. Analysis of the approximations. J 12 Chem Phys 63 (1975), pp. 1140-1149. 13

6. R.I. Cukier, H.B. Levine, and K.E. Shuler, Nonlinear sensitivity analysis of 14 multiparameter model systems. Journal of Computational Physics 26 (1978), pp. 15 1-42. 16

7. J.H. Schaibly, and K.E. Shuler, Study of the sensitivity of coupled reaction systems to 17 uncertainties in rate coefficients. II. Applications. J Chem Phys 59 (1973), pp. 18 3879-3888. 19

8. R.I. Cukier, C.M. Fortuin, K.E. Shuler, A.G. Petschek, and J.H. Schaibly, Study of the 20 sensitivity of coupled reaction systems to uncertainties in rate coefficients. I. 21 Theory. J Chem Phys 59 (1973), pp. 3873-3878. 22

9. E. Borgonovo, Measuring uncertainty importance: investigation and comparison of 23 alternative approaches. Risk Anal 26 (2006), pp. 1349-1361. 24

10. S. Tarantola, D. Gatelli, and T.A. Mara, Random balance designs for the estimation 25 of first order global sensitivity indices. Reliab Eng Syst Safe 91 (2006), pp. 717-26 727. 27

11. C. Xu, and G.Z. Gertner, Understanding and comparisons of different sampling 28 approaches for the Fourier Amplitudes Sensitivity Test (FAST). Comput Stat Data 29 An Accepted (2010), pp. 999. 30

12. C. Xu, and G.Z. Gertner, Extending a global sensitivity analysis technique to models 31 with correlated parameters. Comput Stat Data An 51 (2007), pp. 5579-5590. 32

13. C. Xu, and G.Z. Gertner, A general first-order global sensitivity analysis method. 33 Reliab Eng Syst Safe 93 (2008), pp. 1060-1071. 34

14. A. Saltelli, S. Tarantola, and K.P.S. Chan, A quantitative model-independent method 35 for global sensitivity analysis of model output. Technometrics 41 (1999), pp. 39-36 56. 37

15. R. Scheller, and D. Mladenoff, An ecological classification of forest landscape 38 simulation models: tools and strategies for understanding broad-scale forested 39 ecosystems. Landsc Ecol 22 (2007), pp. 491-505. 40

16. H.S. He, Forest landscape models: Definitions, characterization, and classification. 41 For Ecol Manag 254 (2008), pp. 484-498. 42

17. C. Xu, G.Z. Gertner, and R.M. Scheller, Uncertainties in the response of a forest 43 landscape to global climatic change. Glob Change Biol 15 (2009), pp. 116-131. 44

36

18. P. Reiter, Oviposition, dispersal, and survival in Aedes aegypti: implications for the 1 efficacy of control strategies. Vector Borne Zoonot Dis 7 (2007), pp. 261-274. 2

19. M. Otero, N. Schweigmann, and H.G. Solari, A stochastic spatial dynamical model 3 for Aedes aegypti. Bull Math Biol 70 (2008), pp. 1297-1325. 4

20. K. Magori, M. Legros, M.E. Puente, D.A. Focks, T.W. Scott, et al., Skeeter Buster: a 5 stochastic, spatially-explicit modeling tool for studying Aedes aegypti population 6 replacement and population suppression strategies. Plos Neglect Trop Dis 3 7 (2009), pp. e508. 8

21. S.F. Fang, G.Z. Gertner, S. Shinkareva, G.X. Wang, and A. Anderson, Improved 9 generalized Fourier Amplitude Sensitivity Test (FAST) for model assessment. 10 Stat Comput 13 (2003), pp. 221-226. 11

22. Y. Lu, and S. Mohanty, Sensitivity analysis of a complex, proposed geologic waste 12 disposal system using the Fourier Amplitude Sensitivity Test method. Reliab Eng 13 Syst Safe 72 (2001), pp. 275-291. 14

23. D.H. Meadows, D.L. Meadows, and J. Randers, Beyond the Limits. Chelsea Green 15 Publishing Company, Post Mills, Vermont, 1992. 16

24. R.L. Iman, M.E. Johnson, and T.A. Schroeder, Assessing hurricane effects. Part 1. 17 Sensitivity analysis. Reliab Eng Syst Safe 78 (2002), pp. 131-145. 18

25. A. Kanso, G. Chebbo, and B. Tassin, Application of MCMC-GSA model calibration 19 method to urban runoff quality modeling. Reliab Eng Syst Safe 91 (2006), pp. 20 1398-1405. 21

26. R.L. Iman, and W.J. Conover, A distribution-free approach to inducing rank 22 correlation among input variables. Commun Stat-Simul C 11 (1982), pp. 311-334. 23

27. E.L. Lehmann, Elements of large-sample theory. Springer, New York, 1999. 24 25

26

37

Tables and Figures 1

2

3

4

5

Table 1 Parameter specifications of uniform distributions for World3 model 6

Parameter Label Lower bound Upper bound

x1 industrial output per capita desired 315 385

x2 industrial capital output ratio before 1995 2.7 3.3

x3 fraction of industrial output allocated to

consumption before 1995 0.387 0.473

x4 fraction of industrial output allocated to

consumption after 1995 0.387 0.473

x5 average life of industrial capital before 1995 12.6 15.4

x6 average life of industrial capital after 1995 16.2 19.8

x7 initial industrial capital 1.89 (10+11) 2.31 (10+11)

7

38

1 Figure 1 Estimated sensitivity indices and their standard errors based on simple random 2 sampling. The left column [panel (a) for a sample size of 1000 and panel (c) for a sample 3 size of 5000] shows the sensitivity indices and the right column [panel (b) for a sample 4 size 1000 and panel (d) for a sample size of 5000] shows the standard errors. For the 5 sensitivity indices in (a) and (c), the white bars represent the FAST-based sensitivity 6 indices using a randomly selected replicate, while the grayed bars represent the analytical 7 derived sensitivity indices. For the standard errors in (b) and (d), the white bars represent 8 estimated standard errors based on the delta method. The grayed bars represent the 9 standard errors of sensitivity indices calculated based on 20 sample replicates. 10

11

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

x1 x2 x3x1x

2x1x

3x2x

3

x1x2x

3

Parameters and their interactionsSt

anda

rd e

rror

Delta

Reference

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

x1 x2 x3x1x

2x1x

3x2x

3

x1x2x

3

Parameters and their interactions

Sens

itivi

ty

FAST

Reference

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

x1 x2 x3x1x

2x1x

3x2x

3

x1x2x

3

Parameters and their interactions

Stan

dard

err

or

Delta

Reference

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

x1 x2 x3x1x

2x1x

3x2x

3

x1x2x

3

Parameters and their interactions

Sens

itivi

ty

FAST

Reference

(a) (b)

(c) (d)

39

1 Figure 2 Estimated sensitivity indices and their standard errors based on random balance 2 design sampling. The left column [panel (a) for a sample size of 1000 and panel (c) for a 3 sample size of 5000] shows the sensitivity indices and the right column [panel (b) for a 4 sample size 1000 and panel (d) for a sample size of 5000] shows the standard errors. For 5 the sensitivity indices in (a) and (c), the white bars represent the FAST-based sensitivity 6 indices using a randomly selected replicate, while the grayed bars represent the analytical 7 derived sensitivity indices. For the standard errors in (b) and (d), the white bars represent 8 estimated standard errors using the delta method. The grayed bars represent the standard 9 errors of sensitivity indices calculated based on 20 sample replicates. 10

11

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

0.05

x1 x2 x3x1x

2x1x

3x2x

3

x1x2x

3

Parameters and their interactionsSt

anda

rd e

rror

Delta

Reference

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

x1 x2 x3x1x

2x1x

3x2x

3

x1x2x

3

Parameters and their interactions

Sens

itivi

ty

FAST

Reference

0

0.005

0.01

0.015

0.02

0.025

x1 x2 x3x1x

2x1x

3x2x

3

x1x2x

3

Parameters and their interactions

Stan

dard

err

or

Delta

Reference

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

x1 x2 x3x1x

2x1x

3x2x

3

x1x2x

3

Parameters and their interactions

Sens

itivi

ty

FAST

Reference

(a) (b)

(c) (d)

40

1 Figure 3 Coefficients of variations for the test model in eq. (49) based on simple random 2 sampling with a sample size of 5000. 3

4

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

x1 x2 x3x1x

2x1x

3x2

x3

x1x2x

3

Parameters and their interactions

Coe

ffic

ient

of

vari

atio

n

41

1 Figure 4 Estimated sensitivity indices (left column) and their standard errors (right 2 column) using simple random sampling with a sample size of 1000. The solid lines 3 indicate sensitivity indices and standard errors estimated based on 20 sample replicates. 4 The dotted lines indicate sensitivity indices and standard errors estimated using a single 5 sample of size 1000. The standard errors are estimated using the delta method. 6

Year

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

0.0

.1

.2

.3

.4

.5

Year

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000

.005

.010

.015

.020

.025

.030

.035

(x2)

Year

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

.1

.2

.3

.4

.5

.6

Year

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.00

.01

.02

.03

.04

.05

Year

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

.02

.04

.06

.08

.10

.12

.14

.16

Year

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000

.005

.010

.015

.020

.025

.030

(x3)

(x5)

Year

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

0.0

.1

.2

.3

.4

.5

Year

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000

.005

.010

.015

.020

.025

.030

.035

(x2, x3)

Year

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

0.00.02.04.06.08.10.12.14.16

Year

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000

.005

.010

.015

.020

.025

(x3, x5)

42

1 Figure 5 Estimated sensitivity indices (left column) and their standard errors (right 2 column) using simple random sampling with a sample size of 5000. The solid lines 3 indicate sensitivity indices and standard errors estimated based on 20 sample replicates. 4 The dotted lines indicate sensitivity indices and standard errors estimated using a single 5 sample of size 5000. The standard errors are estimated using the delta method. 6

7

Year

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

0.0

.1

.2

.3

.4

.5

Year

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000.002.004.006.008.010.012.014.016

(x2)

Year

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

.1

.2

.3

.4

.5

.6

Year

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000.002.004.006.008.010.012.014.016.018.020

(x3)

Year

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

0.00.02.04.06.08.10.12.14.16

Year

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000

.002

.004

.006

.008

.010

.012

(x5)

Year

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

0.0

.1

.2

.3

.4

.5

Year

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000.002.004.006.008.010.012.014.016.018

(x2, x3)

Year

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

0.00.02.04.06.08.10.12.14

Year

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000

.002

.004

.006

.008

.010

.012

(x3, x5)

43

1 Figure 6 Estimated sensitivity indices (left column) and their standard errors (right 2 column) using random balance design sampling with a sample size of 1000. The solid 3 lines indicate sensitivity indices and standard errors estimated based on 20 sample 4 replicates. The dotted lines indicate sensitivity indices and standard errors estimated 5 based a single sample of size 1000. The standard errors are estimated using delta method. 6

7

Year

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

.1

.2

.3

.4

.5

.6

Year

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000

.005

.010

.015

.020

.025

.030

.035

Year

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

.02

.04

.06

.08

.10

.12

.14

Year

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000.002.004.006.008.010.012.014.016.018.020

Year

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

0.0

.1

.2

.3

.4

.5

Year

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.00

.01

.02

.03

.04

(x3)

(x5)

(x2, x3)

Year

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

0.00.02.04.06.08.10.12.14.16

Year

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000

.005

.010

.015

.020

.025

.030

(x3, x5)

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

.05

.10

.15

.20

.25

.30

.35

.40

.45

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000

.005

.010

.015

.020

.025

.030

.035

(x2)

44

1 Figure 7 Estimated sensitivity indices (left column) and their standard errors (right 2 column) using random balance design sampling with a sample size of 5000. The solid 3 lines indicate sensitivity indices and standard errors estimated based on 20 sample 4 replicates. The dotted lines indicate sensitivity indices and standard errors estimated 5 based on a single sample. The standard errors are estimated using delta method. 6

7

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

0.0

.1

.2

.3

.4

.5

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000.002.004.006.008.010.012.014.016

Year

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

0.00.02.04.06.08.10.12.14

Year

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000

.002

.004

.006

.008

.010

.012

(x2, x3)

(x3, x5)

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

0.0

.1

.2

.3

.4

.5

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000

.002

.004

.006

.008

.010

.012

.014

(x2)

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

.10

.15

.20

.25

.30

.35

.40

.45

.50

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000

.002

.004

.006

.008

.010

.012

.014

(x3)

1850 1900 1950 2000 2050 2100 2150

Sen

sitiv

ity in

dex

0.00

.02

.04

.06

.08

.10

.12

.14

1850 1900 1950 2000 2050 2100 2150

Sta

ndar

d er

ror

0.000

.002

.004

.006

.008

.010

(x5)

45

1 Figure 8 Minimum sample size required for specified standard errors for a sensitivity 2 index of 0.5 for the last simulation year of World3 model. 3

4

Standard error

0.00 .01 .02 .03 .04 .05 .06

Sam

ple

size

0

2000

4000

6000

8000

10000

46

Appendix A: Delta method 1

For a statistic NT with ( )NE T and 2( )NV T , 2

2( ) (0, )D

NN T N 3

where D

indicates convergence in distribution as N , and 2(0, )N indicates a 4

normal distribution with mean zero and variance 2 . Then, for a function ( )g where the 5

derivative '( )g is not zero, based on the delta method, we have 6

2 2( ( ) ( )) (0,[ '( )] ).D

NN g T g N g 7

If NT is a multivariate vector with covariance matrix , and 8

( ) ( , )D

NN T N 0 , 9

for a function ( )g with 1( ,..., )m , the delta method states that 10

( ( ) ( )) (0, )D

TNN g T g N 11

where 12

1 1

( ) ( ) ( ) ( ),..., , ,..., 0.

m m

g g g g

13

The delta method is based on a first-order Taylor series expansion. For a detailed proof, 14

please refer to Lehmann [27]. 15

47

Appendix B: Covariance between estimated Fourier coefficients for simple random 1 sampling 2

3

The covariance between cosine Fourier coefficients ( )ˆ ra and sine Fourier coefficients ( )'

ˆrb 4

(where 1( , , )nr r r and 'r

is another vector of harmonics which could be different from 5

or the same as r

) can be calculated as, 6

( ) ( )

( ) ( )

( )

( ) ( )'

( ) ( ) ( ) ( ) ( ) ( )1 1 '

1 1

2 ( ) ( )12

1

( ) ( )1

ˆˆ( , )

1 1[ ( , ) cos( ( ) ) ( , ) sin( '( ) ]

1{ [ ( , ) cos( ( ) )sin( '( ) ]

[ ( , ) cos( ( )

j j

j j

j

r r

N Nj j T j j T

n n r rj j

Nj j T T

nj

j j Tn

Cov a b

E g r g r a bN N

E g r rN

E g r

( )( ) ( )1

, 1, ,

( ) ( )'

) ( , ) sin( '( ) )]}k

Nk k T

nj k j k

r r

g r

a b

(B1) 7

where ( ) ( ) ( )

1( ,..., )j j j

n

and ( ) T( )

j

represents the transpose of the vector ( )j

. The 8

first term of the above equation can be simplified as 9

( ) ( )2 ( ) ( ) T T12

1

2 T T1

1{ [ ( , ) cos( ( ) )sin( '( ) ]}

1[ ( , ) cos( ( ) )sin( '( ) ].

j jN

j jn

j

n

E g r rN

E g r rN

(B2) 10

Since ( ) ( )1( ,..., )j j

n are independently drawn from ( ) ( )1( , )k k

n , the second term of eq. 11

(B1) can be simplified as follows, 12

48

( ) ( )

( ) ( )

( ) ( ) ( ) ( )1 12

, 1, ,

( ) ( ) ( ) ( )1 12

, 1, ,

( ) ( )'2

( ) (

1{ [ ( , ) cos( ( ) ) ( , )sin( '( ) )]}

1{ [ ( , ) cos( ( ) )] [ ( , )sin( '( ) )]}

1[ ( 1) ]

1

j k

j k

Nj j T k k T

n nj k j k

Nj j T k k T

n nj k j k

r r

r

E g r g rN

E g r E g rN

N N a bNN

a bN

)'r

(B3) 1

Finally, based on eq. (B1), (B2) and (B3), we have 2

( ) ( )'

2 T T1

( ) ( )'

( ) ( )'

2 T T ( ) ( )1 '

ˆˆ( , )

1[ ( , ) cos( ( ) )sin( '( ) ]

1

1 1[ ( , ) cos( ( ) )sin( '( ) ] .

r r

n

r r

r r

n r r

Cov a b

E g r rN

Na b

N

a b

E g r r a bN N

3

4

49

Appendix C: Covariance between estimated Fourier coefficients and sample 1 variance for simple random sampling 2

3

The covariance between the estimated cosine Fourier coefficients and sample variance 24

can be estimated as 5

( )

( )

2

( ) ( ) ( ) ( ) 2 21 1 1

1 1

( ) ( ) ( ) ( ) 21 1 1

1

ˆ ˆ( , )

1 1[ ( ,..., ) cos( ( ) ) [ ( ,..., ) ( ,..., )] ]

1

1{ ( ,..., ) cos( ( ) )[ ( ,..., ) ( ,..., )] }

( 1)

1{

( 1)

j

j

r

N Nj j T j j

n n n rj j

Nj j T j j

n n nj

Cov a

E g r g g aN N

E g r g gN N

E gN N

( )( ) ( ) ( ) ( ) 21 1 1

, 1, ,

2

( ,..., ) cos( ( ) )[ ( ,..., ) ( ,..., )] }j

Nj j T k k

n n nj k j k

r

r g g

a

(C1) 6

The first term of the above equation can be simplify as 7

( )( ) ( ) ( ) ( ) 21 1 1

1

21 1 1

1{ ( ,..., ) cos( ( ) )[ ( ,..., ) ( ,..., )] }

( 1)

1[ ( ,..., ) cos( ( ) )[ ( ,..., ) ( ,..., )] ].

( 1)

jN

j j T j jn n n

j

Tn n n

E g r g gN N

E g r g gN

8

Let 1[ ( ,..., )]nu E g , based on the Central Limit Theorem, for a large sample size N, 9

we have 10

2

1( ,..., ) ~ ( , )ng N uN

11

and 12

2

21( ( ,..., ) )nE g u

N

. 13

The second term of eq. (C1) can be simplified as 14

50

( )

( )

( ) ( ) ( ) ( ) 21 1 1

, 1, ,

( ) ( ) ( ) ( ) 21 1 1

, 1, ,

1

1{ ( ,..., ) cos( ( ) )[ ( ,..., ) ( ,..., )] }

( 1)

1{ [ ( ,..., ) cos( ( ) )] [ ( ,..., ) ( ,..., )] }

( 1)

[ ( ,..., )

j

j

Nj j T k k

n n nj k j k

Nj j T k k

n n nj k j k

r n

E g r g gN N

E g r E g gN N

a E g

21

21 1

2 2 ( )1 1 1 1

22

( ,..., )]

[ ( ,..., ) ( ,..., )]

{ [ ( ,..., ) ] [ ( ,..., )] 2 [( ( ,..., ) )( ( ,..., ) )]}

.

n

r n n

kr n n n n

rr

g

a E g u u g

a E g u E u g E g u g u

aa

N

1

Finally, we have 2

2

21 1 1

22

2

22

1 1 1

ˆ ˆ( , )

1{ ( ,..., ) cos( ( ) )[ ( ,..., ) ( ,..., )] }

( 1)

1{ ( ,..., ) cos( ( ) )[ ( ,..., ) ( ,..., )] } .

( 1)

r

Tn n n

rr

r

T rn n n

Cov a

E g r g gN

aa

N

a

aE g r g g

N N

3

Similarly, we can show that 4

22 2

1 1 1

1ˆ ˆ( , ) { ( ,..., ) sin( ( ) )[ ( ,..., ) ( ,..., )] } .( 1)

T rr n n n

bCov b E g r g g

N N

5

6

51

Appendix D: Covariance between estimated Fourier coefficients for random balance 1

design sampling 2

The covariance between cosine Fourier coefficients 00.. ..00ira and sine Fourier coefficient 3

00.. '..00irb (where the harmonic ir can be the same as or different from 'ir , | |,| ' | 1,...,i ir r M ) 4

can be calculated as, 5

( ) ( )

( ) ( )

00.. ..00 00.. '..00

( ) ( ) ( ) ( )1 1

1 1

( ) ( ) ( ) ( )

1 1

ˆˆ( , )

1 1[ ( ,..., ) cos( ), ( ,..., ) sin( ' )]

1 1[ ( ( ) ) cos( ), ( ( ) ) sin( ' )]

[

i i

j j

j j

i i

r r

N Nj j j j

n i i n i ij j

N Nj j j j

i i i i i ij j

Cov a b

Cov g r g rN N

Cov u r u rN N

Cov

( ) ( )

( ) ( )

( ) ( )

1 1

( ) ( )

1 1

1 1( ( ) cos( ) cos( ),

1 1( )sin( ' ) cos( ' )].

j j

i

j j

i

N Nj j

i i i i ij j

N Nj j

i i i i ij j

u r rN N

u r rN N

6

By using a grid sampling for i , the estimation error of ( )( )

1

1( )cos( )

jN

ji i i

j

u rN

is 7

substantially smaller than that of ( )( )

1

1cos( )

j

i

Nj

i ij

rN

for a relatively large sample size 8

(see Section 2.2 for details). Similarly, the estimation error of ( )( )

1

1( )sin( ' )

jN

ji i i

j

u rN

is 9

substantially smaller than that of ( )( )

1

1sin( ' )

j

i

Nj

i ij

rN

.Thus, we treat 10

( )( )

1

1( )cos( )

jN

ji i i

j

u rN

and

( )( )

1

1( )sin( ' )

jN

ji i i

j

u rN

as constants. In addition, based on 11

eq. (44), we have 12

( , )00 00

( , )00 ' 00

ˆ[ cos( )] [ ] 0,

ˆ[ sin( ' )] [ ] 0.

i i

i i

i i r

i i r

E r E a

E r E b

13

52

Thus, 00.. ..00 00.. '..00ˆˆ( , )

i ir rCov a b is simplified as 1

( ) ( )

( ) ( )

( ) ( )

00.. ..00 00.. ..00

( ) ( )

1 1

( ) ( )2

1 1

2( )2

1

2

ˆˆ( , )

1 1[ cos( ) sin( ' )]

1[ cos( ) sin( '( ) )]

1[ cos( )sin( ' )

1[

i i

j j

i i

j j

i i

j j

i

r r

N Nj j

i i i ij j

N Nj j T

i i i ij j

Nj

i i i ij

Cov a b

E r rN N

E r rN

E r rN

EN

( ) ( )

( )

( ) ( )

( ) ( )

, 1,

2( )

( ) ( )2

, 1,

2

cos( ) cos( ' )]

1[ cos( )sin( ' )]

1[ cos( ) cos( ]

1[ cos( )sin( ' )]

1[ cos( )] [ sin(

j k

i i

j

i

j k

i i

i

i i

Nj k

i i i ij k j k

ji i i i

Nj k

i i i ij k j k

i i i i

i i

r r

E r rN

E r rN

E r rNN

E r EN

' )].i ir 2

Finally, in view that i

is independent from i , we have 3

00.. ..00 00.. '..00

2 2

ˆˆ( , )

1 1[ ] [cos( )sin( ' )] [ ] [cos( )] [sin( ' )].

i i

i i

r r

i i i i i i i

Cov a b

NE E r r E E r E r

N N

4

Using the fact that 5 2

0

[cos( )sin( ' )] cos( )sin( ' ) 0,i i i i i i i i iE r r r r d

6

and

7

[ ] 0,

iE

8

we have 9

00.. ..00 00.. '..00ˆˆ( , ) 0.

i ir rCov a b

10

Similarly, we can show that the covariance between 00.. ..00ˆir

a and 00.. '..00ˆir

a ( 'i ir r ) is 11

53

00.. ..00 00.. '..00

2

ˆ ˆ( , )

1[ ] [cos( )cos( ' )]

0,

i i

i

r r

i i i i

Cov a a

E E r rN

1

in view that 2 2

0

[cos( ) cos( ' )] cos( ) cos( ' ) 0, '.i i i i i i i i i i iE r r r r d for r r

3

The covariance between 00.. ..00ˆ

irb and 00.. '..00

ˆir

b can be calculated as 4

00.. ..00 00.. '..00

2

ˆ ˆ( , )

1[ ] [sin( )sin( ' )]

0,

i i

i

r r

i i i i

Cov b b

E E r rN

5

in view that 6 2

0

[sin( )sin( ' )] sin( )sin( ' ) 0, '.i i i i i i i i i i iE r r r r d for r r

7

8

54

Appendix E: Covariance between estimated Fourier coefficients and sample 1

variance of model output for random balance design sampling 2

Using eq. (42), the covariance between cosine Fourier coefficients 00 00ˆir

a and sample 3

variance 2 can be calculated as 4

2 ( , ) ( , ) 200 00 00 00 00 00ˆ ˆ ˆ ˆ ˆ( , ) ( , ).

i i i

ur r rCov a Cov a a (E1) 5

For random balance design sampling, the cosine coefficient 6

( , ) ( ) ( )00 00

1

1ˆ cos( ) ( )

i

Nu j jr i i i

j

a r uN

7

is estimated based on a grid sample of i (see eq. (38) for details), which will have a 8

much lower estimation error (the estimation error decreases at an order of 2

1

Nwith an 9

increasing sample size N) compared to that of ( , )00 00ˆ

ira

based on simple random sampling 10

(the estimation error decreases at an order of 1

Nwith an increasing sample size N). 11

Therefore, we treat ( , )00 00ˆ

i

ura as a constant and we have 12

2 ( , ) 200 00 00 00ˆ ˆ ˆ ˆ( , ) ( , ).

i ir rCov a Cov a 13

Using eq. (21) and eq. (42), we have 14

( , ) 200 00

( ) ( ) ( ) ( ) 21 1

1 1

21 1

( ) ( )2

ˆ ˆ( , )

1 1{ cos( ) [ ( ( ),..., ( )) ( ( ),..., ( ))] }

1{ cos( )[ ( ( ),..., ( )) ( ( ),..., ( ))] }

( 1)( cos( )) [ ( (

i

i

i

i

r

N Nj j j j

i n nj j

i n n

j ji

Cov a

E g G G g G GN N

E g G G g G GN

N NE E g G

N

( ) ( ) 21 1),..., ( )) ( ( ),..., ( ))] .k k

n nG g G G

15

Since ( ) ( ) ( , )00 00ˆ( cos( )) ( ) 0

i i

j ji rE E a

(see eq. (43) for details), we have 16

55

( , ) 2 200 00 1 1

1ˆ ˆ( , ) { cos( )[ ( ( ),..., ( )) ( ( ),..., ( ))] }.

i ir i n nCov a E g G G g G GN

(E2) 1

Let 2

21 1 1( ,..., ) ( ( ),..., ( )) ( ( ),..., ( ))]n n ng G G g G G 3

and 4

( )1( ) [ ( ,..., ) | ]i n iu E , 5

we have 6

( ) ( )1( ,..., ) ( )

in iu (E3) 7

where ( )

i

is the random error independent from i and ( )( ) 0

iE

. Using eq. (E3), 8

the covariance in eq. (E2) can be simplified as follows, 9

( , ) 200 00

( ) ( )

( ) ( )

ˆ ˆ( , )

1 1[ cos( ) ( )] [ cos( ) ]

1 1[ ] [cos( ) ( )] [ ] [cos( )].

i

i i i

i i i

r

i i i

i i i

Cov a

E u EN N

E E u E EN N

10

In view that [ ] 0i

E and 2

0

1[cos( )] cos( ) 0

2i i iE d

, we have 11

2 ( , ) 200 00 00 00ˆ ˆ ˆ ˆ( , ) ( , ) 0

i ir rCov a Cov a . 12

Similarly, we can show that 13

2 ( , ) 200 00 00 00

ˆ ˆˆ ˆ( , ) ( , ) 0i ir rCov b Cov b . 14

15

56

Appendix F: Standard errors of first-order sensitivity indices for random balance 1

design sampling 2

Standard errors of first-order sensitivity indices can be estimated based on the delta 3

method and eq. (44) as follows, 4

00.. ..00 00.. ..00

00... ..00 00.. ..00

2

2 200... ..00 00.. ..00

2 2ˆˆ 0, 000... ..00 00.. ..00

2 2 2 2 2 2ˆˆ 0 0

ˆˆ ( )

ˆˆ ˆ

ˆˆ( )ˆˆ ˆˆ4 ( ) 4 ( ) ˆ ˆ[

ˆ ˆ ˆ( ) 2 ( ) 2 ( )

i

i ih h

r ri i i i i i

h h

r ri i

x

T

r r

a br r

a b

a ba V b V

VN N

2 ].

5

where 00... ..00

ˆ 0ri

h

a and 00... ..00

ˆ 0ri

h

b indicate that cosine Fourier coefficient 00... ..00

ˆri

a and sine 6

Fourier coefficient 00... ..00

ˆri

b are significantly larger or smaller than zero based on the 7

hypothesis test in eq. (37), respectively. Since we have 8

2 2

2

ˆ ˆ( ) ( )

ˆˆ ˆ

ˆˆ (1 ),

i i

i

i

x

x

x

V V y V

9

the estimated standard error ˆˆ ( )ix can be simplified as 10

00... ..00 00.. ..00

00... ..00 00.. ..00

2

2 2 200... ..00 00... ..002 2 2

ˆˆ 0 0

2 200... ..00 00... ..002

ˆˆ 0 0

ˆˆ ( )

2 2 ˆ ˆˆ ˆˆ ˆ(1 ) (1 ) [ ]ˆ ˆ ˆ

ˆ2 ˆˆ ˆ(1 )[ ]ˆ

i

i

i i i ih h

r ri i

i i ih h

r ri i

x

xr x r x

a b

xx r r

a b

a b VN N

a bN

2

22

2 22

ˆ ˆ[ ]ˆ

ˆ2 ˆˆ ˆ ˆ(1 ) ( ) [ ].ˆ

i

i

i i

xx x

V

VN

11

Based on estimation of 2ˆ ˆ[ ]V by eq. (32), we have 12

57

2

(4) 2 22

ˆ2 1ˆ ˆ ˆˆ ˆ ˆ( ) (1 ) [ ( ) ].

ˆ ( 1)i

i i i

xx x x u

N N

1