filling gaps evapotranspiration

Upload: cristinapopa2005

Post on 05-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Filling Gaps Evapotranspiration

    1/10

    Filling gaps in evapotranspiration measurements for water budgetstudies: Evaluation of a Kalman filtering approach

    Nasim Alavi a, Jon S. Warland a,*, Aaron A. Berg b

    aDepartment of Land Resource Science, University of Guelph, Guelph, Canada N1G2W1bDepartment of Geography, University of Guelph, Guelph, Canada N1G2W1

    Received 5 May 2006; accepted 29 September 2006

    Abstract

    Missing data in long-term eddy covariance measurements of latent heat flux produce errors in the estimation of evapotranspira-

    tion and the water budget. Because no standard method of gap filling has been widely accepted, identification of optimal filling

    methods for gaps is crucial for determining total evapotranspiration. In this study we evaluate the application of a Kalman filter for

    filling gaps in latent heat flux data collected from an agricultural research station. The filtering approach was compared with several

    gap-filling methods including mean diurnal variation, multiple regressions, 2-week average PriestleyTaylor coefficient, and

    multiple imputation. The results demonstrated that a Kalman filtering approach developed using the relationship between latent heat

    flux, available energy, and vapour pressure deficit provides a closer approximation of the original data and introduces smaller errors

    than the other methods evaluated. Evaluation of the Kalman filter approach demonstrates the efficiency of this technique in

    replacing data in both small and large gaps of up to several days.

    # 2006 Elsevier B.V. All rights reserved.

    Keywords: Eddy covariance; Missing data; Multiple imputation; Recursive estimation; Data filling

    1. Introduction

    With the proliferation of the eddy covariance system

    (EC) in the 1980s, direct measurement of evapotran-

    spiration (ET) has become very common (e.g. Aubinet

    et al., 2000; Baldocchi and Meyers, 1998). Advantages of

    this technique are high temporal resolution (e.g. half

    hourly) and integration of the flux over a relatively large

    area (footprint length of 1002000 m). The technique is

    limited by missing or rejected measurements due to

    system failures (system/sensor breakdown),maintenance

    and calibration, and improper weather conditions. Today,

    FluxNet, a world wide network equipped with eddy

    covariance flux towers, is operating continuously to

    collect water vapour and CO2 fluxes from more than 140

    sites around the world (Baldocchi et al., 2001). However,

    approximately 1750% of the observations in water

    vapour fluxes are reported as missing or rejected at

    FluxNet sites (Falge et al., 2001a).

    ET measurements are used in estimation of the water

    budget at daily, monthly and annual time scales.

    Consequently, data gaps in observed ET must be

    accurately filled. Accurate assessment of the water

    budget is required in many studies including con-

    taminant leaching, water available for plant growth and

    irrigation scheduling. For example, in studies of nitrate

    leaching from fertilized agricultural fields to ground

    water, estimation of drainage often depends on knowl-

    edge of ET for water budget estimation (e.g. McCoy

    www.elsevier.com/locate/agrformetAgricultural and Forest Meteorology 141 (2006) 5766

    * Corresponding author. Tel.: +1 519 824 4120;

    fax: +1 519 824 5730.

    E-mail address: [email protected] (J.S. Warland).

    0168-1923/$ see front matter # 2006 Elsevier B.V. All rights reserved.

    doi:10.1016/j.agrformet.2006.09.011

    mailto:[email protected]://dx.doi.org/10.1016/j.agrformet.2006.09.011http://dx.doi.org/10.1016/j.agrformet.2006.09.011mailto:[email protected]
  • 8/2/2019 Filling Gaps Evapotranspiration

    2/10

    et al., 2006). In many studies, groundwater or soil

    moisture contributions are estimated as a residual of the

    water budget equation. Inaccurate estimation of ET

    perpetuates errors into other aspects of the water budget

    and its applications. Thus, accurate gap-filling proce-

    dures need to be established to provide complete data

    sets for ET with a minimization of errors.Numerous methods have been proposed to fill gaps

    in eddy covariance data but no standard method has

    been widely accepted. The commonly used methods for

    filling gaps in water vapour (or latent heat flux)

    measurements include: the mean diurnal variation

    method (Falge et al., 2001a,b), regression methods

    (Greco and Baldocchi, 1996; Berbigier et al., 2001),

    evaluation of a 2-week average PriestleyTaylor

    coefficient (Wilson and Baldocchi, 2000; Wilson

    et al., 2001), look-up tables (Falge et al., 2001b) and

    multiple imputation (Hui et al., 2004).Results from different gap-filling methods vary

    widely. Falge et al. (2001b) filled gaps in annual ET

    using mean diurnal variation and look-up tables for 28

    data sets from 18 FluxNet sites and found that the

    differences in annual ET among methods ranged from

    48 to +86 mm per year. Although several gap-fillingtechniques have been evaluated in replacing missing

    half hour data (Falge et al., 2001a, b), they have not been

    evaluated for longer gaps of order of days.

    Jarvis et al. (2004) have recently demonstrated the

    application of a recursive parameter estimation algo-rithm based on a Kalman filter to fill gaps in CO2 flux

    data. An approach similar to this technique may be

    suitable for water vapour flux. However, the efficiency

    of this technique has not been examined. The Kalman

    filter is a data assimilation technique that has been

    applied in various studies. Williams et al. (2005) used an

    ensemble Kalman filter to evaluate the carbon budget

    over a ponderosa pine forest. Other studies have

    examined the ability of Kalman filters to retrieve the soil

    moisture profile by assimilation of near-surface soil

    moisture measurements (e.g. Entekhabi et al., 1994;

    Reichle et al., 2002). In a study of the surface energybalance, Boni et al. (2001) applied a Kalman filtering

    approach to satellite measurements of surface tempera-

    ture to estimate the components of the energy balance.

    In this study we compared several gap-filling

    techniques including mean diurnal variation, 2-week

    average PriestleyTaylor coefficient, regression

    method, multiple imputation, and recursive parameter

    estimation algorithms based on the Kalman filter for

    filling gaps in latent heat flux data. Second, we

    examined the application of recursive parameter

    estimation algorithms for filling gaps in latent heat

    flux data in replacing missing values over short and

    large gaps in data. We also evaluated the efficiency of

    this technique in improving the annual and monthly

    estimation of ET.

    2. Materials and methods

    2.1. Site description and data collection

    The experimental site was located at the University

    of Guelphs Elora Research Station (438390, 808250),

    25 km northwest of Guelph, Ontario, Canada. Data used

    in this study were collected in 2002 over a 18 ha winter

    wheat field. Measurement of water vapour flux was

    conducted with two eddy covariance systems. Each

    system consisted of a Campbell Scientific CSAT3 three-

    dimensional sonic anemometer and a LICOR-7500

    open path CO2/H2O analyzer. Net radiation (Rn) andsoil heat flux (G) were measured using net radiometers

    (CNR1 Kipp and Zonen, Delft, the Netherlands) and

    nine soil heat flux plates (five Model HFT-1, Radiation

    and Energy Balance Systems Inc., Seattle WA. and four

    Model 610, Thornthwaite Associates, Elmer, NJ). Air

    temperature (Ta) was estimated from measurements by

    the sonic anemometer and corrected for vapour density.

    2.2. Preparation of data

    2.2.1. Eddy covariance dataQuality control was applied to the EC data to exclude

    potentially non-representative flux measurements.

    Screening criteria for mean velocityu, friction velocity(u*), and wind direction were applied. During times of

    weak turbulence or low wind speeds, often experienced at

    night, mixing is weak and the flux is underestimated.

    Therefore, we rejected measurements with mean wind

    u< 1 m=s andwhen thefrictionvelocityu* < 0.1 m/s. Toavoid tower shadowing (Lee and Hu, 2002), periods

    where the angle between the mean horizontal wind and

    sonic axis was more than 1508 were rejected.

    The two EC systems were mounted on the sametower during the study period, therefore, the data series

    of two systems were merged to one time series by

    averaging two data series when they both had data or by

    using whichever data was available in the event that one

    system was not operating.

    In 2002 the amount of missing data in latent heat flux

    (lE) was 17% (due to maintenance and system failures).

    The quality tests discarded 27% of the measured lE

    data, leading to total gap percentage of 44% in 2002

    with 36% for day and 54% for night observations.

    Fifty-five percent of the total gaps were less than 3 h.

    N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 576658

  • 8/2/2019 Filling Gaps Evapotranspiration

    3/10

    There were two long gaps of 7 and 14 days in summer

    2002 due to field operations.

    2.2.2. Meteorological variables

    The climatic variables, including Rn, G, D, Ta, were

    missing 4, 5.5, 17 and 17% of observations respectively,

    during 2002. Gaps in the Ta, Rn, and vapor pressure (e)time series were filled with the temperature, relative

    humidity and net radiation measurements also available

    at a nearby weather station (approximately 200 m

    away). Missing values in G were replaced by using a 20-

    day running linear regression between G and Rn.

    2.3. Gap-filling methods

    The gap-filling methods examined here include:

    mean diurnal variation (MDV), multiple regressions

    (Regr.), 2-week average Priestley and Taylor coefficient(PT), multiple imputation (MI), and two recursive

    parameter estimation algorithms using a Kalman filter.

    In this study we evaluated the recursive parameter

    estimation algorithms as new approaches to gap filling

    of ET measurements by comparing it with the other

    methods. The other filling methods where selected

    because they are commonly used and they each have

    some advantage. MDV does not require climatic data,

    therefore it can be used if meteorological variables are

    not available. Regr. and PT methods consider the impact

    of climatic variables on ET and capture the response ofET to climatic forcing. MI also preserves the relation-

    ships between lE and climatic variables. Furthermore,

    this method can impute missing climatic variables and

    missing lEat the same time and provides a confidence

    level for each imputed value.

    2.3.1. Mean diurnal variation

    In this method, a missing observation is replaced by

    the mean for that time of day based on observations

    from the previous and subsequent days (Falge et al.,

    2001a,b). Derivation of the mean for the missing data

    depends on the length of the time interval of averaging(window size). A window size of 415 days is usually

    used as the averaging interval. Small window sizes (less

    than 4 days) were not sufficient to determine a mean

    from the measurement and not recommended by

    previous studies (Falge et al., 2001a). Larger window

    sizes were not considered, because nonlinear depen-

    dence of lE on environmental variables introduces

    errors through averaging. In this study, an averaging

    interval of7 days for the missing daytime data day,and 14 days for nighttime data, was found to create

    the lowest errors, and was therefore used in this study.

    2.3.2. Multiple regression

    In this method a multiple regression relationship was

    established between latent heat flux and its main

    controlling factors including available energy (Rn G)and vapour pressure deficit (D), for some period around

    the missing data. The resulting equations were then

    used to fill the missing latent heat flux values. In thisstudy, a regression relationship for each missing value

    was calculated using the data from the adjacent 10days (where 10 days resulted in the highestdetermination). If the coefficient of determination

    (R2) was greater than 0.5, the resulting equations were

    used to fill the gap, otherwise the gap remained unfilled.

    For long gaps (gaps larger than 5 days), data from a

    window of20 days were used.

    2.3.3. Two-week average Priestley and Taylor

    coefficientMissing measurements of half hourly latent heat flux

    were estimated using the product of equilibrium

    evaporation for the half hour and the 2-week average

    Priestley and Taylor coefficient (Wilson and Baldocchi,

    2000; Wilson et al., 2001). Equilibrium evaporation was

    obtained using lEeq D=D gRn G, whereD is the slope of saturated vapour pressure with

    temperature (K), g the psychrometric constant and l is

    the latent heat of vapourization. For each missing value

    oflE, the total measured lE and estimated lEeq were

    obtained during a 2-week period before and after themissing value of lE and the Priestley and Taylor

    coefficient (a lE=lEeq, Priestley and Taylor, 1972)was calculated. At the time of the missing lE

    measurement the product of a and lEeq was used to

    estimate missing lE following Wilson and Baldocchi

    (2000).

    2.3.4. Multiple imputation

    Multiple imputation (MI) is a statistical technique

    for analyzing incomplete data sets (Hui et al., 2004). In

    this technique, the missing entries of the incomplete

    data sets are imputed m times. The variation among them imputations reflects the uncertainty with which the

    missing values can be predicted from the observed ones.

    In order to generate imputations for the missing values,

    a probability model must be imposed on the complete

    data set (observed and missing values). Once the model

    is selected, mean vector and covariance matrices of the

    incomplete data set can be generated using the

    expectation and maximization algorithm (EM). The

    EM algorithm is a common technique for fitting models

    to incomplete data and estimating mean and covariance

    matrices (Rubin, 1987; Schafer and Olsen, 1998). Then

    N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 5766 59

  • 8/2/2019 Filling Gaps Evapotranspiration

    4/10

    a data augmentation algorithm uses the initial values

    obtained from the EM algorithm and generates the

    imputed values. Further details of this method are given

    in Hui et al. (2004) and Schafer (1997).

    For this study, we assumed multivariate normal

    distribution for observed lEand the climatic variables.

    A number of previous studies (e.g. Schafer, 1997) haveshown that MI is robust to violations from the normality

    of values used in the analysis if the amount of missing

    values is not large. We examined the assumption of

    multivariate normal distribution by transforming data

    using the BoxCox transformation method provided in

    SAS software version 9.1 (SAS Institute Inc., 2004)

    following Hui et al. (2004). Comparison of the results of

    transformed data with non-transformed data indicated

    that the annual estimations changed only slightly (about

    0.03%). We also assumed that data were missing at

    random (MAR). Under this assumption, the probabilitythat data are missing may depend on observed data but

    not on ones that are missing. This assumption allows us

    to first estimate the relationships among variables from

    the observed data, and then use these relationships to

    obtain unbiased predictions of the missing values from

    the observed data. This assumption is plausible for the

    lE data since most of the missing data were due to

    improper wind conditions (e.g. low u*) or malfunction

    of the instruments and only weakly related to lE itself

    (e.g. when condensation occurs on the gas analyzer). In

    our evaluation procedure (Section 2.4) this assumptionis strictly valid because data are removed randomly.

    A matrix of observations oflEand available energy

    (Rn G), Ta, and D was used to conduct MI analysis.MI analysis was conducted using SAS software version

    9.1 (SAS Institute Inc., 2004). The analysis used an EM

    algorithm (maximum number of iteration used in EM

    was set to 500) for initial value estimations and five

    imputations were created using a data augmentation

    algorithm, Markov Chain Monte Carlo (MCMC). Rubin

    (1987) and Schafer and Olsen (1998) showed that unless

    the rate of missing information is very high only a few

    imputations (35) are enough to estimate missing dataand there is little advantage to producing and analyzing

    more than a few imputed datasets. The average of the

    five imputed values was used to fill each missing

    observation.

    2.3.5. Recursive parameter estimation algorithms

    This method is based on the Kalman filter and

    smoothing techniques developed by Young and co-

    workers (Young, 1999; Young and Pedregal, 1999;

    Young et al., 2004). In this method, the system is

    described by a proper model with unknown parameters

    (observation model). Temporal variations of parameters

    in this model are assumed to be described by a

    generalized random-walk model (state model). Having

    introduced these two models, a forward pass filtering

    (Kalman filter) is applied to the data to predict time

    variable model parameters. The estimates obtained

    from the Kalman filter are updated using a backwardrecursive smoothing algorithm working through the

    data in reverse temporal order. Based on the type of

    observation model, different methodologies have been

    introduced for applying recursive parameter estimation.

    In this study dynamic linear regression (DLR) and state

    dependent parameters (SDP) were applied to the lE

    data using the Captain Toolbox for MATLAB (Young

    et al., 2004).

    In the DLR algorithm, parameters are considered to

    vary with time. In this study, the regression model was

    expressed as the relationship between lE, net radiation(Rn), and vapour pressure deficit (D):

    lEt atRnt Gt btDt zt (1)

    Here a(t) and b(t) are the model parameters and jis

    the regression model error series that is zero mean

    (white noise). The DLR parameters can be seen as

    coefficients in the PenmanMontheith equation:

    lED

    D gRn G

    raCp

    raD gD

    where g g1 rc=ra, ra is the air density, Cp thespecific heat at constant pressure, ra the aerodynamic

    resistance, rc the canopy resistance and g is the psy-

    chrometric constant.

    The stochastic evolution of each parameter in (1) is

    assumed to be described by the following random walk

    process (Young, 1999):

    at at 1 hat; bt bt 1 hbt

    (2)

    and ha, hb are zero mean, white noise error series.

    In the SDP algorithm, parameters are dependent on

    air temperature (Ta) as well as time:

    lEt aTa; tRnt Gt bTa; tDt zt

    (3)

    where

    aTa; t aTa; t 1 hat;

    bTa; t bTa; t 1 hbt(4)

    and ha, hb are zero mean, white noise error series. We

    assumed that the evolution of the model parameters is

    N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 576660

  • 8/2/2019 Filling Gaps Evapotranspiration

    5/10

    nonstationary stochastic as described by a simple ran-

    dom walk process. This ensures that the recursive

    parameter estimates at any sample time t depends only

    on data in the vicinity of this sample. This weighting

    effect has a Gaussian shape which applies the maximum

    weight at sample time twith declining weight each side.

    The bandwidth of the Gaussian window is defined bythe noise variance ratio (NVR) (NVR = s2(z(t))/

    s2(h(t)). The required NVRs are optimized via max-

    imum likelihood error decomposition introduced by

    Young (1999).

    When using the SDP method the state dependent

    parameters may vary with Ta at a rate commensurate

    with the temporal variations in the input series, thus a

    simple random walk cannot be assumed to describe the

    parametric variation over time. Therefore, before

    estimation of parameter variation, the time series of

    lE, Rn, G, and D were sorted with respect to theascending order of Ta. With sorting, the rapid natural

    variations in the input series are effectively eliminated

    from the data and a simple random walk process utilized

    to describe their evolution (Jarvis et al., 2004).

    2.4. Evaluation of gap-filling methods

    We created random and systematic gaps in the data

    sets and applied each of the methods to fill these gaps.

    The analysis used four bimonthly data sets (February

    March, AprilMay, JuneJuly and SeptemberNovem-ber for the 2002 data) with low percentages (2535%)

    of missing and rejected data. Daytime and nighttime

    data were treated separately. Half hourly data were

    deleted randomly until 20% (about 100180 data

    points) of the original daytime data was removed from

    the bimonthly dataset. Then each of the gap-filling

    methods was applied to fill these artificial gaps. This

    procedure was repeated for the nighttime data. To assess

    the performance of each we examined the potential

    error associated with each method. Root mean square

    error (RMSE), coefficient of determination (R2),

    relative root mean square error (R_RMSE), and relativebias (R_bias) were calculated. The absolute error for

    each method was calculated as the measured minus the

    computed value for each artificially missed data and

    was used to calculate the total RMSE for the each

    bimonthly data set. It was divided by the mean oflEfor

    the data set to calculate R_RMSE. R2 was computed

    between the observed and filled data for each bimonthly

    data set and R_bias was obtained by dividing the bias by

    the mean oflE for each data set. The entire procedure

    was performed three times and the average values

    reported here.

    3. Results and discussion

    3.1. Results of filling artificial gaps in data

    Table 1 compares the performance of the gap-filling

    methods. For the daytime data, root mean square errors

    in each bi-monthly period range from 10.5 to 70.3 W/m2. The DLR technique resulted in the smallest error

    (average RMSE = 17.2 W/m2) and largest agreement

    coefficient (average R2 = 0.93), while the largest errors

    (average RMSE = 54.1 W/m2) were found with the PT

    method. The smallest agreement coefficients (average

    R2 = 0.5) were obtained with MDV. Unlike the other

    methods MDV, does not consider the impact of

    meteorological variables on lE so it can be used when

    meteorological data are not available.

    The root mean square errors for nighttime gaps

    ranged from 5.1 to 38.4 W/m2

    and the agreementcoefficients were all less than 0.5. Comparing the

    different methods, the errors were smallest (average

    RMSE = 11.3 W/m2) with DLR whereas PT resulted in

    the largest error (33.7 W/m2). RMSEs are larger during

    summer and daytime than winter and nighttime, but the

    fluxes are also larger. Therefore we considered the

    relative RMSE to make the comparison between

    different seasons and time. Normalizing the RMSE

    with the mean oflEfor each dataset (relative root mean

    square error) shows that all gap-filling methods resulted

    in smaller relative errors (0.10.5) during summer(JuneJuly) and spring (AprilMay) than fall (October

    November) and winter (FebruaryMarch) (0.31.5).

    Also better agreement between computed and original

    values was obtained during summer and spring daytime

    (greater than 70% except for MDV).

    R_bias was mostly less than 5% for daytime datasets

    except for the PT method. This shows that the error

    introduced by the different methods is essentially

    random.

    The gaps in daytime data during the summer can be

    filled with smaller relative errors than nighttime and

    wintertime gaps. The large relative errors associated withfilling missing nighttime data is primarily related to the

    large number of gaps in nighttime data and unreliable

    measurements during nighttime. However, due to the low

    nighttimevalue oflE, the error introducedby application

    of different filling method may not influence the total

    annual or monthly estimation oflE significantly.

    For the evaluation process, gaps in the data were

    created randomly. This may not be a true representation

    of reality because real gaps in data may not be randomly

    distributed. Most large gaps in the data were due to

    unsuitable weather conditions or field operations. These

    N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 5766 61

  • 8/2/2019 Filling Gaps Evapotranspiration

    6/10

    N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 576662

    Table 1

    Root mean square error (RMSE), agreement coefficient (R2), relative root mean square error (R_RMSE), and relative bias (R_bias) of gap filled lE

    (W/m2)

    Regr. MDV PT DLR SDP MI

    Daytime data

    RMSE

    AprilMay 2002 30.22 55.08 42.67 19.46 28.83 40.41JuneJuly 2002 33.68 70.26 53.59 22.80 44.67 52.40

    OctoberNovember 2002 20.20 29.52 61.15 16.19 22.98 29.14

    FebruaryMarch 2002 20.64 27.96 59.15 10.47 21.09 18.13

    Average 26.18 45.70 54.14 17.23 29.39 35.02

    R2

    AprilMay 2002 0.83 0.44 0.79 0.93 0.85 0.72

    JuneJuly 2002 0.93 0.67 0.86 0.97 0.87 0.81

    OctoberNovember 2002 0.80 0.58 0.55 0.88 0.76 0.62

    FebruaryMarch 2002 0.71 0.46 0.51 0.92 0.68 0.74

    Average 0.82 0.54 0.68 0.93 0.79 0.72

    R_RMSE

    AprilMay 2002 0.29 0.54 0.42 0.19 0.28 0.39

    JuneJuly 2002 0.19 0.39 0.29 0.13 0.25 0.29

    OctoberNovember 2002 0.40 0.58 1.21 0.32 0.45 0.55

    FebruaryMarch 2002 0.53 0.72 1.53 0.27 0.55 0.50

    Average 0.35 0.56 0.86 0.23 0.38 0.43

    R_bias

    AprilMay 2002 0.02 0.04 0.09 0.02 0.05 0.04JuneJuly 2002 0.03 0.00 0.05 0.00 0.00 0.02OctoberNovember 2002 0.02 0.05 0.54 0.04 0.06 0.02FebruaryMarch 2002 0.04 0.03 0.80 0.01 0.01 0.00Average 0.04 0.02 0.37 0.02 0.03 0.00

    Nighttime data

    RMSE

    AprilMay 2002 13.62 15.83 25.76 11.41 13.25 22.51

    JuneJuly 2002 17.88 17.83 38.43 15.61 20.35 29.29

    OctoberNovember 2002 16.06 19.28 37.49 13.11 16.60 19.23

    FebruaryMarch 2002 9.34 10.52 32.96 5.15 9.35 12.83

    Average 14.22 15.86 33.66 11.32 14.89 20.96

    R2

    AprilMay 2002 0.23 0.00 0.19 0.41 0.27 0.03

    JuneJuly 2002 0.20 0.00 0.02 0.35 0.18 0.00

    OctoberNovember 2002 0.17 0.00 0.03 0.46 0.19 0.07

    FebruaryMarch 2002 0.27 0.04 0.02 0.75 0.24 0.02

    Average 0.22 0.01 0.07 0.49 0.22 0.03

    R_RMSE

    AprilMay 2002 0.88 1.03 1.67 0.74 0.86 1.52JuneJuly 2002 1.07 1.06 2.29 0.93 1.21 1.76

    OctoberNovember 2002 1.15 1.38 2.69 0.93 1.18 1.43

    FebruaryMarch 2002 0.99 1.11 3.48 0.55 0.99 1.47

    Average 1.02 1.14 2.53 0.79 1.06 1.54

    R_bias

    AprilMay 2002 0.16 0.09 1.28 0.04 0.21 0.10JuneJuly 2002 0.53 0.14 1.84 0.40 0.74 0.53OctoberNovember 2002 0.09 0.05 1.94 0.16 0.36 0.05FebruaryMarch 2002 0.15 0.12 2.69 0.01 0.02 0.05Average 0.03 0.10 1.94 0.13 0.31 0.11

  • 8/2/2019 Filling Gaps Evapotranspiration

    7/10

    gaps may introduce biases by their non-random

    distribution. Therefore the resultant errors with artificial

    gaps may not be equivalent to the errors from

    application of filling methods to the real gaps. To

    evaluate the application of the gap-filling methods over

    longer gaps as would be expected from bad weather

    conditions or instrument failure, in the followingsection we assess the top three methods (Regr., DLR

    and SDP) for long gaps on order of few hours to days.

    Lack of energy balance closure is a common problem

    in measuring turbulent fluxes with EC technique. It has

    become generally accepted that the turbulent fluxes

    (sensible heat and latent heat fluxes) are underestimated

    by approximately 1030% relative to estimate of

    available energy (Wilson et al., 2002; Aubinet et al.,

    2000). Closure has not been considered in this analysis,

    however since the gap-filling methods are essentially

    interpolative, correcting existing data for closure wouldproduce an equal change in the filled data.

    3.2. The Kalman filter approach to filling long gaps

    This section examines the efficiency of SDP, DLR

    and Regr. methods in replacing data in large gaps by

    creating 3, 6, 12, 18 h, and 1, 3, 5 and 7 day gaps in the

    dataset. As an example, the results for a 3-day gap in

    May 2002 are shown in Fig. 1. For this period the RMSE

    for DLR was 8.2 W/m2 and for SDP 13.4 W/m2, while it

    was 20.6 W/m2 for Regr. Next, a 1-day moving gap was

    created throughout the growing season (MaySeptem-

    ber 2002). Fig. 2 shows the resulting RMSE for eachday of growing season 2002. In general, the values of

    RMSE are smaller for DLR than Regr. and SDP. The

    average daily RMSE was 22.4 W/m2 with DLR,

    33.2 W/m2 with SDP, and 30.2 W/m2 with Regr. This

    procedure was repeated using moving gap sizes of 3, 6,

    12, 18 h and 2, 3, 5, 7 days over MaySeptember 2002.

    Fig. 3 shows the average RMSE for different gap sizes.

    As the size of gap increases, mean RMSE of DLR, SDP

    and Regr. increases. DLR outperformed the other

    methods for all gap sizes and has the smallest mean

    RMSE while SDP has the greatest mean RMSE for allgap sizes. The difference in mean RMSE between Regr.

    and DLR is 6.8 W/m2 for 3 h gaps and decreases to

    3.4 W/m2 for a gap size of 7 days.

    In the SDP method, the lEtime series are sorted out

    of temporal order. Jarvis et al. (2004) claimed that in

    this case the systematic gaps in the time series become

    N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 5766 63

    Fig. 1. Comparison of gap filled lE (*) by (a) DLR, (b) SDP, and (c) Regr. with measurements () for 1821 May 2002.

  • 8/2/2019 Filling Gaps Evapotranspiration

    8/10

    much smaller nonsystematic gaps in the sorted

    temperature space and these smaller gaps can be filled

    with smaller errors. Therefore, we should expect better

    estimation of data in longer gaps using SDP. However,

    in this study the SDP approach gave larger errors for all

    gap sizes compared to DLR. In the SDP algorithm the

    time series of Rn, G and D were also sorted with respect

    to the ascending order of Ta. Although there is some

    correlation of these variables with air temperature, otherfactors are also significant. The assumption that Ta is the

    dominant control may lead to greater errors in SDP

    analysis.

    Including more variables in the algorithm would

    increase the complexity of the model and result in large

    errors comparing to simpler DLR algorithm.

    Application of the DLR method provides a very good

    estimation of missing values in lE since it is able to

    reproduce most variation in lE due to different factors

    (e.g. net radiation, vapour pressure). It provides smallerRMSEs (3.46.8 W/m2, with the lower values for gap

    N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 576664

    Fig. 2. RMSE of (a) Regr., (b) DLR, and (c) SDP for 1 day moving gap in growing season 2002.

    Fig. 3. Mean RMSE of Regr., SDP, and DLR for filling in 3, 6, 12, 18 h and 1, 2, 3, 5, 7 days moving gaps in growing season 2002.

  • 8/2/2019 Filling Gaps Evapotranspiration

    9/10

    sizes of 3 h to 7 days) than Regr. for estimating lE.

    Similar to the analysis over random gaps, the DLR

    method was the most accurate for filling the larger

    artificial gaps.

    3.3. Estimation of annual and monthly sum of ET

    Annual sums of ET were computed by filling the real

    gaps with the Regr. and DLR methods. Regr. method

    was applied to data with a window of10. Since somegaps were not filled because of low agreement

    coefficients (mostly in winter time) we repeated the

    application of Regr. using windows of 20 and 30days to fill data left unfilled by the 10-day window.

    The annual estimation of ET was 606 mm by Regr.,

    and it was 585 by DLR which differ by 21 mm (3%).

    Although the application of Regr. and DLR did not

    affect the annual sum very much, estimating ET duringa shorter period of time might be highly affected by the

    filling method.

    The results for monthly estimation of ET for year

    2002 showed that the difference in monthly ET filled by

    Regr. and by DLR ranged from 1 to 8 mm (0.215%) for

    different months. The absolute difference was small

    during winter months (0.21.5 mm) while it was larger

    in the growing season (48 mm). Therefore, the

    selection of proper gap-filling method is crucial in

    estimating monthly ET during growing season.

    4. Summary

    Several gap-filling methods were applied to eddy

    covariance data of latent heat flux from an agricultural

    site. The methods were evaluated based on their ability

    to fill artificial data gaps. The recursive parameter

    estimation approach based on the relation between

    latent heat flux and available energy and vapour

    pressure deficit resulted in better estimation of the

    original data and smaller errors than the MI, PT and

    MDV, and Regr. methods.

    The recursive parameter estimation approach withthe DLR algorithm produced the smallest errors for both

    short and long (hours to days) gaps. The relative errors

    were smaller for daytime data during the growing

    season, when ET is high, than for nighttime and winter

    data. The application of this method can improve the

    estimation of total ET over a period of time since it

    results in smaller errors than other methods, assuming

    no significant bias between missing and present data

    (MAR is valid).

    While the application of different gap-filling

    techniques does not greatly affect influence the total

    annual sum, the total ET for shorter periods of time

    (month or days) can be highly affected by choice of gap-

    filling technique. Total ET during the growing season,

    when ET is high, is more sensitive to choice of gap-

    filling method and so proper selection of a method is

    crucial at this time. Since the evaluation process based

    on filling artificial gaps showed that the resultant errorwith DLR was smaller than with other methods,

    application of this technique should provide better

    estimation of total ET, especially over a short period of

    time and during the growing season.

    Due to the efficiency of the Kalman filtering approach

    in replacing data in both small and large gaps, and

    providing the best approximation of the original data, the

    application of this technique is recommended as a gap-

    filling approach for estimation of total ET, as is further

    study of these techniques on the other contexts.

    Acknowledgements

    This research project was funded by the grants from

    the Ontario Ministry of Agriculture and Food and the

    Natural Sciences and Engineering Research Council of

    Canada. We are very grateful for the thoughtful and

    detailed comments of the two anonymous reviewers.

    References

    Aubinet, M., Grelle, A., Ibrom, A., Rannik, U., Moncrieff, J., Foken,T., Kowalski, A.S., Martin, P.H., Berbigier, P., Bernhofer, Ch.,

    Clement, R., Elbers, J., Granier, A., Grunwald, T., Morgenstern,

    K., Pilegaard, K., Rebmann, C., Snijders, W., Valentini, R., Vesala,

    T., 2000. Estimatesof the annual netcarbon and water exchange of

    European forests: the EUROFLUX methodology. Adv. Ecol. Res.

    30, 113175.

    Baldocchi, D., Meyers, T., 1998. On using eco-physiological, micro-

    meteorological and biogeochemical theory to evaluate carbon

    dioxide, water vapour and trace gas fluxes over vegetation: a

    perspective. Agric. For. Meteorol. 90, 125.

    Baldocchi, D., Falge, E., Gu, L., Olson, R., Hollinger, D., Running, S.,

    Anthoni, P., Bernhofer, C., Davis, K., Evans, R., Fuentes, J.,

    Goldstein, A., Katul, G., Law, B., Lee, X., Malhi, Y., Meyers,

    T., Munger, W., Oechel, W., Paw, U.K.T., Pilegaard, K., Schmid,

    H.P., Valentini, R., Verma, S., Vesala, T., Wilson, K., Wofsy, S.,

    2001. FLUXNET: a new tool to study the temporal and spatial

    variability of ecosystem-scale carbon dioxide, water vapour and

    energy flux densities. Bull. Am. Meteorol. Soc. 82, 24152434.

    Berbigier, P., Bonnefond, J.M., Mellmann, P., 2001. CO2 and water

    vapour fluxes for 2 years obove Euroflux forest site. Agric. For.

    Meteorol. 108, 183197.

    Boni, G.,Entekhabi, D.,Gastelli, F., 2001. Land data assimilationwith

    satellite measurements for the estimation of surface energy bal-

    ance components and surface control on evaporation. Water

    Resour. Res. 37 (6), 17131722.

    Entekhabi, D., Nakamura, H., Njoku, E.G., 1994. Solving the inverse

    problem for soil moisture and temperature profiles by sequential

    N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 5766 65

  • 8/2/2019 Filling Gaps Evapotranspiration

    10/10

    assimilation of multifrequency remotely sensed observations.

    IEEE Trans. Geosci. Remote Sensing 32 (2), 438448.

    Falge, E., Baldocchi, D.D., Olson, R., Anthoni, P., Aubinet, M.,

    Bernhofer, C., Burba, G., Ceulemans, R., Clement, R., Dolman,

    H., Granier, A., Gross, P., Grunwald, T., Hollinger, S., Jensen, N.-

    O., Katul, G., Keronen, P., Kowalski, A., Lai, C.T., Law, B.E.,

    Meyers, T., Moncrieff, J., Moors, E., Munger, J.W., Pilegaard, K.,

    Rannik, U., Rebmann, C., Suyker, A.,Tenhunen,J., Tu, K.,Verma,S., Vesala, T., Wilson, K., Wofsy, S., 2001a. Gap filling strategies

    for defensible annual sums of net ecosystem exchange. Agric. For.

    Meteorol. 107, 4369.

    Falge, E., Baldocchi, D.D., Olson, R., Anthoni, P., Aubinet, M.,

    Bernhofer, C., Burba, G., Ceulemans, R., Clement, R., Dolman,

    H., Granier, A., Gross, P., Grunwald, T., Hollinger, S., Jensen, N.-

    O., Katul, G., Keronen, P., Kowalski, A., Lai, C.T., Law, B.E.,

    Meyers, T., Moncrieff, J., Moors, E., Munger, J.W., Pilegaard, K.,

    Rannik, U., Rebmann, C., Suyker, A.,Tenhunen,J., Tu, K.,Verma,

    S., Vesala, T., Wilson, K., Wofsy, S., 2001b. Gap filling strategies

    for long term energy flux data sets. Agric. For. Meteorol. 107,

    7177.

    Greco, S., Baldocchi, D.D., 1996. Seasonal variations of CO2 water

    vapour exchange rates over a temperate deciduous forest. Global

    Change Biol. 2, 183197.

    Hui, D., Wan, S., Su, B., Katul, G., Monson, R., Luo, Y., 2004. Gap-

    filling missing data in eddy covariance measurements using multi-

    ple imputation (MI) for annual estimations. Agric. For. Meteorol.

    121, 93111.

    Jarvis, A.J., Stauch, V.J., Schulz, K., Young, P.C., 2004. The seasonal

    temperature dependency of photosynthesis and respiration in two

    deciduous forests. Global Change Biol. 10, 939950.

    Lee, X.H., Hu, X.Z., 2002. Forest-air fluxes of carbon, water and

    energy over non-flat terrain. Boundary-Layer Meteorol. 103 (2),

    277301.

    McCoy, A.J., Parkin, G.W., Wagner-Riddle, C., Warland, J.S., Lauzon,

    J., von Bertoldi, P., Fallow, D., Jayasundara, S., 2006. Usingautomated soil water content and temperature measurement sys-

    tems to estimate soil water budgets. Can. J. Soil Sci. 86, 4756.

    Priestley, C.H.B., Taylor, R.J., 1972. On theassessment of surface heat

    flux and evaporation using large-scale parameters. Monthly

    Weather Rev. 100, 8192.

    Reichle, R., McLaughlin, D.B., Entekhabi, D., 2002. Hydrologic data

    assimilation with the ensemble Kalman filter. Monthly Weather

    Rev. 130 (1), 103114.

    Rubin, D.B., 1987. Multiple Imputation for Nonresponse Surveys.

    John Wiley & Sons, New York.Schafer, J.L., Olsen, M.K., 1998. Multiple imputation for multivariate

    missing-data problems: a data analysts perspective. Multivariate

    Behav. Res. 33, 545571.

    Schafer, J.L., 1997. Analysis of Incomplete Multivariate Data. Chap-

    man and Hall, London.

    Williams, M., Schwarz, P.A., Law, B.E., Irvine, J., Kurpius, M.R.,

    2005. An improved analysis of forest carbon dynamics using data

    assilmilation. Global Change Biol. 11, 89105.

    Wilson, K.B., Baldocchi, D.D., 2000. Seasonal and interannual varia-

    bility of energy fluxes over a broadleaved temperate deciduous

    forest in North America. Agric. For. Meteorol. 100, 118.

    Wilson, K.B., Hanson, P.J., Mulholland, P.J., Baldocchi, D.D., Wulls-

    chleger, S.D., 2001. A comparison of methods for determining

    forest evapotranspiration and its components: sap-flow, soil water

    budget, eddy covariance and catchment water balance. Agric. For.

    Meteorol. 106, 153168.

    Wilson, K.B., Goldstein, A.H., Falge, E., Aubinet, M., Baldocchi, D.,

    Berbigier, P., Bernhofer, Ch., Ceulemans, R., Dolman, H., Field, C.,

    Grelle, A., Law, B., Meyers, T., Moncrieff, J., Monson, R., Oechel,

    W., Tenhunen, J., Valentini, R., Verma, S., 2002. Energy balance

    closure at FLUXNET sites. Agric. For. Meteorol. 113, 223243.

    Young, P.C., 1999. Nonstationary time series analysis and forecasting.

    Progr. Environ. Sci. 1, 348.

    Young, P.C., Pedregal, D.J., 1999. Recursive and en-bloc approaches

    to signal extraction. J. Appl. Stat. 26, 103128.

    Young, P.C., Taylor, C.J., Tych, W., Pedregal, D.J., McKenna, P.G.,

    2004. The Captain Toolbox. Centre for Research on Environmen-tal Systems and Statistics, Lancaster University (http://www.e-

    s.lancs.ac.uk/cres/captain).

    N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 576666

    http://www.es.lancs.ac.uk/cres/captainhttp://www.es.lancs.ac.uk/cres/captainhttp://www.es.lancs.ac.uk/cres/captainhttp://www.es.lancs.ac.uk/cres/captain