filling gaps evapotranspiration
TRANSCRIPT
-
8/2/2019 Filling Gaps Evapotranspiration
1/10
Filling gaps in evapotranspiration measurements for water budgetstudies: Evaluation of a Kalman filtering approach
Nasim Alavi a, Jon S. Warland a,*, Aaron A. Berg b
aDepartment of Land Resource Science, University of Guelph, Guelph, Canada N1G2W1bDepartment of Geography, University of Guelph, Guelph, Canada N1G2W1
Received 5 May 2006; accepted 29 September 2006
Abstract
Missing data in long-term eddy covariance measurements of latent heat flux produce errors in the estimation of evapotranspira-
tion and the water budget. Because no standard method of gap filling has been widely accepted, identification of optimal filling
methods for gaps is crucial for determining total evapotranspiration. In this study we evaluate the application of a Kalman filter for
filling gaps in latent heat flux data collected from an agricultural research station. The filtering approach was compared with several
gap-filling methods including mean diurnal variation, multiple regressions, 2-week average PriestleyTaylor coefficient, and
multiple imputation. The results demonstrated that a Kalman filtering approach developed using the relationship between latent heat
flux, available energy, and vapour pressure deficit provides a closer approximation of the original data and introduces smaller errors
than the other methods evaluated. Evaluation of the Kalman filter approach demonstrates the efficiency of this technique in
replacing data in both small and large gaps of up to several days.
# 2006 Elsevier B.V. All rights reserved.
Keywords: Eddy covariance; Missing data; Multiple imputation; Recursive estimation; Data filling
1. Introduction
With the proliferation of the eddy covariance system
(EC) in the 1980s, direct measurement of evapotran-
spiration (ET) has become very common (e.g. Aubinet
et al., 2000; Baldocchi and Meyers, 1998). Advantages of
this technique are high temporal resolution (e.g. half
hourly) and integration of the flux over a relatively large
area (footprint length of 1002000 m). The technique is
limited by missing or rejected measurements due to
system failures (system/sensor breakdown),maintenance
and calibration, and improper weather conditions. Today,
FluxNet, a world wide network equipped with eddy
covariance flux towers, is operating continuously to
collect water vapour and CO2 fluxes from more than 140
sites around the world (Baldocchi et al., 2001). However,
approximately 1750% of the observations in water
vapour fluxes are reported as missing or rejected at
FluxNet sites (Falge et al., 2001a).
ET measurements are used in estimation of the water
budget at daily, monthly and annual time scales.
Consequently, data gaps in observed ET must be
accurately filled. Accurate assessment of the water
budget is required in many studies including con-
taminant leaching, water available for plant growth and
irrigation scheduling. For example, in studies of nitrate
leaching from fertilized agricultural fields to ground
water, estimation of drainage often depends on knowl-
edge of ET for water budget estimation (e.g. McCoy
www.elsevier.com/locate/agrformetAgricultural and Forest Meteorology 141 (2006) 5766
* Corresponding author. Tel.: +1 519 824 4120;
fax: +1 519 824 5730.
E-mail address: [email protected] (J.S. Warland).
0168-1923/$ see front matter # 2006 Elsevier B.V. All rights reserved.
doi:10.1016/j.agrformet.2006.09.011
mailto:[email protected]://dx.doi.org/10.1016/j.agrformet.2006.09.011http://dx.doi.org/10.1016/j.agrformet.2006.09.011mailto:[email protected] -
8/2/2019 Filling Gaps Evapotranspiration
2/10
et al., 2006). In many studies, groundwater or soil
moisture contributions are estimated as a residual of the
water budget equation. Inaccurate estimation of ET
perpetuates errors into other aspects of the water budget
and its applications. Thus, accurate gap-filling proce-
dures need to be established to provide complete data
sets for ET with a minimization of errors.Numerous methods have been proposed to fill gaps
in eddy covariance data but no standard method has
been widely accepted. The commonly used methods for
filling gaps in water vapour (or latent heat flux)
measurements include: the mean diurnal variation
method (Falge et al., 2001a,b), regression methods
(Greco and Baldocchi, 1996; Berbigier et al., 2001),
evaluation of a 2-week average PriestleyTaylor
coefficient (Wilson and Baldocchi, 2000; Wilson
et al., 2001), look-up tables (Falge et al., 2001b) and
multiple imputation (Hui et al., 2004).Results from different gap-filling methods vary
widely. Falge et al. (2001b) filled gaps in annual ET
using mean diurnal variation and look-up tables for 28
data sets from 18 FluxNet sites and found that the
differences in annual ET among methods ranged from
48 to +86 mm per year. Although several gap-fillingtechniques have been evaluated in replacing missing
half hour data (Falge et al., 2001a, b), they have not been
evaluated for longer gaps of order of days.
Jarvis et al. (2004) have recently demonstrated the
application of a recursive parameter estimation algo-rithm based on a Kalman filter to fill gaps in CO2 flux
data. An approach similar to this technique may be
suitable for water vapour flux. However, the efficiency
of this technique has not been examined. The Kalman
filter is a data assimilation technique that has been
applied in various studies. Williams et al. (2005) used an
ensemble Kalman filter to evaluate the carbon budget
over a ponderosa pine forest. Other studies have
examined the ability of Kalman filters to retrieve the soil
moisture profile by assimilation of near-surface soil
moisture measurements (e.g. Entekhabi et al., 1994;
Reichle et al., 2002). In a study of the surface energybalance, Boni et al. (2001) applied a Kalman filtering
approach to satellite measurements of surface tempera-
ture to estimate the components of the energy balance.
In this study we compared several gap-filling
techniques including mean diurnal variation, 2-week
average PriestleyTaylor coefficient, regression
method, multiple imputation, and recursive parameter
estimation algorithms based on the Kalman filter for
filling gaps in latent heat flux data. Second, we
examined the application of recursive parameter
estimation algorithms for filling gaps in latent heat
flux data in replacing missing values over short and
large gaps in data. We also evaluated the efficiency of
this technique in improving the annual and monthly
estimation of ET.
2. Materials and methods
2.1. Site description and data collection
The experimental site was located at the University
of Guelphs Elora Research Station (438390, 808250),
25 km northwest of Guelph, Ontario, Canada. Data used
in this study were collected in 2002 over a 18 ha winter
wheat field. Measurement of water vapour flux was
conducted with two eddy covariance systems. Each
system consisted of a Campbell Scientific CSAT3 three-
dimensional sonic anemometer and a LICOR-7500
open path CO2/H2O analyzer. Net radiation (Rn) andsoil heat flux (G) were measured using net radiometers
(CNR1 Kipp and Zonen, Delft, the Netherlands) and
nine soil heat flux plates (five Model HFT-1, Radiation
and Energy Balance Systems Inc., Seattle WA. and four
Model 610, Thornthwaite Associates, Elmer, NJ). Air
temperature (Ta) was estimated from measurements by
the sonic anemometer and corrected for vapour density.
2.2. Preparation of data
2.2.1. Eddy covariance dataQuality control was applied to the EC data to exclude
potentially non-representative flux measurements.
Screening criteria for mean velocityu, friction velocity(u*), and wind direction were applied. During times of
weak turbulence or low wind speeds, often experienced at
night, mixing is weak and the flux is underestimated.
Therefore, we rejected measurements with mean wind
u< 1 m=s andwhen thefrictionvelocityu* < 0.1 m/s. Toavoid tower shadowing (Lee and Hu, 2002), periods
where the angle between the mean horizontal wind and
sonic axis was more than 1508 were rejected.
The two EC systems were mounted on the sametower during the study period, therefore, the data series
of two systems were merged to one time series by
averaging two data series when they both had data or by
using whichever data was available in the event that one
system was not operating.
In 2002 the amount of missing data in latent heat flux
(lE) was 17% (due to maintenance and system failures).
The quality tests discarded 27% of the measured lE
data, leading to total gap percentage of 44% in 2002
with 36% for day and 54% for night observations.
Fifty-five percent of the total gaps were less than 3 h.
N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 576658
-
8/2/2019 Filling Gaps Evapotranspiration
3/10
There were two long gaps of 7 and 14 days in summer
2002 due to field operations.
2.2.2. Meteorological variables
The climatic variables, including Rn, G, D, Ta, were
missing 4, 5.5, 17 and 17% of observations respectively,
during 2002. Gaps in the Ta, Rn, and vapor pressure (e)time series were filled with the temperature, relative
humidity and net radiation measurements also available
at a nearby weather station (approximately 200 m
away). Missing values in G were replaced by using a 20-
day running linear regression between G and Rn.
2.3. Gap-filling methods
The gap-filling methods examined here include:
mean diurnal variation (MDV), multiple regressions
(Regr.), 2-week average Priestley and Taylor coefficient(PT), multiple imputation (MI), and two recursive
parameter estimation algorithms using a Kalman filter.
In this study we evaluated the recursive parameter
estimation algorithms as new approaches to gap filling
of ET measurements by comparing it with the other
methods. The other filling methods where selected
because they are commonly used and they each have
some advantage. MDV does not require climatic data,
therefore it can be used if meteorological variables are
not available. Regr. and PT methods consider the impact
of climatic variables on ET and capture the response ofET to climatic forcing. MI also preserves the relation-
ships between lE and climatic variables. Furthermore,
this method can impute missing climatic variables and
missing lEat the same time and provides a confidence
level for each imputed value.
2.3.1. Mean diurnal variation
In this method, a missing observation is replaced by
the mean for that time of day based on observations
from the previous and subsequent days (Falge et al.,
2001a,b). Derivation of the mean for the missing data
depends on the length of the time interval of averaging(window size). A window size of 415 days is usually
used as the averaging interval. Small window sizes (less
than 4 days) were not sufficient to determine a mean
from the measurement and not recommended by
previous studies (Falge et al., 2001a). Larger window
sizes were not considered, because nonlinear depen-
dence of lE on environmental variables introduces
errors through averaging. In this study, an averaging
interval of7 days for the missing daytime data day,and 14 days for nighttime data, was found to create
the lowest errors, and was therefore used in this study.
2.3.2. Multiple regression
In this method a multiple regression relationship was
established between latent heat flux and its main
controlling factors including available energy (Rn G)and vapour pressure deficit (D), for some period around
the missing data. The resulting equations were then
used to fill the missing latent heat flux values. In thisstudy, a regression relationship for each missing value
was calculated using the data from the adjacent 10days (where 10 days resulted in the highestdetermination). If the coefficient of determination
(R2) was greater than 0.5, the resulting equations were
used to fill the gap, otherwise the gap remained unfilled.
For long gaps (gaps larger than 5 days), data from a
window of20 days were used.
2.3.3. Two-week average Priestley and Taylor
coefficientMissing measurements of half hourly latent heat flux
were estimated using the product of equilibrium
evaporation for the half hour and the 2-week average
Priestley and Taylor coefficient (Wilson and Baldocchi,
2000; Wilson et al., 2001). Equilibrium evaporation was
obtained using lEeq D=D gRn G, whereD is the slope of saturated vapour pressure with
temperature (K), g the psychrometric constant and l is
the latent heat of vapourization. For each missing value
oflE, the total measured lE and estimated lEeq were
obtained during a 2-week period before and after themissing value of lE and the Priestley and Taylor
coefficient (a lE=lEeq, Priestley and Taylor, 1972)was calculated. At the time of the missing lE
measurement the product of a and lEeq was used to
estimate missing lE following Wilson and Baldocchi
(2000).
2.3.4. Multiple imputation
Multiple imputation (MI) is a statistical technique
for analyzing incomplete data sets (Hui et al., 2004). In
this technique, the missing entries of the incomplete
data sets are imputed m times. The variation among them imputations reflects the uncertainty with which the
missing values can be predicted from the observed ones.
In order to generate imputations for the missing values,
a probability model must be imposed on the complete
data set (observed and missing values). Once the model
is selected, mean vector and covariance matrices of the
incomplete data set can be generated using the
expectation and maximization algorithm (EM). The
EM algorithm is a common technique for fitting models
to incomplete data and estimating mean and covariance
matrices (Rubin, 1987; Schafer and Olsen, 1998). Then
N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 5766 59
-
8/2/2019 Filling Gaps Evapotranspiration
4/10
a data augmentation algorithm uses the initial values
obtained from the EM algorithm and generates the
imputed values. Further details of this method are given
in Hui et al. (2004) and Schafer (1997).
For this study, we assumed multivariate normal
distribution for observed lEand the climatic variables.
A number of previous studies (e.g. Schafer, 1997) haveshown that MI is robust to violations from the normality
of values used in the analysis if the amount of missing
values is not large. We examined the assumption of
multivariate normal distribution by transforming data
using the BoxCox transformation method provided in
SAS software version 9.1 (SAS Institute Inc., 2004)
following Hui et al. (2004). Comparison of the results of
transformed data with non-transformed data indicated
that the annual estimations changed only slightly (about
0.03%). We also assumed that data were missing at
random (MAR). Under this assumption, the probabilitythat data are missing may depend on observed data but
not on ones that are missing. This assumption allows us
to first estimate the relationships among variables from
the observed data, and then use these relationships to
obtain unbiased predictions of the missing values from
the observed data. This assumption is plausible for the
lE data since most of the missing data were due to
improper wind conditions (e.g. low u*) or malfunction
of the instruments and only weakly related to lE itself
(e.g. when condensation occurs on the gas analyzer). In
our evaluation procedure (Section 2.4) this assumptionis strictly valid because data are removed randomly.
A matrix of observations oflEand available energy
(Rn G), Ta, and D was used to conduct MI analysis.MI analysis was conducted using SAS software version
9.1 (SAS Institute Inc., 2004). The analysis used an EM
algorithm (maximum number of iteration used in EM
was set to 500) for initial value estimations and five
imputations were created using a data augmentation
algorithm, Markov Chain Monte Carlo (MCMC). Rubin
(1987) and Schafer and Olsen (1998) showed that unless
the rate of missing information is very high only a few
imputations (35) are enough to estimate missing dataand there is little advantage to producing and analyzing
more than a few imputed datasets. The average of the
five imputed values was used to fill each missing
observation.
2.3.5. Recursive parameter estimation algorithms
This method is based on the Kalman filter and
smoothing techniques developed by Young and co-
workers (Young, 1999; Young and Pedregal, 1999;
Young et al., 2004). In this method, the system is
described by a proper model with unknown parameters
(observation model). Temporal variations of parameters
in this model are assumed to be described by a
generalized random-walk model (state model). Having
introduced these two models, a forward pass filtering
(Kalman filter) is applied to the data to predict time
variable model parameters. The estimates obtained
from the Kalman filter are updated using a backwardrecursive smoothing algorithm working through the
data in reverse temporal order. Based on the type of
observation model, different methodologies have been
introduced for applying recursive parameter estimation.
In this study dynamic linear regression (DLR) and state
dependent parameters (SDP) were applied to the lE
data using the Captain Toolbox for MATLAB (Young
et al., 2004).
In the DLR algorithm, parameters are considered to
vary with time. In this study, the regression model was
expressed as the relationship between lE, net radiation(Rn), and vapour pressure deficit (D):
lEt atRnt Gt btDt zt (1)
Here a(t) and b(t) are the model parameters and jis
the regression model error series that is zero mean
(white noise). The DLR parameters can be seen as
coefficients in the PenmanMontheith equation:
lED
D gRn G
raCp
raD gD
where g g1 rc=ra, ra is the air density, Cp thespecific heat at constant pressure, ra the aerodynamic
resistance, rc the canopy resistance and g is the psy-
chrometric constant.
The stochastic evolution of each parameter in (1) is
assumed to be described by the following random walk
process (Young, 1999):
at at 1 hat; bt bt 1 hbt
(2)
and ha, hb are zero mean, white noise error series.
In the SDP algorithm, parameters are dependent on
air temperature (Ta) as well as time:
lEt aTa; tRnt Gt bTa; tDt zt
(3)
where
aTa; t aTa; t 1 hat;
bTa; t bTa; t 1 hbt(4)
and ha, hb are zero mean, white noise error series. We
assumed that the evolution of the model parameters is
N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 576660
-
8/2/2019 Filling Gaps Evapotranspiration
5/10
nonstationary stochastic as described by a simple ran-
dom walk process. This ensures that the recursive
parameter estimates at any sample time t depends only
on data in the vicinity of this sample. This weighting
effect has a Gaussian shape which applies the maximum
weight at sample time twith declining weight each side.
The bandwidth of the Gaussian window is defined bythe noise variance ratio (NVR) (NVR = s2(z(t))/
s2(h(t)). The required NVRs are optimized via max-
imum likelihood error decomposition introduced by
Young (1999).
When using the SDP method the state dependent
parameters may vary with Ta at a rate commensurate
with the temporal variations in the input series, thus a
simple random walk cannot be assumed to describe the
parametric variation over time. Therefore, before
estimation of parameter variation, the time series of
lE, Rn, G, and D were sorted with respect to theascending order of Ta. With sorting, the rapid natural
variations in the input series are effectively eliminated
from the data and a simple random walk process utilized
to describe their evolution (Jarvis et al., 2004).
2.4. Evaluation of gap-filling methods
We created random and systematic gaps in the data
sets and applied each of the methods to fill these gaps.
The analysis used four bimonthly data sets (February
March, AprilMay, JuneJuly and SeptemberNovem-ber for the 2002 data) with low percentages (2535%)
of missing and rejected data. Daytime and nighttime
data were treated separately. Half hourly data were
deleted randomly until 20% (about 100180 data
points) of the original daytime data was removed from
the bimonthly dataset. Then each of the gap-filling
methods was applied to fill these artificial gaps. This
procedure was repeated for the nighttime data. To assess
the performance of each we examined the potential
error associated with each method. Root mean square
error (RMSE), coefficient of determination (R2),
relative root mean square error (R_RMSE), and relativebias (R_bias) were calculated. The absolute error for
each method was calculated as the measured minus the
computed value for each artificially missed data and
was used to calculate the total RMSE for the each
bimonthly data set. It was divided by the mean oflEfor
the data set to calculate R_RMSE. R2 was computed
between the observed and filled data for each bimonthly
data set and R_bias was obtained by dividing the bias by
the mean oflE for each data set. The entire procedure
was performed three times and the average values
reported here.
3. Results and discussion
3.1. Results of filling artificial gaps in data
Table 1 compares the performance of the gap-filling
methods. For the daytime data, root mean square errors
in each bi-monthly period range from 10.5 to 70.3 W/m2. The DLR technique resulted in the smallest error
(average RMSE = 17.2 W/m2) and largest agreement
coefficient (average R2 = 0.93), while the largest errors
(average RMSE = 54.1 W/m2) were found with the PT
method. The smallest agreement coefficients (average
R2 = 0.5) were obtained with MDV. Unlike the other
methods MDV, does not consider the impact of
meteorological variables on lE so it can be used when
meteorological data are not available.
The root mean square errors for nighttime gaps
ranged from 5.1 to 38.4 W/m2
and the agreementcoefficients were all less than 0.5. Comparing the
different methods, the errors were smallest (average
RMSE = 11.3 W/m2) with DLR whereas PT resulted in
the largest error (33.7 W/m2). RMSEs are larger during
summer and daytime than winter and nighttime, but the
fluxes are also larger. Therefore we considered the
relative RMSE to make the comparison between
different seasons and time. Normalizing the RMSE
with the mean oflEfor each dataset (relative root mean
square error) shows that all gap-filling methods resulted
in smaller relative errors (0.10.5) during summer(JuneJuly) and spring (AprilMay) than fall (October
November) and winter (FebruaryMarch) (0.31.5).
Also better agreement between computed and original
values was obtained during summer and spring daytime
(greater than 70% except for MDV).
R_bias was mostly less than 5% for daytime datasets
except for the PT method. This shows that the error
introduced by the different methods is essentially
random.
The gaps in daytime data during the summer can be
filled with smaller relative errors than nighttime and
wintertime gaps. The large relative errors associated withfilling missing nighttime data is primarily related to the
large number of gaps in nighttime data and unreliable
measurements during nighttime. However, due to the low
nighttimevalue oflE, the error introducedby application
of different filling method may not influence the total
annual or monthly estimation oflE significantly.
For the evaluation process, gaps in the data were
created randomly. This may not be a true representation
of reality because real gaps in data may not be randomly
distributed. Most large gaps in the data were due to
unsuitable weather conditions or field operations. These
N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 5766 61
-
8/2/2019 Filling Gaps Evapotranspiration
6/10
N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 576662
Table 1
Root mean square error (RMSE), agreement coefficient (R2), relative root mean square error (R_RMSE), and relative bias (R_bias) of gap filled lE
(W/m2)
Regr. MDV PT DLR SDP MI
Daytime data
RMSE
AprilMay 2002 30.22 55.08 42.67 19.46 28.83 40.41JuneJuly 2002 33.68 70.26 53.59 22.80 44.67 52.40
OctoberNovember 2002 20.20 29.52 61.15 16.19 22.98 29.14
FebruaryMarch 2002 20.64 27.96 59.15 10.47 21.09 18.13
Average 26.18 45.70 54.14 17.23 29.39 35.02
R2
AprilMay 2002 0.83 0.44 0.79 0.93 0.85 0.72
JuneJuly 2002 0.93 0.67 0.86 0.97 0.87 0.81
OctoberNovember 2002 0.80 0.58 0.55 0.88 0.76 0.62
FebruaryMarch 2002 0.71 0.46 0.51 0.92 0.68 0.74
Average 0.82 0.54 0.68 0.93 0.79 0.72
R_RMSE
AprilMay 2002 0.29 0.54 0.42 0.19 0.28 0.39
JuneJuly 2002 0.19 0.39 0.29 0.13 0.25 0.29
OctoberNovember 2002 0.40 0.58 1.21 0.32 0.45 0.55
FebruaryMarch 2002 0.53 0.72 1.53 0.27 0.55 0.50
Average 0.35 0.56 0.86 0.23 0.38 0.43
R_bias
AprilMay 2002 0.02 0.04 0.09 0.02 0.05 0.04JuneJuly 2002 0.03 0.00 0.05 0.00 0.00 0.02OctoberNovember 2002 0.02 0.05 0.54 0.04 0.06 0.02FebruaryMarch 2002 0.04 0.03 0.80 0.01 0.01 0.00Average 0.04 0.02 0.37 0.02 0.03 0.00
Nighttime data
RMSE
AprilMay 2002 13.62 15.83 25.76 11.41 13.25 22.51
JuneJuly 2002 17.88 17.83 38.43 15.61 20.35 29.29
OctoberNovember 2002 16.06 19.28 37.49 13.11 16.60 19.23
FebruaryMarch 2002 9.34 10.52 32.96 5.15 9.35 12.83
Average 14.22 15.86 33.66 11.32 14.89 20.96
R2
AprilMay 2002 0.23 0.00 0.19 0.41 0.27 0.03
JuneJuly 2002 0.20 0.00 0.02 0.35 0.18 0.00
OctoberNovember 2002 0.17 0.00 0.03 0.46 0.19 0.07
FebruaryMarch 2002 0.27 0.04 0.02 0.75 0.24 0.02
Average 0.22 0.01 0.07 0.49 0.22 0.03
R_RMSE
AprilMay 2002 0.88 1.03 1.67 0.74 0.86 1.52JuneJuly 2002 1.07 1.06 2.29 0.93 1.21 1.76
OctoberNovember 2002 1.15 1.38 2.69 0.93 1.18 1.43
FebruaryMarch 2002 0.99 1.11 3.48 0.55 0.99 1.47
Average 1.02 1.14 2.53 0.79 1.06 1.54
R_bias
AprilMay 2002 0.16 0.09 1.28 0.04 0.21 0.10JuneJuly 2002 0.53 0.14 1.84 0.40 0.74 0.53OctoberNovember 2002 0.09 0.05 1.94 0.16 0.36 0.05FebruaryMarch 2002 0.15 0.12 2.69 0.01 0.02 0.05Average 0.03 0.10 1.94 0.13 0.31 0.11
-
8/2/2019 Filling Gaps Evapotranspiration
7/10
gaps may introduce biases by their non-random
distribution. Therefore the resultant errors with artificial
gaps may not be equivalent to the errors from
application of filling methods to the real gaps. To
evaluate the application of the gap-filling methods over
longer gaps as would be expected from bad weather
conditions or instrument failure, in the followingsection we assess the top three methods (Regr., DLR
and SDP) for long gaps on order of few hours to days.
Lack of energy balance closure is a common problem
in measuring turbulent fluxes with EC technique. It has
become generally accepted that the turbulent fluxes
(sensible heat and latent heat fluxes) are underestimated
by approximately 1030% relative to estimate of
available energy (Wilson et al., 2002; Aubinet et al.,
2000). Closure has not been considered in this analysis,
however since the gap-filling methods are essentially
interpolative, correcting existing data for closure wouldproduce an equal change in the filled data.
3.2. The Kalman filter approach to filling long gaps
This section examines the efficiency of SDP, DLR
and Regr. methods in replacing data in large gaps by
creating 3, 6, 12, 18 h, and 1, 3, 5 and 7 day gaps in the
dataset. As an example, the results for a 3-day gap in
May 2002 are shown in Fig. 1. For this period the RMSE
for DLR was 8.2 W/m2 and for SDP 13.4 W/m2, while it
was 20.6 W/m2 for Regr. Next, a 1-day moving gap was
created throughout the growing season (MaySeptem-
ber 2002). Fig. 2 shows the resulting RMSE for eachday of growing season 2002. In general, the values of
RMSE are smaller for DLR than Regr. and SDP. The
average daily RMSE was 22.4 W/m2 with DLR,
33.2 W/m2 with SDP, and 30.2 W/m2 with Regr. This
procedure was repeated using moving gap sizes of 3, 6,
12, 18 h and 2, 3, 5, 7 days over MaySeptember 2002.
Fig. 3 shows the average RMSE for different gap sizes.
As the size of gap increases, mean RMSE of DLR, SDP
and Regr. increases. DLR outperformed the other
methods for all gap sizes and has the smallest mean
RMSE while SDP has the greatest mean RMSE for allgap sizes. The difference in mean RMSE between Regr.
and DLR is 6.8 W/m2 for 3 h gaps and decreases to
3.4 W/m2 for a gap size of 7 days.
In the SDP method, the lEtime series are sorted out
of temporal order. Jarvis et al. (2004) claimed that in
this case the systematic gaps in the time series become
N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 5766 63
Fig. 1. Comparison of gap filled lE (*) by (a) DLR, (b) SDP, and (c) Regr. with measurements () for 1821 May 2002.
-
8/2/2019 Filling Gaps Evapotranspiration
8/10
much smaller nonsystematic gaps in the sorted
temperature space and these smaller gaps can be filled
with smaller errors. Therefore, we should expect better
estimation of data in longer gaps using SDP. However,
in this study the SDP approach gave larger errors for all
gap sizes compared to DLR. In the SDP algorithm the
time series of Rn, G and D were also sorted with respect
to the ascending order of Ta. Although there is some
correlation of these variables with air temperature, otherfactors are also significant. The assumption that Ta is the
dominant control may lead to greater errors in SDP
analysis.
Including more variables in the algorithm would
increase the complexity of the model and result in large
errors comparing to simpler DLR algorithm.
Application of the DLR method provides a very good
estimation of missing values in lE since it is able to
reproduce most variation in lE due to different factors
(e.g. net radiation, vapour pressure). It provides smallerRMSEs (3.46.8 W/m2, with the lower values for gap
N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 576664
Fig. 2. RMSE of (a) Regr., (b) DLR, and (c) SDP for 1 day moving gap in growing season 2002.
Fig. 3. Mean RMSE of Regr., SDP, and DLR for filling in 3, 6, 12, 18 h and 1, 2, 3, 5, 7 days moving gaps in growing season 2002.
-
8/2/2019 Filling Gaps Evapotranspiration
9/10
sizes of 3 h to 7 days) than Regr. for estimating lE.
Similar to the analysis over random gaps, the DLR
method was the most accurate for filling the larger
artificial gaps.
3.3. Estimation of annual and monthly sum of ET
Annual sums of ET were computed by filling the real
gaps with the Regr. and DLR methods. Regr. method
was applied to data with a window of10. Since somegaps were not filled because of low agreement
coefficients (mostly in winter time) we repeated the
application of Regr. using windows of 20 and 30days to fill data left unfilled by the 10-day window.
The annual estimation of ET was 606 mm by Regr.,
and it was 585 by DLR which differ by 21 mm (3%).
Although the application of Regr. and DLR did not
affect the annual sum very much, estimating ET duringa shorter period of time might be highly affected by the
filling method.
The results for monthly estimation of ET for year
2002 showed that the difference in monthly ET filled by
Regr. and by DLR ranged from 1 to 8 mm (0.215%) for
different months. The absolute difference was small
during winter months (0.21.5 mm) while it was larger
in the growing season (48 mm). Therefore, the
selection of proper gap-filling method is crucial in
estimating monthly ET during growing season.
4. Summary
Several gap-filling methods were applied to eddy
covariance data of latent heat flux from an agricultural
site. The methods were evaluated based on their ability
to fill artificial data gaps. The recursive parameter
estimation approach based on the relation between
latent heat flux and available energy and vapour
pressure deficit resulted in better estimation of the
original data and smaller errors than the MI, PT and
MDV, and Regr. methods.
The recursive parameter estimation approach withthe DLR algorithm produced the smallest errors for both
short and long (hours to days) gaps. The relative errors
were smaller for daytime data during the growing
season, when ET is high, than for nighttime and winter
data. The application of this method can improve the
estimation of total ET over a period of time since it
results in smaller errors than other methods, assuming
no significant bias between missing and present data
(MAR is valid).
While the application of different gap-filling
techniques does not greatly affect influence the total
annual sum, the total ET for shorter periods of time
(month or days) can be highly affected by choice of gap-
filling technique. Total ET during the growing season,
when ET is high, is more sensitive to choice of gap-
filling method and so proper selection of a method is
crucial at this time. Since the evaluation process based
on filling artificial gaps showed that the resultant errorwith DLR was smaller than with other methods,
application of this technique should provide better
estimation of total ET, especially over a short period of
time and during the growing season.
Due to the efficiency of the Kalman filtering approach
in replacing data in both small and large gaps, and
providing the best approximation of the original data, the
application of this technique is recommended as a gap-
filling approach for estimation of total ET, as is further
study of these techniques on the other contexts.
Acknowledgements
This research project was funded by the grants from
the Ontario Ministry of Agriculture and Food and the
Natural Sciences and Engineering Research Council of
Canada. We are very grateful for the thoughtful and
detailed comments of the two anonymous reviewers.
References
Aubinet, M., Grelle, A., Ibrom, A., Rannik, U., Moncrieff, J., Foken,T., Kowalski, A.S., Martin, P.H., Berbigier, P., Bernhofer, Ch.,
Clement, R., Elbers, J., Granier, A., Grunwald, T., Morgenstern,
K., Pilegaard, K., Rebmann, C., Snijders, W., Valentini, R., Vesala,
T., 2000. Estimatesof the annual netcarbon and water exchange of
European forests: the EUROFLUX methodology. Adv. Ecol. Res.
30, 113175.
Baldocchi, D., Meyers, T., 1998. On using eco-physiological, micro-
meteorological and biogeochemical theory to evaluate carbon
dioxide, water vapour and trace gas fluxes over vegetation: a
perspective. Agric. For. Meteorol. 90, 125.
Baldocchi, D., Falge, E., Gu, L., Olson, R., Hollinger, D., Running, S.,
Anthoni, P., Bernhofer, C., Davis, K., Evans, R., Fuentes, J.,
Goldstein, A., Katul, G., Law, B., Lee, X., Malhi, Y., Meyers,
T., Munger, W., Oechel, W., Paw, U.K.T., Pilegaard, K., Schmid,
H.P., Valentini, R., Verma, S., Vesala, T., Wilson, K., Wofsy, S.,
2001. FLUXNET: a new tool to study the temporal and spatial
variability of ecosystem-scale carbon dioxide, water vapour and
energy flux densities. Bull. Am. Meteorol. Soc. 82, 24152434.
Berbigier, P., Bonnefond, J.M., Mellmann, P., 2001. CO2 and water
vapour fluxes for 2 years obove Euroflux forest site. Agric. For.
Meteorol. 108, 183197.
Boni, G.,Entekhabi, D.,Gastelli, F., 2001. Land data assimilationwith
satellite measurements for the estimation of surface energy bal-
ance components and surface control on evaporation. Water
Resour. Res. 37 (6), 17131722.
Entekhabi, D., Nakamura, H., Njoku, E.G., 1994. Solving the inverse
problem for soil moisture and temperature profiles by sequential
N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 5766 65
-
8/2/2019 Filling Gaps Evapotranspiration
10/10
assimilation of multifrequency remotely sensed observations.
IEEE Trans. Geosci. Remote Sensing 32 (2), 438448.
Falge, E., Baldocchi, D.D., Olson, R., Anthoni, P., Aubinet, M.,
Bernhofer, C., Burba, G., Ceulemans, R., Clement, R., Dolman,
H., Granier, A., Gross, P., Grunwald, T., Hollinger, S., Jensen, N.-
O., Katul, G., Keronen, P., Kowalski, A., Lai, C.T., Law, B.E.,
Meyers, T., Moncrieff, J., Moors, E., Munger, J.W., Pilegaard, K.,
Rannik, U., Rebmann, C., Suyker, A.,Tenhunen,J., Tu, K.,Verma,S., Vesala, T., Wilson, K., Wofsy, S., 2001a. Gap filling strategies
for defensible annual sums of net ecosystem exchange. Agric. For.
Meteorol. 107, 4369.
Falge, E., Baldocchi, D.D., Olson, R., Anthoni, P., Aubinet, M.,
Bernhofer, C., Burba, G., Ceulemans, R., Clement, R., Dolman,
H., Granier, A., Gross, P., Grunwald, T., Hollinger, S., Jensen, N.-
O., Katul, G., Keronen, P., Kowalski, A., Lai, C.T., Law, B.E.,
Meyers, T., Moncrieff, J., Moors, E., Munger, J.W., Pilegaard, K.,
Rannik, U., Rebmann, C., Suyker, A.,Tenhunen,J., Tu, K.,Verma,
S., Vesala, T., Wilson, K., Wofsy, S., 2001b. Gap filling strategies
for long term energy flux data sets. Agric. For. Meteorol. 107,
7177.
Greco, S., Baldocchi, D.D., 1996. Seasonal variations of CO2 water
vapour exchange rates over a temperate deciduous forest. Global
Change Biol. 2, 183197.
Hui, D., Wan, S., Su, B., Katul, G., Monson, R., Luo, Y., 2004. Gap-
filling missing data in eddy covariance measurements using multi-
ple imputation (MI) for annual estimations. Agric. For. Meteorol.
121, 93111.
Jarvis, A.J., Stauch, V.J., Schulz, K., Young, P.C., 2004. The seasonal
temperature dependency of photosynthesis and respiration in two
deciduous forests. Global Change Biol. 10, 939950.
Lee, X.H., Hu, X.Z., 2002. Forest-air fluxes of carbon, water and
energy over non-flat terrain. Boundary-Layer Meteorol. 103 (2),
277301.
McCoy, A.J., Parkin, G.W., Wagner-Riddle, C., Warland, J.S., Lauzon,
J., von Bertoldi, P., Fallow, D., Jayasundara, S., 2006. Usingautomated soil water content and temperature measurement sys-
tems to estimate soil water budgets. Can. J. Soil Sci. 86, 4756.
Priestley, C.H.B., Taylor, R.J., 1972. On theassessment of surface heat
flux and evaporation using large-scale parameters. Monthly
Weather Rev. 100, 8192.
Reichle, R., McLaughlin, D.B., Entekhabi, D., 2002. Hydrologic data
assimilation with the ensemble Kalman filter. Monthly Weather
Rev. 130 (1), 103114.
Rubin, D.B., 1987. Multiple Imputation for Nonresponse Surveys.
John Wiley & Sons, New York.Schafer, J.L., Olsen, M.K., 1998. Multiple imputation for multivariate
missing-data problems: a data analysts perspective. Multivariate
Behav. Res. 33, 545571.
Schafer, J.L., 1997. Analysis of Incomplete Multivariate Data. Chap-
man and Hall, London.
Williams, M., Schwarz, P.A., Law, B.E., Irvine, J., Kurpius, M.R.,
2005. An improved analysis of forest carbon dynamics using data
assilmilation. Global Change Biol. 11, 89105.
Wilson, K.B., Baldocchi, D.D., 2000. Seasonal and interannual varia-
bility of energy fluxes over a broadleaved temperate deciduous
forest in North America. Agric. For. Meteorol. 100, 118.
Wilson, K.B., Hanson, P.J., Mulholland, P.J., Baldocchi, D.D., Wulls-
chleger, S.D., 2001. A comparison of methods for determining
forest evapotranspiration and its components: sap-flow, soil water
budget, eddy covariance and catchment water balance. Agric. For.
Meteorol. 106, 153168.
Wilson, K.B., Goldstein, A.H., Falge, E., Aubinet, M., Baldocchi, D.,
Berbigier, P., Bernhofer, Ch., Ceulemans, R., Dolman, H., Field, C.,
Grelle, A., Law, B., Meyers, T., Moncrieff, J., Monson, R., Oechel,
W., Tenhunen, J., Valentini, R., Verma, S., 2002. Energy balance
closure at FLUXNET sites. Agric. For. Meteorol. 113, 223243.
Young, P.C., 1999. Nonstationary time series analysis and forecasting.
Progr. Environ. Sci. 1, 348.
Young, P.C., Pedregal, D.J., 1999. Recursive and en-bloc approaches
to signal extraction. J. Appl. Stat. 26, 103128.
Young, P.C., Taylor, C.J., Tych, W., Pedregal, D.J., McKenna, P.G.,
2004. The Captain Toolbox. Centre for Research on Environmen-tal Systems and Statistics, Lancaster University (http://www.e-
s.lancs.ac.uk/cres/captain).
N. Alavi et al. / Agricultural and Forest Meteorology 141 (2006) 576666
http://www.es.lancs.ac.uk/cres/captainhttp://www.es.lancs.ac.uk/cres/captainhttp://www.es.lancs.ac.uk/cres/captainhttp://www.es.lancs.ac.uk/cres/captain