the use of a multi-model ensemble in local operational ozone forecasting
DESCRIPTION
The use of a multi-model ensemble in local operational ozone forecasting. William F. Ryan Jeremy Geiger Department of Meteorology The Pennsylvania State University. 3 rd International Workshop on Air Quality Research, Washington, DC, November, 2011. - PowerPoint PPT PresentationTRANSCRIPT
The use of a multi-model ensemble in local operational ozone forecasting
William F. RyanJeremy Geiger
Department of MeteorologyThe Pennsylvania State University
3rd International Workshop on Air Quality Research, Washington, DC, November, 2011
Brief Background on Operational O3 Forecasting in the United States
• Forecasts are typically issued on the metropolitan scale by local (non-NOAA) forecasters, and are verified against the domain-wide peak 8-hour O3 at monitors operated by state or local AQ agencies.– In PHL, roughly 10-12 monitors.
• Key forecast threshold is an 8-hour average concentration ≥ 76 ppbv (Code Orange).
• In order to activate “Air Quality Action Day” plans, forecasts are issued ~ 1800 UTC on the previous day – an 18-30 hour forecast.
Standard Forecast Techniques
• Outside the margins of the season, climatology is not useful but persistence (one day lag) is a skillful forecast.– Persistence “explains” ~ 36% of variance in peak 8-h O3 in PHL.
• Persistence can be coupled with forecast back trajectories to assess regional transport (expected residual layer concentrations).
• Simple statistical models using meteorological and persistence predictors were useful but have become less so in the changing emissions environment post-2002.
• Conceptual models are useful at the regional scale, and for longer range forecasts, as multi-day, severe high O3 events in the mid-Atlantic typically follow certain “classic” synoptic scale patterns.
• However, daily metropolitan scale peak O3 concentrations vary on the meso-scale. These effects that may be successfully simulated by the latest generation of coupled chemistry-meteorological models.
• Post-processed model guidance is underway in Canada but has not been implemented in the US.– Though bias correction methods are available, see AQMOS later.
Hypothesis
• Several skillful numerical O3 forecast models are now reliably available in a timely manner for use in preparing operational O3 forecasts.– These models vary in terms of their
meteorological model drivers, emissions data bases and chemical mechanisms.
• An “ensemble” of these models will provide better forecasts at the critical forecast threshold (Code Orange) than a single model.
Useful Reading
• Djalalova, I., et al, 2010: Ensemble and bias-correction techniques for air quality model forecasts of surface O3 and PM2.5 during the TEXAQS-II experiment of 2006, Atmos. Environ., 44, 455-467.
• McKeen, S., et al, 2005: Assessment of an ensemble of seven real-time ozone forecasts over eastern North America during the summer of 2004, J. Geophys. Res., 110, doi:10.1029/2005JD005858.
Philadelphia Metropolitan Forecast Area:Test Case for Ensemble Hypothesis
Philadelphia is a large metropolitanarea embedded in the “I-95 Corridor”.It has large local emissions and is alsosubject to transported emissions fromother large cities as well as regional NOx from large power generation sources in the Ohio River Valley to the west
I-95 CorridorOhio River Valley
NOAA Operational Forecast Model Results - PhiladelphiaPeak 8-Hour Domain-Wide O3 (May-September, 2007-2010)
0 20 40 60 80 100 120 140
NAQC Forecast 8-h Max (ppbv)
0
20
40
60
80
100
120
140
Ob
serv
ed
8-h
Ma
x (p
pb
v)
Bias: +3.3 ppbv
Median Absolute Error:7.0 ppbv
Mean Absolute Error:8.0 ppbv
Best Linear Fit: [O3]OBS = 4.7 + 0.875*[O3]FC
Increased spread in the higher end of the distribution.
For 76 ppbv Threshold:Hit Rate: 0.77False Alarm: 0.37Threat Score: 0.58
NOAA Operational Model Sunday False Alarms
Sun Mon Tues Wed Thurs Fri Sat0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
False Alarm Hit CSI(Threat)
Skill
Mea
sure
NOAA Model Seasonal Drift
1-May
8-May
15-May
22-May
29-May
5-Jun
12-Jun
19-Jun
26-Jun
3-Jul
10-Jul
17-Jul
24-Jul
31-Jul
7-Aug
14-Aug
21-Aug
28-Aug
4-Sep
-40
-30
-20
-10
0
10
20
30
40
AverageBias07Bias08Bias09Bias10
Fore
cast
Bia
s (p
pbv)
Forecast Models Used in Ensemble
• NAQC (NOAA) – Queried at Monitor Locations– 1200 UTC Run valid following day (24-36 h forecast)– http://www.emc.ncep.noaa.gov/mmb/aq/
• ZIP/NAQC – NOAA Model Queried at all Domain Land Areas– Data extracted from AQMOS (Sonoma Tech)– http://aqmos.sonomatech.com/login.cfm
• AQMOS – NOAA Model with Seasonal Bias Correction – http://aqmos.sonomatech.com/faq/
• Barons Meteorological Services – MAQSIP RT– 0600 UTC Run– http://www.baronams.com/products/
• SUNY-Albany– 1200 UTC Run, “NYSDEC_3x12z”, CMAQ 4.7.1– http://asrc.albany.edu/research/aqf/aqvis/tomorrowforecast_maps.htm
Comment on Determination of Peak Model 8-Hour O3
Numerical forecast models tend to develop strong sea-land O3 gradients.
At left: See high O3
in embayments andalong NJ coast.
If use entire area to verify model, rather thanforecasts at monitor locations, the number of false alarms of high O3 more than double and bias increases by 7 ppbv (2011 results, NOAA Operational Model).
Ensembles Tested in 2011
• ENS1: NAQC, SUNY, BARONS,ZIP • ENS2: NAQC, SUNY, BARONS, ZIP, AQMOS• ENS3: NAQC, SUNY, BARONS, AQMOS• ENS4: NAQC, ZIP• ENS5: NAQC, ZIP, AQMOS• ENS6: ZIP, AQMOS• ENS7: NAQC, NAQC Experimental• ENS8: NAQC, SUNY, AQMOS• ENS9: NAQC, SUNY, ZIP, AQMOS• ENS10: NAQC, SUNY, Barons• ENS11: SUNY, Barons
NAQC Forecast Results for 2011 (PHL)
• Bias: + 1.7 ppbv• r = 0.75• Best Fit:
– [O3]OBS = 18.3 + 0.68*[O3]NAQC
• For Code Orange Threshold:– Hit Rate: 0.74– False Alarm: 0.42– Threat: 0.48
20 40 60 80 100 120
NAQC Forecast 8-h Ozone (ppbv)
20
40
60
80
100
120
Ob
serv
ed
8-h
Ozo
ne
(p
pb
v)
Hit and False Alarm Rates for Individual ModelsThreshold: 76 ppbv 8-h Average (Code Orange)
NAQC SUNY BARONS ZIP AQMOS PERS0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
Hit False Alarm
Selected Skill Scores for Individual Models
NAQC SUNY BARONS ZIP AQMOS PERS0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
HeidkeCSI(Threat)
Heidke score measures skill relative to random forecast with range [-1,+1].Threat score removes influence of “correct null” forecasts, useful for rare events
Hit and False Alarm Rates for Ensembles
ENS1 ENS2 ENS3 ENS4 ENS5 ENS6 ENS7 ENS8 ENS9 ENS10 ENS110.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
False Alarm Hit
Skill
Mea
sure
ENS2: All models; ENS9: All models except Barons; ENS11: SUNY, Barons only.Dashed lines are stand alone NAQC model results.
Selected Skill Scores for Ensembles
ENS1 ENS2 ENS3 ENS4 ENS5 ENS6 ENS7 ENS8 ENS9 ENS10 ENS110.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Heidke Threat
Skill
Mea
sure
ENS2: All models; ENS9: All models except Barons; ENS11: SUNY, Barons only.Dashed lines are stand alone NAQC model results.
Skill Measures for Selected Ensembles,NAQC Model and Persistence
False Alarm Hit Heidke CSI(Threat) PSS0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
ens2 ens9 ens11 NAQC PERS
Skill
Mea
sure
ENS2: All models; ENS9: All models except Barons; ENS11: SUNY, Barons only.Dashed lines are stand alone NAQC model results.
Summary: Overall Ensemble Results
• Two groups performed best– Ensembles 1-3: NAQC, SUNY and Barons models with
various combinations of NAQC-based forecasts (AQMOS, ZIP)
– Ensembles 8-10: NAQC and SUNY, with various combinations of the other models.
– Ensemble 11: SUNY and Barons only.• Do the ensembles address other shortcomings of the
NAQC model?– Seasonal drift in bias– Weekday/Weekend effects
Seasonal Drift: NAQC Skill Scores for Early and Late Summer for 2011 (left) and 2007-2010 (right)
False Alarm
Hit Heidke Threat0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
< July 7 >= July 7
Skill
Mea
sure
False Alarm
Hit Heidke Threat0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
< July 7 > July 7
Skill
Mea
sure
2011 2007-2010
Seasonal Drift: Ensemble Skill
False Alarm Hit Heidke CSI(Threat)0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
NAQC ENS2 ENS9 ENS11
Skill
Mea
sure
False Alarm Hit Heidke CSI(Threat)0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
NAQC ENS2 ENS9 ENS11
Skill
Mea
sure
Early Summer (< July 7) Late Summer (≥ July 7)
Forecast Skill by Day of Week
False Alarm Hit Heidke CSI(Threat)0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
NAQC ENS2 ENS9 ENS11
Skill
Mea
sure
Sunday-Monday
False Alarm Hit Heidke CSI(Threat)0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
NAQC ENS2 ENS9 ENS11
Skill
Mea
sure
Tuesday-Saturday
Conclusions
• A variety of ensemble combinations can improve on the performance of the NAQC model.
• The seasonal drift problem with the NAQC can be partially corrected using non-NAQC models.
• The Sunday-Monday false alarm problem can also be improved with non-NAQC members.
Acknowledgements
• Funding for air quality forecasting in the Philadelphia metropolitan area is supported by the Delaware Valley Regional Planning Commission (Sean Greene), the Commonwealth of Pennsylvania Department of the Environmental Protection and the State of Delaware.
• Thanks to Dianne Miller (Sonoma Tech), John McHenry (Barons), as well as SUNY-Albany and the New York Department of Environmental Conservation for access to their model forecasts.
Supporting Slides
Skill Score Measures
• Heidke Skill Score– Range: [-1,+1]– Compares the proportion of correct forecasts to a no skill
random forecast that is constrained in that marginal total (hit, miss, false alarm) are equal to observed.
• Threat Score– Also known as Critical Success Index (CSI) or Gilbert Skill
Score– Range: [0,1]– Excludes the “null” forecast, so that is a measure of hits
divided by sum of hits, misses and false alarms.
Example of Use of Back Trajectories Coupled with Observations (August 10, 2011)
24-Hour Forward Trajectories at 1000 m AGLNOAA ARL HYSPLIT Trajectory Modelhttp://ready.arl.noaa.gov/HYSPLIT.phpSuperimposed on AirNow Tech Hourly O3
observations:http://www.airnowtech.org/
Weather Summary for Summer 2011
• In Philadelphia, and the mid-Atlantic as a region, 2011 was a warmer than average summer.– 30 days ≥ 90⁰ F compared to average of 23 (1997-2010)
• Warmest temperatures occurred from May 30-August 3 – the heart of the O3 season – with very wet conditions in August and September.
• In PHL, 19 days were Code Orange or Red compared to an average of 24 (2004-2010).– Prior to regional NOx controls, average of 40 Code Orange
or Red days (1997-2003).
Weather Summary for Philadelphia (Summer, 2011):Higher than Average Frequency of Hot Weather
1997 1999 2001 2003 2005 2007 2009 20110
10
20
30
40
50
60
T > 90 JJAOr JJA
Dotted orange line is mean number of Code Orange days in two periods:1997-2003 and 2004-2010, reflecting “NOx SIP Rule” emissions changes in 2002-2003
Daily Peak and 7-Day Running Average for PHL
20-May
24-May
28-May
1-Jun
5-Jun
9-Jun13-Ju
n17-Ju
n21-Ju
n25-Ju
n29-Ju
n3-Ju
l7-Ju
l11-Ju
l15-Ju
l19-Ju
l23-Ju
l27-Ju
l31-Ju
l4-A
ug8-A
ug
12-Aug
16-Aug
20-Aug
24-Aug
0
20
40
60
80
100
120
PHL7DayRA
Warmest weather was concentrated during the climatological peak of the O3 season
1-May5-M
ay9-M
ay
13-May
17-May
21-May
25-May
29-May
2-Jun6-Ju
n
10-Jun
14-Jun
18-Jun
22-Jun
26-Jun
30-Jun
4-Jul8-Ju
l
12-Jul
16-Jul
20-Jul
24-Jul
28-Jul
1-Aug
5-Aug
9-Aug
13-Aug
17-Aug
21-Aug
25-Aug
29-Aug
-10
-5
0
5
10
15
20
0
20
40
60
80
100
120
Delta Tmax
Dep
artu
re fr
om D
aily
Ave
rage
Tem
pera
ture
(d
eg F
)
Max
imum
Sur
face
Tem
pera
ture
(deg
F)
Blue bars are departure from average temperature at PHL, red lineis daily maximum surface temperature (deg F).
Ozone conducive weather comes to an end with very wet conditions in August
1-May
6-May
11-May
16-May
21-May
26-May
31-May
5-Jun
10-Jun
15-Jun
20-Jun
25-Jun
30-Jun
5-Jul
10-Jul
15-Jul
20-Jul
25-Jul
30-Jul
4-Aug
9-Aug
14-Aug
19-Aug
24-Aug
29-Aug
0
0.5
1
1.5
2
2.5
3Rainfall at PHL (May-August, 2011)
Dai
ly R
ainf
all (
inch
es)
August 15 (4.84”) and August 29 (4.55”) are off scale, wettest Auguston record at PHL with 19.21”
Seasonal Drift in Model BiasNAQC for PHL (2007-2011)
1-May
7-May
13-May
19-May
25-May
31-May
6-Jun
12-Jun
18-Jun
24-Jun
30-Jun
6-Jul
12-Jul
18-Jul
24-Jul
30-Jul
5-Aug
11-Aug
17-Aug
23-Aug
29-Aug
4-Sep
-40
-30
-20
-10
0
10
20
30
40
50
Bias07Bias08Bias09Bias10Bias11Average
NA
QC
Fore
cast
Mod
el B
ias
(ppb
v)
Seasonal Drift in Model BiasNAQC for PHL, 2011
21-May
26-May
31-May
5-Jun
10-Jun
15-Jun
20-Jun
25-Jun
30-Jun
5-Jul
10-Jul
15-Jul
20-Jul
25-Jul
30-Jul
4-Aug
9-Aug
14-Aug
19-Aug
24-Aug
-30
-20
-10
0
10
20
30
OPERCbias
Ope
ratio
nal M
odel
For
ecas
t Bia
s (p
pbv)
1-May 21-May 10-Jun 30-Jun 20-Jul 9-Aug 29-Aug
-6
-4
-2
0
2
4
6
cbias07cbias08cbias09cbias10cbias11
Cum
ulati
ve B
ias
(ppb
v)
23-May
26-May
29-May
1-Jun
4-Jun
7-Jun
10-Jun
13-Jun
16-Jun
19-Jun
22-Jun
25-Jun
28-Jun
1-Jul
4-Jul
7-Jul
10-Jul
13-Jul
16-Jul
19-Jul
22-Jul
25-Jul
28-Jul
31-Jul
3-Aug
6-Aug
9-Aug
12-Aug
15-Aug
18-Aug
21-Aug
24-Aug
-10.0
-8.0
-6.0
-4.0
-2.0
0.0
2.0
4.0
6.0
8.0
SUNYBaronsAQMOS
24-May
27-May
30-May
2-Jun
5-Jun
8-Jun
11-Jun
14-Jun
17-Jun
20-Jun
23-Jun
26-Jun
29-Jun
2-Jul
5-Jul
8-Jul
11-Jul
14-Jul
17-Jul
20-Jul
23-Jul
26-Jul
29-Jul
1-Aug
4-Aug
7-Aug
10-Aug
13-Aug
16-Aug
19-Aug
22-Aug
25-Aug
-40
-30
-20
-10
0
10
20
30
-7
-6
-5
-4
-3
-2
-1
0
AQMOS
BiasCbias