matrix-analog measure-correlate-predict approach po.013 ... · carta, j. a., velázquez, s., &...

1
Comparison of MCP methods Used data: 12 Czech manned weather stations (measurement height 10 m), reanalyses NCEP/NCAR, ERA Interim and MERRA (different height/pressure levels, geostrophic/model wind) Testing procedure: - period 2005 - 2008 (only 4 years used to avoid homogeneity issues), 1h measurement interval - 12 months of training period (continual 3-month blocks of data, seasonally stratified), remaining data used for verification - fixed set of 50 resampled runs 1. Hanslian D. (2014): Analýza výsledků měření větru. Dissertation thesis (in Czech). Department of Meteorology and Environmental Protection, Faculty of Mathematics and Physics, Charles University in Prague, Praha, 159p. http://www.ufa.cas.cz/files/OMET/Hanslian_disertacni_prace.pdf 2. Hanslian D. (2014): Wind data analysis. Abstract of Doctoral Thesis. Department of Meteorology and Environmental Protection, Faculty of Mathematics and Physics, Charles University in Prague, Praha, 37p. http://www.ufa.cas.cz/files/OMET/Hanslian_Abstract_of_Doctoral_Thesis.pdf Requirements on calculated long-term data may differ by their usage: average wind speed => basic information wind speed distribution => enables energy calculation wind rose => wake effects, model inputs individual values => replacement of missing records, „prediction“ tasks For wind resource assessment wind speed distribution and wind rose are needed, but most published methods do not deal well with both. Comprehensive review of MCP methods: Carta, J. A., Velázquez, S., & Cabrera, P. (2013). A review of measure- correlate-predict (MCP) methods used to estimate long-term wind characteristics at a target site. Renewable and Sustainable Energy Reviews, 27, 362400 The transformation of measured short-term wind data into the long- term wind climate is a major source of uncertainty in wind resource assessments. The usual approach, called measure-correlate-predict (MCP), employs a long-term wind data series that is correlated to the short-term wind measurement. The process of MCP can be performed by various types of methods: the most common are simple methods of ratios, linear regression methods, “matrix” methods and the use of artificial neural networks. Abstract Matrix-analog measure-correlate-predict approach David Hanslian Institute of Atmospheric Physics AS CR PO.013 Results Objectives Conclusions Methods References EWEA Resource Assessment 2015 Helsinki2-3 June 2015 The presented MCP approach can be classified as a “matrix method”, because it employs the principle of classification the data into bins, most notably by wind speed and wind direction of reference series. The additional feature is the application of analog principle, which enables to preserve the relationship between wind speed and wind direction. As a result, the complete wind climatology including the wind rose is simulated. The matrix-analog approach can be implemented by several ways; two different options were proposed. Verification confirmed that the matrix analog approach enables reliable simulation of long-term wind speed distribution as well as the wind rose. A comparison with regression MCP methods showed that considering the long-term average wind speed the matrix-analog performs well, similarly as the linear regression method. Considering the wind speed distribution, the matrix- analog approach clearly overperform simple approaches, such as Variance ratio method. We suggest making a more ambitious comparison that would include also more elaborate alternative methods, such as methods implemented in commercial software or artificial neural networks. „Matrix“ methods = wide group of methods, where data are separated into classes / bins (so that a „matrix“ of bins is employed). In other aspects, the label „matrix method“ is used for various appr oaches, which are often pricipially different! Proposed „matrix – analog“ approach = general approach how simulate long-term wind speed distribution and wind rose at once, preserving the relationship between wind speed and wind direction. The result is an artificial wind data series (this may be advantage in subsequent data use). 1) Define (reference) bins i) reference data are binned into uniform „basic“ bins by wind speed and direction ii) the „basic“ bins are merged by any algorithm, so that the resulting bins contain enough data in the „reference“ period (period of concurrent measurements) 2) Find the analog Each time record of „target“ (long-term) period is assigned to a time record of training (reference) period, so that both time records correspond to the same bin. Some heuristics can be used to find the most relevant analog. 3) Calculate Analog time record is used to calculate the wind speed and wind direction of the target record. „Method 1“ Application of wind speed ratios and wind direction differences on reference data. „Method 2“ Direct use of the wind data of target series, based on joint probabilistic approach. 4) Correct Some statistical properties can be distorted (i.e. the average wind speed), so that some correction may improve final result. Scheme of used algorithm of merging of „basic“ bins. The table refers to the data of reference series. rows = wind speed bins columns = wind direction bins Example of simulated wind data series Example of simulated wind rose Matrix-analog approach enables reliable simulation of complete long-term wind climatology. Its performance is very good compared to simple MCP approaches. Comparison with more complex MCP methods (e.g. artificial neural networks, commercial software) would be interesting. Round robin test ?? Performance of any MCP method strongly depends on the data used. If no perfectnear-by reference site is available, then using reanalyses is safer bet. Doksany Kopisty B-Tuřan O-Porub P-Libuš Kuchař. P-Ruzyn Č.Buděj. Cheb Luká K.Mysl. Mileš. Doksany 7.68% 8.97% 5.16% 3.98% 5.27% 4.07% 5.83% 7.97% 5.64% 4.58% 4.11% 5.75% Kopisty 9.72% 11.29% 6.17% 7.33% 7.87% 7.38% 8.10% 8.64% 9.06% 8.81% 7.20% 8.32% B-Tuřany 8.51% 6.61% 5.16% 3.73% 2.97% 4.29% 4.43% 4.07% 3.88% 5.71% 3.95% 4.85% O-Poruba 10.73% 6.41% 7.75% 6.06% 6.40% 6.87% 7.06% 6.69% 7.08% 8.46% 6.26% 7.25% P-Libuš 5.79% 5.52% 6.12% 3.97% 2.99% 1.27% 3.33% 4.86% 3.90% 4.19% 2.45% 4.04% Kuchařovice 6.79% 5.65% 5.32% 4.34% 2.55% 3.14% 3.79% 4.77% 3.51% 4.22% 3.33% 4.31% P-Ruzyně 5.59% 5.36% 6.23% 3.70% 1.11% 2.97% 3.16% 4.90% 3.82% 3.97% 2.16% 3.91% Č.Buděj. 7.33% 4.91% 5.80% 4.86% 2.27% 3.04% 2.63% 4.35% 3.97% 4.24% 2.66% 4.19% Cheb 9.64% 5.99% 5.57% 5.66% 4.35% 4.39% 4.92% 4.41% 5.05% 6.55% 4.56% 5.55% Luká 6.98% 5.74% 5.45% 4.74% 2.51% 2.94% 3.01% 3.81% 4.97% 4.59% 2.77% 4.32% K.Myslová 5.61% 6.44% 7.36% 4.66% 2.65% 3.28% 2.94% 4.03% 6.95% 3.83% 2.34% 4.55% Milešovka 7.23% 5.61% 6.83% 3.61% 2.36% 3.08% 2.18% 4.03% 5.39% 4.52% 4.20% 4.46% nc_925g 6.95% 5.01% 6.36% 4.62% 1.73% 1.62% 1.77% 3.17% 4.75% 3.11% 3.71% 1.38% 3.68% nc_925w 6.41% 5.10% 6.86% 4.24% 1.83% 2.18% 1.74% 3.17% 5.59% 3.70% 3.53% 1.48% 3.82% era_1000w 5.92% 5.59% 4.96% 3.23% 1.24% 1.45% 1.11% 3.41% 5.21% 2.49% 3.97% 1.60% 3.35% era_925g 6.77% 5.03% 5.68% 3.56% 1.40% 1.92% 1.51% 3.76% 5.23% 3.12% 4.05% 1.18% 3.60% era_925w 6.52% 5.23% 5.87% 3.44% 1.73% 1.97% 1.69% 3.28% 4.89% 2.89% 3.86% 1.32% 3.56% me_10m 6.51% 5.50% 5.64% 3.54% 1.26% 1.65% 1.37% 2.98% 5.13% 2.55% 3.78% 1.35% 3.44% avg. 7.24% 5.73% 6.59% 4.39% 2.83% 3.29% 3.05% 4.22% 5.55% 4.24% 4.85% 2.95% 4.61% Null method 11.20% 7.17% 7.58% 5.78% 5.45% 5.22% 6.91% 6.48% 6.59% 6.81% 6.45% 5.41% 6.76% reference series target series avg. Average wind speed is best simulated by linear regression and both matrix- analog methods. Wind speed distribution and power density are best simulated by matrix- analog methods; they strongly overperform variance ratio method. Results of regression methods are not relevant - simulation of residuals would have to be added (not tested). Individual values of wind speed are best simulated by linear regression. The uncertainty of matrix-analog methods is higher because its individual values include variability. Simple MCP methods are compared with Matrix-analog methods 1 and 2. Column DD shows number of directional sectors (bins) considered by respective method. Numbers correspond to RMSE of given metric from 50 runs and 13 pairs of reference/target station. Wind rose compliance is metered by differences of frequencies (freq.) and Kolmogorov-Smirnov integral (KSI). Comparison by used data Methods 1 and 2 preform similarly for wind speed statistics; wind rose is better simulated by Method 2. Dividing data into bins by wind direction improves result, but (with exception of wind rose simulation in Method 1) there is no significant difference between using 12 and 36 sectors. In general, more detailed „merged“ bins improve results, but getting them too small may lead to „overfitting“. Whether this may be an issue depends on method of merging bins, structure of basic bins, calculation and correction methods and on the data used, so there is no simple answer. RMSE of simulation of average wind speed for combinations of reference and target data. The matrix-analog Method 1 (36 directions) was computed by 50 runs. Numbers in red show the cases, where the distance between reference and target station is more than 100 km. The lower reference series are derived from reanalyses (nc = NCEP/NCAR, era = ERA Interim, me = MERRA; w modelled wind, g - geostrophic wind) This graph is related to the table on the left. „Improvement rate“ is a ratio between the error in prediction of average wind speed by matrix-analog MCP and by „Null method“. „Null method“ means the application of original short-term (1 year) data. The „improper sites“ are Doksany, Kopisty, O- Poruba and Cheb. They lay in valleys or basins, so that their wind regime is given by local orography or is isolated from large-scale wind regime. Except of these „bad“ reference stations, MCP improves results even in case of low correlation between reference and target wind speed series. ref. target training target series period ref. target training target series period T t , T d time record from training (t), target (d) period v r wind speed (v) of reference(r) series D c wind direction (D) of target (c) series k refers to particular „merged“ bin

Upload: others

Post on 17-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Matrix-analog measure-correlate-predict approach PO.013 ... · Carta, J. A., Velázquez, S., & Cabrera, P. (2013). A review of measure-correlate-predict (MCP) methods used to estimate

Comparison of MCP methods

Used data:

12 Czech manned weather stations (measurement height 10 m), reanalyses NCEP/NCAR, ERA

Interim and MERRA (different height/pressure levels, geostrophic/model wind)

Testing procedure:

- period 2005 - 2008 (only 4 years used to avoid homogeneity issues), 1h measurement interval

- 12 months of training period (continual 3-month blocks of data, seasonally stratified), remaining

data used for verification

- fixed set of 50 resampled runs

1. Hanslian D. (2014): Analýza výsledků měření větru. Dissertation thesis (in Czech). Department of Meteorology and

Environmental Protection, Faculty of Mathematics and Physics, Charles University in Prague, Praha, 159p.

http://www.ufa.cas.cz/files/OMET/Hanslian_disertacni_prace.pdf

2. Hanslian D. (2014): Wind data analysis. Abstract of Doctoral Thesis. Department of Meteorology and Environmental

Protection, Faculty of Mathematics and Physics, Charles University in Prague, Praha, 37p.

http://www.ufa.cas.cz/files/OMET/Hanslian_Abstract_of_Doctoral_Thesis.pdf

Requirements on calculated long-term data may differ by their usage:

average wind speed => basic information

wind speed distribution => enables energy calculation

wind rose => wake effects, model inputs

individual values => replacement of missing records, „prediction“ tasks

For wind resource assessment wind speed distribution and wind rose are needed, but most

published methods do not deal well with both.

Comprehensive review of MCP methods: Carta, J. A., Velázquez, S., & Cabrera, P. (2013). A review of measure-

correlate-predict (MCP) methods used to estimate long-term wind characteristics at a target site. Renewable and Sustainable Energy

Reviews, 27, 362–400

The transformation of measured short-term wind data into the long-

term wind climate is a major source of uncertainty in wind resource

assessments. The usual approach, called measure-correlate-predict

(MCP), employs a long-term wind data series that is correlated to the

short-term wind measurement. The process of MCP can be

performed by various types of methods: the most common are simple

methods of ratios, linear regression methods, “matrix” methods and

the use of artificial neural networks.

Abstract

Matrix-analog measure-correlate-predict approachDavid Hanslian

Institute of Atmospheric Physics AS CR

PO.013

Results

Objectives

Conclusions

Methods

References

EWEA Resource Assessment 2015 – Helsinki– 2-3 June 2015

The presented MCP approach can be classified as a “matrix method”, because it employs the

principle of classification the data into bins, most notably by wind speed and wind direction of

reference series. The additional feature is the application of analog principle, which enables to

preserve the relationship between wind speed and wind direction. As a result, the complete wind

climatology including the wind rose is simulated. The matrix-analog approach can be

implemented by several ways; two different options were proposed.

Verification confirmed that the matrix analog approach enables reliable simulation of long-term

wind speed distribution as well as the wind rose. A comparison with regression MCP methods

showed that considering the long-term average wind speed the matrix-analog performs well,

similarly as the linear regression method. Considering the wind speed distribution, the matrix-

analog approach clearly overperform simple approaches, such as Variance ratio method. We

suggest making a more ambitious comparison that would include also more elaborate alternative

methods, such as methods implemented in commercial software or artificial neural networks.

„Matrix“ methods

= wide group of methods, where data are separated into classes / bins (so that a „matrix“ of

bins is employed). In other aspects, the label „matrix method“ is used for various approaches,

which are often pricipially different!

Proposed „matrix – analog“ approach

= general approach how simulate long-term wind speed distribution and wind rose at once,

preserving the relationship between wind speed and wind direction.

The result is an artificial wind data series (this may be advantage in subsequent data use).

1) Define (reference) bins

i) reference data are binned into uniform „basic“ bins by wind speed and direction

ii) the „basic“ bins are merged by any algorithm, so that the resulting bins contain enough data

in the „reference“ period (period of concurrent measurements)

2) Find the analog

Each time record of „target“ (long-term) period is assigned to a time record of training

(reference) period, so that both time records correspond to the same bin. Some heuristics can

be used to find the most relevant analog.

3) Calculate

Analog time record is used to calculate the wind speed and wind direction of the target record.

„Method 1“

Application of wind speed

ratios and wind direction

differences on reference data.

„Method 2“

Direct use of the wind data

of target series, based on

joint probabilistic approach.

4) Correct

Some statistical properties can be distorted (i.e. the average wind speed), so that some

correction may improve final result.

Scheme of used algorithm of merging of

„basic“ bins.

The table refers to the data of reference

series.

rows = wind speed bins

columns = wind direction bins

Example of simulated wind data series Example of simulated wind rose

Matrix-analog approach enables reliable simulation of complete long-term wind climatology.

Its performance is very good compared to simple MCP approaches.

Comparison with more complex MCP methods (e.g. artificial neural networks, commercial

software) would be interesting. Round robin test ??

Performance of any MCP method strongly depends on the data used. If no „perfect“ near-by

reference site is available, then using reanalyses is safer bet.

Doksany Kopisty B-Tuřan O-Porub P-Libuš Kuchař. P-Ruzyn Č.Buděj. Cheb Luká K.Mysl. Mileš.

Doksany 7.68% 8.97% 5.16% 3.98% 5.27% 4.07% 5.83% 7.97% 5.64% 4.58% 4.11% 5.75%

Kopisty 9.72% 11.29% 6.17% 7.33% 7.87% 7.38% 8.10% 8.64% 9.06% 8.81% 7.20% 8.32%

B-Tuřany 8.51% 6.61% 5.16% 3.73% 2.97% 4.29% 4.43% 4.07% 3.88% 5.71% 3.95% 4.85%

O-Poruba 10.73% 6.41% 7.75% 6.06% 6.40% 6.87% 7.06% 6.69% 7.08% 8.46% 6.26% 7.25%

P-Libuš 5.79% 5.52% 6.12% 3.97% 2.99% 1.27% 3.33% 4.86% 3.90% 4.19% 2.45% 4.04%

Kuchařovice 6.79% 5.65% 5.32% 4.34% 2.55% 3.14% 3.79% 4.77% 3.51% 4.22% 3.33% 4.31%

P-Ruzyně 5.59% 5.36% 6.23% 3.70% 1.11% 2.97% 3.16% 4.90% 3.82% 3.97% 2.16% 3.91%

Č.Buděj. 7.33% 4.91% 5.80% 4.86% 2.27% 3.04% 2.63% 4.35% 3.97% 4.24% 2.66% 4.19%

Cheb 9.64% 5.99% 5.57% 5.66% 4.35% 4.39% 4.92% 4.41% 5.05% 6.55% 4.56% 5.55%

Luká 6.98% 5.74% 5.45% 4.74% 2.51% 2.94% 3.01% 3.81% 4.97% 4.59% 2.77% 4.32%

K.Myslová 5.61% 6.44% 7.36% 4.66% 2.65% 3.28% 2.94% 4.03% 6.95% 3.83% 2.34% 4.55%

Milešovka 7.23% 5.61% 6.83% 3.61% 2.36% 3.08% 2.18% 4.03% 5.39% 4.52% 4.20% 4.46%

nc_925g 6.95% 5.01% 6.36% 4.62% 1.73% 1.62% 1.77% 3.17% 4.75% 3.11% 3.71% 1.38% 3.68%

nc_925w 6.41% 5.10% 6.86% 4.24% 1.83% 2.18% 1.74% 3.17% 5.59% 3.70% 3.53% 1.48% 3.82%

era_1000w 5.92% 5.59% 4.96% 3.23% 1.24% 1.45% 1.11% 3.41% 5.21% 2.49% 3.97% 1.60% 3.35%

era_925g 6.77% 5.03% 5.68% 3.56% 1.40% 1.92% 1.51% 3.76% 5.23% 3.12% 4.05% 1.18% 3.60%

era_925w 6.52% 5.23% 5.87% 3.44% 1.73% 1.97% 1.69% 3.28% 4.89% 2.89% 3.86% 1.32% 3.56%

me_10m 6.51% 5.50% 5.64% 3.54% 1.26% 1.65% 1.37% 2.98% 5.13% 2.55% 3.78% 1.35% 3.44%

avg. 7.24% 5.73% 6.59% 4.39% 2.83% 3.29% 3.05% 4.22% 5.55% 4.24% 4.85% 2.95% 4.61%

Null method 11.20% 7.17% 7.58% 5.78% 5.45% 5.22% 6.91% 6.48% 6.59% 6.81% 6.45% 5.41% 6.76%

reference

series

target series

avg.

Average wind speed is best simulated

by linear regression and both matrix-

analog methods.

Wind speed distribution and power

density are best simulated by matrix-

analog methods; they strongly

overperform variance ratio method.

Results of regression methods are not

relevant - simulation of residuals would

have to be added (not tested).

Individual values of wind speed are

best simulated by linear regression.

The uncertainty of matrix-analog

methods is higher because its

individual values include variability.

Simple MCP methods are compared with Matrix-analog methods 1 and 2.

Column DD shows number of directional sectors (bins) considered by

respective method. Numbers correspond to RMSE of given metric from 50

runs and 13 pairs of reference/target station. Wind rose compliance is metered

by differences of frequencies (freq.) and Kolmogorov-Smirnov integral (KSI).

Comparison by used data

Methods 1 and 2 preform similarly for wind speed statistics; wind rose is better simulated by

Method 2.

Dividing data into bins by wind direction improves result, but (with exception of wind rose

simulation in Method 1) there is no significant difference between using 12 and 36 sectors.

In general, more detailed „merged“ bins improve results, but getting them too small may lead to

„overfitting“. Whether this may be an issue depends on method of merging bins, structure of basic

bins, calculation and correction methods and on the data used, so there is no simple answer.

RMSE of simulation of average wind speed for combinations of reference

and target data. The matrix-analog Method 1 (36 directions) was computed

by 50 runs. Numbers in red show the cases, where the distance between

reference and target station is more than 100 km.

The lower reference series are derived from reanalyses (nc = NCEP/NCAR,

era = ERA Interim, me = MERRA; w – modelled wind, g - geostrophic wind)

This graph is related to the table on the left.

„Improvement rate“ is a ratio between the error in

prediction of average wind speed by matrix-analog

MCP and by „Null method“. „Null method“ means

the application of original short-term (1 year) data.

The „improper sites“ are Doksany, Kopisty, O-

Poruba and Cheb. They lay in valleys or basins, so

that their wind regime is given by local orography or

is isolated from large-scale wind regime. Except of

these „bad“ reference stations, MCP improves

results even in case of low correlation between

reference and target wind speed series.

ref. target

training

target

seriesperiod

ref. target

training

target

seriesperiod

Tt, Td – time record from

training (t), target (d) period

vr – wind speed (v) of

reference(r) series

Dc – wind direction (D) of

target (c) series

k – refers to particular

„merged“ bin