seoul, 23/26-06-2013 sequential design of experiment · • assessing roi of an automated procedure...

SEQUENTIAL DESIGN OF EXPERIMENT

SEOUL, 23/26-06-2013

i4C Innovation Powered by Analytics

2

DRIVERS FOR THE RESEARCH

Needs & issues

Sequential improvements

AGENDA

COMPANY OVERVIEW

Feasibility & Savings

METHODOLOGY

RESULTS


3


Needs & issues


AGENDA

COMPANY OVERVIEW


METHODOLOGY

RESULTS

COMPANY OVERVIEW

FAST FACTS

SOFTWARE VENDOR FOCUSED ON ANALYTICS

FOUNDED IN 2002

OFFICES: • MILAN

• ROME

• BOLOGNA

• LONDON

REVENUES: 9 M$

EMPLOYEES: 75

CUSTOMERS: 70+

Testo libero

4

Testo libero

COMPANY OVERVIEW

VISION

IN A WORLD OF NUMBERS, WE PROVIDE YOU THE ONES UPON WHICH

YOU CAN BASE YOUR DECISIONS.

ORGANIZATIONS REACH EXCELLENCE WHEN THEY USE

DATA TO ACT,

ADVANCED ANALYTICS TO FORECAST

PREDICT AND OPTIMIZE,

APPLICATIONS TO DRIVE EFFECTIVE INFORMATION AT THE POINT OF DECISION.

5

Testo libero

COMPANY OVERVIEW

VALUE PROPOSITION

WE DELIVER ADVANCED ANALYTIC APPLICATIONS

FOR SPECIFIC INDUSTRIES AND BUSINESS

PROCESS.

WE ENABLE TO USE PREDICTIVE ANALYTICS

IN REAL TIME AND WITH HIGH

AUTOMATION

BY ANY ORGANIZATION AND USER.

6

WHY I4C

7

VERTICAL KNOWLEDGE

8

COMPETITIVE POSITION & KEY DIFFERENTIATORS

PLAYGROUND TOOLS VS AAAs

WHEN CLASSICAL ANALYTICS TOOLS WOULD FAIL:

Users with little or no methodological skills

Large problems in need of automation

Integration into Business Process is essential

Strong Vertical Market knowledge is key

KEY DIFFERENTIATORS

INDUSTRY SPECIFIC: i4C apps embed industry knowledge BUSINESS DRIVEN: i4C apps are designed for business users and focused on business results, hiding advanced analytics complexity ACTIONABLE: i4C apps are based on a Framework that can be easily integrated in an enterprise operational environment to drive business process action

AAAs

Advanced Analytics Tools

This image cannot currently be displayed.

Visualization Tools This image cannot currently be displayed.


Business Intelligence

Tools


1

2

3

9

I4C PILLARS

10

I4C PILLARS

11


12


Needs & issues


AGENDA

COMPANY OVERVIEW


METHODOLOGY

RESULTS


13

In terms of figures Need of forecasting

Why do one need a good forecasting method

Let’s estimate the difference (if any)

Issues in forecasting

Some problems to overcome

NEED OF FORECASTING Demand forcaster power

14

Distributors

Anagrafical and consumption data

TERNA

AdR Load Profiling, PRA and CRPU

Reseller

Data consolidation (switch a/p)

Load curves

Demand Forecast active customers

Market Operator Energy

Scheduling

National Network Balance

9k 4k 3k

Let us display important highlights in a forecasting process in the energetic field:

ISSUES IN FORECASTING

15

Time Horizon

Instability

Automation

Set relevant

information

Aggregation level

Time consuming

Human action

N.B. In what follows, reducing the amount of information to collect will be conseidered as essential.

In few words we continuously face forecasting problems with:

Large scale, impossible to simplify in aggregated stochastic processes

High update frequency and fast models obsolescence

Few HR resources to maintain performance

DRIVERS OF THE RESEARCH: ROI

• Assessing ROI of an automated procedure for forecasting model identification would have significant results in terms of cost cutting:

• We evaluated 3 scenarios for small, medium and large operators in the gas and power market (unbalancing errors and basic figures are derived from experience and best practices)

16

Operator Size# of Consumption

PointsGas Portfolio

(106 m3)Portfolio

UnbalancingUNPE

Gas Unbalancing(106 m3)

As Is Unbalancing Cost (€)

Small 100 200 15% 30% 30,00 952.500,00€ Medium 2500 3000 10% 25% 300,00 9.525.000,00€

Large 4000 10000 5% 20% 500,00 15.875.000,00€

Operator Size2 UNPE Improvement Gain (€)Cost

ImprovementSmall 1,5% 190.500,00€ 20%

Medium 1,0% 2.381.250,00€ 25%Large 0,5% 6.350.000,00€ 40%

DRIVERS OF THE RESEARCH: ROI

• Gas market shows low complexity of the problem and higher advantage for a larger player in adopting automated techniquest

• In the power market instead larger players are already efficient due to better quality of SCADA data and less volatility in terms of types of consumption:

17

Operator Size# of Consumption

PointsPower Portfolio

(GWh)Portfolio

UnbalancingUNPE

Power Unbalancing

(GWh)

As Is Unbalancing Cost (€)

Small 2000 200 10% 28% 20.00 102,000.00€ Medium 5000 3000 8% 25% 225.00 1,530,000.00€

Large 15000 10000 3% 14% 300.00 5,100,000.00€


Improvement

Small 2.0% 57,120.00€ 56%Medium 1.0% 680,000.00€ 44%

Large 0.25% 1,983,333.33€ 39%


18


Needs & issues


AGENDA

COMPANY OVERVIEW


METHODOLOGY

RESULTS

Main Drivers

• Reducing number of covariates needed

• Exploring successfully a reduced region of all possible combinations

Sequential Design

First Design

«Meta-model»

Neighbourhood

Fine tuning of parameters

19

Cluster Analysis

METHODOLOGY

Benchmarking

CLUSTERS ANALYSIS

Data

Datas are taken from Italian daily gas distribution network.

Selecting information

• Create subsets of timeseries from our dataset in order to train our model on homogeneous data

• Exclude “error” clusters

and get similar “profiles” clusters.

20

N.B. For each cluster under consideration, a training sample is taken.

A study based on R^2 highlights the advantages of exploiting generalized additive model (GAM) on homogeneous clusters. Thus, the set of all possible covariates to be managed by GAMs can be groupped as follows.

SEQUENTIAL DESIGN: COVARIATES AND MODEL

21

Linear Smooth WeekThursday HolyHolidays LAG1 MinDewMa7 WeekFriday HolyITRepDay LAG2 MaxTmp WeekSaturday HolyITImmConception LAG3 MaxDew WeekSunday HolyAllSaints LAG4 MaxDewMa3 WeekMonday HolyAssumption LAG5 MaxDewMa7 WeekTuesday HolyBoxingD LAG6 WndSpd MonthDecember HolyNewYearD LAG7 TmpH15 MonthJanuary HolyChristmas LAG14 TmpH15Ma3 MonthFebruary HolyEaster Hdd TmpH15Ma7 MonthMarch HolyEasterMon MinTmp TmpH15Feel MonthApril HolyEpiphany MinDew RelHum MonthMay HolyITLiberD MinDewMa3 MonthJune HolyLaborD MonthJuly HolyAugustVac MonthAugust LongWENYeartoEpiphany MonthSeptember LongWEXmasToEpiphany MonthOctober LongWELongWESatSun

LongWELongWESun

In what follows, a subset of linear and smooth variables will be called an experiment.

SEQUENTIAL DESIGN: EXPERIMENTS AND KPI

22

Estimating and forecasting with GAM generates an error on experiments controlled by a KPI to be defined. As the cluster variance may be non negligible, an appropriate KPI takes account of the presences of not uniform time series. Let E be an evaluated experiment and P a time series in the considered sample. Then, the KPI per experiment can be set as where • 𝑈𝑈𝑈𝑈𝐸,𝑃 is the unbalanced percentage error for experiment E on time

series P • 𝑄𝐸 is the set of all time series that score an 𝑈𝑈𝑈𝑈𝐸,𝑃 under the ninth

decile.

𝑼𝑼𝑼𝑼𝑼 = 𝒎𝒎𝒎𝒎𝑼 ∈𝑸𝑼(𝑼𝑼𝑼𝑼𝑼,𝑼)

CORE IDEA: estimating the depencence of the 𝑼𝑼𝑼𝑼 from the presence or absence of covariates in the experiments.

In what follows, it will be simply referred to as UNPE.

SEQUENTIAL DESIGN

23

Temp Holy X_n

1 0 1 1

2 1 0 1

3 0 0 1

4 0 0 1

5 0 1 1

The following path leads along the construction of a sequential design of experiments:

UNPE Temp Holy X_n

1 21.87 0 1 1

2 22.45 1 0 1

3 16.33 0 0 1

4 15.54 0 0 1

5 20.12 0 1 1

Compute UNPE on a sample

Relate UNPE to the presence or absence of variables through Meta-model

«Meta-model» suggests further designs

Several possible designs (D-, G-, V-optimality) has been evaluated and a

D-optimal design has been chosen as starting point of the path. Feature: D-optimal design minimizes the generalized variance of the parameters estimates. Then, set the following constraints and effects to be investigated. Both aspects contribute to the definition of the number of experiments needed.

Effects

Main effects

Interactions between “lag” variables

Interactions between “lag” variables and Holiday dummies

…

FIRST DESIGN

24

Constraints

5-40 variables considered

Not more than 5 “lag”-variables

Not more than 8 weather variables

N.B. This is the only kind of «knowledge» that has been introduced into the sequential design

Let us show an example.

FIRST DESIGN

25

Constraints and effects


Main effects are investigated

Example of a design in 3 variables

The D-optimal algorithm generates a 9-experiment design. The optimization method also aims at computing a design such that each variable is tested approximately in the same number of experiments.

Temp Holy X_n

1 1 1 0

2 0 1 0

3 1 0 0

4 1 0 0

5 0 1 0

6 0 1 1

7 1 0 1

8 1 1 0

9 0 0 1

Marginals 5 5 3

«META-MODEL»: Neighbourhood

26

Furtheron, we need to decide where to direct the exploration of the designs space (259 experiments):

1 See previous slide.

Let us first focus on the neighbourhood, proceeding by sensitivity rather than specificity, and set: In the end, we only keep new designs which also satisfy our constraints1. This selection returns a solid neighbourhood of about 15000 experiments, which is going to be predicted by the Meta-model.

Define a neighbourhood of our best designs and

let a Meta-model predict the minima of UNPE

Neighbourhood

• Center -> 20 best experiments (minima of UNPE) • Width -> 20000 designs • Probabilty -> 20% -80% of each variables according to the their

presence in the best designs

259 ~ 6 ∙ 1017 experiments!

«META-MODEL»: Neighbourhood

27

The binomial probability to choose a valid neighbourhood is based on the presences or absences of each variable in previous best experiments. N.B. Probabilities under 0.2 are shifted to 0.2 and those over 0.8 are shifted to 0.8, thus reducing the possibility to get stuck around local minima.

«META-MODEL»: KENDALL CORRELATION

28

Method Kendall correlation

Rpart 0.28

Blm 0.05

Btlm 0.05

Bcart -0.01

Lm 0.31

Mars 0.02

Bn NA

Next step: Simulate several scenarios via a bootstrap simulations of 1000 repeats and discard methods that score a low Kendall correlation between the prediction of UNPE on not yet tested experiments and their effective one. Bayesian methods (network, lm, …) turn out not only to have a low Kendall correlation, but also to be too «slow» to be computed.

finding a Meta-model able to detect best designs before testing them on the time series.

Further analyses have been conducted for rpart, lm and mars methods .

Exploring the behaviour of different methods, the focus stays onto the maximization of the probability (in the graph, «exp.true») of finding good experiments over a set of 5000 experiments. Other parameters will be discussed in the results.

«META-MODEL»: EFFICIENCY

29

Step by step, an improvement in the UNPE is expected. According to the several scenarios of paramenters, a different profile is produced. In the present research a high speed in the first steps is endorsed, as in Profile 1.

METAMODEL AT WORK

30

Desired Profile 1 Profile 2

AVAILABLE TIME

METAMODEL AT WORK

31

Step by step, new experiments introduced by the Meta-model keep close to the minimum, in terms of trimmed UNPE.

steps

BENCHMARKING

32

A comparable benchmark consists in a generalized additive model (GAM), where in the automation process the following options are set: • Autoselection of degrees of freedom for each variables.

• Concerning linear variables, a shrinkage has been adopted so that the amount of variables considered is comparable to the constraints of the sequential experiments.

This option also allows to completely zero the effects of smooth variables in the estimation process (shrinkage).

The subset of linear effects to be estimated is the union of those obtained by a backward stepwise algorithm on the sample of timeseries with respect to the linear variables.


33


Needs & issues


AGENDA

COMPANY OVERVIEW


METHODOLOGY

RESULTS

RESULTS

Values

Benchmarking

Changing scale

Forecasting

Meta-model selection

34

Cluster analysis

CLUSTER ANALYSIS

35

Data are taken from the Italian daily gas distribution network. Decide to make a cluster analysis to get homogeneous data. Similar time series can be groupped and thus forecasted according to the same range of covariates. Applying a hierarchical k-means method, explained variance is displayed:

CLUSTER ANALYSIS

36

Cluster Dimensions

Exclude some noise clusters containing few odd time series as shown And get 5 clusters with recognizable behaviour.

Each cluster contains hundreds of comparable timeseries.

CLUSTER ANALYSIS

37

Focus on strongly different clusters

within-cluster sum of squares Mnf 232,760 Htn1 69,873

MnfWkn 112,812 Htn2 89,036 Htn3 105,510

Clusters profiles

CLUSTER ANALYSIS

38

Highlights: weather trend and weekdays impact on the profile of this cluster.

CLUSTER ANALYSIS

39

Highlights: weekdays and holidays/vacations strongly affect time series of this cluster.

META-MODEL SELECTION: INITIAL DESIGN

40

Effects

Main effects

Interactions between “lag” variables

Interactions between “lag” variables and Holiday dummies

…

Constraints


Not more than 5 “lag”-variables

Not more than 8 weather variables

According to the knowledge introduced, the length of the D-optimal design depends on the number of interactions to be estimated • Requiring main effects of 59 variables, a design of 65 experiments is

set. • Adding interactions among certain variables leads to a 188-

experiments-long design. Running both scenarios, the UNPE appears to decrease much faster with a starting design of 65 experiments.

«META-MODEL» SELECTION

41

Exploring behaviour of different methods: by cluster


42

Exploring behaviour of different methods: by incidence of good experiments


43

Exploring behaviour of different methods : by incidence and executed experiments

Effective range for the

neighbourhood under test

FORECASTING

44

Complex behaviour of a cluster of type «heating» during a period of changing weather.

Test week Training week Test week

FORECASTING

45

The same period is displayed for a «Manufacturing» Cluster.

Test week Training week Test week

Forecasting performance for a point of cluster Htn3 in the training week And in a test week

FORECASTING

46

FORECASTING

47

Forecasting performance for a point of cluster MnfWkn in the training week And in a test week

SAMPLE 189UNPE UNPE

1 16.54778502 9 20.99280042 16.6374906 8 21.099716753 16.64268102 3 21.208585024 16.65369251 4 21.26650485 16.6898197 1 21.290441226 16.70717776 6 21.316481227 16.71112161 2 21.329077168 16.72990428 5 21.38063819 16.74076457 10 21.44068672

10 16.7409038 7 21.5991528

Best_Sequential_07mar

FORECASTING

48

Global forecasting performance when extending experiments for the training sample to 399 timeseries of cluster Htn3 and to 189 timeseries of cluster MnfWkn.

SAMPLE 399UNPE UNPE

1 4.004575018 1 5.5741435842 4.049522931 5 5.6206459273 4.059941486 7 5.6669688884 4.107598766 4 5.724355155 4.108530062 6 5.7397069536 4.11331634 9 5.760213937 4.122607535 8 5.8279479228 4.139927531 2 5.8898477359 4.144644052 10 5.918986066

10 4.164124426 3 5.94520787

Best_Sequential_07mar

Out of sample, the ranking of best designs may change because of cluster variance.

CHANGING SCALE

49

UNPEHtn3_N50_D65 2906 4.004575018Htn3_N50_D65 2858 4.049522931Htn3_N50_D65 1716 4.059941486Htn3_N50_D65 2335 4.107598766Htn3_N50_D65 2852 4.108530062Htn3_N50_D65 2826 4.11331634Htn3_N50_D65 2992 4.122607535Htn3_N50_D65 1956 4.139927531Htn3_N50_D65 3013 4.144644052Htn3_N50_D65 1084 4.164124426

UNPEHtn3_N50_21feb 1 3.938366673Htn3_N50_21feb 9 3.939481794Htn3_N50_21feb 8 4.011840404Htn3_N50_21feb 4 4.013029568Htn3_N50_21feb 5 4.022413093Htn3_N50_21feb 7 4.022656584Htn3_N50_21feb 6 4.128501192Htn3_N50_21feb 2 4.423881108Htn3_N50_21feb 3 4.484769585

Htn3_N50_21feb 10 4.540948848

UNPEHtn3_N50_21feb_399 1 4.145980074Htn3_N50_21feb_399 8 4.182447592Htn3_N50_21feb_399 4 4.193812874Htn3_N50_21feb_399 9 4.199241952Htn3_N50_21feb_399 5 4.22726051Htn3_N50_21feb_399 6 4.244017049Htn3_N50_21feb_399 7 4.255655226Htn3_N50_21feb_399 2 4.562697402Htn3_N50_21feb_399 3 4.623997373

Htn3_N50_21feb_399 10 4.678761502UNPE

Htn3_N100_D65 2428 5.158018152Htn3_N100_D65 2653 5.237910971Htn3_N100_D65 818 5.24944972

Htn3_N100_D65 1376 5.256119543Htn3_N100_D65 2332 5.258285494Htn3_N100_D65 2132 5.264724736Htn3_N100_D65 2544 5.27648049Htn3_N100_D65 2030 5.311219332Htn3_N100_D65 2911 5.324534197Htn3_N100_D65 2986 5.357067612

UNPEHtn3_N100_21feb_100 1 4.913663839

UNPEHtn3_N100_21feb_399 1 4.721847465

For the 2 sample sizes (50 and 100 timeseries) in Cluster Htn3, UNPE of best models is given for the estimates and a test week forecast.

First evidences : Experiments on a 20% sample appear more consistant in the out-of-sample tests than those on a 10% sample do. Arguments for later studies

BENCHMARKING

50

Let us have a look at linear and smooth coefficients estimated by a model from the benchmark and one of the last sequential experiments.

Benchmark

Sequential

Htn3

BENCHMARKING

51

Compare the smooth effects as estimated by the benchmark (blue) and one of the last sequential experiments (orange).

Benchmark

Sequential

Common variables Most influent

Htn3

BENCHMARKING

52

Compare the density of UNPE values over all timeseries in the cluster: benchmark (blue) and one of the last sequential experiments (orange).

Htn3

BENCHMARKING

53

Let us have a look at linear and smooth coefficients estimated by a model from the benchmark and one of the last sequential experiments.

Benchmark

Sequential

MnfWkn

BENCHMARKING

54

Compare the smooth effects as estimated by the benchmark (blue) and one of the last sequential experiments (orange).

Benchmark

Sequential

Most influent Common variables

MnfWkn

BENCHMARKING

55

MnfWkn

Compare the density of UNPE values over all timeseries in the cluster: benchmark (blue) and one of the last sequential experiments (orange).

UNPE UNPEmin UNPEmedian UNPEmean UNPEmax6.756142112 0.970573175 6.015515484 Inf Inf

UNPE UNPEmin UNPEmedian UNPEmean UNPEmax5.574143584 0.869633394 4.925019982 796.6289807 41228.615235.620645927 1.029247114 4.910178563 842.7382089 42116.953075.666968888 0.938584646 4.930057643 796.1396658 41285.571175.72435515 1.077161168 4.985055732 795.2758674 41494.30103

5.739706953 0.988378518 5.054116047 826.1754962 41959.119585.76021393 1.079579789 5.059421718 850.8183385 40535.89749

5.827947922 0.947614267 5.136946826 804.8961237 41179.287375.889847735 1.086091019 5.25241303 761.4459961 41313.404815.918986066 0.972054278 5.290156752 805.3848279 44657.214025.94520787 1.218038514 5.505236243 741.8996495 41341.10194

Best_Sequential_07mar_full399

Benchmark_Htn3_N399_07mar

After the training on a sample of 50 timeseries, compare the KPI over 399 timeseries in the cluster Htn3 predicting the training week: benchmark model (top) vs 10 best sequential experiments (bottom).

BENCHMARKING

56

Htn3


UNPE UNPEmin UNPEmedian UNPEmean UNPEmax7.821415053 1.239874144 7.190358923 807.1497802 42569.901839.722695028 2.339040222 8.887116576 794.143618 49310.635889.318054749 2.509801494 8.589637493 761.1218989 45756.460497.944944423 1.322730676 7.252391584 802.8160981 44332.196747.933791678 0.952097836 7.092954443 921.9813442 49176.686767.793662882 1.126302413 7.201884132 880.076013 52077.716377.856672353 1.031769978 7.076652663 791.7493517 42060.531937.762224051 1.419322415 7.179014866 836.2021167 56384.868027.673978488 1.188348831 7.080433546 926.4053681 58509.0035810.20214356 2.298690554 9.719737856 876.2458283 52602.5069

Benchmark_Htn3_N399_21mar


BENCHMARKING

57

After the training on a sample of 50 timeseries, compare the KPI over 399 timeseries in the cluster Htn3 predicting a test week: benchmark model (top) vs 10 best sequential experiments (bottom).

Htn3

Htn3

After the training on a sample of 50 timeseries, compare the KPI over 399 timeseries in the cluster Htn3 predicting another test week: benchmark model (top) vs 10 best sequential experiments (bottom).


UNPE UNPEmin UNPEmedian UNPEmean UNPEmax4.145980074 0.886834527 3.971211549 7.485636173 798.01161614.562697402 0.772751687 4.564293112 9.033988559 1252.5032524.623997373 1.416348253 4.556774688 8.000594822 821.71490074.193812874 0.834246411 4.021269124 7.974742208 965.38474434.22726051 0.822315298 4.054336656 7.529662835 787.2986151

4.244017049 0.926757977 4.163006608 8.086179936 1007.1160134.255655226 0.928866999 4.07842148 7.566448697 792.79456154.182447592 1.005476661 4.207106352 7.367601671 741.07518294.199241952 1.193807369 4.088120913 7.367079653 731.38497174.678761502 1.102314894 4.572218497 8.848636001 1139.512525

Benchmark_Htn3_N399_21feb

Best_Sequential_21feb_full399

BENCHMARKING

58

Min Improvement: 0.99% Max improvement: 1.52%


UNPE UNPEmin UNPEmedian UNPEmean UNPEmax20.9928004 3.034627954 17.96153434 49.5776483 1330.886673

21.09971675 3.584812009 18.50327941 53.51947874 1669.90492621.20858502 3.286215776 18.37556934 50.43371962 1695.41179321.2665048 3.580966956 18.5311022 47.78237117 931.4101305

21.29044122 3.137223717 18.96856456 48.07076619 1158.23790121.31648122 3.244270187 19.02869405 49.25817903 1266.95970421.32907716 2.924648713 18.47561564 48.12284217 1257.33678821.3806381 3.505942303 18.32084632 46.73295193 1280.705276

21.44068672 2.261588565 18.89098774 47.82265418 1211.60408221.5991528 3.728615165 18.84180034 46.29059776 1019.645011

Benchmark_MnfWkn_N189_07mar


BENCHMARKING

59

After the training on a sample of 50 timeseries, compare the KPI over 189 timeseries in the cluster MnfWkn predicting the training week: benchmark model (top) vs 10 best sequential experiments (bottom).

MnfWkn


UNPE UNPEmin UNPEmedian UNPEmean UNPEmax22.50335956 4.516423048 21.66990134 72.18348715 4007.98265824.53447347 4.667343368 24.05060776 85.7167827 5123.97710224.85180612 3.838838164 24.08938052 85.55596367 5255.24464624.89034377 5.12813037 24.71533233 69.43004946 3680.40168225.07626094 5.387873475 24.43295814 83.69419439 4866.45841425.23493072 4.644167228 25.51707557 85.14986222 4990.29879924.22824489 2.892642832 23.14490047 75.68316431 4087.27949924.72492358 2.699113016 24.67622489 91.00253035 5361.70273224.63823937 4.963324932 23.08234272 92.08721982 5335.59239323.84309351 4.036484404 24.82640998 72.36345608 3710.977893

Benchmark_MnfWkn_N189_21feb

Best_Sequential_21feb_full189

BENCHMARKING

60

MnfWkn

After the training on a sample of 50 timeseries, compare the KPI over 189 timeseries in the cluster MnfWkn predicting a test week: benchmark model (top) vs 10 best sequential experiments (bottom).

MnfWkn

After the training on a sample of 50 timeseries, compare the KPI over 189 timeseries in the cluster MnfWkn predicting another test week: benchmark model (top) vs 10 best sequential experiments (bottom).

UNPE UNPEmin UNPEmedian UNPEmean UNPEmax27.78863475 5.732029325 23.79324624 110.7714118 7904.264871

UNPE UNPEmin UNPEmedian UNPEmean UNPEmax27.28335019 4.388613399 24.59380591 81.49262746 3560.49568526.52488908 4.641895654 23.4622075 82.17448388 3739.78425626.8146899 5.868648224 24.70632077 81.08016163 4601.059866

27.12839225 4.864389335 24.68066984 73.64815156 3023.45744526.85506333 4.639563782 25.12487291 73.36979967 3476.59322227.01996359 5.410793911 25.66195776 87.6510011 4643.8323625.81614433 5.318573864 24.09472574 77.71771407 3687.69724926.44540866 5.836638159 24.30210448 82.07437242 4866.70596227.22692683 3.026843376 25.81648518 73.3834312 3235.80622926.89818619 6.519069706 24.07974188 67.15726844 3395.322326


Benchmark_N189_21mar

BENCHMARKING

61

Min Improvement: 0.50% Max improvement: 1.97%

VALUES

62

Recalling computations on ROI and matching them with the just

observed improvements on forecasting error, the following figures

seems to be realistic improvement hypothesis (well, we’ll see…).


ImprovementSmall 1,5% 190.500,00€ 20%

Medium 1,0% 2.381.250,00€ 25%Large 0,5% 6.350.000,00€ 40%

ONE STEP FURTHER THAN SEQUENTIAL …

63

Open points

Cluster variance vs sample size:

Investigate how to select the optimal dimension of the sample for each cluster, given cluster volatility.

Launch GAM into a competition :

Linear models, ARIMA or even neural Networks may be tested.

Meta-models proceeding by specificity:

Instead of predicting and proposing «good» experiments, a meta-model can also be thought to discard «bad» ones.

Bayesian network knowledge

Hierarchical time series modeling

Michele Giordani Head of Consulting Services Direct +39 02 461061 Mobile +39 340 0784993 [email protected] i4canalytics.com

Q&A

Daniele Amberti Principal Consultant - Forecasting Direct +39 02 461061 Mobile +39 346 6798292 [email protected] i4canalytics.com

64

REFERENCES

65

Books & Papers

• Atkinson, A. C. , Donev, A., Optimum Experimental Designs, 1992, Oxford University Press. • Fan, S., Hyndman, R.J., Short-term load forecasting based on a semi-parametric additive

model, 2012, IEEE Transactions on Power Systems, 27(1), 134-141. • Hastie, T., Tibshirani, R., Friedman, J., The Elements of statistical learning: Data Mining,

Inference and Prediction, 2009, Springer. • Brockwell, P.J., Davis, R.A., Introduction to Time Series and Forecasting, 1991, Springer. • Goos, P., Jones, B., Optimal Design of Experiments, 2011, Wiley. • Pang, B., The Impact of Additional Weather Inputs on Gas Load Forecasting, 2012, Master's

Theses (2009 -), Marquette University, Paper 163.

Packages in R

• Wheeler, R.E., AlgDesign: Algorithmic Experimental Design, 2011, R package version 1.1-7. http://CRAN.R-project.org/package=AlgDesign.

• Wood, S.N., Mgcv, Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models, 2011, Journal of the Royal Statistical Society (B) 73(1) 3-36.

• Leisch, F., Hornik, K., Ripley, B. D., mda: Mixture and flexible discriminant analysis, 2011, R package version 0.4-2. http://CRAN.R-project.org/package=mda.

• Scutari, M., Learning Bayesian Networks with the bnlearn, 2010, R Package. Journal of Statistical Software, 35(3), 1-22. URL http://www.jstatsoft.org/v35/i03/.

• Therneau, T., Atkinson, B., Ripley, B., rpart: Recursive Partitioning, 2010, R package version 4.1-0.

Acknowledgement

• Thanks to: Claudia Berloco, Alessandra Padriali, Roberto Fontana, Ron Kenett

seoul, 23/26-06-2013 sequential design of experiment · • assessing roi of an automated procedure...

Documents