eurocontrol experimental centre innovative research characteristics in flight data estimation with...

29
EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT 2004 Claus Gwiggner, LIX, Ecole Polytechnique Palaiseau Gert Lanckriet, EECS, University of California, Berkeley Characteristics in Flight Data

Upload: ernest-mitchell

Post on 03-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Characteristics in Flight DataEstimation with Logistic Regression

and Support Vector Machines

ICRAT 2004Claus Gwiggner, LIX, Ecole Polytechnique Palaiseau

Gert Lanckriet, EECS, University of California, Berkeley

Characteristics in Flight Data

Page 2: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Flow Management and Planning Differences

Time slots are distributed among aircraft to avoid congestion

•In reality, delays, re-reroutings, etc. lead to missed time slots

•Not the same number of aircraft than planned arrive in sectors:

•safety, lost capacity

Planning Differences

Page 3: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Related Work

Factors/Causes [ATFM Study, PRR]Slot adherence, flight plan quality, in-flight change of

route, .... Simulations [Ky, Stortz]

Random noise on departure times Reactionary Delay [Toulouse Study]

microscopic model of departure times

Page 4: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Unknown

Real situation at sector entries interplay of factorscompensations of delays ...

Page 5: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Program

Problem Formulation Simple Characteristics Binary Classification Conclusion Future Work

Page 6: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Planning Differences

Planning Differences = Regulated Demand – Real Demand

Page 7: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

General Problem Formulation

Find 'regularities' of planning differences, useful to improve the current planning procedureWhy? Safety, suboptimal used capacityHow?

MACRO approach: relations between flows, not single deviations from flight plans

Daily basis, not extreme situations How? Data analysis

141 days of week-day data

Page 8: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Today's Question

Are planning differences of different sectors the 'same'? if yes: any model can be greatly simplified if no: what are the differences?

Difficulty24 dimensions: one variable for each hour

Page 9: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Comparison of Planning Differences

No visible regularities in both sectors ...

Page 10: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Mean and Standard Deviation

...but similar mean and standard deviation over the time

Page 11: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

H0: same underlying distribution ... reject on 1 % levelassumes that statistical properties do not vary over time

.... but what are the characteristics?e.g. 'if high peaks at noon => sector 1'? Find a rule that tells whether a sequence of values

belongs to sector 1Classification problem

Hypothesis Tests

Page 12: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

(Binary) Classification

Probabilistic 'what is the probability

that a new item belongs to sector 1?'

Logistic Regression

Geometric 'on which side of the

boundary lies the new item?'

Support Vector Machines

Page 13: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Comparison

Linear Logistic Regression vs SVMs linear vs non-linearsimple vs mathematically sophisticated traditional vs state-of-the-artprobabilistic vs geometric

Common points [Hastie et. al 2003], [Friedman 2003]SVM estimator of class probabilities logistic regression induces linear boundaries

Page 14: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Experiments on ...

Data from 4 sectors in Upper Berlin airspaceRaw Data (random permutations)Data where number of instances in both classes are

balanced In total 8 experiments conducted

Page 15: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Model Selection

Report Estimated Prediction Error (EPE) Model Selection:

Cross-Validation [Stone 1974]Wilcoxon-Mann-Whitney Test

Page 16: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Parameters of SVMs

Kernel functionsLinear, Gauss, Poly, Linear CN, Gauss CN, Poly CN

Kernel parameters Cost Function

1 Norm, 2 Norm In total over 800 combinations possible

best one estimated by cross validation

Page 17: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Results

Page 18: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Summary

characteristics in high dimensional data

comparison of a very simple and a very complicated classification method

Page 19: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Conclusions

There are systematic differences between different sectors

SVMs do not promise major improvementno more than 4% better than logistic regression

Linear Prediction is possibleExpected prediction errors around 15 %

Page 20: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Future Work

(black box) prediction not satisfactory Better understanding of the underlying processes

reasons for the differencesmodel of the probability distribution of planned traffic and

realized traffic

Page 21: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Questions ?

• Thanks for your attention!

Page 22: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Results

Is Week End?

Sector Raw Bal+Perm Variable Sel RandomUR1 UR2 UR3 UR4

Page 23: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Known: Causes for Planning Differences

Departure Slot adherenceInconsistent profile

CASA implementation

In flight change of route

Regulations too late Weekday, SeasonWeather

Source: Independent Study for the Improvement of ATFM, Final Report, 2000

Slot tolerance windowMissing flight plansIncorrect flight plan information

Priorities:Very HighHighMediumUnknown

time

# over-deliveries

Page 24: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Little known: Dynamics of Planning Differences

X: timeY: Number of planning differences

Sector n...

'Error'Propagation

Sector 2Sector 1

Related Work: Simulation studies, reactionary delay studies

Page 25: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Summary Motivation

Are planning differences unpredictable? Or are there hidden 'regularities'?

Page 26: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Possible Research Questions

Propagation over the network Dependence on traffic density, sector complexity, ... ... Characteristics Comparison of different sectors

Page 27: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Notation

A sector is represented as a vector of 24 variables, one for each hour

An instance is a value for this vector An instance belongs to class 1 or -1; dependent on the

sector from which it was drawn

Page 28: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Binary Classification

● Given:

Instances from sectors 1 and -1● Question:

a rule to decide for a new instance to which sector it might belong

● Example: if 'high peaks at noon' then class 1Decision trees

Page 29: EUROCONTROL EXPERIMENTAL CENTRE INNOVATIVE RESEARCH Characteristics in Flight Data Estimation with Logistic Regression and Support Vector Machines ICRAT

EUROCONTROL EXPERIMENTAL CENTRE

INNOVATIVE RESEARCH

Geometric and Probabilistic Approaches

example: Instances are 2 dimensional

Geometric Instances are points in

Euclidean spaceRules are class boundaries

Problem: overlapping classes

ProbabilisticClasses have underlying

probability distributionRules are class-probabilities

Problem: which distribution?