ecm time series forecast

Download Ecm time series forecast

If you can't read please download the document

Upload: ayapparaj-sks

Post on 12-Apr-2017

301 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Clock template

ECONOMETRICS PROJECT PRESENTATIONTime Series Analysis in SAS

AYapparaj SKS | Aditya nathireddy | Vibeesh CS | Neha Nehra

1

AGENDATIME SERIES FORECASTING INTRODUCTIONTHINGS TO BE CHECKED BEFORE APPLYING TIME SERIES MODELLINGBUSINESS OBJECTIVEABOUT THE DATASETDATA PREPARATIONMODEL IDENTIFICATION AND ESTIMATIONGENERATING FORECASTFINAL FORECAST

2

TIME SERIES FORECASTING - INTRODUCTIONTime Series relates to values taken by a variable over time (such as daily sales revenue, weekly orders, monthly overheads, yearly income) and tabulated or plotted as chronologically ordered numbers or data points to yield valid statistical inferences.

3

VOLATILITY : Data should not be very volatile for time series method.For stable changes, it is a good method.HOW IT CAN BE CHECKED : Scatter plot of the data with time in the horizontal axis and the time series in the vertical axis gives indications for this.Fan or Inverted Fan distribution of the scatter plot shows the data is highly volatile.WHAT CAN BE DONE TO OVERCOME THE PROBLEM : Transformations has to be done for the data based on the distribution of the plotFor Fan shape distribution - decrease the scale of the data log / Sqrt While for Inverted Fan distribution - increase the scale of the data exponential / square .

THINGS TO BE CHECKED BEFORE APPLYING TIME SERIES MODELLING

4

PATTERN :If the past has a pattern, then time series will yield good result.Absence of pattern will have no effect of this method on the data.For stable changes, it is a good method.STATIONARITY OF THE DATA :If the data is stationary, there will not be any problem in using the technique.if a data is a complete random memory less process with no fixed pattern it is called non stationary data and cannot be used for future forecasting this is checked using Augmented Dickey Fuller unit root (ADF) test.HOW DO WE PERFORM ADF TEST :We perform Hypothesis to determine the whether the data is stationary or not.Ho : Non stationary /// Ha : Stationaryif p < alpha we reject the Ho to claim that the data is stationary and hence it cant be used for forecasting.if p > alpha we accept the Ho to claim that the data is non stationary which can be made stationary by differencing.

5

BUSINESS OBJECTIVE [To project the airline travel for the next 12 months]

6

Sashelp.air Airline Data (Monthly: Jan49-Dec60) The dataset used here is SASHELP.AIR which is Airline data and contains two variables DATE and AIR (labelled as International Airline Travel). It contains the data from JAN 1949 to DEC 1960.

ABOUT THE DATASET

7

CHECK FOR VOLATILITY :Plot between the two variables yielded a distribution as shown below.So we are going for variable transformation.

DATA PREPARATION

8

CHECK FOR VOLATILITY :We are doing both log and sqrt transformations.From the below plots it is visible that log transformation yields a good plot.

9

CHECK FOR STATIONARY CONDITION OF DATA (PROC ARIMA OUTPUT)

Now the result shows there is no stationarity based on p values(all p values should be less than alpha 0.01% or 0.0001) so we have to do differencing.Now all the p values are less than alpha.

10

CHECK FOR SEASONALITY :Auto correlation (ACF) captures correlation btw Yt and Yt-s where S is the period of lag if the ACF exhibits high a value at fixed interval then that interval is considered as the period of seasonality.Differencing of the same order will de seasonalize the data.The output of ACF shows the period of seasonality is 12 years.DESESONALIZATION :We are desesonalizing the data by 12th order differencing as it gives high correlation values.

11

CREATION OF DEVELOPMENT AND VALIDATION OF DATA :Depending upon number of observations, some of the most recent time point data are put aside as the validation sample.The rest of the data, development sample, is used to generate forecast for multiple models which are compared with the actuals stored in the development sample.

MODEL IDENTIFICATION AND ESTIMATION

12

SELECTION OF P(AUTO REGRESSIVE)AND Q (MOVING AVERAGE) :The model selection criteria namely AIC BIC SBC (Lower the values it is better)are used to select the values of P and Q.AIC - Akaike's information criteriaBIC - Bayesian information criteriaSBC - Schwartz bayesian criteriaBIC :: p 0-5 q 0-5The minic (minimum information criteria) under proc arima generates the minimum BIC model after considering all combinations of P and Q from 0 to 5.This selects all models in the neighbourhood of the minimum BIC models, generate AIC SBC and calculate average of AIC and SBC.Then select 6 to 7 models based on relative lower value of average and generate forecasts for them.

13

Contd..,

By observation, we can see that the minimum of the matrix is -6.3503 corresponding to AR3 and MA0 location (P,Q)=>(3,0).This selects all models in the neighbourhood of the minimum BIC models, generate AIC SBC and calculate average of AIC and SBC. Then select 6 to 7 models based on relative lower value of average and generate forecasts for them.

14

For each combinations of p and q selected from AIC and SBC, generate forecast using the forecast under proc arima where, lead = Number of future time points to forecastID = Name of the time variableInterval = Unit of the time variable Out = Output file which saves the forecast.The forecasts obtained for each combination for p and q is compared with the actuals of the same time points stored in the validation file using MAPE(Mean Absolute Percentage Error).The combination with the minimum MAPE is selected.

GENERATING FORECAST

15

GENERATING FORECAST..contd..,

16

The combination with the minimum MAPE is selected and the same is applied to the entire data to generate the final forecast.Ran forecast on the full data for the best p and q combination.

FINAL FORECAST

17

>> Thank you