introduction to time series forecasting
TRANSCRIPT
Introduction to Time Series ForecastingWith Case Studies in NLP
A Tutorial at ICON 2019
Sandhya Singh & Kevin Patel
CFC ILTenter
or
nd ian
anguage
December 18, 2019
Sandhya and Kevin Time Series Forecasting 1
Overview
We will highlight how NLP people are also well suited to workon Time Series problems.
We will provide background information on Time SeriesForecasting.
We will discuss some statistical approaches, some classicalmachine learning approaches, and some deep learningapproaches for time series forecasting.
We will mention some commonalities between NLP and TimeSeries, and how one can assist the other.
Sandhya and Kevin Time Series Forecasting 2
CFC ILTenter
or
nd ian
anguage
Outline
1 IntroductionTime SeriesTime Series Forecasting
2 BackgroundTime Series ComponentsTime Series CategorizationTime Series Forecasting Terminology
3 Statistical MethodsSimple ModelsAuto Regressive ModelsEvaluation Metrics
4 Classical ML ModelsPreparing DataML Models
Sandhya and Kevin Time Series Forecasting 3
CFC ILTenter
or
nd ian
anguage
Outline
5 Deep Learning Models
6 Connection with NLPProblem LevelTooling Level
7 DemosStatsmodel LibraryProphet Library
8 Conclusion
Sandhya and Kevin Time Series Forecasting 4
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Acknowledgement
We thank
The ICON 2019 committee, for accepting our proposal
LGSoft, for their joint project with CFILT. Our investigationsinto the same germinated the idea of this tutorial
Sandhya and Kevin Time Series Forecasting 5
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Some Context Regarding the Tutorial
Why are we (people working in NLP) talking about TimeSeries?
Text and Time Series - both are Sequential DataCommonality - Exploit structure we know about the problemin advance
The sequential nature in this case
Similar tools, different terminologyKnowing terminology will enable us to apply our knowledge oftools in this area too
Sandhya and Kevin Time Series Forecasting 6
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Target Audience
People who have not worked with Time Series data
Subsumes
Those who are completely new to MLThose who have basic knowledge of how to apply MLThose who are proficient in ML and/or are working in NaturalLanguage Processing
Sandhya and Kevin Time Series Forecasting 7
CFC ILTenter
or
nd ian
anguage
Introduction
Sandhya and Kevin Time Series Forecasting 8
IntroductionTime Series
Sandhya and Kevin Time Series Forecasting 8
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
What is a Time Series?
Definition
A time series is a sequence of observations ordered in time.
Xt ; t = 0, 1, 2, 3
The observations arecollected over a fixedtime interval
The time dimensionadds structure andconstraint to the data
Nifty 50 Index as of 13/10/2019 15:31 IST
Sandhya and Kevin Time Series Forecasting 9
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Where (and When) does One Encounter Time Series?
As a time series of our life !
Economy and Finance: Exchange rates, Interest rates,Employment rate, Financial indicesMeteorology: Properties of weather like temperature, humidity,windspeed, etc.Medicine: Physiological signals (EEG), heart-rate, patienttemperature, etc.
Other venues:
Industry: Electric load, power consumption, resourceconsumptionWeb: Clicks, Logs
Sandhya and Kevin Time Series Forecasting 10
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
What is Time Series Analysis?
Applying statistical approaches to time series data
Will enable one to
Predict future based on the pastUnderstand the underlying mechanism which generates thedataControl the mechanismDescribe salient features of the data
Sandhya and Kevin Time Series Forecasting 11
CFC ILTenter
or
nd ian
anguage
IntroductionTime Series Forecasting
Sandhya and Kevin Time Series Forecasting 11
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Time Series Forecasting
Predict futurebased on past
Extrapolation inclassical statisticalterminology
MonthNo. of Passengers(in thousands)
1949-02 118
1949-03 132
1949-04 129
1949-05 121
1949-06 135
1949-07 ?
1949-08 ?
1949-09 ?
Sandhya and Kevin Time Series Forecasting 12
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Forecasting - Yes or No?
Determining whether tomorrow a stock will go up/down orstay put?
Given a voice recording, who is the speaker?
Given a voice recording, who is speaking after the currentspeaker?
Given an ECG plot, is the heart functioning normal orabnormal?
Given an ECG plot, predict whether the person will have aheart related issue in the next month.
Sandhya and Kevin Time Series Forecasting 13
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Forecasting - Yes or No?
Determining whether tomorrow a stock will go up/down orstay put? - YES
Given a voice recording, who is the speaker? - NO
Given a voice recording, who is speaking after the currentspeaker? - NO
Given an ECG plot, is the heart functioning normal orabnormal? - NO
Given an ECG plot, predict whether the person will have aheart - YES related issue in the next month.
Sandhya and Kevin Time Series Forecasting 13
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Advantages of Time Series Forecasting
Reliability:
Given the forecast of power surges in your area, you can checkwhether your home’s wiring is reliable or not.
Preparing for Seasons:
Looking at the patterns from previous Christmas events, stockyour warehouse for the upcoming Christmas accordinglyGiven that the south east coast of India experiences typhoonsduring monsoons, pre-allocate rescue and relief resources
Estimating trends:
Given trend of a particular stock, should I invest in it?
Sandhya and Kevin Time Series Forecasting 14
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Time Series Forecasting and Machine Learning
Forecasting - predicting the future from the past
Given an observed value Y , predict Yt+1 using Y1 . . .Yt
In other words, learn f such that
Yt+1 = f (Y1, . . . ,Yt) (1)
Machine Learning practitioners should be easily able to relatethis expression to
Y = f (X ) (2)
Are ML skills applicable? - Yes
Sandhya and Kevin Time Series Forecasting 15
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
AI/ML : NOT a Silver Bullet
AI/ML are multipliers, and not a silver bullet.
Consider the example:
EMNLP - Empirical Methods in Natural Language Processing -a top tier NLP conferenceEMNLP 2015 was informally called Embedding Methods inNatural Language ProcessingThis is due to sheer number of papers about word embeddingsMore or less implying that word embeddings are the silverbulletIf that were the case, shouldn’t all problems be solved by now?
Shouldn’t ACL, etc. close shops?
That is not the caseDomain knowledge is still needed for proper utilization of MLSo lets discuss some background to gain time series domainknowledge
Sandhya and Kevin Time Series Forecasting 16
CFC ILTenter
or
nd ian
anguage
Background
Sandhya and Kevin Time Series Forecasting 17
BackgroundTime Series Components
Sandhya and Kevin Time Series Forecasting 17
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Time Series Components
Level
The average value of a time series
Trend
A long term pattern present in the time seriesCan be positive, negative, linear or nonlinearIf no increasing or decreasing trend, then the time series isstationary.
i.e. Data has constant mean and variance over time
Sandhya and Kevin Time Series Forecasting 18
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Time Series Components (contd.)
Seasonality
Regular and Predictable changes that recur in regular shortintervalsLargely due to involvement of periodically occurring factors
Cyclicality
Changes that recur in irregular intervalsAs opposed to fixed period intervals in seasonality
Noise / Irregularity / Residual
Random variations that do not repeat in the pattern
Sandhya and Kevin Time Series Forecasting 19
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Time Series Components for Airline Passenger Data
Sandhya and Kevin Time Series Forecasting 20
CFC ILTenter
or
nd ian
anguage
BackgroundTime Series Categorization
Sandhya and Kevin Time Series Forecasting 20
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Categorization of Time Series Problem Formulation
Based on the number of inputsUnivariate vs. Multivariate
Based on the number of time steps predicted in the output
One step forecasting vs. Multi step forecasting
Based on the modeling of interactions between differentcomponents
Additive vs. Multiplicative models
Sandhya and Kevin Time Series Forecasting 21
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Univariate Time Series
Single Time Dependent Variable
Examples:
Monthly Airline Passenger Data
MonthNo. of Passengers(in thousands)
1949-02 118
1949-03 132
1949-04 129
1949-05 121
1949-06 135
Sandhya and Kevin Time Series Forecasting 22
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Multivariate Time Series
Multiple time dependent variables
Can be considered as multiple univariate time series thatneeds to be analyzed jointly
Example: Rainfall Forecast
Date Humidity Temperature Rainfall01/01/18 36.81 16.222 30.25
02/01/18 34.438 18.146 29.26
03/01/18 29.291 19.002 28.26
04/01/18 30.712 19.279 29.54
05/01/18 32.352 19.494 30.12
06/01/18 31.952 20.894 27.63
Sandhya and Kevin Time Series Forecasting 23
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Categorization of Time Series Problem Formulation
Based on the number of inputs
Univariate vs. Multivariate
Based on the number of time steps predicted in the output
One step forecasting vs. Multi step forecasting
Based on the modeling of interactions between differentcomponents
Additive vs. Multiplicative models
Sandhya and Kevin Time Series Forecasting 24
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
One Step vs. Multi Step Forecasting
One Step Forecasting
Given data upto time t, predict value only for the next onestep i.e. at t + 1
Multi Step Forecasting
Given data upto time t, predict values for two or more stepsi.e. at t + 1, t + 2, t + 3, . . .
Sandhya and Kevin Time Series Forecasting 25
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
One Step vs. Multi Step Forecasting (contd.)
One Step Prediction for 3 steps Multi Step Prediction for 3 steps
Sandhya and Kevin Time Series Forecasting 26
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
One Step vs. Multi Step Forecasting (contd.)
One Step Prediction for 8 steps Multi Step Prediction for 8 steps
Note how close prediction is to true value in case of one stepprediction
Sandhya and Kevin Time Series Forecasting 27
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
One Step vs. Multi Step Forecasting (contd.)
This guy should NOT use multi step forecasting
Img Src: https://xkcd.com/605/Sandhya and Kevin Time Series Forecasting 28
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Categorization of Time Series Problem Formulation
Based on the number of inputs
Univariate vs. Multivariate
Based on the number of time steps predicted in the output
One step forecasting vs. Multi step forecasting
Based on the modeling of interactions between differentcomponents
Additive vs. Multiplicative models
Sandhya and Kevin Time Series Forecasting 29
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Additive vs. Multiplicative Models
Additive models:
The series is additively dependent on the different components
Y = Level + Trend + Seasonality + Noise (3)
Multiplicative models:
The series is multiplicatively dependent on the differentcomponents
Y = Level × Trend × Seasonality × Noise (4)
Sandhya and Kevin Time Series Forecasting 30
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Additive vs. Multiplicative Models (contd.)
Comparison of Additive and Multiplicative Seasonality
Img Src: https://kourentzes.com/forecasting/2014/11/09/additive-and-multiplicative-seasonality/
Sandhya and Kevin Time Series Forecasting 31
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Additive or Multiplicative?
Additive
Img Src: https://kourentzes.com/forecasting/2014/11/09/additive-and-multiplicative-seasonality/Sandhya and Kevin Time Series Forecasting 32
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Additive or Multiplicative?
Additive
Img Src: https://kourentzes.com/forecasting/2014/11/09/additive-and-multiplicative-seasonality/Sandhya and Kevin Time Series Forecasting 32
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Additive or Multiplicative?
Multiplicative
Img Src: https://kourentzes.com/forecasting/2014/11/09/additive-and-multiplicative-seasonality/Sandhya and Kevin Time Series Forecasting 33
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Additive or Multiplicative?
Multiplicative
Img Src: https://kourentzes.com/forecasting/2014/11/09/additive-and-multiplicative-seasonality/Sandhya and Kevin Time Series Forecasting 33
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Additive or Multiplicative?
Multiplicative
Img Src: https://kourentzes.com/forecasting/2014/11/09/additive-and-multiplicative-seasonality/Sandhya and Kevin Time Series Forecasting 34
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Additive or Multiplicative?
Multiplicative
Img Src: https://kourentzes.com/forecasting/2014/11/09/additive-and-multiplicative-seasonality/Sandhya and Kevin Time Series Forecasting 34
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Additive or Multiplicative?
Additive
Img Src: https://kourentzes.com/forecasting/2014/11/09/additive-and-multiplicative-seasonality/Sandhya and Kevin Time Series Forecasting 35
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Additive or Multiplicative?
Additive
Img Src: https://kourentzes.com/forecasting/2014/11/09/additive-and-multiplicative-seasonality/Sandhya and Kevin Time Series Forecasting 35
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Additive or Multiplicative?
Multiplicative
Img Src: https://kourentzes.com/forecasting/2014/11/09/additive-and-multiplicative-seasonality/Sandhya and Kevin Time Series Forecasting 36
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Additive or Multiplicative?
Multiplicative
Img Src: https://kourentzes.com/forecasting/2014/11/09/additive-and-multiplicative-seasonality/Sandhya and Kevin Time Series Forecasting 36
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Additive or Multiplicative?
Additive
Img Src: https://kourentzes.com/forecasting/2014/11/09/additive-and-multiplicative-seasonality/Sandhya and Kevin Time Series Forecasting 37
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Additive or Multiplicative?
Additive
Img Src: https://kourentzes.com/forecasting/2014/11/09/additive-and-multiplicative-seasonality/Sandhya and Kevin Time Series Forecasting 37
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Dealing with Multiplicative Models
Passenger Data is Multiplicative Log(Passenger) Data is Additive
Sandhya and Kevin Time Series Forecasting 38
CFC ILTenter
or
nd ian
anguage
BackgroundTime Series Forecasting Terminology
Sandhya and Kevin Time Series Forecasting 38
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Correlations
Captures relation between two series
r = Corr(X ,Y )
=Cov(X ,Y )
σxσy
=E [(X − µX )(Y − µY )]
σxσy
Img Src:https://en.wikipedia.org/wiki/Correlation_and_dependence
Sandhya and Kevin Time Series Forecasting 39
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Spurious Correlations
Img Src: https://www.tylervigen.com/spurious-correlationsSandhya and Kevin Time Series Forecasting 40
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Autocorrelation
Capturing relation between a series and a lagged version ofthe same
Passenger dataAutocorrelation on Passengerdata
Sandhya and Kevin Time Series Forecasting 41
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
White Noise
White noise is a time series that ispurely random in nature
Lets denote it by εt
Mean of white noise i.e. E [εt ] = 0 andVariance is always constant
εt , εk are uncorrelated
If data is white noise, then intelligentforecasting is not possible
The best would be to just returnmean as the prediction
https://en.wikipedia.org/wiki/White_noise
Sandhya and Kevin Time Series Forecasting 42
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Stationarity
A time series is stationary if it does not exhibit any trend orseasonality
Stationary Time Series Non-Stationary Time Series
Sandhya and Kevin Time Series Forecasting 43
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Stationarity (contd.)
Strict stationarity
P(Yt) = P(Yt+k) and P(Yt ,Yt+k) is independent of tMean and variance time invariant
Weak Stationarity
In this case, mean constant, variance constantCov(Y1,Y1+k) = Cov(Y2,Y2+k) = Cov(Y3,Y3+k) = γi.e. Covariance only depends on lag value k
Sandhya and Kevin Time Series Forecasting 44
CFC ILTenter
or
nd ian
anguage
Statistical Methods
Sandhya and Kevin Time Series Forecasting 45
Statistical MethodsSimple Models
Sandhya and Kevin Time Series Forecasting 45
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Naive Forecasting
A dumb forecasting approach
Predict Yt+1 = Yt
i.e. Forecast that the next value is going to be the same asthe current value
Sandhya and Kevin Time Series Forecasting 46
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Simple Moving Average (SMA)
Prediction is the mean of a rolling window over previous data
Yt =1
n
n∑i=1
Xt−i
where n is the rolling window size
MonthThousands ofPassengers
6-month-SMA
12-month-SMA
1949-01-01 112 NaN NaN
1949-02-01 118 NaN NaN
1949-03-01 132 NaN NaN
1949-04-01 129 NaN NaN
1949-05-01 121 NaN NaN
1949-06-01 135 124.500000 NaN
1949-07-01 148 130.500000 NaN
1949-08-01 148 135.500000 NaN
1949-09-01 136 136.166667 NaN
1949-10-01 119 134.500000 NaN
1949-11-01 104 131.666667 NaN
1949-12-01 118 128.833333 126.666667
1950-01-01 115 123.333333 126.916667
1950-02-01 126 119.666667 127.583333
1950-03-01 141 120.500000 128.333333
Sandhya and Kevin Time Series Forecasting 47
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Simple Moving Average (SMA) (contd.)
Shortcomings of SMA:
Smaller windows lead to more noise, rather than signalWill lag by window sizeCannot predict extreme values (due to averaging)Captures trend, but poor at capturing other components; poorat forecasting
Sandhya and Kevin Time Series Forecasting 48
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Exponential Weighted Moving Average (EWMA)
Gives exponentially high weights to nearby values and lowweights to far off values while performing weighted averaging
Y0 = X0
Yt = (1− α)Yt−1 + αXt
where α is a smoothing factor such that 0 < α ≤ 1
Sandhya and Kevin Time Series Forecasting 49
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Comparison between SMA and EWMA
One can see seasonality better captured in EWMA ascompared to SMA
Sandhya and Kevin Time Series Forecasting 50
CFC ILTenter
or
nd ian
anguage
Statistical MethodsAuto Regressive Models
Sandhya and Kevin Time Series Forecasting 50
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Auto Regressive (AR) Models
If the series is not white noise, then the forecasting can bemodeled as
Yt = f (Y1, . . . ,Yt−1, et) (5)
Practically not feasible to consider all time steps
Approximation time !
Yt = β0 + β1Yt−1 + εt (6)
Since we used 1 step, this is called AR(1) model
Extending to AR(p), we get
Yt = β0 + β1Yt−1 + β2Yt−2 + · · ·+ βpYt−p + εt (7)
Sandhya and Kevin Time Series Forecasting 51
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Moving Average (MA) Models
Consider the modeling in AR
Yt = f (Y1, . . . ,Yt−1, et) (8)
Prediction based on previous values
In MA models, we model upon the white noise observations
Yt = f (e1, . . . , et−1, et) (9)
Using the previous analogy, an MA(q) model learns
Yt = γ0 + εt + γ1εt−1 + γ1εt−2 + · · ·+ γqεt−q (10)
Sandhya and Kevin Time Series Forecasting 52
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
ARMA Models
ARMA models combine both AR and MA models
An ARMA(p,q) models Yt using p previous values and qprevious noise components
Yt = β0 + β1Yt−1 + β2Yt−2 + · · ·+ βpYt−p (11)
+εt + γ1εt−1 + γ2εt−2 + · · ·+ γqεt−q (12)
Sandhya and Kevin Time Series Forecasting 53
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Differencing: Converting Non-stationary to Stationary
A time series which is non-stationary can be converted to astationary time series by differencing
Y ′t = Yt − Yt−1
If still not stationary, do second order differencing
Y ′′t = Y ′t − Y ′t−1 = Yt − 2Yt−1 + Yt−2
Sandhya and Kevin Time Series Forecasting 54
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
ARIMA Models
Stands for Auto Regressive Integrated Moving Average
In ARIMA, the AR and MA are same as ARMA
However, I indicates the amount of difference done
If differencing done once, it is called I(1)
Thus an ARIMA(p,d,q) model is a combination of AR(p) andMA(q) with I(d)
Sandhya and Kevin Time Series Forecasting 55
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
How to Decide p, d, q ?
Difficult for a human - will have to look various plots, runsome tests, etc.
Another approach - Auto ARIMA
Learns p,d, and q automatically
Sandhya and Kevin Time Series Forecasting 56
CFC ILTenter
or
nd ian
anguage
Statistical MethodsEvaluation Metrics
Sandhya and Kevin Time Series Forecasting 56
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Evaluation Metrics
Standard evaluation metrics for time series forecasting are;
Mean Absolute Error (MAE)Mean Absolute Percentage Error (MAPE)Mean Squared Error (MSE)Root Mean Squared Error (RMSE)Normalized Root Mean Squared Error (NRMSE)
Sandhya and Kevin Time Series Forecasting 57
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Mean Absolute Error (MAE)
MAE =1
n
n∑j=1
|yj − yj | (13)
Measures the average magnitude of the errors
If MAE = 0, then no error
Unable to properly alert when the forecast is very off for a fewpoints
Sandhya and Kevin Time Series Forecasting 58
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Mean Absolute Percentage Error (MAPE)
MAPE =100%
n
n∑j=1
|yj − yj
yj| (14)
Percentage equivalent of MAE
Not defined for zero values
Sandhya and Kevin Time Series Forecasting 59
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Mean Squared Error (MSE)
MSE =1
n
n∑j=1
(yj − yj)2 (15)
Measures the mean of the squared error
Those forecast values which are very off are penalized more
Squared values make it more difficult to interpret the errors
Sandhya and Kevin Time Series Forecasting 60
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Root Mean Squared Error (RMSE)
RMSE =
√√√√1
n
n∑j=1
(yj − yj)2 (16)
Value of the loss is of similar magnitude as that of theprediction
Thereby making it more interpretable
Also punishes large prediction errors
Sandhya and Kevin Time Series Forecasting 61
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Normalized Root Mean Squared Error (NRMSE)
NRMSE =
√1n
∑nj=1(yj − yj)2
Z(17)
where Z is the normalization factor
NRMSE allows for comparison between models acrossdifferent datasets
Common normalization factors:
Mean: Preferred when same preprocessing and predictedfeatureRange: sensitive to sample sizeStandard Deviation: suitable across datasets as well aspredicted features
Sandhya and Kevin Time Series Forecasting 62
CFC ILTenter
or
nd ian
anguage
Classical ML Models
Sandhya and Kevin Time Series Forecasting 63
Classical ML ModelsPreparing Data
Sandhya and Kevin Time Series Forecasting 63
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Preparing Time Series Data for Machine Learning
TimeExtra
FeatureFeature of
Interest
t1 e1 x1t2 e2 x2t3 e3 x3t4 e4 x4t5 e5 x5t6 e6 x6t7 e7 x7t8 e8 x8t9 e9 x9t10 e10 x10t11 e11 x11t12 e12 x12
Sandhya and Kevin Time Series Forecasting 64
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
One Step Forecasting Setup
TimeExtra
FeatureFeature of
InterestForecast Feature
of Interest
t1 e1 x1 x2t2 e2 x2 x3t3 e3 x3 x4t4 e4 x4 x5t5 e5 x5 x6t6 e6 x6 x7t7 e7 x7 x8t8 e8 x8 x9t9 e9 x9 x10t10 e10 x10 x11t11 e11 x11 x12t12 e12 x12 NaN
Sandhya and Kevin Time Series Forecasting 65
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Random Split
TimeExtra
FeatureFeature of
InterestForecast Feature
of Interest
t1 e1 x1 x2t2 e2 x2 x3t3 e3 x3 x4t4 e4 x4 x5t5 e5 x5 x6t6 e6 x6 x7t7 e7 x7 x8t8 e8 x8 x9t9 e9 x9 x10t10 e10 x10 x11t11 e11 x11 x12
TimeExtra
FeatureFeature of
InterestForecast Feature
of Interest
t1 e1 x1 x2t2 e2 x2 x3t4 e4 x4 x5t6 e6 x6 x7t7 e7 x7 x8t9 e9 x9 x10t10 e10 x10 x11
Table: Train Set
TimeExtra
FeatureFeature of
InterestForecast Feature
of Interest
t3 e3 x3 x4t5 e5 x5 x6t8 e8 x8 x9t11 e11 x11 x12
Table: Test Set
Sandhya and Kevin Time Series Forecasting 66
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Sequential Split
TimeExtra
FeatureFeature of
InterestForecast Feature
of Interest
t1 e1 x1 x2t2 e2 x2 x3t3 e3 x3 x4t4 e4 x4 x5t5 e5 x5 x6t6 e6 x6 x7t7 e7 x7 x8t8 e8 x8 x9t9 e9 x9 x10t10 e10 x10 x11t11 e11 x11 x12
TimeExtra
FeatureFeature of
InterestForecast Feature
of Interest
t1 e1 x1 x2t2 e2 x2 x3t3 e3 x3 x4t4 e4 x4 x5t5 e5 x5 x6t6 e6 x6 x7t7 e7 x7 x8t8 e8 x8 x9
Table: Train Set
TimeExtra
FeatureFeature of
InterestForecast Feature
of Interest
t9 e9 x9 x10t10 e10 x10 x11t11 e11 x11 x12
Table: Test Set
Sandhya and Kevin Time Series Forecasting 67
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Multiple Train-Test Split
TimeExtra
FeatureFeature of
InterestForecast Feature
of Interest
t1 e1 x1 x2t2 e2 x2 x3t3 e3 x3 x4
Table: Train Set 1
TimeExtra
FeatureFeature of
InterestForecast Feature
of Interest
t1 e1 x1 x2t2 e2 x2 x3t3 e3 x3 x4t4 e4 x4 x5t5 e5 x5 x6t6 e6 x6 x7
Table: Train Set 2
TimeExtra
FeatureFeature of
InterestForecast Feature
of Interest
t4 e4 x4 x5t5 e5 x5 x6t6 e6 x6 x7
Table: Test Set 1
TimeExtra
FeatureFeature of
InterestForecast Feature
of Interest
t7 e7 x7 x8t8 e8 x8 x9t9 e9 x9 x10
Table: Test Set 2
Sandhya and Kevin Time Series Forecasting 68
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Multiple Train-Test Split (contd.)
Train Size Test Size
3 timesteps (t1 - t3) 3 timesteps (t4 - t6)
6 timesteps (t1 - t6) 3 timesteps (t7 - t9)
9 timesteps (t1 - t9) 3 timesteps (t10 - t12)
12 timesteps (t1 - t12) 3 timesteps (t13 - t15)
Sandhya and Kevin Time Series Forecasting 69
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Expanding Window Multiple Sets
Train Size Test Size
10 timesteps (t1 - t10) t1111 timesteps (t1 - t11) t1212 timesteps (t1 - t12) t1313 timesteps (t1 - t13) t14
Sandhya and Kevin Time Series Forecasting 70
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Fixed Window Sequential Data
Sequence No. Input Data Output Data
1 t1, t2, t3 t42 t2, t3, t4 t53 t3, t4, t5 t64 t4, t5, t6 t75 t5, t6, t7 t86 t6, t7, t8 t97 t7, t8, t9 t108 t8, t9, t10 t11
Sandhya and Kevin Time Series Forecasting 71
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Comparison of Different Dataset Preparation
Split Approach Comments
Random SplitNot advisable as the temporalinformation is lost
Sequential Split Mostly preferred on large datasets
Multiple Splits Leads to leakage of data
Expanding WindowMultiple Sets
Also known as Walk Forward validation
Sandhya and Kevin Time Series Forecasting 72
CFC ILTenter
or
nd ian
anguage
Classical ML ModelsML Models
Sandhya and Kevin Time Series Forecasting 72
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Linear Regression
Models change in one variable through change in othervariables
Method for finding the linear relationship betweenindependent and dependent variables
Assuming a linear relationship exists !!!
Also known as line of best fit, ordinary least squaresregression, etc.
Sandhya and Kevin Time Series Forecasting 73
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Estimating Linear Regression
Simple univariate linear regression
Given training data of the form (x , y), learn w and b such that
(y − (wx + b))2 (18)
is minimized
Simple multivariate linear regression
Given training data of the form (X , y) where X is ndimensional, learn w1 . . .wn and b such that
(y − (n∑
i=1
wixi + b))2 (19)
is minimized
Sandhya and Kevin Time Series Forecasting 74
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Forecasting with Linear Regression
Simple univariate linear regression
Given new data X , forecast using learned parameters w and bas
y = wx + b (20)
Simple multivariate linear regression
Given new data X where X is n dimensional, forecast usinglearned parameters w1 . . .wn and b as
y =n∑
i=1
wixi + b (21)
Sandhya and Kevin Time Series Forecasting 75
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Support Vector Regression
This model uses the concept of support vectors for regression
Performs linear regression in high dimensional feature space
Aim is to fit the error within a threshold range
A hyperplane is obtained such that the loss is minimized
Loss is considered to be zero within small deviation ε fromhyperplane
Sandhya and Kevin Time Series Forecasting 76
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Estimating SVR
Multivariate scenario
Given training data of the form (X , y) where X is ndimensional, learn w1 . . .wn and b such that
Loss =
{0 |f (xi )− yi | < ε|f (xi )− yi | − ε otherwise
(22)
is minimizedwhere f (xi ) =
∑ni=1 wixi + b
Sandhya and Kevin Time Series Forecasting 77
CFC ILTenter
or
nd ian
anguage
Deep Learning Models
Sandhya and Kevin Time Series Forecasting 78
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Feedforward Neural Networks
Inputlayer
Hiddenlayer
Outputlayer
Sandhya and Kevin Time Series Forecasting 79
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Feedforward Neural Network: Forward Propagation
Let X = (x1, . . . , xn) be the set of input features
hidden layer activation neurons,aj = f (
∑ni=1Wjixi ), ∀j ∈ 1, . . . h
Sandhya and Kevin Time Series Forecasting 80
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Feedforward Neural Network: Forward Propagation
Let a = (a1, . . . , ah) be the set of hidden layer features
output neurons, ok = g(∑h
j=1 Ukjaj), ∀k ∈ 1, . . .K
Sandhya and Kevin Time Series Forecasting 81
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Feedforward Neural Network: Learning Algorithm
Adjust weights W and U to minimize the error on training set
Define the error to be squared loss between predictions andtrue output
E =1
2(y − o)2 (23)
Gradient w.r.t to output is,
∂E
∂ok=
1
2× 2× (yk − ok) = (yk − ok) (24)
Sandhya and Kevin Time Series Forecasting 82
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Recurrent Neural Networks
Feed forward networks cannot handle sequences
If sequential data is flattened, then it can be learned by FFN
However, weights will not be shared across timesteps
Recurrent networks to the rescue!
Sandhya and Kevin Time Series Forecasting 83
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
An Unrolled RNN
NOTE : Hidden state (ht) tells us summary of the sequence tilltime t
Forward passht = tanh(Wht−1 + Uxt + bh)
zt = softmax(Vht + bz)
Sandhya and Kevin Time Series Forecasting 84
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Long Short Term Memory (LSTM) Network
Forward Pass 1
ft = σ(Wf [ht−1; xt ] + bf )it = σ(Wi [ht−1; xt ] + bi )at = tanh(Wa[ht−1; xt ] + ba)ot = σ(Wo [ht−1; xt ] + bo)
Ct = ft ∗ Ct−1 + it ∗ atht = ot ∗ tanh(Ct)
1For a more detailed treatment of neural networks, refer ICON 2018 slidesat http://www.cfilt.iitb.ac.in/documents/ICON_Tutorial_2018.pdf byKevin Patel and Himanshu Singh
Sandhya and Kevin Time Series Forecasting 85
CFC ILTenter
or
nd ian
anguage
Connection with NLP
Sandhya and Kevin Time Series Forecasting 86
Connection with NLPProblem Level
Sandhya and Kevin Time Series Forecasting 86
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Problem Level Connections
Discuss connections and similarities among problems and howone solution can impact another
Time Series Forecasting benefitting from NLP
NLP benefitting from Time Series Forecasting
Sandhya and Kevin Time Series Forecasting 87
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
On the Importance of Text Analysis for Stock PricePrediction
By Lee et al. (2014)
Forecast companies’ stock price changes (UP, DOWN, STAY)in response to financial events reported by them in 8-Kdocuments
Baseline: Using recent stock price movement and earningssurprise
Contribution: Using textual information from 8-K documentsalong with recent stock price movement and earnings surprise
Observation: Proposed system outperforms baseline by 10%
Resource Contribution: Annotated 8-K documents for use infurther research
Sandhya and Kevin Time Series Forecasting 88
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Semantic Frames to Predict Stock Price Movement
Xie et al. (2013)
Uses FrameNet information to generalize specific sentences toscenarios
Sandhya and Kevin Time Series Forecasting 89
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Semantic Frames to Predict Stock Price Movement(contd.)
Predict 1) change in stock price or 2) polarity of change(up/down)
Baseline: BOW features and LDA
Contribution: FWD (Frames, BOW and part-of-speechspecific DAL scores) features and SemTree datarepresentations
Model: SVM with tree kernels
Observation: Proposed features assist significantly in polaritytask, and show promise in change task.
Sandhya and Kevin Time Series Forecasting 90
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Stock Movement Prediction from Tweets and HistoricalPrices
Xu and Cohen (2018)
Present a novel deep generative model jointly exploiting textand price signals for this task
Introduce recurrent, continuous latent variables for bettertreatment of stochasticity, and use neural variational inference
Resource Contribution: A new stock movement predictiondataset 2
2https://github.com/yumoxu/stocknet-dataset
Sandhya and Kevin Time Series Forecasting 91
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Identifying and Following Expert Investors in StockMicroblogs
Bar-Haim et al. (2011)
Task: Identify expert investors from the information publishedin online stock investment message boards / tweets
Indirect evaluation by considering advice of detected expertsin stock prediction
Baseline: Assume all users as experts
Contribution: A probabilistic expert finding framework
Observation: Information from tweets of identified expertsallowed to forecast stock price movement with higherprecision.
Sandhya and Kevin Time Series Forecasting 92
CFC ILTenter
or
nd ian
anguage
Connection with NLPTooling Level
Sandhya and Kevin Time Series Forecasting 92
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Attention
Anyone familiar with technical indicators in stock marketpredictions will probably have an epiphany at this point
Attention can be used by a neural network to attend toarbitrary portions of the time signal for forecasting
Qin et al. (2017) use two attentions in their paper to improvetime series prediction
An input attention which adaptively extracts relevant inputfeatures (more interpretable)A temporal attention over the encoder states (betterperformance)
Outperforms state-of-the-art on two time series predictiondatasets
Sandhya and Kevin Time Series Forecasting 93
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Transfer Learning
From AlexNet, ResNet in computer vision to BERT, ELMO inNLP, transfer learning has proved its effectiveness
The idea of learning on one scenario and applying the same inother with little tuning is fascinating
What could be the equivalent in time series forecasting?
Ye and Dai (2018) mix transfer learning with online sequentialextreme learning machine and ensemble learning
Does not discard long-ago data
Instead the authors claim that their model is able to transferknowledge from long-ago data
Showed effectiveness on multiple synthetic and real world data
Sandhya and Kevin Time Series Forecasting 94
CFC ILTenter
or
nd ian
anguage
Demos
Sandhya and Kevin Time Series Forecasting 95
DemosStatsmodel Library
Sandhya and Kevin Time Series Forecasting 95
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Statsmodel Library
Statsmodels is a Python package that allows users to exploredata, estimate statistical models, and perform statistical tests
Has extensive list of descriptive statistics, statistical tests,plotting functions, and result statistics
For different data types and estimators
Built on top of NumPy and SciPy
Integrates with Pandas
Sandhya and Kevin Time Series Forecasting 96
CFC ILTenter
or
nd ian
anguage
DemosProphet Library
Sandhya and Kevin Time Series Forecasting 96
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Prophet Library
An open source library released by Facebook
Originally designed to forecast business data internally atFacebook
For more details, refer Taylor and Letham (2018)
Sandhya and Kevin Time Series Forecasting 97
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Prophet
It is an additive regression model with 4 components
Yt = gt + st + ht + εt (25)
where gt is trend, st is seasonality, ht is holidays and εt is the errorterm.
It automatically detects the change points in data
It is robust to missing data and shifts in the trend and handlesoutliers well.
Sandhya and Kevin Time Series Forecasting 98
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Other Options
Google AutoML
Provides a web interfaceModels being used is hiddenProvides a final metric of the model being usedHas a trial period
Microsoft Azure
Provides a web interfaceExposes the list of modelsProvides metrics for each model being testedHas a trial period
Amazon Forecast
Provides a web interfaceExposes the list of modelsProvides a final metric of the model being usedNo trial period
Sandhya and Kevin Time Series Forecasting 99
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Demo Content
The code for the demo on Statsmodel and ARIMA can befound athttps://github.com/Sandhya2207/ICON-2019-TSF-demo
The code for the demo on how statistical techniques helpdeep learning techniques can be found athttps://github.com/KevinNPatel/icon2019_demo
Sandhya and Kevin Time Series Forecasting 100
CFC ILTenter
or
nd ian
anguage
Conclusion
Sandhya and Kevin Time Series Forecasting 101
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Time Series Forecasting Competitions
Santa Fe Time Series Prediction and Analysis Competition(1994)
International Workshop on Advanced Black-box techniques fornon-linear modeling competition (1998)
NN3 and NN5 competitions
Kaggle challenges
Makridakis challenges (M1, M2, M3 and M4)
Sandhya and Kevin Time Series Forecasting 102
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Time Series Forecasting Conferences
Makridakis conferenceshttps://mofc.unic.ac.cy/m-conferences/
International conference on Time Series and Forecasting(ITISE)
ACM Special Interest Group on Knowledge, Data andDiscovery (SIGKDD)
IEEE International Conference on Data Mining (ICDM)
Society for Industrial and Applied Mathematics (SIAM)
The usual ACL, EMNLP, etc. for the interplay between textand time series
Sandhya and Kevin Time Series Forecasting 103
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Time Series Forecasting Datasets
UCR datahttps://www.cs.ucr.edu/~eamonn/UCRsuite.html
Makridakis challenge data
Simple data for testing models using generative models likeARIMA
Simple real world datasets 3:Airline Passenger datasetShampoo Sales datasetMinimum Daily Temperatures datasetMonthly Sunspot datasetDaily Female Births datasetEEG Eye State datasetOccupancy Detection datasetOzone Level Detection dataset
3https://machinelearningmastery.com/
time-series-datasets-for-machine-learning/
Sandhya and Kevin Time Series Forecasting 104
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
Conclusion
Time Series Forecasting - interesting challenge
Provided background and clarified terminology of time series
Discussed different approaches
Statistical ApproachesClassical MLDeep Learning
Discussed few papers showcasing interplay between NLP andTime Series and potential future directions worth exploring
Sandhya and Kevin Time Series Forecasting 105
CFC ILTenter
or
nd ian
anguage
Thank You
Sandhya and Kevin Time Series Forecasting 106
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
References I
Bar-Haim, R., Dinur, E., Feldman, R., Fresko, M., and Goldstein,G. (2011). Identifying and following expert investors in stockmicroblogs. In Proceedings of the 2011 Conference on EmpiricalMethods in Natural Language Processing, pages 1310–1319,Edinburgh, Scotland, UK. Association for ComputationalLinguistics.
Lee, H., Surdeanu, M., MacCartney, B., and Jurafsky, D. (2014).On the importance of text analysis for stock price prediction. InLREC, pages 1170–1175.
Qin, Y., Song, D., Cheng, H., Cheng, W., Jiang, G., and Cottrell,G. W. (2017). A dual-stage attention-based recurrent neuralnetwork for time series prediction. In Proceedings of the 26thInternational Joint Conference on Artificial Intelligence, pages2627–2633. AAAI Press.
Sandhya and Kevin Time Series Forecasting 107
CFC ILTenter
or
nd ian
anguage
Introduction Background StatisticalMethods
Classical MLModels
Deep LearningModels
Connectionwith NLP
Demos Conclusion References
References II
Taylor, S. J. and Letham, B. (2018). Forecasting at scale. TheAmerican Statistician, 72(1):37–45.
Xie, B., Passonneau, R. J., Wu, L., and Creamer, G. G. (2013).Semantic frames to predict stock price movement. InProceedings of the 51st Annual Meeting of the Association forComputational Linguistics (Volume 1: Long Papers), pages873–883, Sofia, Bulgaria. Association for ComputationalLinguistics.
Xu, Y. and Cohen, S. B. (2018). Stock movement prediction fromtweets and historical prices. In Proceedings of the 56th AnnualMeeting of the Association for Computational Linguistics(Volume 1: Long Papers), pages 1970–1979.
Ye, R. and Dai, Q. (2018). A novel transfer learning framework fortime series forecasting. Knowledge-Based Systems, 156:74–99.
Sandhya and Kevin Time Series Forecasting 108
CFC ILTenter
or
nd ian
anguage