introduction to time series - purdue universityboli/stat420/lectures/lecture1.pdf · introduction...

19
Introduction Get start with R Introduction to Time Series Dr. Bo Li January 8, 2013 Dr. Bo Li Introduction to Time Series

Upload: nguyenkiet

Post on 31-Jan-2018

230 views

Category:

Documents


1 download

TRANSCRIPT

IntroductionGet start with R

Introduction to Time Series

Dr. Bo Li

January 8, 2013

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

IntroductionExamples of time seriesA time series problemTerminologyObjectives of Time Series Analysis

Get start with R

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

Examples of time seriesA time series problemTerminologyObjectives of Time Series Analysis

What is time series

A time series is a collection of observations xt made sequentiallythrough time. Examples occur in a variety of fields, ranging fromeconomics to engineeringExamples of time series:

I Monthly sales of U.S. houses (thousands) 1965 - 1975I The Beveridge wheat price annual index series from 1500 to

1869I Average wheat price in nearly 50 places in various countriesI Particular interest to economic historians.

I Time series related to temperatureI Northern hemisphere mean temperatureI 14 Tree ring seriesI Important to past temperature reconstruction

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

Examples of time seriesA time series problemTerminologyObjectives of Time Series Analysis

Monthly sales of U.S. houses (thousands) 1965 - 1975.

month in 1965 − 1975

sale

s (t

hous

ands

)

1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975

●●

●●

●●

●●

●●

●●●●

●●

●●

●●

●●

●●●

●●●●

●●

●●●

●●●●

●●

●●

●●

●●

●●●

●●●●

●●

●●

●●●

●●

●●

●●●

●●●

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

Examples of time seriesA time series problemTerminologyObjectives of Time Series Analysis

Average wheat price (1500 -1869)

1500 1600 1700 1800

010

020

030

0

year

pric

e in

dex

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

Examples of time seriesA time series problemTerminologyObjectives of Time Series Analysis

Past temperature reconstruction

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

Examples of time seriesA time series problemTerminologyObjectives of Time Series Analysis

Instrumental temperature series

1850 1900 1950 2000

−0.

6−

0.2

0.0

0.2

0.4

0.6

year

Deg

ree

●●●

●●●

●●●

●●●

●●

●●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●●●

●●

●●

●●●

●●

●●●

●●●

●●●

●●●●

●●

●●●

●●

●●

●●

●●●

●●

●●●●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

Examples of time seriesA time series problemTerminologyObjectives of Time Series Analysis

14 tree ring series

1000 1200 1400 1600 1800 2000

−20

−10

010

year

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

Examples of time seriesA time series problemTerminologyObjectives of Time Series Analysis

Long Climate Proxies and NH temperatures

Exploit wherethere is overlap intwo sets of data

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

Examples of time seriesA time series problemTerminologyObjectives of Time Series Analysis

Reconstructed temperaturesD

egre

e C

1000 1150 1300 1450 1600 1750 1900

−0.

6−

0.3

0.1

0.4

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

Examples of time seriesA time series problemTerminologyObjectives of Time Series Analysis

Terminology

I Continuous: A time series is continuous when observations aremade continuously through time, even when the measuredvariable can only take a discrete set of values. E.g., a binaryprocess at continuous time is a continuous time series.

I Discrete: A time series is discrete when observations are takenonly at specific times, usually equally spaced, even themeasured variable is a continuous variable.

I We will more focus on discrete time series.

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

Examples of time seriesA time series problemTerminologyObjectives of Time Series Analysis

Terminology

I Discrete time series can arise in several ways:I Sampled: Given a continuous time series, we could read off the

values at equal intervals of time to give a discrete time series,sometimes called a sampled series. The sampling intervalbetween successive readings must be carefully chosen so as tolose little information.

I Aggregated: Aggregate the values over equal intervals of acontinuous time series. E.g., monthly exports and dailyrainfalls.

I What are the main difference between a time series andrandom samples of independent observations?

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

Examples of time seriesA time series problemTerminologyObjectives of Time Series Analysis

Terminology

Answer: The special feature of time-series analysis is the fact thatsuccessive observations are usually not independent and that theanalysis must taken into account the time order of theobservations. When successive observations are dependent, futurevalues may be predicted from past observations.

I Deterministic: If a time series can be predicted exactly, it issaid to be deterministic, e.g. xt = 3xt−1

I Stochastic: Most time series are stochastic in that the futureis only partly determined by past values, so that exactpredictions are impossible and must be replaced by the ideathat future values have a probability distribution, which isconditioned by a knowledge of past values, e.g.,Xt = 3Xt−1 + ε, ε ∼ N(0, σ2).

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

Examples of time seriesA time series problemTerminologyObjectives of Time Series Analysis

Objectives

I DescriptionI Time plot: plot the time series against time, and then to

obtain simple descriptive measures of the main properties ofthe series. Cyclic pattern? Seasonal effect? Upward trend?Outlier (“wild observations”)?Anyone who tries to analyze a time series without plotting itfirst is asking for trouble!

I Explanation: when observations are taken on two or morevariables, it may be possible to use the variation in one timeseries to explain the variation in another series.e.g., it is interesting to see how sea level is affected bytemperature and pressure, and to see how sales are affectedby price and economic conditions.

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

Examples of time seriesA time series problemTerminologyObjectives of Time Series Analysis

Objectives

I Prediction: Given an observed time series, one may want topredict the future values of the series. This is an importanttask in sales forecasting, and in the analysis of economic andindustrial time series. Here we also call it “forecasting”.

I Control: Time series are sometimes collected or analyzed soas to improve control over some physical or economic system.For example, when a time series is generated that measuresthe “quality” of a manufacturing process, the aim of theanalysis may be to keep the process operating at a “high”level. Control problems are closely related to predictions inmany situations. For example, if one can predict that amanufacturing process in going to move off target, thenappropriate corrective action can be taken.

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

“Data Analysts Captivated by R’s Power”, The New York Times

To some people R is just the 18th letter of the alphabet.To others, its the rating on racy movies, a measure of anattics insulation or what pirates in movies say.

R is also the name of a popular programming languageused by a growing number of data analysts insidecorporations and academia. It is becoming their linguafranca partly because data mining has entered a goldenage, whether being used to set ad prices, find new drugsmore quickly or fine-tune financial models. Companies asdiverse as Google, Pfizer, Merck, Bank of America, theInterContinental Hotels Group and Shell use it.

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

continued:

But R has also quickly found a following becausestatisticians, engineers and scientists without computerprogramming skills find it easy to use.

“R is really important to the point that its hard toovervalue it, said Daryl Pregibon, a research scientist atGoogle, which uses the software widely. It allowsstatisticians to do very intricate and complicated analyseswithout knowing the blood and guts of computingsystems.”

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

Get start with R

R code to make the wheat time plot:

file <- read.table(’wheat.txt’,header=F, sep=’’)

file <- as.matrix(file)

wheat <- as.vector(t(file))

plot(1500:1869, wheat, xlab=’year’,ylab=’price index’,

type=’l’,lwd=2.5, col=’blue’)

Dr. Bo Li Introduction to Time Series

IntroductionGet start with R

More R code examples:

x <- c(17, 19, 20, 15, 13, 14, 14, 14, 14 )

plot(x)

plot(1500:1509, x)

points(1500:1509, x)

mean(x)

var(x)

Dr. Bo Li Introduction to Time Series