applied multivariate statistics - uni-mannheim.de...applied multivariate statistics (ams) -content...

21
Applied Multivariate Statistics Fall Semester 2019 University of Mannheim Department of Economics Chair of Statistics Dr. Toni Stocker

Upload: others

Post on 14-Jan-2020

53 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

Applied Multivariate StatisticsFall Semester 2019

University of Mannheim

Department of Economics

Chair of Statistics

Dr. Toni Stocker

Page 2: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

Applied Multivariate Statistics (AMS) - Content

Introduction to AMS

Principal Component Analysis (PCA)

Correspondence Analysis

Matrix Algebra

Multivariate Samples

Biplots

Multidimensional Scaling (MDS)

Cluster Analysis

Linear Discriminant Analysis (LDA)

Binary Response Models

Factor Analysis

1

20

57

77

129

141

152

170

183

194

212

Page 3: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

Introduction to AMS

1

Page 4: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

2

● All other students:

should have attended two or more courses in Statistics

(descriptive statistics, estimating and hypothesis testing)

● Students in Economics from Mannheim: no problem

● A course in Basic Econometrics or Linear Algebra would be helpful

but ist not strictly required.

● The statistical software R will intensively be used throughout this

course. Students who are not yet familiar with R should work

through chapters 1-5 of the R introduction (see course folder) on

their own by September 13 at the latest.

● If you are not yet sure whether you will attend this course, you may

read sections 1.1 and 1.2 in Johnson & Wichern (see p. 3) to get an

idea about the purposes of this course.

● Though R is easy to learn, you need to invest some time at

the beginning. But you may benefit from it for a long time.

PrerequisitesPrerequisites

General Course InformationGeneral Course Information

Page 5: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

Day Time Location

Lecture Friday 10:15-11:45 L9, 1-2, 003

Tutorial Friday 08:30-10:15 L9, 1-2, 003

Dr. Toni Stocker

Office Hour: Wednesday, 3:00-4:30 p.m. or by appointment

Office: L7, 3-5, 1st floor, room 143

Phone: 0621-181-3963

Email: [email protected]

General Course Information

Tutorials start in the 2nd week.

3

Time and LocationsTime and Locations

ContactContact

Page 6: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

Slides (Lecture), Assignments (Tutorials), ‘Introduction to R’ (see p. 2)

Material will be updated weekly (Friday) to find in course folder

at Studierendenportal (ILIAS)

General Course Information

● R. Johnson, D. Wichern (2007):

Applied Multivariate Statistical Analysis; Pearson Intl. Ed.

4

Main Reference

● A. J. Izenman (2008):

Modern Multivariate Statistical Techniques; Springer.

● P. Hewson (2009):

Multivariate Statistics with R; Open Text Book.

● A. Rencher (2002):

Methods of Multivariate Analysis; Wiley.

● W. Härdle, L. Simar (2003):

Applied Multivariate Statistical Analysis; Wiley.

Course MaterialCourse Material

ReferencesReferences

Page 7: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

Exam + Assignments:

80% written exam (120 minutes) + 20% Assignments

in terms of points to earn in total.

Example:

Points

Written Exam: 60 (from 80)

Assignments: 18 (from 20):

Total: 78 (from 100)

=> Grading will be based on 78 points (from 100)

Minimum for passing: ≤ 40

Assignments:

Need to submit homework and attend tutorial. To get full points (20) you

need to work at least on 10 assignments (out of 12) in a meaningful way.

(See Guidelines for Assignments)

5

ExaminationExamination

Page 8: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

Multivariate analysis consists of a collection of methods that can be

used when several measurements are made on each individual or

object in one or more samples.

6

See Renchner (2002), p.1

Objectives

● Dimension reduction and structural simplification

● Grouping, discrimination and classification

● Investigation of the dependence among variables

● Visualization of high-dimensional data

(see also J+W (2007), p.2)

What is it about?What is it about?

Issues of Applied Multivariate Statistics (AMS)Issues of Applied Multivariate Statistics (AMS)

● Close link to other areas such as Exploratory

Data Analysis (EDA) and Data Mining

Page 9: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

What is it about?

7

Example 1: Dimension Reduction

Economic Indicators for the 27 European Union Countries in 2011

(see WIREs Comput Stat 2012, 4:399–406. doi: 10.1002/wics.1200)

Page 10: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

What is it about?

8

Example 2: Modern Graphical Techniques

Page 11: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

What is it about?

9

Example 2 ...

Page 12: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

What is it about?

10

Example 3: Factor Analysis

Consumer Preference (J&W, example 9.9, p. 508)

179.011.085.001.0

79.0150.071.042.0

11.050.0113.096.0

85.071.013.0102.0

01.042.096.002.01

R

Tas

te

Good b

uy

for

money

Fla

vor

Suit

able

for

snac

k

Pro

vid

es l

ots

of

ener

gy

Page 13: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

What is it about?

Example 4: Distances

11

Voting results for 15 congressmen from New Jersey

(example from R package HSAUR)

Hunt(R) Sandman(R) Howard(D)

Hunt(R) 0 8 15

Sandman(R) 8 0 17

Howard(D) 15 17 0

Extraction from the distance matrix ...

Page 14: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

What is it about?

Example 5: Grouping

12

Page 15: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

What is it about?

13

Example 5 ...

Page 16: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

What is it about?

Example 6: Classification

14

Labor Market Participation of Married Women in Switzerland (1981)

(example from R package AER)

Page 17: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

What is it about?

Example 7: Discrimination

15

Heights and weights of students

Page 18: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

What is it about?

Example 8: Correspondence Analysis

16

Baccalauréat in France (Härdle & Simar, p. 313)

Page 19: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

What is it about?

Generally: Chapters 1-4, 8, 9, 11, 12 from Johnson and Wichern (J&W)

Timetable and Contents

● Lecture 3: Matrix Algebra (part 2)

● Lecture 2: Matrix Algebra (part 1)

● Lecture 4: Multivariate Samples

● Lecture 5: Principal Component Analysis

● Lecture 1: Introduction

17

● Lecture 6: Biplots

● Lecture 7: Factor Analysis

Course OutlineCourse Outline

Page 20: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

What is it about?

● Lecture 8: Multidimensional Scaling

● Lecture 10: Linear Discriminant Analysis

18

● Lecture 13: Used as time buffer

● Lecture 11: Binary Response Models

● Lecture 12: Open Topic (Correspondence Analysis, Ggobi, ...)

● Lecture 9: Cluster Analysis

Note: This is just a plan! Topics may be skipped; order

may be changed; lecture topics may overlap

Page 21: Applied Multivariate Statistics - uni-mannheim.de...Applied Multivariate Statistics (AMS) -Content Introduction to AMS Principal Component Analysis (PCA) Correspondence Analysis Matrix

What is it about?

... at the end of the semester you

● know and (hopefully) understand most common methods for

analyzing multivariate data and their theoretical background

Generally: This is an introductory and applied course.

Modern multivariate techniques based on machine

learning algorithms will hardly be covered.

19

● can proficiently use R when using multivariate techniques:

data import, constructing graphics, inference, model diagnosis

and assessment

● have experienced the possibilities and limitations of

multivariate methods on the basis of real data examples

Main ObjectivesMain Objectives