descriptive statistics with matlab/octave · 2020. 1. 8. · lab 1 intro matlab data descrip....

43
Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive Statistics 1 with MATLAB/Octave Telecommunications Engineering Department of Statistics Universidad Carlos III de Madrid 1 Review files Descriptive 1 and Descriptive 2 (in Spanish) 1 / 43

Upload: others

Post on 22-Jan-2021

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Descriptive Statistics1 withMATLAB/Octave

Telecommunications Engineering

Department of Statistics

Universidad Carlos III de Madrid

1Review files Descriptive 1 and Descriptive 2 (in Spanish)1 / 43

Page 2: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Summary

1 Introduction to MATLAB/Octave’s environment

2 Description of a data set

3 Graphical Representation

4 Position measures

5 Dispersion measures

6 Joint description of two variables

2 / 43

Page 3: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Summary

1 Introduction to MATLAB/Octave’s environment

2 Description of a data set

3 Graphical Representation

4 Position measures

5 Dispersion measures

6 Joint description of two variables

3 / 43

Page 4: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Main window in MATLAB

Command WindowWorkspace

Command History

Current Directory

4 / 43

Page 5: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Working directory (Current Directory).

Especify directory: C:/.../

for example: C:/lab1/

Here you can save the functions *.m created.

Working space (Workspace).

Here you can view all variables and functions loaded.

Window to type commands (Command Window).

Console where you type the commands to execute.

Command History

Here you can see all commands typed in the Command Window.

Useful files for Lab 1

Download from the webpage the next files internet.mat,AlumnosIndustriales.mat and histg.m here

5 / 43

Page 6: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Summary

1 Introduction to MATLAB/Octave’s environment

2 Description of a data set

3 Graphical Representation

4 Position measures

5 Dispersion measures

6 Joint description of two variables

6 / 43

Page 7: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: failures

Next table summarizes the number of failures in 40 electric centralstations (the table shown is incomplete):

Central Station No failures

1 12 33 24 2...

...39 140 2

7 / 43

Page 8: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: failures

In MATLAB, we insert data over the command window

>> averias=[ 1; 3; 2; 2; 0; 3; 1; 0; 2; 1; 0; 1; 0; 3; 0; 2;

0; 2; 1; 1; 1; 0; 2; 3; 0; 0; 0; 0; 1; 1; 4; 0;

3; 1; 0; 2; 1; 1; 1; 2 ]

We have created the column vector averias

averias =

1

3

2

2

...

8 / 43

Page 9: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: failures

The command tabulate calculates the absolute frequencies (Count)and the relative frequencies in % (Percent).

>> tabulate(averias)

Value Count Percent

0 13 32.50%

1 13 32.50%

2 8 20.00%

3 5 12.50%

4 1 2.50%

9 / 43

Page 10: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: failuresTo manage the table as a matrix, we name it tabla. Now we canaccess to their elements easily.

>> tabla = tabulate(averias)

For instance, we can represent the relative frequencies between 0 and1, rather than in %.

>> tabla(:,3) = tabla(:,3)/100

tabla =

0 13.0000 0.3250

1.0000 13.0000 0.3250

2.0000 8.0000 0.2000

3.0000 5.0000 0.1250

4.0000 1.0000 0.0250

10 / 43

Page 11: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: failures

We can calculate cumulative frequencies by means of commandcumsum.

>> abs_acum = cumsum(tabla(:,2))

ans =

13

26

34

39

40

>> rel_acum = cumsum(tabla(:,3))

ans =

0.3250

0.6500

0.8500

0.9750

1.0000

11 / 43

Page 12: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: failures

The complete frequency table can be built binding the columns ofcumulated frequencies (abs acum and rel acum) to the previoustable.

>> tabla=[ tabla abs_acum rel_acum ]

tabla =

0 13.0000 0.3250 13.0000 0.3250

1.0000 13.0000 0.3250 26.0000 0.6500

2.0000 8.0000 0.2000 34.0000 0.8500

3.0000 5.0000 0.1250 39.0000 0.9750

4.0000 1.0000 0.0250 40.0000 1.0000

12 / 43

Page 13: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: failures

How many stations have less than 2 failures?

No failures Abs Freq Rel Freq Cum Abs Freq Cum Rel Freq

0 13 0.325 13 0.325

1 13 0.325 26 0.6502 8 0.200 34 0.8503 5 0.125 39 0.9754 1 0.025 40 1

13 / 43

Page 14: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: failures

What percentage of stations has 3 or less failures?

No failures Abs Freq Rel Freq Cum Abs Freq Cum Rel Freq

0 13 0.325 13 0.325

1 13 0.325 26 0.650

2 8 0.200 34 0.850

3 5 0.125 39 0.9754 1 0.025 40 1

14 / 43

Page 15: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Summary

1 Introduction to MATLAB/Octave’s environment

2 Description of a data set

3 Graphical Representation

4 Position measures

5 Dispersion measures

6 Joint description of two variables

15 / 43

Page 16: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Graphical RepresentationQuantitative variables

Some graphical representations are different depending ifvariables are discrete or continuous.

They are used with absolute, relative and cumulativefrequencies.

BarplotPiechart or pictogramHistogram

16 / 43

Page 17: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Graphical RepresentationQuantitative variables

Example: central stations failures

Barplot:

In MATLAB we use the command bar

>> bar(tabla(:,2))

we have obtained the barplot of ab-solute frequencies.

1 2 3 4 50

2

4

6

8

10

12

14

17 / 43

Page 18: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Graphical RepresentationQuantitative variables

Example: central stations failures

Piechart or pictogram:

In MATLAB, we use the command pie

>> pie(tabla(:,2))

33%

33%

20%

13%

3%

18 / 43

Page 19: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Graphical RepresentationQuantitative variables

Example: students heights

Histogram:

The variable alturas contains the data of 28 students heights in aclassroom.

alturas = [ 156; 159; 153; 157; 162; 156; 162;

153; 158; 152; 153; 159; 154; 153;

150; 162; 163; 160; 162; 156; 164;

168; 152; 151; 159; 147; 165; 149 ]

Command hist

>> hist(alturas)

145 150 155 160 165 1700

1

2

3

4

5

6

19 / 43

Page 20: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

We can choose the number of classes or intervals

>> hist(alturas, 5) % 5 classes or intervals

145 150 155 160 165 1700

1

2

3

4

5

6

7

8

20 / 43

Page 21: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Let’s suppose there is a new student of 2.10 m.

We can create a new variable alturas2

>> alturas2 = [alturas; 210];

>> hist(alturas2)

140 150 160 170 180 190 200 2100

1

2

3

4

5

6

7

8

9

10

The new student is an outlier

21 / 43

Page 22: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Diagram for cumulative freq.

Example: central stations failures

Now we consider absolute and cumulative frequenciesFrom tabla, we take columns 2 and 4

>> bar([tabla(:,2) tabla(:,4)])

1 2 3 4 50

5

10

15

20

25

30

35

40

Frec. Abs

Frec. Abs Acum.

22 / 43

Page 23: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Summary

1 Introduction to MATLAB/Octave’s environment

2 Description of a data set

3 Graphical Representation

4 Position measures

5 Dispersion measures

6 Joint description of two variables

23 / 43

Page 24: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Position measuresmean, median

Example: students heights

Let’s go back to alturas

The mean, x , is calculated by commands mean:

>> mean(alturas2)

ans =

158.7931

And the median, xmed , by command median:

>> median(alturas2)

ans =

157

24 / 43

Page 25: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Summary

1 Introduction to MATLAB/Octave’s environment

2 Description of a data set

3 Graphical Representation

4 Position measures

5 Dispersion measures

6 Joint description of two variables

25 / 43

Page 26: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Dispersion measuresvariance, standard deviation and range

They measure the “dispersion” of the observations with respect to aposition measure.

Variance:

VARx = S2x =

1

n

n∑i

(xi − x)2

Standard deviation:

DTx = Sx =√S2x

Range or amplitude:

Rangex = max(xi )− mın(xi )

26 / 43

Page 27: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

In MATLAB

Command var

>> var(x)

Command std

>> std(x)

Command range

>> range(x)

Or also by means of commands max and min: max(x)-min(x).

27 / 43

Page 28: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Summary

1 Introduction to MATLAB/Octave’s environment

2 Description of a data set

3 Graphical Representation

4 Position measures

5 Dispersion measures

6 Joint description of two variables

28 / 43

Page 29: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: height/weight

We have measured the height (X in mm) and the weight (Y in gr) ofa sample of 117 bolts, the information is presented in the followingtable:

nij

X / Y 40-60 60-80 80-100 100-200

140-160 4 0 0 0160-180 14 60 2 0180-200 0 20 16 1

>> tornillos = [ 4 0 0 0

14 60 2 0

0 20 16 1 ]

29 / 43

Page 30: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: height/weight

We have measured the height (X in mm) and the weight (Y in gr) ofa sample of 117 bolts, the information is presented in the followingtable:

nij

X / Y 40-60 60-80 80-100 100-200

140-160 4 0 0 0160-180 14 60 2 0180-200 0 20 16 1

>> tornillos = [ 4 0 0 0

14 60 2 0

0 20 16 1 ]

30 / 43

Page 31: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

We can calculate the relative frequency table by simply:

fij =nijn

>> frec_relat = tornillos/117

X / Y 40-60 60-80 80-100 100-200

140-160 0.0342 0 0 0160-180 0.1197 0.5128 0.0171 0180-200 0 0.1709 0.1368 0.0085

31 / 43

Page 32: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

We calculate marginal distributions adding up by columns and byrows:

by columns (1) and by rows (2):

>> sum(tornillos,1) >> sum(tornillos,2)

ans = ans =

18 80 18 1 4

76

37

X / Y 40-60 60-80 80-100 100-200 Total

140-160 4 0 0 0 4160-180 14 60 2 0 76180-200 0 20 16 1 37

Total 18 80 18 1 117

32 / 43

Page 33: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: internet

Two internet services companies have done a survey in 95 householdswith internet connection. The measures variables were:

MB: downloaded MB’s per day.

Time connection: average daily time connection in hours.

ISP: Internet Service Provider (provider 1 and 2).

Moment: moment of connection (morning=1, evening=2,night=3).

The data are contained in internet.mat file.

33 / 43

Page 34: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: internet

Here is shown the data table (incomplete):

MB Time connection ISP Moment

163 1,1 1 1170 3,5 1 1169 3 1 3173 2 1 2160 1,1 1 1

......

......

183 8,1 2 1175 5,6 2 3173 3 2 1

34 / 43

Page 35: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: internet

To import the file in MATLAB:

File → Open or also File → Import data

We have in the workspace the matrix internet

Name the variables:

>> MB = internet(:,1);

>> tiempo_conexion = internet(:,2);

>> ISP = internet(:,3);

>> momento = internet(:,4);

35 / 43

Page 36: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: internet

The command crosstab2 returns the counts of two vectors in across table. For example, for the qualitative variables moment

and ISP, it is possible to create the following matrix ofdimensions 3× 2, where the rows represent the categories of thevariable momento (1= morning, 2 = evening and 3 = night) andthe columns the two types of internet service ISP.

>> crosstab(momento,ISP)

ans =

3 10

11 25

18 28

ISPMOMENT 1 2

morning 3 10evening 11 25

nigth 18 28

2implemented in MATLAB’s stats toolbox36 / 43

Page 37: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: InternetHistogram by groups

The file histg.m allows us to represent a histogram by groups.

Save the file in your working directory C:/lab1/

>> histg(MB,ISP)

155 160 165 170 175 180 185 190 195 2000

0.05

0.1

0.15

0.2

0.25

0.3

ISP 1ISP 2

37 / 43

Page 38: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: InternetGroup analysis

Command grpstats allows us to calculate statistics by groups.

GRPSTATS(X,GROUP) calculates the mean by groups.

[MEANS,SEM,COUNTS,GNAME] = grpstats(X,GROUP),calculates means (MEANS), standard errors of the mean (SEM),number of elements by groups (COUNTS), and name of eachgroup (GNAME).

38 / 43

Page 39: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: InternetGroup analysis

>> [MEANS,SEM,COUNTS,GNAME] = grpstats(MB,ISP)

MEANS =

165.3125

179.3492

SEM =

0.7834

0.6345

COUNTS =

32

63

GNAME =

’1’

’2’

39 / 43

Page 40: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: InternetGroup analysis

The standard deviations by groups (Sg ) can be obtained fromSEM and COUNTS, using the formula:

SEMg =Sg√

COUNTSg

In recent versions of MATLAB (R2011a, R2013b, ...), commandgrpstats has been updated in such a way that we can directlyobtain the standard deviations by groups.

40 / 43

Page 41: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: InternetGroup analysis

>> [MEANS,STDS,COUNTS,GNAME] =

grpstats(MB,ISP,{’mean’,’std’,’numel’,’gname’})

MEANS =

165.3125

179.3492

STDS =

4.4318

5.0359

COUNTS =

32

63

GNAME =

’1’

’2’

41 / 43

Page 42: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: InternetDispersion plot

The function plot represents a dispersion plot of twoquantitative variables , for instance, for MB andtiempo conexion:

>> plot(MB,tiempo_conexion,’.’)

155 160 165 170 175 180 185 190 1950

2

4

6

8

10

12

Is there any linear relationship between MB and tiempo conexion?Any linear dependency?

42 / 43

Page 43: Descriptive Statistics with MATLAB/Octave · 2020. 1. 8. · Lab 1 Intro MATLAB Data descrip. Graph. Repr. Position measures Dispersion measures Descrip. of 2 variables Descriptive

Lab 1

Intro MATLAB

Data descrip.

Graph. Repr.

Positionmeasures

Dispersionmeasures

Descrip. of 2variables

Example: Internetcovariance and correlation

Command cov calculates the covariance matrix

>> cov(MB,tiempo_conexion)

ans =

67.6847 13.8172

13.8172 4.7746COV (X ,Y ) =

(S2x Sx,y

Sx,y S2y

)

Command corrcoef calculates the correlation coefficient

>> corrcoef(MB,tiempo_conexion)

ans =

1.0000 0.7686

0.7686 1.0000 CORR(X ,Y ) =

(rx,x rx,yrx,y ry ,y

)where rx,y =

Sx,ySxSy

.

43 / 43