summary statistics 1 and 2

Samenvatting_Statistiek_1_and_2.pdf

Brief Summary Chapter 1 - 23

Tilburg University | Statistiek 2

Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]

http://www.studeersnel.nl

http://www.studeersnel.nl/tilburg-university/350912-statistiek-2/samenvatting/brief-summary-chapter-1-23-samenvatting-statistiek-1-and-2-pdf/16/10002/52096

http://www.studeersnel.nl/tilburg-university/350912-statistiek-2/samenvatting/brief-summary-chapter-1-23-samenvatting-statistiek-1-and-2-pdf/16/10002/52096

http://www.studeersnel.nl/tilburg-university/350912-statistiek-2/16/10002

http://www.studeersnel.nl/click_banner.php?banner=YTo2OntzOjExOiJjYW1wYWduZV9pZCI7aTo0MTtzOjY6InVuaV9pZCI7aToxNjtzOjY6InZha19pZCI7aToxMDAwMjtzOjY6InBhZ2luYSI7czoxODoiL2Rvd25sb2FkX2ZpbGUucGhwIjtzOjQ6ImxpbmsiO3M6NTU6Imh0dHA6Ly9tYWduZXQubWUvY29tcGFueS9jb2duaXphbnQtdGVjaG5vbG9neS1zb2x1dGlvbnMiO3M6MTE6ImJhbm5lcl90eXBlIjtpOjM7fQ==

Chapter 1 Intro and Basic Concepts N: Population elements n: Sample elements

Qualitative: Nominal: Cannot be ordered Ordinal: Can be ordered

Quantitative: Interval: Differences of values are meaningful, ratios not. Ratio: Differences and ratios are meaningful.

Discrete variables: Set of possible variables can be counted. Continuous variables: Set of possible variables with same real interval, with uncountable many numbers.

Descriptive Statistics: Gathering the data, summarization (tables, graphs; statistics)

Probability (theory): Chance; probability; rules

Sampling (theory): How to draw a sample? Properties of random sampling

Inferential Statistics: Drawing conclusions about the population on the basis of a sample from it

Chapter 2 Tables and Graphs Frequency distribution: Overview of all values with accompanying frequencies

Relative frequency distribution: Overview of all values with accompanying relative frequencies

-‐ Frequency/total observations

CDF (cumulative distribution function F): F(a) = relative frequency of the observations £ a; real numbers a

Classified freq distribution: Overview of classes and accompanying frequencies

-‐ depends on the classification

-‐ preferably equal class width

Frequency density: Overview of classes and relative frequencies divided by corresponding

class widths

Linear interpolation: (or see book page 42)



Chapter 3 Measures of Location Mode: value with the largest frequency

Median: middle of ordered observations

Mean (= arithmetic mean): Average (population = µ, sample = )

Weighted mean:

Geometric mean:

Property:

Population mean:

Sample mean:

Binary variable can only have two variables, summing up will only give the numbers 1 = p, p/n gives the mean, that is the proportion.

= mode

= median

= (arithmetic) mean

rg = geometric mean

p = population proportion

Chapter 4 Measures of Variation 4.1 Measures Based on Quartiles

IQR

Relative IQR:

Box plot: see book page 108

Ki, ki = ith quartile (i=1, 2, 3)

= interquartile range |K3 K1|, |k3 k1|

2, S2 = variance

= standard deviation


4.2 Measures Based on Deviations from the Mean

Population variance:

Population standard deviation:

Sample variance:

Sample standard deviation:

Coefficients of variation: and

SHORT-‐CUT FORMULAS FOR VARIANCE

Population Variance:

Sample Variance:

The Population variance is just the mean of the squares minus the square of the mean.

4.3 Interpretation of the standard deviation

for all k > 0 it holds that:

Sample data: at least of the observations lies in

Population data: at least of the observations lies in



4.4 z-Scores

Population dataset: -‐

Sample dataset: -‐

Outliers: Observations that are extremely small or extremely high.

Outlier if smaller than K1 1 -‐definition

4.5 Variance of 0-1 data

Population dataset: -‐

-‐

Sample dataset:

4.6 Variance of a frequency distribution

Discrete frequency distribution:

Classified frequency distribution:

Mean µ 2

Original observation

Frequency distribution


Chapter 5 Pairs of Variables

5.1 Scatter plot, Covariance and Correlation Measures of association: The measure of the strength of the linear relationship.

x,y, sx,y = covariance

px,y, rx,y = correlation coefficient

0, b0 = intercept of regression line

1, b1 = slope of regression line

Quadrants I, II, III, IV

I and III >

II and IV >

SHORT-‐CUT FORMULAS FOR COVARIANCE

Population Covariance:

Sample Covariance:

The population covariance is just the mean of the products minus the product of the means

Population Correlation coefficient:

Sample Correlation coefficient:



5.2 Regression Line

LEAST-‐SQUARES METHOD

LS-‐method: a and b are taken such that is minimal

Population regression coefficients:

Population regression line:

Sample regression coefficients:

Sample regression line:

Property: A sample line regression passes trough ; a population regression line passes trough (µx,µy)

PREDICTION AND RESIDUALS

Prediction of yp:

Predictions of y1, y2, ..., yn:

The n prediction errors are called residuals

SUM OF SQUARED ERRORS

Sum of squared errors (overall): -‐ note that

5.3 Linear Transformations

Mean of a + bx1, a + bx2 N =

Variance of a + bx1, a + bx2 N (around ) = hence

Transformation of statistics under

Transformation of covariance and correlation by v = a + bx and w = c + dy

Population dataset Sample dataset Covariance

Correlation coefficient If bd > 0 If bd < 0

If bd > 0 If bd < 0

Relationship between two qualitative variables Contingency table: Table that offers an overview of all joint frequencies

Population dataset Sample dataset

Location

Variation


Chapter 6 Definitions of Probability

6.1 Random experiments Random experiment: An experimentation or an observation of an uncontrollable phenomenon for which more than one outcome is possible

Sample space:

Elements: The different possible outcomes

P = Probability (measure), model

= Sample space

Ø = empty set, empty event

= Subset

= Union

= Intersection

.c = Complement

6.2 Rules for sets

Concept Notation Venn-‐Diagram Meaning for Events

Empty set Ø -‐-‐-‐-‐-‐-‐-‐-‐ Cannot occur

Sample Space

Occurs certainly

Complement Ac

A does not occur



Union

At least one of the events A and B occur (and/or)

Intersection

Both A and B occur (and)

Subset

If A occurs then B occurs

Disjoint

A en B cannot occur jointly

Partition D1, ..., Ds

Exactly one of the events D1, ..., Ds

occurs

6.3 Historical definitions of probability

Classical definition of probability: for all events A requirement: equally likely outcomes

Empirical definition of probability: requirement: independent & identically repeatable

Subjective definition of probability: How strongly one individual believes in occurrence of event A

6.4 General definition of Kolmogorov Probability measure P (definition of Kolmogorov): A probability measure P is a prescription that

that the following axioms hold for all subsets A and -‐

-‐ -‐ If A and B are disjoint, then


Chapter 7 Probability and Rules

7.1 Basic properties Important rules for probability: (7.1), (7.2), (7.6), (7.7), (7.8), (7.9)

Important rules for conditional probability: (7.12), (7.15), (7.17), (7.18), (7.19), (7.20) THE BASIC AXIOMS

Requirement Formula Numbering

(7.1)

(7.2)

If A, B and C are disjoint (7.3)

If A1, A2 m are disjoint

(7.4)

If D1, D2 m (7.5)

If (7.6) (7.7)

For all events B (7.8) If D1, D2, m

(7.9)

7.2 Rules for counting

With Replacement Without Replacement

Ordered mk

Unordered -‐-‐-‐-‐-‐-‐-‐-‐-‐-‐

(7.10)



7.3 Random drawing and random Sampling

(7.11)

(pa = relative frequency)

7.4 Conditional probabilities and independence

The Conditional probability of event A given event B is denoted as P(A|B) or as (7.12)

Interchanging A and B (if ) leads to the probability of B given A (7.13)

By multiplying both sides of (7.12) by P(B) and both sides of (7.13) by P(A) yields (7.14)

Since some terms cancel out we obtain, this is also called the product rule (7.15)

In case of four events A, B, C and D it follows that (7.16)

Events are stochastically independent if (7.17)

Two other equivalent ways of expressing independence of A and B (7.18) & (7.19)

Rule of Bayes: It expresses a conditional probability in its opposite conditional probability. (7.20)


Chapter 8 Probability Distribution, Expectation, Variance Random variable (rv) is a prescript that attaches a value to each outcome of the sample space.

Working definition: quantity for which the actual outcome is determined by chance

The actual outcome of a random variable (X) is called the realisation

W

Discrete: Finite or countable number of values

Continuous: May assume any value in some interval

Probability distribution: Overview of the probabilities of all X-‐events

X, Y = random variables

rv = random variable

x, y = outcomes of X, Y

E(X) = expectation of X

V(X) = variance of X

SD(X) = standard deviation of X

F = cdf; (cumulative) distribution function

f = pdf; probability density function

DISCRETE VARIABLES

Probability density function (pdf) f of rv X: for all outcomes x of X

f(x) is the probability that the realisation of X will be x

f is also called (discrete) density

(8.1)

(Cumulative) distribution function (cdf) F of a rv X: for all real numbers a

F is non decreasing

F(-‐ ) = 0; F( ) = 1 (8.2)

F(b) for al a and b with a < b

Relation between discrete pdf and cdf (=non-‐decreasing step function) (8.3)



When x is an outcome of X and is the largest outcome that is smaller than x it yields that (8.4)

so f follows from F and f(x) is just the jump-‐size of F at x

A random variable X is called degenerate at the constant b if b is the only possible outcome of X.

Expectation or expected value or mean of a discrete X (8.14)

Expectation of V = h(X) for a discrete X (8.17)

Variance of a discrete X (8.18)

CONTINUOUS VARIABLES

(8.5)

Property for continuous X: for all real numbers x (due to the infinite amount of possibilities) (8.6)

Property for continuous X: for all real numbers x (8.7)

Probability density function (pdf) f of a continuous rv X: (8.8)

Elementary properties of the pdf f of a continuous X:

for all real numbers x (8.9)

total area under f, which is 1 (8.10)

Properties of the pdf F of a continuous X:

It is a non-‐decreasing continuous function that is strictly increasing on an interval where the pdf is positive

It is completely determined by the pdf f since for all real numbers b: (8.11)

It completely determines the pdf f since: for all real numbers a (8.12)

(8.13)


Because h is really close to zero the probability that X falls in a small neighbourhood [a, a + h] is

relatively large when compared to neighbourhoods of other outcomes, hence f(a) is NOT the

probability of a, it is called the likelihood of outcome a (see page 286 for a detailed explanation).

Expectation or expected value or mean of a continuous X (8.22)

Expectation of V = h(X) for a continuous X (total area under h(x) f(x) (8.23)

Variance of a continuous X (8.18)

8.5 Rules for expectation and variance Linear transformation (discrete and continuous):

(8.30)

Short-‐cut formula for variance of X (8.32)

8.7 Other statistics of probability distributions

The p-‐quantile of X and its distribution is such that: (8.41)



Chapter 9 Families of Discrete Distributions This whole chapter is about 0 and 1 values

~

Bin(n,p) = binominal distribution

H(n; M, N) = hypergeometric distribution

Expectation, variance and standard deviation of Alt(p) (9.1)

The binominal distribution with parameters n and p has the following pdf (Y ~ Bin(n, p)) (9.2)

Expectation, variance and standard deviation of Bin(n,p) (9.3)

Expectation, variance and standard deviation of proportion successes in binomial (9.4) experiment

If

(9.9)

Distribution Notation f(y) Expectation Variance

Hypergeometric H(n;M,N)

Binominal Bin(n,p) np np(1-‐p)


Chapter 10 Families of Continuous Distributions

10.1 Uniform distributions Uniform distribution: (10.1)

(10.2)

Expectation, variance and standard deviation of (10.3)

10.2 Exponential distributions Exponential distribution: (10.4)

Some properties of Y~Expo(µ) with cdf F: (10.5 & 10.6)

10.3 Normal distribution Normal distribution: (10.7)

Notation: 2) if Y has this density



Some properties of the pdf of 2)

is the maximum value of f

for all positive a; so f is symmetric around µ

-‐

At y = µ -‐ and the graph of f has a turning point in the sense that the decline decreases when going further from µ

Expectation and variance of 2) (10.8)

Standard normal deviation or z-‐distribution (10.9)

If 2) and Y = a + bX, then Y ~ N(a + bµ, b2 2) (10.10)

If X ~ 2) and , then Z ~ N(0,1) (10.11)

Z ~ N(0,1) + µ, then Z 2) (10.12)

) = 1 (10.14)

Distribution Notation f(y) Expectation Variance

Uniform

Exponential Expo(µ) µ µ2

Normal 2) µ 2


Chapter 11 Joint Probability Distributions The whole chapter is about discrete variables, continuous are not looked into in this book.

Cov(X,Y) = covariance

E(X|Y = y) = conditional expectation of X given that Y = y

V(X|Y = y) = conditional variance of X given that Y = y

h(x,y) = joint pdf

f(x|Y = y) = conditional pdf of X given that Y = y

X,Y = correlation effect

X,Y = covariance

Covariance of X and Y (11.3)

Correlation coefficient of X and Y (11.4)

Short-‐cut formula for covariance of X and Y (11.5)

Covariance and Correlation of V = a + bX and W = c + dY

Covariance X,Y X,Y

Correlation coefficient If V,W X,Y

If V,W = -‐ X,Y

Conditional expectation of V = v(X) given that {Y = y} (11.8)



(Stochastically) independent X and Y (11.9)

X and Y are (Stochastically) independent if the joint pdf is equal to the product of the two marginal

Properties of two independent X and Y (11.10)

Expectation and variance of a linear combination of X and Y (11.14)

Expectation and variance of X1 + ... + Xn: (11.16)

Expectation and variance of X1 + ... + Xn for independent Xi with the same mean and variance (11.17)

Property of the sum of two independent binomial (11.18)

(11.19)

µV V2 2; µW W

2 = n2 2 (11.20)

The probability distribution of X +Y for independent X and Y

Distribution of X Distribution of Y Distribution of X + Y

P0(µ1) P0(µ2) P0(µ1+ µ2)


Chapter 12 Random Samples There are 4 types of random sampling

Random Sampling with replacement

Random Sampling without replacement

Stratified Random Sampling: The population is divided into natural sub-‐populations (strata)

and independent random samples are drawn from them.

Cluster Sampling: The population is divided into sub-‐populations (clusters), are random sample

of clusters is drawn and all elements of these clusters constitute the sample.

i-‐property: X1, ..., XN are independent (only

id-‐property: X1, ..., XN

A Sample Statistic is a random variable that is only based on the random sample X1, ..., XN and not on

unknown parameters.

An Estimator of a parameter is a sample statistic that can be used to generate approximations of that

parameter. is the natural estimator of p, and estimate is denoted with a small letter .

Chapter 13 The Sample Mean Random sample with replacement:

Random sample without replacement:

If the sample size is less than 10% of the population size, the drawing of the sample is done without

replacement but its results are analysed as if done with replacement. When not stated otherwise we

can assume it is a random sample with replacement.



Central Limit Theorem (CLT): The sample mean of a random sample X1, ..., XN has the following

property:

In many cases n has to be at least 30 to use normality for the sample mean.

If the distribution 2) and/or the sample size n is large then:

Chapter 14 Sample Proportion and other Sample Statistics

If n is so large that and n(1-‐ , then:

Estimator Std. Deviation Standard Error

Interest is in µ: unknown known

Interest is in p:


Chapter 15 Inferential Statistics = notation for a general parameter

E = notation for a general estimator

H = half width

POINT ESTIMATION:

Estimator: sample proportion

Format 1: L = E -‐ H and U = E + H where H is a non-‐negative sample statistic.

Format 2: L = aE and U = bE where , and a < b (see chapters 17 and 18)

H = a * SD or if SD contains unknown variables H = a * SE (standard error) (15.2)

INTERVAL ESTIMATION: (15.4) & (15.5)

p(1-‐p)

HYPOTHESIS TESTING:

H1 = Alternative hypothesis = R H0 = Null hypothesis = Rc

Do not reject H0 Reject H0

H0 is true Correct conclusion Incorrect, type I error

H1 is true Incorrect, type II error Correct conclusion

Type I errors are controlled, normally 0.05 or 0.01, or some other prescribed . This is the

significance level. Type II errors usually become small for a large n.

There are three types of testing problems; µ0 is a fixed and known constant called hinge.

I. µ µ0 against H1: µ > µ0 one-‐sided, upper-‐tailed

II. µ µ0 against H1: µ < µ0 one-‐sided, lower-‐tailed

III. µ = µ0 against H1: µ µ0 two-‐sided



TESTING PROBLEM I

test µ µ0 against H1: µ > µ0

test statistic:

Reject H0

Calculate the val, the value of Z when data are substituted

Draw the statistical conclusion

TESTING PROBLEM II

µ µ0 against H1: µ < µ0

test statistic:

Reject H0 -‐z



TESTING PROBLEM III

µ = µ0 against H1 0

test statistic:

Reject H0 -‐z or



The p-‐value or observed significance level: the smallest level of that allows the conclusion of

rejecting H0. A p-‐value can only be calculated afterwards, as soon as the val has been calculated.

Testing p-‐value with hinge µ0

test statistic:

Reject H0 -‐z or




Overview of variables

Chapter 1

N = population elements

n = sample elements

| | = absolute value

Chapter 2

= no new symbols introduced

Chapter 3

= mode

= median

= (arithmetic) mean

rg = geometric mean

Chapter 4

p = population proportion

Ki, ki = ith quartile (i=1, 2, 3)

= interquartile range |K3 K1|, |k3 k1|

2, S2 = variance

= standard deviation



Chapter 5

x,y, sx,y = covariance

px,y, rx,y = correlation coefficient

0, b0 = intercept of regression line

1, b1 = slope of regression line

Chapter 6

P =p (measure), model

= sample space

Ø = empty set, empty event

= Subset

= Union

= Intersection

.c = Complement

Chapter 7


Chapter 8

X, Y = random variables

rv = random variable

x, y = outcomes of X, Y

E(X) = expectation of X

V(X) = variance of X

SD(X) = standard deviation of X

F = cdf; (cumulative) distribution function

f = pdf; probability density function


Chapter 9

~

Bin(n,p) = binominal distribution

H(n; M, N) = hypergeometric distribution

Chapter 10


Chapter 11

Cov(X,Y) = covariance

E(X|Y = y) = conditional expectation of X given that Y = y

V(X|Y = y) = conditional variance of X given that Y = y

h(x,y) = joint pdf

f(x|Y = y) = conditional pdf of X given that Y = y

X,Y = correlation effect

X,Y = covariance

Chapter 12 & 13 & 14


Chapter 15

= notation for a general parameter

E = notation for a general estimator

H = half width



Chapter 16-‐18 T-Distribution

is Graph is symmetric.

CONFIDENCE INTERVAL

HYPOTHESIS TESTING

(i) Test (a)

(b) (c)

(ii) Test statistic:

(iii) Reject

Reject Reject

-‐1)

(iv) Calculate the val

(v) H0 since val is smaller/greater than the critical value.



P-Distributions

Use a P distribution when a value can only be 1, or 0.

CONFIDENCE INTERVAL

HYPOTHESIS TESTING

(i) Test (a) (b) (c)


(iii) Reject Reject Reject

NORMSINV(1-‐




Chi-square distribution

Make assumption that the random sample is normal. Graph is not symmetric. Use when calculating variances.

CONFIDENCE INTERVAL

HYPOTHESIS TESTING

(i) Test (a)

(b)

(c)


(iii) Reject

Reject

Reject

CHIINV ,n-‐1)





Two-Parameter Distribution

Two different samples, independent samples, and not independent samples, also called paired.

Two independent samples and , equal-variance test CONFIDENCE INTERVAL

HYPOTHESIS TESTING

(i) Test (a)

(b) (c)


(iii) Reject

Reject

Reject

1+n2-‐2)




Two independent samples and , unequal-variance test

CONFIDENCE INTERVAL

m = min(n1,n2)-‐1

HYPOTHESIS TESTING

(i) Test (a)

(b) (c)


(iii) Reject Reject

Reject

m)





Two Paired samples, matched-pairs design

CONFIDENCE INTERVAL

= mean of the differences

HYPOTHESIS TESTING

(i) Test (a)

(b) (c)


(iii) Reject

Reject

Reject

-‐1)




F-Distribution

The F-‐distribution with the parameters is the probability of a special density that is concentrated on

CONFIDENCE INTERVAL

HYPOTHESIS TESTING

(i) Test (a)

(b)

(c)


(iii) Reject

Reject

Reject

FINV((1-‐) 1-‐1,n2-‐1)





P-Distribution with two populations

CONFIDENCE INTERVAL

HYPOTHESIS TESTING



(iii) Reject Reject Reject

NORMSINV(1-‐




Chapter 19 Simple Linear Regression Model standard deviation:

Model variance:

Standard deviation of B1

Standard error of B1



(iii) Reject

Reject

Reject -‐1)



Interval estimator (L,U) for E(Yp) and Interval predictor (L,U) for E(Yp) (confidence interval) (Prediction interval)



Chapter 20 Multiple Linear Regression Sample variance of the estimated model:

Standard error of the estimated model:

ANOVAb Table

Sum of squares Degrees of

freedom Mean square F-‐ratio Significance

Regression k SSR / k

Residual n (k + 1)

Total n - 1

MODEL TEST, USEFULNESS OF THE MODEL

(vi) Test

(vii) Test statistic:

(viii) Reject F k,n-‐(k+1))

(ix) Calculate the val

(x) H0 since val is smaller/greater than the critical value.

Ordinary coefficient of determination:

Adjusted coefficient of determination:


Inference on the regression coefficients

Interval estimator (L, U) for Bi:

TESTING THE INDIVIDUAL SIGNIFICANCE OR CONJECTURE ABOUT 1



(iii) Reject

Reject

Reject -‐(k+1))



Interval estimator of E(Yp):

Interval predictor of E(Yp):



Chapter 21 Multiple Linear Regression PARTIAL F-‐TEST (FOR USEFULNESS OF A PORTION)

(i) Test


(iii) Reject

(iv) Calculate the val F k-‐g,n-‐(k+1))

(v) H0 since val is smaller/greater than the. If H1 true then the independent

variables are jointly significant.

HIGHER ORDER TERMS AND INTERACTION TERMS

nnd 0

Interaction term = Number of xn*xm

The regression coefficient of the dummy is a difference of means under a ceteris paribus condition.


Chapter 22 Multiple Linear Regression DURBIN-‐WATSON TESTS (TO TEST FOR AUTOCORRELATION):

Positive:

Negative:

Two-‐sided:

LOGIT MODEL:

Logit regression equation:

It estimates P (Y = 1) and after rounding it predicts Y itself.

2

H1 H0 Inc

0 1 2 3 4

d d U

H0 H1 Inc

4 -‐ d 4 -‐ d U

0 1 3 4

H0 Inc Inc H1 H1

0 1 2 3 4

4 -‐ d /2,L 4 -‐ d /2,U

d /2,U

d /2,L



Chapter 23 Time series and forecasting MOVING AVERAGES

Moving averages, 3-‐period: for all t = 2, 2 -‐ 1

EXPONENTIAL SMOOTHING AND FORECASTING

s1 = y1

st = wyt + (1-‐w)st-‐1 for all t = 1, 2

for all k

(i) Test Ho: There is no first-‐order autocorrelation vs H1: There is first-‐order autocorrelation

(ii) Test statistic: D

(iii) Conclude

Inconclusive

Conclude


(v) The test gives the conclusion...

(FIRST ORDER) AUTOREGRESSIVE MODEL AR(1)

with t) = 0 for all t = 2


summary statistics 1 and 2

Documents

sample mean

mean population variance

sample variance

variance population

weighted mean

population standard

average population

sample standard deviation