summary statistics 1 and 2
TRANSCRIPT
Samenvatting_Statistiek_1_and_2.pdf
Brief Summary Chapter 1 - 23
Tilburg University | Statistiek 2
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 1 Intro and Basic Concepts N: Population elements n: Sample elements
Qualitative: Nominal: Cannot be ordered Ordinal: Can be ordered
Quantitative: Interval: Differences of values are meaningful, ratios not. Ratio: Differences and ratios are meaningful.
Discrete variables: Set of possible variables can be counted. Continuous variables: Set of possible variables with same real interval, with uncountable many numbers.
Descriptive Statistics: Gathering the data, summarization (tables, graphs; statistics)
Probability (theory): Chance; probability; rules
Sampling (theory): How to draw a sample? Properties of random sampling
Inferential Statistics: Drawing conclusions about the population on the basis of a sample from it
Chapter 2 Tables and Graphs Frequency distribution: Overview of all values with accompanying frequencies
Relative frequency distribution: Overview of all values with accompanying relative frequencies
-‐ Frequency/total observations
CDF (cumulative distribution function F): F(a) = relative frequency of the observations £ a; real numbers a
Classified freq distribution: Overview of classes and accompanying frequencies
-‐ depends on the classification
-‐ preferably equal class width
Frequency density: Overview of classes and relative frequencies divided by corresponding
class widths
Linear interpolation: (or see book page 42)
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 3 Measures of Location Mode: value with the largest frequency
Median: middle of ordered observations
Mean (= arithmetic mean): Average (population = µ, sample = )
Weighted mean:
Geometric mean:
Property:
Population mean:
Sample mean:
Binary variable can only have two variables, summing up will only give the numbers 1 = p, p/n gives the mean, that is the proportion.
= mode
= median
= (arithmetic) mean
rg = geometric mean
p = population proportion
Chapter 4 Measures of Variation 4.1 Measures Based on Quartiles
IQR
Relative IQR:
Box plot: see book page 108
Ki, ki = ith quartile (i=1, 2, 3)
= interquartile range |K3 K1|, |k3 k1|
2, S2 = variance
= standard deviation
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
4.2 Measures Based on Deviations from the Mean
Population variance:
Population standard deviation:
Sample variance:
Sample standard deviation:
Coefficients of variation: and
SHORT-‐CUT FORMULAS FOR VARIANCE
Population Variance:
Sample Variance:
The Population variance is just the mean of the squares minus the square of the mean.
4.3 Interpretation of the standard deviation
for all k > 0 it holds that:
Sample data: at least of the observations lies in
Population data: at least of the observations lies in
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
4.4 z-Scores
Population dataset: -‐
Sample dataset: -‐
Outliers: Observations that are extremely small or extremely high.
Outlier if smaller than K1 1 -‐definition
4.5 Variance of 0-1 data
Population dataset: -‐
-‐
Sample dataset:
4.6 Variance of a frequency distribution
Discrete frequency distribution:
Classified frequency distribution:
Mean µ 2
Original observation
Frequency distribution
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 5 Pairs of Variables
5.1 Scatter plot, Covariance and Correlation Measures of association: The measure of the strength of the linear relationship.
x,y, sx,y = covariance
px,y, rx,y = correlation coefficient
0, b0 = intercept of regression line
1, b1 = slope of regression line
Quadrants I, II, III, IV
I and III >
II and IV >
SHORT-‐CUT FORMULAS FOR COVARIANCE
Population Covariance:
Sample Covariance:
The population covariance is just the mean of the products minus the product of the means
Population Correlation coefficient:
Sample Correlation coefficient:
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
5.2 Regression Line
LEAST-‐SQUARES METHOD
LS-‐method: a and b are taken such that is minimal
Population regression coefficients:
Population regression line:
Sample regression coefficients:
Sample regression line:
Property: A sample line regression passes trough ; a population regression line passes trough (µx,µy)
PREDICTION AND RESIDUALS
Prediction of yp:
Predictions of y1, y2, ..., yn:
The n prediction errors are called residuals
SUM OF SQUARED ERRORS
Sum of squared errors (overall): -‐ note that
5.3 Linear Transformations
Mean of a + bx1, a + bx2 N =
Variance of a + bx1, a + bx2 N (around ) = hence
Transformation of statistics under
Transformation of covariance and correlation by v = a + bx and w = c + dy
Population dataset Sample dataset Covariance
Correlation coefficient If bd > 0 If bd < 0
If bd > 0 If bd < 0
Relationship between two qualitative variables Contingency table: Table that offers an overview of all joint frequencies
Population dataset Sample dataset
Location
Variation
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 6 Definitions of Probability
6.1 Random experiments Random experiment: An experimentation or an observation of an uncontrollable phenomenon for which more than one outcome is possible
Sample space:
Elements: The different possible outcomes
P = Probability (measure), model
= Sample space
Ø = empty set, empty event
= Subset
= Union
= Intersection
.c = Complement
6.2 Rules for sets
Concept Notation Venn-‐Diagram Meaning for Events
Empty set Ø -‐-‐-‐-‐-‐-‐-‐-‐ Cannot occur
Sample Space
Occurs certainly
Complement Ac
A does not occur
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Union
At least one of the events A and B occur (and/or)
Intersection
Both A and B occur (and)
Subset
If A occurs then B occurs
Disjoint
A en B cannot occur jointly
Partition D1, ..., Ds
Exactly one of the events D1, ..., Ds
occurs
6.3 Historical definitions of probability
Classical definition of probability: for all events A requirement: equally likely outcomes
Empirical definition of probability: requirement: independent & identically repeatable
Subjective definition of probability: How strongly one individual believes in occurrence of event A
6.4 General definition of Kolmogorov Probability measure P (definition of Kolmogorov): A probability measure P is a prescription that
that the following axioms hold for all subsets A and -‐
-‐ -‐ If A and B are disjoint, then
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 7 Probability and Rules
7.1 Basic properties Important rules for probability: (7.1), (7.2), (7.6), (7.7), (7.8), (7.9)
Important rules for conditional probability: (7.12), (7.15), (7.17), (7.18), (7.19), (7.20) THE BASIC AXIOMS
Requirement Formula Numbering
(7.1)
(7.2)
If A, B and C are disjoint (7.3)
If A1, A2 m are disjoint
(7.4)
If D1, D2 m (7.5)
If (7.6) (7.7)
For all events B (7.8) If D1, D2, m
(7.9)
7.2 Rules for counting
With Replacement Without Replacement
Ordered mk
Unordered -‐-‐-‐-‐-‐-‐-‐-‐-‐-‐
(7.10)
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
7.3 Random drawing and random Sampling
(7.11)
(pa = relative frequency)
7.4 Conditional probabilities and independence
The Conditional probability of event A given event B is denoted as P(A|B) or as (7.12)
Interchanging A and B (if ) leads to the probability of B given A (7.13)
By multiplying both sides of (7.12) by P(B) and both sides of (7.13) by P(A) yields (7.14)
Since some terms cancel out we obtain, this is also called the product rule (7.15)
In case of four events A, B, C and D it follows that (7.16)
Events are stochastically independent if (7.17)
Two other equivalent ways of expressing independence of A and B (7.18) & (7.19)
Rule of Bayes: It expresses a conditional probability in its opposite conditional probability. (7.20)
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 8 Probability Distribution, Expectation, Variance Random variable (rv) is a prescript that attaches a value to each outcome of the sample space.
Working definition: quantity for which the actual outcome is determined by chance
The actual outcome of a random variable (X) is called the realisation
W
Discrete: Finite or countable number of values
Continuous: May assume any value in some interval
Probability distribution: Overview of the probabilities of all X-‐events
X, Y = random variables
rv = random variable
x, y = outcomes of X, Y
E(X) = expectation of X
V(X) = variance of X
SD(X) = standard deviation of X
F = cdf; (cumulative) distribution function
f = pdf; probability density function
DISCRETE VARIABLES
Probability density function (pdf) f of rv X: for all outcomes x of X
f(x) is the probability that the realisation of X will be x
f is also called (discrete) density
(8.1)
(Cumulative) distribution function (cdf) F of a rv X: for all real numbers a
F is non decreasing
F(-‐ ) = 0; F( ) = 1 (8.2)
F(b) for al a and b with a < b
Relation between discrete pdf and cdf (=non-‐decreasing step function) (8.3)
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
When x is an outcome of X and is the largest outcome that is smaller than x it yields that (8.4)
so f follows from F and f(x) is just the jump-‐size of F at x
A random variable X is called degenerate at the constant b if b is the only possible outcome of X.
Expectation or expected value or mean of a discrete X (8.14)
Expectation of V = h(X) for a discrete X (8.17)
Variance of a discrete X (8.18)
CONTINUOUS VARIABLES
(8.5)
Property for continuous X: for all real numbers x (due to the infinite amount of possibilities) (8.6)
Property for continuous X: for all real numbers x (8.7)
Probability density function (pdf) f of a continuous rv X: (8.8)
Elementary properties of the pdf f of a continuous X:
for all real numbers x (8.9)
total area under f, which is 1 (8.10)
Properties of the pdf F of a continuous X:
It is a non-‐decreasing continuous function that is strictly increasing on an interval where the pdf is positive
It is completely determined by the pdf f since for all real numbers b: (8.11)
It completely determines the pdf f since: for all real numbers a (8.12)
(8.13)
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Because h is really close to zero the probability that X falls in a small neighbourhood [a, a + h] is
relatively large when compared to neighbourhoods of other outcomes, hence f(a) is NOT the
probability of a, it is called the likelihood of outcome a (see page 286 for a detailed explanation).
Expectation or expected value or mean of a continuous X (8.22)
Expectation of V = h(X) for a continuous X (total area under h(x) f(x) (8.23)
Variance of a continuous X (8.18)
8.5 Rules for expectation and variance Linear transformation (discrete and continuous):
(8.30)
Short-‐cut formula for variance of X (8.32)
8.7 Other statistics of probability distributions
The p-‐quantile of X and its distribution is such that: (8.41)
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 9 Families of Discrete Distributions This whole chapter is about 0 and 1 values
~
Bin(n,p) = binominal distribution
H(n; M, N) = hypergeometric distribution
Expectation, variance and standard deviation of Alt(p) (9.1)
The binominal distribution with parameters n and p has the following pdf (Y ~ Bin(n, p)) (9.2)
Expectation, variance and standard deviation of Bin(n,p) (9.3)
Expectation, variance and standard deviation of proportion successes in binomial (9.4) experiment
If
(9.9)
Distribution Notation f(y) Expectation Variance
Hypergeometric H(n;M,N)
Binominal Bin(n,p) np np(1-‐p)
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 10 Families of Continuous Distributions
10.1 Uniform distributions Uniform distribution: (10.1)
(10.2)
Expectation, variance and standard deviation of (10.3)
10.2 Exponential distributions Exponential distribution: (10.4)
Some properties of Y~Expo(µ) with cdf F: (10.5 & 10.6)
10.3 Normal distribution Normal distribution: (10.7)
Notation: 2) if Y has this density
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Some properties of the pdf of 2)
is the maximum value of f
for all positive a; so f is symmetric around µ
-‐
At y = µ -‐ and the graph of f has a turning point in the sense that the decline decreases when going further from µ
Expectation and variance of 2) (10.8)
Standard normal deviation or z-‐distribution (10.9)
If 2) and Y = a + bX, then Y ~ N(a + bµ, b2 2) (10.10)
If X ~ 2) and , then Z ~ N(0,1) (10.11)
Z ~ N(0,1) + µ, then Z 2) (10.12)
) = 1 (10.14)
Distribution Notation f(y) Expectation Variance
Uniform
Exponential Expo(µ) µ µ2
Normal 2) µ 2
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 11 Joint Probability Distributions The whole chapter is about discrete variables, continuous are not looked into in this book.
Cov(X,Y) = covariance
E(X|Y = y) = conditional expectation of X given that Y = y
V(X|Y = y) = conditional variance of X given that Y = y
h(x,y) = joint pdf
f(x|Y = y) = conditional pdf of X given that Y = y
X,Y = correlation effect
X,Y = covariance
Covariance of X and Y (11.3)
Correlation coefficient of X and Y (11.4)
Short-‐cut formula for covariance of X and Y (11.5)
Covariance and Correlation of V = a + bX and W = c + dY
Covariance X,Y X,Y
Correlation coefficient If V,W X,Y
If V,W = -‐ X,Y
Conditional expectation of V = v(X) given that {Y = y} (11.8)
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
(Stochastically) independent X and Y (11.9)
X and Y are (Stochastically) independent if the joint pdf is equal to the product of the two marginal
Properties of two independent X and Y (11.10)
Expectation and variance of a linear combination of X and Y (11.14)
Expectation and variance of X1 + ... + Xn: (11.16)
Expectation and variance of X1 + ... + Xn for independent Xi with the same mean and variance (11.17)
Property of the sum of two independent binomial (11.18)
(11.19)
µV V2 2; µW W
2 = n2 2 (11.20)
The probability distribution of X +Y for independent X and Y
Distribution of X Distribution of Y Distribution of X + Y
P0(µ1) P0(µ2) P0(µ1+ µ2)
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 12 Random Samples There are 4 types of random sampling
Random Sampling with replacement
Random Sampling without replacement
Stratified Random Sampling: The population is divided into natural sub-‐populations (strata)
and independent random samples are drawn from them.
Cluster Sampling: The population is divided into sub-‐populations (clusters), are random sample
of clusters is drawn and all elements of these clusters constitute the sample.
i-‐property: X1, ..., XN are independent (only
id-‐property: X1, ..., XN
A Sample Statistic is a random variable that is only based on the random sample X1, ..., XN and not on
unknown parameters.
An Estimator of a parameter is a sample statistic that can be used to generate approximations of that
parameter. is the natural estimator of p, and estimate is denoted with a small letter .
Chapter 13 The Sample Mean Random sample with replacement:
Random sample without replacement:
If the sample size is less than 10% of the population size, the drawing of the sample is done without
replacement but its results are analysed as if done with replacement. When not stated otherwise we
can assume it is a random sample with replacement.
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Central Limit Theorem (CLT): The sample mean of a random sample X1, ..., XN has the following
property:
In many cases n has to be at least 30 to use normality for the sample mean.
If the distribution 2) and/or the sample size n is large then:
Chapter 14 Sample Proportion and other Sample Statistics
If n is so large that and n(1-‐ , then:
Estimator Std. Deviation Standard Error
Interest is in µ: unknown known
Interest is in p:
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 15 Inferential Statistics = notation for a general parameter
E = notation for a general estimator
H = half width
POINT ESTIMATION:
Estimator: sample proportion
Format 1: L = E -‐ H and U = E + H where H is a non-‐negative sample statistic.
Format 2: L = aE and U = bE where , and a < b (see chapters 17 and 18)
H = a * SD or if SD contains unknown variables H = a * SE (standard error) (15.2)
INTERVAL ESTIMATION: (15.4) & (15.5)
p(1-‐p)
HYPOTHESIS TESTING:
H1 = Alternative hypothesis = R H0 = Null hypothesis = Rc
Do not reject H0 Reject H0
H0 is true Correct conclusion Incorrect, type I error
H1 is true Incorrect, type II error Correct conclusion
Type I errors are controlled, normally 0.05 or 0.01, or some other prescribed . This is the
significance level. Type II errors usually become small for a large n.
There are three types of testing problems; µ0 is a fixed and known constant called hinge.
I. µ µ0 against H1: µ > µ0 one-‐sided, upper-‐tailed
II. µ µ0 against H1: µ < µ0 one-‐sided, lower-‐tailed
III. µ = µ0 against H1: µ µ0 two-‐sided
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
TESTING PROBLEM I
test µ µ0 against H1: µ > µ0
test statistic:
Reject H0
Calculate the val, the value of Z when data are substituted
Draw the statistical conclusion
TESTING PROBLEM II
µ µ0 against H1: µ < µ0
test statistic:
Reject H0 -‐z
Calculate the val, the value of Z when data are substituted
Draw the statistical conclusion
TESTING PROBLEM III
µ = µ0 against H1 0
test statistic:
Reject H0 -‐z or
Calculate the val, the value of Z when data are substituted
Draw the statistical conclusion
The p-‐value or observed significance level: the smallest level of that allows the conclusion of
rejecting H0. A p-‐value can only be calculated afterwards, as soon as the val has been calculated.
Testing p-‐value with hinge µ0
test statistic:
Reject H0 -‐z or
Calculate the val, the value of Z when data are substituted
Draw the statistical conclusion
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Overview of variables
Chapter 1
N = population elements
n = sample elements
| | = absolute value
Chapter 2
= no new symbols introduced
Chapter 3
= mode
= median
= (arithmetic) mean
rg = geometric mean
Chapter 4
p = population proportion
Ki, ki = ith quartile (i=1, 2, 3)
= interquartile range |K3 K1|, |k3 k1|
2, S2 = variance
= standard deviation
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 5
x,y, sx,y = covariance
px,y, rx,y = correlation coefficient
0, b0 = intercept of regression line
1, b1 = slope of regression line
Chapter 6
P =p (measure), model
= sample space
Ø = empty set, empty event
= Subset
= Union
= Intersection
.c = Complement
Chapter 7
= no new symbols introduced
Chapter 8
X, Y = random variables
rv = random variable
x, y = outcomes of X, Y
E(X) = expectation of X
V(X) = variance of X
SD(X) = standard deviation of X
F = cdf; (cumulative) distribution function
f = pdf; probability density function
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 9
~
Bin(n,p) = binominal distribution
H(n; M, N) = hypergeometric distribution
Chapter 10
= no new symbols introduced
Chapter 11
Cov(X,Y) = covariance
E(X|Y = y) = conditional expectation of X given that Y = y
V(X|Y = y) = conditional variance of X given that Y = y
h(x,y) = joint pdf
f(x|Y = y) = conditional pdf of X given that Y = y
X,Y = correlation effect
X,Y = covariance
Chapter 12 & 13 & 14
= no new symbols introduced
Chapter 15
= notation for a general parameter
E = notation for a general estimator
H = half width
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 16-‐18 T-Distribution
is Graph is symmetric.
CONFIDENCE INTERVAL
HYPOTHESIS TESTING
(i) Test (a)
(b) (c)
(ii) Test statistic:
(iii) Reject
Reject Reject
-‐1)
(iv) Calculate the val
(v) H0 since val is smaller/greater than the critical value.
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
P-Distributions
Use a P distribution when a value can only be 1, or 0.
CONFIDENCE INTERVAL
HYPOTHESIS TESTING
(i) Test (a) (b) (c)
(ii) Test statistic:
(iii) Reject Reject Reject
NORMSINV(1-‐
(iv) Calculate the val
(v) H0 since val is smaller/greater than the critical value.
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chi-square distribution
Make assumption that the random sample is normal. Graph is not symmetric. Use when calculating variances.
CONFIDENCE INTERVAL
HYPOTHESIS TESTING
(i) Test (a)
(b)
(c)
(ii) Test statistic:
(iii) Reject
Reject
Reject
CHIINV ,n-‐1)
(iv) Calculate the val
(v) H0 since val is smaller/greater than the critical value.
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Two-Parameter Distribution
Two different samples, independent samples, and not independent samples, also called paired.
Two independent samples and , equal-variance test CONFIDENCE INTERVAL
HYPOTHESIS TESTING
(i) Test (a)
(b) (c)
(ii) Test statistic:
(iii) Reject
Reject
Reject
1+n2-‐2)
(iv) Calculate the val
(v) H0 since val is smaller/greater than the critical value.
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Two independent samples and , unequal-variance test
CONFIDENCE INTERVAL
m = min(n1,n2)-‐1
HYPOTHESIS TESTING
(i) Test (a)
(b) (c)
(ii) Test statistic:
(iii) Reject Reject
Reject
m)
(iv) Calculate the val
(v) H0 since val is smaller/greater than the critical value.
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Two Paired samples, matched-pairs design
CONFIDENCE INTERVAL
= mean of the differences
HYPOTHESIS TESTING
(i) Test (a)
(b) (c)
(ii) Test statistic:
(iii) Reject
Reject
Reject
-‐1)
(iv) Calculate the val
(v) H0 since val is smaller/greater than the critical value.
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
F-Distribution
The F-‐distribution with the parameters is the probability of a special density that is concentrated on
CONFIDENCE INTERVAL
HYPOTHESIS TESTING
(i) Test (a)
(b)
(c)
(ii) Test statistic:
(iii) Reject
Reject
Reject
FINV((1-‐) 1-‐1,n2-‐1)
(iv) Calculate the val
(v) H0 since val is smaller/greater than the critical value.
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
P-Distribution with two populations
CONFIDENCE INTERVAL
HYPOTHESIS TESTING
(i) Test (a) (b) (c)
(ii) Test statistic:
(iii) Reject Reject Reject
NORMSINV(1-‐
(iv) Calculate the val
(v) H0 since val is smaller/greater than the critical value.
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 19 Simple Linear Regression Model standard deviation:
Model variance:
Standard deviation of B1
Standard error of B1
(i) Test (a) (b) (c)
(ii) Test statistic:
(iii) Reject
Reject
Reject -‐1)
(iv) Calculate the val
(v) H0 since val is smaller/greater than the critical value.
Interval estimator (L,U) for E(Yp) and Interval predictor (L,U) for E(Yp) (confidence interval) (Prediction interval)
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 20 Multiple Linear Regression Sample variance of the estimated model:
Standard error of the estimated model:
ANOVAb Table
Sum of squares Degrees of
freedom Mean square F-‐ratio Significance
Regression k SSR / k
Residual n (k + 1)
Total n - 1
MODEL TEST, USEFULNESS OF THE MODEL
(vi) Test
(vii) Test statistic:
(viii) Reject F k,n-‐(k+1))
(ix) Calculate the val
(x) H0 since val is smaller/greater than the critical value.
Ordinary coefficient of determination:
Adjusted coefficient of determination:
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Inference on the regression coefficients
Interval estimator (L, U) for Bi:
TESTING THE INDIVIDUAL SIGNIFICANCE OR CONJECTURE ABOUT 1
(i) Test (a) (b) (c)
(ii) Test statistic:
(iii) Reject
Reject
Reject -‐(k+1))
(iv) Calculate the val
(v) H0 since val is smaller/greater than the critical value.
Interval estimator of E(Yp):
Interval predictor of E(Yp):
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 21 Multiple Linear Regression PARTIAL F-‐TEST (FOR USEFULNESS OF A PORTION)
(i) Test
(ii) Test statistic:
(iii) Reject
(iv) Calculate the val F k-‐g,n-‐(k+1))
(v) H0 since val is smaller/greater than the. If H1 true then the independent
variables are jointly significant.
HIGHER ORDER TERMS AND INTERACTION TERMS
nnd 0
Interaction term = Number of xn*xm
The regression coefficient of the dummy is a difference of means under a ceteris paribus condition.
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 22 Multiple Linear Regression DURBIN-‐WATSON TESTS (TO TEST FOR AUTOCORRELATION):
Positive:
Negative:
Two-‐sided:
LOGIT MODEL:
Logit regression equation:
It estimates P (Y = 1) and after rounding it predicts Y itself.
2
H1 H0 Inc
0 1 2 3 4
d d U
H0 H1 Inc
4 -‐ d 4 -‐ d U
0 1 3 4
H0 Inc Inc H1 H1
0 1 2 3 4
4 -‐ d /2,L 4 -‐ d /2,U
d /2,U
d /2,L
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]
Chapter 23 Time series and forecasting MOVING AVERAGES
Moving averages, 3-‐period: for all t = 2, 2 -‐ 1
EXPONENTIAL SMOOTHING AND FORECASTING
s1 = y1
st = wyt + (1-‐w)st-‐1 for all t = 1, 2
for all k
(i) Test Ho: There is no first-‐order autocorrelation vs H1: There is first-‐order autocorrelation
(ii) Test statistic: D
(iii) Conclude
Inconclusive
Conclude
(iv) Calculate the val
(v) The test gives the conclusion...
(FIRST ORDER) AUTOREGRESSIVE MODEL AR(1)
with t) = 0 for all t = 2
Verspreiden niet toegestaan | Gedownload door: Danique Sabel | E-mail adres: [email protected]