mba statistics midterm review sheet

2
CHAPTER 1 Individuals are the objects described by a set of data. Individuals may be people, but they may also be animals or things. A variable is any change of an individual. A variable can take different values for different individuals. A categorical variable places an individual into one of several groups or categories. A quantitative variable takes numerical values for which arithmetic operations such as adding and averaging make sense. The distribution of a variable tells us what values the variable takes and how often it takes these values A time plot of a variable plots each observation against the time at which it was measured. Always mark the time scale on the horizontal axis and the variable of interest on the vertical axis. If there are not too many points, connecting the points by lines helps show the pattern of changes over time. Standard Deviation s – The variance s 2 of a set of observations is the average of the squares of the deviations of the observations from their mean. In symbols, the variance of n observations X1, X2, …, Xn is CLASS 1: Two basic types of data: 1) Quantitative Response is a #; 2) Qualitative (categorical) original question being asked is not a number, usually a word; “what % fell into each category. Different types of quantitative data: 1. Ratio has a point of origin, like on the kelvin scale the 0 means a complete absence of the thing being measured. A ratio scale has a logical zero value. In measuring distance around the track, the starting line is a 0 point and half way around the mile-long outer track would be 2,640 feet. A horse that has run 100 yards has run twice far as a horse that has run 50 yards. One can say that the outer track is three times as long as the inner. 2. Ordinal – you can say this is higher than that one, can also be used to put people into groups 3. Interval scales measure distance but do not have a logical zero point that makes absolute magnitudes measurable; scale is consistent throughout; usually required to make numerical comparisons. One can say that the orange hat jockey is ahead of the green hat jockey by 1 length, the green hat ahead of the white hat by 4 lengths. But since we don’t know the exact magnitude of a “length,” we can’t say exactly how far ahead each horse is. 4. Nominal – just puts things into groups Three important characteristics of Central tendency : 1) Mode (most); 2) Mean (arithmetic sum/total #); 3) Median – middle observation Sample mean v. population mean: n vs. N; x bar vs. Mu. Variance – how spread our are these numbers, how far is each obs from the mean; calculating the mean square dist. from the mean: population – divide by N, sample / n – 1 (lost one degree of freedom, don’t need to know what the nth number is); Sample variance (S^2) vs. pop variance (lowercase sigma^2) St. dev. Absolute dispersion - Used to describe dist., just the SQRT of VAR Coeff. Var. everyone relative to the mean- relative dispersion – Sample: S/x bar ; Population: s/mu; Range – lower vs. highest; InterQ range– how far is 3 rd Q from 1 st Q /\--- positively skewed (e.g. income) ; --/\-- symmetrical ; ---/\ negatively skewed *If data is skewed, mean is likely not best description of central tendency Bimodal distribution : -/\--/\- bimodal distribution, treats two groups separately Empirical Rule (normal rule) assuming data is normally dist.; = P (M+-1SD) =~ 68% = P (M+-2SD) =~ 95%; = P (M+-3SD) =~ 99.7% Z score = (x – Mean)/SD i.e. # of SDs below or above the mean; Standardized data (i.e. scores) will have a MeanZ = 0 and SDZ = 1 CHAPTER 2 Density curve : curve that: 1) is always on or above the horizontal axis; and 2) has an area exactly one underneath it. The 68-95-99.7 Rule: In the normal distribution with mean µ and standard deviation. 1) 68% of the observations within SD of the mean. 2) 95% of the observations fall within 2 SD of mean. 3) 99.7% of the observations fall within 3 SD of mean. Standard Normal Distribution: The standard normal distribution is the normal distribution N(0,1) with mean 0 and standard deviation 1. If a variable x has any normal distributions N(µ, SD) with mean and standard deviation, then the standardized variable has the standard normal deviation. Population vs Sample: The entire group of individuals that we want information about is called the population. A sample is a part of the population that we actually examine in order to gather information. Rule of combinations: Used to find the # of ways of selecting X objs from n objs, irrespective of sequence- nCx = n! / x!(n – x)!; ALSO (??) Z = x - µ / SD Sampling versus a Census: Sampling involves studying a part in order to gain information about the whole. A census attempts to contact every individual in the entire population. BIAS : The design of a study is biased if it systematically favors certain outcomes. Voluntary response sample consists of people who choose themselves by responding to a general appeal. Voluntary response samples are biased because people with strong opinions, especially negative opinions, are most likely to respond. The Multiplication Rule for Independent Events: Two events A and B are independent if knowing that one occurs does not change the probability that the other occurs. If A and B are independent, P(A and B) = P(A)P(B)this is the multiplication rule for independent events. • The union of any collection of events is the event that at least one of the collections occurs. Addition Rule for Disjoint Events: • If events A, B, and C are disjoint in the sense that no two have any outcomes in common, then P(1 or more of A, B, C) = P(A) + P(B) + P(C). This rule extends to any # of disjoint events. CHAPTER 3: Num Descript Measures Coefficient of Variation : should be computed only for data measured on a ratio scale, which are measurements that can only take non-negative values. The coefficient of variation may not have any meaning for data on an interval scale.[1]. For example, most temperature scales are interval scales (e.g. Celsius, Fahrenheit etc.), they can take both positive and negative values. The Kelvin scale has an absolute null value, and no negative values can naturally occur. Hence, the Kelvin scale is a ratio scale. While the standard deviation (SD) can be derived on both the Kelvin and the Celsius scale (with both leading to the same SDs), the CV could only be derived for the Kelvin scale. Often, laboratory values that are measured based on chromatographic methods are log-normally distributed. In this case, the CV would be constant over a large range of measurements, while SDs would vary depending on the actual range that has been measured. The CV is sometimes expressed as a percent, in which case the CV is multiplied by 100%. Advantages : The coefficient of variation is useful because the standard deviation of data must always be understood in the context of the mean of the data. Instead, the actual value of the CV is independent of the unit in which the measurement has been taken, so it is a dimensionless number. For comparison between data sets with different units or widely different means, one should use the coefficient of variation instead of the standard deviation. Disadvantage : When the mean value is close to zero, the coefficient of variation will approach infinity and is hence sensitive to small changes in the mean. This is often the case if the values do not originate from a ratio scale. Unlike the standard deviation, it cannot be used directly to construct confidence intervals for the mean. Z-Scores: The Z-SCORE is the diff between the value & the mean, divided by the SD. It is useful in identifying outliers. The larger the Z score, the greater the distance from the value to the mean. ( Z = X – X bar/ S) CHAPER 4: BASIC PROBABILITY Section 4.2 & 4.3 Bayes Theorem convoluted problem will give the tip that it’s a bayes theorem prob; P(B) =70/100; P(B’) = 30/100; P(O|B’) = 50/100; P(O| B) = 25/100; P(B’|O) = ? Try Prob 34, 35, (we did 33) ; 28 (cards), anything in Sec. 4.2 A little bit of a positive skew, 14.448 standard error General Multiplication Rule for Any Two Events: The probability that both of two events A & B happen together can be found by: P(A and B) = P(A)P(B|A). Here P(B|A) is the General Addition Rule for Union of Two Events: For any 2 events A or B, P(A or B) = P(A) + P(B) – P(A and B) Equivalently,

Upload: sk112988

Post on 13-Apr-2015

30 views

Category:

Documents


0 download

DESCRIPTION

8 x 11 MBA Midterm review sheet for statistics

TRANSCRIPT

Page 1: Mba Statistics Midterm Review Sheet

CHAPTER 1Individuals are the objects described by a set of data. Individuals may be people, but they may also be animals or things.A variable is any change of an individual. A variable can take different values for different individuals.A categorical variable places an individual into one of several groups or categories.A quantitative variable takes numerical values for which arithmetic operations such as adding and averaging make sense.The distribution of a variable tells us what values the variable takes and how often it takes these valuesA time plot of a variable plots each observation against the time at which it was measured. Always mark the time scale on the horizontal axis and the variable of interest on the vertical axis. If there are not too many points, connecting the points by lines helps show the pattern of changes over time. Standard Deviation s – The variance s2 of a set of observations is the average of the squares of the deviations of the observations from their mean. In symbols, the variance of n observations X1, X2, …, Xn isCLASS 1: Two basic types of data: 1) Quantitative – Response is a #; 2) Qualitative (categorical) – original question being asked is not a number, usually a word; “what % fell into each category.Different types of quantitative data: 1. Ratio has a point of origin, like on the kelvin scale the 0 means a complete absence of the thing being measured. A ratio scale has a logical zero value. In measuring distance around the track, the starting line is a 0 point and half way around the mile-long outer track would be 2,640 feet. A horse that has run 100 yards has run twice far as a horse that has run 50 yards. One can say that the outer track is three times as long as the inner. 2. Ordinal – you can say this is higher than that one, can also be used to put people into groups 3. Interval scales measure distance but do not have a logical zero point that makes absolute magnitudes measurable; scale is consistent throughout; usually required to make numerical comparisons. One can say that the orange hat jockey is ahead of the green hat jockey by 1 length, the green hat ahead of the white hat by 4 lengths. But since we don’t know the exact magnitude of a “length,” we can’t say exactly how far ahead each horse is. 4. Nominal – just puts things into groupsThree important characteristics of Central tendency: 1) Mode (most); 2) Mean (arithmetic sum/total #); 3) Median – middle observationSample mean v. population mean: n vs. N; x bar vs. Mu.Variance – how spread our are these numbers, how far is each obs from the mean; calculating the mean square dist. from the mean: population – divide by N, sample / n – 1 (lost one degree of freedom, don’t need to know what the nth number is); Sample variance (S^2) vs. pop variance (lowercase sigma^2)St. dev. Absolute dispersion - Used to describe dist., just the SQRT of VARCoeff. Var. everyone relative to the mean- relative dispersion – Sample: S/x bar ; Population: s/mu; Range – lower vs. highest; InterQ range– how far is 3rd Q from 1st Q/\--- positively skewed (e.g. income) ; --/\-- symmetrical ; ---/\ negatively skewed *If data is skewed, mean is likely not best description of central tendencyBimodal distribution: -/\--/\- bimodal distribution, treats two groups separatelyEmpirical Rule (normal rule) assuming data is normally dist.; = P (M+-1SD) =~ 68%= P (M+-2SD) =~ 95%; = P (M+-3SD) =~ 99.7%Z score = (x – Mean)/SD i.e. # of SDs below or above the mean; Standardized data (i.e. scores) will have a MeanZ = 0 and SDZ = 1

CHAPTER 2Density curve: curve that: 1) is always on or above the horizontal axis; and 2) has an area exactly one underneath it.The 68-95-99.7 Rule: In the normal distribution with mean µ and standard deviation. 1) 68% of the observations within SD of the mean. 2) 95% of the observations fall within 2 SD of mean. 3) 99.7% of the observations fall within 3 SD of mean.Standard Normal Distribution: The standard normal distribution is the normal distribution N(0,1) with mean 0 and standard deviation 1. If a variable x has any normal distributions N(µ, SD) with mean and standard deviation, then the standardized variable has the standard normal deviation.Population vs Sample: The entire group of individuals that we want information about is called the population. A sample is a part of the population that we actually examine in order to gather information.

Rule of combinations: Used to find the # of ways of selecting X objs from n objs, irrespective of sequence- nCx = n! / x!(n – x)!; ALSO (??) Z = x - µ / SDSampling versus a Census: Sampling involves studying a part in order to gain information about the whole. A census attempts to contact every individual in the entire population.BIAS: The design of a study is biased if it systematically favors certain outcomes.Voluntary response sample consists of people who choose themselves by responding to a general appeal. Voluntary response samples are biased because people with strong opinions, especially negative opinions, are most likely to respond.The Multiplication Rule for Independent Events: Two events A and B are independent if knowing that one occurs does not change the probability that the other occurs. If A and B are independent, P(A and B) = P(A)P(B)—this is the multiplication rule for independent events.• The union of any collection of events is the event that at least one of the collections occurs.

Addition Rule for Disjoint Events: • If events A, B, and C are disjoint in the sense that no two have any outcomes in common, then P(1 or more of A, B, C) = P(A) + P(B) + P(C). This rule extends to any # of disjoint events.

CHAPTER 3: Num Descript MeasuresCoefficient of Variation : should be computed only for data measured on a ratio scale, which are measurements that can only take non-negative values. The coefficient of variation may not have any meaning for data on an interval scale.[1]. For example, most temperature scales are interval scales (e.g. Celsius, Fahrenheit etc.), they can take both positive and negative values. The Kelvin scale has an absolute null value, and no negative values can naturally occur. Hence, the Kelvin scale is a ratio scale. While the standard deviation (SD) can be derived on both the Kelvin and the Celsius scale (with both leading to the same SDs), the CV could only be derived for the Kelvin scale. Often, laboratory values that are measured based on chromatographic methods are log-normally distributed. In this case, the CV would be constant over a large range of measurements, while SDs would vary depending on the actual range that has been measured. The CV is sometimes expressed as a percent, in which case the CV is multiplied by 100%. Advantages: The coefficient of variation is useful because the standard deviation of data must always be understood in the context of the mean of the data. Instead, the actual value of the CV is independent of the unit in which the measurement has been taken, so it is a dimensionless number. For comparison between data sets with different units or widely different means, one should use the coefficient of variation instead of the standard deviation. Disadvantage: When the mean value is close to zero, the coefficient of variation will approach infinity and is hence sensitive to small changes in the mean. This is often the case if the values do not originate from a ratio scale. Unlike the standard deviation, it cannot be used directly to construct confidence intervals for the mean.Z-Scores: The Z-SCORE is the diff between the value & the mean, divided by the SD. It is useful in identifying outliers. The larger the Z score, the greater the distance from the value to the mean. ( Z = X – X bar/ S)

CHAPER 4: BASIC PROBABILITY Section 4.2 & 4.3Bayes Theorem – convoluted problem will give the tip that it’s a bayes theorem prob; P(B) =70/100; P(B’) = 30/100; P(O|B’) = 50/100; P(O|B) = 25/100; P(B’|O) = ?Try Prob 34, 35, (we did 33) ; 28 (cards), anything in Sec. 4.2A little bit of a positive skew, 14.448 standard errorA priori – num of was the event occurs and the total num of poss oyutcomes are known from the deck of cardsEmpirical prob approach 0 prob based on observed data not on prior knowledge.X = # of ways in which the event occurs ; T = total num of poss outcomesSimple prob - = X/TJOINT (AND) = Two events are mut exclus if both cannot occur simul; A set of ecents is coll exhaustive is one of the events must occur.PROB when you know certain info about events involved:Conditional: P(A|B) = P(A and B)/P(B)

P(B|A) = P(A and B)/P(A)INDEPTwo events are independent if & only if (A and B) P(A}B) = P(A) - Where P(A|B) = Conditional probability of A given BBAYES: P(B|A) = P (A & B) / P(A) = P (A | B) * P(B) / P(A)

CHAPTER 5 DISCRETE PROBABILITY DISTRIBUTIONA random variable is a variable whose value is a numerical outcome of a random Phenomenon A discrete random variable X has a countable number of possible values The probability distribution of X lists the values and their probabilities A continuous random variable X takes all values in an interval of numbers. The probability distribution of X is described by a density curve. The probability of any event is the area under the density curve and above the values of X that make up the event Formula: Suppose that X is a discrete random variable, To find the mean of X,

multiply each possible value by its probability then add all the products Independent variables: Statistical independence – prob. Of A given B is the same as the prob. Of A; IF there is ind. You can find the joint prob by multp the two simples.Degrees of freedom in counting table – (# of rows – 1) X (# of columns – 1)Ex. from class: 7/9 X 6/8 = 42/72 21/36 7/9 X 2/8 = 14/72 7/36 2/9 X 7/8 = 14/72 7/36 = 14/36 = 7/181) Series of ind. Trials: n= # of trials; x = # of successes; P = (# of successes|# of trials)Binomial prob: P=(X=x | n,Pie) [ n!/x!(n-x)! ] (Pie)^x (1Pie)^(n-x)Law of Large Numbers: Draw independent observations at random from any population with finite mean U. Decide how accurately you would like to estimate U. As the number of observations drawn increases, the mean of the observed values eventually approaches the mean U of the population as closely as you specified and the stays that close.*MBA Book uses x instead of k. -The distribution of the count X of successes in the binomial setting is the binomial distribution with parameters n and p (pie). The parameter n is the number of observations, and p is the probability of a success on any one observation. The possible values of X are the whole numbers from 0 to n. As an abbreviation, we say that X is B(n,p).Binomial Coefficient: The number of ways of arranging k successes among n observations is given by binomial coefficient(n/k) = n! / k!(n – k)! for k = 0, 1, 2,…, n.Binomial Probability: If X has the binomial distribution with n observations and probability p of success on each observation, the possible values of X are 0, 1, 2,…, n. If k is any one of these values, P(X = k) = (n/k)piek(1 – pie)n-k

Mean and Standard Deviation of a Binomial Random Variable: If a count X has the binomial distribution with number of observations n and probability of success p, the mean µ and standard deviation of X are µ = np; SD = Square Root(np(1 – p))Rule for Calculating Geometric Probabilities : If X has a geometric distribution with probability p of success and (1 – p) of failure on each observation, the possible values of X are 1, 2, 3,… --If n is any one of these values, the probability that the first success occurs on the n th trial is: P(X = n) = (1 – p)n-1p; --The probability that it takes more than n trials to see the first success is P(X > n) = (1 – p)n

BIAS CONTD: A parameter is a number that describes the population. A parameter is a fixed number, but in practice we do not know its value because we cannot examine the entire population

CHAPTER 7: Sampling and SAMPLING DISTA statistic is a number that describes a sample. The value of a statistic is known when we have taken a sample, but it can change from sample to sample. We often use a statistic to estimate as unknown parameter. The sampling distribution of a statistic is the distribution of values taken by the statistic in all possible samples of the same size from the same population A statistic used to estimate a parameter is unbiased if the mean of its sampling distribution is equal to the true value of the parameter being estimated Measurement: 1) Ambiguous question; 2) Hawthorn effect; 3) Respondent errorCentral limit theorem – regardless of how the pop is distributed, as the sampling gets large enough, the sampling dist will get closer to the pop dist.

CHAPTER 8: CONFIDENCE INTERVALSConditions for Constructing a Confidence Interval for m: The construction of a confidence interval for a population mean µ, is appropriate when.

General Addition Rule for Union of Two Events: For any 2 events A or B, P(A or B) = P(A) + P(B) – P(A and B) Equivalently, P(A U B) = P(A) + P(B) – P(A ∩ B)

General Multiplication Rule for Any Two Events: The probability that both of two events A & B happen together can be found by: P(A and B) = P(A)P(B|A). Here P(B|A) is the conditional probability that B occurs given the information that A occurs.

Page 2: Mba Statistics Midterm Review Sheet

---1) The data comes from an SRS from the population of interest, and --- 2) The sampling distribution of x is approximately normal.Sample Size for Desired Margin of Error: To determine the sample size n that will yield a confidence interval for a population mean with a specified margin of error m, set the expression for the margin of error to be less than or equal to m and solve for n. z * α / Square Root(n) ≤ m