statistics slides

59
Ritesh Singhal WELCOME To all PGDM Students from Ritesh Singhal {M.Sc.(Maths), MIT, M.Phil.}

Post on 19-Oct-2014

5.607 views

Category:

Documents


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Statistics slides

Ritesh Singhal

WELCOME

To all PGDM Students

from

Ritesh Singhal{M.Sc.(Maths), MIT, M.Phil.}

Page 2: Statistics slides

Ritesh Singhal

Statistics The systematic and scientific treatment of

quantitative measurement is precisely known as statistics.

Statistics may be called as science of counting.

Statistics is concerned with the collection, classification (or organization), presentation and analysis of data which are measurable in numerical terms.

Page 3: Statistics slides

Ritesh Singhal

Stages of Statistical Investigation

Collection of Data

Organization of data

Presentation of data

Analysis

Interpretation of Results

Page 4: Statistics slides

Ritesh Singhal

Statistics It is divided into two major parts:

Descriptive and Inferential Statistics. Descriptive statistics, is a set of methods

to describe data that we have collected. i.e. summarization of data.

Inferential statistics, is a set of methods used to make a generalization, estimate, prediction or decision. When we want to draw conclusions about a distribution.

Page 5: Statistics slides

Ritesh Singhal

Collection of Data

Data can be collected by two ways:>>> Primary Data CollectionIt is the data collected by a particular person

or organization for his own use.

>>> Secondary Data CollectionIt is the data collected by some other person

or organization, but the investigator also get it for his use.

Page 6: Statistics slides

Ritesh Singhal

Methods of Primary data collection

Direct personal interview Data through questionnaire Indirect investigationEtc.

Page 7: Statistics slides

Ritesh Singhal

Methods of Secondary data collection

Data collected through newspapers & periodicals.

Data collected from research papers. Data collected from government

officials. Data collected from various NGO, UN,

UNESCO, WHO, ILO, UNICEF etc. Other published resources

Page 8: Statistics slides

Ritesh Singhal

Classification of data Classification is a process of arranging

data into sequences and groups according to their common characteristics or separating them into different but related parts.

It is a process of arranging data into various homogeneous classes and subclasses according to some common characteristics.

Page 9: Statistics slides

Ritesh Singhal

Presentation of Data Data should be presented in such a

manner, so that it may be easily understood and grasped, and the conclusion may be drawn promptly from the data presented. e.g.

>>> Histogram>>> Frequency polygon & curve>>> Pie Chart>>> Ogives>>> Pictogram & Cartogram>>> Bar Chart

Page 10: Statistics slides

Ritesh Singhal

Variables Discrete Variablee.g. No. of books, table, chairs Continuous Variablee.g. Height, Weight Quantitative VariableThat can be measured on a scale Qualitative VariableThat can not be measured on a scale

Page 11: Statistics slides

Ritesh Singhal

Frequency Distribution

The observations can be recorded by three ways:

1. Individual SeriesData recorded for individual member.2. Discrete SeriesThis variable can assume values after an

interval (or jumps).3. Continuous SeriesHere the variable may be having any value,

integer or fraction.

Page 12: Statistics slides

Ritesh Singhal

Statistics functions & Uses It simplifies complex data It provides techniques for comparison It studies relationship It helps in formulating policies It helps in forecasting It is helpful for common man Statistical methods merges with speed of

computer can make wonders; SPSS, STATA MATLAB, MINITAB etc.

Page 13: Statistics slides

Ritesh Singhal

Scope of Statistics

In Business Decision Making In Medical Sciences In Actuarial Science In Economic Planning In Agricultural Sciences In Banking & Insurance In Politics & Social Science

Page 14: Statistics slides

Ritesh Singhal

Distrust & Misuse of Statistics

Statistics is like a clay of which one can make a God or Devil.

Statistics are the liers of first order. Statistics can prove or disprove

anything.

Page 15: Statistics slides

Ritesh Singhal

Measure of Central Tendency

It is a single value represent the entire mass of data. Generally, these are the central part of the distribution.

It facilitates comparison & decision-making

There are mainly three type of measure1. Arithmetic mean2. Median3. Mode

Page 16: Statistics slides

Ritesh Singhal

Arithmetic Mean

This single representative value can be determined by:

A.M. =Sum/No. of observationsProperties:1. The sum of the deviations from AM is

always zero.2. If every value of the variable increased or

decreased by a constant then new AM will also change in same ratio.

Page 17: Statistics slides

Ritesh Singhal

Arithmetic Mean (contd..)

3. If every value of the variable multiplied or divide by a constant then new AM will also change in same ratio.

4. The sum of squares of deviations from AM is minimum.

5. The combined AM of two or more related group is defined as

Page 18: Statistics slides

Ritesh Singhal

Median

The median is that value of the variable which divides the group into two equal parts, one part comprising all values greater, and the other part having lesser value than median.

Determination of Median>>> Arrange the data first>>> Find the size of (N+1)/2 th item.

Page 19: Statistics slides

Ritesh Singhal

Mode

Mode is that value which occurs most often in the series.

It is the value around which, the items tends to be heavily concentrated.

It is important average when we talk about “most common size of shoe or shirt”.

Page 20: Statistics slides

Ritesh Singhal

Relationship among Mean, Median & Mode

For a symmetric distribution: Mode = Median = Mean

The empirical relationship between mean, median and mode for asymmetric distribution is: Mode = 3 Median – 2 Mean

Page 21: Statistics slides

Ritesh Singhal

Skewness

Mode: Peak of the curve.Median: Divide the curve into two equal

parts.Mean: Center of gravity of the curve.

For a positively skewed distribution:Mean>Median>Mode For a Negatively skewed distribution: Mean<Median<Mode

Page 22: Statistics slides

Ritesh Singhal

Dispersion or Variation

The average does not enable us to draw a full picture of the distribution. So a further description is necessary to get a better description.

The extent or degree to which data tends to spread around an average is called dispersion & Variation.

Page 23: Statistics slides

Ritesh Singhal

Objectives

For judging the reliability of averages. Comparison of distributions Useful for controlling variability Useful in further analysis

Page 24: Statistics slides

Ritesh Singhal

Measure of Dispersion

Range Inter quartile Range Mean Deviation Standard Deviation

Page 25: Statistics slides

Ritesh Singhal

Range

Range is the difference between the largest and the smallest observation.

Range = L-S It is easy to calculate and provides a

full picture of variation of the data quickly.

It is crude measure & not based on all the observations.

Page 26: Statistics slides

Ritesh Singhal

Correlation Analysis

Correlation denotes the degree of interdependence between variables or the tendency of simultaneous variation between variables.

Types of Correlation:1. Positive & Negative2. Linear & Non-linear3. Multiple & Partial

Page 27: Statistics slides

Ritesh Singhal

Positive & Negative Correlation Positive Income Vs

Expenditure Agricultural Prod Vs

Rainfall Sales Vs Advt Expd Cost of raw

material Vs Cost of Industrial Prod

Negative Price Vs

Consumption Day temp Vs Sale

of Woolen clothes

Page 28: Statistics slides

Ritesh Singhal

Measure of Correlation

Scatter Diagram Method Karl Pearson’s Coefficient of

Correlation Spearman’s Coefficient of Rank

Correlation Concurrent Deviation Method

Page 29: Statistics slides

Ritesh Singhal

Scatter Diagram Method

It is a graphical method to find the correlation between variables.

Here the pair of the observations are plotted on a 2-D space.

After joining the these points we can have the idea about the relationship between variables.

Page 30: Statistics slides

Ritesh Singhal

Karl-Pearson’s coefficient of correlation (r)

The value of r lying between -1 and +1 i.e., -1≤r ≤+1

Coefficient of correlation is independent of change origin and scale.

Coefficient ‘r’ is symmetric rxy=ryx

The Probable error of ‘r’ is used to interpreting its estimated value.

Page 31: Statistics slides

Ritesh Singhal

Spearman’s Coefficient of Rank Correlation

Karl-Pearson’s method discusses the relationship between the quantitative variable where as Spearman’s coefficient suitable for qualitative variable like, rank given to the participant in any contest by two judges and we want to measure the relationship between rank given by these judges.

Page 32: Statistics slides

Ritesh Singhal

Concurrent Deviation Method

This is the simplest method in which only the direction of change is taken into consideration rather than magnitude of variation.

It gives a general idea about the correlation between variables quickly.

Page 33: Statistics slides

Ritesh Singhal

Regression Analysis It is concerned with the formulation

and determination of algebraic expression for the relationship between variables.

For this purpose we use regression lines.

These regression line are used for predicting the value of one variable from that of other.

Page 34: Statistics slides

Ritesh Singhal

Regression Analysis contd..

Here the variable whose value is to be predicted is called dependent (Explained) variable and the variable used for prediction is called independent (Explanatory) variable.

This method first introduced by “Sir Francis Galton”.

It helps in prediction & estimation.

Page 35: Statistics slides

Ritesh Singhal

Properties of Regression Lines & Coefficient

The regression line Y on X is used to estimate the best value of Y (Dep.) for a given value of X (Indep.).

The regression line X on Y is used to estimate the best value of X (Dep.) for a given value of Y (Indep.).

Both the regression coefficients are independent of change of origin & scale.

Page 36: Statistics slides

Ritesh Singhal

Properties of Regression Lines & Coefficient (contd..)

The relation between r, byx and bxy is

r = ±√ byx bxy

Both the regression coefficient should have same sign.

Both the regression coefficient could not more than one simultaneously.

Regression coefficient denotes the rate of change. i.e. byx measure the change in Y for a unit change in X.

Page 37: Statistics slides

Ritesh Singhal

Properties of Regression Lines & Coefficient (contd..)

Both lines cut each other at (X, Y). If r=0, both lines perpendicular to

each other. If the regression lines are identical,

the correlation between the variable is perfect.

Page 38: Statistics slides

Ritesh Singhal

Standard Error of Estimate

It provides us a measure of scatter of the observations about an average line, the standard error of estimate of Y on X is:

SY.X = √ [Σ(Y-Yest)2 / N]

Page 39: Statistics slides

Ritesh Singhal

Probability

Probability is a concept which numerically measures the degree of uncertainty or certainty of the occurrence of any event. i.e. the chance of occurrence of any event.

The probability of an event A is No. of Favorable cases

P(A)= Total No. of Cases

Page 40: Statistics slides

Ritesh Singhal

Probability

If P(A)=0, Impossible Event If P(A)=1, Sure Event 0≤P(A)≤1 P(A)= Probability of occurrence P(Ā)= Probability of Non-occurrence P(A) + P(Ā) = 1

Page 41: Statistics slides

Ritesh Singhal

Some Keywords Equally Likely Events: When the

chance of occurrence of all the events are same in an experiment.

Mutually Exclusive Events: If the occurrence of any one of them prevents the occurrence of other in the same experiment.

Sample Space: the set of all possible outcomes.

Page 42: Statistics slides

Ritesh Singhal

Some Keywords

Independent Events: If two or more events occur in such a way that the occurrence of one does not effect the occurrence of other.

Dependent Events: If the occurrence of one event influences the occurrence of the other.

Page 43: Statistics slides

Ritesh Singhal

Classical or Priori Probability

If a trial result in ‘n’ exhaustive, mutually exclusive and equally likely cases and ‘m’ of them are favorable to the happenings of an event E, then the probability ‘P’ of happening of E is given by:

P(E) = m / n

Page 44: Statistics slides

Ritesh Singhal

Empirical or Posteriori Probability

The classical def requires that ‘n’ is finite and that all cases are equally likely.

This condition is very restrictive and can not cover all situations.

The above conditions are not necessarily active in this case.

Page 45: Statistics slides

Ritesh Singhal

Fundamental rule of counting

If an event can occur in ‘m’ ways and following it, a second event can occur in ‘n’ ways, then these two event in succession can occur in ‘mxn’ ways.

E.g. A tricolor can be formed out of 6 colors in 6x5x4=120 ways.

No. of words of 3 characters out of 26 alphabets 26x25x24= 15600 ways.

Page 46: Statistics slides

Ritesh Singhal

Permutations The different arrangement can be

made out of a given no. of things by taking some or all at a time are called permutations.

P (n,r) = n! / (n-r)! E.g. permutations made with letters

a,b,c by taking two at a time:P(3,2)=6ab, ba, ac, ca, bc, cb

Page 47: Statistics slides

Ritesh Singhal

Combinations The combination of ‘n’ different

objects taken ‘r’ at a time is a selection of ‘r’ out of ‘n’ objects with no attention given to order of arrangement

C (n,r) = n!/r!(n-r)!e.g. From 5 boys & 6 girls a group of 3 is

to be formed having 2 boys & 1 girl is C(5,2) x C(6,1) = 60 ways

Page 48: Statistics slides

Ritesh Singhal

Example

A coin is tossed three times. Find the probability of getting:

i) Exactly one headii) Exactly two headiii) One or two head

Page 49: Statistics slides

Ritesh Singhal

Example

One card is randomly drawn from a pack of 52 cards. Find the probability that

i) Drawn card is redii) Drawn card is an aceiii) Drawn card is red and kingiv) Drawn card is red or king

Page 50: Statistics slides

Ritesh Singhal

Example

A bag contains 3 red, 6 white and 7 blue balls. Two balls are drawn at random. Find the probability that

i) Both the balls are white.ii) Both the balls are blue.iii) One ball is red & other is white.iv) One ball is white & other is blue.

Page 51: Statistics slides

Ritesh Singhal

Addition Theorem

For any two event A and B the probability for the occurrence of A or B is given by:P(AUB)= P(A) + P(B) – P(AПB)If A & B are mutually Exclusive then P(AПB)=0 P(AUB)= P(A) + P(B)

Page 52: Statistics slides

Ritesh Singhal

Multiplication or Conditional Probability

The probability of an event B when it is known that the event A has occurred already: P(B/A)= P(AПB) / P(A) ;if P(A)>0

ie. P(AПB)= P(A).P(B/A) If A and B are Independent event:

P(AПB)= P(A).P(B)

Page 53: Statistics slides

Ritesh Singhal

Example A bag contains 25 balls numbered from 1

to 25. Two balls are drawn at random from the bag with replacement. Find the probability of selecting:

i) Both odd numbers.ii) One odd & one even.iii) At least one odd.iv) No odd numbers.v) Both even numbers.

Page 54: Statistics slides

Ritesh Singhal

Example

Five men in a company of 20 are graduate. If 3 men are picked up at random, what is the probability that they are all graduate? What is the probability that at least one is graduate.

Page 55: Statistics slides

Ritesh Singhal

Example

The probability that A hits a target is 1/3 and the probability that B hits the target is 2/5. What is the probability that the target will be hit, if each one of A and B shoots at the target.

Page 56: Statistics slides

Ritesh Singhal

Expected Value of Probability

Let X be the random variable with the following distribution:

X : x1 x2 x3………..

P(X) :P(x1) P(x2) P(x3)……..

Expected Value is given by:E(X) = Σ xi . P (xi)

Page 57: Statistics slides

Ritesh Singhal

Example

A player tossed two coins. If two heads show he wins Rs. 4. if one head shows he wins Rs. 2, but if two tails show he pays Rs. 3 as penalty. Calculate the expected value of the game to him.

Solution:E(X)= (-3) ¼ + (2) ½ + (4) ¼ =1.25

Page 58: Statistics slides

Ritesh Singhal

Example An insurance company sells a

particular life insurance policy with a face value of Rs. 1000 and a yearly premium of Rs. 20. If 0.2% of the policy holder can be expected to die in the course of a year, what would be the company’s expected earning per policy holder per year.

E(X)= (-980) 0.002 + (20) 0.998=18

Page 59: Statistics slides

Ritesh Singhal

Theoretical Probability Distribution