bridge course

25
An Introduction to Statistics Statistics: The subject concerned with scientific method for collecting, summarising, presenting, and analysing data as well as drawing conclusions or making predictions on the basis of such analysis. Descriptive statistics: The branch of statistics, which seeks only to describe and analyse any data is called descriptive statistics. Inferential statistics: The branch of statistics dealing with drawing conclusions about the population with the help of the analysis of a sample, drawn from it, is known as inferential statistics. Classification and tabulation: Classification is the first step in tabulation. Classification implies bringing together the items which are similar in some respect(s). Example: students of a class may be grouped together with respect to their obtained in an examination, their age or area of specialisation, etc. After classification, tabulation is done to condense the data in a compact form which can be easily comprehended. Diagrammatic / Graphical presentation: There are several diagrams/graphs used for presentation of data. Bar chart Pareto chart Pie chart

Upload: anandkasirajankak

Post on 20-Nov-2014

185 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Bridge Course

An Introduction to Statistics

Statistics:

The subject concerned with scientific method for collecting, summarising, presenting, and

analysing data as well as drawing conclusions or making predictions on the basis of such

analysis.

Descriptive statistics:

The branch of statistics, which seeks only to describe and analyse any data is called

descriptive statistics.

Inferential statistics:

The branch of statistics dealing with drawing conclusions about the population with the help

of the analysis of a sample, drawn from it, is known as inferential statistics.

Classification and tabulation:

Classification is the first step in tabulation. Classification implies bringing together the items

which are similar in some respect(s).

Example: students of a class may be grouped together with respect to their obtained in an

examination, their age or area of specialisation, etc.

After classification, tabulation is done to condense the data in a compact form which can be

easily comprehended.

Diagrammatic / Graphical presentation:

There are several diagrams/graphs used for presentation of data.

Bar chart

Pareto chart

Pie chart

Histogram

Ogive

Line graph

Lorenz curve.

Page 2: Bridge Course

(i) Bar chart:

It comprises a series of bars of equal width- the base of the bars being equal to

the width of the class interval of a grouped data. The bars stand on a common

base line, the heights of the bars being proportional to the frequency of the

interval.

The following data give the distribution of 215 MBA students at a management

institute according to educational qualifications.

Educational Qualification No of students

B.Tech 55

B.Com 70

B.Sc 25

B.A 45

C.A 20

(a) Sub divided bar chart:

A subdivided bar chart is a bar chart wherein each bar is divided into further

components.

In the above example if the information about the cities from where the

students have graduated, is also available as given below.

Page 3: Bridge Course

Educational Qualification Metro Large

Mediu

m

No of

students

B.Tech 15 25 15 55

B.Com 35 20 15 70

B.Sc 10 10 5 25

B.A 15 10 20 45

C.A 10 5 5 20

(b) Percentage bar chart:

Percentage bar chart is one in which each bar is divided into components

which are expressed as percentage of the total bar.

Automaker

Average Sales

Estimates

Average Net

Profit

Estimates

Percentage of profit to sales

(iii)=(ii)/(i) *100

Tata motors 6848.8 466 7.2

Hero Honda 2196.5 224.2 10

Bajaj Auto 2444.7 345.4 14

TVS Motor 1032.9 35.1 3.4

Bharath Forge 461.6 63.4 14

Ashok Leyland 1635.8 94.7 5.8

M&M 2365.5 200.6 8.5

Marutiudyog 3426.5 315.7 9.2

Page 4: Bridge Course

(c) Multiple bar chart:

Multiple bar charts are one in which two or more bars are placed together

for each entity.

The bars are placed together to give comparative assessment of values of

some parameter over two periods of time or two different locations etc.

Pain Killer 2005 2006

Voveran 16.5 23.2

Calpol 13.2 18.2

Nise 15.2 18.6

Combiflam 9.4 14.1

Dolonex 6.8 10.3

Sumo 5.1 7.4

Volini 6.9 9.6

Moov 3.8 4.9

Nimulid 3.5 4.9

Page 5: Bridge Course

Another example…

Name

Net worth in

$ Billion March 06

Net worth in

$ Billion March 07

Lakshmi Mittal 20 32

MukeshAmbani 7 20.1

Anil Ambani 5.5 18.2

AzimPremij 11 17.1

Kushal Pal Singh 5 10

Sunil Mittaal& Family 4.9 9.5

Kumar Mangalam Birla 4.4 8

Shashi& Ravi Ruia 2.7 8

PallonjiMistry 3.3 5.6

Adi Godrej & Family 2.3 4.1

Shiv Nadar 3 4

Anil Agarwal 2.1 3.8

DilipShanghvi 2 3.1

Tulsi Tanti 2.4 3.7

Malvinder&Shivinder Singh 2 1.55

VenugopalDhoot 1.6 1.6

Naresh Goyal 1.3 1.9

Rahul Bajaj 1.1 1.5

Page 6: Bridge Course

(ii) Pareto chart:

This specialist bar chart, named after the famous Italian economist, is used to

classify a variable into groups or intervals from largest to smallest frequency.

It facilitates identification of the most frequent occurrence or causes of an event

or phenomenon. It is used for sorting by data by using any criteria like

geographical regions, organisation like management institutes, banks, countries,

cities etc.

Academic Background Frequency

Commerce 18

Economics 6

Eingineering 17

Information Technology 7

Science 8

Page 7: Bridge Course

(iii) Pie chart:

It is one of the most popular charts for presenting the whole into parts. It is a

circular chart divided into sectors representing relative magnitude of various

components.

A pie chart is obtained by dividing a circle into sectors such that these sectors

have areas or centre angles proportional to different components given in the

data.

Sources of Funds Percentage of Total Uses of Funds

Percentage of

Total

Excise 17 Central Plan 20

Customs 12 Non-plan Assistance and Expenditure 23

Corporate Tax 21 Defence 12

Income Tax 13 Interest Payments 20

Service Tax 7 states' Share 18

Borrowings & others 30 Subsidies 7

Total 100 100

sources of Funds percentage of Total size of Segment (Degrees)

Excise 17 61.2

Customs 12 43.2

Corporate Tax 21 75.6

Income Tax 13 46.8

Service Tax 7 25.2

Borrowings & Others 30 108

Total 100 360

Page 8: Bridge Course

uses of funds percentage of total

size of

segment

Central Plan 20 72

Non-plan Assistance and Expenditure 23 82.8

Defence 12 43.2

Interest Payments 20 72

states' Share 18 64.8

Subsidies 7 25.2

100 360

(iv) Histogram / Frequency polygon:

A histogram comprises of vertical rectangles whose base is proportional to the

class interval and height is proportional to the frequency of an interval.

The polygon formed by joining the top middle points of the rectangles of the

histogram s called frequency polygon.

(v) Line graphs:

A line graph is a visual presentation of a set of data values joined by straight

lines.

Bank Business Per Employee 2005-06 Business Per Employee 2001-02

Allahabad Bank 336 153

Andhra Bank 426.75 195.96

Bank of Baroda 396 222.76

Bank of India 381 218.74

Bank of Maharashtra 306.18 191.44

Page 9: Bridge Course

Canara Bank 441.57 214.88

Central Bank of India 240.46 148.77

Corporation Bank 527 290.44

Dena Bank 364 221

Indian Bank 295 156

Indian Overseas Bank 354.73 175.41

(vi) Lorenz curve:

Indicates the extent of inequality in the distribution of a financial parameter like

income

Page 10: Bridge Course

Descriptive statistics

The branch of statistics, which seeks only to describe and analyse any data is called

descriptive statistics.

Measures of central tendency:

1) Arithmetic mean

2) Median

3) Mode

4) Geometric mean

5) Harmonic mean

Arithmetic mean:

An average is a single value within the range of the data that is used to represent all of the values

in the series.

“Arithmeticmean is quotient of sum of the given values and number of the given values”.

Arithmetic mean: Problems for Practice

1) Find the arithmetic mean of the marks obtained by 10 students of class X in mathematics in

a certain examination. The marks obtained are

25,30,21,55,47,10,15,17,45,35

Ans=30.

2) Find the Arithmetic Mean from the following frequency table:

Marks 52 58 60 65 68 70 75

No of

Students

7 5 4 6 3 3 2

Ans= 61.6

3) The following table gives the distribution of 100 accidents in New Delhi during seven days of

a week of a given month. During that month there were 5 Mondays, 5 Tuesdays and 5

Wednesday s and only four each for the other days. Calculate the number of accidents per

day.

Day: Sunday Monday Tuesday Wednesday Thursday Friday Saturday

Page 11: Bridge Course

No of

Accidents:

26 16 12 10 8 10 18

Ans= 14.13

4) The data on number of patients attending a hospital in a month are given below. Find the

average number of patients attending the hospital in a day.

Number of

patients

0-10 10-20 20-30 30-40 40-50 50-60

Number

days

attending

the hospital

2 6 9 7 4 2

Ans=28.67

5) Ten coins were tossed together and the number of the resulting from them was observed.

The operation was performed 1050 times and the frequencies thus obtained for different

number of tails (x) are shown in the following table. Calculate the arithmetic mean by the

shortcut method.

X: 0 1 2 3 4 5 6 7 8 9 10

Y: 2 8 43 133 207 260 213 120 54 9 1

Ans=5.0114

6) For the following frequency table, find the mean.

Class: 100-120 120-140 140-160 160-180 180-200 200-220 220-240

Frequency 10 8 4 4 3 1 2

Ans=145.625

7) In a study on patients, the following data were obtained. Find the arithmetic mean.

Page 12: Bridge Course

Age (in

years)

10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89

No of

cases:

1 0 1 10 17 38 9 3

Ans=60.7

8) Find the value of p for the following distribution whose mean is 16.6

F: 12 16 20 24 16 8 4

X: 8 12 15 P 20 25 30

Ans=16.6

9) The mean height of 25 male workers in a factory is 61 inches and the mean height of 35

female workers in the same factory is 58 inches. Find the combined mean height of 60

workers in a factory. Ans=59.25

10) A firm of readymade garments make both men’s and women’s shirts. Its profit average is 6%

of sales. Its profits in men’s shirts average 8% of sales; and women’s shirts comprise 60% of

output. What is the average profit per sales rupee in women’s shirts? Ans= 4.67

11) The average score of girls in class X examination in a school is 67 and that of boys is 63. The

average score for the whole class is 64.5 find the percentage of girls and boys in the class.

10. Ans:62.5

12) There are 50 students in a class of which 40 are boys and rest girls. The average weight of

the class is 44 kg and the average weight of the girls is 40 kg. Find the average weight of the

boys. Ans=45

13) The mean annual salary of all employees in a company is Rs. 25,000. The mean salary of

male and female employees is Rs. 27,000 and Rs. 17,000 respectively. Find the percentage of

males and females employed by the company. Ans: males=80 and females=20.

14) The mean marks of 100 students were found to be 40. Later on it was discovered that a

score f 53 was misread as 83. Find the correct mean corresponding to the correct score.

Ans=39.7

15) Mean of 100 observations is found to be 40. If at the time of computation two items are

wrongly taken as 30 and 27 instead of 3 an 72. Find the correct mean. Ans =40.18

Median

Page 13: Bridge Course

Problems for Practice

1) The number of runs scored by 11 players of a cricket team of a school are

5 19 42 11 50 30 21 0 52 36 27 . Find median

Ans=27runs.

2) Find the median of the following items:

6 10 4 3 9 11 22 18

Ans=9.5

3) The following table represents the marks obtained by a batch of 12 students in certain class

tests in Statistics and Physics.

sr. no 1 2 3 4 5 6 7 8 9 10 11 12

Marks

(Statistics)53 54 32 30 60 46 28 25 48 72 33 65

Marks

(Physics)55 41 48 49 27 25 23 20 28 60 43 67

Ans=42.

4) Calculate median for the following data:

No of

students

6 4 16 7 8 2

Marks: 20 9 25 50 40 80

Ans= 25

5) Find the median of the following frequency distribution:

X: 5 7 9 12 14 17 19 21

Y: 6 5 3 6 5 3 2 4

Ans=12

6) The following table gives the weekly expenditure of 100 families. Find the median weekly

expenditure.

Page 14: Bridge Course

Weekly Expenditure 0-10 10-20 20-30 39-40 40-50

Number of Families 14 23 27 21 15

Ans=24.815

7) Calculate the mean and median for the following data:

Height (in cm) No of boys Height (in cm) No. Of boys

135-140

140-145

145-150

150-155

4

9

18

28

155-160

160-165

165-170

170-175

24

10

5

2

Ans=153.9

8) Calculate the median from the following data.

Weight (gms) No of apples Weight (gms) No of apples

410-419 14 450-459 45

420-429 20 460-469 18

430-439 42 470-479 7

440-449 54

Ans=443.94

9) Calculate the median:

Marks No of students Marks No of students

Less than 5 29 Less than 30 644

Less than 10 224 Less than 35 650

Less than 15 465 Less than 40 653

Less than 20 582 Less than 45 655

Less than 25 634

Ans=14.29

Mode

Page 15: Bridge Course

Problems for practice

1) A shoe shop in Delhi had sold 100 pairs of shoes of a particular brand on a certain day with

the following distribution: find the mode of the distribution.

Size of Shoes 4 5 6 7 8 9 10

No of pairs: 10 15 20 35 16 3 1

Ans=7

2) Find the mode for the following data:

Marks: 1-5 6-10 11-15 16-20 21-25

No of Students: 7 10 16 32 24

Ans=19.33

3) Calculate Median and Mode for the following distribution:

Production per day

(in tons)21-22 23-24 25-26 27-28 29-30

No of days: 7 13 22 10 8

Ans= 25.36

4) Calculate AM, median and mode from the following frequency distribution.

Variable Frequency Variable Frequency

10-13 8 25-28 54

13-16 15 28-31 36

16-19 27 31-34 18

19-22 51 34-37 9

22-25 75 37-40 7

(Mean =24.19, median=23.96, mode=23.6)

Measures of dispersion

Page 16: Bridge Course

The degree to which numerical data tend to spread about an average value is called the variation or

dispersion of the data.

Significance of measuring variation:

To determine the reliability of an average.

To serve as a basis for the control of the variability.

To compare two or more series with regard to their variability.

To facilitate the use of other statistical measures.

Methods of studying variation:

The range

The quartile deviation

The mean deviation

The standard deviation.

Range

1) The following are the prices of shares of AB Co Ltd from Monday to Saturday. Calculate

range and its coefficient.

Day Price Day price

Monday 200 Thursday 160

Tuesday 210 Friday 220

Wednesday 208 Saturday 250

Ans: range=90 and coefficient of range=0.22

2) Calculate the coefficient of range from the following:

Marks No of students Marks No of students

10-20 8 40-50 8

20-30 10 50-60 4

30-40 12

Ans=0.714

The quartile deviation

1) Find out the value of quartile deviation and its coefficient from the following data:

Page 17: Bridge Course

Marks 10 20 30 40 50 60

No of

students

4 7 15 8 7 2

Ans: QD=10 and coeff=0.333

2) Calculate quartile deviation and its coefficient from the following data:

Wages in Rs

per week

Less than 35 35-37 38-40 41-43 Over 43

No of wage

earners

14 62 99 18 7

Ans: QD=1.67 and coeff=0.044

Mean deviation

1) Calculate the mean deviation and its coefficient of the two income groups of five and seven

members.

1st group 4000 4200 4400 4600 4800

2nd group 3000 4000 4200 4400 4600 4800 5800

Ans: 1st: MD=240 coeff=0.054 & 2nd: MD=571.43, coeff=0.130

2) Calculate the mean deviation:

X 10 11 12 13 14

F 3 12 18 12 3

Ans=0.75

3) Calculate mean deviation and its coefficient.

Class frequency Class Frequency

0-10 5 40-50 20

10-20 8 50-60 14

20-30 12 60-70 12

30-40 15 70-80 6

Ans: MD=15.37 & coeff=0.357

Standard deviation

Page 18: Bridge Course

1) Blood serum cholesterol levels of 10 persons are as under

240,260,290,245,255,288,272,263,277,251.

Calculate standard deviation.

2) The annual salaries of a group of employees are given in the following table.

Salaries

in (Rs

000)

45 50 55 60 65 70 75 80

Number

of

persons

3 5 8 7 9 7 4 7

Calculate SD of the salaries. Ans =10.35

3) Calculate mean and SD of the following frequency distribution of marks:

Marks No of students Marks No of students

0-10 5 40-50 50

10-20 12 50-60 37

20-30 30 60-70 21

30-40 45

Ans : mean=40.9 & SD=14.839

Coefficient of variation

1) From the prices of shares of X and Y below find out which is more in value:

X 35 54 52 53 56 58 52 50 51 49

Y 108 107 105 105 106 107 104 103 104 101

Ans: CV of X=11.6 & CV of Y=1.905

2) Two brands of tyres are tested with the following results:

Life (in ‘000 miles) No of tyres brand

X Y

20-25 1 0

25-30 22 24

30-35 64 76

35-40 10 0

40-45 3 0

Page 19: Bridge Course

a) Which brands of tyres have greater life?

b) Compare the variability and state which brand of tyres would you use on your fleet of

trucks/

*********************************************************************************

Page 20: Bridge Course

Probability

1. A can solve 80% of the problems, while B can solve 90% of problems in a Statistics book. A problem is selected at random. What is the probability that at least one of them will solve it?

2. In a box, there are 2 white and 4 black balls. What is the probability that both of the two balls drawn, one after the other, are white?

3. In families with two children, what is the probability that a family will havei. One boy one girl?ii. Two girls?iii. Two boys?

In the absence of any other information, it is assumed that the probability of child being a boy or a girl is ½ .

4. A speaks the truth in 60% and B in 75% of the cases. In what percentage of the cases, they are likely to contradict each other stating the same fact?

5. An investment consultant predicts that the odds against the price of a certain stock going up are 2:1, and odds in favour of the price remaining the same are 1:3. What is the probability that the stock will go down?

6. The probability that A can solve a problem in Statistics is ½ , that B can solve 1/3, and C can solve it is 1/5. If all of them try independently, then find the probability that the problem will be solved.

7. A salesman is known to sell a product in 3 out of 5 attempts while another salesman in 2 out of 3 attempts. Find the probability thati. No sale will take place when they both try to sell the product ii. Either of them will succeed in selling the product.

8. An investment analyst presents the following table giving probabilities of next year’s economic conditions normal or good or very food, in the country, and probabilities of the movement increase or decline.

9. A class consists of 100 students; 25 of them are girls and 75 boys; 80 0f them are rich and 20 are poor; 40 of them have brown eyes and 60 have black eyes. What is the probability of selecting a brown eyed rich girl?

10. A candidate is selected for interviews for 3 posts. For the first post, there are 3 candidates are second, 4 and for the third post there are 2 candidates. What is the probability that the candidate is selected for at least one post?

11. Three machines producing 40%, 35% and 25% of the total output are known to produce with defective proportion of items as: o.04, 0.06 and 0.03, respectively. On a particular day, a unit of output is selected at random, and is found to be defective. What is the probability that it was produced by the second machine?

12. In a basin area where oil is likely to be found underneath the surface, there are three locations with three different types of earth composition, say C1, C2 and C3.