statistics for economics for class 11 n. m. shah

PREFACE TO THE THIRD REVISED EDITION

Periodic updating and review accommodate new knowledge as well as adds freshness even as it allows for continuity. This third revised edition is an effort to that end.

Statistics for Economics for class XI in its revised format brings forth the changed mode by the Central Board of Secondary Education, New Delhi in 2005-06. The revision includes sufficient exercises keeping in mind learning tools of Statistics in the context of the study of Economics.

This volume incorporates extensive and colourful diagrams and illustrations to enhance a better and friendlier understanding of concepts of Statistics. Answer to the numerical questions in the exercise of unit 3 are also provided so that the students can verify the solutions.

It is hoped that this revised edition will be of great help to both teachers and students.

— N.M. SHAH

PREFACE

Stamtics has become an anchor for social, economic and scientific studies Stat,st,cal methods are widely used in several disciplines, be i, planning bul ' management, psephology (study of voting patterns), psycholo^ oT adve Sn;

steps. A hst of formulae has been provided at the end of each chapter of unit 3 and rJr rTT —• ^ong years of teLh ng thrsubta

I have to acknowledge that in the wriring of this volume I have got immense help rom my frtends and relation. The pubUshers have been very cooperative aTrelp ll

1 Trr r "tT- -derstanding w'fe bes d s tsp rin '

me and looktng after the household, has done ind.spensable work for the volume Z

M.M. Shah, himself a scholar and teacher of economics, and who retired as dean of acuity of Commerce, Nagpur University and Prmc.pal, G.S. College of clt rct

Shri Ram College of Commerce, Delhi April, 2002

—N.M. SHAH

SYLLABUS

STATISTICS FOR ECONOMICS-XI

One Paper

3 Hrs.

100 marks

104 Periods/50 xMarks 5 Periods/3 Marks 25 Periods/12 Marks 64 Periods/30 Marks 10 Periods/5 Marks

PART-A

STATISTICS FOR ECONOMICS

1. Introduction

2. Collection, Organisation and Presentation of Data

3. Statistical Tools and Interpretation

4. Developing Projects in Economics

Unit 1: Introduction

What is Economics? ^^

Meanmg scope and importance of statistics m Economics

Unit 2: Collection, Organisation and Presentation of Data p,,,

SoToflot^tm^^^^^^^ W imrr^ - «

and National Sam;:;! sT^.y^lZZT ^^^^ ^^

Organisation of Data: Meaning and types of variables; Frequency Distribution.

Unit 3: Statistical Tools and Interpretation

interpretation for the rite d^ived) P'-d'

deviat.o„,_Lore„z curvLMeaJ^aX ^^^ --

-Karl Pearson's method

Oirrelation-meaning, scatter diagram-Measures of correlation-(two variables ungrouped data), Spearman's rank correlation

"n^ inL'"on types-wholesale price index, consumer price

"umLrs production, uses of index numbers; inflation and iLx

Unit 4: Developing Projects in Economics in n • j

Tu . J ■ Periods

latlo^ both TaL tT^rV" data, secondary

of L eimnl. f I ' organisations outlets may also be encouraged. Some

of the examples of the projects are as follows (they are not mandatory but suggestivT) (t) A report on demographic amongst households; suggestive).

(«) Consumer awareness amongst households

(»/) Changing prices of a few vegetables in your market

(w) Study of a cooperative institution—milk cooperatives

^eJ^t to enable the students to develop the ways and

u 1

CONTENTS

UNIT 1 : Introduction

1. What is Economics

2. Introduction—Meaning and Scope .

UNIT 2 : Collection and Organisation of Data

3. Collection of Primary and Secondary Data

4. Organisation of Data

Presentation of Data

5. Tabular Presentation

6. Diagrammatic Presentation

7. Graphic Presentation

UNIT 3 : Statistical Tools and Interpretation

8. Measures of Central Tendency

9. Positional Average and Partition Values

10. Measures of Dispersion

11. Measures of Correlation

12. Introduction to Index Numbers

UNIT 4 : Developing Projects in Economics

13. Preparation of a Project Report

1 10

22 52

76 87 108

138 178 232 313 354

394

WHAT IS STATISTICS?

S : T : A : T : I : S : T : I : C : S :

Scientific Methodology

Theory of Figures

Aggregate of Facts

Tables and Calculation for Analysis

Investigation

Systematic Collection

Tabulation and Organisation

Interpretation

Comparison

Systematic Presentation

tc fc ea dc in lif wl his

in (

UNIT 1 —

HmODUCTIOr^

m Whui is ficonoinifs? ■ Jlieaiiiiig, Scope and Imporianct^ of l§)tatislics in Economms

■ ■

Chapter 1

what is economics?

1- Introduction

2. Activity

3. Definition of Economics

4. Nature of Economics

Economics as a Science (i^ Economics as an Art

introduction

If each of us possessed 'Aladdin's magic lamp, which we had merely to rub in order to get our desires fulfilled immediately, there would be no economic problem and no need for a science of economics. In real life we are not lucky as Aladdin, we have to work to earn our livelihood. All people in this world work to satisfy their unlimited wants and desires. Every one requires food to eat, clothes to wear and house to live in. Besides these in daily life. They need television, mobile phone, motor bike, car etc., to lead a comfortable life. The person visits the market and enquires about the varieties and prices of the item which he wants to purchase. Thinking about his source and alternative choices, he uses his sense of economy and decides to buy that item. This is economics.

So,

A customer is a person who buys goods to satisfy his wants.

A Producer is a person who produces or manufactures goods.

A Service holder is a person who is in a job to earn either wages or salary to buy goods.

A Service provider is a person who provides services to society to earn money, e g doctors, scooter drivers, lawyers, bankers, transporters, etc.

All above persons are busy in different activities to earn, called economic activity in ordinary business of life. They face in their life the problem of scarcity of income

z

Statistics for Economics-XI »purcha.

Thus,

and^J^Ttu' "" ^r^'^ff'^^l^dge with economic activities relating to earning ar^ spending the wealth and tncome. Economics is the study of how human beinJZa^

tZ^ tn":^-' T""" unlimited wLts in sZZ Z^LTZ

ômm maxnntse thetr satisfaction, producers can maximise their profits and society can maxtmtse its social welfare'. ^^

infn^W mT publication of Adam Smith's "An Inquiry

into the Nature and Causes of Wealth of Nations", in the year 1776 At its Mxth Z

name of economics was 'PoHtical Economy'. Some of the suggested names '

— Catallactics or the science of exchange.

— Plutology or the science of wealth.

— Chrematistics or the science of money-making

~ "y't toôm- s''.'^' -- ^ 'ô'^tical

(HoItVoTr E'^gli^h has its origin into two Greek words : Oikos

(Household) and nomos (to manage). Thus, the word economics was used to mean home management with limited funds available in the most possible economLal mTn2

activity

Iife.?herlf " ^ day

1. Non-economic Activities

2. Economic Activities

or Activities. These activities are those which have no economic aspect

or are not concerned with money or wealth, viz.

~ bkll^S '' 8«-together, attending

Dirtnday parties or marriages etc. o , &

" wSrCo?'''!'''^'"'^ '' ^"^"dwara, mosque or church to

worship God, attending mass prayer (Satsang), etc.

- Political activities such as various activities performed by different political parties namely by Bhartiya Janata Party (BJP), Congress Party etc

~ ^'ds or helping

- Parental activities, such as love and affection towards their children

—^ ^l^ey involve any

nf Activities. Different types of activities are performed bV different types

of people (doctors, teachers, businessmen, industrialists, lawyers etc.) so as taelT^

What is Economics? 3

living. Every one is concerned with one or the other type of activity to earn money or wealth to meet their wants. An economic activity means that activity which is based on or related to the use of scarce resources for the satisfaction of human wants. Economic activities are classified as under :

the

ECONOMIC ACTIVITIES

i

nt

>ution

kos ome

I day

ispect nding to lies ilping

any

[types I their

Production : Production is that economic activity which is concerned with increasing the utility or value of goods and services. Manufacturing shirt with the help of cloth (raw material) and tailoring (labour) etc. is an act of production. Transporting sand from river bank to a town, where it is needed, is also an act of production. Here utility is created through transportation of goods to the person who needs it.

Consumption : Consumption is that economic activity which is concerned with the use of goods and services for the direct satisfaction of individual and collective wants. Consumption activity is the base of all production activities. There would have been no production if there would have been no consumption. For example, eating bread, drinking water or milk, wearing shirt, services of lawyer or doctor etc. are consumption activities.

Investment : Investment is that economic activity which is concerned with production of capital goods for further production of goods and services. Investment indirectly satisfies human wants. For example, the production of printing press machines to print newspapers, books, magazines etc. or investment in computers to provide Internet, banking and related services.

Exchange : Exchange is that economic activity which is concerned with sale and purchase of commodities. This buying and seUing is mostly done in terms of money or price. So, it is also called ""Product Pricing'' which relates to determination of the price of the product under different conditions of the market, viz., perfect competition, imperfect competition, monopoly etc.

Distribution : Distribution is that economic activity which deals with determination of price of factors of production (land, labour, capital and enterprise). This is known as the 'Factor Pricing', e.g., price of land is rent, that of labour is wage, that of capital is interest and price of entrepreneur is profit. Distribution is the study to know how the national income or total income arising from what has been produced in the country (called Gross Domestic Product or GDP) is distributed through salaries, wages, profits and interest.

"Economics is that branch of knowledge that studies consumption, production, exchange and distribution of wealth".

—Chapman

10 Statistics for Economics-XI

definition of economics

Economics has been defined by many economists m different ways The set of cat"go^s^'"'" '' ^^mto the folwL; W

1. Wealth definition—Adam Smith

2. Material welfare definition—Alfred Marshall

3. Scarcity definition—Lionel Robbins

4. GroAvth definition—Paul A. Samuelson

1. Wealth Definition

(/) Adam Sm^th the father of modem economics, m his book 'An Inquiry mto the Nature and Causes of Wealth of Nations' in 1976 defined that-

production and expansion of wealth as the subject matter of

(«) According to J.B. Say, Economics as "the science which deals wtth wealth"

ofTaTth! " ^^^ consumption

{in) Ricardo shifted the emphasis from production of wealth to distribution of wealth

Criticism : This definition is not a precise definition. It gives importance to wealth rather than production of human and social welfare. importance to wealth

The wealth definition of economics was discarded towards the end of the 19th century. 2. Material Welfare Definition

"Economics is a sUuiy of mankind i„ the ordinary business of life it examines that

What is Economics?

5

Criticism :

{a) In economics, we study immaterial things also.

(b) Welfare cannot be measured in terms of money.

(c) Welfare definition makes economics a purely social science.

II h "Tf " " " - d^ff-nt times.

Then basic difference between Adam Smith's and Marshall's definition is that Ad.m

r-iti* sr. ™ -—hiri^-

3. Scarcity Definition

There are three important aspects in this definition. They are •

Icf of Tir m" human wants which is the

tact of Me. When one warn gets satisfied, another want crops up.

fr" '' are scarce in relation

coal IS used m factories, m running railways and in thermal stations for electric generation and by households, etc. electric

In short, according to Robbins, Economics is a science of choice It deals with how Crit^sm : Sfet™s

Scarcity definition of economics has been criticised on the following grounds ■ (0 The defimtion is impractical and difficult. It is narrow and restricted in scope It

^ development. It has notS^^^

Hi) The definition makes economics a human science.

4. Growth Definition

Paul A. Samuelson defines—

to p^u^ vanous commod,Ues overtime and distribute them for consumptionZ^ or


The definition combines the essential elements of the definitions by Marshall and Robbms. Accordmgly, economics is concerned with the efficient allocation and use of scarce means as a result of which economic growth is increased and social welfare is promoted. The definition has been accepted universally. In short, the growth definition of economics is most comprehensive of all the earUer definitions.

iture of economics ^

Nature of economics—as a science or art. It is science and art as well.

NATURE OF ECONOMICS

ics as Art

A. ECONOMICS AS A SCIENCE

Science can be divided into : (a) Natural science, and (b) Social science : Sciences like Physics Biology and Chemistry are natural or physical sciences, where experiments can be conducted in the laboratory under controlled conditions. Relationships can be decided between cause and effect, which are based on facts. Observations can be made and used to prove or disprove theories. The results apply universally.

Economics is a social science because it is systematic study of economic activities of human beings. Economics is a science as it is a branch of knowledge where various facts have been systematically collected, classified and analysed.

The following arguments are given in favour of economics as a science.

(/) Systematised Study : The study of economics is systematically divided into

consumption, production, exchange and distribution of wealth and finance which

have their own laws and theories. Economics as social science which is a systematic

study of human behaviour concentrating on maximum satisfaction to households

maximum profit to producers and maximum social welfare to the society as a whole. ^

Hi) Scientific Laws : Economics is a science because its laws are universally true Different laws m economics namely, law of demand, law of supply, law of dimimshing marginal utility, law of returns,

Gresham's law etc. are applicable to all types of economies, whether capitaKstic, socialistic or mixed economy

y

ae. of ; to

What is Economics? y

(m) Cause and Effect Relationship : Economic laws establish cause and effect relationship like the laws in other sciences. For example, the law of demand shows the relationship between change in price and change in demand..It shows that mcrease in price of a commodity (the cause) will decrease its demand (the effect) establishing the negative or inverse relationship between price and quantity demanded. The law of supply shows that the increase in price of a commodity (cause) will increase its supply (the effect) establishing the positive relationship between price and supply of quantity of commodity.

(iv) Verification of Laws : Like other sciences economic laws are also open to verification. These economic laws can be verified through any empirical investigation.

On the basis of the arguments given above, we can say economics is a science—not exacriy natural or physical science but social science that studies economic problems and policies in a scientific manner.

Economics—A Positive or Normative Science

(a) Economics as a Positive Science

A positive science is one which makes a real description of an activity. It only answers what ts} what was! It has nothing to suggest about facts, positive economics deals with what IS or how the economic problems facing a society are actually solved. Prof. Robbins held that economics was purely a positive science. According to him, economics should be neutral or silent between ends; /.e., there should be no desire to learn about ethics of economic decisions. Thus, in positive economics we study human decisions as facts which can be verified with actual data.

Some exampi es of Economics as a positive science are : {i) India is second largest populated country of the world. (k) Prices have been rising in India.

(m) Increase in real per capita income increases the standard of living of people. (iv) The targeted growth rate of the tenth five-year plan is 8 per cent per annum. {v) Fall in the price of commodity leads to rise in its quantity demanded.

(vi) Minimum wage law increases unemployment.

(vii) The share of the primary sectors in the national income of India has been declining.

{viii) Ordinary business of life is affected enormously by tsunami, earthquakes, the bird flue, droughts, etc.

(b) Economics as a Normative Science

A normative science is that science which refers to what ought to be} what ought to have happened} Normative economics deals with what ought to be or how the economic problems should be solved. Alfred Marshall and Pigou have considered the normative aspect of economics, as it prescribes that cause of action which is desirable and necessary to achieve social goals. It makes an assessment of an activity and offers suggestions for that. The statements which make assessment of activity and offer suggestions are called


normative statements. The normative statements, in fact, are the opinions of different

persons relating to tightness or wrongness of a particular thing or policy. Normative

statements cannot be empirically verified. That part of economics which deals with

normative statements is called Normative Economics. Thus, economics is both positive and normative science.F^smvc

Some examples of Economics as a normative science are :

(/) Minimum wages should be guaranteed by the government in all economic activities. (//) India should not take loans from foreign countries. [Hi) Rich people should be taxed more. [iv) Free education should be given to the poors.

{v) Effective steps should be taken to reduce income-inequalities in India. (vi) India should spend more money on defence. {vii) Government should stop minimum support price to the farmers.

(vtii) Our education system should produce sufficient qualified and trained persons to the economy.

Economics as positive science and normative science is inseparable. In reality economics has developed along, both positive and normative lines. The role of economist is not only to explain and explore as positive aspect but also to admire and condemn as negative aspect which is essential for healthy and rapid growth of economy.

In the followii^ examples first part of statement is positive giving facts and second i part IS normative based on value judgements.

H) Indian economy is a developing economy, the government should make development through correct and proper planning.

(ii) A rise in the price of a commodity leads to a fall in demand of quantity of commodity, therefore government should check rise in prices.

(iti) Rent Control Act provides accommodation to the needy peoples, therefore, the act should be honestly implemented.

B. ECONOMICS AS AN ART

Art IS practical application of knowledge for achieving some definite aim. It helps in solution of practical problems Art is the practical application of scientific principles. Sc ence lays down princip es while art puts these principles into practice. Economics is an art as it gives us practical guidance in solution to various economic problems. '

We all know that there is oil shortage in India. The information given by economics .sposmve sconce We also know the govermnent aims at removing' oil shortage X information supplied by economics is normative science. In order to achieve the objective of full availability of oil m India, the govermnent has followed the path of oil plaLng The path of planmng is an art as it implies practical application of knowledge with a view to achieve some specific objectives. So, we can say that economics is an art. Economics is, thus, a science as well as an art.

What is Economics? y

exercises

i Explain the origin of word 'Economics', i What is economic activity?

Distinguish between non-economic and economic activities.

Make a Ust of economic activities that constitute the ordinary business of life.

What are your reasons for studying Economics?

How will you choose the wants to be satisfied?

Give Adam Smith's definition of economics.

Define economics in the words of Alfred Marshall.

"Economics is the science of choice." Explain.

Which is the most accepted definition of economics? Give the definition. Explain welfare definition of economics.

"Economics is about making choices in the presence of scarcity." Explain. How scarcity and choice go together? What is meant by economics? Economics is a science? Give reasons.

Discuss the nature of economics as a science. Give argimient in favour of economics as a science.

Is economics an art? Give reasons.

Is Economics a science or an art? Explain with reasons.

Is economics a positive science or a normative science or both? Explain.

lin

3.

4.

5.

6.

Chapter 2

introduction-meaning and scope

Introduction What is Statistics? Functions of Statistics Importance of Statistics Limitations of Statistics Misuse of Statistics

i introduction

e—, plannW and X ^^^^ ^^ ^

progJsst tcTs - - developments and

chemistry, medicine, technology etc) neLTnWr' ^P^^^^ês,

new machines have been devdoS'tLt I f ff ' ^^"^'ês of energy,

because man-whether Indtn T? -^f^^ble. All this is possible

thinking and reasoning which had evo^ '""I' ^^ ^^"^^ê-is gifted with

given us civiIisation4he wtef f ^^ has

electricity, machinery etc. We ntw ^^ ^ê irrigation system,

systems, better organisations for the comX bul^^^^^^^

All thic h^c , "-"'npiex Dusiness and administration today

wayfoftalSmnrp^^^^^^^ ^PP^'ed itself in'findmg

and scientific man'ner. A meZdo^X b'n ^ things. The empirical methodolog^ consists Jf la^^^^

mformation, analysing the information Th^ ^ observations and collecting

conclusions by fuêr^bserta^TsZs W al ZtZ' ^^^^ ^

knows it or not, he uses this method to f'""'.^'e made Whether a common man

buying vegetabks he looks t dS^lrû^^^^^^^^^^ decision-makmg. While

and then mentally calculates, or ^rks out wLT. P."''' ^ôps

observes from his daily exêlre whT'> ^^^ ^hich shop. A shopkeeper

decides to stock these - demand, a'nd

the pattern of demand and manufactures larle o^T "^^ûfacturer also observes

or manufactures new items acco^SnSr^rthTZL^ ^^^ demand

radio, the television etc., it is possiWe trcXr T f "^^^^"êdia-the newspaper, the

Introduction—Meaning and Scope

\ 11

demand and supply, he collects data (information) systematically, gets it organised in some logical or systematic way, analyses this data according to certain principles and draws conclusions. He has to do it carefully since a wrong judgement can completely ruin him.

Quantitative Data and Qualitative Data : An empirical investigation is an investigation where facts are collected through observation. In Physics, Chemistry and Botany, only those things that can be observed by our senses—seeing, hearing, touching, tasting and smelling—are taken to be reliable and then recorded (noted).

We all agree that the rose is beautiful. How do we reach that conclusion? We all like its colour, shape and above all its smell. In this respect, it is not a subjective or personal conclusion. But I say that I like the rose most of all the flowers, this would be a subjective statement. A scientist however, makes very precise statement—he would say that roses have a sweet smell. Similarly, people might say that theft and robbery have increased these days. This might be a conclusion based on impression people get from the newspaper reports of cases of theft and robbery. This impression may or may not be true. We can find out whether it is true or not only by comparing the number of cases of theft and robbery reported during one year with the number of cases reported in other years. An investigator would collect such information from police records.

When information or observations are recorded in numbers or quantity, we say we have quantified information. For example, the number of people in a state who are strict vegetarians, heights or weights of students, everyday temperature, income of individuals, prices of wheat during this week, number of people in country are really poor-rich-middle class, number of people are illiterate who will not get jobs, number of highly educated and will have best job opportunities, etc. are known as 'Quantitative data'.

However, not all information can be numerically expressed. It is not possible in certain cases to measure or quantify information, e.g., preference of people viewing TV. channels, intelligence of students, appreciation of art, beauty, music etc. Supposing a selection for a post is to be made, candidates are interviewed, some questions are put to them and their qualifications are taken into consideration. The

interview board discusses the comparative merit of the candidates and ranks them for final selection. This judgement is not quantifiable, it is based on impression.

Non-quantifiable/qualitative items can however be measured in percentages. For example, percentage of people watching TV. news in English or Hindi or other regional languages. This information obtained in percentages is called 'Qualitative data'. It may be collected through questionnaire or opinion poll using landline or mobile telephone, internet or newspapers.

Social sciences, such as economics, sociology, management etc., do not always deal with what we call inherently measurable or quantifiable facts.

is smistics ? >

It is necessary to have quantitative measurements even for things which are not basically quantifiable. This is necessary for preciseness of statement. The systematic

h

12

vw 11 ^^'^tistics for Economics-Xl

treatment of quantitative expression is known as 'Statistics'. Not all quantitative expressions are statistics; we will see that certain conditions must be fulfilled for a quantitative statement to be called statistics. We will also consider later the functions and hmitations of statistics. First, let us understand what comes under the name Statistics. Statistics can be defined in two ways :

(a) In plural sense.

(b) In singular sense.

i^nn'W^nir"* 'consider whether figures

1600, 400, 80, 20, 700, 300, 70 and 30 are Statistics? Figures are innocent and do not

speak anythmg. But when they refer to some place, person, time etc., they are called statistics. Let us look at the table given below :

Students in Two Schools (2005-2006)

Kendria Vidyalaya Govt. Senior Secondary School

Students Number Percentage Number Percentage

Boys Girls 1600 400 80 20 700 30070 30

Total 2000 100 1000 100

The above table gives a numerical description of students in Kendria Vidyalaya and

Govt Senior Secondary School. Students are grouped as boys and girls and percentage is st^

calculated for each group. Now, in this context the figures 1600, 400, 700 etc have a ' of

statistical meaning; we call this statistics of students. Similarly, we find in newspapers ^

statistics of scores in a cricket match, statistics of price, statistics of agricultural production, i sin statistics of export and import etc. j ^ "

martT^^'^Vl Statistics we mean aggregates of facts affected to ^ ^

marked extent by muhtphctty of causes numerically expressed, enumerated or estimated according to reasonable standards of accuracy, collected in a systematic manner for a predetermmed purpose and placed in relation to each other."

The above definition covers the following main points about statistics as numerical presentation of facts (Plural sense).

Statistics are aggregates of facts : A single observation is not statistics, it is a group of observations, e.g., "pocket expenses of Anil during a month is Rs 50" is not statistics. But "pocket expenses of Anil, Prakash, Sunil and Suresh during a month are Rs 50, 55, 80 and 70 respectively" are statistics.

{b) Statistics are affected to a marked extent by multiplicity of causes : Statistics are generally not isolated facts they are dependant on, or influenced by a number of phenomena, e.g., electricity bills are affected by consumption and rate of electricity (c) Statistics are numencally expressed : Qualitative statements are not statistics unless A they are supported by numbers. For example, if we say that the students of a class colle.


13

are very good in studies, it is not a statistical statement. But when a statement reads as 40 students got first division, 30 second division, 20 third division and 10 failed out of 100 students, it is a statistical statement expressed numerically.

(d) Statistics are enumerated or estimated according to reasonable standard of accuracy: Enumeration means a precise and accurate numerical statement. But sometimes, where the area of statistical enquiry is large, accurate enumeration may not be possible. In such cases, experts make estimations on the basis of whatever data is available. The degree of accuracy of estimates depends on the nature of enquiry.

(e) Statistics are collected in a systematic manner : Statistics collected without any order and system are unreliable and inaccurate. They must be collected in a systematic manner.

if) Statistics are collected for a pre-determined purpose : Unless statistics are collected for a specific purpose they would be more or less useless. For example, if we want to collect statistics of agricultural production, we must decide before hand the regions, commodities and periods for which they are required.

(g) Statistics are placed in relation to each other : Statistical data J»re often required for comparisons. Therefore, they should be comparable periodwii>c, regionwise, commoditywise etc.

When the above characteristics are not present numerical data cannot be called statistics. Thus, "all statistics are numerical statements of facts bui all numerical statements of facts are not statistics."

Statistics defined in singular sense (as a statistical method) : Statistics in its second, singular sense, refers to the methods adopted for scientific empirical studies. Whenever a large amount of numerical data are collected, there arises a need to organise, present, analyse and interpret them. Statistical methods deal with these stages :

PRESENTATION

I Interpretation

Statistics as Methodology

According to Croxton and Cowden, "Statistics may be defined as a science of collection, presentation, analysis and interpretation of numerical data.'"

It

Statistics for Economics-XI The above definition covers the following statistical tools :

(a) Collection of data : This is the first step in a statistical study and is the foundation of statistical analysis. Therefore, data should be gathered with maximum care by the investigator himself or obtained from reliable pubHshed or unpublished sources

(b) Organisation of data : Figures that are collected by an investigator need to be organised by editing, classifying and tabulating.

(c) Presentation of data : Data collected and organised are presented in some systematic manner to make statistical analysis easien The organised data can be presented : with the help of tables, graphs, diagrams etc.

(d) Analysis of data : The next stage is the analysis of the presented data. There are

large number of methods used for analy sing the data such as averages, dispersion correlation etc.'

(e) Interpretation of data : Interpretation of data implies the drawing of conclusions

on the basis of the data analysed in the earlier stage. On the basis of this conclusion certain decisions can be taken.

Stages of Statistical Study

According to the figure, interpretation of data is the last stage in order to draw some conclusion. One has to go through the four stages to arrive at the final stage; they are — collection, organisation, presentation and analysis. First stage — collection of data refers to gather some statistical facts by different methods. The second stage is to organise the data so that collected information is easily intelligible. This is the arrangement of data in a systematic order after editing. Third stage of statistical study is presentation

of data After collection and organisation the data are to be reproduced by various

niethods of presentation, namely tables, graphs, diagrams, etc. so that different

characteristics of data can easily be understood on the basis of their quality and uniformity.

Fourth stage of statistical study is the analysis of data. Calculation of a value by different

methods and tools for various purposes is made to arrive at the last stage of study viz interpretation of data. ^

In brief statistics is a method of taking decisions on the basis of numerical data properly collected, organised, presented, analysed and interpreted.

in i scie drai relai

( (

functions of statistics

Following are the functions of statistics :

1. Statistics simplifies complex data : With the help of statistical methods a mass of data can be presented in such a manner that they become easy to understand. For example, the complex data may be presented as totals, averages, percentages etc

Stati

I

resea Mars quan;

of stj

Introduction—Meaning and Scope \ 5

2. Statistics presents the facts in a definite form : This definiteness is achieved by stating conclusions in a numerical or quantitative form.

3. Statistics provides a technique of comparison : Comparison is an important function of statistics. For example, comparison of data of different regions; periods, conditions etc., is helpful for drawing economic conclusions. Some of the statistical tools like averages, ratios, percentages etc., are used for comparison.

4. Statistics studies relationship : Correlation analysis is used to discover functional relationship between different phenomena, for example, relationship between supply and demand, relationship between sugarcane prices and sugar, relationship between advertisement and sale. Statistics help in finding the association between two or more attributes, for example, association between literacy and unemployment, association between innoculation and infection etc.

5. Statistics helps in formulating policies : Many policies such as that of import, export, wages, production etc., are formed on the basis of statistics. Some laws such as Malthus' theorj^ of population and Engel's law of family expenditure are based on statistics.

6. Statistics helps in forecasting : Statistics also helps to predict the future behaviour of phenomena such as market situation for the future is predict^).' on the basis of available statistics of past and present. Economist might be interested in predicting the changes in one economic factor due to the changes ir another factor. For example, he might be interested to know the impact of today's investment on the national income in future which is possible with the knowledge of statistics.

7. Statistics helps to test and formulate theories : When some theory is to be tested, statistical data and techniques are useful. For example, whether cigarette smoking causes cancer; whether demand increase affects the price, can be tested by collecting and comparing the relevant data.

importance of statistics

The use of statistical method is so widespread that it has become a very important tool in affairs of the world. It IS indispensable to fields of investigations especially in the sciences, such as Botany, Sociology, Economics, Medicine etc. It helps particularly in drawing research conclusions. Let us examine the importance of statistics in some fields relating to economics and business :

{a) Statistics and Economics (b) Statistics and Economic Planning

(c) Statistics and Business (d) Statistics and Government

Statistics and Economics

A number of economists have given a practical shape to statistical tools for economic research. Famous economists (like Augustin, Cournot, Vilfredo Pareto, Leon Walras, Alfred Marshall, Edgeworth, A.L. Bowley etc.) evolved a number of economic laws by

quantitative and mathematical studies^ In India, Prof. P.C. Mahalanobis, Dr V.K R V Rao R.C. Dcsai, ctc. iiave cgntrmuieu aTOtWtne aeveiopmeiu ui lucuiclk-^t-----'

of statistics.

! H

If

16

Statistics for Economics-XI

...ril—T tools and the importance of

ZnomV: rr' f" "T™" ^^^^^ - ^ relanonLp amo^

Tecorm^rA ? of mathematics andLtistic!

m economics. As a result, a new science has evolved which is called Econometrics

the Z" ^r' have evolved due to statistical analysis in

the f eld of economics, Engel's law of family expenditure, Malthus theory of

population etc. New things are being invented today in all the sciences becauTo7the

econoT^b^""" • T ^^^^ ^ new laws in

T^ Statistical methods have made a contribution to the

development of empirical side of economics; the inductive method of economics is dependent upon statistical methods. economics is

TT'' are the tools and

appliances of his laboratory, m the same way as the doctor uses stethoscope for diagnosis

of a patient. A number of economic problems can easily be understood by the useTf

tatistical tools. It helps m formulation of economic policies.^ Let us unLstand the

importance of statistics keeping in view the various parts of economics

oTrh!" consumption : Every individual needs a certain number

7 K' necessities, then on comforts and luxuries, which

depend on his income; but there is no end to his desires and demands. No sooner does he consume one thing, he desires to obtain the other. We discover how

staS^" T"^' consumption. The

Td/vT r T rt^ ^ê taxaWe liability of

individuals and their standard of living

(b) Statistics and the study of production : The progress of production every year can

easily be measured by statistics. The comparative study of the prod^Lty of

various elements of production (e.g., land, labour, capital and enterprise) is also

done with the help of statistics. The statistics of production are ver^ helpful or

ad^stment of demand and supply. Every developed country executes L census of

production with a view to make a comparative study of various fields of production and economic planning. ^ wuucuou

fn^ttirf = ^^ - -ônal and

international demand. A producer needs statistics for deciding the cost of .

P^r ^^^^ ^^ competition afd demand o

^mmodity m a market. The law of price determination and cost price which are :

(d) Statistics and the study of distribution : Statistics are helpful in calculation of national income in the field of distribution. Statistical methods are used in solving the problem of the dismbution of national income. Various problems arise diÎ to

o7lttfc7datr " ^^ ^ with"hrhelp

et/eo- other econormst, have to make bricks." -Marshall

Introduction—Meaning and Scope 1 ^

Thus, statistics is useful in the various fields of economics. It gives statement of facts, direction to solve problems, evolution of economic laws and helps in economic planning. Economic laws in the modern economic world are based on mathematics and statistics which help to form econometric models; these models are helpful in solving economic problems. On this basis we can call economics a Science of Human welfare and statistics as an Arithmetic of Human welfare.

Statistics and Economic Planning

Statistics is the most important tool in economic planning. Economic planning is the best use of national resources, both human and natural. Planning without statistics is a leap in the dark. Every phase in planning—drawing a plan, execution and review is based on statistics. The success of a plan is dependent upon sufficient and accurate statistical data available at all these stages.

The comparison of the stage of development of one country with other is possible only with the availability of statistical data. There are a number of problems of underdeveloped countries, e.g., over population, lack of industries, lack of agricukural development, lack of education etc. These problems can be fully viewed and understood only by getting the actual figures for different areas. Similarly, general review of progress in all fields of economic development needs the help of statistical data and statistical methods. Priorities of expenditure of a national budget can be determined through the comparative study of past performances with the present. Thus, planning without statistics is a ship without radar and compass.

Statistics and Business Planning

Business activities can be classified as under :

f"

L

BUSINESS

1

I

Internal Wholesale Retail

International

Import

Export

i of Trade

^-1-1 t V

Banking Transport Insurance Wterehousing Packmg

Advertisement

Mf

18

vw 11 ^^'^tistics for Economics-Xl

of statmical method have to be followed The it steps

the producer to f„ the pnces of ^o^rd""'

method, of .ta„st,ea.

profttable trade he must know what the ZoZ, '' celling activities. For

would last. This is very .mportant or demand

e„po„ for var.o„s co.mod,t.es and at a^d

he has to forecast when the dLaXodd Ll": °nf "Tf

of reserves he must have. Similarly he ^st t ^-We what amounts

h.s deposttors, otherwise his bak wol fT^oT rh

transaction ,s required where statistical toTls tt iX-b^^ i

hfe e^rrtt^r that is !

what proportion of the. capS decde' i

payments of matured policies ^ Proportion kept ready for

n>.ght fix for the same. In fact no mod^Xanirr °° they

without analysis of the complex faJ^rthrLr"" ^

business analysis statistical tools are afeolt; Tfential

analys^p tz: p":^-lt:tsere 717 ^ ^^ ^

Statistical tools of collection, classification L uncertainties,

data are used in all ma,or functioTstfinterpretation of

material control, budgetary control, fmanc aTclro^f T"'' "

^nd so on. ' control, cost control, personnel management

Statistics and Government

Introduction—Meaning and Scope \ 9

like that of crimes, taxes, wealth, trade etc., so that there is no obstacle for the promotion of economic development. Such policies may develop the economic status of the people and the nation, depending upon the accuracy of the statistical law. Statistics is indispensable for all important functions of the ministries of the state. Above all the ministry of planning takes into account statistics of various fields of economy. Keeping in view the increase of global price rise, the ministry plans and makes policy to import oil in 2010, which depends on expected oil production by domestic sources and likely demand for oil for the year 2010. The policy of family planning can be made effective in controlling the population of country. Thus, statistical techniques are used to analyse economic problems of country, viz., unemployment, poverty disinvestment, price control, etc. Sometimes to make plans and policies, planners require the knowledge of future trend. This trend could be based on the data of past years or recent years. The required data can be obtained by surveys. For example, production poHcy of 2010 depends on the consumption recorded in past years and recent years which decides the expected level of consumption in 2010. This helps the planners to make the production policy for the future. The Ministry of Finance is responsible for preparing the annual budget of the country for which reliable statistical data of revenue and expenditure is necessary. In short, statistical tools are of maximum utility in the governance of state and formulation of various economic policies.

f'i

umitations of statistics

Statistics is very widely used in all sciences but it is not without limitations. It is necessary to know the misuses and limitations of statistics. The following are the limitations of statistics.

1. It does not study the qualitative aspect of a problem : The most important condition of statistical study is that the subject of investigation and inquiry should be capable of being quantitatively measured. QuaHtative phenomena, e.g., honesty, intelligence, poverty, etc., cannot be studied in statistics unless these attributes are expressed in terms of numerals.

2. It does not study individuals : Statistics is the study of mass data and deals with aggregates of facts which are ultimately reduced to a single value for analysis. Individual values of the observation have no specific importance. For example, the income of a family is, say Rs 1,000, does not convey statistical meaning while the average income of 100 families say Rs 400, is a statistical statement.

3. Statistical laws are true only on an average : Laws of statistics are not universally applicable Hke the laws of chemistry, physics and mathematics. They are true on an average because the results are affected by a large number of causes. The ultimate results obtained by statistical analysis are true under certain circumstances only.

4. Statistics can be misused : Statistics is liable to be misused. The results obtained can be manipulated according to one's own interests and such manipulated results can mislead the community.

20

h

(Si.

^ • , . ^t^tistics for Economics-XI

observations of mass data nXkrT L âsed on

to rectify them. Therefor^ itêslnreT" " -...ca, states are a fa/.re m ^

misuse of statistics

statistics by deliberately twisting or man pulat^ne^ ^^ ^--e

be mterpreted by a lawyer to prove ^^ZTltTl^^;' " " ^^^ ^^^

tools for grinding their owTa^sf b^

student of commerce and economic , hou^ 1 the

This generally takes place at the time of sefectinZm^^^^^ a

and interpreting analysis of data ^ "^^îng comparisons

and^êS-pJT~ - support knowledge of statistics, the truth with the help of his

exercises

1- D.stm8„,sh between qualitatrve and quantitative data

3 Dto Tr- characteristics?

4 M rr and plural sense

- above

obrva'Sn'" cr™"'" -d never with a single

llT"" '' counting." Discuss.

■

.............

4.

5.

6.

7.

8.


21

16.

17.

18.

19.

20,

Discuss with illustration the importance of Statistics in the solution of social and economic problems.

"Statistical Analysis is of vital importance for successful businessmen, economists, administrators and educationists." Discuss with illustrations.

Write notes on : (a) Importance of statistics in modern economic set up, {b) Statistics in economic analysis.

Define Statistics. Explain its utiHty in the field of economic planning. "Statistical thinking is as necessary for efficient citizenship as the ability to read and write." Explain this statement in about 200 words.

"Statistics in these days is indispensable for dealing with socio-economic problems". How far is this statement true?

What is the importance of Statistics in modern economic set up? Explain giving

examples. . i- u

"Planning without Statistics is a ship without radar and compass." In the light ot this statement explain the importance of Statistics as an effective aid to national planning.

Explain the relationship between Economics and Statistics and discuss how far it is correct to say that the science of economics is becoming statistical in its method.

Explain briefly :

(a) Statistics, (b) Statistical methods,

(c) Statistical data, (d) Statistics in economic analysis.

Statistical methods are no substitute for common sense, comment. "The Government and poHcy maker use statistical data to formulate suitable policies of economic development". Illustrate with two examples. Mark the following statements as true or false. J (i) Statistics is of no use to economics without data. (ii) Statistics can only deal with quantitative data. (Hi) Statistics solves Economic problems.

UNIT 2

}1:

If

K

^lecton and organisation of data 12

« CoUecAion of Primary and S p Organisation of Uata « Presentation of Data

Chapter 3

collection of primary and secondary data

What is a Statistical Enquiry? Sources of Data g Primary and Secondary Data Drafting the Questionnaire Methods of Collecting Primary Data Census and Sample Surveys Sample Surveys Methods of Sampling Random Sampling Non-Random Sampling Advantages of Sampling Reliability of Sample Data How Secondary Data is Collected? Some Important Sources of Secondary Data Census of India

National Sample Survey Organisation (NSSO)

COI

cer are coIJ

Sourct

w

other, statisti

s-XI Collection of Primary and Secondary Data

what is a statistical enquiry ?

Enquiry means a search for truth, knowledge or information. Statistical enquiry therefore means a search conducted by statistical methods. There are different subjects on this earth; some are described by the degree of expression (quality) and some by the degree of figures or magnitudes (quantity). The application of a statistical technique is possible when the questions are answerable in figures (quantity),

in other words the first and the foremost condition for the answer to the questions in statistical enquiry should be quantitative, for instance :

Profit of firms measured in rupees;

Income of families measured in rupees;

Weight of students measured in kg;

Age of students measured in years;

Intelligence measured in marks obtained by students in a particular test.

But, there are questions like—How great was Jawaharlal Nehru? How brave was Bhagat Singh? etc., which cannot be answered through statistical methods. Questions that can be answered in quantity lies within the purview of statistics, viz.. What is the average production of rice per acre in India? What is the total population of India? How many students are there in a class?

Thus, statistical enquiry means statistical investigation or statistical survey, one who conducts this type of enquiry is called an investigator. The investigator needs the help of certain persons to collect information, they are known as enumerators, and respondents are those from whom the statistical information is collected. Survey is a method of collecting information from individuals.

Let us observe the following table.

TABLE 1

Production of Finished Steel in India (in Million Tons)

fc^ yearProduction

1950-51 1.0

1980-81 6.8

1990-91 13.5

2000-01 30.3

2001-02 31.1

2002-03 34.5

2003-04 36.9

Source : Government of India, Economic Survey 2004-05.

We observe that the production of finished steel in India, is different from one year to other. They are not same. They varies from year to year. They are called Variables in statistics which is represented as X, Y or Z variables. The finished steel production in

ti i-.

24 ,


V V tu f by x-variable and the production of finished

From the followmg text, we will understand :

1. What is the source of data?

2. How do we collect data?

3. By which method of survey is data collected

. sources of data

BefoT^n' Z" ^^ '^^ta. This is the first stage in statistics!

SOURCES OF DATA

dary

^ Governmern departments RaiZr^cf

' ^^ ^reparmgât •

cantrîne??""'"'" external data whij

^âry and secondary '

Collection of Primary and Secondary Data 25

But, you may have the other choice that of visiting the factory accounts department, and record the information from the salary register or, may gather this information from the published report of the factory about the payment of wages. This is secondary source for an investigator but, for the factory it is a primary source.

Thus, primary data is collected originally and secondary data is collected through other sources. Primary data is first hand information for a particular statistical enquiry while the same data is second hand information for an another enquiry. The same data is primary in one hand and secondary in the other, e.g., any Government publication is first hand (Primary) for Government and second hand (Secondary)

for a research worker. Thus, secondary data can be obtained either from published sources or from any other source, for example, a website which saves time and cost.

PIHMARY DATA-* PUBLISHED

How Primary Data is Collected

The most popular and common tool is questionnaire/interview schedule to collect the primary data. The questionnaire is managed by the enumerator; researchers or trained, investigators. Sometimes the questionnaire is managed by the respondents also.

MIMTim

Following are the basic principles of drafting questionnaire :

(1) Covering letter : The person conducting the survey must introduce himself and make the aims and objectives of the enquiry clear to the informant. A personal letter can be enclosed indicating the purposes and aims of enquiry. The informant should be taken into confidence. He should be assured that his answers will be kept confidential and he will not be solicited after he fills up the questionnaire. A self-addressed and stamped envelope should be enclosed for the convenience of the informant to return the questionnaire.

(2) Number of questions : The informant should be made comfortable by asking minimum number of questions based on the objectives and scope of enquiry. More the number of questions, lesser the possibiUty of response. Therefore, normally

\l

(I ■•

I-J

Ni

m-

.26


.nstr„c,.ons about units of measutement shol b^ g.7en questionnaire, |

investigator. numbered for the convenience of the informant and the

''' sraetsr^s r— They

the mformant should ^ abrtr^ve Ae aZ^f "" 'TT' ■"."''ject.ve. For this the blank space, e.g., ®^y "smg a tick-mark in

WUch of the folloJng languages you use most for uniting, (Pu, a cross) (.) English p

M Punjabi □ iit,)Vrd>x n

(f) Any other q

Sd fnTu^f: 'ir^'s'ts^;^:' ^^^-omd be

■Wrong', e.g.. are answerable m'Yes' or 'No'

or 'Right' or

Are you married? Yes/No

Are you employed? Yes/No

should start from general

These questionsleXle aS In which class do you read? In which subject you are more interested?

'rr ^-ooses. such

questi-ons should bet Smlt ^^l^rrt^^ ^^^^^^^^^^ work. Such

Collection of Primary and Secondary Data

A SPECIMEN QUESTIONNAIRE

27

H,.

S.I

Hit:

28 1

Statistics for Economics-xi

Example :

ml" "" " "f P"- ^'"dents in Universi^-

W How will you solve the wage problem in your mdustry?

(a) Which brand of tea do you take?

(b) Why do you prefer it?

(16) T .ssn'r btk"

(b) Do you love your children?

(c) Do you beat your wife?

t

iiethods of collectiiig priiiiiary data

etc. enq in a not

Mer

7.

8.

We^wmg are the methods of primary data collection which a« in common use

\ Collection of Primary and Secondary Data

COLLECTION OF DAW

of

29

r

lARY

SECONDARY

►Direct Personal Interview ►Indirect Personal Interview ►Telephone Interview •^Information from Correspondents ►Mailed Questionnaires ►Questionnaires Filled by Enumerators

1

Published Sources

-1

Unpublished Bourns

»Government Publications ► Publications of Internal Bodies ►Semi-official Publications Report of Committees and Commissions

—> Private Publications

(a) Journals and Newspapers

(b) Research Institutions

(c) Professional Trade Bodies

(d) Annua! Reports of Joint Stock Companies

(e) Articles, Market Reviews and Reports

etc and collect the desired information. In the same way one can think of personal Imryof collection of information regarding family budget and living conto Ta group area. The investigator must be skilled, tactful, accurate, pleasing and should

not be biased. Merits :

1. Original data are collected by this method.

2. There is uniformity in collection of data.

3 The required information can be properly obtained.

4 There is flexibility in the enquiry as the investigator is personally present.

5" Information can be obtained easily from the informants by a personal interview. 6. Since the enquiry is intensive and m person, the results obtained are normally

reliable and accurate. 7 Informants' reactions to questions can be properly studied. , , . ,

8'. Investigators can use the language of communication according to the educational standard and attitude of the informant.

Limitations : 1 j u „

1. This method can be used if the field of enquiry is small. It cannot be used when

field of enquiry is wide.

J'"

SSf

m

'30

Statistics for Economics-X

2. It is costly method and consume more time.

3. Personal bias can give wrong results

""erwise

5. This method is lengthy and complex

ifSifMlSi

m-smrnrn

Merits :

Kiai^ obtained from the third party, it is more or less free froJ

3 of the investigator and the inforr^ant I

3. It saves labour, time and money i

H iLtJ™"'"" """ ' of P'oHems can properl,'

Limitations :

ar in to oh G( in( wt

Mt

Lin


Thus, we find that both the above methods—direct and indirect personal interviews— have certain plus and minus points. For this reason the choice of the method depends on the nature of enquiry and sometimes we balance the demerits of one method by '.tsing the other method also for the same investigation. This way we can counter chec'. the data collected by one method with the other.

(m) Telephone interview : The investigator asks questions over landhne telephone, mobile telephone and even through website. Various researchers, newspapers, television channels, mobile service providers, banks etc., use telephone service to get information from different people, e.g., exit poll, political or economical opinions, music or dance performance opinion etc. Even sometimes website or internet are used for obtaining statistical data. These days online surveys through Short Message Service, i.e., SMS has become popular.

Merits :

1. Telephone interviews are cheaper than personal interviews.

2. It can be conducted in a shorter period of time.

3. The investigator can assist the respondent by clarifying the questions.

4. Sometimes respondents are reluctant to answer some questions in personal interviews. Telephone interviews are better in such cases.

Limitations :

1. Information cannot be obtained from people who do not have their own telephones.

2. Reactions of respondents on certain issues cannot be judged; but it sometimes becomes helpful in obtaining information from respondents.

(IV) Information firom correspondents : In this method, local agents or correspondents are appointed in different parts of the investigation area. These agents regularly supply the information to the central office or investigator. They collect the information according to their own judgements and own methods. Radio and newspaper agencies generally obtain information about strikes, thefts, accidents etc. by this method. It is adopted by Government departments to get estimates of agricultural crops and the. wholesale price index number. It is suitable when the information is to be obtained from a wide area and where a high degree of accuracy is not required.

Merits :

1. This method is comparatively cheap.

2. It gives results easily and promptly.

3. It can cover a wide area under investigation.

Limitations :

1. In this method original data is not obtained.

2. It gives approximate and rough results.

32

Statistics for Economics-XI f C

r

fij

3. As the correspondent uses his own judgement, his personal b^as may affect the accuracy of the information sent. ^

nn^'^'TT correspondents and agents may mcrease errors.

the mo " " ^^^ ^^^^ kept con^'entkllt '

Merits :

^^ ^his method in cases where

informants are spread over a wide geographical area.

o^pêLri 1 ~ ^^ ^^ - -- Jess than the cost

3. We can obtain original data by this method

Limitations :

faift?"" ^^^ ^^^ informants. They may

ques^r misinterpret or may not understand'some

3. There may be delays in getting replies to the questionnaires

4. ms method can be used only when the informants are educated or hterate so that ^ they return the questionnaires duly read, understood and answered 1

' possibility of getting wrong results due to partial responses, and those

IrmrnTreûir ^^ ^^ ^ ^^ ^^^ ^ê splcm:

6. There may be loss of questionnaires in mail. This method is suitable for the following situations •

cTmpef LrS? '' " questionnaire, Government agencies

compel bank and companies etc., to supply information regularlv to the Government in a prescribed form. ^ ^ regularly to the

(b) This method can be successful when the informants are educated.

inf( The be j org; and the high

Mer

Limit

1

3.

4.

5.

'Ki

33


Following are some suggestions for making this method more effective and successful.

(a) Questions should be simple and easy so that the informants may not find it a

m burden to answer them. - u i

(b) Informants should not be required to spend for posting the questionnaires back

therefore, prepaid postage stamp should be affixed. ic) This method should be used in a large sample or wide universe.

(d) This method is preferred in such enquiries where it is compulsory by law to till the schedule. Thus, there is little risk of non-response.

(e) The language of the schedule should be polite and should not hurt the sentiments

of the informants.

(VI) Ouestionnaire filled by enumerators : Mailed questionnaire method poses a tanber oi difficulties in collection of data. Generally, these filled questionnaires received to incomplete, inadequate and unrepresentative.

S The second alternative approach is to send trained investigatorsôr enumeratoi^m M,rmants with standardised questionnaires wl.ich are to be fiUed^jn

ê imêstigator helps the informants in recording their answers. The investâ^rs shoidd i honest tactful and painstaking. This is the most common method used by research iSons. They train investigators properly specifically for the purpose of an enqu^ ^d also tram them in dealing with different persons tactfu ly, to get Proper answers to Ac questions put to them. The statistical information collected under this method is

highly reliable.

Merits :

1. It can cover a wide area.

2 The results are not affected by personal bias. , - , . u

3' True and reliable answer to difficult questions can be obtained through

■ establishment of personal contact between the enumerator and the informant. 4. As the information is collected by trained and experienced enumerators, it is

reasonably accurate and reUable. ,

5 This method can be adopted in those cases also where the informants are illiterate.

6'. Personal presence of investigator assured complete response and respondents can

be persuaded to give the answers to the questionnaire.

Limitations : ^ n ■ ^

1. It is an expensive method as compared to other methods of primary collection of

data, as the enumerators are required to be paid.

2. This method is time consuming since the enumerator is required to visit people

spread out over a wide area. 3 This method needs the supervision of investigators and enumerators. 4" Enumerators need to be trained. Without good interview and proper traming, most

■ of the collected information is vague and may lead to wrong conclusions. 5. It needs a good battery of investigators to cover the wide area of universe and therefore it can be used by bigger organisations.

I,

If.

P

34

Klot Suiwey or Pre-Te« • p ^f^^stics for Economics-Xi

a Pre-test or a guidihg survl; " ^ -^uc.

mam survey. This is done to try out the auetlr before starting th,

the general mformation about L po^Ja"^^^^ -thods for obtaL the pilot survey helps in : ^ ^e sampled. The information supplied b, [i) Estimating the eosr of ^ ■

avaihbiii,® of fc 'he ei™ needed for

"" rr. or fan.

0-t.ons and a,so .„ .He ..proje.enr „

(w) Training of field staff " rX" -- ^ namr,

casus MID SAIWnE SURVEVS

ia) Census method/Census Survey, and (b) Sample method/Sample Survey

.e^VS^td^^.—^ " — Have a Cear n„dersra„d.n, „,

Population and Sample

o all ,he ten,. A par, of .hVlorplZL

ekcon termed as sampling. SupposeTe'^are 500 1 "T'" <>'

School. If we want to know the average wekta Secondary

W'll get the mformation abont all the fLrhnnd f each girl and

obtamed by dividing the total weigte of he tn' " we.ght willbe

CX)VERNMENT SENIOR SECONDARY

I i

f( P

O £ S<

UI

gro' Sun

20

Source


by taking only 50 girls out of 500 and obtain the average of this part of the total population. The average of 50 girls reasonably be representative of average weight of 500 girls. In this case weight of 50 girls is the sample.

Census Surveys

The objective of a census method or complete enumeration is to collect information for each and every unit of the population/universe. In this method every element of population is included in the investigation. Thus, when we make a complete enumeration of all items in population, it is known as 'Census Method" or 'Method of Complete Enumeration'. In above example, collecting weights of all the 500 girls in Senior Secondary School is census method of collection where no student is left over, as each student is a unit.

Following are few examples of census :

1. The population census is carried put once in every ten years in India. Most recently population census in India was carried out in February, 2001 by house to house enquiry to cover all households in India.

2. Demographic data obtained by census method on death rates and birth rates, literacy, work force, life expectancy and composition of population etc. are published by Registrar General of India.

3. The data relating to estimation of the total area under principal crops in India are obtained by using village records maintained regularly by Patwari.

Let us review the following census data in the following Table no. 2 regarding relative growth of Urban and Rural Population in India obtained from Reports and Economic Survey 2002-2003.

TABLE2

Relative Growth of Urban and Rural Population in India

r-................. Year i f r UrhaiP Popuiatinn {tn itorpi) Rural PopttUuioti (m rmrei) Total Ptipuldtion (m Lrine») As Perceraage of Total Popukttidn

Urban popuhtion Rumi PvpuUition

1901 2.58 21.25 23.83 10.9 89.1

1911 2.59 22.62 25.21 10.3 89.7

1921 2.80 22.32 25.12 11.2 88.8

1931 3.35 24.54 27.89 12.1 87.9

1941 4.41 27.44 31.85 13.8 86.2

1951 6.24 29.87 36.11 17.3 82.7

1961 7.89 36.02 43.91 18.0 82.0

1971 10.89 43.93 54.82 19.9 80.1

1981 16.22 52.11 68.33 23.7 76.3

1991 21.76 62.87 84.63 25.7 74:3 .

2001 28.50 74.2 102.7 27.8 72.2-

Source : Census Reports and Economic Survey 2002-2003.

if:.

i t i

I7

I (1

36


74 " 'rr ^^ Of India's population. In 2001

74 2 crore persons, out of about 102.7 crore total population lived in around 5 5 lakhs'

i: ifo^lVt 2 r - -- or uian areas

urban a;ea. Th T ' Population of around 24 crores lived in

urban areas The urban population formed about 11 per cent and rural population 89 per

r^om I urban population had gone up to around 28 per cen

n 2001 while still over 72 per cent people lived m rural areas. The above table show the relative growth of rural and urban population m India since 1901.

The net addition to rural population between 1991-2001 was 1133 crore while urban population increased by 6.74 crore persons. The decadal growth at frru^d

in1he rlr" T -- ^ mcrease of 2 1 per cent

in the growth rate of urban population m the decade ending 2001 over the decade^nding

SAMPLE SURVEYS

We may study a sample drawn from the large population and if that sample is adequate representative of the population, we should be able to arrive at val 7corcSn

Method of collecting of data. In above example, collecting the weights of 50 girls out of

500 girls m Semor Secondary School is sample method of collectiol In this method ew students as sample considered for our study. metnod tew

Following are a few common examples of samplin- •

{a) We look at a handful of gram to evaluate the quality of wheat, rice or pulses, etc

A ^^^ '^^^ ^P^" ^1-tric bulbs out of each lot"

[c) A drop of blood is tested for diseases like malaria or typhoid etc

^ ^ fudtrnfof^'tC ^^^ » P-^-ion for final

(.) Th^^^elevision network provides election coverage by exit polls and prediction is

nnnT"'' T ""V^^^'^^ical termmology population or universe does not mean the total numbe of people m an area; it means the total number of observations or terns fn

r::att- -- ^ — — ^^lected from^ a ~

methods of sampling

^^^Broadly speaking, various methods of sampling can be grouped under mam (a) Random Sampling, and (b) Non-Random Sampling.


37

Let us discuss now the various samphng methods which are popularly used in practice.

MiTHODS OF SAMPLING

i

Random Sampling (a)

-1

Non-Random Sampling

ib)

Simple or Unrestricted Random Sampling Restricted Random Sampling (f) Stratified Sampling (//) Systematic Sampling

or Quasi Random Sampling (f/f) Cluster Sampling or Multi-stage Sampling

(a) (to) (c)

Judgement Sampling Quota Sampling Convenience Sampling

random sampling

Random Sampling is one where the individual units (samples) are selected at random.

It is called as probability sampling.

Random sampling does not mean unsystematic selection of units. It means the chances of each item of the universe being included in the sample is equal. The term 'Random Sampling' here is not used to describe the data in the sample but it refers to the process used for selecting the sample. Following are the methods of random sampling.

Simple or Unrestricted Random Sampling

This method is also known as simple random sampling. In this method the selection of item is not determined by the investigator but the process used to select the terms of the sample decides the chances of selection. Each item of the universe has an equal chance of being included in the sample. It is free from discrimination and human judgement. Random sampling is the scientific procedure of obtaining a sample from the given population. It depends on the law of probability which decides the inclusion of items in a sample. To ensure randomness, mechanical devices are used. There are t^vo methods ot obtaining the simple random sample. They are :

(a) Lottery Method, and

(b) Table of Random Numbers.

(a) Lottery Method : A random sample can generally be selected by this simple and popular method. All the items of the universe are numbered and these numbers are written on identical pieces of paper (slip). They are mixed in a bowl and then there starts the selection by draw one by one by shaking the bowl before every draw The numbers are picked out blind folded. All slips must be identical in size, shape and colour to avoid the

biased selection.

IMH'

38


metal pieces on which nuZ^tT Th! d "" "--ên or

device and each time one piece comes rotated by a mechanical

of digits, for instance if the numbe'Ts ZZ u '^^^ ""^ber

Thisî^ IS us. m drawi::^^-^^:; --

-ry large the above procedures if the disks, balls or slips L not XrouThnf' ^^ T^^^^^ " been a marked tendency to usetSroTrindT^^^^ > T T' " ^^^^ âs

samples. A table of random digkst simply a the purpose of drawing such ,

by a random process. The follLing of Som ^gS tt^^^^^^^^^^^^ ~ (.) Tippet. Random Sampling Numbers. There are 10^00 numbe^tânged 4 digits

MG. Kendall and Babington Smith's Random Sampling Numbers, having 1 lakh

ic) Rand Corporation's a million random digits (d) Snedecor's 10000 random numbers. ie) Fisher and Yates Table having 15000 digits

Rc

2952 3170 7203 3408 0S60

Tippett Numbers

6641 3992 9792

5624 4167 9524

• 5356 1300 2693

2762 3563 1069

5246 1112 9025

ho

Th

hoi

ave strs ;san [peo

7969 5911

1545 1396

2370 7483

5913 7691

6608 8126

college. We will first nuX aVMOo Tui" f ''"t"'' ^

students, now we will cons J a pag^of ?fp i'" ^^^ ""'"''"ing the

15 successive number either horLLhy or ^^a^ "" ^^

Merits

' w a universe. There are less

selected. ^ bas equal chance of being

Regularity begin to operate ^^^ ^aw of Statistical

(

[Meri 1.

2.

3.

4.

5.

6.


3. This method is economical as it saves time, money and labour in investigating a population.

4. The theory of probability is applicable, if the sample is random.

5. Sampling error can be measured.

Demerits

1. This requires complete list of population but up-to-date lists are not available in many enquiries.

2. If the size of the sample is small, then it will not be a representative of a population.

3. When the distribution between items is very large, this method cannot be used.

4. The numbering of units and the preparation of the slips is quite time consuming and not economical particularly if the population is large.

t

Restricted Random Sampling

They are as follows :

(t) Stratified random sampling : In this method the universe is divided into strata or homogeneous groups and an equal sample is drawn from each stratum or layer at random. This method is therefore useful when the population of the universe is not fully homogeneous. For example, suppose we want to know how much pocket money an average university student gets every month will be taken equal sample from various strata, namely : B.A. students, M.A. students and Ph.D. students etc. Stratified random sampling is widely used in market research and opinion polls, it is fairly easy to classify people into occupational, economic, social, religious and other strata. There are different types of stratified sampling

{a) Proportional stratified sampling is one in which the items are taken from each stratum in the proportion of the units of the stratum to the total population.

(b) Disproportionate stratified sampling is one in which units in equal numbers are taken from each stratum irrespective of its size.

(c) Stratified weighted sampling is one where units are taken in equal number from each stratum, but weights are given to different strata on* the basis of their size.

Merits

1. The sample taken under this method is more representative of the universe as it has been taken from different groups of universe.

2. It ensures greater accuracy as each group (stratum) is so formed that it consists of uniform or homogeneous items.

3. It is easy to administer as universe is sub-divided.

4. Greater geographical concentration reduces the time and expenses.

5. For non-homogeneous population, it is more reliable.

6. When original population is not normal (skewed), this method is appropriate.


V\L

N

m.

40

Demerits

1. Stratified sampHng is not possible unless some mformation concerning ti population and its strata is available. concerning u

2. If proper stratification is not done the sample will have an effect of bias. If differ, strata of population overlap such a sample will not be a representative one

(«) Systematic sampling or quasi-random sampling : Systematic sampling is a simo

by preparing this list m some random order, for example, alphabetical order

SMnlr U the list, « stands for any numl

Suppose we have a universe of 10,000 items and we want a sample of 1000, then ^^^

« - 10 The method of selecting the first item from the list is to decide at random f^^t

?hen th "r "'Tu' Suppose we pTck up Z Z t

Then the other items will be 15th, 25th, 35th, and so on unSl we have got oVr fuH sal

fullv rw'' T I u" " that the list of the univers!

fully random and that there are no inherent periodicities in the list.

Merits

1. It ^yystematic, very simple, convenient and checking can also be done quickl]

2. In this method time and work is reduced much.

3. The results are also found to be generally satisfactory.

Demerits

random will not be a determming factor in the selection of a sample.

2. It IS feasible only if the units are systematically managed

3. If the universe is arranged in wrong manner, the results will be misleading |

to divL and sub " ^^

to divide and sub-divide a universe according to its characteristics. Thus if a survev ki be conducted in a country it will first be divided into zones or states l region t^^^^^ mailer units cities towns and villages and then into localities and hLseToW; At Jd

non-ranoom sampun6

ccessi 3n No. Date.^............

s-XI

the

:rent

tipler on is ieved

mber. : take from item, mple. irse is

ickly.

!on as

It m e have y is to ;n into .t each nethoc the hsl

of the s, non-ing are


(a) Judgement or purposive sampHng

(b) Quota samphng

(c) Convenience sampHng \

'n

Judgement Sampling

This is also called purposive or deliberate sampling. In this method individual items of sampling are selected by the investigator consciously using his judgement. Therefore, it requires that the investigator should have a good knowledge of the universe and some experience in the field of investigation. Obviously, the choice of samples will vary from one investigator to another. For example, from a universe of 10,000 ladies who use a particular brand of hairdye, the investigator will select a sample of say, 1,000. His choice of this sample will be such that it is irrespective of the universe. For this an exercise oi judgement is required.

In order for the judgement sampling to be reliable, it should be free from individual lies or prejudice. Since the choice of sample is not based on probability it does not guarantee accuracy and it makes detecting of sampling errors difficult. However, this methods is useful in solving a number of kinds of problems in universe and economics.

The purposive or judgement sampling is suitable in the following conditions :

(a) The number of items in the universe is small to which some items of important characteristics are likely to be left out.

(b) When small sized sample is to be drawn.

(c) When some known characteristics of the universe are to be intensively studied.

(d) It is also appropriate for pilot survey.

Quota Sampling

It is a method of sampling that saves time and cost and is commonly used m surveys of political, religious and social opinion.- Interviewers are allotted definite quotas of the universe and they are required to interview a certain number from their quota. Quotas are decided on the basis of the proportion of persons in various categories. In other words, the investigator is given instructions about

how many interviews should be taken say in a given localitv and what proportion should be from say upper, middle and lower mcome groups, as by some other classification which is predetermined. For example, for a study of truancy (running away) from school in Delhi the investigators are allotted quotas of say 10 schools each out of which two should be public schools (Boys), one public school (Girls), three Boys' Senior Secondary Schools, two Girls' Senior Secondary Schools, two Co-education Schools and from each school he is asked to interview 50 students, taking 10 students each from Classes VIII, IX, X, XI and XII. The interviewer can select any 10 students according to his own judgement.

It is a kind of judgement samphng and provides satisfactory results only when interviewers are carefully trained and personal prejudice is kept out of the-process of selection. ' '

hi

!i<

42 ,

Convenience Sampling Statistics for Economics-XI;

P0.1:;: f'întfri:- lêssr ~ -ce ^

example, for the study of truanrvTr ^ê basis of convenience F.

a school or schools in I neiS^ ^^^^^ ^ê invesdgat™selec

schools. This method is used wLnê " .V ^^ ^^ convenient for hL to g^trthe not clear or complete source hst is tâ^lbl" T ^ê sample unit i

easily available lists, such as teleDhLTW may be obtained W

results obtained by' this mâTntl^^^^^^

unsatisfactory. ^^^ ^ruly representatives of the universe and are

^^HIAGESOF

(I

• of dae. bee. Jo,

getting quick results. ^ therefore, sampling is very useful in

sxhris freîr" —a,

. for fc cV^ "^dr m - «

method. in some ways more reliable than cenLs

2 aliow a samplmg mefcd t

« fc o„,, pos.b,e or E ~ or bote, ma„ufaeLd"« fcTar:^;-,,*^ ^dl"'

sampling method. '' P^^^'^le due to the scientific nature of

» appropriate «e,d . neceiary^Srfctr:?"^^^^^^^^

le otl: Thi

the larf

Statistic

The means a


43

^^i^lity of sample data ;

The main purpose of sampling is to collect maximum information with minimum ^nditure of money, time and labour and yet achieve a high degree of -^curacy and Ability. For ensuring reliability certain principles must be followed. In samphng method : is presumed that whatever conclusions are drawn from a sample are also true for the lole population. This presumption is based mainly on the followmg two laws :

(a) The Law of Statistical Regularity, and

(b) The Law of Inertia of Large Numbers. .'r u (a) Law of statistical regularity : The law of statistical regularity is derived from the

mathematical theory of probability. It says that a comparatively small group of items chosen at random from a very large group will, on the

characteristics of the large group. Basically, it applied to rWom se^lection. Thus so in the process of sampling each unit of the universe has an equal chance of being selected. Therefore, the selected items can be said to be representative of the universe. Although the law is not as accurate as a scientific law is, it does insure a reasonable degree of accuracy. Since there is a certam regularity m natural phenomena, we assume a certain uniformity in nature A random samphng is said to follow the law of statistical regularity because of this basic uniformity m a

universe. r , -t- r

lb) Law of inertia of large numbers : This law is also called the law of stability of mass

data. It is based on the law of statistical regularity. Basica ly, it states that if the

numbers involved are very large, the change in a sample is likely to be very small

in other words, the individual units of a universe very continually but the total

universe changes slowly. That is, large aggregates are most stable than «tnaU

Because of the slow change in the nature of total universe this law is called the law

of inertia (laziness) of large numbers. For example, sugar production of factory will vary significantly from year to year but Ac sugar production of a country as a whole will remain comparatively s able. Or a g eat Inge may take place in the male-female ratio of family may appreciably -bange ove a short period, but the male-female ratio of a country as a whole will ^^

the period, ô take another example, if a. coin is tossed 6 times we may get heaj^s f^r ^ Js and tails two times. But if a coin is tossed 5^0 times^ there is a high p^î^ of getting heads and tails 2,500 times each. This happens due to ^^^

I oplation of this law. That is, when one part of large group is changing m one direction

the other moves in the opposite direction.

Thus, reliability of sampling depends mainly on randomness of selection of data and the large size of universe, expressed by the above two laws.

Statistical Errors

There is a great difference in the meaning of mistake and error in statistics. Mistake imeans a wronfcalculation or use of inappropriate method in the collection or analysis

44

Statistics for Economics-X.

other words, the difference between the approximated (estimated value) and the actual value (true value) is called statistical error m a technical sense. For examl we make a' estimation that in a particular meeting, 1,000 persons are there. But we clnt persons It may be wrongly counted, as 1,030. There is a difference of 30 between the estimate value and counted va ue. This difference is called '...or' in statistics. But wên weTak*

aTS''^r VrThey arl knowi as mistake . For example, there is a meeting, we sent a person to count the audience

Sources of Errors

Following errors are likely to occur in collection of data :

ur'l^! origin arise on accoum of inappropriate definitions of statistical

unit scale, or defective questionnaire etc. For example, wrong scale to measun

meLl -'I height to nearest of inch or approximatrTh

differences may also occur due to differences in measuring tapes due tc manufacturing defect. In Physics or Chemistry such errors of mLsurementrwlI occur while taking readings on various instruments.

nZZ incomplete data, madequat.

crsdonna " sample, non-response of respondent, incomplete answers i questionnaire, misinterpretation of questions in questionnaire, careless oi unqualified investigators, etc. '-diciesb oi|

dr^o I'f f^ -"hmetic calculatio

due to clerical errors, arithmetic slips etc. by omitting some figure consideri

wrong value, making wrong totals etc. by respondent L investigator thrjlta^"''''^'"''^''''" ' statisticians for misinterpret!

Types of Errors

(a) Absolute and relative errors : Absolute error is the difference between the actua

true value and estimated approximate value while relative error is the raTo o absoS error to the approximated value. absolut

Absolute error = Actual value - Estimated value Symbolically,

Ue = U' -U

wr

the enu mei wh( faul

Relative error = Actual value - Estimated value Symbolically,

Estimated value

e =

U'-U U

Sec furthei obtain alreadj are inv b) unf


Here, Ue = Absolute error

e = Relative error U' = Actual value U = Approximate value niustration. Sales of commodity approximated Rs 497 and actual sale Rs 500. Absolute error (Ue) = 500 - 497 = 3

500-497 3

and Relative error (e) -

= .006

500 500

Relative error (e) can also be represented in percentage

X 100 = 0.6%.

500

Relative error is generally used in statistical calculations because absolute error gives wrong or misleading calculations.

(h) Biased and unbiased errors : Biased errors arise due to some prejudice or bias in the mind of investigator or the informant or any measurement instrument. Suppose the Hiumerator used the deliberate sampling method in place of simple random sampling method; then it is called biased error. These errors are cumulative in .ir re and increase when the sample size also increases. Biased errors arise due to fauli^ j^iocess of selection, faulty work during the collection of information and faulty method of analysis.

Unbiased errors are not the result of any prejudice or bias. They are those which arise acccidently just on account of chance in the normal course of investigation. Unbiased errors are generally compensating.

(c) Sampling and non-sampling errors : The errors arising on account of drawing inferences about the population on the basis of few observations (sampling) are called sampling errors. The errors mainly arising at the stages of ascertainment and processing 'of data, are called non-sampling errors. They are common both in census enumeration and sample surveys.

To avoid these errors, the statistician must take proper precaution and care in using itfie correct measuring instrument. He must see that the enumerators are also not biased. Unbiased errors can be removed with proper planning of statistical investigations.

Statisticians should have none of these errors.

i "

how secondary data is collected

Secondary data are those which are collected by some other agency and are used for i^her studies. It is not necessary to conduct special surveys and investigations. We can obtain the required statistical information from other institutions, or reports which are ^eady published by them as a part of their

routine work. It saves cost and time which 'are involved in collection of primary data. Secondary data may be either (a) published or (fc) unpublished.

46


Ji

hm. Sf.

Published Soiu-ces

The various sources of pubhshed data are as under :

(/■) Goveênt pubUcations : Different ministries and departments of Central ar State Governments publish regularly current information along with statistical da on a number of subjects. This information is quite reliable for related studies. ^ examp es of such publications are: Annual Survey of Industries, Labour Gaze Agriculture Statistics of India, Indian Trade Journal, etc.

(«) Publications of international organisations : We can obtain valuable internation s atistics from official publication of different international organisations, like, ti United Nations Organisation (UNO), International Labour Organisation (ILO International Monetary Fund (IMF), World Bank, etc. .

(Hi) Semi-official publications : Local bodies such as Municipal Corporations, Distri Boards etc: publish periodical reports which give factual information about heal sanitation, births, deaths etc.

(iv) Reports of committees and commissions : Various Committees and Commission are appointed by the Central and State Governments for some special study an recommendations. The reports of .uch committees and commissions contai valuable data^ Some of the reports are : Report of National Agricultu Commission, Report of the Tariff Commission, the Patel Committee Report e

(v) Private publications :

(a) Journal and newâpers. Journals like Eastern Economists, Journal of Industr and Trade, Monthly Statistics of Trade; and newspapers, like Financial Expres Economic Times, collect and regularly puWish the data on different fields ( economics, commerce and trade.

(b) Rêarch institutions. There are a number of institutions doing research o allied subjects This is the most importarn source of obtaining secondary dat The National Council of Applied Economic Research and Foundation ( !>cientihc and ônomic Research are such institutions. Research scholars at ti university level also contribute significandy to the availabihties of secondai

(c) Professional trade bodies. Chambers of Commerce and Trade Associatio, publish statistics relating to trade and commerce. Federation of Indian Chamb of Commerce, Institute of Chartered Accountants, Sugar Mills Associatio Bombay Mill Owners Association, Stock Exchanges, Bank and Cooperath Societies, Trade Unions, etc. pubhsh statistical data.

(d) Annual reports of joint stock companies are also useful for obtaining statistic information. These are pubKshed by companies every year.

^^^ ^md^'' also, provide valuable data for reseat


Unpublished Data

Research institutions, trade associations, universities, labour bureaus, research workers and scholars do collect data but they normally do not pubHsh it. Apart from the above sources we can get the information from records and files of government and private offices. -

Limitations of Secondary Data

One should use the secondary data with care and full precaution and should not accept them at their face value as they may be suffering from the following limitations:

1. They may not have been collected by proper procedure.

2. They may not be suitable for a required purpose. The information which was collected on a particular base may not be suitable and relevant to an enquiry.

3. They may have been influenced by the biased investigation or personal prejudices.

4. They may be out of date and not suitable to the present period.

5. They may not satisfy a reasonable standard of accuracy.

6. They may not cover the full period of investigation.

Precautions in the Use of Secondary Data

The investigator should consider the following points before using th j secondary data : (a) Are the data reliable?

{b) Are the data suitable for the purpose of investigation?

(c) Are the data adequate?

(d) Are the data collected by proper method?

(e) From which source were the data collected? if) Who has collected the data?

(g) Are the data biased?

. Thus, the secondary data should not be used at its face value. It is risky to use such statistics collected by others unless they have been properly scrutinised and found reliable, suitable and adequate. ■

■

ijl^ofrlant sources of secondary dali|> of india and national survey organisations)

There are various sources and organisations through which statistical data are being compiled in India. Since India achieved Iiidependence, great and rapid strides have been made in the field of collection of data. In the context of economic planning, importance of statistics (data) in the country has become great. Statistics are necessary for framing and judging the progress of economic planning. The study of Indian statistics is made under following heads :

I. Statistical Organisation of India (CSO)

II. Indian Statistical Material.

48


This can be studied >finder following sections :

(A) Agriculture-Statistics (B) National Income and Social Accounting

(C) Population Statistics (D) National Sample Survey

(E) Price Statistics (F) Industrial Statistics

(G) Trade Statistics (H) Financial Statistics

(I) Labour Statistics

There are some agencies both at the national and state level, which collect, process .^nd tabulate statisticar data. Some important major agencies at the national level are ^ensus of ^dia, Narionai Sample Survey Organisation (NSSO), Labour Bureau, Central Statistical Organisation (CSO), Registrar General of India (RGI), Director General of Commercial Intelligence and Statistics (DGCIS), etc.

census of india

unique experience of undertaking the biggest census in the world in 1981 and has also an unbroken record of more than hundred years of decadal censuses Ihe Indian census is universally acknowledged as most authentic and comprehensive source of information about our land and people. In 1869 Hunter was appointed Director General of Statistical Surveys. He not only elaborated the statistical system but also assisted the statistical surveys of districts and provinces. That later followed into famous Gazetteers. He advised m conducting of census of India which undertook explanatory surveys from 1869 to 1872 and thereafter matured into a decennial census which ever since contmued without interruption. After 1872 the next census was taken in 1881 and ^nce then it has ^become a regular feature of holding census every ten years uninterruptedR The Census of India provides the most complete and continuous demographic record of

T Independecne was held in 1951 and latest one completed

m .001. The study of population is important for several reasons in overall study of economic development. Information of demographic characteristics include birth and death, fertility, sex ratio, age-composition, migration and literacy etc. The economic Characteristics of ppulation are manifested through workers' participation m economic

classification of workers m various occupations, employment

The data generated by the Census of India 2001 provide benchmark statistics on the

people of India at the beginning of the next millennium. This is a mirror of a fair

^presentation of the socio-economic and demographic condition of our people which

constitute about one-sixth of the human population on this planet. The census statistics

s useful for assessing the^impact of the developmental programmes and identify new

thrust areasTor focussing the efforts on improving the quality of life in our country Basic

population data fmm Primary Census Abstract. Census of India 2001 gives information ot population m India as :

TABLE 3

PersonsMales Females Sex Ratio

1,028,610,328 —^- ^ _" 532,156,772 496,453,536 933


49

national sample survey organisation (ii5s0)

The National Sample Survey (NSS), initiated in the year 1950, is a nationwide, large scale continuous survey operation conducted in the form of successive rounds. It was established on the basis of a proposal from Prof. P.C. Mahalanobis to fill up data gap for socio-economic planning and policy making through sample surveys. On march 1970, the NSS was recognised and all aspects of its work were brought under a single Government organisation namely the National Sample Survey Organisation (NSSO) under the overall direction of a Governing Council to impart objectivity and autonomy in the matter of collection, processing and publication of the NSS data.

The Governing Council consists of 18 experts from within and outside Government and is headed by an eminent economist/statistician and the member-secretary of the council is Director General and Chief Executive Officer of NSSO. The Governing Council is empowered to take all technical decisions in respect of survey work, from planning of survey to release of survey results. The NSSO headed by a Director General and Chief Executive Officer, has four divisions namely. Survey Design and Research Division

(SDRD), Field Operation Division (FOD), Data Processing Division (DPD) and Coordination Publication Division (CPD). A Deputy Director General heads each division except FOD. An Additional Director General heads FOD.

Functions of NSSO

The functions of National Sample Survey Organisation are :

(i) Collection of data on socio-economic conditions, production of small scale household enterprises consumption etc. on continuous basis in a comprehensive manner for whole country. A major objective of NSS has been to provide data required to fill up the gaps in information needed for estimation of national income.

[ii) Collection of data relating to the organised industrial sector of the country. {Hi) Supervision of surveys conducted by states in agricultural sector through their own

agencies and also giving guidance to them for analysing and coordinating the results of these surveys. The NSSO took a forward view of the data requirements to planners, research workers and other users and draw up a long term programme. The programme conducts periodical surveys on :

{a) Demography, health and family planning; {b) Assets, debt and investment; (c) Land holdings and livestock enterprises;

{d) Employment and unemployment, rural labour and consumer expenditure; and (e) Self employment in non-agricultural eflterprises.

The data collected by NSSO surveys on different socio-economic subjects are released tiirough reports and its quarterly journal 'Sarvekshana\ The data comprises different iocio-economic subjects like employment, unemployment literacy, maternity child care.

■ 1 t r

50


utUisat^n of public distribution system, utilisation of educational of services etc Th

car Aoar; fTolfll . T 2004)-was on morbidity and head

care. Apart from collection of rural and urban retail prices for compilation of consume

pn.e mdex numbers NSSO also undertakes field work of Annual S^ ^dust^ conducts crop estimation surveys. ^ maustries an

exercises

2.

3.

4.

5.

6.

7.

8.

9.

10. 11. 12. 13.

14.

15.

i-XT

What do you understand by Statistical Enquiry? Explain

d~ft~f ^ merits anc

Discuss the comparatwe merits of various methods of collecting primary data.

SrsTable^^^^^^^^^^^ " ' i-tigations-,

What are the similarities and dissimilarities between the two methods-l questionnaires to be filled in by informants and schedules to be fild in h enumerators? Explain with examples. *

mat is a questionnaire? Give a specimen of a questionnaire.

Describe the questionnaire method of collecting primary data. What precaution! must be taken while preparing questionnaire? precautionf

Write short notes on :

(a) Census of India

{b) National Sample Survey Organisation (NSSO) mat IS Secondary^data? Discuss the various sources of collecting secondary data, mat precaution should be taken before using secondary data? Explain

iTZn 1 constructing interview schedules and questionnaires

F ame at least four appropriate multiple choice options for following questions (t) How often do you use computers? («) What is the monthly income of your family? (m) Rise in petrol price is justified : (iv) Which of the newspaper do you read regularly ?

Jv) Which of the following most important when you buy a new dress' l<rame two way questions (with 'Yes' or 'No')

Following statements-true or false.

(/) Data collected by investigator is called secondary data. («) There are many sources of data.

(m) Telephone survey is the most suitable method of collection of data when the population is literate and spread over a large area.


16. Distinguish between population and sample.

17, Distinguish between census and sample surveys. List four important types of sampling : methods. Explain the reasons for preferring sample surveys in the collection of data.

Name the methods of selecting a sample. Describe the method of stratified sampling - with merits and demerits.

19. The Education Ministry is interested in determining the level of education of unmarried girls in the country. How would you organise a survey for this purpose?

20. Does the lottery method always give you random sample? Explain.

21. Do samples provide better resuhs than surveys? Give reasons for your answer.

22. Examine the important types of sampling methods.

23. Distinguish between random sampling and systematic sampling. Give suitable examples.

24. Discuss briefly the following :

(a) Law of inertia and large numbers

(b) Law of statistical regularity

25. What do you understand by 'Census' investigation? Explain its suitability with illustrations.

26. What do you mean by 'Sample' investigation? Explain its suitability with illustrations.

27. Explain briefly the different methods of sampling. Give illustrations.

28. Give a comparative study of stratified sampling and multi-stage sampling.

29. Write a critical note of random sampling method.

30. How would you distinguish convenience sampling with judgement (deliberate) sampling? Explain.

31. What do you mean by statistical errors?

32. Discuss briefly the following :

(a) Biased and unbiased errors

(b) Absolute errors and relative errors p; :—^ (c) Sampling and non-sampling errors

Give two examples each of sample, population and variable. Which of the following methods gives better result and why? (a) Census (6) Sample

Chapter 4

organisation of data

¥

(b)

Classification

1. Definition

2. Objects of Classification

3. Characteristics of Classification

4. Methods of Classification Statistical Series

1. Definition

2. Types of Series

3. Frequency Distribution

(a) classification

The quantitative information collected in any field of society or science is never uniform. They always differ from one to another, e.g., prices of vegetables, students in different sections, income of families, time in different watches, height or weight of students. A single item out of all the observations of group as numerical may be called variate or variable, e.g..

Price of potato is Rs 10.00 per kg, in a group of vegetable prices.

Students in Section A are 50, in a group of different sections.

Income of family D is Rs 10,000 per month, in a group of families.

Time in HA/IT watch is 10.45 a.m., m a group of different watches.

Height of Rajesh is 60", in a group of students.

Variate can also be called 'variable' or 'magnitude' or 'observation' or 'item' or 'measure' or value'.

The characteristics which are not capable of being measured quantitatively are called attributes. For example, blindness, deafness, literacy, sickness, tall and short, black and blue eyed, intelligence, aptitude for art and music, etc. They cannot be measured numerically in the same way as heights and weights, or, price and incomes. Individuals may be ranked according to quality of attributes. The ranks are sometimes used as their numerical values for purposes of statistical analysis.

The collected data (either by primary or secondary method) are always in an unorganised form in schedules or questionnaires or another written form. The collected data in unorganised form is called RAW DATA. Because of the limitation of human mind

Organisation of Data

53

to understand such a complex, varied and unorganised data, it is necessary to make them available for comparison, analysis and appreciation by proper and suitable grouping and arrangement in condensed form. The process of grouping into different classes or subclasses according to characteristics is called classification. The classified information arranged in a logical and systematic order in a particular sequence is called seriation or statistical series. The classified information presented in precise and systematic tables is called tabulation. In other words, classification is for division of data, seriation is for arrangement of data in a systematic order and tabulation is for presentation of data in a table.

DEFINITION

According to Professor Connor, "-Classification is the process of arranging things (either actually or notionally) in the groups according to their resemblances and affinities, and give expression to the unity of attributes that may subsist amongst a diversity of individuals."

According to this definition, the chief features of classifications are :

(1) The facts are classified into homogeneous groups by the process of classification All the units having similar characteristics are placed in one class or group.

(2) The basis of classification is unity in diversity.

(3) The classification may be either actual or notional.

(4) The classification may be according to either attributes or characteristics or measurements.

Classification is grouping of data according to their identity, similarity, or resemblances For example, letters in the post office are sorted out in groups of cities and towns of destination, viz., Delhi, Chennai, Agra,- Chandigarh etc. Similarly, students in a school may be grouped as boys and girls, or according to age, in library the books and periodicals are classified and arranged according to subjects, students are classified according to division they secured in certain examination, animals or plants may be grouped according to origin or structure etc.

OBJECTS OF CLASSIFICATION

The chief objects of classification are :

1. To present the facts in a simple form : Classification process eliminates unnecessary details and makes the mass of complex data, simple, brief, logical and understandable. For example, the data collected in a population census is so huge and fragmented that it is not possible to draw any conclusion from them. When these massive figures are classified according to sex, education, marital status, occupation etc., then the structure and nature of the population can easily be understood.

2. To bring out clearly points of similarity and dissimilarity : Classification brings out clearly the points of similarity and dissimilarity of the data so that they can be


easily grasped. Facts having similar-characteristics are placed in a class, such as educated, uneducated^ employed, unemployed etc.

3. To facilitate comparison : Classification of data enables one to make comparison, draw mferences and locate facts. This is not possible in an unorganised and unclassified data. If marks obtained by B. Com. students in two colleges are given, no comparison can be made of their intelligence level. But classification of students into first, second, third and failure classes on the basis of marks obtained by them, will make such comparison easy.

4. To bring out relationship : Classification helps in finding out cause-effect relationship, if there is any in the data. For example, data of small-pox patients can help m finding out whether small-pox cases occurred more on vaccinated or unvaccinated population.

5. To present a mental picture : The process of classification enables one to form a mental picture of objects of perception and conception. Summarised data can easily be understood and remembered.

6. To prepare the basis for tabulation : Classification prepared the basis for tabulation and statistical analysis of the data. Unclassified data cannot be presented in tables.

CHARACTERISTICS OF CLASSIFICATION

It is important that the classification should possess following characteristics :

1. Classification should be unambiguous : Classification is meant for removing ambiguity. It is necessary that various classes should be so defined that there is no room for doubt and confiision and must have a class for each item of data in one of the classes.

2. The classes must not overlap : Each item of data must find its place in one class and one class only There must be no item which can find its way into more than one class.

3. Classification should be stable : If classification is not stable and if each time an enquiry is conducted it has to be changed. The data would not be fit for comparison. Therefore, the classification must proceed at every stage in accordance with one principle, and that principle should be maintained throughout.

4. Classification should be flexible : It should be flexible and should have the capacity of adjustment to new situations and circumstances. With change in time, some classes became obsolete and have to be dropped and fresh classes have also to be added.

5. Classification should be suitable to enquiry i-Classification should be according to the objects of enquiry. If the investigation is carried on to enquire into the economic conditions of labourers, then it will be useless to classify them on the basis of their religion. »

6. Classification, should have arithmetical accuracy : The total of items included in different classes, should tally with the total of the universe.

55

■ganisation of Data

.O .e. ch—s. . can be .onped

the following bases :

1. According to time, (Chronological Classification)

2 According to area, (Geographical Classification)

3 According to attributes, (Qualitative Classification)

4 According to magnitudes or variables (Quantitative Classification)

For example.

Population of India

1951 35.7

1961 43.8

1971 54.6

1981 68.4

1991 81.8

2001 102.7

OR

2001 102.7

1991 81.8

1981 68.4

1971 54.6

1961 43.8

1951 35.7

Year

Population (in crores)

Year

Population (in crores)

(a) Alphabetical order (names of the countries)

CountryAmerica Brazil China Denmark France India

Yield of .

wheat in . ' 2,25 439 862

kg (Per acre) 1925 127 893

(b) Descending order (figures of the second row above) ^

Country " America China India France Denmark Brazil

Yield of

wheat in ^^^ ^^^ ^^^ 225 127

Note 4«rLes of Ae conntrTes do no. appear in an alphabe.,cal order in example (b).

For example.

56

Two-fold classification

POPULATION


r

Males

c o

to

2 o o E

CO

= iS « «

s u

Employed

Females

Married (1)

I

Unemployed

Employed

Unmarried Married Unmarried

(2) (3) (4)

Married (5)

i

Unemployed

Unmarried Married (6) (7)

1

Unmarrit (8)

Second and AM Iges of cLsffaLT " ""

confusion in classification Th..Q in i-u^ u . , . ' cieariy dehned to avoid

we.gh, sales, t'Tts"7rf ^ ^^

formation of searisrical feries "7' classification ,s made in tl,

: "r;"

Tu 1 -10 27 58 72

Thus, there are 15 workers in the income group of Rs 100 to 199 77 7 • mcome group of Rs 200-299 and so on. ' '1

(b) statistical series

DEHNinON

other, the resu,. ,s s^^r^^ :""" "'

the division of the data I tt^ flTrr' "

.he collected and classified" ir^ rprl^^eJrr;:;': """""

57


STATISTICAL SERIES

Jiasis of bharacter

Z Times Series (on the basis of time)

2. Spatial Series (on the basis of space)

3. Condition ^ries (on the basis of condition)

r Series of Individual ^servation 2. Discrete Series ■ 3. Continuous Series

Frequency distribution

Series on die Basis of General Chat«c.« ^ ^^ „

ISa^r tr reference ro sonre .n,e nn..,

,ear,.onrH,weeU, or da. Por example. ^^^ ^ ^^ ^^

Sugar Production of a Factory ^^^ ^^^^ ^^ 2006)

of .he universe under srudy ^^ j,.^coun.ry, sate, cry, town.

village or colony. For example : ^^^^^ ^^ ^^^^^^^

CountryPer Capita

USA France Japan Canada India 5,100 3,900 2,800 2,100 500

City 3 »

Delhi Mumbai Chennai Kolkata Bangalore 792 649 573 532 459

58


3. Condition series : A series of values of some variable made according to a condition is called condition series. Data are presented with reference to some condition, viz., height, age, weight, income etc. For example :

Weekly Income of 100 Workers

Income (Rs.) No. of Workers

500- 999 35

1000-1499 25

1500-1999 15

2000-2499 20

2500-2999 5

Series on the Basis of Construction

After collection and classification of data it is the most important job now to construct the data in an arranged order that is the formation of series for further study of presentation, analysis and interpretation. This arrangement can be done in three ways :

{a) Series of Individual Observation, {b) Discrete Series, (c) Continuous Series.

{b) and (c) are with reference to frequency distribution.

Series of Individual Observation

Mass data in its original form is called raw data or unorganised data which can be arranged in any of the following ways :

(/■) Serial order of alphabetical order, (ii) Ascending order, {Hi) Descending order.

The mass data when put in ascending or descending order of magnitude is called an array. A series of individual observations is a series where items are listed singly after collection. They are not listed in groups.

Suppose an investigator has obtained the following information from a factory about the payment of daily wages of 30 workers, which is in unorganised form (Raw Data) as shown in Table 1.

TABLE 1

Daily Wages Paid to Workers (in Rupees)

60 102 61 101 92 80

87 72 86 73 96 101

92 56 90 58 85 74

83 63 84 62 92 100

56 84 90 86 67 72


TYPES OF SERIES

Statistical series can be clas^i^ed in the following way :

STATISTICAL SERIES

57

^J^sls of "Character

1. Times Series (on the basis of time)

2. Spatial Series (on the basis of space)

3. Condition Series

(on the basis of condition)

1. Series of Individual observation

- 2. Discrete Series

- 3. Continuous Series

Frequency distribution

Series on the Basis of General Character

1 Time series. A series of values of some variable according to successive points in time is called time series. Data are presented with reference to some time unit, viz., year, month, week, or day. For example :

Sale in Super Bazar

Sugar Production of a Factory

Year Production

(in WO tons)

1999 78

2000 75

2001 94

2002 86

2003 89

2004 92

2005 95

(1st week of Jan. 2006)

Day Sale

(Rs)

Men. 1,892

Tues. 2,757

Wednes. 3,090

Thurs. 2,650

Fri. 2,592 ■

Satur. 3,822

2 Spatial series. A series of values of some variable according to geographical division of the universe under study is called a spatial series or geographical series. Data are presented with reference to some geographical division, viz., country, sate, city, town.

village or colony. For example :

Per Capita Income

Number of Schools

CountryPer Capi^

USA 5,100

France 3,900

Japan 2,800

Canada 2,100

India 500

City No. of Schools

Delhi 792

Mumbai 649

Chennai 573

Kolkata 532

Bangalore 459

58

Statistics for Economics-XI j

3. Condition series : A series of values of some variable made according to a condition IS called condition series. Data are presented with reference to some condition, viz., height, age, weight, income etc. For example :

Weekly Income of 100 Workers

Income (Rs.) No. of Workers

500-999 35

1000-1499 25

1500-1999 15

2000-2499 20

2500-2999 5

i;

Series on the Basis of Construction

After collection and classification of data it is the most important job now to construct the data in an arranged order that is the formation of series for further study of presentation, analysis and interpretation. This arrangement can be done in three ways :

(a) Series of Individual Observation, (b) Discrete Series, (c) Continuous Series.

{b) and (c) are with reference to frequency distribution.

Series of Individual Observation

Mass data in its original form is called raw data or unorganised data which can be arranged in any of the following ways :

(/•) Serial order of alphabetical order, (ii) Ascending order, (Hi) Descending order.

The mass data when put in ascending or descending order of magnitude is called an array. A series of individual observations is a series where items are listed singly after collection. They are not listed in groups.

Suppose an investigator has obtained the following information from a factory about the payment of daily wages of 30 workers, which is in unorganised form (Raw Data) as shown in Table 1.

TABLE 1

Daily Wages Paid to Workers (in Rupees)

60 102 61 101 92 80

87 72 86 73 96 101

92 56 90 58 85 74

83 63 84 62 92 100

56 84 90 86 67 72


59

The above raw data can be arranged either in serial order (Table 2) or ascending order (Table 3) or descending order (Table 4) as given below :

TABLE 2 Arranged in Serial Order

5- No. ^^ y -."" - V (Rs) S.No. Wages (Rs, S. No. Wages IRS,

1 60 11 61 21 92

2 87 12 86 22 96

3 92 13 90 23 85

4 83 14 84 24 92 ■

5 56 15 90 25 67

6 102 16 101 26 80

7 72 ■ 17 73 27 101

8 56 18 58 28 74

9 63 19 62 29 100

10 84 20 86 30 72

TABLE 3 Arranged in Ascending Order (Wages in Rupees)

56 62 73 84 90 96

56 63 74 85 90 100

58 67 80 86 92 101

60 72 83 86 92 101

61 72 84 87 92 102

TABLE 4 Arranged in Descending Order (Wages in Rupees)

102 92 87 84 72 61

101 92 86 83 72 60

101 92 86 80 67 58

100 90 85 74 63 56

96 - 90 84 73 62 56

FREQUENCY DISTRffiUTION

Before discussing anything about frequency distribution it is advisable to know the following important terms of frequency distribution under which the two types of distributions are grouped. The two types are :

(a) Discrete Frequency Distribution.

(b) Continuous Frequency Distribution.

60


Terminology of Frequency Distribution

Examine the following two sets of illustrations to clearly understand the basic termmology of frequency distribution. ^

Set I. Children in Families t

Children No. of families (f)

0 1 2 3 4 25 45 37 15 8

Total 130

Height in Inches No. of Students (f)

56-58 12

58-60 16

60-62

62-64 4

64-66 10

Total 57

diffe^r^t' i' arrangement of items mto a particular order or sequence m

sfuZZ '' ' " Height "f

Frequency : The number of times given value in an observation appears is the frequencv

four chiljen ^ 10 students m the group of 64" to 66" and 16 students m group of 58 to 60 etc. So the frequency of famihes having no child is 25, frequency of families havmg 4 children is 8; frequency of students m the group of 6^" to 66•^s 10 and frequency m the group of 58" to 60" is 16.

class^'lrau^nr"'^ ^ Tu^"/^ ^^^ quantitative classes is called the

class frequency^.^., out of the five classes of Set II students in a group of 58" to 60"

to 60 IS 16 and of 62" to 64" is 4. There is no instance of a class in Set I

"^SiT't^TlT ^ J ^^ - the total frequency,

e.g., the total 130 and 57 in our set I and set II.

cJJaT^^ distribution : The distribution of observations over the several values is

n ^^ Set n T" ' ^^ children

m tamilies, and Set II is the frequency distribution of heights of students.

Class. It is a decided group of magnitudes, e.g., 56"-58", 100-200, 10-19 4-8

form fhe b'ndar^T I = ^^^ magnitudes, which

boundaries of a d^ss, are known as the upper and lower limits, respectivdy For

clTn'^f Set^f^^^^^^^^ '' " '^T " upper.limit.'Thu's m thi fir"

andTht hf H / f magnitudes 56, 58, 60, 62, and 64 are the lower limits

respec^v classes ''' ^^ ^^^ ^^^ ^mits of their

Organisation of Data ^^

The above raw data can be arranged either in serial order (Table 2) or ascending order (Table 3) or descending order (Table 4) as given below :

TABLE 2 Arranged in Serial Order

S. No. Wages S. No. Wages S. No. Wages

(Rs) (Rs) (Rs)

1 60 11 61 21 92

2 87 12 86 22 96

3 92 13 90 23 85

4 83 14 84 24 . 92

5 56 15 90 25 67

6 102 16 101 26 80

7 72 • 17 73 27 101

8 56 18 58 28 74

9 63 19 62 29 100

10 84 20 86 30 72

TABLE 3 Arranged in Ascending Order (Wages in Rupees)

56 62 73 84 ' 90 96

56 63 74 85 90 100

58 67 80 86 92 101

60 72 83 86 92 101

61 72 84 87 92 102

TABLE 4

Arranged in Descending Order

(Wages in Rupees)

102 92 87 84 72 61

101 92 86 83 72 60

101 92 86 80 67 58

100 90 85 74 63 56

96 - 90 84 73 62 56

FREQUENCY DISTRIBUTION

Before discussing anything about frequency distribution it is advisable to know the following important terms of frequency distribution under which the two types of distributions are grouped. The two types are : .

(a) Discrete Frequency Distribution.

(b) Continuous Frequency Distribution.

60

Statistics for Economics-XI Terminology of Frequency Distribution

Examine the following two sets of illustrations to clearly understand the basic terminology of frequency distribution.

Set I. Children in Families

Children No. of families (f)

0 25

1 45

2 37

3 15

4 8

Total 130

Height in InchesNo. of Students (f)

56-58 12

58-60 16

60-62 15

62-64 4

64-66 10

Total 57

Series is a systematic arrangement of items into a particular order or sequence in ffe. different classified categories, as Set I for Children in Families and Set II for Height of

Students.

Frequency: The number of times given value in an observation appears is the frequency For example, in the above sets there are 25 families having no child and 8 families having four children; and 10 students in the group of 64" to 66" and 16 students in group of 58" to 60" etc. So the frequency of families having no child is 25, frequency of families having 4 children is 8; frequency of students in the group of 64" to 66" is 10, and frequency in the group of 58" to 60" is 16.

Class frequency : The number of values in each of the quantitative classes is called the class frequency, e.g., out of the five classes of Set II students in a group of 58" to 60" are 16 and students in a group of 62" p 64" are 4, so the class frequency of the class 58" to 60" is 16 and of 62" to 64" is 4. There is no instance of a class in Set I.

Total frequency : The sum (total) of the frequencies is known as the total frequency, e.g., the total 130 and 57 in our set I and set II.

Frequency distribution : The distribution of observations over the several values is called frequency distribution. For example. Set I is the frequency distribution of children m families, and Set II is the frequency distribution of heights of students.

Class. It is a decided group of magnitudes, e.g., 56"-58", 100-200, 10-19, 4-8, 7-13 etc.

Upper and lower limits of the classes : The lowest and the highest magnitudes, which form the boundaries of a class, are known as the upper and lower limits, respectively For 1 example, for a class of 62-64, 62 is lower limit and 64 is upper limit. Thus in the first column of Set II, left hand side magnitudes 56, 58, 60, 62, and 64 are the lower limits! and right hand side magnitudes 58, 60, 62, 64 and 66 are the upper limits of their] respective classes.-

Organisation of Data 6\

Cias- mterval : The magnitude spread between the lower and upper class limits is called class interval. It is the span or width of a class which can be obtained by finding the difference between the upper and lower limits of the class. For example, for class 64"-66". the class interval is upper limit (l^) - lower limit (/j), i.e., (l^ - l^) = 66-64 = 2. The class interval in this case is 2, l^ is the lower limit and is the- upper limit.

Mid-point : The mid-value which lies half way between the lower and upper class limits is known as mid-point. Thus, in a class of 62 "-64" the mid-point is

upper class limit + lower class limit 2

or

/2+/1 64+62

= 63

2 2

Calculated mid-points are the most important values, as being the representatives of the classes, and are taken for use in further statistical calculations.

Variable : A quantity which varies from one individual to another is known as a variable or variate. Quantitative characteristics such as income, height, weight, number of units sold etc., are variables. A variable may be either discrete or continuous.

Discrete and Continuous Variables

Discrete and Discontinuous Variables are those which are exaci or finite and are not normally fractions. They cannot manifest every conceivable fractional value, but appear by limited gradations. For example, children in a family can be either 2 or 3, but cannot be 2.2, 2.8 or 2.7. It is a descrete variable which is not expressed in a fraction. In the same way test scores of a cricket match, rooms in a house, workers in a factory, fans installed in an auditorium, students in a class are all the examples of discrete variables. The occurrence of the observation will be integers, i.e., 1, 2, 3, 4, 5, 6, ... and so on. Thus the variable is said to be of a discrete type when there are gaps between one value and the next. For example, in set I; 0, 1, 2, 3, 4 are discrete variables.

Even fractional values are discrete or discontinuous variables provided there is an uniform difference from one variable to the other variable. For example, if wage rate per unit is 50 paise then workers of a factory may get wages in rupees as : 0.50, 1, 1.50, 2, 2.50, 3, 3.50, and so on.

Continuous variables are those that one in units of measurement which can be broken down into infinite gradations, e.g., weights, heights, incomes, rainfall etc. They are capable of manifesting every conceivable fractional value {i.e., in decimals) within the range of possibilities. They fall in any numerical value within a certain range. For instance covering a distance on a road, by car, say from 0 kilometre to 5 kilometres one never jumps from 0 to 1 km, 1 to 2 km, or 2 to 3 km, but every fraction of distance from 0 km to 5 km is touched. In other words, the car must pass through'all the infinitely small gradations of distance between 0 km to 5 km. All the fractional values are continuous variables. Heights of students from 56" to 58" in our set II for example, cover all the fractional values falling within the limit of 56 and 58.

I I

62


(a) Discrete frequency distribution, and

(b) Continuous frequency distribution.

Conslruclion of Discrete Frequency Distribution

1. Prepare a table with three columns-first for variable under study, second for 'Tallv

magnitude to another may preferably be tL satne®

column

^^^lustration .. Prepare frequency table of ages of 25 students of XI Class in your Solution.

Frequency Distribution of Ages of 25 Students

Age

15

16

17

18 19

Tally bars Total (/)

mill 7

mi nil . 9

mi 5

III 3

1 1

Total 4 25


Class interval : The magnitude spread between the lower and upper class limits is called class interval. It is the span or width of a class which can be obtained by finding the difference between the upper and lower limits of the class. For example, .or c^ass 64"-66". the class interval is upper limit (Z^) - lower limit (/,), i.e., (l^-/,) - 6t,-b4 - 2. The class interval in this case is 2, is the lower limit and is the-upper limit.

Mid-point : The mid-value which lies half way between the lower and upper class limits is known as mid-point. Thus, in a class of 62"-64" the mid-point is

upper class limit + lower class limit 2

or

l2±k 2

64+62

= 63

Calculated mid-points are the most importam values, as being the representatives of the classes, and are taken for use in further statistical calculations.

Variable : A quantity which varies from one individual to another is known as a variable or variate. Quantitative characteristics such as income, height, weight, number of units sold etc., are variables. A variable may be either discrete or continuous.

Discrete and Continuous Variables

Discrete and Discontinuous Variables are those which are exac. or finite and are not normally fractions. They cannot manifest every conceivable fractional value, but appear by limited gradations. For example, children in a family can be either 2 or 3, but cannot be 2 2 2 8 or 2 7. It is a descrete variable which is not expressed in a traction. In the same way test scores of a cricket match, rooms in a house, workers in a factory, fans mstaUed in an auditorium, students in a class are all the examples of discrete variables. The occurrence of the observation will be integers, i.e., 1, 2, 3, 4, 5, 6, ... and so on. ihus the variable is said to be of a discrete type when there are gaps between one value and

the next. For example, in set I; 0, 1, 2, 3, 4 are discrete variables.

Even fractional values are discrete or discontinuous variables provided there is an uniform difference from one variable to the other variable. For example, if wage rate per unit is 50 paise then workers of a factory may get wages in rupees as : 0.50, 1, 1.5U, Z,

2.50, 3, 3.50, and so on.

Continuous variables are those that one in units of measurement which can be broken down into infinite gradations, e.g., weights, heights, incomes, rainfall etc. They are capable of manifesting every conceivable fractional value {i.e., in decimals) within the range ot possibilities. They fall in any numerical value within a certain range. For mstance covering a distance on a road, by car, say from 0 kilometre to 5 kilometres one never jumps from 0 to 1 km, 1 to 2 km, or 2 to 3 km, but every fraction of distance from 0 km to 5 km is touched. In other words, the car must pass through'all the infinitely small gradations of distance between 0 km to 5 km. All the fractional values are continuous variables Heights of students from 56" to 58" in our set II for example, cover all the fractional values falling within the limit of 56 and 58.

62


Discrete series : Any series represented by discrete variahdes is called a discrete series e.g.. Set I of the distribution of children in families is a discrete series.

Continuous series : Any series described by continuous variables is called continuous series, e.g.. Set II of the distribution of heights of students is a continuous series.

It is to be noted that a discrete variable series can be presented in a continuous type of series also, but continuous variables cannot be presented in a discrete series. Whenever the range of values in a discrete series is too wide, one can have the choice of a continuous frequency distribution.

Considering discrete and continuous series, now individual observations can be constructed and condensed in two ways :

(a) Discrete frequency distribution, and

(b) Continuous frequency distribution.

Construction of Discrete Frequency Distribution

1. Prepare a table with three columns—first for variable under study, second for 'Tally bars' and the third for the total, representing corresponding frequency to each value or size of the variable.

2. Place all the values of the variables in the first column in ascending order-beginning with the lowest and giving to the highest. The gap between one magnitude to another may preferably be the same.

3. Put bars (vertical lines) in front of the values accordingly in the second column keeping in view the number of items a particular value repeats itself. This column IS for facility in counting. Blocks of five bars or mi or W are prepared and some space IS left between each block of bars.

4. Count the number of bars in respect of each value in the variable and place it in the third column made for total or frequency.

Illustration 1. Prepare frequency table of ages of 25 students of XI Class in your school. .

Va 16, 15, 16, 16, 15, 17, 17,

lo, ly, 16, 15

Solution.

Frequency Distribution of Ages of 25 Students

Age Tally hens ■ Total if)

15 mill 7

16 mi nil . 9

17 m^ 5

18 III 3

19 1 1

Total 4 25

63


Illustratxon 2. In a aty 45 famUies were surveyed for the number of domestic apphances

they used. Prepare a frequency array------------------- ^ 3

2 2 2 2 1 2. 1 2 2 - ^ ^ 3

3 3 2 4

Solution.

2 2

3 7

2 4

2 2

based on their replies as recorded below.

2 1 2 2.3 3 3 6 l" 6 2 1 5 1 5 4 3 4 2 0 3 1 4

Frequency Array of Domestic Appliances Used by 45 FamiUes

Number of Appliances

0 1 2

3

4

5

6 7

Tally bars

mill

mi M M miM II M

Total

1 7 15 12 5 2 2 1

45

I »|

appliances.

Construction of Continuous Frequency Distribution

"ations are divided mto groups havmg class mtervals. There are two methods of

classifying the data according to class intervals. -

(^i) Inclusive Method, and

is iiiciuuJ <: QQ 10-14 9 15-19.9 and so on.

uppe. .«s „e e^cludea. "H-e ''' orrinterva, is the lower o^ f ^ cfass^

second, i.e., 10 to 15.


Sometimes lower limits are excluded from their respective classes. For example, if the students' obtained marks are grouped as 5-10, 10-15, 15-20, 20-25, 25-30 etc., we include in the first group the students whose marks are above 5 and up to 10. If the marks of a student are 10, he is included in the first group. But if a student gets 5 marks, we will have to prepare a group 0-5 to include.

There are various methods by which class intervals can be designated. They are: {a) By Inclusive method :

10-14, 15-19, 20-24, 25-29

10-14.99, 15-19.99, 20-24.99, 25-29.99

Marks : 5-9,

or Prices in (Rs) : 5-9.99, {b) By Exclusive method : (i) Lower limit excluded :

Marks : 5-10,

15-20,

20-25,

25-30

These are to be

These are to be

10-15,

Lower limits 5, 10, 15, 20, 25, 30 of their respective groups are excluded. («) Upper limit excluded :

Marks : 5-10, 10-15, 15-20, 20-25, 25-30

Upper limits 10, 15, 20, 25, 30 of their respective groups are excluded. However, if the class intervals are given as 5-10, 10-15, 15-20, 20-25 etc., it is always presumed that upper limits are excluded in absence of any specific instructions.

(c) By mentioning lower limits (followed by a dash) : Marks : 5- 10-, 15-, 20-, 25-,

read as 5-10, 10-15, 15-20, 20-25 and 25-30.

(d) By mentioning upper limits (preceded by a dash): Marks : -10, -15, -20, -25, -30.

read as 5-10, 10-15, 15-20, 20-25 25-30.

(e) By mid-points of class inter\'al :

Marks: 7.5 12.5 17.5 22.5 27.5

These mid-points are required to be converted into class intervals. Say for first midpoint (12.5-7.5) and divide the difference by 2, i.e., (5/2). The quotient is added and subtracted to first mid-point we get, (7.5-2.5 = 5) and (7.5 + 2.5 = 10). We get thus the class interval 5-10. In the same way intervals of all the mid-points can be obtained, i.e., 10-15, 15-20, 20-25, 25-30.

(f) 'Open-end' class intervals

In certain frequency distributions 'open-end' class intervals are given as we find in the example given below :

Marks Below 10 10-15 15-20 20-25 25-30 30-35 35 and above Total

Frequency (f) 7 10 13 18 8 5 3 64

In such cases, values are put on the basis of construction of series. In the above series '5' in place of 'below' and '40' in place of 'above' may be put. Thus making the classes as : Marks 0-10 10-15 15-20 20-25 25-30 30-35 35^0

Organisation of Data 63

Illustration 2. In a city 45 families were surveyed for the number of domestic appliances they used. Prepare a frequency array based on their replies as recorded below.

13 2 2 2 2 1 2.1 2 2.3 3 ^ 3 33 2 3 2 2 6 1 6 215 1 5 3 242 7 42434 2 03143

Solution.

I

Frequency Array of Domestic Appliances Used by 45 Famihes

Number of Appliances Tally bars Frequency (f)

0 1 1

1 Mil 7

2 MMM 15

3 MMii 12

4 M 5

5 II 2

6 II 2

7 1 1

Total 45

Thus, from the above table it is clear that out of 45 families 1 is not using any domestic appliance, 7 using 1 appliance, 15 using 2 appliances, 12 using 3 appliances, 5 using 4 appliances, 2 using 5, 2 using 6 appliances and only 1 family using 7 domestic appliances.

Construction of Continuous Frequency Distribution

Observations are divided into groups having class intervals. There are two methods of classifying the data according to class intervals.

[a) Inclusive Method, and

[b) Exclusive Method.

(a) Inclusive Method : Under this method upper class limits of classes are included in respective classes. For example, if the students obtained marks are grouped as 5-9, 10-14, 15-19, 20-24, 25-29 etc., in the group 5-9, we include in first group students whose marks are between 5 and 9. If the marks of a student are 10 he is included in the next class, i.e., 10 to 14. If there are no whole numbers, the classes can be made 5-9.9., 10-14.9, 15-19.9 and so on.

(b) Exclusive method : Under this method upper limits are excluded. The upper limit of class interval is the lower limit of the next class. For example, if the marks obtained by the students are grouped as 5-10, 10-15, 15-20, 20-25, 25-30 etc., we include in first group of students whose marks are 5 or more but under 10. If the marks of a students are 10 he is not included in the first group but in the second, i.e., 10 to 15.


Sometimes lower limits are excluded from their respective classes. For example, if the students obtamed marks are grouped as 5-10, 10-15, 15-20, 20-25, 25-30 etc we mclude m the first group the students whose marks are above 5 and up to 10 If the marks of a student are 10, he is included in the first group. But if a student gets 5 marks, we will have to prepare a group 0-5 to include.

There are various methods by which class intervals can be designated. They are: (a) By Inclusive method :

10-14, 15-19, 20-24, 25-29

10-14.99, 15-19.99, 20-24.99, 25-29.99

Marks : S-9,

or Prices in (Rs) : 5-9.99, (&) By Exclusive method : (/) Lower limit excluded :

Marks : 5_io,

25-30

These are to be

These are to be

10-15, 15-20, 20-25,

Lower limits 5, 10, 15, 20, 25, 30 of their respective groups are excluded. (ii) Upper limit excluded :

5-10, 10-15, 15-20, 20-25, 25-30

Upper limits 10, 15, 20, 25, 30 of their respective groups are excluded. However, if the class intervals are given as 5-10, 10-15, 15-20, 20-25 etc it is always presumed that upper limits are excluded in absence of any specific instructions, (c) By mentioning lower limits (followed by a dash) : ^'^'ks : 5-, 10- 15-

20- 25-

readas 5-10, 10-15, 15-20, 20-25 and 25-30.

{d) By mentioning upper limits (preceded by a dash): Marks : -10, -15, -20, -25, -30.

read as 5-10, 10-15, 15-20, 20-25 25-30.

(e) By mid-points of class interval :

Marks : 7.5 12.5 17.5 22.5 27.5

These mid-points are required to be converted into class intervals. Say for first midpoint (12 5-7.5) and divide the difference by 2, (J/2). The quotiem is added and subtracted to first mid-point we get, (7.5-2.5 = 5) and (7.5 + 2.5 = 10). We get thus the

class '"t^al 5 10 In jhe same way intervals of all the mid-points can be obtained, lU 15, 15—20, 20—25, 25—30.

(f) 'Open-end' class intervals

In certain frequency distributions 'open-end' class intervals are given as we find in the example given below :

Marks Below 10 10-15 15-20 20-25 25-30 30-35 35 and above Total

Frequency (f) 7 10 13 18 8 5 3 64

In such cases values are put on the basis of construction of series. In the above series 5 m place of below and '40' in place of 'above' may be put. Thus making the classes as • Marks 0-10 10-15 15-20 20-25 25-30 30-35 35-40

ganisation of Data

65

ciples of Grouping

There is no hard and fast rule for grouping the data, but following general principles ay be kept in mind for satisfactory and meaningful classification of data :

[a) It is advisable to have total number of classes between 5 and 15. The preference for the total number of classes depends on the numbers and figures to be grouped, the magnitude of the figure and possibility of simplified calculations of further statistical studies.

[b) Odd figures for example 3, 7, 9, 11, 27, 33 etc. should be avoided for class intervals. The choice for the class intervals should be either 5 or a multiple of 5. It simplifies our further statistical calculations.

[c) Lower limit of the class as far as possible, should be 0 or a multiple of 5.

[d) For maintaining continuity and correct classes exclusive method of preparing classes is adopted.

[e) The class interval should be equal for all classes.

(/) As far as possible open-end classes should be avoided. For example,

Marks Below 5 5-10 10-15 15-20 Above 20

The first and the last classes are open-end classes; the first is open at the lower-end and last at the upper end. For statistical calculations the open-ends should be closed. Maintaining the regularity of the class intervals we can close these groups as 0-5 and 20-25.

(g) For frequency distribution, we prepare a table having three columns—first for variables, second for 'Tally bars" and the third for the total representing corresponding frequency to each class.

Simple Series and Cumulative Series : We have seen in the above illustrations the erns of simple series of discrete type and continuous type (Using inclusive and exclusive [lods of class intervals). In simple series the frequency is shown against each value or in cumulative series the frequencies are progressively totalled. See the following tration :

Simple Series

r . Disi^^ti'^pe ; V - ■ Continuous Type

I Marks No. of Students Marks No. of Students

10 4 0-10 4

20 8 10-20 8

30 15 20-30 15

40 20 30-40 20

50 13 40-50 13

66

Statistics for Economic

Cumulative Series

Less than

Marks

Less than 10

Less than 20

Less than 30

Less than 40

Less than 50

No. of Students {c.f}

12 (4 + 8) 27 (4 + 8 + 15) 47 (4 + 8 + 15 + 20) 60 (4 + 8 + 15 + 20 + 13)

More than

Marks i No. of Students (i

More than 0 i 60

More than 10 56 (60^)

More than 20 48 (60-12)

More than 30 33 (60-27)

More than 40 13 (60-47)

Now, we can read students getting less than 10 marks are 4, less than 20 marks 12, less than 30 marks are 27 and so on.

In the same way the students getting more than 0 mark are 60, more than 10 mai are 56, more than 20 marks are 48 and so on.

Illustration 3. From the following table given below of monthly household expenditi (m Rs) on food of 50 households;

(a) Obtain the range of monthly household expenditure on food.

(b) Divide the range into appropriate number of class intervals and obtain the frequei distribution of expenditure.

(c) Find the number of households whose monthly expenditure on food is (/■) less than Rs 2000 (ii) more than Rs 3000

Monthly Household Expenditure in (Rupees) on Food of 50 Households

(c

(e (;

1904 1559 3473 1735 2760

2041 1612 1753 1855 4439

5090 1085 1823 2346 1523

1211 1360 1110 2152 1183

1218 1315 1105 2628 2712

4228 1812 1264 1183 1171

1007 1180 1953 1137 2048

2025 1583 1324 2621 3676

1397 1832 1962 2177 2575

1293 1365 1146 3222 1396

(g)

Sit )atten nethoi lass, j (lustra

Solution.

(a) Finding the highest and lowest expenditure on food of 50 households to get range by the following formula.

Range = L - S

■ rj^ . 65

msatton of Uata

.viples of Grouping

There is no hard and fast rule for grouping the data, but following general principles ly be kept in mind for satisfactory and meaningful classification of data : la) It is advisable to have total number of classes between 5 and 15. The preference for the total number of classes depends on the numbers and figures to be grouped, the magnitude of the figure and possibility of simplified calculations of further

statistical studies. _ . , , r ,

(b) Odd figures for example 3, 7, 9, 11, 27, 33 etc. should be avoided for class intervals. The choice for the class intervals should be either 5 or a multiple ot 5. It simplifies our further statistical calculations. ic) Lower limit of the class as far as possible, should be 0 or a multiple of 5.

(d) For maintaining continuity and correct classes exclusive method of preparing classes is adopted.

(e) The class interval should be equal for all classes.

(f) As far as possible open-end classes should be avoided. For example,

Marks Below 5 5-10 10-15 15-20 Above 20

The first and the last classes are open-end classes; the first is open at the lower-end and last at the upper end. For statistical calculations the open-ends should be closed. Maintaining the regularity of the class intervals we can close these groups

as 0-5 and 20-25. , r f

(P) For frequency distribution, we prepare a table having three columns—first tor variables, second for 'Tally bars" and the third for the total representing

corresponding frequency to each class. Simple Series and Cumulative Series : We have seen in the above illustrations the terns of simple series of discrete type and continuous type (Using inclusive and exclusive lods of class intervals). In simple series the frequency is shown against each value or I, in cumulative series the frequencies are progressively totalled. See the following

ration :

Simple Series

■ Discrete Type Continue

% Marks No. of Students Marks No. of Students

10 20 30 40 50 4 8 15 20 13 0-10 10-20 20-30 30-40 40-50 4 8 15 20 13

66


Cumulative Series

Less than

Marks

Less than Less than Less than Less than Less than

10 20 30 40 50

No. of Students {c.f.}

12 (4 + 8) 27 (4 + 8 + 15) 47 (4 + 8 + 15 + 20) 60 (4 + 8 + 15 + 20 + 13)

More than

Marks

More than 0

More than 10

More than 20

More than 30

More than 40

i No. of Students {t \ 60

56 (60-4)

48 (60-12)

33 (60-27)

13 (60-47)

Now we can read students getting less than 10 marks are 4, less than 20 marks 12, less than 30 marks are 27 and so on.

In the same way the students getting more than 0 mark are 60, more than 10 ma are 56, more than 20 marks are 48 and so on.

Illustration 3. From the following table given below of monthly household expendit (m Rs) on food of 50 households;

(a) Obtain the range of monthly household expenditure on food.

(b) Divide the range into appropriate number of class intervals and obtain the frequeu distribution of expenditure.

(c) Find the number of households whose monthly expenditure on food is (/■) less than Rs 2000 (ii) more than Rs 3000

Monthly Household Expenditure in (Rupees) on Food of 50 Households

1904 1559 3473 1735 2760

2041 1612 1753 1855 4439

5090 1085 1823 2346 1523

1211 1360 1110 2152 1183

1218 1315 1105 2628 2712

4228 1812 1264 1183 1171

1007 1180 1953 1137 2048

2025 1583 1324 2621 3676

1397 1832 1962 2177 2575

1293 1365 1146 3222 1396

Solution.

(a) Finding the highest and lowest expenditure on food of 50 households to get I range by the following formula.

Range = L - S

'^ganisation of Data

where, L = Longest value, and S = Smallest value

Here, L = 5090 and S = 1007

Range = 5090 - 1007 = Rs 4083 (b) Dividing the class interval of Rs 500, we get

4083

67

500

= 8.166

Now, we decide 9 classes to include all the given values preparing a continuous frequency distribution by exclusive method (excluding upper limit).

Frequency (Excluding upper limit)

Household Expenditure (Rs) Tally bars Frequency (f)

1000-1500 miMTHiTHl 20

1500-2000 MM III 13

2000-2500 Ml 6

2500-3000 M 5

3000-3500 II 2

3500-4000 1 1

4000-4500 II 2

4500-5000 0

5000-5500 1" 1

Total 50

(c) (i) Number of households whose monthly expenditure is less than Rs 2000 (i.e., 1000 - 2000)

= 20 + 13 = 33 Households (ii) Number of households whose monthly expenditure is more than 3000 (i.e., 3000 - 5500)

= 2 + 1+ 2 + 0 + 1 = 6 Households 1. Illustration 4. Form a frequency distribution from the following data by inclusive rthod taking 4 as the magnitude of class intervals taking the lowest class as (10 - 13). obtain class boundries and mid-values.

31 23 19 29 22 20 16 10 13 34

38 33 28 21 15 18 36 24 18 15

12 30 27 23 20 17 14 32 26 25

18 29 24 19 16 11 22 15 17 10

68

Solution.

Statistics for Economics-M O

Frequency Distribution

Class interval

10-13 14-17 18-21 22-25 26-29 30-33 34-37 38-41

Tally bars

mini mini mill mi

Frequency (f)

8 7 5 4 2 1

Total

40

Oass Boundaries

In above illustration 10-13 14-17 io 91 ^c of inclusive method of construction of coin nor f ' ^^^ I™

or discontinuity between upper uZ S Xs ' w I ^e find 'gap

1 between the upper limit of the fct dasfia and rZT ^^

we find a ';gap' of 1. The contmuity of tie vaLtk ^^^^^^^

adjustment m the class interval. classified data is obtained

Steps

""""" class and ehe „pp„

14 - 13 = 1 2. Divide the difference by 2

1

Mid-point =

2

(c)

m.v. =

li+h

nil

[ methoc Also ol

31 38 12 18


where, L = Longest value, and S = Smallest value

Here,' L = 5090 and S = 1007

' Range = 5090 - 1007 = Rs 4083

(b) Dividing the class interval of Rs 500, we get

4083

67

500

= 8.166

Now, we decide 9 classes to include all the given values preparing a continuous frequency distribution by exclusive method (excluding upper limit).

Frequency (Excluding upper limit)

Household Expenditure (Rs) Tally bars Frequency (f)

1000-1500 MM MM 20

1500-2000 MM III 13

2000-2500 Ml 6

2500-3000 M 5

3000-3500 II 2

3500-4000 1 1

4000-4500 II 2 i „

4500-5000 1 0

5000-5500 1" 1

Total 50

(c) (i) Number of households whose monthly expenditure is less than Rs 2000 [i.e., 1000 - 2000)

= 20 + 13 = 33 Households («■) Number of households whose monthly expenditure is more than 3000 (i.e., 3000 - 5500)

= 2 + 1+ 2 + 0 + 1 = 6 Households Illustration 4. Form a frequency distribution from the following data by inclusive bod taking 4 as the magnitude of class intervals taking the lowest class as (10 - 13). obtain class boundries and mid-values.

31 23 19 29 22 20 16 10 13 34

38 33 28 21 15 18 36 24 18 15

12 30 27 23 20 17 14 32 26 25

18 29 24 19 16 11 22 15 17 10

68

Solution.

Frequency Distribution


Class interval 1 Tally bars Frequency (f)

10-13 M 5

14-17 mini 8

18-21 mini 8

22-25 m^ii 7

26-29 m^ 5

30-33 nil i 4

34-37 n 1 7

38^1 1 i i 1

Total 40

f^.Iass Boundaries

In above illustration 10-13, 14-17, 18-21, 22-25, 26-29 and so on are class Um of mclusxve method of construction of contmuous frequencv distribution. We S 'Z or discontinmty between upper limit of a class and lower limit of next class Fo elS

JTnT: trotftr' f ^^^

ifL dat Ii;—^ " ^^

Steps

14 - 13 = 1

2. Divide the difference by 2

1

- =0.5

3. Subtract the value obtained from lower limits of all the classes (- 0 5)

4. Add the value obtained to upper limits of all classes (+ 0 5)

are itri^Ld'^"'""" " ^^ each cL

Mid-point = Upper class limit+Lower class limit

2

m.v. =

Ink 2

lelativ

It i factual 1

■ganisation of Data Now we get

69

Illustration 5. Prepare a frequency distribution by inclusive method taking class interval of 7 from tbe following data :

imits japs' nple, s 14, d by

lit of

Class ii^erval frequency (f) mid values (m.v.).

9.5-13.5 5 11.5

13.5-17.5 8 15.5

17.5-21.5 8 19.5

21.5-25.5 7 23.5

25.5-29.5 5 27.5

29.5-33.5 4 31.5

33.5-37.5 2 35.5

37.5-41.5 1 39.5

Total 40

28 17 15 22 29 21 23 27 18 12 7 2

9 4 6 1 8 3 10 5 20 16 12 8

4 33 27 21 15 9 3 36 27 18 9 2

4 6 32 31 29 18 14 13 15 11 9 7

1 5 37 32 28 26 24 20 19 25 19 20

Solution.

Frequency Distribution (Inclusive Method)

class

Tally bars Frequency (f)

0-7 miMmi 15

8-15 miMTHi15

16-23 mm nil 14

24-31 mimi 1 11

32-39 M 5

Total 60

^tive Frequency Distribution

It is sometimes required to show the relative frequency of occurrences rather than ual number of occurrences in each class of frequency distribution. If actual frequencies I expressed as per cent of the total number of observations, relative frequencies are ained.

70

Individum

]

2

3

4

5

Money (Rs)

114 108 100 98 106

Individual

6

7

8 9

10

Money (Rs) Individual Money (Rs)

109 11 131

117 12 136

119 13 143

121 14 156

126 15 169

Individual

■"-'igclllisc

frequencies. Solution.

16

17

18

19

20

182 195 207 219 235

Money (Rs)

75-100 100-125 125-150 150-175 175-200 200-225 225-250

Frequency Distribution (Excluding upper limit)

OJ

lof 7 :

Tally bars

Frequency (f)

Mil

Total

2 7 4 2 2 2 1

Relative frequency (%)

20

10 35 20 10 10 10 5

100

Soli

(Assuming the class interval of Rs 25) of ,35. We take a ..e,

Frequency Distribution of Money

Money (Rs)

50-100 100-150 150-200 200-250

Tally bars

MMii

Total

Frequency (f)

1 12 4

3

Relative frequency (%)

5 60 20 15

dative ]

It is ! (ictual nuj exprei obtained.

anisation of Data

69

Class interval Frequency (f) mid lvalues {m.v.)

9.5-13.5 5 11.5

13.5-17.5 8 15.5

17.5-21.5 8 19.5

21.5-25.5 7 23.5

25.5-29.5 5 27.5

29.5-33.5 4 31.5

33.5-37.5 2 35.5

37.5-41.5 1 39.5

Total 40

fflustration 5. Prepare a frequency distribution by inclusive method taking class interval

28 9 4 4 1

Solution.

17 15 22 29 21 23 27 18 12 7 2

4 6 1 8 3 10 5 20 16 12 8

33 27 21 15 9 3 36 27 18 9 2

6 32 31 29 18 14 13 15 11 9 7

5 37 32 28 26 24 20 19 25 19 20

Frequency Distribution (Inclusive Method)

Class Tally bars Frequency (f)

0-7 mimm 15

8-15 miMM 15

16-23 mm 111! 14

24-31 MM 1 11

32-39 M 5

Total 60

lative Frequency Distribution

It is sometimes required to show the relative frequency of occurrences rather than Illmber of occuLnces in each class of frequency distribution If actual frequencies ^pressed as per cent of the total number of observations, relative frequencies are

ained.


Illustration 6 In a hypothetical sample of 20 individuals the amounts of money them were found to be :

Individual Money Individual Money Individual Money Individual

(Rs) (Rs) (Rs)

1 114 6 109 11 131 16 182

2 108 7 117 12 136 17 195

3 100 8 119 13 143 18 207

4 98 9 121 14 156 19 219

5 106 10 126 15 169 20 235

frequencies. Solution.

Frequency Disttibution (Excluding upper limit)

Money {Rs) Tally bars Frequency If) Relative frequency (%)

75-100 II 2 10

100-125 Mil 7 35

125-150 nil 4 20

150-175 II 2 10

175-200 II 2 10

200-225 II 2 10

225-250 1 1 5

Total 20 100

(Assuming the class interval of Rs 25)

Frequency Distribution of Money

Money (Rs) Tally bars

50-100 1 1 c

100-150 150-200 Mmiii nil 12 4 J 60 20 15

200-250 III 3

Total 20 100

-XI

vith

ganisation of Data ^^

quency Distribution with Unequal Classes

Data are sometimes given in unequal class intervals. Such series are used when there f great fluctuation in data. For example :

ative

1 Set I Set II Set III

pfass Frequency Class Frequency Class Frequency

0-5 X 2 X 2-A X

5-10 Y 5 Y 2-6 X + Y

10-20 Z 7 Z 2-8 X + Y + Z

20-30 A 7-20 A 8-10 A

30-50 B 20-40 B 10-12 B

50-75 C 40-60 C 12-14 C

x)ss of Information

Raw data is grouped by making equal or unequal class frequency distribution, say 1-5, 5-10, 10-15 or 0-5, 5-7, 7-12, 12-20 and so on. By making such classes there is loss of information of individual observation. Further, the statistical analysis is based on die mid-points of these classes without giving any importance to individual observation. ^ such, the significance of individual observation is lost.

livariate Frequency Distribution

We have so far studied above frequency distributions involving single variable only, uch frequency distributions are called univariate frequency distributions. Often we come aoss data composed of measurements made on two variables for each individual items.

example, we may study the weights and heights of group of individuals, the marks .uiined by a group of students in two different subjects, ages of husbands and wives for group of couples, etc. A frequency table where two variables have been measured in the ue set of items through cross classification is known as 'bivariate frequency distribution" ntervalB 'two-way frequency distribution'. Various values of each variable are grouped into ious classes (not necessarily the same for each variable).

lUustration 7. Following figures give the ages of 20 newly married couples in year, jresent the da ! of husband t of wife {of husband ! of wife

Solution. We are given two variables : (i) age of husbands, and (ii) age of wives. We Id represent the data in the form of a two-way frequency distribution so that we are to show the ages of husbands and wives simultaneously. This is also called bivariate \cy distribution.

24 26 27 25 28 24 27 28 25 26

17 18 19 17 20 18 18 19 18 19

25 26 27 25 27 26 25 26 26 26

17 18 19 19 20 19 17 20 17 18

72

r

. -I i '1 i.fV

Age of husband (years)

24

25

26

27

28

Total (/)

17

Bivariate Frequenqr Distribution , Age of wife, (years)

Statistics for Economics-Xll

I (1) III (3) I (1)

(1) (1) (3). (1)

(1) (2) (2) (1)

20

I

(1) (1) (1)

Total (

2 5 7 4 2

20

Illustration 8 Tbe data given below relate to the heights and weights of 20 nersc

66" IT class Tnterval 62^

64 -66 and so on and 115 to 125 lbs., 125 to 135 lbs. and so on.

S.N.

1

2

3

4

5

6

7

8

9

10

Solution.

Height S.N. Weight Height

170 70 11 163 70

135 65 12 139 67

136 ■ 65 13 122 63

137 64 14 134 68

148 69 15 140 67

124 63 16 132 69

117 65 17 120 66

128 70 18 148 68

143 71 19 129 67

129 62 20 152 67

Height Inches -------,

lbs.) 62—64 64—66 66— 6S—70 70—72 Totali

115-125 125-135 135-145 145-155 155-165 165-175 II (2) i (1) 1 (1) 111 (3) 1 (1) 1 (1) II (2) 1 (1) II (2) 11(2) i (1) 1 (1) 1 (1) 1 (1) 4 5 6 3 1 1 i

Total (/) 3 4 5 4 4 20

71

anisation of Data ^^uency Distribution with Unequal Classes

CData are sometimes given in unequal class intervals. Such series are used when there eat fluctuation in data. For example :

ions. 64",

iO

W Set I Setll Set III

Frequency Class Frequency Frequency

1 0-5 X 1 X 2-A X

5-10 Y 5 Y 2-6 X + Y

10-20 Z 7 Z 2-8 X + Y+ Z

t20-30 A 7-20 A 8-10 A

,30-50 B 20-40 B 10-12 B

J&-75 C 40-60 C 12-14 C

a (f)

KS of Information

Raw data is grouped by making equal or unequal class frequency distribution, say -5, 5-10, 10-15 or 0-5, 5-7, 7-12, 12-20 and so on. By making such classes there is loss of information of individual observation. Further, the statistical analysis is based on ' mid-points of these classes without giving any importance to individual observation, such, the significance of individual observation is lost.

,jiate Frequency Distribution

We have so far studied above frequency distributions involving single variable only. 1 frequency distributions are called univariate frequency distributions. Often we come uss data composed of measurements made on two variables for each individual items. • example, we may study the weights and heights of group of individuals, the marks ained by a group of students in two different subjects, ages of husbands and wives for oup of couples, etc. A frequency table where two variables have been measured in the • set of items through cross classification is known as 'bivariate frequency distribution' i'two-way frequency distribution'. Various values of each variable are grouped into

ous classes (not necessarily the same for each variable). Inhistration 7. Following figures give the ages of 20 newly married couples in year.

24 26 27 25 28 24 27 28 25 26

17 18 19 17 20 18 18 19 18 19

25 26 27 25 27 26 25 26 26 26

17 18 19 19 20 19 17 20 17 18

(of husband t of wife ! of husband t of wife

Solution. We are given two variables : [i) age of husbands, and (ii) age of wives. We lid represent the data in the form of a two-way frequency distribution so that we are to show the ages of husbands and wives simultaneously. This is also called bivariate

icy distribution.

72

Bivariate Frequency Distribution

Statistics for Economics

(years)

24

25

26

27

28

17

Total (/)

I (1) III (3) I (1)

18

19

I (1)

I (1)

III (3)

I (1)

6

(1) (2) (2) (1)

20

(1) (1) (1)

Total i

2 5 7 4 2

______——20

You are required ro

nterval 62"-

S.N.

1

2

3

4

5

6

7

8

9

10

170

135

136

137 148 124 117 128 143 129

Solution.

Height

70 65 65

64

69 63

65

70

71 62

S.N.

11 12

13

14

15

16

17

18

19

20

Weight

163

139 122 134

140 132 120 148 129 152

Height

70

67 63

68

67 69 66

68 67 67

Bivariate Frequency Distribution Inches

lOrganisation of Data

73

_:

exercises

uestions :

I Distinguish between variable and attribute. Explain with examples, i Define classification. Explain the objects and characteristics of classification. ! What do you understand by classification? Explain the methods of classification of j data giving suitable examples.

; Is there any use in classifying things? Explain with illustrations. ^ Explain discrete and continuous variables with examples. Define series and explain the different types of series. Define Frequency Distribution. State the principles required to be observed in its formation.

8. Explain with illustration the 'inclusive' and 'exclusive' methods used in classification of data.

9. Distinguish between univariate and bivariate frequency distribution.

10. Distinguish between discrete and continuous variable.

11. What is loss of information in classified data?

12. Do you agree that classified data is better than raw data?

13. What is a relative frequency distribution? Illustrate. Write short notes on the following :

I (a) Classification and series.

[b) Geographical and chronological classification.

(c) Exclusive and inclusive class-intervals, i (d) Discrete and continuous series. I [e) Simple and cumulative frequency. I (/) Equal and unequal class frequency

blems :

Prepare a statistical table from the following data taking the class width as 7. by ! inclusive method :

28 17 15 22 29 21 23 27 18 12

7 2 9 4 6 1 8 3 10 5

20 16 12 8 4 33 27 21 15 9

3 36 27 18 9 2 4 6 32 31

29 18 14 13 15 11 9 7 1 5

37 32 28 26 24

74

I

i^v


50 57 58 51 53 62 64 60 61

51 64 55 55 52 60 65 58 60

52 63 56 56 58 64 63 62 60

54 62 54 54 60 65 60 62 59

56 63 52 53 62 53 61 61 59

the following marks in frequency table taking the lowest classinterval

69 33 91 53 63 69

70 36 80 78 52 51

73 73 92 64 55 49

74 57 95 70 64 57

75 80 42 85 43 29

77 65 73 95 76 53

86 73 40 83 43 76

84 72 75 57 58 59

62 65 67 87 81 84

61 75 85 81 58 81

4.

47 69 78 62 72 43 87 61 84 23

Change the following into continuous series and convert the series into 'less than' and more than cumulative series :

Marks (mid-values) No. of students

5.

5 15 25 35 45 55

8 12 15 9 4 2

Marks obtained by 24 students in English and Statistics in a class are given below

»

S.No. Marks in English Marks in Statistics S.No. Marks in English Marks in ] Statistics j

1 22 16 13 23 16

2 23 16 14 25 17

3 23 18 15 23 17

4 23 16 16 22 17

5 23 16 17 27 15

6 24 17 18 27 16

7 23 16 19 26 18

. 8 25 19 20 28 19

9 22 16 21 25 19

10 23 18 22 24 16

11 24 18 23 23 17

12 24 17 24 25 19

I " ^^

^ganisation of Data

tin a survev it was found that 64 famiUes bought milk in the following quantities a parSar Inth. Quantity of milk (in litres) bought by 64 famthes m a month.

.O 99 9 22 12 39 19 14 23 6 24 16 18 7

i y. p • i I i i iH i i ■■

1 Comrert the above data in a frequency distribution making classes of 5-9, 10-14 and

J so on. - u 1

I: The marks obtained by 20 studends in Statistics and Economics are. given below.

« • • . r _____—. Vviii*i/~vn

Marksin

10 11 10 11 11 14 12 12 13 10

Marks in Economics Marks in Statiaics Marks in Eammcs

20 13 24

21 12 23

22 11 22

21 12 23

23 10 22

23 14 22

22 14 24

21 12 20

24 13 24

25 10 23

8 Prepare 'less than' and 'more than' cumulative frequency distributions of the

^ 140-150 15.160 160-170 170-180 180-190 190^200

Ino. of workers : 5 10 20 \ . / ' .

I Find out the frequency distribution and 'more than' cumulative fi^quency^^ble . below : 10- 30 40 50 60

Quantity(kg) : 17 22 ^ lociqo

If class mid-points in a frequency distribution of a group of persons are : 125, 132,

139, 146, 153, 160, 167, 174, 181 pounds, find (^i) size of the class intervals, and (b) the class boundaries.

PRESENTATION OF DATA

g^^lfeftwr^ Prcssentaiion

-4«nmaiic Presentation

Chapter 5

tabular presentation

' 1.' - Introduction

2. Definition and Objectives of Tabulation

3. Essentials of a Satisfactory Table

4. Parts of a Table Types of Table

J" jjji

w.

b. » ".e da. eUH.

as we,. J^tJZ riLX

There are four methods of presentation. They are : eral people.

(i) Text presentation, (it) Semi-tabular presentation, (ni) Tabular presentation, and (iv) Pictorial presentation.

(i) Text Presentation

\^Jabular Presentation ^^

increased from an extremely low figure of less than 2 lakhs in 1950-51 to over 46 lakhs in 1990-91. There was around ten-fold increase in this sphere between 1991 and 2004-05 as the number of landline connections increased to 4.42 crore besides 4.5 crore mobile phones. Thus the number of telephones

stood 9.7 crore in March ?C05. With Wnifold increase in telephone connections, the teledensity [viz., the number of telephone connections per hundred persons) has increased from 3.6 in 2001 to 6.7 m 2005.

(«) Semi-Tabular Presentation

■ Semi-tabular presentation is both through tables and paragraphs, This method is not often used, but is useful when figures are required to be compared along with one or two sentences of explanation.

^ (Hi) Tabular Presentation

Tabular presentation is a systematic presentation of numerical data in columns and rows in accordance with some important features or characteristics.

(iv) Pictorial Presentation

Pictorial presentation is visual form of statistical data in diagrams and graphs.

and objectives of tabuutio» c

Systematic presentation of data is one of the most important consideration in statistical j work and it is done through the use of tables. A statistical table is an arrangement of I systematic presentation of data in columns and rows. Tabulation is the process of fpresenting in tables. Tabulation is a process and the outcome of which are statistical Itables. In brief, tabulation is a scientific process involving the presentation of classified ata in an orderly manner so as to bring out their essential features and chief iracteristics.

According to H. Secrist, "Tables are a means of recording in permanent form the alysis that is made through classification and of placing juxtaposition things that are ,nilar and should be compared". According to Tuttle, "A statistical table is the logical listing of related quantitative ta in vertical columns and horizontal rows of numbers, with sufficient explanatory and alifying words, phrases and statement in the form of titles, headings and notes to make and full meaning of the data and their origin.''

bjectives of Tabulation Statistical data arranged in a tabulated form have following important objectives: I 1. They simplify complex data and the data presented are easily understood.

2. They facilitate comparison due to proper systematic arrangement of statistical data in different columns.

78

I t


3. They leave a lasting impression without any confusion.

4. They facilitate computation of different statistical measures namely averat dispersion, correlation etc.

5. They present facts in minimum space and unnecessary, repetition and explanatic are avoided and required figures can be located more quickly.

6. Tabulated data makes easy for summation of various items and errors and omissions can easily be detected.

7. Tabulated data are good for references and they make it easy to present intormation on graphs and diagrams.

^^^m of a satisfactory ti^^

The following are the essentials or characteristics of a satisfactory table :

1. Attractive : A table should be attractive to draw the attention of readers. To ma

It so, care should be taken in determining its size, proportion of columns and rov writing of figures, etc.

2. Manageable size : The size of the table should be neither too big nor too sma loo much of details should not be given in a table. If the table is too large becomes confusing to the eyes and there is great difficulty in following the lir and columiis at a glance. If more details are to be given, then a number of sr tables should be preferred to one big table. So, it should be simple and comp.

3. Comparable : The facts should be arranged in a table as to make comparis. between them easy, because, comparison is one of the chief objectives of tabulatio Whenever it is necessary, average, percentage, proportion, etc., should be given the table to facihtate comparison.

4. According to objective : A table should be according to objective of statistic investigation.

^^ « easily understandable,

should be complete within itself containing all the explanations necessary to mi clear the meanmg to items. Units of measurement must be clearly stated such, price m rupees" or "weight in kilograms". Columns and rows should be numl when It is desired to facilitate reference to specific parts of a table

nSffr '' scientifically prepared. All the requir

ru es of tabulation should be carefully observed. A table should have a suitab

title, proper captions and stubs, source, footnotes etc. Certain figures which are I

be emphasised should be in distinctive type or in a 'circle' or a 'box' or ber^^

thick lines^ A table should have miscellaneous columns for the data which can«

be grouped m the classification made. Large numbers are hard to read and dif

to compare therefore, they should be approximated e.g., up to the nearest

79

Wlabular Presentation

OF ATi

A good table . an art. ^ ^ng a^tference

1 Table number : A table should ^B ^^^ etc.) or numbered (say 1,

in the future. A table must be codihed ê A, ^ ^ ^

2, 3, 4 etc.) whenever more than orê table ^s^ prepa ^^^^^ ^^ ^ ^^^^^ ^^^^^^^

either at

the centre on the top anove rnc^^^ Sometimes table number

the table number is given refers to the chapter or section

like 1.2 and 2.4 are also used. In ^ ôuld mean second table m

and second digit to its order. ^^ the fourth table in second first chapter or section and Table 2.4 wo

chapter or section. ^ ^^ or a catch title written 2. Title : There may be a V'^^^^Se^ be W, clekr and self explanatory,

in few words. Title given above of all lettering used m the

The lettering of the title should be ^he most pr ^^ ^^^^^ ^^^ ^ê

table. A complete title explams of classification of data.

field to which the data are -^êd jf â) bas. of ^ ^^^

3. Captions and stubs (Column he^^ ^^^^ by smbs^

headings given to columns a e caUed captio ^ ^^ ^ be numbered

Both stubs and captions ^ôuld be « ^^^^^^ ^^^ and stubs

rfof cStôns verticaUy and stubs

. ^^rtable :lt con^ms je ^^^^

part of the table. This table ^ôuld be mad^^^^^^^ arrangement of items

in view the purpose of b^ excluded,

in columns and rows and : (0 alphabetically W Items in a table may be arranged ^progressively, and (vt)

geographically, (Hi) chronologically, sSnif cance which are to be

million tons. . ^ the table It is a phrase or a statement

6. F«.»o.e, : I. is placed « "om of tte ta^ ^^ ^

which contains systems and keys like puttmg star(s)

table. Footnotes can be identified by vanom sys ^^ ^^ 3 „

or signs (say £ etc.) or ^ Umitattons of data,

a, b, c, d etc.) Footnotes m also necessa.7 to sp y f

« diei is any or to explain »me , ^„rce note should be

7. Source : ta case of the ■'•j'^XalnTst ^taon, name of the pubUsher or

a°t ti:" o^-.. is usefu, to the reader to

fi^ and gather additional information.

80

\'L

structure of table

Number Title

(Head note, if any)

Statistics for Economics-Xll

ifcsiEswwaiiisaMJB^ aw

-——

■

— _________

Footnote : Source :

- ----

table 1 Literacy Rates in India

Year

Rural

1951 19.02

1961 34.30

1971 48.60

1981 49.60

1991 57.90

2001 71.40

Source : Economic

45.60 66.00 69.80 76.70 81.10 86.70

Total

27.16

40.40

45.96

56.38

64.13

75.85

Rural ' Females

4.87 urttan 22.33

10.10 40.50

15.50 48.80

21.70 56.30

30.60 64.00

46.70 73.20

Total

(Per cent)

Persons

8.86 15.35 21.97 29.76 39.29 54.16

Rural

12.10

22.50

27.90

36.00

44.70

59.40

The table clearly shows that •

per cent among females. This shows

Urban

34.59 25.40 60.20 67.20 73.10 80.30

Total

18.33

28.30

34.45

43.57

52.21

65.38

per cent in 2001, while it

that there is a general bias

Total

81

globular Presentation

against female education and in our conservative society, girls still get discriminated in the matters like health, nutrition, education, etc. iii) Literacy rate in urban areas was high at 80 per cent in 2001 than rural areas where ^ ^ ^t rs less L 60 per cent. This clearly speaks of inadequate facilities of education av^IbL in the rural areas as well as comparatively lower willingness of the conservative rural folk to go to schools for education. Illustration 1. In a sample study about coffee drinking habits m two towns, the

^tal coffee drinkers were 45% and Males non-coffee

were 55%, Males non-coffee drinkers were 30% and Females coffee drinkers were 15%.

Represent the above data in a tabular form. before

Solution. Let us calculate the missing percentages of the above information before

representing the data in a tabular form.

STOWN A 100

TOWN B 100

1

Non-Coffee drinkers drinl(ers

35 40

Non-Coffee drinkers drinkers

30 25

TABLE 2

Coffee Drinking Habits in Towns A and B

(in percentages)

Coffee Drinkers

Non-Coffee Drinkers

82

Alternative Solution


TABLE 3

Coffee Drinking Habits in Towns A and B

{in percentages]}

SBSSsSiePlSlillBJi Toum A Town B ■ ■ ■■ ^

Coffee Non-Coffee Total Coffee Non-coffee Total ]

Drinkers Drinkers Drinkers Drinkers .1

Males 40 20 60 25 30 55

Females 5 35 40 15 30 45

Total 45 55 100 40 60 100

Illustration 2. Of the 1,125 students studying in a school during 2005-2006, 720 are Hindus, 628 are boys and 440 are science students. The number of Hindu boys is 392, that of boys studying science 205 and that of Hindu students studying science 262; finally, the number of science students among the Hindu boys was 148. Enter these frequencies in a table and complete the table by obtaining the frequencies of the remaining cells.

Solution.

TABLE 4

Faculty Boys Girls Total 1

Hindus Non-Hindus Total Hindus Non-Hindus Total Hindus Non-Hindus Total J

Science Arts 148 24457 179 205 423114 214121 48 235 262262 458178 227440 j 685 1

Total 392 236 628 328 169 497 720 405 1125 j

niustration 3. Census of India 2001 reported that Indian population had risen to 102 crore of which only 49 crore were females against 53 crore males. 74 crore people resid m rural India and only 28 crore lived in towns or cities. While there were 62 crore nc workers Population against 40 crore workers in the

entire country, urban population an even higher share of non-workers (19 crore) against the workers (9 crore) as comp; to the rural population where there were 31 crore workers out of 74 crore populatic Represent the above information in a tabular form.

I Tabular Presentation Solution.

83

TABLE 5 Growth of Population in India

(figures in crores)

Source : Census of India 2001.

Males : 53 crore Females : 49 crore

fs of tables

Table can broadly by classified as under :

A. From the point of view of purpose :

(i) General purpose tables.

(ii) Special purpose tables.

B. From the point of view of originality :

{/■) Original tables. (ii) Derivative tables.

C. From the point of view of construction

(i) Simple or single tables.

(ii) Complex tables.

library

i «ble provides of » ts^s^SSllt^^^^

it eaiy to make comparisons and clear relationships.

wMcH co„.ai„ —. inro™a.o„ i. .^e «ae fom, in wUch they are origi^lly collected

-sSl^ciurX^^^^^^^^ -^a .o. ...era,

purpose tables.

«J,„

SsfT

f

u

84

Simple and Complex Tables

. table 6

of Students in a School

Marks

0-10 10-20 20-30 30-40

Total

15 12 28 5

60

Double or WWay Tabte (Double Tabulation)

,, , , table 7

I---i" a School

Mirt,

'^"tsZa^^'J (T-ble Tabulation)

dlustration : g'^Is mto Sec A. and b in our following j

r^- • , table 8

According .o Mari.. Se. and Section

^ \ : ^ ^ ^^^^^^^^ "■" ...........

ular Presentation

85

The above table can even be called as manifold table, higher order table or ma^ 2lon Z "lh we can increase the number of charactensttcs, more sections,

" rh'aX"is needed when a number of characteristics are to be simultaneously ii But as more characteristics are included, the table becomes more complex, and

, may be confusing to the reader. If the field of investigation is not big, the data have not too many

future use and thirdly when ,he table requirements are varymg. ifabulation wiU be more accurate than the manual process.

EXERCISES

tions :

I State the advantages of Tabular Presentation of data.

i Describe the major functional parts of statistical tables. Draw a structure of a table I ExpLn bnefly the main characteristics of a good statistical

Whai are the points to be taken into accomit while preparing a table? IxpTarand discuss the various types of tables used in a survey after the data have

, ^i:^!rt;ween tabulation and classification. Explain the objects of tabulation.

^ Discuss briefly the importance of tabulation. ^ . , u, '

I What is a statistical table? Discuss briefly the essentials of a good table. ■

1 What are the objects of tabulation?.

i following industries : . . . ■

Fishing, coal mining, iron-ore mining, cloth and wool industries. ^

hArepare a blank table to show the distribution of population according to sex and ^ four religions in three age groups in Delhi and Mumbai. y ^ five

2006.

Statistics for Economics-Xli we^eTrL"""' --op" in - »w„s following d. J

Town A 51% 16% 18%

Town B 54% 28% 20%

p Males in Total Population ' Smokers

Male Smokers ; J Tabulate the above data.

- Preset the following information in a suitable table •

oi ^ ——

Wong ro a trade un,™ 20M T "" ''' """ ""i

of which 1290 were men Sn Z h ^ u •<> 1^80

students according to : mformafon regarding the college]

(a) Faculty

(b) Class

(c) Sex id) Years

8,

Social Sciences, Commercial Sciences. Under-graduate and Post-graduate classes. Male and Female. 2005 and 2006. Tabulate the following •

JXaSSe^—^^^

of the total sales during the yeaT "^P^cfvely. Texnles accounted for 30%

pesttt^

• t

Ls

Town A

Town B

60% people were males 40% were coffee drinkers, and 26% were male coffee drinkers 55% people were males, 30% were coffee drinkers, and 20% were male coffee drinkers

Chapter 6

digrammatic presentation

Introduction

Importance and Uses of Graptis and General Rules for Constructing Diagrams Types of Diagrams

A. One-dimensional Diagrams

B. Pie Diagrams

Limitations of Diagrammatic Presentation

Other methods of presentation.

rORIAL PRESENT^

r

1

Presentation

resentation

—► ONE-DIMENSIONA^DIAGRAMS

(/■) Simple Bar Diagram-(i7) Sub-divided Bar Diagram (i/i) Multiple Bar Diagram (/i^ Percentage Bar Diagram (v) Broken Bar Diagram (vA Deviation Bar Diagram -►TWO-DIMENSIONAL DIAGRAMS (f) Rectangles (fO Squares

'//A Circles and Pie-diagrams -►THREE-DIMENSIONAL DIAGRAMS (!) Cubes (iO Cylinders (I/O Blocks etc. —► PICTOGRAM —►CARTOGRAMS OR MAPS

♦ GRAPHS OF FREQUENCY DISTRIBUTION (i) Line Frequency Graph

Histogram (Hi) Frequency Polygon (iV) Smoothed Frequency Curve (Frequency Curve)

'Ogive' or Cumulative Frequency Curve GRAPHS OF TIME SERIES (A One Variable Graphs (Ii) Two or more than two Variable Graphs (i/0 Graphs of Different Units

fr

88

r =a

conunon or representing the statisticalT^^^^^ xs the most popular an

are appeahng to the mind through the eyL as the^ of Presentatior

For the purpose of simplifying an^tter^p'Tas ''h ^^^^ W diagrams m this chapter and some iZonZ Znr

used m presenting statistical informal ^ ^ are commonly

and uses of graphs and

1. They are interesting, attractive and impressive • h. . I

fluctuations of the statistical values bv^n? ' ^^^ ^^^nd and interested in going through tiffûre's^.e^"^ ^^^^^ -ho is not âgrams are used for publicity XââX'"'

• • TW save time and energy of

say without any stram on' mmd a"d knowl^^^^^^^^^ ^^^^ ^êy warn to

data simple and intelligible knowledge of mathematics as they make the)

quick comparisons. becomes easy Thus, diagrams can be used for ]

4. They have universal utility • Sinr^- c

presentation of graphs anjîa^^ SLv I T'^ ^^ ^y Ae!

journals, newspapers, board meeting etc D L^T ^^ ^ " exhibitions, fairs, information to the common man TW are wideT Particularly to givj

and other fields. Diagrams play arimorttr'1 " campaigns. ^ ^ ^^ an importarn role m the modern advertising

hke median, quartiles, mode etc. Twf is dSt^^^ ^

Central Tendency of this book. ^âpter on Measures of]

whjes for constructing

I

dCf - Be acuêl through p„„.ee. Uei

general rules are observed : " advantageous if following

conveys main facts depicted by the Sa^m T^n ^^ but short.

given It must be brief, self expLato^ a^d^^^^^^ sub-headings can also be

for Identification and for purpose TrefereJe ê used

Digrammatic Presentation

89

i

. . J The «:i7e of the diagram should be neither too big nor too

paper. It should be attractive, neat and appealing to the eyes, so that peoples attention is automatically drawn towards it. . , ,

'ilîlifp

'poputoton- or 'productton' on Y-ax>s and -years' or 'months "n X-ax,.

4 Scak ■ A diagram should be drawn with the help of geometric - ^

scale slould be selected to su.t far as possible be in even numbers or multiple ot 5, lU, /u, zo, luo

'months' on X-axis. . û v onH Y

. A iÂ^^ ■ The 'sc/j/e' of measurement on both X-axis ana i

sl'zrrrf ^^ « f U târor^rVftX™ ^nJt

through different colours, shades, dotting, crossing, etc., an index must g for identifying and understanding the diagram.

the source from which data have been obtained, more effective than a complex one.

types of diagrams

There are various types of geometric forms of diagrams used in practice as shown on

following two Geometric forms of diagrams :

A. One-dimensional Diagrams

B. Pie Diagram

A ONF-DIMENSIONAL DIAGRAMS

™lionaMiagrams are also called ^^^^ JthXiror:!

used in practice. They are called one-dimensional because of height of the bar

90

significance and not the width of the bar Foil " u ^^ j

(-) Simple bar diagram ^^^ ^ypes of bar d^

(b) Sub-divided bar diagram

(c) Multiple bar diagram id) Percentage bar diagram (e) Broken bar diagram (/) Deviation bar diagram (a) Simple Bar Diagrams • The

variable can be presented, A Jimple^rdlTlrc^ b'^ T

vertical base. It is used for vistil ? "" horizontal o,

production, population, sa,es,Xt e^dilr

one category either in years, months w«ks et T ^fonnation of

or groups. All the bars L be brau^LZ T u

attractive. or shading to make them more

simpfc bar diagram the scale is detet^ned'^teiTof ^

Illustration 1. Draw a b«r A:, "" " the series,

of computer softw^r " relating to expo,;

-Xc^ore. ; T/oo' --- 2000.. ' .OOf-0.

, &o„omfc Survey. 2002-03 p 144) Solution. ' '

export of computer software (1997-2002)

Scale : 1 cm = Rs 7,000 crore

Y

42000 -

35000

ff 28000-S O

r 21000-

^ 14000-

6,500

36,500

28,350

17,150

10,940

YEARS

Fig. 1

2001-02

"tr rrr aw———

; vertical base showing horizontal b«s as under :

Alternative solution - Vertical base. ^ „ V ^vU

w T Years on X-axis; Value (Rupees m -^ores on Y-ax s W 2 • Years on Y-axis; Value (Rupees m crores) on X-axis.

Scale : 1 cm = Rs 7,000 crores.

Export of Computer Software (1997-2002)

Scale : 1 cm = Rs 7,000 crores

2001-02

2000-01

1999-00

1998-99

1997-98 0

36,500

■ 28,350

4 17,150

10,940

. 6,500

—r

7000

14000 21000 2^00 3M00 42000 Rupees (in crores)

.. above ™o „e ui^L ^ - ^^^^

i„ the export of computer ^^^^^^^ the software export have

,crores) m 1997-98 to Rs per year for last four years.

-TwoXtl^rr;!: o"^^^^ .o^ -nomlc survey .00.0. are

, , ....______Dri>>a nhanaes

given below : Poodgrains Production

(in million tons)

Wholesale Price Changes

52-WeeteAverageJnflat^^

'1999-00 -01 -02 -03 -04

X (Provisional) Average up to Jan. 14, 2006. e First advance estimates (Khanf only).

.-05 -06

1999-00 -01 -02

.03 -04 -05 -06

Fig. 3

92

ReauirPnt^^t j ' ^^^^t^stics for Economics-X

.he guZrr/a Sir given

(h) Sub-divided Bar Diaeram • TJ, j

In general sub-divfdedo;!^^^^^^^ ^Component Ba

values of the given data is to be dividedTo v^no '^e to.

F-t of all a bar representing total s CwrXn Z ^^

proportion to the values given in the dat^ S/ ? , P^^s i, dotting or designs can be'used to d^sttguish-os-g

remember that the various componentrshouFd be t '

tndex' IS to be given alongwith the Lgram to ? " ^^^^ bar.

Illustration 2. Draw a snir.Kl ^ differences.

Draw a suitable diagram to represent the following mfo^ation :

Year

2001 2002

2003

2004

2005

Trains

Murder

108 131 97 102 75

Solution.

Robbery

82 115 144 70 68

Loot

321 386 352 285 245

Total

511 632 593 457 388

CRIME IN RUNNING PASSENGER TRAINS (2001-2005)

Scale : 1 cm = 200 crimes

800

600-

co m

E - 400-1 O

a loot e robbery

■ murder

200 - _ ' ^^ia

2001 pnno -----^--

2002 2003 YEARS

2004

jr

2005

Fig. 4

(c)M between t inter-relaC of drawin; In this cai spacing isi in a set, d be given. '

93

Qigrammatic Presentation ^ifU^w^ofD^riation

S.E. Asia West Asia Africa

Other Regions Total

d,agram to represent the above data.

Ltion. Suh-d.v.aea bar a,a.ta,n .s sn,.ab.e to the ahove data.

Y A

100

Q Other Regions ® Africa pfi West Asia

o-E. Asia

2003-04

2004-05

YEARS

Fig. 5

94

, Statistics for Economics-}

Illustration 4. Draw a suitable diagram of the following data :

Statement of CrimeJnR^g Passenger Trains

Solution.

Year Murder Robbery Loot

2001 108 82 321

2002 131 115 386

2003 97 144 352

2004 102 70 285

2005 75 68 245

CRIME IN RUNNING PASSENGER TRAINS (1998-2002)

(Scale ; 1 cm = 100)

500

4001

2 300H

tr o

200

loo-

ses

321

352

285

B Murder ■ Robbery Q Loot

245

tons during the same fortnight last vear(ronnf TK « ? T'

during the first fortnight of DecemberToo ^ 2 ssloo^f f""

and 41,000 tons for exports as against 1 54 000 ton / . consumption

exports during the sam"! fortnigriast sfasfm^ (t) Present the data in a tabular form

(Hi) Present these data diagrammatically.

95

Digrammatic Presentation Solution.

(/) Presentation of data in a tabular form. • c i Stock

Fortnight Sugar Production, Off-take for Internal Consumption, Export arU Stock

in Sugar Mills in India. (figure in thousand tons)

December, 2000 December, 2001 , (First fortnight)

Production Off-take from Mills Export Stock 378 154 Nil 224 387 283 41 63

Source : muiau du^^i ---------

export and stock which we have calculated. (Hi) Diagrammatic presentation of above data by

(a) Sub-divided bar diagram

(b) Multiple Bar diagram

INDIAN SUGAR MILLS ASSOCIATION REPORT

(Fortnight Sugar production, off-take for internal consumption, export and stools in Sugar Mills in India Scale : 1 cm = 50,000 tons.)

MULTIPLE BAR DIAGRAM

SUB-D'VIDED BAR DIAGRAM

400-

350-

300-

250-

iction K

8,000 1 200-

tories ■150-

iption 1

lil for ■ 100-

50-

grams B0-

Dec. 2000 Dec. 2001 (First fortnigh) (First fortnight)

Dec. 2000 (First fortnigh)

Dec. 2001 (First fortnight)

Fig. 13

96


Proceeds per Chair Factory A (Rs) Factory B (Rs)

Wages Material Other Expenses 160 120 80 200 300 150

Total Selling Price 360 400650 600

Profit or Loss (±) (+) 40 (-) 50

1 he percentages are calculated as under :

Ptrcentaj^^ (For Percentage Bar Diagram)

Proceeds per Chair i-______ Factory A (%) Factory B (%)

j Wages 1 Material Other Expenses 40 30 20 33.3 50 25

Total Selling Price 90 100 108.3 100

Profit or I oss (+) (+) 10

OT

UJ 4

m ^ a.

3

tr

97

Digrammatic Presentation

% COST Y

chmr 3. pnofit amo loss

100-

^ , , . n » PROFIT AMD LOSS

: 1 cm = 20%

factory b

factory a

(t UJ 60-

Q-

Z

CO 40"

UJ

lU

Q-

rj 20'

cc

o:

-20

M other Expenses Q Material Wages wm Profit or Loss

Fig. 8

. c .pries in which some values may Broken Bar Diagram : Sometimes we may S™^ reasonable shape

each bar is written on the ™ ° , b„ a suitable dragram.

Year Number of students

2001 2002 2003 2004 ^___- 25 48 375 125

neces^to brsn^ -^ ^

98

Statistics for Economics-XI 3 NO. OF STUDENTS .N SCIENCE (2001-2005)

Scale : 1 cm = 25 students

200-

175-

f2 150-

2

LLI Q 125-

?

CO 100-

u.

o

d 75-

2

50-

25-

0--

2002 2003 2004

YEARS

2005

Fig. 9

net 'i^r: ^e^l^ipt tt^ t.e

export, etc., wh,ch have both " "" "

■n plus and minus values to plot th.ron

the base l.ne and negative vales bell'fbatTe"

Year

1998

1999

2000 2001 2002

Export

47 125 20 94 120

Import

30 115 39

no

125

(Rs in Lacs)

Balance of Trade

17 10

/

-19 -16 -5

99

Digrammatic Presentation . Solution.

BALANCE OF TRADE (1998-2002)

Scale 1 1 cm = 5 lacs

Y

25 -20 -15 -I

CO O

CO lU LU ti. =3 CC

10 5 0

-5 -10 -15

-20 -25

gg Surplus ■ Deficit

u

t

1998

1999

2000 YEARS

Fig. 10

2001

2002

B. PIE DIAGRAMS _ ^ ^^^^^ ^^ ^^^^^^^ These

Pie diagram or circular diagram is ^^^^ ^ comparatively easier to draw,

diagrams are very useful in emphasising exhibited. Circles can

With circles and sectors, totals as well as comp^em parts ca ^^^^ ^^^ ^^^^^

be drawn by making the^/™ ^IgllmL ^called I p.

angle at the centre is 360 or 2%. I hereio , _ ^ oercentage breakdowns by

Pie diagrams are very POP^^-^J/JJ" represent the

portioning a circle into various parts ^ various parts will indicate

Lvernment expenditure and different pornons^^^^^^ ^^^^^^^^ Transport,

the expenditure over different heads l^ke^ heads Namely, food, clothing.

Education, etc. Similar^ expenditure of ^^^ ^^ components or the

rent, education, etc. If the series is diagrams are less effective than

difference among the components is very small, then pie a g

bar diagram.

Steps for Construction of Pie Diagram percentage of respective totals. Pie

helpful for comparison.

100

- «ke„ ,o be equal to 360-. ,t ,s „™ teê^ h" ""

ry to express each part proportionately

™^rees.S.„ceiperce„totthetotaWalne,se,„alto^.3..,.hepercentages

oahe .mponent parrs w„l be now converted to degrees by .n,t,ply.„g each of

Degree of any Component part = Component value

simultaneously for comparison tbrralTrh "

proportional ro the square roots » ^

■ ^^^^^ -"-a.. . is common

position on the circle. Now, with this K ^^ o'clock

centre with the help ofp^r t] "â o t ^ ^^^ component, the new line drawn a iTete to f ^

circumference. The sector so obtained wi^ the

component. From this second line a ble nowT""' ^^^^^^^ ^^^

equal to the degree represented by second c^Z ^^ ^^^ ^--^re

the portion of the second component Sîh 7 representing

component parts can be coZuTd ' -P-senting differen?

Be distinguished

Illustration 9. Constmrr o a-break-up of the cost of —

Item Expenditure

Labour Bricks Cement Steel Timber Supervision 25 % 15 % 20 % 15 % 10 % 15 %

Ltgrammauc Fresentation _ ^^^^ percentage into

So.».>o„. Before draw™, » 3.6" —, we

Labour

Bricks

Cement

Steel

Timber

Supervision

Fig. 11

n- of three textile items in percentage

Items

Readymade Garments Cotton Textiles Wollens Textiles

Years

2003-04

Total

52.2 19.1 28.7

2004-0S

100.0

/

41.7 23.3 35.0

100.0

102

Solution. (Degrees of angle are rounded off)


Items

Redymade Garments Cotton Textile WoIIen Textile

Total

2003-04

%

52.2 19.1 28.7

100.0

Degree of angle

188 69 103

360

2004-0S

%

41.7 23.3 35.0

100.0

Degree of angle

150 84 126

360

export of textile items

2003-04

2004-05

Fig. 12

Illustration 11. Represent the following data by a pte diagram.

basfs of 360 r ,

basis of 360 taken as equal to the total of the values.

Family X Family Y

1. Food 2. Clothing 3. Rent 4. Education 5. Miscellaneous (Including Saving) 400 250 15r 40 160 640 480 320 100 60

Total 1000 1600

103

the

digrammatic Presentation

>ms of Expenditure

Rs

1. Food

2. Clothing

3. Rent

4. Education

5. Miscellaneous (Including Saving)

Total

Square root

400 250 150 40 160

1000

31.6

400

—x360

= 144

Family Y

Rs

_ , ___i

1000

^x360 = 90

1000

^360 = 54 x360 = 14-4

1000 1000 1000

x360 = 57.6

360

640 480 320 100 60

1600

40

640

1600 480

x360 = 144

1600 1600 1600 1600

x360

= 108

x360 = 72

x360 = 22.50

x360 = 13.50

360

Radii of circle are determined m proportion 3.2 : 4 (31.6 : 40). Wore the radU of arcle accordmg to avaUabUtty of space 3.2

are :

Family X : Radius -y = 1-6 cm

4

Family Y : Radius - = 2 cm

expenorrure of family x and y

Food im] Clothing m Rent B Education B Miscellaneous

FAMILY X

FAMILY Y

Fig. 24

104

limitations of diagrammatic presentation

Statistics for Economics-xl

fo/lowng points „„„ remembered mterpretation of diagrams, tlie

W—HmS-Ot^^^^^ " a .i.e,r capacity to g,ve

.n the,r basic fnncti^rrsytdTrj *

2. Diagrams can show presentation.

and ™ tje dte^rb^Lrl^^^-Tt'''""'"-^' in diagrams. facts are not possible to show

by tables etc. they can misrepresent facts

diagram for visual Presentation of" n"^^^^ ^ P-icular

and the object of presentation. Therefore, it shLld be m.^ t ^^^ data

A well constructed simple and attractive Ltam sho ^are and caution,

easier to understand at a glance; sucrpresentatiotr^^ ^^^ mformadon is

newspapers, magazines and journals be seen in financial reports in

2.

3.

4.

5.

6.

Questions :

• a-:: —

feplam the various rules of drawing a diagram.

TpLin JT "ftheir utility, txplam M bar diagram, and (b) pie diagram

Digrams are less accurate but more effective than tables in presenting the data •

' rc^mtlstc:: " —« "-wmg.

W Composition of the population of Delhi by reltgion H Agnculture production of five states of India.

What are the merits and limitations of dtagrammatic representation of stat,st.cal

Explain the following with illustration ■ M Sub-divided bar diagrams, and W Multiple bar diagrams.

8

9.

105

iigrammatic Presentation

(Write short notes on the following 1(a) Percentage bar diagram , (c) Deviation bar diagram 1 (e) Multiple bar diagram.

(b) Broken bar diagram (d) Sub-divided bar diagram

iLt the following data by simple bar diagram.

PRODUCTION OF COAL (Million Tons)

Production pillion Tons)

bar diagrams

DEMAND AND AVAILABILITY OF STEEL (Thousand Tons)

Exports (Rs in crores)

4. Present the following data by sub-divided bar diagram.

total import

Food Fertilizers Mineral oil Others

Total

2001-02

2002-03 2003-04

474 795

125 298

341 1,113

1,789 1,951

2,729 4,167

(Rupees in crores)

2004-05

988 323 1,042 2,394

4,744 /

Represent the foJlowine • l . for Economics-x\

«e diagram. 'i-e He,p Sunp.e Bar diagram. J

Yi^f ■ ■■ ■ .. ■ . ■:--—---

' 7.

Export (Rs in crores) Import (Rs in crores)

2002 ~~2m _2004

73 80 85

70 72 74

ZOOS

8.

Food Clothing Rent

Education Miscellaneous

Farntly A (Rs)

Family B (Rs)

P'Xpettdiiure

9 TU --——i—^ 1440

' ^'iree year's result of XTT ri T _

• ^ - ".e fCowmg rab...

107

Ipigrammatic Presentation

ntmaiic 11 -----

B (Rs)

3 1

75 100

175 150

30 25

20 25

Price per Unit Quantity Sold Value of Raw Material

Other Expenses

________

Show the following data by percentage bar diagram^^

L^^ of a product . bar

chart :

COST P^OTRPDS AND fROm AND LCTJ^

Cost per table :

(a) Wages

(b) Other Costs

(c) Polishing

Total Cost

Proceeds per table Profit (+) Loss (-)

Chapter 7

cmpbic presentation

3.

sw

Construction of Graphs Graphs of Frequency Distribution Line Frequency Graph • Histogram

Frequency Polygon Frequency Curve

:

asiiiiiifa*

Graphic presentation gives i rr -

as a tool of analysis.

" (Line Graphs,.

^ ^ ... ------- of graphs;

109

Graphic Presentation . „ j of origin 'O' which represent^

axes. (See Fig. D- . ^ j equal parts called qaadrants

Quadrants : G-ph pa^ys ^mded " ^^^ ^^ ^ Y are posttrve.

Fig. 1

.ea. - - - -

^^^ -V-Tdata. d.«e„„t

.e^lu-'rS^^Str;: S^rrXs . ol t„ue, re—. ''Ttôl senes graph X-axts ^^^^^^ ^ ^^ ^

110

■ J J Statistics for Economics-XI

OF FREQOEIICr

scale wid, d,e difference of lO^wS™ T ' " "" wasting too much of space of â7h pSr «« ^ S'^Ph

require a lot of space so that Xîs is T",T'" "" "" axis

portion of the scL may be t

use of 'kinke, fe' in ^phtc pr^tTê Figi'r"" " '

frequency graph

Scale : 0.75 cm = 10 Rs on X-axis

0.75 cm = 10 Workers on V-axis

S'graphs..

(b) Histogram

(c) Frequency Polygon

(d) Frequency Curve or Smoothed Frequency Curve ie) Cumulative Frequency Curve or 'Ogive'

(a) Line Frequency Graph

fluency array, on graph by which the line is drawn. represents the frequency of that variable on

kaphic Fresentation

111

Heieht in incht » Nc -~d----f)

60" 90

61" 80

62" 120

63" 140

64" 132

65" 70

66" 40

^ Metlwd

{. X-axis for variables under study (Heights in inches)

2. Y-axis for frequencies (No. of students)

3. Draw a vertical line on each value equal to the length of each frequency

4. Both the axes must be clearly lebelled and scale of measurement clearly shown. X-axis can conveniently be determined according to the need of the problem. We can

have three varieties of X-axis. Taking the above illustration they are :

(a) Use of kinked hne

(b) Starting from 59" , . ^ , . •

(c) Starting from 60" (use thick line to read the data properly). See the graphs given

{d) Both axes must be clearly labelled and the scale of measurement should be clearly shown.

Solution. HEIGHTS OF STUDENTS

Scale

140 -

120 -

^ 100 -

lU

Q

3

is 80-

LL

o 60 -

d 40 -

H

20

(a) using kinked line

1 cm = Frequency 20 Students 1 cm = r on X-axiis

Fig. 13

il2

Statistics for Econo

>mtcs~}

STARTING FROM 59" (X-axis)

- iF-pu 20 StudL onV-axis ^ I cm _ 1 on X-axis

Scale 1 starting FROM 60" !

■ ^ = Frequency 20 Students on V^axls^ 1 cm = 1" on X-axis

yf

(b) Histogram

' which each

and also called a frequency histo^m : '' " ' ^^-dimensional diagram

Cases of Constructing Histogram

U) Histogram of Equal Class Intervals {« Histogram when Mid-points are given Histogram of Unequal Class intervals

Method

1. X-axis for variables under study (Marks) .y-axis for frequencies

are

freq

Thus is pn

(«)K

n

obtai

113

\ Graphic Presentation class with frequency.

3. oe. reW. —

4. Both the axes must oe ucdny clearly shown.

Solution. histogram

Scale : 1 cm = 10 Marks on X-^is

1 cm = frequency 4 on Y-axis

1- in for all the classes and the frequencies

In the above illustration 2, class mterval is 10 for all

SSe!rr: -Se for each class can be deaded(c^^

frequency)

Class (Marks)

0-10 10-20 20-30 30-40 40-50 50-60

Class frequency (f)

4 10 16 22 18 2

10 X 4 = 40 10 xlO = 100

10 xl6 = 160 10 x22 = 220 10 xl8 = 180 10

Total Area = 720

Total frequency^^;^^______^ - ^ interval

Histogram = ^en ^ff^^^Tora the following aisnr,bu,ion of total marks

ojrr^-"'a'^Sjra Boara H—.

114

Method

Marks (Mid-points) No. of Students (f)

150 160 170 180 190 200 •8 10 25 12 7 3


Graf

mid-poirns different classes from the given

2. X-axis for variables under study (Marks). 4 iTZ r fr^q^es (No. of Students).

5: with frequency

he clearly shown. ..... ' the measurement should

bet^ sS::^^ first mid-pomt, get the difference

^vide the difference by 2, /.e., 10/2 = 5 ~

of we get lower and upper lim.

Thus, the class decided is 145 ~to 155 ^ ^ = ^^^ - "PP- H.^"

U^g the same -jet ^h^ mid-points as under :

No. of Students : g IQ ^25

^ ^hri ways : ^

See the Figs, below : ^ ^^^ Starting from 135 marks.

(iii) I

ni

Solution.

histogram kinked line method

Scale : 1 cm = 10 Marks on X-axis 1 cm = 5 Students on V-axis

histogram x-axis-starting

from 145 marks

Scale : 1 cm = 10 Marks on X-axis 1 cm = 5 Students on V-axis

165 175 185

marks Fig. 7

Sd Nc histogi

Metho 1. 2.

3.

4.

115

^Graphic Presentation

histogram X-AXIS^starting from 135 marks

Scale

1 cm = 10 Marks on X-axis 1 cm = 5 Students on V-axis

i

UJ

o

3

u. O

d Z

135 145 155 165 175 185 195 MARKS

Fig. 9

205 215

S.

Hit) Histogram of Unequal Class Intervals 'Tst^lo 4. Rep«se„. .he fonowing^^ans of

of Workers

.he Cass — are unequal, frequencies n.us. he adiusred, otherwise .he his.ogram would give a misleading picmie.

'^"I'^Take .he class which has d>e lowes. class in.erval.

2. Do no. adius. .he frequencies of i„,erval.

^rr^^^^^^ each .ecan^e of h.s.ogram hu. ■ widths will be according to class limits.

116

Thus, the adjusted frequencies are :

histogram

Scale : 0.5 cm = Rs 5 on X-axis

1 cm = 5 Workers on V-axis

daily wages in rs

Fig. 10

japhic Presentation

Histogram : When Class Intervals are given by " Method

^ IllustrLn 5. Constru^^ '

r -Jji^s ^ Students W

117

5-9 10-14 15-19 20-24 25-29 30-34

4

17 25 32 13 6

Solution. „ inrliisive method (where lower and upper

Note : Since the class intervals are given ^ .^d upper limits of

Adjustment : Find the difference between lower hmn^ _ J ^^^^^^^^

and so on.

Adjusted Class Limits

Marks Students (f)

4.5- 9.5 5

9.5-14.5 17

14.5-19.5 25

19.5-24.5 32

24.5-29.5 13

29.5-34.5 6

histogram

Scale : 1 cm = 5 Marks on X-axis

1 cm = 10 Students on V-axis

9.5 14.5 19.5 24.5 29,5 marks

Fig. 24

118

(c) Frequency Polygon («) Without histogram.

Statistics for Economic^y,

Method

loISo" ---

20-30 5

30-40 12

40-50 15

50-60 22

60-70 14 4

—" - !>uitaDJe histogram kepnm„

2- the of the '"Ho " ™T""""P'-

3. Jotn these md-p„i„„ „f ^.de of each rectangle

ciearly sro^^^"^' ^^^^^ ^^^elied and the scale of the meas Solution. measurement should

histogram and prequencv polygon

JO on X-axis 1 cm = 5 Students on y-a*is

25'

CO 20-

&

UJ

s 15-

1-

co

u.

o 10-

d

2

5-

0-

- Histogram

-Frequency Polygon

Fig. 24

Graphic Presentation jj^

While drawing the frequency polygon, we observe that some area which was under the histogram has been excluded and some area which was not under histogram has been included under frequency polygon. This dotted area which was under histogram but is not under the frequency polygon. This dotted are is excluded from the area of frequency polygon. But the shaded area has been included under the polygon. This was not under histogram. Thus there is always some area included under the frequency polygon instead ot the area excluded from histogram. Therefore, the total area excluded from the histogram ts equal to the area mcluded under frequency polygon.

(ii) Frequency Polygon : Without Histogram

und™"^ ^^^ illustration, we can get the frequency polygon without histogram as Method

1. Take the mid-point of each class interval.

2. Scale of X-axis can either be decided on the basis of class interval or mid-points

3. Join the points plotted for the mid-points corresponding to their frequencies by straight lines. We will get the same figure as obtained by the first method (i.e., with histogram).

■BJ

Hr

Mui-pomts No. ofstfj^nirw 1

15 5

25 12

35 15

45 22

55 14

65 4

Solution.

frequency polygon

Y Scale : 1 cm = 10 Marks on X-axis 1 cm = 4 Students on V-axis

Fig. 13

120

Illustration 7 V St'^t'stics for Economics-XI

exa JSr -ks secured by 25 s,ude„,s i„ an

2.1 in -

- - " " - - - - « . « ,, Solution.

Frequency Distribution of Marks Tally Bars

Gi foi

Marks

20-29 30-39 40-49 50-59 60-69

m(

Total

-^xuxc preparing 1 exclusive method, i.e..

No. of Students (f)

2

5 8

6 4

25

~1MS Students (f)

19.5-29.5 29.5-39.5 39.5-A9.5 49.5-59.5 59.6-69.5 2 5 8 6 4

J--'-WUTtlUN

Scale : 1 cm = lo Marks on *-axis 1 cm = 1 Student on V-axis

» i*

marks Fig. 14

121

Graphic Fresentation

lllusmtion 8. We have the following data on the daily expendttute on food (in rupees) fot 30 households in^alocaU^: ^^^

- - s r/o r/s r/o .s

(a) Obtain a frequency distribution using class intervals : 100-150, 150-200, 200-250, 250-300 and 300-350

(b) Draw a frequency polygon. j u r-^nt \c) What per cent of the households spend less than Rs 250 per day, and what per cent

spend more than Rs 200 per month? Solution, (a)

lit

Monthly Expenditure on Food (Rs) Tally Bars No. of households (f) . -

100-150 nil 4

150-200 mil 6

200-250 mim^iii 13

250-300 M 5

300-350 11 A.

Total 30

(b)

frequency polygon

Scale : 1 cm = Rs. 50 on X-axis

1 cm = 2 Households on V-axis

*■ X

100 150 200 250 300 350 EXPENDITURE IN RUPEES

Fig. 15

400

122

fc) Ont of ^sn J, u Statistics for Economics-X

^^ Hence 76.6% spend less tha.

xtlr '' ^^ spend more than

(d) Frequency Curve or Smoothed Frequency Curve

generalVby^'el^^^^^^^^^ frequency curve. It is drawn

area mcluded ,s ,ust the same as^I Tthf poL^^^^"u ^ ^^^^ ^^^"he required to be done carefully to ge^ co rect "eS Smoothing the frequency polygon

shows neither more nor less area of the rectanLs of .h v drawn with care

frequency polygon to get a smoothed freq^ ett "

histogram for the data given in IllustraZ 8 fo^ U"^^ constructing

frequency curve

Scale : 1 cm = Rs. 50 on X-axis

1 cm = 2 Households on V-axis

200 250 300

expenditure in rupees Fig. 16

> X

350 400

We observe that :

123

Graphic Presentation

statistics. It is a uni-modal distribution curve.

histogram, frequency polygon and frequency curve

_____1—------ I _____.ri^ol

124


ie) y-Shaped Curve (Curve E) : In this case, maximum frequency'is at the ends of rh. (e) Cumulative Frequency Curve (Ogive)

I- O" the graph paper by rwo

(a) 'Less than' method

(b) 'More than' method

Illustration^^Draw^^ for the following data :

Marks

0-10 10-20 20-30 30-40

Ma. of StHdents

4 4

7 10

Marks

40-50 50-60 60-70

No. of Students

12

obra,„,„g mar,. S" ^^^^^^^^

125


of each class .g in above illustration, the number of students obtain,ng marks more In 0 .s 50; moi; than 10 is 46; more than 20 is 42; and so on.

Cumulative Frequency Distribution

Marks

Less than 10 Less than 20 Less than 30 Less than 40 Less than 50 Less than 60 Less than 70

No. of Students

(c.f-)

4

8

15 25 37 45 50

Marks

More More More More More More More

than 0 than 10 than 20 than 30 than 40 than 50 than 60

Nr. of Students (c.f.)

50 46 42 35 25 13 5

We get a rising curve in than method', if the above

case of 'less than method' and declining curve in case of 'more cuLlative frequencies are plotted on the graph paper.

""Ilet the cumulative frequencies of the given frequencies either by 'less than method'

or 'more than method'. 2 X-axis — the variables under study

3. Y-axis - calculated cumulative frequencies Cumulative

4. Plot the various points and )om them to get a curve (i.e., ugiv

5. be clearly lebelled and the scale of the measurement should

be clearly shown. „ , 'Ogive on Graph Paper

(Cumulative Frequency Curve)

by 'less than' method

Scale : 1 cm = 10 Marks on X-axis

1 cm = 10 Students on V-axis

by

Scale

more than' method

1 cm = 10 Marks on X-axis 1 cm = 10 Students on V-axis

>X

10 20 30 40 50 60 70 80 MARKS

10 20 30 40 50 60 70 MARKS

Fig. 18

Fig. 19

126


FAv

by less than' method

Scale : i cm = 10 Marks on X-axis 1 cm = 10 Students on V-axis

Less than method

Fig. 20

^ c„„e for .he ,oHo„i„, dis„,h.„o„ of

Weekly Wages Workers (f)

100-109 7

110-119 / 13 15 32 20 8

120-129

130-139

140-149

150-159

Method

1. Must the lower and upper of he classes.

>jet cumulative frequencies

Both the axes should be iX l beM T

clearly shown. «ale of the measurement should be

and

127

Graphic Presentation Solution,

Adjustment of da- limits and calculation

of cumulative frequencies by less than method.

99.5-109.5 109.5-119.5 119.5-129.5 129:5-139.5 139.5-149.5 149.5-159.5

7

13 15 32 20

8

7 20 35 67 87 95

ogive (less than method)

ocale ■ 1 cm = Rs. 10 on X-axis

• 1 cm = 20 Workers on V-axis

be

..........Fig. 21 J

and indicate the value o^Ae^i^ -----------

128

Solution.


It I

Marks r\ r- Number of Students Cumulative Frequency (Less than) c.f Cumulative Frequency (More than) c.f

0-5 5-10 10-15 15-20 20-25 25-30 30^35 35^0 1 7 10 20 13 12 10 14 9 ----- 1 7 17 37 50 62 72 8695 88 78 58 45 33 23 9

than' ogive

Scale . cm = 5 Marks on X-axis

1 cm = 20 Students on /-axis

or h

Kg. 22


129

^ graphs of time series

, ^ , , .......

Time series can be sbown on the graph paper. The information arranged over a period of time (e.g., years, months, weeks, days etc.) is termed as a time series. Presentation of this type of information by hne or curve on the graph paper is of great use in economic statistics. These graphs are known as hne grapjhs or histograms, or arithmetic hne graph. (a) General Rules to Construct a Line Graph

1. As the time (year, month, week) is never in negative (i.e., in minus figures), there is no need of using Quadrant II and III.

2. Year, month or week according to the problem, is taken on X-axis. Give titles to X-axis and Y-axis.

3. Start Y-axis with zero and decide the scales for both the axes. For example, on every 1 cm for Y-axis one may represent an equal gap of 50 students and 1 cm for X-axis a gap between 2000 arfd 2001. X-axis can start either from 1999 or 2000 (See Fig. 23).

4. The pair values will give different dots on the graph paper. For example, values corresponding to time factor are :

Years Students

2000 50

2001 150

2002 100

2003 150

2004 200

2005 225

2006 200

These dots obtained of pair values are joined by straight line which is called line graph or histogram (See Fig. 23).

students (2000-06)

Scale : 1 cm = 50 Students on V-axis

300-

250-

CO

H

Z 111 200-

Q

Z)

H

OT 150-

O

d 100-

Z

50-

0-

/ N s /

/ f S

/

/

2000 2001

2002

2003 2004 YEARS

Fig. 23

2005

2006

130

5. It is not advisable to ■

by a straight iine and not by a curte " J™" Ae dots

-is r ^^^^ of un.

m One Variable Graph ^^e given below •

Kendriya Vidyalaya

■

Method

1998-99

1999-00

2000-01 2001-02

2002-03

2003-04

2004-05

120

400

567

490

760

834

750

Gra Wh

in v (yea timt orig

largi <

year Y-as smal line ; two reqa porti line, I

1 Select X-ax for the time factor (years).

2. Select Y-ax. for variables under study (students)

3. G„ble title and scales to X-J Ld C

= ^o Its value and .. the. by

STUDENTS-KENDRIYA VIDYALAYA (1998^)5)

Scale : 1 cm = 200 Students on V-axis

z

UJ Q =3 I-

co u.

o d

Sc

2001-02 2002-03 2003-04 2004-05' YEARS

Fig. 24

J Graphic Presentation

■ What is a False Base Line?

131

't rlvldTuse faUe base U„e according to ne^ of tbe ptoblent. Keeping

doln^—^ .o out tequiretnents by using False ^se Line) mustrafon No. n-bet of stude^ ts ^ one t^usaud .n e.b

wmmmmM

P" : . . C T in crranh r nresentation See Fig.

Dortion of the scale may De omiueu wm^n ^ciix ---- ,

£ that is the use of False Base Line in graphic presentation (See Fig. 25).

illustration 13. Present the following information on the graph paper.

Year__________ Students

2U00 1120

2001 1380

2002 . 1587

2003 1490

2004 1760

2005 1734

2006 1675

STUDENT&-<30vt. higher sec. school (20(hm»6)

Scale : 1 cm = 200 Students on r-axis

2000 2001 2002 2003 2004 2005 2006 YEARS

Fig. 25

132


Year (1) Agriculture and allied sectors (2) ■ Industry (3)

1994-95 1995-96 1996-97 1997-98 1998-99 1999-00 5.0 - 0.9 9.6 - 1.9 7.2 0.8 9.1 11.8 6.0 5.9 4.0 6.9

Services (4)

7.0 10.3

7.1 9.0 8.3

8.2

Gr (d)

Wt

as

axi

data as * rime series graph. estimated sectoral growth rate ,n gdp at factor cost

t>Cdle : 1 cm = 2 per cent growth rate in years

— -----Services

Agriculture and allied sectors Industry

YEARS

Fig. 26

133

IGraphic Presentation

1(d) Graphs of Different Units different units, we will have two different scales.

When two values are given into two ^^"erent unn , ^^ ^^^

Year

1997-98

1998-99

1999-00

2000-01 2001-02 2002-03

Quantity (in, '000 tons)

9 10 12 11

14

15

Value (Rs. in crores)

300 596 782 900 762 640

Solution.

Average of Quantity: 12 Approximately Average of Value : 695 Approximately

trade of tea in quantity and value (1997-2003)

qcale • 1 cm = Quantity 3 thousand Tons

or7-axis : 1 cm = value in RS. 150 crore

Quantity

-S-Rupees

1997-98

134

Two figures of graphic presentation

10000

exports (Provisional)

(US $ IMillicn)

Statistics for Economics-Xl\

are shown below to understand time series graph. ^ IMPORTS (Provisional)

(US $ Million)

5000

Fig. 28

m

Questions :

■ TSL^ Name ehe ,«e.e„.

^ - - .apb . prepared. 7. berween -Bar dra^ar^'irH™

JO- Wha, « a 'Cumulat,; ^^ Give iUusrrarion.

! r Tr' '"O'-ency curve

fop rXt"iara presented .„ rbe «*en an the class intervals h^trtheTa.t^X''""'"

pS - f the ess than type- „„ve and

s.gn,f,cance of Rs 715 f^r the gtven « of da™ " '"-e

13.

Sm Vim

Graphic Presentation ^^^

What is a false base hne? Under what conditions would its use be desirable? What is meant by (a) Histogram, and (b) Ogive? Explain their construction with the help of sketches.

Distinguish Histogram and Historigram clearly with illustrations. What is a smoothed frequency curve? Discuss briefly various types of frequency curves.

Explain the importance of graphic presentation of data. 19. Describe the procedure of drawing histogram when class intervals are (i) equal, and (ii) unequal.

i4. 15-

16.

17.

18.

Probl^^s :

^ The frequency distribution of marks obtained by students in a class test is given

40-50 3

below:

. Marks : 0-10 10-20 20-30 30^0

No. of Students : 3 10 14 10

Dtaw a histogram to represent the frequency distribution of marks. Comment on the shape of the histogram.

What is histogram? Present the data given in the table below in the form of a Histogram:

Mid-points : 115 125 135 145 155 165 175 Frequency : 6 25 48 72 116 60 38 3/ Make a frequency Polygon and Histogram using the given data / Marks Obtained : 10-20 20-30 30-40 40-50

/Number of Students : 5 12

4. Draw Histogram from the following data : r Marks Obtained : 10-20 20-30

Number of Students : 6 10

In a certain colony a'sample of 40 households was selected. The data on daily income for this sample are given as follows :

200 120 350 550 400 140 350 85

200

15

30-40 15

22

40-50 10

185 22

50-60 14

50-70 6

195

3

70-80

4

70-100 3

5.

180 170 210 430

110 90 185 140

110 170 250 200

600 800 120 400

350 190 180 200

500 700 350 400

450 630 110, 210

170 250 300

(a) Construct a Histogram and a frequency polygon.

(b) Show that the area under the polygon is equal to the area under the histogram. (Hint. Get a frequency distribution table to obtain a continuous series).

136


Frequency - f s '''''

'st: it: S...

15-19, 20-24 TQ

(b) What percent of th^ hr. u ij '

Size of classes

t'::; -- -- ao-z^ . 30-3.35-.„

Students ^ 10 15

"'Z; ^

Workers : 9 12 15

Weekly Wages of B cia .

; 'tr '"^r ".^r

cZ''^""

Frequency : 4 ^ ^^ 1^-24 24-30 30-36

u. I

. 0-. 3.30 3..0-...0 .0.0 .O-.O

Companies : 2 3 j

^ —

a-OOO.o^, , 35 3. 3. .0 « ^ ^

iwi

■ 137

Waphic Presentation

I P^ .e foUo™. -a —^^^^^^^^^ tr r " -

\ Profit (Rs in ^^ 65 80 95

f i graph from the

Ps " --- --------------------------Im-hnrtv t

Year

'T99"o-91

1991-92

1992-93

1993-94

1994-95

1995-96

1996-97

,, 12 » 25 31 29 27 35

U. 'p:;:::^ — co. a„a .ot. proauct.o„ of a scooter ma—

; company.

Year ^ Production f? (in units) 1 Total Cost I (Rs in lakh)

2001 8500 24

2001 2003 2004 2005

9990 11700 13300 15600

29 34 45 49

<53t

»TICAL TOOLS AND INTIRPRETATIOlf ii

■ Average p^^

••—ye, of Correla«p„

l-rtion to lBde» lV«™be«»

-

Chapter 8

nusmcs of centum, tendency

Weaning and Importance -----

Objects and Functions Of Averages

Characteristics of a RepresentatL Averaoe

Three Orders Of Measurement ®

Arithmetic Average or Mean J-'st of Formulae and Abbreviations

MCAMte AND IMPORTANCE

the whole group is ealleH 7 observations. An average is a fi. J generally

briefly average Th^ j / <^entral tendency or ^easur^T, ^^P^^^^^ts

marks of students -n a , '' ^^ everyday

income of hctnr , represents the marks of aM . !i' the average

As satisfy

fl

139

Measures of Central Tendency

According to Croxton and Cowdon : "An average value is a single value within the range of the data that is used to represent all of the values in the series. Since the average is somewhere within the range of the data, it is sometimes called a measure of central

value." £ 1

According to KeUoy and Smith : "An average is sometimes called a measure of central

tendency because individual values of the variable usually cluster around it."

and functions of averages

1 To represent the salient features of a mass complex data : It determines a single figure' of the whole series. It is a tool to represent the salient features of a mass of.complex data. It is helpful in reducing the mass information into a single value for drawing.general conclusions. It is difficult to generalise anythmg from the ages of crores of Indian People. But if it is said that the average age of an Indmn is 55 years one can draw conclusions about health conditions of the people. Thus the purpose of an average is to represent a group of individual values in a simple manner, so that the mind can get a quick understanding of the general size of the

individuals in the group.

2 To facilitate comparison : Averages are useful for comparison. The average of one group can be compared with averages of other groups. For example, the average marks of students in section A can be compared with the average marks of students in section B, easily at a glance or the average monthly sales of Department A are compared with average monthly sales of Department B.

3 To know about universe from a sample : Averages also help-to obtain a picture of complete group by means of sample data. In statistical enquiries, very often, sample method is used. The mean of a sample gives a good idea about the mean ot the

population.

4 To help in decision making : Averages are helpful for making decisions in planning ■ in various fields. For example, a sales manager may need to know the average

number of calls made per day by salesman in the field. A railway officer will require information regarding the average number of passengers carried by rails on the various passenger runs. Averages are valuable in setting standards, estimating and planning and other managerial decision areas.

5 To trace mathematical relationship : When it is desired to trace the mathematical relationship between different, groups or classes, an average becomes essential. Definiteness can only come, they are expressed in averages.

i gh^tl^ of a repiuesentjirive average >

As the average represents statistical information and it is used for comparison, it must

satisfy the following conditions : . uiju-

1 It should be simple to calculate and easy to understand : An average should be calculable with reasonable ease and rapidity only then it can be wide y used. It should not involve heavy arithmetical calculations. If the calculation of the average

Statistics for Economics-XI u„de«,„d by pLons of ordln^rSltn 'e. """

are not separated the aver^cotton cloth per mill, if big and small mills cotton mill industry fnTdfa s^pUt'u ^^ ^^^^ ^

female workers, adillt worts ^^^ ^^

OF MEASUREMENT

There are three orders of measurement.

1. Measures of first order.

2. Measures of second order.

3. Measures of third order.

141


students).

kfNDS OF STATISTICAI- AVERAGE

♦►Moving Average

.....^.........^

Arithmetic Mean] tSeometric Mean or Mean X , (GM)

! Harmonic Mean (HM)

Progressive Average Composite Average -^Quadratic Mean

Specialised Average (Index Numbers)

Of the above mentioned average, mean, median, mode are mo^

'"M^lirt cl of qualitative data which caunot be measured quantitatively for rdatiou to all the values, naturally, median should be the choice.

which has highest demand, most fashionable garment, etc.

142


I

1. Meaning

2. Calculation of Arithmetic Mean

3. Mathematical Properties of Arithmetic Mean

4. Miscellaneous Problems

^ Merits and Demerits of Arithmetic Mean

(B) Weighted arithmetic average or Weighted Mean

(b) Short Cut Method (Assumed Mean Method) ic) Step Deviation Method

Let us see the calculations m the following senes.

(A) Series of Individual Observations.

(B) Discrete Series.

(C) Continuous Series.

A. Series of Individual Observations

of v^updLtS^^^^^^^ we .e

1010+1020+1030 3060

--= Rs 1020

/.e., average wage taken by the workers is Rs 1020 Direct Method ; Symbolically,

Mea

1 M

/.(

M

Alten

where denote

Worker Wages (Rs) X

A B C 1010 1020 1030 X

N = 3 - 3 XX = 3060

Special 1. I

these «

ieasures of Central Tendency

1. Obtain IX by adding all the values of variables.

2. Divide the total by number of observations (N). Symbolically,

+ +......x„

X —-

143

N

- XX • 3060 A. —

= Rs 1020

N 3

Therefore, average of the workers is Rs 1020

where, X =. Arithmetic mean

SX = sum of all the values of observations

/.e., X, + X, + X3 + ..... X„

N = Number of observations

Alternative equation

- Iv

X = —Sx,

n

where, the symbol X is the 'Greek alphabet called sigma and is used xsi mathematics to denote the sum of values.

n - total number of observations

Lx = the sum of n values

t

= i X (1010 + 1020 + 1030)

3060

= Rs 1020

iSpecial Features of Arithmetic Mean

1. If we replace each item of observation by the calculated mean, then the total of these replaced values will be equal to the sum of the given observations.

Workers Wages (Rs) Mean

V- v=' -

A 1010 1020

B 1020 1020

C 1030 1020

N = 3 ZX = 3060 3060

NX = IX 3 X 1020 - 3060

144

2 The sum - ■ -- Statistics, for Economics-d

Workerc rv, 7:r-.------------1

A B

C

N = 3

Wages (Rs) X

X-X

1010 Xj 1020 X^ 1030 X3

^X = 3060

-10 0 +10

2(X-X) = 0

Symbolically, ^X-X) =0

Alternatively, ^ ~ ^ ^^^^^ ' ^^^O) + (1030 - 1020) 0

- K + Jf^ + + ...xj -nx = (1010 + 1020 + 1030) - 3 X 1020 = 3060 - 3060 = 0

Short-Cut Method (Assumed Mean Method)

get the anthmetic. mean the total of tt deZl-? ^^^^^ « calculated. To total ,s divided by the number of^^^^"I-ted. This

assumed mean. observations and finally the product is added to the

b]

r Worker X ^ — A (d)

A B C lOlu 1020 1030 0 10 20

N=3 4 Id =30

Measures of Central Tendency . 145

- Steps :

1. Decide assumed mean {A), i.e., Rs 1010.

2. Calculate deviations of items from assumed mean (d).

3. Get the sum of the deviation {Jd), i.e., 30.

4. Use of following formula.

X= A + 1010 + — = 1010 + 10 = 1020 N 3

i.e., average wage taken by the workers is Rs 1020.

where, A = assumed mean

d = X - A, deviations of X variables from assumed mean

"Ld = Z (X - A), i.e., sum of the deviations of X variables taken from

assumed mean.

N = number of observations ^

Step Deviation Method

We can further simphfy the short cut or assumed mean method. All deviations taken by assumed mean are divided by common factor.

Symbolically (Assumed mean = Rs 1010)

Worker Wages X-A

A 1010 . 0 0

B 1020 10 1

C 1030 20 2

N=3 ■ Id'=3

Steps

1. Decide assumed mean, i.e., Rs 1010.

2. Calculate the deviations of items from assumed mean (d).

3. Divide these deviations by common factor, i.e., C = 10.

4. Get the sum of step deviation (Zd'), i.e., 3.

5. Use of following formula :

-Ld'

X = A +

N

= 1010 + J X 10

= 1010 + 10 = Rs 1020 i.e., average taken by the workers is Rs 1020.

146

where v a u for Economics-XI

wnere, X = Arithmetic Mean

'"lit <iev,a„„„. of X vanabies fron, assumed mean

d' =

brcom'r flz"" ~ - -

C - Common factor T , ^ = Number of observations

- - fo^m. .„us„a,ons, done

(a) Direct method Direct Method

rr^^^^^^ .neome of ten wor.ees

Workers Daily Income (in Rs)

Solution.

in a factory.

A B C D E F G H J T

"3^ J ^6^

l>Jiiy Incuwe [Rst X

ZX = 2400

X = ^ il^

N ~ \o ^ rupees.

The average daily income of workers is Rs 240.

St

1. 2.

3.

4.

'Measures of Central Tendency

147

asures of f^entrai. lenui^nc^y

Dlustration 2. Calculate the anthmetic mean of the marks given m illustration l by he short-cut method (Assumed Mean Method).

Worker

A B C D E F G H I

J

Daih income (Rs.)

N = 10

120 150 180 200 250 300 220 350 370 260

X-A

JdJ

-50 -20 0

+50 +100 +20 +150 +170 +60

Id. = +400

Steps :

1 Decide assumed mean, suppose A = 200. v - A - d

2. Calculate the deviations from assumed mean, /.e., A A - ^

3. Get the total of the deviations calculated from assumed mean (d).

4. Use the following formula :

X = A +

= 100 +

400

N - "" 10 = 200 + 40 = 240 rupees

The average daily wage of workers is Rs 240. „ 1 w

niustration 3. Calculate the anthmetic mean of the marks given m illustration 1 by

step deviation method.

.t^'i

Worker

A B C D E F G H I

J

Marks X

X-A <<i) .

120 150 180 200 250 300 220 350 370 260

-50 -20 0

+50 +100 +20 +150 +170 +60

X-A

m

-8 -5--2 0 +5 +10 H.+2

+15 +17 +6

Id' - 40

148

Steps :

1- Decide assumed mean, suppose 4 = 200.

X ^C


N

= 200

X 10

10

= 200 + 40 = 240 rupees The average dady wage of wbrkers is Rs 240

B. Discrete Series

Students .-Marks :

Solution.

A 50

B 100

c

50

D

150

E 100

F 50

G

150

H 100

I

50

J 100

tx

200

400 f^x.

HO tTC - - each, 4 .00 each and .

ieasures of Central Tendency Steps :

1. Multiply the frequencies with variables (fX).

2. Get the sum of the products (I^fX).

3. Divide the total by number of observations {Lf or N).

/1+/2+/3+••■/«

149

Z/X ^f

900 10

or

S/X N

= 90 marks

where,

E/X = sum of the products of variables and their frequencies f- Frequency '

N = L/j i.e.. Number .of observations

fx = Product of variables with their respective frequencies

Direct method :

Illustration 5. Following tables gives the marks obtained by 100 students in a class.

Calculate the arithmetic mean.

Marks : 10

No. of Students : ^

Solution.

20 10

30 40

40 20

50 25

Marks No. of Students ■

10 Xj 5 A 50 f^X^

20 Xj 200 f^X^ ■

30 X3 40/^3 . 1200 /■3X3

40 X, 20/; 800 f^X^

50 X, 15 f^ 1250

E/ = N = 100 IfX = 3500

Steps :

1. Multiply the frequency with the variable X.

2. Get the sum of the product (LfX).

3. Divide Z/X by total number of observations, i.e., S/" or (N).

ISO

t


X = or

N

3500 100

If

= 35

••• Average marks of students is 35. Alternative equation :

where,

^ = -^Ifx n '' •

n

' 100 J_

100 100

" + 200 . 1200 . 800 . 1250] " 3500 = 35 Marks

^ 100 x 35 = 3500

= -475 + 475 = 0

Aver Step

taken by calculate

-XI

Measures of Central Tendency , ^^^

Short-Cut Method (Assumed Mean Method) : We can use this ^method to calculate arithmetic mean in order to simplify arithmetic calculations. The followmg formula is

used :

Ifd

X = A +

N

Here,

A = Assumed Mean N = Number of observations f - frequency

X - A, i.e., deviations of variables taken from assumed mean •Lfd = Sum of the product of frequencies and their respective deviations Dlustration 6. Calculate the average marks of students given in Illustration 5 by short cut method.

Sliort-Cut Method (Assumed Mean Method)

cies

25)]

Marks

10 20 30 40 50

5 10 40 20 25

N - 100

-20 -10 0 +10 +20

-100 -100 0

+200 +500

= 500

n of

:heir

Steps :

1. Decide assumed mean, suppose A = 30.

2. Calculate the deviations from assumed mean, i.e., X - A = d.

3. Multiple deviations by frequency and get fd.

4. Add the product of deviatioiis and frequency.

5. Use the following formula :

Ifd

X = A +

= 30 +

N

500 100

5)

= 30 + 5 = 35 Average marks of students is 35.

Step Deviation Method : We can further simplify the short-cut method. All deviations taken by assumed mean are divided by common factor. The following formula is used to calculate the arithmetic mean by step deviation method.

152

til; V

Here,


X = A.m., C

N ^ ^ A = Assumed Mean N = Number of observations C - Common factor f = frequency

- ' Step deviation

'''' ' fr^f -d ehe. .especve

devSrtrd/^^'™'"^ -"ents given ,„ I„u.„aeio„ 5 b, seep


Steps ;

1. Decide assumed mean, suppose A = 30.

2. Calculate deviations from assumed mean, /.../x - A = d.

3. Divide deviations by common factors ^Zii. ^ common factor = Q 10 ^

4. Multiply step deviations 'by frequency

5. Add the product of step deviations and frequency

6. Use the following formula.

50

= loo ^ 10 = 30 + 5 = 35

Average marks of students is 35

153


C. Continuous Series t

In continuous series, the method of calculations of arithmetic mean - ^e ^^ ^ the case of discrete series. The only difference is that m continuous series mid-pmnts ot trcSsfLrvals are required to be obtained. The following equation can be used

to get the mid-points.

Mid-point = Here presents lower limit and presents upper limit, ..g., the

mid-point for a class 5-10 can be obtained as : ^^ = 7.5.

After obtaimng the mid-points, we can use all the ^^-e me^ods o^ca^^^^^^^^^^ of arithmetic mean in the same way as we used m discrete series. These methods are.

(i) Direct method, («) Short-cut method, and {Hi) Step deviation method.

Direct Method .. .

Marks

0-4 4

4-8 8

8-12 2

12-16 1

lliuil. Marks (X) No. of Students (f) Mid-points (m)

0-4 4-8 . 8-12 12-16 4 A 2 A 1/. 2 wij 6 m^ 10 W3 14 m. 8 48 f,m, 20 f,m, 14

X/^ = N = 15 •Lfm = 90

Steps :

1. Obtain mid-points (m) of the classes, i.e.,

Here, /, = lower limit and = upper limit

2. Multiply the frequency with mid-point (fm).

3. Get the sum of products (Lfm).

4. Divide Sfm ky total number of observations (N).

S'ii

ll

154

l'!.- V.'

Symbolically,


/1+/2+/3+.../■„ X - -

Mean marks are 6. Alternative Equation

■ X =

«

where, « = total of frequency (E/)

^ frequeLts ™cl-points of classes and their respective

V = ^ X [(4 X 2) + (8 X 6) + (2 x 10) + (14 x 1)]

J_

15 1

X [8 + 48 + 20 + 14]

x90=90

15 15

= '6 Marks.

Special Features of Arithmetic Mean

1. Thyotal of frequencies multiplied by Arithmetic Mean is always equal to the sum ^of the product of mid-points of various classes and their respLive frequtd"

. NX =-Lfm

15 X 6 = 90

IhlT of mid-points from arithmetic mean being multiphed by

their frequencies is always equal to ZERO. ' '""uipuea oy

•Efim-X) =0

= 4(2 - 6) + 8(6 - 6) + 2(10 - 6) + 1(14 - 6) = -16 + 0 + 8 + 8 = 0


Short-Cut Method (Assumed Mean Method) „ u u ^ ' . ^AK^d

Illustration 9. Calculate arithmetic mean of Illustration 8 by short-cut method.

155

)01U11UU. Marks X No. of Students ■f Mid-point m ------ m-1 d ■ /a --—

0-4 4-8 8-12 12-16 4 8 2 1 1 6 10 14 0 +4 +8 +12 0 32 16 12

N = 15 60

Steps :

1. Obtain mid-points.

2. Decide assumed mean (A = 2).

3. Calculate the deviation from assumed mean, i.e., m - A = d.

4. Multiply deviation by frequency and get fd.

5. Add the product of deviation and frequency.

6. Use the formula :

IM

N

X = A +

= 2 +

60 15

=2+4=6

Mean Marks are 6.

rcl.«e ..hmetic mean of m„s„a.o„ 8 by rbe deviation

method.

Solution.

I'. ' ■'■

m-A

tS6

Steps : for Economics-XI

1- Obtain mid-points.

2. Decide assumed mean (A = 2)

3. Calculate the deviations from assumed mean M =

4. Divide deviations by conmion faaor

(Common factor = C <■ e 10)

5. Multiply step deviation b; frequency.

7: ui't^^'frrmr fr^-ncy.

-y 15

= 2 + — X 4

15 ^

A^ w . =2+1x4 = 6 Mean Marks are 6.

worSlfXory"^-'- ^^ folWin, distHbu^on of daily wages of

Daily Wages (Rs)

Below 120 120-140 140-160 160-180 Above 180

No. of Workers

10 20 30 15 5

80

confidl^;,tfr r^tl^S:;- «« - - « .„ .ose the ends Calculation of mean.

J?aily Wages (Rs) (X)

100-120 120-140 140-160 160-180 180-200

Measures of Central Tendency Applying formula, we get

lfm

If

11700

157

X =

or

lfm N

80

= 146.25

Mean wage of workers is Rs 146.25.

Illustration 12. Following information pertains to the daily income of 150 famdies. Calculate the arithmetic mean.

Income (Ks) No. of families

More than 75 150

85 140

95 115

105 95

115 70

125 60

135 40

145 25

Solution. First, get the class frequencies from given more than c inxuiative frequencies.

Jncome jRs) (x)

75-85 : 85-95 ;:..95-105 105-115 f15-125 125-135 135-145 145-155

(f)

10 25 20 25 10 20 15 25

N = 150

Mid-points m-100

(m)

80 -20

90 -10

100 0

110 +10

120 +20

130 +30

140 +40

150 +50

m~lOO 10 id')

-2 -1 0 +1 +2 +3 +4 +5

m

-20 -25 0

+25 +20 +60 +60 +125

Ifd! = +245

Applying formula.

X = A +

Id' N

= 100.1^x10

= 100 + 16.33 = 116.33

ll

Arithmetic Mean is Rs 116.33.

t-

iJ'ii -

158 ^


Charlier's Accuracy Check

CO "P ^y given by Charher while

computing arithmetic mean by the short-cut method and the step deviatiL methodTa

frequency distnbution m discrete and continuous series). The formX i^af le"

^f{d + 1) = -Lfd' + Zf

Equal values on both sides of the above formula is a proof of correct calculations We add one more column to a table of calculations prepared in discrete and contiCus eri

?hTcoLn ifrrr'- ^^^ calculatrngTmr

^ Illustration 13. Calculate the mean for the following marks obtained in Statistics bv 50 students. Also apply Charlier's accuracy check for verifying calculations '

f-f = 0-10 10-20 20-30 30-40 40-50 "50-60 Students : 4 6 20 10 7 3

Solution.

to a of tl

Marks Students m m-lS d fd' f(d'* 1)

X f 10 ) d'

0-10 10-20 20-30 30-40 40-50 50-60 4 6 20 10 7 3 5 15 .25 35 45 55 -10 0 +10 +20 +30 +40-1 0 +1 +2 +3 +4 -4 0 +20 +2C +21 +12 0 6 40 30 28 15

N= 50 -Lfd = 69 md' +1) = 119

Arithmetic Mean, X = A + ^ x C

N

= 15 +

69 50

X 10

= 15 + 13.8 = 28.8 Hence mean marks are 28.8 or 29 approx. Applying Charher's test :

md' + 1) = Lfd' + Zf 119 = 69 + 50 119 = 119 Hence, the calculation is correct.

H hi 2.

of obi

H« X

ires of Central Tendency - ^^^

ithematical Properties of Arithmetic Mean

The arithmetic mean has the following important mathematical properties : 1. The sum of the deviations of the items from the arithmetic mean is always equal to zero. Mean is a point of balance and sum of the positive deviations is equal ^o the sum bf the negative deviations.

Marks x~x

X

5 -10

10 -5

15 0

20 + 5

25 + 10

LX = 75 nx-x) = o

- _

^ ~ N 5

= 15

Z(X - X) = Le., Ix = 0

Here, Ex or E{X - X ) = Total of the deviations from arithmetic mean. In case of discrete and continuous series Zfx or I,f(X - X) = 0 2. We can calculate the combined arithmetic mean from the means and the number of observations of two or more related groups. The combined mean formula is as under :

(f) Ttvo related groups :

- _ N1X1 + N2X 2

A 1.2 - -77—~T7

NI+N2

Here,

X J 2 = Combined mean of two groups

X1 = Mean of first group

X2 = Mean of second group

Nj = Number of observations in

the first group N^ = Number of observations in the second group

(«) Three related groups :

- _ N,XI+N2X2+N3X3

X

U,3 -

x,=

N, =

Combined mean of three groups

Mean of first group

Mean of second group

Mean of third group No. of observations in first group

No. of observations in second group No. of observations in third group

T

■c

160

T^u £ , Statistics for Economics-XI

The formula can be extended for more groups

groupraVuIl?''^^ of combined mean of more

Combined Mean

X

1.2.3..

= + N, X2 +N3X3+ ..Nj(„ ■

Ni+N2+N3+..N„

of /o'Sr in^IL™? cllcd^^^^^^^^^ rr ^ --

seaions A and B. l-alculate the combined mean of all the students of

Solution.

Section : I. No. of Students

A 40 Xi 60 N,

B ■ 35 X2 40 N,

Here,

N, = 60, N^ = 40, X, = 40 and Xi = 35

^ ^ (60x40)+ (40x35) 60 + 40 ~

_ 2400 + 1400 100

38 the

3800 100

= 38 marks.

" ^ ^ marks.

ttere ar^n"4?s:udrsr^a^dt ^ « -

<n s^ion . are 40, fmd out ^rt s'^lS^r'''^

Section Mean —— —_ No. of Stilts

A 40 Xi 60 Nj

B ? X2 ■ 40.N,

H


Combined mean (Xi,2) =38

161

-N2X2

where.

V -

- N1 + N2

2 = 38, Nj = 60, N^ = 40 and Xi = 40 (600x40) + (40XX2)

38 = 38 =

60 + 40

2400 + 40X2

100 _ 3800 = 2400 X 40 X2

- 40X2 = 3800 - 2400

40X2 = 1400

X2 = 35 Marks Hence, mean of the students of section B is 35 marks.

Illustration 16. The mean marks of 100 students of combined sections A and B are 38 marks. If the mean marks of section A are 40 and that of section B are 35. Find out the number of students in sections A and B. Solution.

M

Mean No. of Students

100

A 40X1 ? N,

B 35X2 ■ ?N,

Combined Mean (Xi,2) = 38

_ N1X1 + N2X2 N1+N2

Here Xi,2' = 38, Xi = 40, X2 = 35 and N^ + N, = 100

Hint. 1. N^ = (100 - Nj)

(Nix40) + (100-Ni)x35

=-Too

3800 = 40 Nj + 3500 - 35 N^

2. 40 Nj - 35 Nj = 3800 - 3500 -

5 Nj = 300

Nj = 60

Hence, the students in section A are 60 and in Section B are (100 - 60) = 40.

162

3 The sum of rh for Economics-XI

1 2

3

4

5

XX = 15

Set I ----

(X - 3)

(X)

-2 4

-1 1

0 0

+1 1

+2 4

= 10

X

1 2

3

4

5

Set II ~

X~2 (X-2)^

(x') (x'^J

-1 1

0 0

+1 1

+2 4

+3 9

= 15

H - - N 5 -

^ V ~ ^' ' f taken from mean.

W. .. ' 'f"' ''^v'^tions taken from any value

cha^:o, eh.. ^^^^^^^

4. Arithmetic mean i. calculated by a simple formula, ,>., h - E

three values of the formula are known, the third can be calculated"

_ J^X

Tf , X = or IX = NX

if any two of the

X ■ , ^ ' ^^^^^--—--—1 X

10 30

20 30

30 30

40 30

50 30

i:x = 150 150


163

150

= 30

N

NX = SX 5 X 30 = 150

150 = 150. ^

This property has great utihty in calculatton of wage bills, e.g., average wage Rs 120.

No. of workers 2000.

... Total wage bill = N. X = 120 x 2000 = Rs 2,40,000.

The relation NX = ZX can be easily used for correcting the value of mean, which is

explained in the following illustration. ^

m„s„ado„ 17. -n-e arithmetic mean of a series of 40 as Rs 265. Bnt while calculating ii an item Rs 115 was misread as Rs ISO. Fmd the correct

arithmetic mean.

Solution.

Since,

_ EX X =

N

EX = NX Here, X = 265, N = 40

EX = 40 X 265 = 10600

Calculated EX, i.., 10600, is wrong as the us get correct EX by subtracting the incorrect item and adding the correct item

Incorrect EX = 10600

Less : Incorrect item 1^0

10450

Add : Correct item

Correct ^^X = 10565

Hence, corrected arithmetic mean

= Rs 264.12.

40

Illustration 18. Tie arithmetic mean of a given set of data

(i) in terms of rupees, and (it) in terms of paise.

164

f

Solution. Since


Here,

N NX

-■"''e. of observations,

Calculated ZX i c Tnn values. Let us cor^t M u "" "" observations are Rs 5 less ,1, ^

value = 25). ^ « correct H (5 observati^s x Rs f^st T

Incorrect ^

^dd : Corrected balance

of 5 observations (5x5) -25 Correct —----

es .

Corrected ZX 525 ^ ^ .

N " - — = Rs 105

(«) Corrected mean expenditure in terms of n '

Rs 105 X 100 = 10500 ^^

= paise 10500.

™ean of 5 items (1, 2 3 4 5 . • . •

* ' 4, 5,) is 3, I.e.,

1+2+3+4+S EX IS

value, sa. 2. we get tbe

Alisi

(

ll

requ ]

£

Lo to be < examp class M 2.1 Exa

1

2

3

4

__^

IX = IS X = 3

X + 2

3

4

5

6

25 5

X~2

-1 0 +1 +2 +3

Xx2

Column 1 : X = 3 (+) Column 2 :X - i: a j j j

(X) Column = 2) = 1

^ = 6 Multiplied 2 = (3 X 2) = 6

5 1

2 4 6 8 10

30 6

We a

JVIaifc Studo

Cunn before ap Thus Marks

Stud^

165

, after aadition. s.b„action and .uUipôn by

-b»caon and *a.on by *e same

constant to their means.

Miscellaneous Problems . , ■ u .wUrr^fûc mean These problems are

re^r rb^ê rS^ê mean.

1. In the case of open-end classes Example

Marks

Below 10 10-15 15-20 20-25 Above 25

No. of Students

5 8

3

4

5

Lower limit of the first class and upp. W o^^ j^^r^^L^niS^^ to be defined by marking an —and last classes. Thus, first

example, the same class mterva IS decided M., 5)

class would be 5-10 and last class 25-30.

2. In the case of cumulative frequency distribution

Example

Less than 10 Less than 15 Less than 20 Less than 25 Less than 30

5-10 5

5-15 13

5-20 16

5-25 20

5-30 25

We are given Marks :

^tive frequencies ar;:equired to be converted into ascending class frequencies

Thus we get, ^^^^

Marks : 5-10 1(^5 ^^^^^^ ^^0-16)

Smdents : 5 =8 - ^

= 4

25-30 (25-10) = 5

■tr

m

166

Example :

Statistics for Economics-X,

We are given, Marks : Students

Marks No. of Students

More than 5 More than 10 More than 15 More than 20 More than 25 25 20 12 9 5

5-30 25

10-30 20

15-30 12

20-30 9

25-30 5

.ue^citTef: "r " above descending cL„,a.ive

frequencies. They are : Marks : 5-1 o (25-20) Students : =5

10-15 (20-12) = 8

15-20 (12-9) = 3

25-30

20-25 (9-5)

= 4 5

Merits and Demerits of Arithmetic Mean

Merits : Arithmetic Mean ,s the most popularly used because of the following merits ■ . 1. It IS simple to understand and easy to calculate. "

2. It is based on all the observations of the series Therefore it i.' th representative measure.^neretore, it is the most

3. Its values is always definite. It is rigidly defined and not affected by personal bias

4. The calculation of arithmetic mean does not require any specific algement of

îS m SS^~ - - ^-re mathematical

7 It is fluctuations of sampling and ensues .ability m calculations.

/. tt IS a good base for comparison.

n«h''o7o?:fcuI onTj^^î^^^^^^^^ 'ê average and

be used with caution mean should

ffcô 000 a " General Manager's sala;, in a firn,

rnfpe^f Sloor"^' -P'ovee-y clerk Rs jjoo, typiS^Rs

167


„ ^ Rs on 000+ Rs .S.500 + Rs 4,500+ Rs 2,000 _ g oqO per The average salary will be--4

month. Average calculation is not - "presenmive^ I. is affected by an extreme value of Rs 20,000 paid to the General Managet; ,

4 Arithmetic mean can be a value that doe. not extst m the senes at all, ..g., the average of 4, 8 and 9 is = 7, which is not an item of the series.

5 Arithmetic mean gives more impot«,nce to the bigger items and less importance to

titr^cflXaecided inst by observa.on. It needs mathematica. calculations.

(B) Weighted Arithmetic Average or Weighted Mean

1. Meaning

2. Uses of Weighted Mean

3. Calculation of Weighted Mean

(a) Equal Weights

(b) Unequal Weights

arithmetic mean gives equal importance tt/aif the ^^^^^ fact, thL are number

One item may be more impot^mt to different

hi—rlorl^^^^^^^^ ^o^ as jeights. ,n other

wir«=ights are figures to indicate d>e relative mtportauce of ,tems.

1, « » "ve Cual weigh, to different categories of employees in a fa«o^

2. WeStled mean is used for comparison of the results of two or more un,verstt,es

or boards. , . u ^

3. It is used' to calculate standardised birth rate and death rate.

, 4. It is used in the construction of Index Numbers.

168

Statistics for Economics XI

Calculation of Weighted Mean

The formula for calculating weighted arithmetic mean is as under :

IWX

Xw =

where.

Xw = Weighted Arithmetic Mean W= Weights X = The variables Steps. (/) Multiply weights by X and obtain WX

Hi) Divide the total (ZWX) by total weights (XW)

Solution.

of payment of wage per hour by three ways

Simple Arithmetic Mean (x) :

Workers

Man Woman Child 8 6 4

ZX=18

EX

X =

N

8 + 6 + 4

18

: ■: 3 .

= Rs 6 per hour.

(a) Weighted Mean (Equal Weights) : (Xit;)

terSn' - - — -- ana

•n

Type SWlfittlBil^Bl® Wages (Rs) ■ X Workers W

Man Woman Child 8X, 6 X, 4X3 50 Wj 50 50 W3400 300 X^W^ 200 X3 W3

I.W= 150 IWX = 900

-—-—--

'ires of Central Tendency

169

Xw

X,w, + XM+2W - =Rs6

LW 150

Weighted Mean is Rs 6. . n

Thus, weighted arithmetic mean will be equal to the simple arithmetic mean, when all

: items are given equal weights.

Xw = X Rs 6 = Rs 6.

lb) Weighted Mean (Unequal Weights) : (X«^)

Suppose men, women and child workers are 10, 20 and 50 respectively then our

---------Vorkers

lyfJtr 'x W - V - --_U-^^^---^

Man 8 10 80

Woman6 20 120

Child 4 50 200

l.W = 80 ZWX = 400

- SWX 400 Xw =

= Rs 5

LW 80

Thus, the weighted arithmetic mean will be less than the simple arithmetic mean when items of small vflues are given greater weights and items of big values are given less

weights. _ _

Xw < X

Rs 5 < Rs 6 . u

However, in the absence of given weights, assumed weights can be assigned to the

items on the basis of their relative importance.

But, normally they are not equal. Suppose men, women and child workers are 50, 20 and 10 respectively, then our answer would be different.

Type

Man

Woman

Child

X

Workers W

50 20 10

SW = 80

WX

400 120 40

ZWX = 560

__ ZWX Xw =

560 80

= 7

Weighted Mean is Rs 7.

^ Statistics for Economics-Xl

Thus the weighted arithmetic mean will be greater than the simple arithmetic mean

when items of small values are given less weights and items of big values are given mor^ weights.

Xw > X Rs 7 > Rs 6

niustration 20. Calculate Weighted Mean by weighting each price by the quantity consumed.

Articles of Quantity Consumed Price in Rs

Food (per kg) 3

Flour 11.50 5.8

Ghee 5.60 58.4

Sugar .28 8.2

Potato .16 2.5

Oil .35 20.0

Solution.

fiiBsBslPlSSaHi^B^^^^B Food Articles Price in Rs per kg Qty. Consumed in kg W

Flour 5.8 11.50

Ghee 58.4 5.60

Sugar 8.2 .28

Potato 2.5 .16

Oil 20.0 .35

Total 17.89

WX

iMM

66.700 327.040 •2.296 0.400 7.000

IWX = 403.436

.Xw =

ZWX 403.436

= 22.55

I.W 17.89 Weighted Mean Price is Rs 22.55.

lUustration 21. From the results of the two schools A and B given below, state which or them is better.'

Oass

IX

X

XI

XII

Total

School .A

Appeared

30 50 200 120

400

Passed

25 45 150 75

295

School B

Appeared

100 120 100 80

400

Passed

80 95 70 50

295

171


ntf Use Weighted Anthmetk Mean after obtaining homogeneous figures, converting into percentages.

School A

Class Appeared w Passed Pass % X WX

IX X XI XII 30 50 200 120 25 45 150 75 8.33 90 75 62.50 2499 4500 15000 7500

LW - 400 LWX = 29499

School B

Class Appeared W Passed Pass % X WX

IX X XI XII 100 120 100 80 80 95 70 50 80 79.2 70-62.5 8000 9504 7000 5000

XW = 400 EWX = 29504

School A : LWX 29499 _ 73 75 Xw - 400

_ SWX _ School B : Xw -

School B is better.

29504 400

= 73.76

m„s«a»„ 22. An exannnaaon was held to decide the award of a -hola*

Subject

Statistics Accountancy Economics Business Studies

Weight

4

3 2

i

Marks of A

63 65 58 70

Marks of B JAarks of C

60 64 56 80

65 70 63 52

DUSIIICSS jiuun^o _-________

Of Ihe candidate gening the highest marks .s to be awarded the scholarsWp, who should get it?

172


Solution.

Subfect Weight Marks of A Marks ofB Marks of C

W X, WX, WX,

Statistics 4 63 252 60 240 65 260

Accountancy 3 65 195 64 192 70 210

Economics 2 58 116 56 112 63 126

Business Studies 1 70 70 80 80 52 52

Total IW = 10 EWXj = 633 SWX^ = 624 EWX3 = 648

EW 10

2:WX2 624

LW 10

EWX3 648

^t Vt ./v O^ I

Weighted Mean of B, Xw^ = ^^^= 62.4 Marks.

Weighted Mean of C, Xw^ = ^^^ = Marks.

The weighted Mean of C is the highest, hence he is entitled for scholarship.

OF FORMULAE AND ABBREVIJmONS

[Arithmetic Mean, Properties and Weighted Mean]

Type of Series 1. Individual Observations (Ungrouped data) Direct Method - ZX N Short-cut MethodSfi^ Deviation Mjethod X = A+'^f'xC N

2. Discrete Series (Grouped data) ^ N X-A + ^^f xC N

3. Continuous Series (Grouped data) - -Lfm ^ N X = A + N

Mathematical Properties of Arithmetic Mean Here, X-X=x Mathematically. (1) 2.(X - X) = 0 ljc = 0 If(X - X )- 0 Ifx = 0 , Properties of ■ Arithmetic Mean (3) E(X - X )Ms the least, i.e., Ix^ is minimum ZWX Weighted. Mean Xw = ^^ - N1X1+N2X2 Similarly - N1X1+N2X2 + N3X3 N1 + N2 + N3 (4) NX = IX

ii-


173

Abbreviations

X = Arithmetic Mean. d = X-A, i.e., deviations of X

variable from

x = The variables. an assumed mean.

zx- Sum of all the items of the U = Sum of the deviations of X variabic i

variable X. taken from an assumed mean. |

N = Number of observations or (Z/). C = Common factor.

f = Frequency. d' = X — A , i.e., step deviations of C

E/X = Sum of the product of variable (X) X-variable from assumed mean

and the frequencies (/). and divided by common factor.

m = Mid-values. Yd' = Sum of step deviations.

lfm = Sum of the products of mid-points Lfd = Sum of the product of frequencies

and the frequencies. and their respective deviations.

A = Assumed mean lLfd' = Sum of the product of deviations

and their respec ive step deviations.

X-X = X, i.e., deviations of X variable

from the mean.

= Combined mean of two groups. (X- Xf = X', i.e., square of the deviations

of X variable from mean.

Xi- : Arithmetic Mean of the first group. Xw = : Weighted Arithmetic Mean.

X, - : Arithmetic Mean of second group. W = : Weights.

N. : Number of observations in the first zwx = : Sum of the product of variable

1 group. X and weight.

N, = : Number of observations in the

second group.

EXERCISES

Questions :

1. What is a statistical average? Mention different types of averages.

2. What are the functions of an average? Discuss the characteristics of good average. Which of the average possesses most of these characteristics?

3. What is meant by 'Central Tendency'? Discuss the essentials of a measure of central tendency.

4. Name the commonly used measure of central tendency.

174

StattsUcs for Economics-XI

5. Define the mean. Also explain properties of mean.

6. Why xs arithmetic mean is the most commonly used measure of central tendency^

Llutionr "" """ ^^ ^ - fr^q-ncy

-- - ^ -sure

' ^of the va^es of the vanable from

in ^ " unweighted mean?

lU. What are the uses of weighted mean?

II.. Write notes on (a) Central Tendency, and (b) Weighted Mean. Problems :

1. Calculate arithmetic averages of the following information :

(a) Marks obtained by 10 students : 30, 62, 47, 25, 52, 39, 56, 66, 12, 24

(b) Income of 7 families (In Rs) : Also show = o 550, 490, 670, 890, 435, 590, 575

ic) Height of 8 students (In cm) :

140, 145, 147, 152, 148, 144, 150, 151

2. .u______________________• " : ^ = 600, c = 147.12 cm.]

Name of batsman Matt ch I Match II Mutch m

1 ■ Inning II InningI Inning 11 Inning I Inning II Inning

A B C D 60 40 100 20 20 50 10 40 26 60 8 46 10 36 18 84 100 70 100 42 40 80 140 52

, , , ••^ = 42.67, B = 56, C= 62.67 and D = 47.33]

Calculate mean of the following series •

5 6 7 8 9

Frequency : 6 12 15 28 20 14

^ fI^J^/"^ expenses of following 10 firms

irmx . . ^ , . ^

65 15

10 5

[X= 7.06]

Firms Sales (Rs in '000) Expenses (Rs in '000)

1 50 11

2 50 13

3 55 14

4

60 16

5 65 16

7 65 15

8 60 14

9 60 13

10 50 13

[X= Sales = Rs 58,000, Expenses = Rs 14,000]

ieasures of Central Tendency

Calculate mean of the following frequency distribution

62 64 67 70 73 82 103 176 212 180

Values Frequency

60 54

77 115

n

Calculate arithmetic mean of the followmg data Profit (in Rs) : 0-10 10-20 20-30

No of shops : 12 18 2/

30-40 20

175

81 85 89 78 _ 50 21 [X: R> 70.94]

40-50 50-60

16

[ X = Rs 30.45]

Compare the average age of mal^injhe_ two countries :_____________________

population of U.K.

Age Group

0-5 5-10 10-15 15-20 20-25 25-30 30^0 40-50 50-60 60-65

(in lakhs)

214 258 222 157 145 161 267 184 120 100

18

19

20 18 16 14 27 25 19 17

8.

[Average Age India = 25.25 years and UK = 29.404 years] Calculate simple and weighted ar^hmetic averages of the folbwing items :

Items Weights Items Weights

68 1

124 9

85 46 128 14

101 31 143 2

102 1

146 4

108 11 151

6

110

7

153 5

112 23 172 2

113 17

[Simple Mean = 121.07 and Weighted Mean = 108.71)


Less than 10 Less than 20 Less than 30 Lesi than 40 Less than 50 -:- 5 15 55 75 100

_ A = OU iViaiivai

Also get 5:f(X-X) = 0

176

11.

12.

13.

14.

15.

Statistics for Economics-

There are two branches of an establishment employmg 100 and 80 nerso,

rrS monthly salrfes by thrtwo b^rh

are Rs 27^ and 225 respectively, fmd out the arithmetic mean of the salaries the employees of the establishment as a whole.

[Combined Mean = Xi,2 = Rs 252.81

to be"49 Tth' by a group of 100 students were found

If Too ; H obtained in the same examination by another group

of 200 students were 52.32. Fmd out the mean of marks obtained by both th g^ups of students taken together. ^ ^^ 3,

The mean marks of 1 >0 students were found to be 40. Later on it was discovere

— Ll ' ^^ —^^ -- correspondingTot

-ru • , [Correct X = 39.7 Marksl

The mean weight ot 25 boys in group A of a class is 61 kg and the mean weiS

of .5 boys m group of the same class is 58 kg Find the mean weight of 60 b^!

.. , , . [Xj.2 = 59.25 kg]

Calculate mean ot the following data :

Marks Beloiv : 10 20 30

No. of Students : 5 9 17

40

29

50 45

60 60

16. Calculate Combined Mean

70 80 90 100 j 70_ 78 83 85 [X = 48.41 Marks]

Section Mean Marks No. of Students

A 75 50

B 60 60

C 55 50

17 , ■ [Xi,2,3 = 63.125 Marks]

average ot 31 marks. What were the average marks of the other students.'

-ru [X2 = 57.25 Marks]!

R^18r4"T 1000 workers of a factory was found to be

taken af29ranri67" 'T?"' ^^ workers were wron^

taken as 297 and 165 mstead of 197 and 185. Find the correct mean. '

[Corrected Mean : Rs 180.32]!

19. Find the average wage of a worker from the following data •

: Above 300 310 320 330 340 350 360 3701 No. of ivorkers : 650 500 425 375 300 275_ 250 100

[X = Rs 339.23]j


177

effip* ........ ^ o. of day

-40 to -30 10

-30 to -20 28

-20 to -10 30

-10 to 0 42

0 to 10 65

10 to 20 180

20 to 30 10

[X = 4.29 °C)

21. A candidate obtains tbe followmg percentage of marks : Sanskrit Mathemat^ 84, Economics 56, English 78, Politics 57, History 54, Geography 47. ^ is agreed to give double weights to marks m Enghsh, Mathematics and Sanskrit. What is he weighted and simple arithmetic mean? = 68.8, X = 64.43 Marks]

22. Calculate weighted mean by weighting each price by the quantity consumed:

Food items

Flour

Ghee

Sugar

Potato

Oil

Quantity Consumed

500 kg 200 kg 30 kg 15 kg 40 kg

Price in Rupees {per kg)

1.25 20.00 4.50 0.50 5.50

[Xw = Rs 6.35]

23. Comment on the performance of the students of three universities given below using weighted mean :

No. of Students are in hundreds

Courses of study Mumbai Kolkata Cher tnai

% pass No. of students % pass No. of students % pass Np. of students

M.A. M.Com. B.A. B.Com. B.Sc. M.Sc. T-l 83 73 74 65 66 4 5 2 3 3 S2 76 73 76 65 601 3 6 7 3 7 81 76 74 58 70 73 2 3.5 4.5 2 7 2

[Weighted Mean

Mumbai : 72.55, Kolkata : 7U.6 and Chennai : 72.0; Mumbai is better]

24. A distribution consists of three components with total frequencies of 200, 250 and 300 having means of 25, 10 and 15 respectively Find out the mean o^ combined

: distribution. =

Chapter 9

TOOTioMHTnaet ui. rmmmv«jm

(a) Median,

(b) Partition Values (Quart,les), and

(c) Mode.

median

1. Definition

2. Calculation of Median

3. Mathematical Properties of Median

4. Merits and Demerits of Median

Definition

^ev^r shit t s^r z"-- --- - ---e

sxss

lics-XI Positional Average and Partition Values 201

According to AX. Bowley, "If the number of the group are ranked m order according to the measurement under consideration then the measurement of the number most nearly one half ts the median." ^

■ According to Secrist, "Median of a series is the value of the ttem actual or estimated

tvhen a sertes ts arranged in order of magnitude which divides the distribution into the tivo parts.

nL! O™'''''' "" heights of 7 students in a class.

; 'uf "sr

147

151

140

Anurag

Deven

149

M

Suresh

142

At

Mayoor

147

AtuI

144

144

Satish

145

145

Himankar

The first and most important rule for obtaining the median is that the data should be arranged in an ascending (increasing) or descending (decreasing) order. This arrangement facditates locating the central position so that the series may be divided into two parts one less than the central value and the other more than the central value. '

So, we arrange our data in ascending order as follows :

140

142

144

145

147

149

151

mm

Deven

Mayoor

Satish

Himankar

AtuI

Suresti

f

■

Anurag

If we arrange the above data in descending order we get : Name of

smdents : Anurag Suresh Ami Himankar Satish Mayoor Deven Height (cm) : 151 149 147 145 144 142 140

From this ordering also we observe that 145 cm or value of the 4th item is the median.

Calculation of median

(a) Individual observations.

(b) Discrete series.

(c) Continuous series.

Median is the central positional average of given data. That is, median has a position more or less at the centre of the values and it divides the series roughly into equal parts.

180


{a} Individual Observations

meiarhetht. ^ ^ impute the

Solution.

Name of Students Height (cm.)

Anurag 151

Deven 140

Suresh 149

Mayoor142

Atul 147

Satish 144

Himanka-- 145

Name of Students Height (cm.)

Deven 140

Mayoor142

Satish 144

Himankar 145

Atul 147

Suresh 149

Anurag 151

Steps :

1. The above data must be arranged either in ascending or descending order to get the value of median. Arrange the data in ascending order.

\th

item

2. Locate the median by finding the size of

3. Applying the formula, we get

fN+^^

Me = Size of

= Size of

fN + V

th

item

Pos

But

fN

h

the h

7 + 1

item

= Size of 4'"' item Median is the Himankar's height, i.e., 145 cm

8th tTm^^ cm, which will be the

»tn item m the list, and calculate the median height.

Solution. When the number of items in an individual series is 3, 5, 7, 9, 11 etc. that

'N+iK ■

th Item will be a whole number.

nil 1

is when it is an odd number, the central item, i.e..

-XI the

f ositional Average and Partition Values I g j

But when the number of item in a series is even 2, 4, 6, 8, 10 etc, the central item, /.e.,

N + V

the item will be in fraction.

Arranging the data in ascending order including the height of Rajesh, we get

et

Name of the Students Heii^ht (cm.)

Deven 140

Mayoor142

Satish 144

Himankar 145

AtuI 147

Suresh 149

Anurag 151

Rajesh 152

Me = Size of

= Size of

fN + lX^ .

2 , +

Item

Item

= Size of 4.5'*' item

Medkn is estimated by finding the arithmetic mean of two middle values, i.e., adding the height of Himankar and AtuI and dividing by two. ' &

Size of 4.5"^ item = item + item

2

145 + 147

292

Median height = 146 cm.

Serial No. Marks Serial No. Marks Serial No. Marks

1 17 7 41 13 1 1

2 32 8 32 14 15

3 35 9 11 15 35

4 33 10 18 16 23

5 15 11 20 17 38

6 21 12 22 18 12

JI

182

Statistics for Economics-XI """sed ,„ an ascending order in the

Serml No. Marks Serial No. Marks Serial No. Marks

1 2 3 4 5 6 11 11 12 ' 15 15 17 7 8 9 10 11 12 18 20 211 22 J 23 32 13 14 15 16 17 1832 33 35 35 38 41

Median = Size of the

item = Size of the

18 + V

th

= 9.5'^ item

The value of 9.5"' item = .Z^lue of the 9"* item + Value of the 10^'' item

= 11^.21.5.

Hence Median = 21.5 (b) Discrete Series

Illustration 4. Calculate median of the followmg distribution :

Solution.


10 1

20 8

30 16

40 26

50 20

60 16

70 7

80 4

Marks

10 20 30 40 50 60 70 80

No. of Students

1 8 16 26 20 16 7 4

N =99

Ctwtulatiue frequencies c.f

1=1 10 = 2 26 = 2 52 = 2 72 = 2 88 = 2 95 = 2 99 = 2

16 16 16 16 16 16

26 26 26 26 26

20 20 20 20

16 16 16

up to (c) Cc

nil

the m(

7

7 + 4

Positional Average and Partition Values

183

Steps :

1. Arrange the data in ascending or descending order.

2. Compute cumulative frequencies.

3. Apply the formula.

Me = Size of

fN + n

th

Item.

4. Median is located at the size of the items in whose cumulative frequency, the value

of

(N + U

th

item falls.

Median = Size of

= Size of

(N + l

th

2

(99 + ^^

Item

= 50th item

Median Marks = 40 Marks.

Illustration 5. Find out the value of median from the following data : Daily wages (in Rs) : 100 50 70 110 80 Number of Workers : 15 20 15 18 12

Solution.

Wages in Number of Cumulative

Ascending Order Workers Frequencies

(Rs) (f) (c.f.)

50 20 20

70 15 35

80 . 12 47

100 15 62

110 18 80

Median is the value of

fN + l

rso+i^i

or

th

or 40.5'*' item. All items from 35 onwards

up to 47 have a value of 80. Thus the median value would be Rs 80. (c) Continuous Series

Illustration 6. The size of land holdings of 380 families in a village is given below. Find the median size of land holdings.

Size of Land Holdings No. of Families

(in acres)

Less than 100 40

100-200 89

200-300 148

300-400 64

400 and above 39

J'

184

IvV

i

Solution.


Size of Land Holdings (in acres) No. of families (f) Less than cumulative frequencies

0-100 40 40

100-200 89 129

200-300 148 111

300-400 64 341

400-500 • 39 380

Steps :

1. Compute less than cumulative frequencies.

u

th

item. Do not use

2. Median item is located by finding out size of

the item in continuous series.

3. Locate the median group in cumulative frequency column where the size of

fN^''

the item falls.

4. Apply the following formula to calculate the median from located group :

—— c.f. Median = /j + - x i

where, = Lower limit of median group.

c.f = Cumulative frequency of the class preceding the median class. f = Frequency of the median group.

I = The class interval of the median group. Calculation of Median

Me = size of

= size of

2

380

item

item = 190^'^ item

Median lies in the group 200-300. Applying the formula, we get

--cf Me = /j + :

X i

Ositional Average and Partition Values

where, /, = 200, f = 190, c.f. = 129 f = 148, /• = 100

185

Me = 200 + 1^1^x100

= 200

= 200 +

148

61x100 148

241.216

148

•. Median size of land holding = 241.22 acres, (ie 50% of the families are having less than or equal to 241.22 acres of land holdingr^d 50% of famihes are having more than or equal to 241.22 acres of land

holdings.)

Illustration 7. Calculate median from the following data :

Age (in years)

55-60 50-55 45-50 40-45

Number of Age

Persons(m years)

(f)

'7 35-40

13 30-35

15 25-30

20 20-25

Total

Number of Persons (f)

3U 33 28 14

160

«I

Note : If the given question is in deseending otdet of values then Wore giving the question, the dafa is Required to arrange ■„ ascending order to calculate less than

cumulative frequencies. . . ,

Solution. This question has been solved below after arranging the series m ascending

order. "___

Age in years (Ascending order)

20-25 25-30 30-35 35-40 40-45 45-50 50-55 55-60

No. of persons (f)

14

28, 33 30 20

15 13

7

Cumulative frequency-(c.f.)

14 42 75 105 125 140 153 160

186


In the above example median is the value of lies in 35^0 class interval.

N- , Me = + X i

80-75

f^l th or ri6o> th

.1) I 2 ; or

or 80''^ item which

= 35

= 35

30

5x5 30

X 5

= 35 + 0.83 = 35.83 .•• Median Age = 35.83 years

Illustration 8. Calculate the median from the following data :

Value Frequency Value Frequency

(f) (f)

Less than 10 4 Less than 50

Less than 20 16 Less than 60 112



Solution. If the data are given in the form of cumulative series they have to be converted into simple series in order to find out the frequency of the median class which IS needed m calculation of median. Once it is done that rest of the procedure is the same as in any other continuous series.

Value Frequence (f) Cumulative frequency {c.f.)

0-10 4 4

10-20 12 16

20-30 24 40

30^0 36 76

40-50 20 96

50-60 16 112

60-70 8 120

70-80 5 125

ha' on

Middle item is

ri25

xth

or 62.5* item, which lies in 30-40 group.

lics-XI


187

ich

^ -c.f. Me = /, + - X i

^^ 62.5-40 = 30 + —trr- X 10

= 30 +

36 22.5x10

36

= 30 + 6.25 Median = 36.25

Illustration 9. Calculate the median from the following data

Size Frequency (f)

More than 50 0

More than 40 40

More than 30 98

More than 20 123

More than 10 165

Solution. e>umulative frequency taoie is oi more man type, in !.u».u eases mc ucua have to be converted into a simple continuous series and median is calculated of ascending order series. ,,

be lich me

Size Frequency (f) Cumulative frequency (c.f.)

10-20 42 42

20-30 25 67

30^0 58 125

40-50 40 165

ri65Y'

Middle item is —^ or 82.5* item which lies in 30^0 group.

th

e I

N

Me = /j + - X i

^ 30 , iM^iZ >, 10

= 30 +

58 15.5x10

58

= 30 + 2.67 Median = 32.67

188

Statistics for Economics-XI Illustration 10. Compute median from the following data • MMues : 115 125 135 145 155 165 175 185 195 Frequency : 6 25 48 72 116 60 8 22 3

he'^^rdls^^Z^: Th 7fT ^^^ of ^he class-intervals of a contmuous

trequency distribution The difference between two mid-values is 10 hence 10/2 - 5 i.

upper limit ot a class. The classes are thus 110-120 170 l^n ^ ^

190-200. ' .... and so on up to -——----

Uass-intervals

110-120

120-130

130-140

140-150

150-160

160-170

170-180

180-190

190-200

Total

Frequency

6 25 48 72 116 60 38 22 3

390

_

6 31 79 151 267 327 365 387 390

The middle item is

(390^

th

or 195"' item, which lies in the 150-160 group.

Me = + - ^

195-151 = 150 + -r^^-X 10

116

= 150 +

44

X 10

116

= 150 + 3.79 Median = 153.79

Illustration 11. if the arithmetic mean of the data given is 28 Find rh. I ^ ■ • frequency, and (b) the median of the series. ^ ^^^

Profit per

Retail shop (in Rs) : 0-10 10-20 20-30 Number of

Retail shop j2 jg

30-40

27

40-50 17

50-60 6


189

Solution.

{a) Calculation of missing frequency. Let the missing frequency of group 30-40 he X.

Profit per Retail shop X Number of retail shops f Mid-point m ' —-----1 fm

0-10 12 5 60

10-20 18 15 270

20-30 27 25 675

30-40 X 35 35X

40-50 17 45 765

50-60 6 55 330

N = 80 + X Y.fm = 2100 + 35X

Applying formula, we get

Ifm

X =

or

28 =

Ifm If " N

2100+ 35X 80 + X

28 X (80 + X) = 2100 + 35 X 2240 + 28 X = 2100 + 35 X 2240 - 2100-= 35 X - 28 X 7X = 140

140

X =

7

= 20

Therefore, the missing frequency is 20. (b) Calculation of median :

Profit (Rs) (X) Fret, mcHC\ (f) (c-f)

0-10 12 12

10-20 18 30

20-30 27 57

30-40 20 77

40-50 17 . 94

50-60 6 100

N = 100

The middle item is

100^

or 50th item, which lies in the 20-30 ^group.

190


Me = /j + -ly- X

-in ^0-30 20x10

= 20 + —-X 10 = 20 +

27 " ■ 27

= 20 + ^ = 20 + 7.407 Median = 27.41.

Illustration 12. In the frequency distribution of 100 famiUes given below, the number of families corresponding to expenditure groups 20-40 and 60-80 are missing from the table. However, the median is known to be 50. Find the missing frequencies. Expenditure : 0-20 20-40 40-60 60-80 80-100 No. of families : 14 ? TI 15

Solution. Let the missing trequency of the group 20-40 be X and the missing frequency of 60-80 group be Y.

Now Z/" (total frequency) = 100

i.e., 100 = 14 + X + 27 + y + 15

or X + Y = 100 - 14 - 27 - 15 or X + Y = 44

Expenditure No. of Families (f) Cumulative frequency (c.f.)

0-20 14 14

20-40 X 14 + X

40-60 27 41 + X

60-80 Y 41 + X + Y

80-100 15 100

Median is given in this problem as 50.

^lOOV*'

Middle item of the series is also interval 40-60. (Given median = 50)

N

or 50* item, which means it lies in the class-

Now,

Me = /. +

-c.f.

f

X t

50 = 40 +

50-[14 + X] 17

X 20

50 - 40 =

27^20

f ositional Average and Partition Values 191

10 X 1.35 = 50 - [14 + X] 13.5 = 50 - 14 - X

X = 50 - 14 - 13.5 = 22.5 Since the frequency in this problem cannot be in fraction so X, i.e., f^ wouId be taken as 23.

X + Y = 44 or /■j + = 44

/; = 44 - fj or 44 - 23 or 21 Thus the missing frequencies in the question are 23 and 21.

Mathematical Properties of Median

1. Median is an average of position and therefore is influenced by the position of items in arrangement and not by the size of items.

2. The sum of the deviations of the items about the median, ignoring ± signs, will be less than any other point. For example :

X : 10 11 12

Deviations from Median : 2 10

Deviations from

any poirit, (say 10) : 0 1 2

The sum of the deviations taken from median (12), less than the sum of the deviations taken from an\

13 1

14

2

= t>

-f — »J

ipomt (1®

Merits and Demerits of Median ^^''^i^ilJAN t

Merits

1. It is easy to calculate and understand.

2. It is well defined as an ideal average should be and it indicates the yalue of the middle item in the distribution.

3. It can be determined graphically, mean cannot be graphically determined.

4. It is proper average for qualitative data where items are not converted or measured but are scored.

5. It is not affected by extreme values.

6. In the case of open-end distribution it is specially useful since only the position is to be known. It is useful in a distribution of unequal classes.

Demerits

1. For median data need to be arranged in ascending or descending order.

2. It is not based on all the observations of the series.

3. It cannot be given further algebraic treatment.

4. It is affected by fluctuations of sampling.

192


5. It is not accurate when the data is not large.

6. Interpolation by a formula is required to calculate median in continuous series This reqmres the assumption that all the frequencies of the class interval are uniformly spread which is not always true.

partition values (quartiles)

1. Definition

2. Characteristics of Partition Values

3. Calculation of Partition Values

Definition

When we are required to divide a series into more than two parts, the dividing places are known as partition values. Suppose, we have a piece of cloth 100 metres long an^d we have to cut it into 4 equal pieces, we will have to cut it at three places. Quartiles are those values which divide the series into four equal parts. For getting partition values the most important rule is that the values must be arranged m ascending order only. In the case of finding out the median, we can arrange the data either m ascending or in descending order but here there is no choice-only ascending order is possible for calculating partition values (Quartiles).

For example, we have the following data of heights of 7 students in a class • Name of students : Anurag Deven Suresh Mayoor Atul Satish Himankar Height (cm) : 151 140 149 142 147 144 145

Therefore, for getting correct results, the data must be arranged in ascending order in all the cases.

Characteristic of Partition Values

The difference between averages and partition values is as follows :

While an average is representative of whole series, quartiles are averages of parts of series For example, the first quartile is the average of first half of the series and third quartile is the average of the second half of the series.

Thus, quartiles are not averages like mean and median. They help us in understanding

how various "ems are spread around the median. Therefore, the special use of partition

values IS to study the dispersion of items in relation to the median, that is in understanding the composition of a series.

Calculation of Partition Values

(a) Individual Series.

(b) Discrete Series.

(c) Continuous Series.


Now we arrange the data (in Illustration 1) in ascending order :

193

Name of Students (cm)

Deven 140

Mayoor142 = Q == First quartile or lower quartile

Satish 144

Himankar 145 = Me = = Second quartile or middle quartile

AtuI 147

Suresh 149 = Qj = Third quartile or upper quartile

Anurag 151

AS we Know tne meuiaii is uic nci^iii. iwuim -------------

cm. Now, suppose we have to calculate quartiles. By definition quartiles will divide a series into four equal parts and so number or quartiles will be three. They are known as lower quartile, middle quartile and upper quartile. These are also called first, second and third quartiles.

The middle or second quartile (Q^) is the central positional value of the data, i.e., median. The first or lower quartile (Qj) is the central positional value of the lower half, and third or upper quartile (Q3) is the central position value of upper half of the data. In the above data, (Q, = 142, Q, = 145 and Q, = 149.

It must be remembered that Q, is always less than Q^ and Q3 (Q^ < Q^ and Q3) and median falls between Qj and Q3.

(a) Individual Series

Illustration 13. From the following information of wages of 30 workers in a factory calculate median, lower and upper quartile.

S. No. Wages (in Rs) S. No. Wages (in Rs)

1 330 16 240

2 320 17 330

3 550 18 420

4 470 19 380

5 210 20 450

6 500 21 260

7 270 22 330

8 120 23 440

9 680 24 480

10 490 25 520 .

11 400 26 300

12 170 27 580

13 440 28 370

14 480 29 380

15 620 30 350

194

Solution.

1. Arrange the data in ascending order.

2. Locate the item by finding out.

fN+iY^ rN+v""'

and

N + 1

Nth

items

Median

Me = size of

N + l

2

th

Item

r 30 + 1* = size of —J- item

= 15.5* item

^ size of 15th item + size of 16th item

380 + 400


S. No. * Wages (in Rs) S. No. Wages (inRs)

1 120 16 400

2 170 17 420

3 4 5 210 18 440

240 19 440

260 20 450

6 270 21 470

7 o i 300 1 ^ 22 480

0 1 320 23 480

9 330 24 490

10 11 12 13 14 15 330 25 500

330 350 370 380 380 26 27 28 29 30 520 550 580 620 680

F

I

Uj

{b)l I

Calci

= 390 Median is Rs 390.

Positional Average and Partition Values Lower Quartile

195

Qj = size of - size of

rN+n

th

item

(30 + 1

ah

item = 7.75th

= size of item + |(size of - size of 7* item)

= 300 + .75 (320 - 300) = 300 + 15 = 315 .-. Lower Quartile is Rs 315. Upper Quartile

Qj = size of

= size of

VN+r"'

item

V3O + 1Y''

item

= size of 23.25* item

= size of 23^" item + -^(size of 24* item - size of 23^" item)

= 480 + ^(490 - 480)

= 480 + .25(10) = 480 = 480 + 2.50 Upper Quartile is Rs 482.50.

(&) Discrete Series r , u u

Illustration 14. Following are the different sizes and number of shoes m a shoe shop.

Calculate median, first quartile and third quartile.

Size of Shoes

4.5

5

5.5

6

6.5

7

7.5

8

8.5 9

9.5 10 10.5 11

No. of Shoes (f)

4

8 12 15 20 35 50 40 20 15 24 12 5 3

196

Solution.

Statistics for Economics-XI I

Steps

Sue of shues " -----

4.5 4 4

5 8 12

5.5 12 24

6 15 39

6.5 20 59

7 35 94

7.5 50 144

8 40 184

8.5 20 204

9 15 219

9.5 24 243

10 12 255

10.5 5 260

11 3 263

dat:

1. Arrangement of the data in ascending order is necessary.

2. Calculate less than cumulative frequencies.

3. Locate the item by finding out :

fN + U

th

fN + V

th

and

Af + 1

th

Wk " "" " feo-cy, the tule of

Median

Me - size of = size of

fN + lY'' .

Item

r263+r

th

item = 132th item

First Quartile

= size of 132* pair of shoes = 7.5 size of shoes.

N + l^

th

4

263 + 1^*''

Qj = size of = size of

= size of 66* item = size of 66* pair of shoes

Medial Ap:

item item

lics-XI Positional Average and Partition Values

First Quartile = 7 size of shoes. Third Quartile

197

Q = size of

Vn+T''*

Item

= size of

r 263 + 1^

th

item

= size of 198* item Third Quartile =8.5 size of shoes, niustration 15. Calculate Median, First Quartile and Third Quartile from the following data:

Solution.

Income No. of persons

(in Rupees)

800 16

1000 24

1200 26

1400 30

1600 20

1800 5

Income ■ ^H

(in Rs) 1

800 16 16

1000 24 40

1200 26 66

1400 30 96

1600 20 116

1800 5 121

Median :


Me = size of = size of

fN + 1

Nth

item

ri2i+i^

th

item = 61* item

= income 61* person

198

Median = Rs 1200

First Quartile


Qj = size of

4

Item

= size of

121+n

Item

= 30.5* item = income 35.5* person ••• Qi = Rs 1,000 Third Quartile

Q. = size of

= size of

(N+l^

th

Item

T21 + n

th

4

item

Thus,

= 91.5* item = income 91.5* person Q, = Rs 1,400 Me = Rs 1,200 = Rs 1,000 Q3 = Rs 1,400

(c) Continuous Series

Marks Students

Solution.

Me

30-35 14

35^0 16

40-45 18

45-50 23

50-55 18

55-60 8

60-65 3

Marks

30-35 35-40 40-45 45-50 50-55 55-60 60-65

Mo. of students (f)

14 16 18 23 18 8 3

c.f

14 30 48 71 89 97 100

H


199

Steps :

1. Calculate less than cumulative frequencies.

2. Median, first quartile and third quartile items are located by finding out

th n/»T\th

u.

(N^

v4.

, and

N

Item m continuous series.

3. Locate the median group, first quartile and third quartile group by cumulative

frequency column where the size of respective fall.

4. Apply the suitable formula to get the value :

Me = /j + - X i

N r

— -C.f.

2.

th

N 4

th

, and

fN'

4,.

th

Items

fN) 4]

-c.f.

X I

Median

Median = size of-^ item = = 50* item

Hence, median lies in class 45-50

N , Me = /j + - X i

th

100

where, = 45, ^ = 50, c.f = 48, f ^ 23, / = 5

50-48

Me = 45 + = 45 +

23

2x5 23

X 5

= 45.43

Hence, median is 45.4% marks.

200

First Quartile

Qj = size of Hence, Q^ lies in class 35-40


14 j

item = = 25* item

—-c.f.

X I

where, ^ 35, ~= 25, c.f = 14, / = 16, i = 5

= 35 + 1^= 38.43 16

Hence, first quartile is 38.4% marks. Third Quartile

Qj = size of Hence, Q lies in class 50-55

(N^ th rioo^i

Item = ^

UJ I 4 J

= 75* item

-c.f.

f

X /

v4.

where, = 50, ' — = 75, c.f = 71, f = 18, / = 5

. ^ „ 75-71

= 50 + = 51.11

lo

Hence, third quartile is 51.11% marks.

niustration 17. Calculate the Median and Q^ using the following data :

Mid-points marks : 5 15 25

■ No. of

students : 3 10 17

35 7

45 6

55 4-

65 2

75 1

Positional Average and Partition Values 201

Solution. Given mid-points are required to be converted into class intervals.

0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80

3 10 17

7 6

4 .2 1

Calculation of Median and Q3.

Median


Median = size of

th r5o^

U> item = I2 J

= 25* item

Hence, Median lies in class 20-30

Applying suitable formula to get the median value

"l-c-f. Me = /j + ^^— X i

where I, = 20,^ = 25, c.f. = 13, f = 17, i = 10

2S-1 3

Me = 20 + - X 10

= 20

17 12x10

Hence, Median is 27.05 marks. Third Quartile

17

= 27.05

n

v4y

item = ^^ = 37.5* item

Qj = size of

Hence, Q^ lies in class 40-50 Applying suitable formula, we get

/ X T

/ Nl

Qs = K +

v4.

-c.f.

f

X /

3

13 30 37 43

■47

49

50

202


. where, = 40,

= 37.5, c.f = 37, f= 6, i = 10

03 = 40.^^,10

. 40 . = 40.83

Hence, third quartile is 40.83 marks.

Illustration 18. Calculate the Median and Quartiles for the following : Marks (below) : 10 20 30 40 50 60 70 80 No. of Students : 15 35 60 84 96 127 198 250

Solution. Before calculating Median and Quartiles, first we convert the given cumulative frequencies into class frequencies :

[ W. of^tttdentfi

0-10 15 15

10-20 20 35

20-30 25 60

30-40 24 84

40-50 12 96

50^60 31 127

60-70 71 ■ 198

70-80 52 250

Total 250

Median. Applying formula, we get

( N^^

Median = size of

v2.

250 ,

Item = -— = 125* item

Hence, median hes in class 50-60 Applying suitable formula to get median :

Me = /, + 2

f

X t

where / = 50,

N

= 125, c.f = 96, f= 31, i = 10 125-96

2

Me = 50

= 50 +

31 29x10

31

- X 10 = 59.35

;. '.Hence, median is 59.35 marks.

• i

Tl

in a

Positional Average and Partition Values First Quartile

rN

2)

Nth

item =

250^

Q^ = size of

Hence, Q, lies in class 30-40. Applying suitable formula, we get

E^cf.

= 62.5* item

X t

62.5-60

= 30 + X 10

. 30 . ^ 31.04

24

Hence, Q, is 31.04 marks. Third Quartile

fN

th

r250

4 J

Q, = size of -J- item =

Hence, Q, lies in class 60-70. Applying suitable formula, we get

4.

= 187.5* item

Q3 = ^

f

X I

203

r"

Hence, Q, is 68.52 marks.

Thns, Q, = 31.04, Q. = Median = 59.35 and Q, = 68.52 marks.

nlustarion 19. The following series relates to the da,ly income of workers employed

in a firm. Compute

(a) highest income of lowest 50% workers,

(b) minimum income earned by the top 25% workers, and

(c) maximum income earned by lowest 25% workers.

204


25-29 30-34 35-39 20 10 5

Daily Income (in Rs) : 10-14 15-19 20-24 Number of workers : 5 10 15

Before solving it let us understand the question.

1. As the data are of inclusive class intervals, we are required to convert the classes into class boundaries. ^.lasses

2. The data is arranged in ascending order, where

Area of 50% of workers in highest income group

At this point a worker in the centre earning highest daily income of the lowest 50% of workers (;.e., Median value)


- 100% data

--------1_ .

H-4 At this point a worker is eaming minimum daily income of top 25% workers .4.- Area of top 25% of workers


Area of lowest 25% of workers

■ At this point a worker is earning maximum daily income of lowest 25% workers (i.e. lower quartile value = O,)

qutdln"' ^^ ^^^ ^^^to the given

Positional Average and Partition Values SoiutLoa.

205

DaUy Income (m (X)

9.5-14.5 14.5-19.5 19.5-24.5 24.5-29.5 29.5-34.5 34.5-39.5

5 10 15 20 10 5

5

15 30 50 60 65

(a) Computation of highest daily income in lowest 50% of workers. (Median)

Nth

Median is the value of

th (65\

— item or -

Uv I 2 j

or 32.5th item which lies in 24.5-

29.5 class interval.

Applying suitable formula to get median value,

N

Me = /j +

-c.f.

X t

= 24.5 + = 24.5 +

f

32.5-30 20 2.5x5

x5

20

= 24.5 + 0.625 = 25.125 .-. Highest data income of lowest 50% workers is Rs 25.13. (b) Computation of minimum daily income earned by top 25% workers (Q^)

th

3x65

^ l\

Q, = Value of — item =

Hence, Q3 lies in class 24.5 - 29.5. Applying suitable formula, we get

.T,

= 48.75* value

Qs = ^

-c.f.

f

X t

48.75-30 _ = 24.5 + -TT-x^

= 24.5 +

20 18.75x5 20

= 24.5 + 4.687 = 29.187 Minimum daily income earned by top 25% workers is Rs 29.19.

It,f

Statistics for Economics-XI (c) Computation of maximum daily income earned by lowest 25% workers (Qj)

Qi = Value of

UJ

item = 4r = 16.25* value

Hence, Q^ lies in class 19.5 - 24.5 Applying suitable formula, we get

Q. = ^

f

X I

= 19.5 + = 19.5 +

16.25-15

15 1.25x5 15

x5

= 19.5 + 0.416 = 19.916 Maximum daily income earned by lowest 25% workers is Rs 19.92.

Graphical Determination of Median and Quartiles

Illustration 20. Determine median and quartiles graphically from the following data : Marks : 0-5 5-10 10-15 15-20 20-25 25-30 30-35 35^0 Students : 7 10 . 20 13 17 10 14 9

Solution.

Secc

Akrrifes . . f ■ Mjr/fes less than Less than cumulative Marks more than More than cumulative

0-5 7 5 7 0 100

5-10 10 10 17 5 93

10-15 20 15 i7 10 83

15-20 13 20 50 15 63

20-25 17 25 67 20 50

25-30 10 30 77 25 33

30-35 14 35 91 30 23

35^0 9 40 100 35 9

N = 100

First Method (only for median). Steps

1. Calculate ascending cumulative frequencies (less than) and descending cumulative frequencies (more than).

2. Draw two ogives—one by 'less than' and other by 'more than' methods.

3

4.

5.


3. From the intersecting point of two ogives, draw a perpendicular on X-axis.

4. The point where perpendicular touches X-axis, median value is determined.

207

Second Method (For Median and Quartiles). Steps

1. Calculate ascending cumulative frequency (less than).

2. Determine the value by the following formulae :

Me = size of

Qj = size of

Q = size of

th

UJ

item, i.e., ^^ = 50* item

u

Item, i.e..

N 4

th

Item, i.e..

100 4

3^100^

= 25* item = 75* item

V ^ /

3. Locate 50, 25, 75 values on Y-axis and from them draw perpendiculars or cumulative frequency curve (ogive).

4. From these points where they meet the ogive draw another perpendicular touching X-axis.

5. The points where perpendicular touches X-axis, Qj, Me and Q^ are located.

208


VehficaUon

Median Group 15-20

Me = /j +

N U

-cf

f

X t

50-37 ^

13x5

Median = 20 Marks Lower Quartile Group 10-15

-cf

X I

25-17 = 10 + —-X 5

- 10

20 8x5

20

Qi = 12 Marks.

= 12

Upper Quartile Group 25 - 30

-cf

x t

ic 75-67 ^ 8x5

= 25.— .29

Q, = 29 Marks

Less than method' cumulative frequency curve is the reminder of the rule that at the

hrst step of calculation of quartiles, the data is arranged in ascending order. Howeven

median can be ocated on graph even by more than 'ogive' or calculated by arranging the data m descending order. ^ h b ^


209

1. Definition

2. Determination of Mode

3. Merits and Demerits of Mode

1. Definition

According to Coxton and Cowden, "the mode of distribution is the value at the point around which the items tend to be most heavily concentrated. It may be regarded as the

most typical of a series of values." x/r j •

The word mode comes from French la mode which means the fashion Mode in statistical language is that value which occurs most often in a senes, that is value which is most typical. If garment manufacturers say that short collars are now in fashion the statement implies that maximum number of people now-a-days wear short collar shirts If we say the mode is size No. 7 shoe, it means in a given data maximum number of people wear size No. 7. Thus, mode is that value of observations which occurs the greatest number of times or with the greatest frequency.

For a better understanding of mode let us look at the following information about

frequency of students in relation to marks obtained.

Marks : 5 10 15 20 25 30 35 40 45 50 : 2 3 25 2 1 18 20 24 14 10

According to the explanation of mode given above, the modal marks will be 15 because maximum number of students (25) have obtained 15 marks each. Although 15 have the highest frequency, a more careful examination of the information shows that the highest concentration of the frequency is around 40 marks. That is, m the neighbourhood of 40 marks. There are more frequencies (18, 20, 14, 10) as compared to the neighbourhood of 15 marks (2, 3, 2, 1). Thus 15 marks are not ^yp.c^/ of the series of valLs. For the reasons given above, 40 marks is the mode and not 15. Therefore, to define accurately, mode is that value of observations around which items are most densely

or heavily concentrated.

The mode is defined as the most frequently occurring value. If each observation occurs the same number of times, then there is no mode in that distribution. If two or more observations occur the same number of times (and more frequently than any other observation) then there is more than one mode and the distribution is multi-modal, as against uni-modal, where there is one mode. If two values occur most frequently then the series is bi-modal, in case of three values occurring most frequently then the series is called tri-modal. The mode as a measure of central tendency has little sigmficance for a bi- or

"Mode is that value of the graded quantity at wh,ch the instances are most numerous. " -A.L. Bowley "The value occurring most frequently in a senes (or group) of Hems and around which the other item^ar^ distributed most densely."

210

Statistics for Economics-XI - - .oae .„.

2. Determination of Mode

(a) Series of Individual Observations and Discrete Series

(b) Continuous Series

(c) Graphic Location of Mode

id) Mode from Mean and Median.

(a) Series of Individual Observations and Discrete Series

In a senes of individual observations, the mode can be located in two ways •

" -- ^^st

a cl^r""" ^^ ^^^ -rks obtamed by 15 students i

Marks : 4 6 5 '

in

9 8

Solution.

10 4 7 6 5 Modal value

8

7

7

7 8 8 9 9 10.

(a) (i) Array : 4 4 5 5 6 6 : Mode = 7 Marks

(«) Discrete Series. Converting the above data into discrete series, we get

Mode = 7 Marks


(b) Discrete Series. In discrete series the mode can be located by two ways :

(i) By Inspection.

(ii) By Grouping.

(i) By Inspection. The mode can be determined just by inspection in discrete series, the size around which the items are most heavily concentrated will be decided as mode. Illustration 22. Find out mode from the following data :

Wages (in Rs)___ ' No.- of Persons ■

125 3-

175 8

225 21

275 6

325 4

375 2

Solution. By inspection, we can determine that the modal wage is Rs 225 because this value occurred the maximum number of times, i.e., 21 times.

{ii) By Grouping. In discrete and continuous series, if the items are concentrated at more than one value, attempt is made to find out the item of concentration with the help of grouping method. In such situations it is desirable to prepare a grouping table and an analysis table for ascertaining the modal class.

In grouping method, values are first arranged in ascending order and the frequencies against each item are properly written. A grouping table normally consists of six columns Frequencies are added in twos and threes and total are written between the values. It necessary, they can be added in fours and fives also.

Column 1. The maximum frequency is observed by putting a mark or a circle.

Column 2. Frequencies are grouped in twos.

Column 3. Leaving the first frequency, other frequencies are grouped in twos.

Column 4. Frequencies are grouped in threes.

Column 5. Leaving the first frequency, other frequencies are grouped in threes.

Column 6. Leaving the first two frequencies, other frequencies are grouped in threes.

After observing maximum total in each of these cases, put a mark or circle on every total. An analysis table is prepared after completing grouping table in order to find out the item which is repeated the highest number of times. If the same procedure is adopted in continuous series, we shall be in a position to determine the modal class.

We shall now see how mode is determined by grouping method in a discrete series.

212

Statistics for Economics-XI Illustration 23. Find out mode of a data given in Illustration 20 by grouping.

Grouping Table

Wages (m Rs) No. of persons

• <l) (3) m ■ ' (6)

125 3

11

175 8 32

225 21 29

■ 27 i i 35

275 6 31

325 4 10 12

6

375 2

Analy.sis Table

Column No. 125 "■ 225- - 275, 32S

1

2 1 1

3 1 1 1

4 1 1 1

5 1 1 1

6 1 1 1

Total 1 3 6 3 1

^^ Smce the value 225 has come largest times, 6 times, hence the modal

visage IS

lUustration 24. Compute the mode from the following :

Size of the item .- 2 3 4 5 6 7 8 9 10 11 12 13

Frequency- : 3 8 10 12 16 14 10 8 17 5 4 i

beW ^^^ be done as shown

lics-XI


213

Grouping Table

2 3

3 8

4 10

5 12

6 16

7 14

8 10

9 8

10 17

11 5

12 4

13 1

11

22

30

18

22

J:

18

28

21

24

25

_42

35

10

30

40

30

38

26

The analysis can be done separately also as shown below :

!

1 2

3

4

5

6

Total

to

■ 1

m

32 I

It

j The value of 6 has come the largest times (5), hence mode is 6.

12

5 3 1

13


214

(b) Continuous Series

applying the following formula? " determined by

Mo = I +

or

Mo = / +-

X t

ifl~fo) + {fl~f2)

X /■

where, Mo = Mode

/j = lower hmit of modal class /", = frequency of the modal class /o = frequency of the class preceding the modal class = frequency of the class succeeding the modal class i = class interval of the modal class

The above formula can also be expressed in the following way :

or

Mo = Mo = /j +

X /

A1+A2

"/"i-Zol + l/i-Zil

X t

where. Mo = Mode

/j = lower hmit of the modal class

modal class and the frequency of the class before the modal class . precedmg class (ignoring signs)' "

A, = (Read delta 2), .. \f _ f^l Jhe difference between the frequency of the

niustration 25. Fmd out the mode from the following frequency distribution • Central snes : 1 , 3 , ^ ^ ^ ^^^

Frequency ; g ^

10

12 20

12

215


Solution. Since the central sizes are given, we must convert them into class intervals.

Grouping Table

Qass Imervai

0.5-1.5

1.5-2.5

2.5-.3.5

3.5^.5

4.5-5.5

5.5-6.5 6.5-7.5

7.5-8.5 8.5-9.5

9.5-10.5

(V

6

10

12

20

12 5

3 2

14

22

32

16

32

17

24

44

10

28

37

Analysis Table

42

20

1 2 3 4 5 6 1 1 1 1 1 1 1 1 1 1 1 1 1

Total 1 3 6 3 1

By Inspection Mode lies in the group 4.5-5.5. To determine the value of Mode, we should apply the following formula.

fi-fo

X t

where, /j = lower limit of the modal class (4.5) /j = frequency of the modal class (20)

216


fo = frequency of the class preceding the modal class (12) fi = frequency of the class succeeding the modal class (12) i = class interval of modal grdkp (1)

20-12

Mo = 4.5 + = 4.5 +

= 4.5 +

2x20-12-12 8

X 1

40-12-12 8

X 1

16

Mode = 4.5 + 0.5 = 5. Illustration 26. Find the mode of the distribution from the following data :

Below 15 . ........

20 10

" 25 26

" 30 38

" 35 47

40 52

" 45 55

Solution. For calculation, mode of the given distribution first convert the given data into class intervals.

Grouping Table

10-15 15-20 20-25 25-30 30-35 35-40 40-45

3 -

10

7 -

23

16 - -

28

12 -

9 21

14 -

5 -

8

3 -


217

Analysis Table

Column No. lO^IS 15-20 20-2S 25-30 30-3S 35-40

1 2 3 4 5 6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Total 1 3 6 4 2 1

The mode lies in the class 20-25. Applying the formula, we get

fi-fo

where.

Mo = /, + ^ X I

/j = 20,/j = 16,/; = 7,= 12, I = 5 16-7

Mo = 20 +

2x16-7-12

X 5

= 20 + ^ X 5

= 20 + 3.46 Mode = 23.46

Mode when Class Intervals are Unequal

The formula to calculate the mode from the modal class discussed above, is apphcablt in a series where there are equal class intervals. When the class intervals are not equal, before calculating the value of the mode, we must take them equal and the given frequencies should be adjusted presuming that they are equally distributed throughout the class.

Illustration 27. Compute the mode from the following data :

Class Frequency

Class Frequency

0-3 3-6 6-10 10-12 12-15 15-18

4 8 10 14 16 20

18-20 20-24 24-25 25-28 28-30 30-36

24 14 16 11 10 6

218


Solution. The class intervals are not equal. They are made equal by combining two or more classes.

Grouping Table

Class Frequency

(1) ; a; ; (V (4) (V (6)

0-6 4 + 8 = 12 -]

36

6-12 10 + 14 = 24 - -1 72

60

• 12-18 16 + 20 . = 36 - - -

74 98

18-24 24 + 14 = 38 - _ -

75 Ill

24-30 16 + 11 + 10 = 37 - _ 81

43

30-36 = 6 - -

Analysis Table

Column No. 6-12 18-24 24-30 30-36

1 2 3 4 5. 6 1 1 1 1 1 1 1 1 1 1 1 1 1

Total 1 3 6 3 1

The mode lies in the class 18-24 Applying the formula, we get

X /

where

/j = 18, f^ = 38, /•„ = 36, = 37, i = 6 38-36

Mo = 18 +

2x38-36-37

X 6

= 18 + J X 6 = 18 + 4 = 22. Mode is 22.

219


(c) Graphic Location of Mode

The value of mode can be determined graphically in a frequency distribution. Followmg

are the steps of locating mode on graph. .

1. Prepare a histogram of the given data.

2. The highest rectangle will be the modal class.

3. Draw two lines diagonally inside the modal class rectangle to the upper corner of the adjacent bar.

4. From the point of intersection of these lines, draw a perpendicular of X-axis which gives the modal value.

Illustration 28. Determine the value of mode of the following distribution graphically and verify the results.

10-20 20-30 30-40 40-50 50-60 12 14 10 8 6

Marks

No. of Students Solution.

.0-10 5

GRAPHIC LOCATION OF MODE

Scale: 2 cm = 10 Marks on X-axis 1 cm = 2 Students on V-axis

30 40 MARKS

Verification :

Mode lies in the class 20-30

Mo = /, + ,

' Ifi-fo-fi

X i, where = 20, f^ = 14, f, = 12, f^ = 10, / = 10

Mo = 20 +

14-12

2x14-12-10

X 10

= 20 + — X 10 = 20 + 3.33 6

Mode = 23.33 Marks.

220


is ^ ^"'Tt't^ ^ perpendicular,

luchZt? u " ^^^ the per^ndicular

touches the X-axis, gives the modal value. Mode cannot be determined graphicSSy if two

id) Mode from Mean and Median

(vJ'uvrj'^l't T'r of distribution curves in Chapter 7'

intl W^ a symmetrical distribution mean, median and mode are

orZZ Ll ^fl ^^ ' distributicm of frequencies on either side of the maximum;

vakr.L h I 1 ^^ber of cases above the mean

value and below the mean value are equal. This relationship does not exist in moderately

—' will pull^a^lt

Asymmetrical Distribution, (Negatively Skewed Curve)

Negative

X < Me < Mo

Symmetrical Distribution, (Bell-Shaped Curve) Peak

Asymmetrical Distribution, (Positively Skewed Curve)

Positive

X = Me = Mo

Mo < Me < X

of tie (^'^^^"caly distribution, if the distribution tails off towards higher value

lit r conint r" ""'T ^ ^^'-s, positively skewed) and ha

greater concentration m lower values mean and median will be more than the Lde (X

and Me > Mo). In other words, mode is lowest, i.e., X > Me > Mo.

valufbf Ae^dl'Vyi"^^^^^ distribution, if the distribution tails off towards lower value of the data and has greater concentration in higher values, (i.e., negatively skewed),

mean and n^edian are less, then mode (X and Me < Mo). In other words, mode is

highest,X < Me < Mo.

The relationship between mean, median and mode reveals that in a moderatelv assymetrical (skewed) distribution the median lies between the mode and the arS^Sc mean, approximately 2/3rd distance from the tpode and l/3rd from the lafTS relationship is expressed as follows which is given by Karl Pearson

Positional Average and Partition Values 221

Mode = Mean - 3(Mean - Median) = Mean - 3 Mean + 3 Median = 3 Median - 2 Mean

Mo = 3 Med - 2X

In most of the cases if the distribution is moderately asymmetrical, the value of mode calculated from mean and median would not differ significantly from the value calculated by other methods. Inhere may be two values in a series which occur with equal frequency, this IS called b,-modal series. In case of bi-modal distribution or mode is ill-defined, its value may be determined by the above formula which is based upon the relationship of mean median and mode. If we know any of the two values out of the three, we can calculate the third value from the above relationship.

Dlustration 29. {a) In an asymmetrical distribution mean is 58 and the median is 61 Calculate mode.

{b) If mode in a tolerably asymmetrical distribution is 12 and median is 16, what would be the most probable mean?

Solution.

Mode = 3 Median - 2 Mean = (3 X 61) - (2 X 58) = 67

Mode = 67.

Mode = 3 Median - 2 Mean -

12 = (3 X 16) - 2 Mean 12 = 48 - 2 Mean 2 Mean = 48-12 = 36

Mean = ^ 2

= 18

Mean =18.

musttation 30. The following table gives production yield in kg per hectare of wheat ot 150 farms m a village. Calculate the mean, median and mode production yield.

Production (in kg) : 50-53 53-56 56-59 59-62 62-65 65-68 68-71 71-74 74-77

No. of farms : 3 8 14 30 36 28 16 10 5

222

Solution.


Production No. of terms f Midpoints m m - 63.5 d fm-63.5] fd' --

yield X I 3 ) d' c.f

50-53 53-56 56-59 59-62 62-65 65-68 68-71 71-74 74-77 3 8 14 30 36 28 16 10 5 51.5 54.5 57.5 60.5 63.5 66.5 69.5 72.5 75.5 . -12 -9 -6 -3 0 +3 +6 +9 +12 -4 -3 -2 -1 0 +1 +2 +3 +4 -12 -24 -28 -30 0 +28 +32 +30 +20 3 11 25 55 91 119 135 145 150

N = 150 - i - lfd' = 16

Median :

N

= 63.5 + 0.32 = 63.82 Mean = 63.82 kg per hectare

' \7\th

Median = the size of f —

Item

150

th

Item

= the size of

V 2

= the size of 75* item Median hes in group 62-65.

To interpolate median, we use the following formula :

N

T -C.f.

Me = /, + X

f

where.

N

/, = 62, — = 75, c.f =55, f= 36. i =


75-55

Mode

= 62

36

■x3

= 62 + 1.666 = 63.67

Median = Rs 63.67 kg per hectare

Grouping table

Analysis Table

223

Rupees No . of receiver

X (V (2) (■V (4) iS)

50-53 3 -

53-56 8 11 25

56-59 14 - 22 . if -

59-62 30 44 *« 80

62-65 36 66 - -

65-68 28 64 94 i!0 1

68-71 16 44 54

71-74 10 26 - 31

74-77 5 - 15 -

Column No. 59-62 62-65 65-68 68-71

1 2 3 4 5 6 1 1 1 - 1 1 1 1 1 1 1 1 1 1 1

Total 3 6 4 1

224


^Jy .nspection the o,ode hes i„ the group applying the Mowing formula, we

Mo = L + _fi~fo_

1 _/■_

X i

Here,

2fi-fo-f2 I, = 62, f^ = 36, /■„ = 30, f^ = 28, / = 3

36-30

Mo = 62 +

= 62 +

2x36-30-28

x 3

14

= 62 + 1.285

x 3

Mode = Rs 63.29 kg per hectare.

3. Merits and Demerits of Mode Merits

compared to mean LSian Inf^o f A?""' " most typical and cogent ues orr ^rm^^

2. Mode is not A i. ' ' of shoes etc. vduts L nottt™."'' " if the extreme

3. Mode- can be determined in open-end distribution.

"" -hmetic mean can not be ascertained

5. M^e is helphU in describing the quaUtative character of the product

Demerits

1. Mode cannot be decided in bi-modal and multi-modal distributions

2. Mode . not sm^table when relative importance of itemst fctaT

tnt —• " "Ot c:;a'ble of algebraic

4. Mode is not based on every item of the series.

- -- may differ .om one

"Lg"' " Ae size of the class interval decided


Comparison of Mode with Mean and Median

We find that as compared to mean and median, mode is less suitable. Mean is simple to calculate, its value is definite, it can be given algebraic treatment and is not affected by fluctuations of samphng. Median is even more simple to calculate and is almost as stable as mean, although it is influenced by fluctuations and cannot be given algebraic treatment. Mode is the most popular item of a series and is also easy to calculate and simple to understand. But it is not suitable for most elementary studies because it is not based on all the observations of the series and is unrepresentative. Mode has its own uses and advantages as we have seen, but as compared to mean and median, it is not so precise and accurate.

OF FORMI

Individual Series and Discrete S

1. Median Me = Size of

fN+l^

V 2

th

Item

2. Lower Qj = Size of Quartile

3. Upper Qj = Size of Quartile

fN + 1}

th

I 4 ;

rN+n

Item

4 J

th

Item

Me = Size of

th

Item

X, I N/2-c.f. Me = /j + --j.—— X I

Qj = Size of

l4j

Item

Q3 = Size of

f

fN

.4;

Item

Q3 = + f '

Mode :

1. Grouping method for discrete series.

2. After grouping, decide the Modal Group and use the formula to find modal value in continuous series

Mos:

X f

226

exercises


Questions :

Define median. Discuss ks merits and dements.

3 wTT """" -1-s.

•5. Write short notes on :

4 De&rr"'?' of a dismbution.

■ " -- can be read W ,be

5. What is meant by the following ?

(a) Mode (b) Median, and (c) Anthmetic mean

7. Deftne mode. Explain how mode can be read on graph paper> • nr^a—ft -.an and

tendency

■ "raTr^ndelf ^ -d,a„ as measures of

and mode of a frequency

" rtrsLr ^^ -- - ---

<b) Average inrelligence of srudents in a 'class, and (c) Average production per shift in a factory .

Median W Arithmetic Mean]

Problems :

Calculate median of the following data :

145

257

130 260

200 300

210 345

198 360

234 390

159

160

178

Fmd out median of the following information : Marks : 10, 70, 50, 20, 95, 55,

42,

[Me = 210]

60, 48, .80 [Me = 52.5 marks]

Positional Average and Partition ^^^ues 227

'e have the following frequency distribution of the size of 51 households. Calculate the arithmetic mean and the median.

; Size Number of households :

4./Find out median (a) Serial No.

4

9

21

11

7 5

Total

51

1 2 3 4 5

2 4 10 8 15

5-- 10 - 15 20 25

2 • 4 6 8 10

[X = 5, Me = 5]

6 20

7 12

25

9 30

..........[Me (a) = 12, (b) = 20]

out median, furst quartile and third quartile of the following series :

Height (in inches) : 58; 59 60 61 ^62 63,. 64 ' 6^ 66

No. of Persons 2 3 6 15 10 .5 4 3 1

/? [Me = 61, Q, = 61 and Q3 = 63]

6. /The percentage of marks obtained by 68 students in an examination are given below - ' Compute the median.

= Below 20 20^0 40-60 60-80 Above 80 No. of Students : 0 5 22 25 16

_ ^ , , , [Me = 65.6]

7. Calculate the mean of the following distribution of daily wages of workers in a factory:

Daily Wages (in Rs) : 100-120 140-160 160-180 180-200 Total

No. of Workers : 10 30 15 5 80

41so, calculate the median for the distribution of wages given above.

. / [X = 146.75, Me = 146.67]

8./The following table gives the marks obtained by 65 students in statistics in a certain examination. Calculate the median.


More than 70% 8

60% 18

50% 40

40% 45

" - 30% 50l

20% 63,)

10% 65

[Me = 53.4 Marks]

228

j„ . Statistics for Economics-XI

■ 26 8 ^ 2 50

j 16-19 20-29 30-39 40-49 50-59 60-64

1 ^^ 46 49 32 28 14

■ — —iiicuian or tne above data,

(ft) Draw a histogram and indicate mean and mode therein.

Get Mode on Histogram. Mean cannot be obtain!? o^ HiLl^^rT:'


30-35 14

35-40 16.

40-45 18

Compute mode from the following series :

45-50 50-55 55-60 60-65 23 18 8 3

[Me = 45.43, Q, = 38.4, Q^ = 51.1]

Size of items Frequency Size of items Frequency

1 3 8 10

3 8 9 8

4 10 10 17

5 12 11 5

6 16 12 4

13.

Calculate mode for the following data

7 14 13 1

[Mo = 6]

1 26

2 113

3 120

4 95

5 60

6 42

7 21

8 14

9 5

10 4

[Mo = 3]

Positional Average and Partition ^^^ues

229

^^ind out the Mode from any of the following two distributions :

X : 30-40 40-50 50-60 60-70 70-80 80-90 90-100

f ■■ 6 10 16 14 10 5 . 2

And

Marks : 0-9 10-19 20-29 30-39 40-49

No. of Candidates : 6 29 87 181 247

Marks : 50-59 60-69 70-79 80-89 90-99

No. of Candidates : 263 113 49 9 2

is/uk of electric lamps is given in the following table. Calculate the median and the mode.

Below 400 4

400-800 12

800-1200 40

1200-1600 41

1600-2000 27

2000-2400 13

2400-2800 9

Above 2800 4

[Mo

:e Mean, Median and Mode from the following data :

59 1

61 2

63 9

65 48

67 131

69 102

71 40

73 17

Total 350

[X= 67.9, Me = 67.75, Mo = 67.48]

i Followiiig is the distribution of marks of 50 students in a class : [Marks (Kore than) : o\ 10 2030 40 50

50 \

Calculate theNMedian Marks. If^60% of students pass this examination, find out the I minimum marl^btained by a pass tandidate. . [Me = 27.5, 25.5%]

46

40

20

10

230


32 20 43 11

61 31 47 .15

52 56 64 20 35 21 50

22 10 43 42

49 62

75 77

persons is given below :

' 97 35 30 30 95

60 27 53 31 9

45 22 36 13 46

73 81 40 40 55

67 54 23

42 25 51

modal age.

19. Determine the value of mode for the foil ^ = ^^

Mode = 3, Median = 2 M^n :

21.

[Modal age = 42 years]

Ma No. of Students

Less than 10 Less than 20 Less than 30 Less than 40 Less than 50 Less than 60 Less than 70 Less than 80 Less than 90 5 15 98 242 367 405 425 438 439

20.

c 1 , LA = jy.j:). Me = SX 44

For the data given below find graphically the folWing • '

[a) The two quartiles. '

(b) The central 50% limit of the age

W The nnntber of workers falhttg ,n .he age gronp of 2S to 57 .ears

Mo = 36.22]

Age in years 20-24

iVo. of workers : 5

Age in years : 50-54

No. of workers : 23

25-29 10 55-59 10

30-34

15 60-64 5

35-39

25 65-69 2

40-44 65

45-49 40

Draw a -less than' ogtve front the following data and hence find out the value of

Class

20-25

25-30

30-35

35-40

40-45

45-50

50-55

55-60

Frequency

6 9 13 23 19 15 9 6

231

Positional Average and Partition ^^^ues 253

22. The following table gives the distribution of the wages of 65 employees in a factory.

Wages (in Rs) : 50 60 70 80 90 100 110 120

(Equal to ormore than) " ^^0

Number of employees 65 57 47 31 17 7 ' 2 q

23. Draw the histogram and estimate the value of mode from the following data :

Marks No. of students

0-10 0

10-20 2

20-30 3

30-40 7

40-50 13

50-60 11

60-70 9

70-80 2

80-90 1

24. Represent the following data by means of a histogram and find out mode. Weekly wages : No. of workers :

10-15

7

15-20 19

20-25 27

40-45

25-30 30-35 35^0 , 15 12 12 8

W ind^idual incomes, the 'less than type' Ogive and the

;:. 7i?'r ti!^™ ^sir ^^ --

Chapter 10

measures of dispersion

Introduction

Objectives of Measuring Disperelon Metliods of Measuring Dispereion

(A) Dispersion from Spread of Values

(B) Dispersion from Average

(C) Graphic Metiiod—Lorenz Curee Absolute and Relative Measures of

(A) Absolute Measures

(B) Relative Measures Graphic Method

Comparison of Measures of Dispersion List of Formulae

.eafr^r:?:-^^^^^-- a single fig^e. These

the form of an average. These av^Les Sf uf frequency distribution in

magnitude of the distriLion but ZrS tfir ^^put the general level of Measures of central tendency aê somî^r^^^^^

happens when the extent of variationf îZl f ^his

relation to the other values is lajr^n any ^^^^^^^^ ? ^^ -

not only to know the average

ôw about the measures of

He knew that the average depth of the r^ter w^ Too 1 uTl' ^

family was 130 cm. Hef therLre decidl7thTh Tu""^" ^^ his

mer on foot. But it so Wpenâf hf ma^^^^^ ^^ ^^

Ae height of youngest childôf the ^s^v 12? m W ""

had happened to the statistician famity ^ ^^ ^^^t must

r-^ -7 ^^ there may be great may be below poverty line. TTiere ifnTed to nt ^^^ôf a majority of the people

Measures of Dispersion 233

Definitions

According to D.C. Brooks and W.F.L. Dick. "Dispersion or spread is the degree of the scatter or variation of variables about a central value."

According to Prof. L.R. Connor. "Dispersion is a measure of the extent to which the individual items vary."

According to A.L. Bowley. "Dispersion is a measure of the variation of the items."

Now, look at the following data about salaries paid to employees of three different departments of an organisation.

Dept. A

■

Deviation

t. B

mmmmi

Dept. C

Deviation

fWf

MS

5000 5000 5000 5000 5000

0 0 0 0 0

4500 5500 6000 5000 4000

- 500 + 500 + 1000 0

- 1000

10000 2000 4000 4500 4500

5000

- 3000

- 1000 - 500 -500

Total : Mean X :

25000 Rs 5000

25000 Rs 5000

25000 Rs 5000

We find from the above table that the average salary paid to employees in each department is the same, i.e., Rs 5000. In department 'A' the salary paid to each employee is the same, i.e., Rs 5000, hence mean is fully representative of the values of the items in the series. In department 'B' though the mean is Rs 5000, but the constitution of series is quite different. In this case lowest value is Rs 4000 and the highest value is Rs 6000 and the difference between the highest and the lowest value is Rs 2000, and the highest deviations from the mean are -1000 and +1000. The mean in this case, does not adequately represent the values of the items in the series of department 'B'. In department 'C though the mean is the same,

but there is wide gap between the values of items. The lowest value is Rs 2000 and the highest value is Rs 10,000, which deviate from mean by -3000 and +5000 respectively. The difference between the highest and the lowest value is Rs 8000. Not a single item in the series is represented by its mean.

From the above illustration we observe that some deviations are positive and some are negative. Similarly, some deviations are large and others are small. Therefore, we are required to make an overall summary of these differences (scatteredness) in all values about the central value. This summary is called the measures of dispersion or measures of variation. It is clear that we must not only know the composition of a series but also observe how the composition of a series differs from another. For such a study we have, a statistical tool called measures of dispersion or measures of variation. ,

234


|bjectives of measuring uispmm

Before we go on to describe the specific methods of studying variabihty, we must clearly define the objectives.

ia) To Test the Reliability of an Average : Measures of dispersion enable us to know whether an average is really representative of the series. If the dispersion of variabdity m the values of various items in a series is large the average may be unrepresentative of the series. If on the other the variability is small, the average would be a representative value. This point has already been made clear in the above dlustration, wherein different series of three departments the mean was a common value and the variations differed.

(b) To Serve as Basis for Control of Variability : The study of variation is done also for the purpose of analysing why large variations happen or occur and this may help to control the variation itself.

For example, in some major human health problems the blood pressure, the heart and pulse beat are recorded and an attempt is made by the doctors to control these through provision of medicines. Similarly, in industrial production to control the quality of the product and the causes of variations in product are obtained by inspection and quality control programmes. In social sciences where we have to study problems relating to inequality in income and wealth, measures of dispersion are of great help.

(c) To Make a Comparative Study of Two or More Series : Measures of variability are also useful in comparing two or more series with regard to disparities or differences. A greater degree of dispersion or variability would mean lack of uniformity or consistency or homogeneity of the data. While a low degree of variability would indicate high uniformity or consistency or stability. Comparative studies of varmbihty are very useful in many fields like profit of companies, share values, performance of individuals and studies relating to demand, supply and prices, etc.

id) To Serve as a Basis for Further Statistical Analysis : Measure of variability which IS measure of second order is very useful in the use of higher measures such as skewness, kurtosis correlation, regression etc.

Note: Characteristics of a representative average are explained on Page 139 and 140 ot this Book. Same points are for characteristics of a good Measure of dispersion.

Following are the important methods of measuring dispersion :

M^ODS OF MEASURING OlSPERSiON

i

from Spread ilues

Range

Interquartile Range and Quartile Deviation

from

IMean Deviation or Average Deviation Standard Deviation

Method-Curve

Measures of Dispersion

235

First two measures, viz.. Range and Quartile Deviations are from spread of values, termed as positional measures. They are calculated from the values of the variable at a particular position of the distribution. They are not based on deviations from any particular value. While the mean deviation and standard deviation are from an average defined in terms of deviations from a central value. Lorenz curve is graphic method of studying dispersion/variability.

Measures of dispersion in terms of spread and position are as under :

(A) Dispersion from Spread of Values

(a) Range

(b) Interquartile Range and Quartile Deviation

(a) Range

1. Meaning

2. Calculation of Range

3. Merits and Demerits of Range

4. Uses of Range

1. Meaning

Range is the simplest measure of dispersion. Range is the difference betu/een the largest and the smallest value in the distribution. It is determined by two extreme values of observations. In case of the grouped frequency distribution range is defined as difference between the upper Hmit of the highest class and the lower limit of the smallest class. In case of a frequency distribution, the frequencies of the various classes are immaterial since range depends only on the two extreme observations. Range as defined is an absolute measure of dispersion and expressed in the units of measurement of the given data. Thus if we want to compare the variabihty of two or more distributions with the same units of measurement, we may use absolute measure. Symbolically, range is located by the following formula :

Range = L - S

where, L = Largest item

S = Smallest item

Relative Measure

To compare the variability of two or more distributions given in different units of measurement, we cannot use absolute measure but we need a relative measure which is independent of the units of measurement. This relative measure is called coefficient of Range. It is common practice to use coefficient of range even for the comparison of variability of the distributions given in the same units of measurement. It is obtained by applying the following forniula :

Coefficient of Range =

L-S L + S

236


Range = L-S Here, L = 30, S = 5

Range = 30 - 5 = 25 Range is 25 Marks. Coefficient of Range

= kzS

L + S _ 30-5 30 + 5 = 0.714

.11 ^35

Range = L- S Here, L = 60, S = 0

Range = 60 - 0 = 60 Range is 60 Marks. Coefficient of Range

= IlzI

L + S _ 60-0 ^ 60 + 0 = 1

Age (in year) : l^^o 21-25 26-30 31-35

No. of Persons : 10 15 17

Calculate range and the coefficient of range. ^

last class will become 30 5 - 35.? " ^^^ ^ '

Absolute Measure of Range au ■ , r

Range = L-S Alternatively (from mid-values)

T , Mid-value of highest class =33

..Range = 35.5-15.5 = 20 years .. Range = 33 - 18 = 15 years.

ai

ci


Relative Measure of Range

237

Coefficient of range =

L-5 L + S

35.5-15.5

20

= 0.39

Alternatively Coefficient of range ^ 33-18 _ 15 _ 0 29 33 + 18 ~ 51

~ 35.5 + 15.5 51

Illustration 3. The following are the marks obtained by 50 students in Statistics. Calculate the range of marks obtained by middle 50% of the students.

. Marks No. of Students

Less than 10 4

Less than 20 10

Less than 30 30

Less than 40 40

Less than 50- 47

Less than 60 50

Solution. We arrange the data in continuous series.

Np. of Students

0-10 4 A

10 - 20 6 10

20 - 30 20 30

30-40 10 40

40 - 50 7 47

50 - 60 3 50

lb get marks ot miame ou /o stuucms, wc ^it ..v, ----------------- -

and 37.5* student (i.e., 1- and 3«» Quartiles) Q, and Q,. Marks of 12.5* student hes m class 20 - 30 and marks of 37.5* student in class 30 - 40.

U.S'^-c.f.

Marks of 12.5* student = l^ + ^

X t

20

= 20 + ^ X 10 = 21.25 Marks

Marks of 37.5* students = /j +

= 30 +

20

37.5'*' Smdent-c./". f

37.5-30

X I

10

X 10

= 30 + X 10 = 37.5 Marks

10

Largest Value Smallest Value Range Marks

238


= 37.5^arks = 21.25 Marks = L-S

= 37.5 - 21.25 = 16.25 Marks Thus, range of marks obtained by middle 50% students is 16.25 marks

Solution. We arrange the data in ascending order 61, 64, 65, 66, 67, 67, 68, 68, 69, 70, 72 Range height = L - S 72 - 61 = 11 inches.

When shortest man (61 inches is omitted) the range will be = L - S = 72 - 64 = 8 Change in the range = 11 - 8 = 3

Percentage change in the range ^ x 100 = 27.27%

11

3. Merits and Demerits of Range Merits

1. Range is simple to calculate and easy to understand th^ rr, r j •

ban a very accurate picture of variability one may compute'rangl ' "

2. It gives broad picture of the data quickly.

3. It is rigidly defined.

4. It depends on unit of measurement of the variable It has the Demerits

le es

:st he

ies )m

239


160 to 180 centimetres, if a dwarf (shortest) student whose height is 100 cemimetres is admitted in our data, the range would shoot up from 20 to 80 centimetres. Thus, a single variation in the value of an extreme item affects the value of the range. 3 It is influenced very much by fluctuations of sample. Range is subject to P uctuations ■ of values from sample to sample. However in small samples, it is uscxul in certain

circumstances.

4. It cannot be calculated in case of open-end distributions because extreme values ot

the distribution are not known.

5. It does not tell anything about distribution of items in the series relative to a

measure of central tendency. Thus, the range is very unsatisfactory measure of dispersion and should be used with

great care and caution. 4. Uses of Range

Despite various limitations, the range is useful in the following areas: (a) Quality control : Range is used to study the variation in the quality of the items produced of a manufacturing concern. Range has a great significance in quality

control measures.

lb) Measure of fluctuations : It is a very useful measure to study fluctuation.: of series Variations in the prices of share, other commodities arJ money rates pnd rate ot exchange can easily be studied with the help of ran^e. " (c) Use in day-to-day life : Range is by far the most widely .'sed measure of variabihty in our day-to-day life. For example, the answer to the problems hke daily sales in a departmental store', 'monthly wages of workers in a factory'^ or the expected return of fruits from an orchard', is usually provided by the probable limits in the

form of range.

Id) Use in meteorological department: Range is also used in a very convenient measure by meteorological department for weather forecast since the general public is interested to know the limits within which the temperature is likely to vary on a particular day.

(b) Interquartile Range and Quartile Deviation

1. Meaning

2. Calculation of Quartile Deviation

3. Merits and Demerits of Quartile Deviation

1. Meaning -i i ■£ u

Just as in case of range the difference of extreme items is obtained,^similarly if the

diffLnce in the two values of quartiles is calculated, it would give us what is called the 'Interquartile Range'. It is also a measure of dispersion. It is an advantage over range m as much as, it is not affected by the values of the extreme items. In fact 50% of the values of a variable are between the quartile (i.e., Q, and Q,) and as such the interquartile range gives a fair measure of variability.

240


Semi-interquartile Range or Quartile Deviation ■ A. tU

I.Q.R = Interquartile Range = Q _ q

, Q3 = Third Quartile, Q. = First Quartile Semi-Interquartde Range or Quartile Deviation

or

Q.D. =

Q3-Q1 2

Symbolically,

QazQi

Coefficient of Quartile Deviation = _2__

Q3 + Q1 2

Coeff. of Q.D. = fc^

1. It IS simple to calculate and easy to understand. f

2. It IS rigidly defined.

3. It does not depend on all the values of the data v^rLblf ^he quartile deviation are the same as those of the

2. Calculation of Quartile Deviation

(a) Series of Individual observations

(b) Discrete Series

(c) Continuous Series !

(a) Series of Individual Observations

the -ijtr ^^^^ -

145 130 200 210 , 198

234 159 160 178 257

260 300 345 360 390

fromi


SJiT-i-ti.

241

' Y S NoC ' ■ Income (Rs) Income (Rs)

1 130 9 234

2 145 10 257

3 159 11 260

4 160 12 300

5 178 13 345

6 198 14 360

7 200 15 390

8 210

Steps

1. Arrange the data in ascending order to get the value of lower and upper quartiles.

2. Locate the value by finding out Qj = size of

item and Qj = size of

fN + 1

th

item.

3. Apply the formulae to get interquartile range, quartile deviation and coefficient of quartile deviation. Thus, we get

Qj = Size of = Rs 160

Q, = Size of

item = 4* item

ri5+n

th

item = 12* item

= Rs 300 Interquartile Range = Qj - Qj Here, Q, = 300 and Q, = 160

= 300 - 160 = Rs 140

Quartile Deviation =

Q3-Q1

Q.D. = Hzl^ = Rs 70

Coefficient of Q.D. =

2

Q3-Q1 Q3 + Q1

300-160 140 300 + 160 460

= 0.304

242


Solution. We are given Quartile deviation (Q.D.) = SlZ^

= 15 Marks

r ■ . •■• - = 30 Coefficient of Quartile deviation

...(1)

QizQL Qs+Qt

= 0.6.

30

Q3+Q1

= 0.6

...(2)

= Q3 - Q = — 0.6

••• Qs + Qa = 50 Now equation (1) and (2) are solved, Q3 - Qi = 30

80

Q3 = Y = Marks

Putting the value of Q^ in equation (i)

Qs-Q, = 30 40 - = 30

Tu tt = 40 - 30 = 10 Marks

Thus, Upper Quartile = 40 Marks and Lower Quartile = 10 Marks

(b) Discrete Series

series : coethcient of quartile deviation from the following

Heights

'I '' 63 64 65 66

^ 6 15 10 5 4 3 1

f

Coe

(in inches)

No. of Persons :

Solution.

58 2

Range = L - S = 66-58 = 8 inches

ai

(c) G HI

coeffii Ai Nc

Sol

I'l

Measures of Dispersion Quartile Deviation Steps :

1. Arrangement of items in ascending order is necessary.

2. Calculate cumulative frequencies.

3. Locate First Quartile and Third Quartile by

(n+IY"

and — .

V ^ y V ^ y

4. Values are located at the size of item in whose cumulative frequency the value of item falls.

5. Apply the formula to find quartile deviation.

Qj = size of

fN+l^

Item =

(49 + 1^

th

Item

= 12.5* item = 61 inches

Q, = size of

= 63 inches

Q.D. =

Q3-Q1 _ 63-61

3x50 . Item = -= 37.5* item

= 1 inch

243

Height No. of

in inches persons (f) c.f.

58 2 2

59 3 5

60 6 11

61 15 26

62 10 36

63 5 41

64 4 45

65 3 48

66 1 49

Coefficient of Quartile Deviation

Coefficient of Q.D. = = = 0.OI6.

Q3+Q1 63 + 61 124

Thus,

and

Range = 8 inches Q.D. = 1 inch Coeff. Q.D. = 0.016

(c) Continuous Series

Illustration 8. Calculate range and quartile deviation and compare them. Also calculate coefficient of quartile deviation of the following data.

Age (years) : 20-30 30-40 40-50 50-60 60-70 70-80 80-90 No. of members : 3 61 132 154 140 513

Solution. Range

Range = L - S = 90-20 = 70 years

244

Quartile Deviation Steps :

1. Calculate cumulative frequencies.

2. First quartile and third quartile items are

(NY' 3/».T\th


located by finding out (-] and

V ^ y

.4

item

in continuous series.

3. Locate, the first quartile and third quartile group m cumulative frequency column where

size of respective — and

UJ

N

UJ

item falls.

4. Apply the following formula :

- + -J- X I

Thus, we get First quartile

+ J--X /

= Size of

= size of

U

(544^

4

item

th item = 136* item

Hence, Q^ lies in the group 40-50

'i + J- X /

where.

h - 40, ^ = 136, c.f. = 64, f= 132, / = lo

n - An 136-64

an 72x10 " " ~13r = years

Hence first quartile is 45.45 years.

Age No. of t.f

(years) members (f)

20-30 3 3

30-40 61 64

40-50 132 196

50-60 154 350

60-70 140 490

70-80 51 541

80-90 3 544

KI

Measures of Dispersion Third Quartile

245

Qj = Size of

= Size of

th

Vn

.4;

3x544

Item

= 408* item Hence Q3 lies in the group 60-70

where.

h = 60,

N

-c.f.

f

.4.

X I

= 408, c.f = 350, f = 140, i = 10

xlO

= 60 +

140 58x10

140

= 64.14 years

Quartile Deviation (Q.D.) =

^ 64.14-45.45 2

= 9.345 years Coefficient of Quartile Deviation

Coefficient of Q.D. = ^^^

Q3 + Q1

_ 64.14-45.45 64.14 + 45.45 _ 18.69 ~ 109.59 = 0.17

Thus Range = 70 years

Q.D. = 9.345 years Coeff. of Q.D. = 0.17

246


Absolute value of dispersion

Quartile Deviation (OD )- Q3-Q1 45,000-18,000

vv- ■] - ~ _ -- ^ j^g 13,500

-100% Persons

O3 = Rs 45,000

Relative value of dispersion Coefficient of O.D. =

Q3-Q1 Q3+Q1 "

45,000-18,000 45,000 + 18,000

27,000

63:000 = 0-428

and^rlti^etatr^tn?:^^^^^ ^ ^^^ ^ ^ ^^e data (Rupees)

of dispersion (0.428). percentage or coefficient of the absolute measure

3. Merits and Demerits of Quartile Deviation Merits :

2 'r'""" " "-derstand.

3. It ,s also useful where extreme values are likely to affeet the results

4. T^S measure ,s useful when it is desired to know variah.lity in L Itral part o,

Demerits :

1. It ignores half the times-lst 25% and the last 25%

2. h IS a so not possible to give it further algebraic treatment.

4 2Tntb iT" " by fluctuations of sampling.

4^ As an ah olute measure it is not sufficient for comparison.

variX.'"^"'^ " ^^ for a rough study of

247


(B) Dispersion from Average

The range, the interquartile range and the quartile deviation suffer from common defect. They are calculated by only two values of a series-wither extreme values m case

of range or the two values of the quartiles as in case of quartile deviation. This method of studying dispersion by location of limits is also called the 'Method of Limits .

It is, therefore always better to have such a measure of dispersion which is based on all the observations of a series and is calculated in relation to a central value. Range and Quartile deviations are not calculated in relation to any average. If the variations ot items are calculated from an average, such measure of dispersion throws light on the formation of the series and the scatteredness of items around a central value. This method ot calculating dispersion is called the 'method of averaging deviations'.

Let .us examine from the following illustration about the salaries paid to employees of a departmental store :

Monthly Salaries of Employees

Employee A B C D E Total (Rs)

Salary (in Rs) m 10,000 2,000 4,000 6,500 4,500 EX = 27,000

Deviations from Mean 4,600 -3,400 -1,400 +1,100 -900 £(X-X) = 0

- IX 27000 Mean (X) = — - —-

= Rs 5,400

We observe that the salary of A (Rs 10,000) is more from arithmetic mean (Rs 5,400) and the salary of B (Rs 2,000) is quite less than the arithmetic mean. In gerieral, some deviations are positive and some are negative. Similarly, some are large and some are

small.

If we consider an average of these deviations calculated from arithmetic mean, we can get an idea of a measure of dispersion. As we know the sum of the deviations calculated Lm arithmetic mean is always zero. Here, positive deviations and negative deviations cancel out each other. Therefore, adding these deviations directly does not help us. Alternatively, we may consider either the 'absolute deviations' or 'squanng deviations . Thus, the measures of dispersion in terms of deviations from central value (average)

are as under :

(A) Mean deviation or Average deviation

Where absolute deviations are obtained from average (ignoring plus and minus signs).

(B) Standard deviation

Where deviations obtained from arithmetic mean are squared.

248


2. Calculation of Mean Deviation

3. Ments, Demerits and Uses of Mean Deviation 1- Meaning

^^'"■'^""ring plus (.) and nUnJsAsS^^^l^^^ rith^^ean or

Measures of Dispersion where.

249

ElDl = (Read sigma D modulus), sum of the deviations taken from mean or median ignoring ± signs N or M = Number of observations f = frequency X = Mean Me = Median Relative Measure of Mean DeviaHon

Coefficient of M.D. = ^

M.D.

XorMe

2. Calculation o£ Mean Deviation

{a) Series of Individual observations

[b] Discrete series

(c) Continuous series

(a) Series of Individual Observations . j

BtasMrio. 10. Calculate mean devation and its coefficient from median and mean fro^TL following yeld of rice per acre for 10 districts of a state as under:

Districts

Rice Yields (in tons) Solution.

1

22

2

29

3 12

4

23

5 18

6

15

7 12

8

34

9 18

10 12

Calculation of Mean Deviation

12 12 12 15 18 18 22 23 29 34

N = 10

6 6 6

3 0 0

4

5 11 16

LIDI = 57

22 2.5

29 9.5

12 7.5

23 3.5

18 1.5

15 4.5

12 7.5

34 14.5

18 1.5

12 7.5

EX = 195 EIDI =

250

1. Arrange the data in ascending order.

2. Calculate the median of the series

fN + lY'


Me = size of

Item.

3. Take deviations of item from median Jgxonng ± smgs and denote the column as

4. Cdculate the sum of these deviations,

1. Calculate the total item of finding arithmetic mean. '"umg

2. Take deviations of items from mean

Jg^ormg ± signs and denote the column by

3. Cdcufate the sum of these deviations,

4. Divide the total obtained by number of items. Formula :

EIDI

5. Divide the total obtained by number of items. Formula :

ZIDI

M.D. =

N

M.D. =

N

Si"""'' »'

Coefficient of M.D. = ^

XT Mean

Now, we get

Mean =

Median

fN + V th item

4 J

rio+1^ th

< 2

Now, we get Median = Size of

= Size of

= 5.5* item = Value of 5* item + 0.5 (Value of

item - Value of 5* item) = 18 + 0.5 (18 - 18)

= 18 + 0.5 (0) = 18 tons

Absolute Measure :

JV ~ 10 - 'on®

Relative Measure : Coefficient of M.D.

M.D. 57

N 195

Median ~ =

~ 10

= 19.5 tons Absolute Measure :

N = —

~ 10 = 6 tons

Relative Measure : Coefficient of M.D.

= m.D.

Mean

~ 19.5 = 0.307

251


Note It is better to calculate M.D. from median than that from mean because the sum of the deviations taken from median ignoring ± signs is less than sum of deviations taken

from mean.

Illustration 11. the yield of wheat per acre for 10 districts of a state is as under:

District : 1 2 3 4 5 6 7 8 9 10

Yield of wheat : 12 10 15 19 21 16 18 9 25 10 (in tons) ,

Calculate : .

(i) Range and coefficient of range. («) Quartile Deviation and its coefficient. {Hi) Mean Deviation about Mean and coefficient. (iv) Meati Deviation about Median and coefficient.

Solution. In order to calculate the quartile and median we arrange the yield of wheat in the ascending order of magnitude.

(i) Calculation of Range (Absolute Measure)

Range = L - S

L = 2S,S = 9 Range = 25-9 = 16 tons

(ii) Calculation of Quartile Deviation

Calculation of Coefficient of Range (Relative Measure)

L-S

Coefficient Range =

25-9_16^0.47

25 + 9 34

Qj = size of

N + l

item = size of

rio+1

Mh

item

9 10 10 12

15

16 18 19 21 25

= size of 2.75th item

= Value of 2nd item + | (values of 3rd item - value of 2nd item) = 10 + 0.75 (10 - 10) = 10 + 0 = 10 tons.

Qj = size of

rN+n

th

item = size of

srio+i"!

th

item

= size of 8.25th item

= Value of 8th item + ^ (Value of 9th item - Value of 8th item)

= 19 + 0.25 (21 - 19) = 19 + 0.25 (2) = 19 + 0.5 = 19.5 tons

252

Absolute Measure : ■•■ Quartile deviation Q.D. = Sl^

_ 19.5-10

- =4.75 tons

of Mean deviation

Statistics for Economics-XI Relative Measure :

Coeff. of Quartile deviation

Coeff. Q.D = QiJ:^ Q3+Q1 ^ 19.5-10 _ 9.5 l^XTlO ~ 2^ = 0.322

Absolute Measure :

Mean Deviation from Mean

Arithmetic Mean, =

N ~ IF tons


M.D. =

Mean Deviation = 4.2 tons ^^

= 4.2

Mean Deviation from Median

Me = Size of

= Size of

4

10^ y''

item

item

Relative Measure :

Coeff. of M.D. =

Mean 15.6 ~ 0.269

= Size of 5.5th item = Value of 5th item

2 (Value of 6th item

Value of 5th item) = 15 + 0.5 (16 - 15) = 15 + 0.5 = 15.5 tons

M.D.=

^ 10" ••• Mean Deviation = 4.3 tons

Relative Measure :

M.D. 4.3

Coeff. of M.D. =

Median = 0.277

15.5


, 253.

Range = 16 tons . Q.D. = 4.75 tons M.D. = 4.3 tons

(from median)

and " -

that ,s why we ate g tfj™ h^ XeTah T "'data,

deviation,wh,chasaLaRe of Ae d^er„rh l Q""™''

all the observations of se™s ttn«25 " f"

the value 4.75 tons. "f series, giving us

^^ ^^ ~g of absolute

distribution and thus cVcu^ ^^^ -regularities in the

true measure of dispersion accurate and

Relative Calcularioas of Illustration 10 and 11 Refer to Illustration 10 and 11

p RêYÛ ■ Wlât Veld

M.D. = 5.7 tons Coeff. of M.D. = 0.316 M.D. = 4.3 rons Coeff. of M.D. = 0.277

y.eld has greater variati" 6)1" cal-lations we decide rife

has lesser variation is more rehable Therefore the vSh 7" ^^^ ^^^P ^hich

the yield of rice. ifteretore, the yield ot wheat is more reliable than

{b) Discrete Series

Also"tS:^^^^ lismbution.

254

Solution.

Statistics for Economics~XI

0 1 2

3

4

5

6

7

8 9

10 11 12

15

16 21 10 16

8 4 2 1 2 2 0 2

IDt fm

15 2 30

31 1 16

52 0 0

62 1 10

78 2 32

86 3 24

90 4 16

92 5 10

93 6 6

95 7 14

97 8 16

97 9 0

99 10 20

Z/'IDI - 194

Total

N = 99

Steps :

1. Calculate cumulative frequencies

2. Locate the item by finding out

th

item.

3. Value is located at the size of the item in whose cumulative frequency the value of « Item falls.

4. Find out the median,

5. ^ke deviations of items from median ignoring ± signs and denote the column as

6. Multiply frequencies with deviation and get f\D\.

7. After getting the total of f\D\ column apply the following formula :

I.f\D\

N

M.D. =

Median

Me = size of

= Size of

(N + lf

2 J

item

item = 50^^ item

Median = 2 Accidents

Measures of Dispersion Mean Deviation :

255

M.D. =

Z/IDI 194

N

Coefficient of Mean deviation :

M.D.

Coefficient of M.D. =

99

1.96

Median

= 1.96

= Approx 2 Accidents = 0.98

(c) Continuous Series '

Illustration 13. Calculate Mean Deviation from mean and its coefficient of the following data :

Marks : 0-10 10-20 20-30 30-40 40-50

No. of Students : 5 8 15 16 6

Solution.

Calculation of Mean Deviation from Mean

Marks X No. of Students Mid-points m. m-25 10 d' fd' m- 17 ^IDI •

0-10 5 5 -2 -10 22 110

10-20 8 15 -1 -8 12 96

20-30 15 25 0 0 2 • 30

30-40 16 35 +1 +16 8 128

40-50 6 45 +2 +12 18 108

N = 50 -Lfd'^ 10 •Lf\p\ = 472

Steps :.

1. Calculate Arithmetic Mean by step deviation method.

2. Take the deviations of mid-points from mean ignoring ± sings and denote them by ID!.

3. Multiply these deviations by respective frequencies and find out /IDI.

4. After getting the total of f\D\ column apply the following formula :

If\D\

M.D. =

Now, we get Arithmetic Mean :

N

Zfd'

X = A + ^x C N

256

where. Mean Deviation :

A = 25,-Lfd' =10, N = 50, C = 10

V ir 10

^ ^ "" 50 = 25 + 2 = 27 Marks


M.D. =

If\D\ N

where, I.f |DI = 472, N = 50, M D = ~

50

Mean Deviation = 9.44 Marks Coefficient of Mean Deviation =

Here,

M.D.

X

M.D. = 9.44 and X = 27 9.44

>27

= 0.349

0-10 5

10-20 10

20-30 20

30-40 5

40-50 10

Marks :

No. of Students : Solution. (/) Calculation of Range Absohite Measwe :

Range = L- S ^here, L = 50, S = 0

= 50 - 0 = 50 Marks

Relative Measure :

Coefficient of Range =

L-5 L + S

50-0 50 + 0

= 1

(ii) Calculation of Quartile Deviation

First, we will calculate and Q^

Qj = size of = size of

V4y (50

item

th

item = 12.5th item


Qj lies in the class 10 - 20. Applying the following, formula :

257

Q,-I,*

N ■ 4

-cf

■XI

Here,

N

/j = 10, — = 12.5, c.f = 5, f= 10 and i = 10

= 10 + ^^xio

= 10 + ^^^^ = 17.5 Marks

Q = size of

N 14

Item

= size of

'50

.4,

th

item : 37.5* item

Q lies in the class 30 - 40

Here, 1, = 30,

Vn)

v4y

= 37.5, c.f = 35, /■ = 5 and i = 10

Q, = 30+37.5-35^^^

= 30 +

5

2.5x10 10

= 32.5 Marks

Absolute Measure :

Quartile Deviation, Q.D. = ^^^^^

32.5-17.5

= 7.5 Marks

Relative Measure :

Coefficient of Q.D. = Q^-Qi

Q3+Q1

32.5-17.5 15 32.5 + 17.5 50 ■

258

(iii) Calculation of Mean Deviation from Median

Statistics for Economicsr-XI

Marks Stud^ts f c.f. Mid-poitos. m f«-25 IDI f\m

0-10 5 5 5 20 100

10-20 10 15 15 10 100

20-30 20 35 25 0 0

30-40 5 40 35 10 50

40-50 10 50 45 20 200

N^ 50 Zf IDI = 450

Steps :

1. Calculate median of the given data.

2. Take the deviations of mid-points from median ignoring ± signs and denote IDI.

3. Multiply deviations by respective frequencies and find out Z/'IDI.

4. After obtaining the total of f\D\ column, apply the following formula :

n\D\

M.D. =

N

Now we get. Median :

Median = Size of = Size of

Median lies in the class 20-30 Apply the following formula :

N

l2j 12J

item

item = 25th item

Here,

Me = + ^ .

/, = 20, J = 25, c.f = 15, /■ = 20 and / = 10 Me =.20 + ^^x 10

20 +

20

10x10 20

= 25 Marks.


Absolute Measure : Mean Deviation

259

M,D. =

If\D\ N

Here, Zf \D\ = 450 and N = 50 450

M.D. = Relative Measure

50

= 9 Marks

Coefficient of M.D. = ^ = ^ = 0.36

Illustration 15. Calculate the mean deviation from mean for the following marks obtained by 10 students.

Marks : 2-4 4-6 6-8 8-10

Student : 3 4 2 1

2-4 3 3 -2 ■ '-6- 2.2 6.6 6

4-6 4 5 0 0 0.2 0.8 0

6-8 2 7 +2 4 1.8 3.6 4

8-10 1 9 +4 4 3.8 3.8 4

N = 10 lfd = 2 - I/-IDI = 14.8 I/^WI = 14

Steps

1. Calculate Arithmetic Mean by assumed mean method.

2. Take deviations of mid-points from mean ignoring ± signs and denote them by IDI.

3. Multiply these deviations by respective frequencies and find out f IDI.

4. After getting the total of /IDI column apply the following formula:

S^IDl N

M.D. =

Now, we get Arithmetic Mean :

X = A +

M

N

where, A = 5, = 2 and N = 10 2

= 5 +

10

= 5.2

260

Mean Deviation:


Here, where.

M.D. =

If\D\ N

X/IDI = 14.8, N = 10

14.8

= =1.48 Marks

Alternatively: (Short-cut Method) apply the follo„i°; ZnZ, ' ob«,ni„g Xf W,,

where.

Now, we get

M.D. = l/l^^llP^z^KI/B-^ N

^f\d\=14,A = 5,N=10 •

class (the class m which mean hes), i.e., 4 + 3=7 If A = Sum of all class frequencies after the mean' class, 2 + 1=3

MD =

10

= 14 + (0.2) (4) 14 + 0.8 14.8

10

10

M.D. = 1.48 Marks

Note : Take care that assumed mean is close to the true mean.

3. Merits and Demerits and Uses of Mean Deviation Ments :

-d i.. value .s P.ec,se and

u

spite statist deviat cycles, studiei

(B) SI 1. 2.^

3.<

4.J 5.1 6.1

261


3 Based on all items. It is based on all the items of the series, hence it is affected by ■ every value of the distribution. Thus mean deviation is a better measure of dispersion than range and quartile deviation. 4. Less affected by extreme values. It is not affected very much by value ot extreme

items.

5 Absolute measure. The averaging of absolute deviations for an average takes out the irregularities in the distribution and thus mean deviation provides an accurate and true measure of dispersion.

6 Calculated value. Mean deviation is not based on limits like range and quartile deviation. It is a calculated value based on the deviations about an average. It provides a better measure for comparison about the formation of different

distributions.

Demerits : u u i

1 Ignoring the signs. The strongest objection against mean direction is that while

■ calculating its value we take the absolute value of the deviations about an averap and ignore the ± signs of the deviation. The step of ignormg the signs of the deviation is mathematically unsound and illogical. Therefore this method is non-algebraic, for this reason it is not in further statistical calculations.

2 Not well defined. Mean deviation is not a well-defined measure since it is calculated

■ from different averages (mean, median and mode). Mean deviation calculated from various averages will not be the same.

3 Harder calculations. Mean deviation involves harder calculation than the range ' and quartile deviation. Its calculation by an arbitrary origin makes the calculation

tedious. 1 1 ■ • u

4. Cannot be calculated. Mean deviation cannot be computed for distribution with

open-end classes.

Uses. Despite so many demerits, mean deviation is not a totally useless measure. In spite of its mathematical drawbacks, it has found favour with economists and business statisticians because of its simplicity, accuracy and also on account of the fact that standard deviation gives greater importance to deviations of extreme values. For fj^^^sting ^u^s cycles, this measure has been found useful than others. It is also good for small sample studies where elaborate statistical analysis is not required.

(B) Standard Deviation

1. Meaning

2. Calculation of Standard Deviation

3. Other Measures from Standard Deviation

4. Mathematical Properties of Standard Deviation

5. Relation between Measures of Dispersion

6. Merits and Demerits of Standard Deviation

262 ■ '


-1. Meaning

It is^e mnf^' Standard deviation was introduced by Karl Pearsons in the year 1893 It i^he most commonly used measure of dispersion. It satisfies most of the properties iLd down for an ideal measure of dispersion. properties laid

rr " mathematically illogical as in its

calculation signs are ignored and absolute deviations are taken. This drawback is

removed m die calculation of standard deviation. One of the easiest ways of dXg a way

devil"'me^r^^^^ ^^ ^^^^^ ^^^ ^^^

Standard deviation is also known as root mean square deviation because it is the square root of the means of squared deviations from le arithmetic La„

.nnT!?"'; deviation, first the arithmetic average is calculated

and the deviations of various items from the arithmetic average are square! ^e ql^d

deviations are totalled and the sum is divided by the number of iteL tL sqnaTrrot Symbolically,

where

<y

X-X = *

-

2. Calculation of Standard Deviation

(A) Series of Individual Observations

(B) Discrete Series

(C) Continuous Series

(A) Series of Individual Observations

Standard deviation may be calculated by any of following methods •

(«) Actual mean method (b) Direct method (c) Assumed mean method

(a) Actual Mean Method

Illustration 16. Calculate Standard Deviation of the following data • 25 50, 45, 30, 70, 42, 36, 48, 34, 60

Measures of Dispersion Solution.

Calculation of Standard Deviation Steps :

1. Calculate the actual mean of the observations.

2. Obtain deviations of the values from the mean, i.e., calculate (X - X). Denote these deviations by

X.

3. Square the deviations and obtaiti the total Ix^.

4. Divide Ix^ by number of observations and find out the square root.


263

Here,

Now we get.

Here,

= (X- X)

X =

i:X 440

= 44

N 10 Ix^ = 1710, N = 10

jrm

= iir ^ \ 10

Values X X-X *

25 -19 361

50 +6 36

45 +1 1

30 -14 196

70 -26 676

42 -2 4

36 -8 64

48 +4 16

34 -10 100

60 +16 256

i:X=r440 1x^4=1710

= jl7i = 13.076

(b) Direct Method

Illustration 17- Calculate the standard deviation of data given in Illustration 16 by direct method. Solution.

Calculation of Standard Deviation

Steps

1. Calculate the actual mean of observations.

2. Obtain the sum of square of values.


= i-w ■i

{Xf

. N .

Vaiues

X

25 625

50 2500

45 2025

30 900

70 4900

42 1764

36 1296

. 48 2304

34 1156

60 3600

2:X = 440 IX^ = 21070

264

Now we get,

Statistics for Eopnomics-Xl

X =

EX 440

= 44

N 10

Here, ZX' = 21070, N = 10 and X = 44

a =

'21070

-(44)2

N "" " V 10 = V2107-1936 = n/171 = 13.076 (c) Assumed Mean Mediod

Solution.

Calculation of Standard Deviation Steps

1. Calculate the deviations of the observations from an assumed mean (X - A). Denote these deviations by d and make the total of deviations.

2. Square the deviations and denote the total LiP


a =

d = X-A

N

Here,

When the mean is in fraction, this method is used to simphfy the calculations.

25 -20 400

50 +5 25

45 0 0

30 -15 225

70- +25 625

42 -3 9

36 -9 81

48 +3 9

34 -11 121

60 +15 225

N = 10 Id Id'

= -10 = 1720

Now we get. Here,

a =

/ N

N

1720, N = 10, ZJ = - 10

a =

1720 f-lO^

10

10

= VI72-(-1)2 = Vm = 13.076


Illustration 19. From the following information, find standard deviation of x and y variables :

Ix = 235, Ey = 250

Ix^ = 6750, = 6840

N = 10

Solution.

: : V X

ox = y N v m) ay = = \ N InJ

6750 10lio J 6840 p50Y 10 I 10 J

- ^675-(23.5)2 = ^/675-552.25 = 22.75 = 11.079 = V684-(25)-= ^684-625 = V59 = 7.68 .

(B) Discrete Series

Standard deviation can be calculated by any of the following methods :

(a) Actual mean method (b) Direct method

(c) Assumed mean method (d) Step deviation method


Illustration 20. Calculate Standard deviation of the following data :

: 4 5 6 7 8 9 10

Frequency : 6 12 15 28 20 14 5

Solution.


Size Frequency f . /X X- X fx^

4 6 24 -3.06 9.3636 56.1816

5 12 60 -2.06 4.2436 50.9232

6 15 90 -1.06 1.1236 16.8540

7 28 196 -0.06 0.0036 0.1008

8 20 160 +0.94 0.8836 17.6720

9 14 126 +1.94 3.7636 52.6904

10* 5 ■ 50 . +2.94 8.6436 43.2180

N = 100 •LfX = 706 Ifx'- = 237.6400

266

jj^p^ _ ^tMistics for Economics-XI

(x'ltrr„

(X -X) and denote these deviations by .r

A- by Ae respecve frequences and n,ake


a =

N IfX

Here,

N

X =

^fX = 706 and N = 100 706

X =


100

= 7.06

c =

Here,

N

Zfx'^ = 237.64 and N = 100 [237J4 r_

= 1.541

{b) Direct Method

direar^od" of 'he data given in I„us„at,on „ by

Solution.


(c) ass

X f X-2

4 5 6 7 8 -9 10 6 12 15 28 20 14 , . 5 24 60 90 196 160 126 50 16 25 36 49 64 81 100—__ ^ _ 96 300 540 1372 1280 1134 fDO i

N = 100 IfX = 706 ^X^^"5222 f

Measures of Dispersion Steps :

1. Calculate mean of the series, i.e., X •

2. Obtain the sum after mukiplying f and X (frequency and size), i.e., Z/X.

3. Calculate the square of values (X^)

4. Multiply frequency (/) to X^ and get the total, i.e., ZfX"


267

a =

Z/X^

N

-(X)'

\

Ifx^ fz/x^ N J

N

m 7.06

N 100

Now we get,

Here, Z/X^ = 5222, N = 100

Substituting the values

a =

IfX"

N

-(Xf =

5221

100

-(7.06)2

= V52.22-49.8436 = ^2.3764 = 1-541 .-. Standard Deviation = 1.541

(c) Assumed Mean Method

Illustration 22. Calculate the standard deviation of the data given in Illustration 20 by

assumed mean method. Solution.


Size Frequency Y - 7 ^ ' ' —•——^—

4 6 -3 -18 54

5 12 -2 -24 48

6 15 -1 -15 15

7 28 0 0 0

8 20 +1 +20 20

9 14 +2 +28 56

10 5 +3 +15 45

N = 100 ■ E/a = 6 Ifd^ = 238

268


Steps :

r . Take the dev,ario„s of s,ze from an assnmed mean and denote these delations 2. Multiply these deviations by the respective frequencies and calculate the total Zfd

where.

Now, we get

d= {X- A)

W fXfdf

N

m = 238, Yfd = 6, 100

Here, _

Substituting the values.

Ifd N

0 =

238

100

= A/2.38-(0.0^ = V2.3764

jlOO

= a/^38-0.0036 a = 1.541

and — « --

id) Step Deviation Method

^ Jllustradon 23. Calculate Arithmet.c Mean and Standard Deviation for the following

Value Frequency Solution.

140 1

145 4

150 15

155 30

160 36

165 24

170

175 2

Values

^i^l^l^io^ A^^ Deviation

140

145 150 155 160 165 170 175

Frequency

1 4 15 30 36 24

N = 120

X ~ iss d

-10 -5 0 +5 +10 + 15 +20

d'

-3 -2 -1 0 +1 +2 +3 +4

fd'

-3 -8 -15 0

+36 +48 +24 +8

m' = 90

fd

9 16 15 0 36 96 72

__32

= 276~

269

Measures of Dispersion Steps :

1. Take the deviations of values from an assumed mean and denote these deviations

by (d).

2. Divide these deviations by common factor and obtain step deviations, i.e., d'.

3. Multiply step deviations by the respective frequencies and calculate the total Zfd'.

4. Calculate the squares of the step deviations (J"), multiply these squared deviations by respective frequencies (in other words fd' x d' = fd'-) and obtain the total Zfd'-.


where,

Now, we get Arithmetic Mean

Here,

a =

d' =

V N I N

C

C

and C = Common factor

xC

N

A = Assumed mean = 155, Ifd' 9^ = 5, N = 120 90

X = 155 +

Standard Deviation

120

= 155 + 3.75 = 158.75

X 5 = 155 + 0.75 X 5

a =

Xfd'^ f-Lfd'

N

N

C

Here,

Tfd'- = 276, Tfd' = 90, N ^ 120 and C = 5

0 =

(27

il20'

f 90 ^ 120

X 5

= V2.3-(0.75)2 X 5 = /2.3-0.5625 x 5

= V1.7375 X 5 = 1.318 X 5 = 6.59

(C) Continuous series

For calculating standard deviation in continuous series any of the following methods may be applied :

{a) Actual mean method (b) Direct method

(c) Assumed mean method (d) Step deviation method

270

{a) Actual Mean Method


calXr«t jalT"^ * formula ,s used ,o

V N

where, x = {X - X)

24. Find .he Mean and S«ndard deviation from .he following dis.rihu.ion :

No. of Students , 4 V

« 2 1

Solution.

Calcularion of Mean and Standard Deviation

Marks X

4-8 8-12 12-16

No. of Students if)

1 1

N= IS

Midpoints (m)

1 6 10 14

fm

8 48 20 14

Ifm = 90

(m ~ X)

X

-A 0 +4 +8

16 0 16 64

fx^

64 0 32 64

Ifx'- = 160

Steps :

1. Calculate the actual mean of the series, x

-d-pomts from the mean, - x). Denote these

~ * the respective frequencies and ohtam

Se ^Sr dS ^^ ^^^ ^^-re root to calculate the

Apply the following formula :

a =

N

Mean

Here,

^ - N ~15 = 6 marks S/w = 90, N = 15

Measures of Dispersion Standard Deviation

271

a =

If-'

N

Here,

Zfx^ = 160, N = 15

0 = J^ = M666 = 3.265

.-. Standard Deviation = 3.265 marks.

Note. This method is rarely applied in practice becausc in case the actual mean is in fraction, the calculations becomes complicated and take lot of time.

(b) Direct Method

Illustration 25. Calculate the standard deviation of the data given in Illustration 24 by direct method. Solution.

Calculation of Standard deviation

Marks No. of Students --------- Mid-points fm^

X (f) (m) fm w'

0-4 4 1 8 4 16

4-8 8 6 48 36 288

8-12 2 10 20 100 200

12-16 1 14 14 196 196

N = 15 ■Lfm = 90 E/m- = 700

Steps :

1. Calculate actual mean of the Series X.

2. Obtain the total after multiplying f and m, i.e., Zfrn.

3. Calculate square of mid-points, i.e., m^.

4. Multiply frequency to m^ and get the sum, i.e., Zfm^.


0 =

i

Ifm^ (Ifm

N

N

-{Xf

= V N • ■ We get, N = 15, Zfrn = 90, Zfrn'- = 700

2-^2

Substituting the values


15

9o^ 15

= ^ yfiOM ^ 3.265

Standard Deviation = 3.625 marks.


d J!"" ^^^ I^-ation of the followmg frequency

50-55 45 50 40^5 35-40 30-35 25-30

29 31 47 51 70

Age in Years No. of Labourers Solution.


Age in years . , x^ cc No. of labourers Jf> Mid-points (m) m -- 42.5 d fd fd^

45-50 40-45 35-40 30-35 25-30 22 29 31 47 51 70 52.5 47.5 42.5 37.5 32.5 27.5 +10 +5 0 -5 -10 -15 220 145 0 -235 -510 -1050 2200 725 0 1175 5100 15750

N = 250 Ifd = -1430 Ifd^ = 24950

Svwt tr'"" - and denote these

2. Multiply these devattons by the respective frequencies and calculate the total, m \

4. Apply the following formula : i

a =

where,

d= (X-A)

. N

Measures of Dispersion Now we get.

273

a =

l-Lfd^ f-Lfd^

. N VN )

Here, = 24950, Zfd = -1430, N = 250

Substituting tbe values.

a =

24950 250

r-1430

= ^99.8-(-5.72)2

250

= V99.8-32.718 = V67.082 = 8.19

Standard Deviation = 8.19 years

(d) Step Deviation Method

This method is mostly used in practice.

Illustration 27. Find out the Standard Deviation of the frequency distribution given in Illustration 26 by step deviation method.

Solution.


Age in years X No. of labourers f Midpoints m m - 4.25 4 m- 4.25 5 d'id' fd"

50-55 22 52.5 +10 +2 +44 88

45-50 29 47.5 +5 +1 +29 29

40-45 31 42.5 0 0 0 0

35-40 47 37.5 -5 -1 -A7 47

30-35 51 32.5 -10 -2 -102 -204

25-30 70 27.5 -15 -3 -210 630

N = 250 Ifd'= - 286 lA'^ = 998

Steps

1. Take the deviations of mid-points from an assumed mean and denote these deviations by d.

2. Divide these deviations by common factor and obtain step deviations, i.e., d'.

3. Multiply step deviations by the respective frequencies and calculate the total Jlfd'.

4. Calculate the squares of the step deviations {d'^Y, multiply these squared deviations by respective frequencies (in other words fd' x d' = fd'-) and obtain the total I^fd'-.

274



where, and

a =

d' =

X C

I C

C = Common factor

Now

we get.

a =

Ifd

r2

N

Zfd'

{ N

\2

X C

where, ^fjn = 993, ^fj^ ^ _ N = 250 and C = 5

Substituting the values,

a =

998 250

286^ I 250

5 = 73.992-(-1.144)2 X 5

= V3-992-1.309) X 5 = VI^ x 5 = 1.638 X 5 = 8.19

.-. Standard Deviation = 8.19 years. Illustration 28. Find the Standard Deviation of the height of 100 students.

Less than 62.5 Less than 65.5 Less than 68.5 Less than 71.5 Less than 74.5

5 23 65 92 100

Solution. Convert cumulative frequency into class interval.


Height (in inches) X Frequency Midpoints im) ni-67 3 (d-) fd' fd-"

59.5-62.5 62.5-65.5 65.5-68.5 68.5-71.5 71.5-74.5 5 23-5 =18 65 - 23 = 42 92 - 65 = 27 100 - 92 = 861 64 67 70 73 -2 -1 0 +1 +2 -10 -18 0 +27 +16 20 18 0 27 32

N = 100 Ifd' = +15 m'^ = 97


Applying the formula, we get

275

a =

<

Ifd'^ flfd'^

N

N

X C

where, = 97, Zfd' = 15, N = 100, C = 3

Substituting the values.

o =

97

100

100,

X 3 = VO.97-0.0225 x 3

= ^/0.9475 X 3 = 0.9733 x 3 = 2.92

Standard Deviation. = 2.92 inches. Illustration 29. Calculate Mean Standard Deviation and mean deviation about mean

Marks Students

More than 20 50

More than 40 47

More than 80 41

More than 100 21

More than 120 9

Solution. (Convert cumulative frequency into class interval)

Calculation of Mean and Standard Deviation

Marks X Frequency (f) Midpoints (m) wj-yo 10 (dl fd' fd^

20-40 50 - 47 = 3 30 -6 -18 108

40-80 47 - 41 = 6 60 -3 -18 54

80-100 41 - 21 = 20 90 0 0 0

100-120 21 - 9 = 12 110 +2 24 48

120-140 9-0 = 9 130 +4 36 144

1 • N = 50 Ifd' = 24-Lfd'^ = 354

Mean :


X-A.^.C

where.

A 90, -Lfd' = 24, N = 50, C = 10

276


- 24

^ " ^^ ^ X 10 = 90 + 4.8 = 94.8

Mean = 94.8 Marks

Standard Deviation


a =

N

where.

I N

X C

= 354, Zfd' = 24,N = 50, C = 10

a =

SO

v50y

xlO

= V^xlO = 2.617 X 10 = 26.17 Standard Deviation = 26.17 Marks

Deviation from Mean

Marks X

20-40 40-80 80-100 100-120 120-140

Frequency (ft

3 6 20 12 9

Mid-points m

N = 50

Mean Deviation


30 60 90 110 130

■■m. ~ 94.8 ID!

64.8 34.8 4.8 15.2 35.2

M.D. =

where.

N

If IDI = 998.4 and N = 50

\A r, 998.4

M-D- = = 19.968

••• Mean Deviation = 19.968

/"IDI

194.4 208.8 96 182.4 316.8

2/IDI = 998.4

3. <

mea

(b


Let us try the same question by assumed mean method (Assumed Mean = 90)

t- - . Marks ■ ■ ■■ f m m - 90 /lai

h ^ \d\

20-A0 3 30 60 180

40-80 6 60 30 180

80-100 20 90 0 0

100-120 12 110 20 240

120-140 9 130 40 360

\d\ = 960


M.D. =

I/IJI+(X-A)(I/B-IM) N

where, IM = 960, X = 94.8, A = 90,1/B = 3 + 6 + 20 = 29 and S/A = 12 + 9 = 21

960 + (94.8-90)(29-21) 50

960 + (4.8)(8) 960 + 38.4

50 998.4

50

50

= 19.968

Mean Deviation = 19.968

3. Other Measures from Standard Deviation

Various measures are calculated from standard deviation. Some of the important measures are as under :

(a) Coefficient of Standard Deviation : A relative measures of standard deviation is calculated to compare the variability in two or more than two series which is called 'coefficient of standard deviation'. This relative measurement is called by dividing standard deviation by arithmetic mean of the data

Symbolically,

Coefficients of S.D. = ^

X

Here, a = Standard deviation and X = Arithmetic mean

(b) Coefficient of Variation : This relative measurement is developed by Karl Pearson and is most popularly used to measure relative variation of two or more than two series. It shows the relationship between the standard deviation and the arithmetic mean expressed in terms of percentage. This measure is used to compare uniformity, consistency and variability in two different series. The series having greater coefficient of variation, it is called to be less uniform, less homogeneous, less

m

278


consistent or less stable (in other words, it has higher degree of variability). In the same way, the series having lesser coefficient of variation, it is said to be more uniform, more homogeneous, more consistent or more stable, (in other words, it has less degree of variability).

Symbolically,

C.V. = ^ X 100

Here, C.V. = Coefficient of Variation, a = Standard deviation and X = Arithmetic mean.

(c) Variance : Variance is the square of standard deviation. Standard deviation and variance are measures of variability and they are closely related. The only difference between the two measurements is that the variance is the average squared deviation from mean and standard deviation is the square root of variance. Symbolically,

Variance = a^ and Standard Deviation = VVariance

Calculation of Variance

In Series of Individual Observation :

Variance (a^) =

N

^ nx-xf

N

Here, x = X - X

In Frequency Distribution :

Variance (a^) =

flfd'^ 2'

N [ N J

X a

Here,

d' = , and

C = Common factor

Individual Observations

Illustration 30. A batsman is to be selected for a cricket team. The choice is between X and Y on the basis of their five previous scores which are : ■X : 25 85 40 80 120

y ■ 50 70 65 45 80

[a) Calculate coefficient of standard deviation, variance and coefficient of variation.

[b) Which batsman should be selected if we want

(/) a higher run scorer (ii) a more reliable batsman in the team.

Measures of Dispersion Solution, (a)

Batsman X

Arithmetic Mean

JX

X =

N

EX = 350, N = 5 350

X = = 70 Average score = 70 Runs

Standard Deviation

a =

N

Ix' = 5750 and N = 5

a =

5750

= V1150

= 33.91 Runs

Coefficient of Standard Deviation

Coefficient of S.D. = ^

Coeff. of S.D. =

a = 33.91 and X = 70 33.91

70 = 0.484

Variance

E(X-X)2 Ijc^

N

N

I(X -XV = = 5750

Batsman Y

Y =

N

EY = 310, N = 5 310

Y =

= 62

Average score = 62 Runs

a =

N

Ey2 z: 830 and N = 5

oy

= 12.88 Runs

Coefficient of S.D. = ^

279

Batsman (X-X)^ Batsman (y-Yf

Scores X-X Scores X- Y

X * Y

25 -45 2025 50 -12 ■ 144

85 +15 225 70 +8 64

40 -30 900 65 +3 9

80 +10 100 45 -17 289

120 +50 2500 80 +18 324

EX = 350 Ex^ = 5750 EY = 310 = 830

Coeff. of S.D. =

a = 12.88 and Y = 62 12.88

62 = 0.207

_ E(Y-Y)2 ^^ ^ N N

E (Y- Yf = ly = 830

280

Here, X - X = x

5750


ax^ =

=1150 Runs

Coefficient of Variation

„ - ax

c.v.^ =• Y 100

o = 33.91 and X = 70

33.91 C.V = X 100

= 48.44%

Here, y - y ^ ^

, 830 ay- = —-

= 166 Runs

C.V.^ = ^ X 100

(J, = 12.88 and Y = 62

„„ 12.88 C-V. = X 100

= 20.77%

ib) H) Batsman X should be selected as a

(70 ™„S, . than Ae't^ii tt'o^y ST^r,

1 '""IT of

Discrete Series """ * = 48.44%,.

^^ Jl-ustration M. Calculate variance and coefficient of variation from the foHowin,

Solution.

Size

4

5

6

7

8 9

10

Frequency

3

7 22 60 85 32

8

Size X

4

5

6

7

8 9

10

of Variation

Frequency

—

3

7 22 60 85 32

8

~NT217~

X~7 d

-3 -1 -1 0 +1 +2 +3

fd

-9 -14 -22 0

+85 +64 +24

m = 128

Cot

estir

fd'

17 28 22 0 85 128 71

a an

Measures of Dispersion Variance

, ,, Ifd^ ri/iif

Variance (o^) = — - J Here, ^ = 362, l.fd = 128 and N = 217

362 YmY

Variance (a^) = ^ (^217,

= 1.668 - (0.589)2 = 1.668 - 0.347 = 1.32


C.V. = I X 100

Let us calculate a and X, _

o = VVariance = >/02 = 1.15

^ + 0.59 = 7.59

281

N

217


Here,

C.V = I X 100

o = 1.15 and X = 7.59 C.V = ^ X 100 = 0.1515 X 100 = 15.15%

Continuous Series

itmuous series

niustration 32. To check the quality of two bulbs and their life in burmng hours was

Life (in hrs.) Ho. of bulbs

Brand ABrands

0-50 15 2

50-100 20 8

100-150 18 60

150-200 25 25

200-250 22 5

Total 100 100

(i) Which brand gives higher life? (it) Which brand is more dependable?

282

Solution.

Statistics for Eco.

nomics-Xl

Brand A

Coefficient of Variation (Brand A^

C-V- = ^ X 100 Let us calculate first a and X. Standard Deviation

MLjlfd"

X c

Here,Z/^- = 193,Z/-^-19,N=100andC=50

CT =

/m 1100

'19_ 100

X 50

Co^ffidem ofJS^ation~(BrLd BJ

= X 100 Let us calculate first a and X. Standard Deviation

a =

l^jEfd'^

X c

a =

100

.100,

x50


283

= Vl.93-(0.19)2

Vl.93-0.0361x50

= Vl.8939x50 = 1.376 X 50 = 68.8 hrs.

Arithmetic Mean

X = A + X C

N

Here, A = 125, I^d' = l9,h!= 100 and C = 50

19

= Vo.61-(0.23)2

- V0.61-0.0529x50

>/0.5571x50

0.746 X 50 = 37.32 hrs.

X = 125 +

X 50

Arithmetic Mean

Ifd'

X = A + X C

N

Here, A = 125, Ifd' = 23, 100 and C = 50

23

100

= 125 + (0.19) X 50 = 125 + 9.5 = 134.5 hrs. Applying the formula, now we get

C.V. = ^ X 100

.A.

where, a = 68.8 and X = 134.5 68 8

C.V = X 100 = 51.15%.

134.5

X = 125 +

X 50

100

= 125 + (0.23) X 50 = 125 + 11.5 = 136.5 hrs. Applying the formula, now we get

where.

C.V. = ^ X 100

.A.

a = 37.32 and X = 136.5 37.32

C.V. =

136.5

X 100 = 27.34%.

(/■) Since the average life of bulbs of brand B (136.5 hrs) is greater than that of brand

A (134.5 hrs), therefore the bulbs of brand B give a higher life. (ii) Since C.V. of bulbs of brand B (27.34%) is less than that of brand A (51.15%),

therefore the bulbs of B are more dependable. Illustration 33. The number of employees, wages per employee and the variance of wages per employee for two factories are given below :

No. of Employees

Average wage per employee per day (Rs) Variance of wages per employee per day (Rs)

(a) In which factory is there greater variation in the distribution of wages per employee?

(b) Suppose in factory B, the wages of an employee are wrongly noted as Rs 120 instead of Rs 100. What would be the corrected variance for factory B?

Factory A Factory B

50 100

120 85

9 16

284

Solution.

(«) Calculation of Coefficient of Variation : Factory A


C.V. = |. X 100 Here, x = 120 and a =

= il? 100 = 2.5%

Factory B

C.V. = J X 100

Here X = 8S and a = ^

C-V. = ^ X 100 = 4.7%

W &rrecti„8 Mean and Variation :

— zx

For Factory B .-

^ = 100 and X= 85

100 x 85 = 8500

It IS not correct ZX

Corrected ZX= 8500 - 120 . 100 = Rs 8480

Corrected X = ^ = ^ ^

N

Variance = a^

a^ =

Here,

N

100

- (xp

= 16, X = 85 and JV = 100

ZX^

16 =

100

(85)2

T. = ^^00 + 722500 = 724100

It IS not correct ZX^ /^^lUU

Corrected ZX^ = 724100 - (120)2 . (100)^

= 724100 - 14400 + 100000 = 719700 Corrected Variance = ^o^reaed ZX^

- (Corrected X)^

719700 ~ ~Tdr - <84.8)2

bet

= 7197 - 7191.04 = Rs 5.96


285

Illustration 34. The sum of 10 values is 100 and the sum of their squares is 1090. Find the coefficient of variation.

Solution. We are given.

N = 10, IX = 100, and XX^ = 1090

Coefficient of variation (C.V.) = ^ x 100

y\.

Apply the following formula to get Mean (X)

ZX

^ = N

10 X = 100

100 10

X =

NX = IX

= 10

Apply the following formula to obtain standard deviation (a) by direct method.

(X)2

02 =

N

1090

- (X)^

- (10)2

10

= 109 - 100 = 9

= V9 =3

Therefore,

C.V. = — X 100

= - X 100

= 30

Thus, X = 10 and C.V. = 30

Illustration 35. The means and standard deviations of two brands of bulbs are given below:

Brand I Brand II

Mean 800 hours 770 hours

Standard deviation 100 hours 60 hours

Calculate a measure of relative dispersion for two brands and interpret the result.

286


Brand I

C.V. = ^ X 100

Given : X = 800 a = 100

100 800

X 100 = 12.5%

Given

Brand II C.V. = ^ X 100

X = 770 a = 60

60 770

X 100 = 7.79 %

Hence, the bnlbs of brand H are more consistem as compared to brand L

4. Mathematical Properties of Standard Deviation

Standard deviation has the following important mathematical properties ■

- 11 5

= 3

Here, x-X- X, i.e., deviations taken from Mean, w/ , " ^ ~ ^^ deviations taken from any value


287

less than the sum of the squares of deviations calculated from any other value, which is used to calculate standard deviation.

Symbolically,

But the

sum of deviations calculated from Median (ignoring ± signs) is always less than the sum of deviations calculated from mean (ignoring ± signs), which is used to calculate mean deviation.

Symbolically,

ElX-Mel < ZIX-XI

(b) Standard Deviation and Normal Curve : In a normal or symmetrical distribution apart from mean, median and mode are identical, a large proportion of distributions are concentrated around mean. Following are a relationship {i.e., range of spread of items) can be determined on the basis of mean and standard deviation.

Mean ± 1 a

covers 68.27% of the total items.

Mean ± 2 a covers 95.45% of the total items.

Mean ± 3 ct covers 99.73 % of the total items.

This can be observed in the following frequency curve :

PERCENTAGE OF ITEMS INCLUDED UNDER NORMAL CURVE

Illustration 36. Calculate the percentage of cases lying within X + 1 a, X ±1(5, X. ± 3 a from the following data :

Size : 1 2 3 4 5 6 7 8 9 10

Frequency : 8 12 10 28 16 12 10 2 0 2

288

Solution.

Statistics for Economics-xA

Calculation of Mean and Standard Deviation : Mean : X = A ■ ^^^

Here,

N

A = 5, I.fd = - 68 and N = 100 -68

X = 5 +

Standard Deviation

100

= 4.32

CT =

Here, ^ 434, Zfd = -68 and N = 100

0 =

/424 100

-68>

—00 I____

liwj = V4.24-(0.68)2

Calculation (a)

Cases lying ib)

Cases lying (c)

There is no

- y/4.24-0A624 = VIT^ = 1.943 of percentage of cases :

X ±10 = 4.32 ± 1.94 = 6.26 and 2.38

between 3 and 6 are (10 . 28 . 16 . 12) = 66 out of 100, 66"/o

X ± 2a = 4.32 ± 2 x 1.94 = 4.32 ± 3.88 = 8.20 and 0 44 between 1 and 8 are (100 - 2) 98 out of 100, i.e., 98%

X ±3o = 4.32 ± 3 X 1.94

= 4.32 ± 5.82 = -1.5 and 10.14 negative value. All the cases lie between 0 to 10, /.e., 100%.


(c) Combined Standard Deviation : Just as combined arithmetic mean can be calculated, if means and number of items in different groups are given, similarly combined standard deviation can be calculated, if standard deviation means and number of items in different groups are given. Combined standard deviation is obtained as follows : {a) Two related groups :

iNiaf + N2O2 + Njd} + Nidj

= ^ N1+N2

Here,

CTjj = Combined standard deviation of two groups

CTj = Standard deviation of first group a^ = Standard deviation of second group

Xj 2 = Combined arithmetic Mean of two groups (b) Three related groups :

'1.2,3

V

NjO^ + N^al + Njof + N^d^ + Nzcff + N^dj

N1+N2+N3

Here, d^= (X-X 1,2,3), d^ = (X2 -Xi,2.3) and d^ = {X, -X,^^,^)

The above formula can be extended to calculate the combined standard deviation of even more groups.

Illustration 37. In sample A, N = 150, X = 120 and S.D. = 20; in sample B, N = 75, X = 126 and S.D. = 22. Calculate Combined Mean and Combined Standard Deviation. Solution. Combined Mean :

V N1X1+N2X2

- N1 + N2

150x120 + 75x126 18000 + 9450 27450

150 + 75

Combined Standard Deviation :

225

225

= 122

Niol + N2al + N^d^ + N2

. N1+N2

d^ = (X,-Xi,2) = 120 - 122 = -2 d^ = (X2-Xi,2) = 126 - 122 = 4

-i

150 (20)2 ^ 75(22)2 ^ _ (_2)2 ^ 75(4^2 150 + 75

290

j60odoT363ddT6^^ * 150 + 75


[981^ --

~ V~22r " = 20.88

Thus, Combined Mean = :22 and Combined Standard Devation = 20.88

follow^ A™'ISirs.''' """ ained by combming ,he

Distributions

N

A 20

^ 120 C 60

Solution. Combined Arithmetic Mean :

60 50 40

5.D.

8 20 12

H00 + 6000^h2^ _ 9600

200 - ^ = 48

Combined Standard Deviation :

20 + 120 + 60

^1.2,3 =

Here,

and

I nY+N^J^

= ~ ^1,2,3) = 60 - 48 = 12 = (^2 -X12 3) = 50 - 48 = 2 = (X3 -X12 3) = 40 - 48 = -8

=

' 20 + 120^^60

* 200 -

[65120 ,_

~ = ^^25.6 = 18.04

Thus, Combined Mean = 48

Combined Standard Deviation = 18.04

291

measures of Dispersion

\d) Change of origin and change of scale : Any constant added or ^"^ttacted (change of origin), then standard deviation of original data and of changed data after addition or Isuhtraction will not change but the mean of new data will change."

Any constant multiplied or divided (change of scale), then mean, standard deviation

and,variance will change of the new changed data.

Illustration 39. Average daily wage of 50 workers of a factory was Rs 200 with standard deviation of Rs 40. Each worker is given a rise of Rs 20. (,) What is the new average daily wage and standard deviation? (ii) Have the wages become more or less uniform?

(Hi) If each worker is given a hike of 10% in wages, how are the Mean and Standard Deviation values affected? Solution. We are given

N = 50, X = 200, a = 40

(i) Change of Origin

Old Series

Since,

X =

IX

N

NX =IX

50 X 200 = 10,000

Mean

X =

N 10,000

50

= Rs 200

Standard Deviation

a =

N

■-(X)2

Suppose each worker is paid Rs 200.

X X^ = (200)^ X 50 workers

= 40,000 X 50 = 2,00,0000

a =

'20,00,000

50

= ^40,000-40,000

= Ji

Thus, Standard Deviation (a) and variance

(o') = 1. _

New Series

Rise of Rs 20 to each worker to get new series 20 X 50 workers = Rs 1,000 New XX = 10,000 + 1,000 = Rs 11,000

New Mean EX

X =

N 11,000

" 50 Standard Deviation

= Rs 220

a =

N

Each worker given a rise of Rs 20, i.e., 200 + 20 = Rs 220

XX^ = (220)^ X 50 workers

48,400 X 50 = 24,20,000

a =

^4,20,000

50

-(220)2

= ^48,400-48,400 Thus, Standard Deviation (a) and variance

«y') = 1._

292

« New . new .ancUrd aeva.on .

devanon remain rhe same as rhat ofold series

w^ are required .„ ca.cuia,e eoefficienr „, variation ro decide *e uniform..

Old Scries


C.V. = ^xlOO

- 200 = 0.5%

X 100

New Seri^


C.V. = |xlOO

220 = 0.45%

X 100

orvaSl^^ro^i-ZCtrtr^ - -- coe^cienr

(ni) If each woricer is given 20% h,ke

Mean affected: Old + Increase in wages ■•• Rs 200 + Rs 20 = Rs 220 New Mean = Rs 220

- ..e. remain .he same as

5- Relation between JMeasures of Dispersion

and -Viation, Mean Devation

(a) Q.D. - ^ s.D. (more precisely 0.6745 S.D.)

(b) M.D. _ ^ S.D. (more precisely 0.7979 S.D.)

(c) Q.D. = I m.D. (more precisely 0.8453 m D )

(d) 6 S.D. = 9 Q.D. = 7.5 M.D. Further, in such distributions •

(0 Arithmetic M " ! stnl^TT""''' ^e items.

The above relationships ~ of the items,

to moderately asymmerric'al for c^t^" *»« Vplied


Merits and Demerits and Uses of Standard Deviation

Standard deviation is the most satisfactory and widely used measure of dispersion ■ause of the following merits :

■ Merits

1. Based on every item. Unlike the range and location based measures of dispersion, the standard deviation makes use of all the observations in the set of series. That is, it includes every item of the distribution.

2. Correct mathematical process. The standard deviation is the easiest measure of dispersion to handle algebraically and it is the resuk of correct mathematical process. The deviations are calculated from arithmetic mean which is an ideal average. The deviations are squared, so that automatically become positive. Being used on correct mathematical process, it is amenable to further statistical analysis.

3. Rigidly defined. Standard deviation is a well-defined and definite measure of dispersion. It is rigidly defined and its value is always definite and based on all the observations and the actual signs of deviations are used.

4. Sampling fluctuations. Standard deviation is less affected by the fluctuations of sampling than most other measures of dispersion.

5. Mathematical Properties. It is amenable to algebraic treatment and possesses many mathematical properties. It is the only measure for calculating combined standard deviation of two or more groups. It is on account of the properties that standard deviation is used in many advanced studies.

Demerits

1. Complex in calculation. Standard deviation is not easy to calculate, nor it is easily understood. In many cases it is more cumbersome in its calculation than either quartile deviation or mean deviation.

2. More weights to extreme items. It gives more weight to extreme items and less weight to those which are near to the mean, because the squares of the deviations which are big in size, would be proportionately greater than the squares of the deviations which are comparatively small. Thus, deviation 2 and 8 are ratio of 1 : 4 but their square, i.e., 4 and 64 would be in the ratio of 1 : 16. Howevei; since standard deviation gives greater weight to extreme items, it does not find much favour with economists and businessmen who are more interested in the results of the modal class.

Uses

Despite the drawback the standard deviation is the best measure of dispersion and )uld be used whenever possible. It is widely used in statistics because it possesses most die characteristics of an ideal measure of dispersion, k is a significant measure for aking comparison between variability of two sets of observations to test the significance f various statistical measures of random samples, correlation and regression analysis etc. ' may regard standard deviation as the best and the most powerful measure of lion.

294

(between H-0 tn a u- u Statistics for Economics-Xl

reutive measures of or variations

Types of Measures of

Dispersion/Variation

Measure iatiori

Range

Înter-Quartile Range and Quartile Deviation Mean Deviation Standard Deviation

1

re of Variation/ nation

Coefficient of Range ^ Coefficient of Quartile Deviation ^ Coefficient of Mean Deviation Coefficient of Standard Deviation

(a) Absolute measure • Absol..^^ ^

. the ^mrfda™ ^f ^^ ^^^ as the data,

êes. If the data are in kg, the measiV^'e nV Jf " ê in j

d spersxon cannot be used to compare the Matter o ^ absolute

St r —■ --

or the coefficient of the absoluteltaC If d ' "" »» P^tcentZ

coefficent of dispersion or coeST „f « « caS

c«fhcient of range, co^Bc^ro^^^uXtST' -asures ar

and coefficient of standard deviation "oo of mean deviation on standard deviation is ca.ed

Thus, C.V. - ®

= f XIOO

measured m the same of var abihty of two or more series wher "uiLts 5 ' " " « - obtained as percenta;«


295

method (lorenz curve)"

Lorenz Curve

The graphic method of studying dispersion is known as the Lorenz Curve Method. It is named after Dr Max. O. Lorenz who used it for the first time to measure the distribution of weakh and income. Now k is also used for the study of the distribution of profits, wages, turnover etc. In this method of values the frequencies are cumulated and their percentage are calculated. These values are plotted on the graph and curve that is obtained IS called the Lorenz Curve. The greatest defect of this curve is that it does not give a quantitative measure of dispersion. Let us look at the following illustration. Illustration 40. Draw a Lorenz Curve from the following data :

Income (In thousand Rs) No. of Persons in thousands

Group A Group B

20 10 16

40 20 14

60 40 10

100 50 6

180 80 4

Solution.

Income in

(Rs)

lative Income

wm

20 40 60 100 180

_

20 60 120 220 400

Cumu-laui/e-

Par^Mt

tage sands) hers

_____

No. of Pers6ns (stt thou-

5 15 30 55 100

10 20 40 50 80

Steps

1. The size of items (or if classes are given, then mid-points) are made cumulative. Considering last cumulative total as equal to 100 difference cumulative total are converted into percentages.

2. In the same way frequencies are made cumulative. Considering the last cumulative frequency item as equal to 100, all the different cumulative frequencies are converted into percentages.

296

* ! ■

r ■ !.

Percentage of Income

lorenz curve

Curve

5 """ ^T " ^om 0 to 100

ot^^'ltSUl'"" ™ ^ line . as line

Illustration 41. fa ,he fo|Wi„, ab,,™^ " . and . according ro the allt tX" t'^ » -

Profits earned in Rs '000 Area A Area B

6 25 60 84 105 150 170 400 6 11 13 14 15 17 10 14 2 38 52 28 38 26 12 4

Sii inequa

nil

income After-ta No. of,

After-m

No. of I


Solution. Obtaining Lorenz Curve, we calculate percentage values as under :

Profits (in thousand) Area A Area B

Cumu- Cumu- No of Cumu- Cumu- No. of Cumu- Cumu-

lative lative Com- lative lative Com- lative lative

Profit Profit Profit panies Num- Per- panies Num- Per-.

Percen- ber cent- ber cent-

tage age age

6 6 0.6 6 6 6 2 2 1

25 31 3.1 11 17 17 38 40 20

60 . 91 9.1 13 30 30 52 92 46

84 175 17.5 14 44 44 28 120 60

105 280 28.0 15 59 59 38 158 79

150 430 43.0 17 76 76 26 184 92

170 600 60.0 10 86 86 12 196 98

400 1000 100.0 14 100 100 4 200 100

Since curve B is farthest from the line of equal distribution, it represents greater inequality in area B as compared to area A.

Illustration 42. 9400 Indian households are classified according to their after-tax income as follows :

After-tax income : 0-1000 1000-5000 5000-10000

No. of households : 1348 4^9 » 1892

After-tax income : 10000-20000 20000-40000

No. of households : 1460 490

Drawrh. T . ^t'^ti^tics for Economics-XI _^alculatioi^^ Values

After-tax Income

0 - 1000 1000 - 5000 5000 - 10000 10000 - 20000 20000 - 40000

Cunmlatii'c Income r> f Cumulative Percentage of Income No. of houses holds Cumulative Number of households Cumulative % of house- . holds

Below 1000 Below 5000 Below 10000 Below 20000 Below 40000 IT 12.5 25.0 50.0 100.0 1348 4210 1892 1460 490 1 1348 5558 7450 8910 9400 14.34 59.13 79.25 94.74 100.00

of measures of oispersion

;------

i^orenz Curve. We are now m a position to make a

there is more inequahty

299


comparative study of these measures. It would help us in the selection of an appropriate measure of dispersion which depends on-(^) nature of data, (b) the purpose, and [c) object of an investigation.

1. Definite value point of view : All the four-methods of dispersion are rigidly defined

and their values are definite.

2. Calculation point of view : Range is the easiest and simplest measure because it is the difference between two extreme items. Quartile deviation is superior to the range, as it is not affected too much by the value of extreme items, instead it is calculated from lower and upper quartiles. This is also easy method. However, Mean deviation and Standard deviation requires more calculations which are based on the deviations from average. Lorenz Curve is a visual aid method but it does not give quantitative measure.

3 Based on every item point of view : Range and quartile deviation are not based on

" all the items of series while mean deviation and standard deviation makes use of

all the items. They are based on every item of the distribution. Range is highly affected by the extreme item.

4 Interpretation and application point of view : All the four measures of dispersion are easy to interpret. Range and quartile deviation are useful for general study of variability. Range is useful for quality control, weather forecasting, etc. Quartile deviation is useful when influence of extreme items is minimised as in the study ot social problems. Mean deviation is used by economists and busmess statisticians It is useful in forecasting business cycles and small sample studies. Standard deviation possesses most of the good characteristics of a measure of dispersion Therefore, in sampling and other areas of statistical analysis, it is the most favoured and indispensable measure.

5 Algebraic treatment point of view : Standard deviation is the best measure of dispersion because of correct mathematical processes as compared to range, quartile deviation and mean deviation. It is widely used in statistics, i.e., in making comparison between variability of two or more sets of data, in testing the significance of random samples, in correlation and regression analysis, etc.

Thus, standard deviation satisfies the most essentials of a goqd measure of dispersion.

These essentials are as under :

(i) It should be simple to calculate and easy to understand.

(ii) It should be rigidly defined, i.e., it has precise value.

(iii) It should be based on all the observations of a series.

(iv) It should not be greatly influenced by extreme items.

(v) It should not be affected by fluctuations in sampling.

(vi) It should be usable for statistical calculations for further or higher order analysis.

300


of formulae

Range

Absolute Afeasure

Range = L - S

Q^rtile Deviation

Relative Measures

Coefficient of range

L-S

L + S

L = largest item S = smallest item

Absolute Measure

Quartile range = Q^ _ q Semi-interquartile range or Quartile Deviation

Q.D. =

2

Relative Measures

Coefficient of Quartile Deviation

_ QlzQl Qs+Qi

Qj = lower quartile Qj = upper quartile

Absolute Measure Individual Observations

M.D. =

N

- - N

- ^Ji^

I £> I = Deviations from median

Discrete and Continuous Series

M.D. =

zfjm

N

Short-cut (Assuiiied mean) Mediod

M.D. = lll^hilzAmB-^ N

Absolute Measure Individual Observation

Actual Mean Method

a =

N

Direct Method

N ■ ■ y M Assumed Mean Method

I N

a =

' N ' d = X-A

.N


Relative Measure Coefficient of Mean Deviation

Coeff. M.D. =

M.D.

Mean or Median

M.D.

XorMe

R< lative Measure

ent nf Standard Dcv- f-on

or mean ignoring ± signs

Coeff. S.D. = -E-X

Relative Measure Coefficient of Variation

a

C.V = - X 100

301


a =

d' =

N

fX~A

{ N

X C

V c

C = Common factor

"CS

Dirt

a =

a

N

IfX N

2-

-(Xf =

N

' N [ N

Assumed Mean Method

a =

' N

lfd I N

\2


a -

I N [ N J

Variance or a' = ^^X-Xf ^ N N

X C

a'

^ J^]

N

N

\ i\ J

X e

d - ——, C = Common factor

1 V

t ■■

r ■■

302

Combined Standard Deviation (a) Two related groups :


_ Niof + N^cl + + N^dl

Here,.

{b) Three related groups :

, _ + N^ol + Njof + N, Jf + N^dl + N,dl

3

Here, and

d^ = ^3 -X^ 2,3

EXERCISES

Questions :

1. Illustrate the meaning of dispersion with examples.

2. What are the essentials of a good measure of dispersion.?

3. {a) Name four commonly used measures of absolute dispersion.

{b) Name the most commonly used measure of relative dispersion. Give formula for calculating it.

4. Why should we measure dispersion.? Do the range and quartile deviation measure dispersion about same value.?

5. A measure of dispersion is a good supplement to the central value in understanding frequency distribution. Comment. ^

6. Which measure of dispersion is the best and how.?

7. Some measures of dispersion depend upon the spread of values whereas some calculate the variation of values from central value.? Do you agree?

8. Define the first and third quartiles. Explain how the quartiles are used to calculate dispersion values.?

9. (a) What do you understand by mean deviation?

qZt'd^LLT"^"""

10. 'Coefficient of variation is a relative measure of dispersion'. Explain

^^ variability? What is the need of calculating a measure

17.

lo.

19.

20. 21.

303


H2. In what way is standard deviation a better measure of dispersion than mean deviation?

13 What is Standard deviation? Explain the uses of standard deviation.

14. Why is standard deviation considered to be the most popular measure cf dispersion.

Explain. ... . ,

15. What IS coefficient of variation? What purpose does it serve? Also distinguish

between 'variance' and 'coefficient of variation'.

16. Define coefficient of variation? In what situation would you prefer this as a measure

of dispersion.

Make a comparative study of various measures.

"The standard deviation of heights measured m inches will be larger than the standard deviation of heights measured in feet for the same group of individuals. Comment on the validity of the above statement. Otherwise give appropriate explanation of the statement given above.

What is meant by absolute and relative measure of dispersion?

Briefly explain the concept of 'Lorenz Curve'.

Define Lorenz Curve.

22. Write short notes on:

(a) Coefficient of Dispersion (b) Coefficient of Variation

(c) Variance Standard Deviation

(e) Quartile Deviation (f) Lorenz Curve

Problems : Range

1. The daily wages of ten workers are given below. Find out range and its coefficient.

No. of Workers :A B C D E F G H I ] Wages in (Rs) : 175 50 50 55 100 90 125 145 70 60

[Range = 125; Coefficient ot Range = 0.55J

2. Following are the marks obtained by students m Sec. A and Sec. B. Compare the range of marks of students in two sections.

Marks (Section A) : 20 25 28 45 15 30

Marks (Section B) : 45 52 36 42 28 25

[Section A : Range = 30, Coetticiem ot Range = 0.5

Section B : Range = 27, Coefficient of Range = 0.351]

3. Find range and coefficient of range of the following :

(a) Per day earning of seven agricultural labourers in Rs :

60, 72, 36, 85, 35, 52, 72

304

Jan. +1.5 July -0.1

Feb. +2.4 Aug. -0.6

from normal (2002).

Mar. Apr. May

+3.1 -1.5 -0.4

^ep. Oct. Nov.

-1-5 -0.6 -1.9


June +3.3 Dec. -6.1

4.

[(a) Range = 50, Coeff. of Range L 0 42

No. of Workers :

50 2

70 8

80 12

90

7

100 4

120 3

130 8

150 6

5.

If o .

X f

10 4

15 12

6.

Fir.^ M , i^vduge = ^u; L.oel

Find the range and coefficient of range of the following • Age m years : cm . & •

Frequency :

20 30 40 50

7 3 5 2

[Range = 40; Coefficient of Range = 0.67]

5-10 10

10-15 15

7.

15-20 20-25

20 5 [Range = 20, Coefficient of Range = 0.671

Frequency :

1-5 2

6-10 8

11-15 15

8.

16-20 21-25 26-30 20 10 l^ange = 30; Coefficient of Range = 0.97]

:e

The following table gives the hei^h. f ^ ^^ R^^ge = 0.97

Method. ^ ^^ Calculate dispersion by Rang.

Height (in centimetres)

Below 162 Below 163 Below 164 Below 165 Below 166 Below 167 Below 168 Below 169 Below 170

No. of persons

1 8 19 32 45 58 85 93 100

[Range = 9 cm. Coefficient of Range = 0.

03]

'f

■ ^M


Quartile lc\ialinn

9. Calculate Quartile Deviation and its Coefficient of Rajesh's daily income.

Months : 1 2 3 4 5 6 7 8 9 19 11 12

^ Income (Rs) : 239 250 251 251 257 258 260 261 262 262 273 275

[Q.D. = Rs 55, Coeff. of Q.D. = 0.213] 10. Find the Quartile Deviation and its Coefficient from the following data relating to the daily wages of seven workers : . Daily Wages (in Rs) : 50 90 70

40 80 65 60

[Q.D. = Rs 15, Coefficient of Q.D. = 0.23] ^ Find out Quartile Deviation and Coefficient of Quartile Deviation of the following items :

145 130200 210 198

234 159160 178 257

260 300345 360 390

[Q.D. = 70; Coefficient of Q.D. = 0.304]

12. Find out Quartile Deviation, Interquartile Range and Coefficient of Quartile Deviation of the following series :

Height (in inches) : 58 59 60 61 62 63 64 65 66 No. of Persons : 2 3 6 15 10 5 4 3 1

[Q.D. = 1, Interquartile Range = 2, Coefficient of Q.D. = 0.016]

13. Find out Coefficient of Quartile Deviation from the following data :

X : 10 15 20 25 30 35 40 45

f: 6 17 29 38 25 14 9 1

[Coefficient of Q.D. = 0.2]

14. Calculate Quartile and Coefficient of Quartile Deviation of the following data : Marks : 5-9 10-14 15-19 20-24 25-29 30-34 35-39 Students : 1 3 8 5 4 2 2

[Qj = 15.906, Q^ = 20, Q^ = 26.687 and Coefficient of Q.D. = 0.25]

15. Calculate lower and upper Quartiles, Quartile Deviation and Coefficient of Quartile Deviation of the following series :

Values : 5-6 6-7 7-8 8-9 9-10 10-11

Frequency : 5 8 12 15 6 2

[Qj = 6.875, Q3 = 8.733, Coefficient of Q.D. = 0.119]

16. Calculate the Semi-interquartile Range and its Coefficient of the following data : Marks : 0-10 10-20 20-30 30-40 40-50 50-60 60-70 No. of Students : 48 11 15 12 6 3

[Q.D. = 11.55; Coefficient of Q.D. = 0.337]

17. Find out the Q3, Qj, Quartile Deviation and Coefficient of Quartile Deviation in the following :

Age : 20- 30- 40- 50- 60- 70- 80-

No. of Members : 3 61 132 154 140 51 2

I [Q3 = 64.08, Q, = 45.43, Q.D. = 9.33, Coefficient of Q.D. = 0.17]

' i

1

306


68

M.D. = 12.77] 50 50

18. Calculate the InterquartUe Range for the data given below •

Zency : T T T ^^ ^

6 5 4

[I.Q.R. = 12.9]

Mcaii Deviation

Find Coefflcent of Mean dev,a„on f„n, median • ^

P''-^ ms) : 25 28 32 32 36 48 44 45

... Ca,cn,a. Mean Oe™..on from mean and med.an of rS^laTa" ^

^^ 52 49 45 72 57 47

Ca.n,a.e Mean Oe™„o„ f.om « ml"; Llf ; - ^ "

= ^ 4 10 9 15 12 7 9 7

.. o„ .e a™.e devia.on from mel^'X f^ Xtn:

n ■ 12 18 24 30 36 42

Frequency : 4 7 9 18 15 jo 5

24. Calculate Mean Deviation from median : = =

No. of tomatoes per plant : 0 1 9 cj 4 ^

No. of plants . -y . ^ , , ^ ^ ^ 7 8 9 10

• 2 ^ 7 11 18 24 12 8 6 4 3

Size of Item : 4 ^ „

Frequency =2 4 I ^^ ^^ 16

_ 3 2 14

[M^D from X = 3.32 and Coefficient of M.D. = 0 342

26 Th. • u, "" ^^ " and Coeff,ciem of M D - 0 4051

dev,a,L. ""<1 ""dian and coeffiden, mean

s- ^^Mrr r ? r

[M.D. from X = 28.56, Coeffidem of M D = 0 228 M.D. from Me = 28, Coefficient of M.D. = 0.233]

Measures of Dispersion ^^^

127 Calculate Mean Deviation from mean of the following data :

Class : 3-4 4-5 5-6 6-7 7-8 8-9 9-10 Frequency : 37 22 60 85 32 8

[M.D. = 0.915]

28. Calculate Mean and Mean Deviation and coefficient of M.D. for the following distribution :

Weekly tvages : 20-40 40-60 60-80 80-100

Workers : 20 40 30 10

[Mean = 56, M.D. = 15.2 and Coefficient of M.D. = 0.27]

29. Age distribution of hundred life insurance policy holders is as follows :

mlrTstVtrthday : 17-19 20-25 26-35 36-40 41-50 51-55 56-60 61-70 Number : 9 16 12 26 14 12^ _ ^

Calculate Mean deviation from the median age. [M.D. = 10.73]

30. Find Mean Deviation from median of the marks secured by 100 students in a class-test as given below:

Marks : 60-63 63-66 66-69 69-72 72-75

No. of students : 5 18 42 27 8

[M.D. = 2.26]

31. Using Mean Deviation from median of the income group of 5 and 7 members given below, compare which of the group has more variability?

Group A : Group B :

4000 3000

4200 4400 4600 4800 4000 4200 4400 4600 4800 5800

[Group A : M.D. = 1240, Coeff M.D. = 0.054 Group B : M.D. = 571.4, Coeff. M.D. = 0.13 Group B has greater variation]

Standard Deviation

32. Calculate the Standard Deviation of wage earner's daily earnings :

Week

Earnings (in Rs)

1

54

2 62

3

63

4

65

5 68

6

71

7 8 9 10

73 78 82 84 [X =70, S.D. = 9.01]

33. Calculate Standard Deviation of the following two series. Which series has more variability:

A : 58 59 60 65 66 B : 56 87 89 46 93

52 75 31 46 48 65 44 54 78 68 [A: X = 56, S.D. = 11.7, C.V. = 20.89% B: X = 68, S.D. = 17.1, C.V = 25.14% Series B has more variability]

308

WaSiSM

wem mtM


Income (Rs) ; iqo

120 ISO ^ 140 120 150

[X = 133.7S, S.D. = 22.88]

35.

36.

: (B) Following are given X variable •

55, 49, 67, 89, 44, 59, 57 (a) Calculate

(«■) the arithmetic mean («) the standard deviation («/') the mean deviation Also calculate 0) Z(X - 55)^ («) 2 IX - Median! (c) Examine, if

(«)Z|X-X|>z IX-Median!

No. of families

166 552 580 433 268 148

7 8

37.

9 10 11 12 77 41 20 8 6 1

Calculate Mean and Standard Deviation fron. . n ^^ " = 1-76]

Marks (Above) ■ Q jq the followmg data :

No. : 150 HO 100 80 80 70 30 ^

Calculate Mean and vana.ce f.r the ~

2

X

2 5

4 16

5 21

6 18

7 13

8 10

9 4

38.

10

3

[X = 5.5; a^ = 3.99]

25-30 16

30-35 8

39.

WKMM wm&m

Ml

35^0 3

"^^^-o Vanance ..

Frequency 2 , ^^-25

■ 7 13 21

Calculate the Coefficient of V^w r . ^^ " = 7.95, 02 ^

200 workers in a flc.™ lowing distribution of the wagt 2

^OT : 40-49 en

r T

fX= 74.45,S.D. = ,5.921.CV...2,.38%J

4:

43

44.


309

40. The following are the scores made by two batsmen A and B in a series of innings:

: 12 115 6 73 7 19 119 36 84 29 % \; B : 47 12 76 42 4 51 37 48 13 0 ; Who is better as a run-getter? Who is more consistent?

[A : X = 50, S.D. = 41.83, C.V. = 83.66% ■B : X = 33, S.D. = 23.37, C.V. = 70.82%]

(A is better as run-getter and B is more consistent,)

41. The index number of prices of cotton and coal shares in 1998 were as under:

Month : Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Dec. Index Number of Prices :

Cotton : 188 178 173 164 172 183 184 185 211 217 232 240 Coal : 131 130 130 129 129 129 127 127 130 137 140 142 Which of these two shares do you consider more variable in prices :

[Cotton : X = 193.9, S.D. = 23.80, C.V. = 12.27%

Coal : X = 131.75, S.D. = 4.815, C.V. = 3.65% Cotton shares are more variable in prices]

42. Calculate the arithmetic mean and standard deviation and variancefrom the following distribution :

Class : 0-5 5-10 10-15 15-20 20-25 25-30 30-35 35-40

Frequency : 2 5 7 13 21 16 8 3

[X = 21.9, S.D. = 7.99, a^ = 63.97]

43. Calculate arithmetic mean and standard deviation and variance from the foSbwing series:

Marks : 70-80 60-70 50-60 40-50 30-40 20-30

No. of Students : 7 11 22 0 15 5

[X = 51.67, S.D. = 15.13, a^ = 228.92] ,44. The following tables gives the age distribution of students in a school in 2001 and 2002. Calculate Coefficient of Variation for both the groups.

Age :. 17- 18- 19- 20- 21- 22- 23- 24- 25-

2001 : 1 3 8 12 14 14 5 3 2

2002 : 6 22 34 40 32 20 16 9 3

[2001 : X = 21.5, S.D. = 1.703, C.V = 7.92%

2002 : X = 20.91, S.D. = 1.846, C.V. = 8.82%]

You are given the following data about height of boys and girls : •

Boys Girls

72 , 38

Number

Average height (in inches) Variance of distribution (in inches)

■tl

68 9

61 4

310

46.


(a) Calculate Coefficient of Variation.

(b) Decide whose height is more variable.

[{a) Boys : C.V. = 4.41%, Girls • C V - 1 77°/1 r i

Ufe No of Years : 0-2 2-4 4-6 6-8 8-10

No of Refrigerators Model A Model B

10-12

^ 16 13 7 s 4

of Wh,chide, has

[Model A : X =5.12, C.V. = 54 9% Model B : X = 6.16, C.V = 36.2%

Neck circumference ■. 12 0 12'^ i^n

(in inches) ^^.O 14.5 15.0 15.5 16.0

No of Students : 5 20 30 43 60 56 37 16 .

crkenon Mean .3 standard deviation. ' ^^^ ^^e

[X = 14.01", a = 0.87". The largest size of collar = 16.62".

48 T, u A r smallest size of collar = 11 4"1

I ^^^ --Its : ^

Life ( 000 miles) : 20-25 25-30 30-35 35^0

orand X o c

» 15 12 IS

Brand Y 6 in

20 32 30

Which brand has a greater variation.?

[C.V. (X) = 21.82%; C.V. (Y) = 16 101°/ y U

.o. the data he.ow state lih se'r.s'.rcon'.^l'^

40-45 13 12

45-50 9 0

Variable Series A Series B

10-20 20-30 30-^0 40-50 50-60 60-70 1832 40 22 18

22 40_ 32 18 10

[Series A : _X = 42.14, S.D. . 14.06,.C.V = 33.36%

50 Th. ( u ■ u, B : X = 37.86, S.D. = 14.06, C.V = 3714%1

50. The following table gives the distribution of wages m the two branches of a falry^

: : — 300-350

" 93 157 105 82 500


gr Find the mean and standard deviation for the two branches for the wages separately. F [a) Which branch pays higher average wages?

(b) Which branch has greater variability in wages in relation to the average wages?

(c) What is the average monthly wage for the factory as a whole?

id) What is the variance of wages of all the workers in the two branches—A and B taken together?

[Branch A : Mean = Rs 225, S.D. = Rs 66.20, C.V. = 29.42% Branch B : Mean = Rs 230, S.D. = Rs 62.15, C.V. = 27.02% {a) Branch B pays higher average monthly wages.

(b) Branch A has greater variability.

(c) Combined Mean = Rs 226.67 (d) Combined Variance = Rs 4215]

51. What percentage of frequencies are there in the range of X ± 3 S.D.

X : 60.5 70.5 80.5 90.5 100.5 110.5 120.5 130.5 140.5 f ■ 3 21 -78 . 182 305 209 81 21 5

[X = 100.95, S.D. = 13; Range = 99.12%]

52. Goals scored by two teams A and B in football matches were as follows : No. of Goals in a match : 0 1 2 3 4 No. of matches : A : 17 9 8 5 4

B : 17 9 6 5 3

Find the team which is more consistent in its performance.

[Coeff. of variation Team A = 123.6% Coeff. of Variation Team B = 109.0% Thus, Team B is more consistent]

53. Find mean and the standard deviation of the following two groups taken together:

Group Mumb^ : ^. ; Mean : , ^ . SOX : - -

A 113 159 22.4

B 121 149 20.0

54.

[Combined .: X^ z = 153.83, a^ , = 21.8]

The number examined, the mean weight and standard deviation in each group of examination by two medical examiners are given below. Calculate mean and standard deviation of both the groups taken together.

a

iW

Medical Exami?ier iiilS^^ MiMm Examined IS^^WMlilli ' y Mean : ^ Weight Standard Deviation

A B 50 60 113 1206.5 8.2

[Combined : X^ ^ = 116;82, Oj , = 8.25]

312

^ The following data gives arithmef ^^ Sub-srauh \r.: r77~~~r~-—---—----B'^up.

No. of Men

A B

C

{in Rs)

50 100 120

61.0 70.0 80.5

Standard Deviation (in Rs)

8.0 9.0 10.0

c^ A fCombined : X, , , - 7:5 „

56. A sample of 35 value, h.c o ' ~ ' 2.3 = 11.9]

<=0. a group of 50 male worths .he „ '' """ ' =

wages are Rs 63 and Rs 9 respSively F?" of their weeidy

R^ 54 and Rs 6 tespeaively Fi"d If" "0 female workers, th^fare

group of 90 workers. ' '•"> ^^dard deviation for a comLed

Coefficient ofvariation of two series are 5S«/ ' = ",, = 91

and What are their ml, TJ" ""'"'on

f the coefficient of variation of X series is J4 6»/ 'f f" * = 22.608]

nteans are fOf.a and respe':,!:^^^.^- l^dTZ:^--

No. Of Persons COOO) ^^ 60 iqO i^q

Class A

Class B ]' 20 40 50 80

^ - li It ll - « « .5 85 „ ' ^ - - « 40

61.

xi/St------y^PC-

58.

59.

60.

- [rniu-vaiues) No. of students (Eco.) No. of students (Statis.)

Chapter 11

MEASURES OF CORRELATION

T. 2.

3.

4.

5.

6.

Introduction

Correlation and Causation Kinds of Correlation Degree of Correlation Methods of Studying Correlation Scatter Diagram

Karl Pearson's Coefficient of Correlation (£) Spearman's Rank Correlation List of Formulae

iduction

In the previous chapters we have discussed measures of central tendency (Mean, Median and Mode), partitional values (Quartiles) and measures of dispersion (Range,' Quartde Deviation, Mean Deviation, Standard Deviation and Lorenz Curve. These are all relating to the description and analysis of single variable only This type of statistical analysis is called 'univariate analysis'. Now, we will deal with problems involving association in two variables. We find that in social as well as natural sciences, where more than one inter-dependent variables are involved, change in one variable brings change in others. For instance, in Biology we know that weight of a person increases with height in Geometry we know the circumference of a circle depends on the radius, in Economics prices vary with supply, cost of industrial production varies with the cost of raw materials-agricultural production depends on the rainfall etc. The relationship between variables is measured by correlation analysis. Thus, 'the term correlation (or covariation) indicates the . relationship between two such variables in which change in the values of one variable, the values of the other variable also change.' This statistical analysis of such data is called bivariate analysis

Other Definitions

According to Croxton and Cowden, "When the relationship is of d quantitative nature, the appropriate statistical tool for developing and measuring the relationship and expressing it in a brief formula is known correlation."

According to L.R. Connor, "If two or more quantities vary in sympathy so that movements in one tend to be accompanied by corresponding movements in other(s) then they are said to be correlated."

t

314

H.

^relation and causation


1. Cause and effect : There is a cause and effect relationship between two variables shon w,ves and many .h„„ starred husbands may havt K' ^ r;

Measures of Correlation 315

be correlation between price and demand so that in general whenever there is an increase in price the demand falls, and vice-versa. But this does not mean that whenever there is a rise in price the demand must fall. It is possible that with the rise in price the demand may also go up. This is on account of the fact that in economic and social sciences various factors affect the data simultaneously and it is difficult almost impossible to study the effects of these factors separately. Thus, correlation measures co-variation, not causation. It measures the direction and intensity of relationship among variables.

s^ds of correlation

On the basis of nature of relationship between the variables correlation may be :

(1) Positive and Negative Correlation

(2) .Linear and Curvilinear Correlation

(3) Simple, Multiple and Partial Correlation

1. Positive and Negative Correlation

When both the variables change in one direction, that is when both increase or decrease the relationship between the two variables is called positive or direct. But when the change is in opposite directions that is one is increasing and the other is decreasing, the correlation is negative or inverse. For determining the direction of change average values are taken. For example :

(I) Positive Correlation (il) Negative Correlation

(a) . (b) (a) (b)

Both variables Both variables One variable One variable

increasing decreasing increasing, the decreasing, the

other decreasing other increasing

X Y X Y X Y X Y

10 100 70 . 147 15 125 75 110

20 150 60 140 30 110 60 180

30 160 40 135 35 90 40 190

40 190 30 130 40 80 30 200

50 200 15 120 45 75 20 240

60 255 10 90 50 60 10 250

We find that in I (a) the values of X series are increasing so also of the Y series. In I (b) values of X and Y are decreasing. Thus, they are both instance and positive correlation. On the other hand, in II {a) the values of X are increasing and the values of Y are decreasing, similarly in II (b) the values of X are decreasing and the values of Y are increasing. Thus, hey are both examples of negative correlation.

316

Examples : (Positive Correlation) '^"""""'cs-XI

1- Age of husband and age of wife.

--easeinheatlllT^^^nofr,.

1- Demand of a commodity mav ^

Increase in the number tfTe ^ "": " 3- Sale of woollen garments ant ly ir ""

Yield of crops and Price. " Correlation

u the ratio of chance h^n

of^rrotstr -ear

line. If tbTv:ri:trfmS™=™™bles. Their^SaSnshbt'f "P™ ""-"ncy

"f non-hnear "ttnTw are ^'phed, rhe

bear a constant rari^ ('""'''near), the amount of chale 3. Staple, Multiple and Partial Correlation

relationship betwln • T "^"Me or p '' ^^r-ables are

IS a study of relaf,V,„,l.- 7 ""her variables from ,1. u , " ™riables

influencing valbles b '' variables Sfllcit ^ '

of tainfalf""stant. For example, 3;,':'' "f other

correlation. " a certain consrant^Jm::

M

Measures of Correlation

317

of correlation

The relationship between two values can be determined by the quantitative value of coefficient of correlation which is obtained by calculations.

Perfect Correlation : Perfect correlation is that where changes in two related variables are exactly proportional. If equal proportional changes are in the same direction, there is perfect positive correlation betWeen the two values described as +1; and if equal proportional changes are in the reverse direction, there is perfect negative correlation, described as - 1. For example, the circumference of a circle increases in the equal proportionate ratio with the increase in the equal proportionate ratio in the length of its diameter; the amount of electricity bill increase in a perfectly definite ratio with an increase in the number of unit consumed, the volume of a gas varies inversely with the pressure at constant temperature etc.

DEGREE OF CORRELATION

ive

Zero Correlation : The value of the coefficient of correlation may be zero. It means that there is zero correlation. It does not mean the absence of any type of relation between the two variables. Two valued are uncorrelated. However; other type of relation may be there. There is no linear relationship between them.

Limited Degree of Correlation : In social science, the variables may be correlated, but an increase in one variable need not always be accompanied by a corresponding or equal increase (or decrease) in the other variable. Correlation is said to be limited positive when there are unequal changes in the two variables in the same direction; and correlation is limited negative when there are unequal changes in the reverse direction. The limited degree of combination can be high (between ± .75 to 1); moderate (± .25 to .75) or low

i j I-i

318


Karl Pearson's foLt '"P""'^-"'^ degree of eorrelarion according ,o

Dejp-ee of Correlation

Perfect Correlation

Very high degree of Correlation

Sufficiently high degree of Correlation

Moderate degree of Correlation

Only the possibility of a Correlation

Possibly no Correlation

Zero Correlation (Uncorrelated)

Positive

+ 1

+ -9 or more from + .75 to + .9 from + .6 to +.75 from + .3 ro +.6 Less than +.3 0

Negative

- 1

- .9 or more from - .75 to - .9 from - .6 to -.75 from - .3 to -.6 Less than -.3 0

IhhoD;

"on. Somell^ are :

S W Coefficient of Correlation

{c) Spearman's Rank Correlation

scatter diagram

tdea about the presence of

measuring X-variable on bokolTL^âpb paper. The chart is prepared by pomt for each pair of observation oTx and y JZTi u'''' ^^ P^^^ -

plotted m the shape of points. The cluster of ooin^ U ^^^^^ âta are

the scatter diagram. When the plottedTo „te^ "" P^P^^ called

we know that there is some correlation Seen tl ^^^d-upward or downward-the correlat^n is positive, when it irôv^nw^d ^ - "Pward

- scatter diagrams given below : ^"rreiation is negative. Let us study

r=+l

Perfect Positive Correlation

(a)

High Degree ot Positive Correlation (b)

Low Degree of Positive Correlation (0


Perfect Negative Correlation (d)

319

High Dgree of Negative Correlation

-

Low Degree of Negative Correlation

I w

V! i

r= 0

No Con-elation (9)

Fig. 1

Figure [a), (b) and (c) show an upward trend—they show positive correlation. Figure (d), (e) and show a downward trend—they show negative correlat!on. Howe\ er, there are differences among (a), (b) and (c) and similar differences among {d), (e) and (/).

We find from the plottings on the scatter diagrams that there is a certain similarity among (a) and {d), (b) and (e) and (c) and (/). In (a) and (d) the plotted points are almost in a straight lines—this indicates perfect correlation. In [b) and (e) the plotted points are not in a straight line but if we draw a straight line in the middle of their points (regression line) we will find, the points are near about the line. This kind of scatter diagram shows high degree correlation. In (c) and (f) if we draw a similar line (regression line), we will find that the plotted points are very much scattered around the line—^not as near as in the case of {b) and (e). This kind of scattered diagram shows low degree correlation. Finally, diagram {g) shows such a vast scatter of points that it is impossible to see any trend— this shows no correlation or zero correlation.

Illustration 1. From the following pairs of value of variables X and Y draw a scatter diagram and interpret the result.

8 9 10 11 12 13 14 15 54 48 42 36 30 24 18 12

5

72

6 66

7 60

X : 4 Y : 78 Solution.

We note that X = 4 and Y = 78 as given first X and Y values. We may plot this as point (X, Y) on graph paper, where X = 4 and Y = 78. We measure 4 on X-axis and 78

lik

,1.


scatter diagram

Scale : 0.5 cm = 2 on X-axis

B

64 56

48 40 32 24 16 8 0

320

coordinates of

measure 5 along the x'axis and 72 alongT axis and so on for all d,e given X and y Xl

from the above scatter diagram we can decide Aat the variables X an" Y "e corre ated. The points take the shape of li^e

hen tC r 'r "" Weei X aid

Rate of Change

It is slope of the straight line rwhirh depends on an angle that the str^lghT Hnt

makes with the X-axis and is equal to j^Zj

rate of change

showing almost equal change

t-- » .__ —

— i • — — --

1

• ----

-4-. -U —1-

0 2 4 6 8

10 12 14 16 IS

showing more than proportionate

an<

change

I'fl::

m

in

a (i


321

showing less than proportionate change

showing no change

non-linear relationship

Fig. 3

We know when the plotted points show some upward trend, the correlation is positive and when there is downward trend, the correlation is negative.

(/•) If the straight line makes an angle of 45° with the X-axis, the change is exactly in the same proportion as the change in the value of X [Fig. {a) and (b)].

Hi) If the angle that the straight line makes with the X-axis is greater than 45° the

change in the value Y is more than proportionate to the change in the value of X [Fig. (c) and (d)].

(iii) If the angle that the straight line makes with X-axis is less than 45", the change in value Y is less than proportionate to the change in the value X [Fig. (e) and (/)]

(w) If there is no angle and it is a straight line parallel to X-axis, it shows that value Y does not change at all [Fig.

(v) Linear correlation exists when the ratio of change between two variables is uniform

The relationship is described by the straight line. In case of non-linear relationship

(curvilinear) the amount of change in one variable does not bear a constant ratio

to the amount of change in the other variable. Such relationship will form a curve on graph [Fig. (h)].

322

Ir

!

m

» f


Merits and Demerits of Scatter Diagram Merits :

1. It is very easy to draw a scatter diagram.

5. fa case of linear relationship between x lni y lT T ^ donate change in th^ .a,„e t Jcha^^r,: t'^nTT" "

t?™ r^atX™ — ■'"own ,„ n„„er,ca,

Whether YLTes XorTcau^ —^^ ^oes not tell,

^hen ,t . not possible to draw a scatter

pearson's coefficient of correlation

X fgr^BS variable

(1867-1936).ItisthemostLerused me^^^^^^^^^ and statistician Karl Pearson correlation of coefficient. This is atrcaHedX^^^^

represented by r. It is based on arithmel^ a„d f't'!, ^^ ^

of Correlation (r) of two variables « ob aiZ l jfjS^^T, ^-^ffft-ent

corresponding deviations of the various 7eZ of the products of the

by the product of their standard devZ^Zan JZ ^T

Symbolically, <^evtattons and the number of pairs of observations.

r =

Ixy

Here,

Nxaxxay

x=(X- X); y={Y- Y)

<yx = Standard deviation of X Series ie

' ' " \ N


ay = Standard deviation of Y Series, i.e.

N = Number of pairs of observations r = Coefficient of correlation The above formula can be rewritten as under :

Txy

323

N

r =

N.ax.ay

The above formula is based on the study of covariance between two series. The covariance between two series is written as follows :

N N

r =

Exy 1 1

—-y. — X — N

Zxy,

ox oy 1

^y X 1 )c 1 = ^y

" if " -if

r =

Ixy

yjlx^xZy^

or

I(X-X).(Y-Y)

^Jiix-xfylm-yf

Applying the Karl Pearson's formula Coefficient of Correlation is calculated by following methods :


(b) Direct Method


(d) Step Deviation Method


Illustration 2. Calculate Product moment of correlation from the following data and interpret the result.

Serial No. of Students : 1 2 3 4 5 6 7 8 9 10

Marks in Mathematics : 15 18 21 24 27 30 36 39 42 48

Marks in Statistics : 25 25 27 27 31 33 35 41 41 45

Solution. Karl Person's coefficient of correlation is also called Product Moment of Correlation.

I

324

II

t

n ^ > -

:

i*'

Statistics for Economics-XI Cdcula^on of Coefficient of Correlation

15 18 21 24 27 30 36 39 42 48

ZX = 300

-15 -12 -9 -6 -3 0 +6 +9 +12 +18

225 25

144 25

81 27

36 27

9 31

0 33

36 35

81 41

144 41

324 45

= 1080 2Y = 330

Steps :

1. Calculate arithmetic means of X and Y series 7. Apply the following formula :

r =

Ixy

Here, ^ = (X - X) and y = (Y _ y)

Let us calculate arithmetic means of X and Y series :

300

X =

Now we get.

Y =

r =

N ZX

10

330 N

Ixy

= 30

= 33


Here, Zxy = 708, Zx^ =1080 and ^ 439

708 708 708

325

r =

= 0.98

>/l080x480 >/518400 720 Hence, there is high degree of positive correlation.

Illustration 3. Calculate Karl Pearson's coefficient of correlation between birth rate and death rate from the following data :

Year Birth rate Death rate

1931 24 15

1941 26 20

1951 32 22

1961 . 33 24

1971 35 27

1981 30 24

Solution.

Calculation of Coefficient of Correlation

Actual Mean Method :

EX

X =

180

N

30 ; Y =


Ixy

EY _ m N " 6

= 22

r =

Here,

yJlx^x-Ly^ Ixy = 81, Zx^ = 90 and = 86 81 81 81

r =

V90x86 V7740 87.9772 Hence, there is high degree of positive correlation.

= 0.920

Birth Death X-X Y- y

rate rate

X ■ y xy

24 15 -6 36 -7 \ 49 42

26 20 -4 16 -2 4. 8

32 22 +2 4 b 0 0

33 24 +3 9 +2 . 4 6

35 27 +5 25 +5 '25 25

30 24 0 0 +2 4 0

EX = 180 EY = 132 Ex^ = 0 Ex = 90 Ey = 0 If = 86 Exy = 81

Pi'

t-i \vi

I;-,

326


X data compute product moment correlation between

No of items Arithmetic Mean Square of deviation from Mean

X series 15 25 136

Y series 15 18 138

c ----------138

Summation of product of deviations of X v o r means = 122. "cviauons ot A and Y series from their respective

Solution. Regarding deviations of the values in X anH v t . means, we are given the following Morr^^^^ ""

= 136, ^ 138^ ^^ ^ ^^2

Applying formula

Now, we get

r =

r =

Ixy

122

122

122

= 0.891

>^36x138 V18768 136.996 Hence, there is high degree positive correlation between X and Y

their arithmetic means be 420. Ld the coSf^fl^^^^^^^^^ IXe^^S

Solution. Given N = 50, cx = 4.5, ay = 3.5 and Zxy = 420 Applying formula.

r =

Ixy

Now, we get

N X ax X ay

r =

420

420

= 0.533

50x4.5x3.5 787.5 ^rl^'^Z ^ -cl Y.

^^ation 6. If the covariance between X and Y variables is3

X and Y are respectively 13 8 and 16 4 Vir.A .u t +12.3 and variances of

between them. " ^he Karl Pearson's coefficient of correlation

Solution. We are given.

Covariance of X and Y = ^ - l ? ^ Variance of X (ax^) = 13.8

= y/l3:s = 3.71


Variance of Y (a/) = 16.4

oy = Vl^ = 4.05

327

Applying formula, Now, we get

r =

N.ax.ay

Exy 1 1

—-X — x — N ax ay

r = 12.3 X —X ^

3.71 4.05 = 12.3 X 0.27 X 0.25 = 0.83 Hence, there is high degree of positive correlation between X and Y. Illustration 7. Find the standard deviation of X series if coefficient of correlation between two series X and Y is = 0.28 and their covariance is 7.6 and variance of Y series is 81.90.

Solution. Given, coefficient of correlation (r) = 0.28

Covariance of X and Y =

Lxy

= 7.6

Variance of Y (af) = 81.90

ay = V81.90 = 9.05

Applying formula.

Now, we get

r =

lay

N.ax.ay

2jcy 1 1 x — x —

0.28 = 7.6 X

x-

N ax ay 1

or

ax 9.05

0.28 X 9.05 ax = 7.6 or 2.534 ax = 7.6 7.6

ax =

2.534

= 2.99 approx 3

Therefore, variance of X (ax^) = (3)^ = 9

Illustration 8. Calculate the number of items for which r = + 0.8, Ixy = 200, standard deviation of Y = 5; and Ix^ = 100, where x and y denotes deviation of items from actual mean.

Solution. Applying formula.

r =

Ixy

yjlx^xly^

or 0.8 =

200

Now, we get

^O.sr = or 0.64 =

100 xZy

or 0.64 X 100 x = 40000 64 Ey^ = 40000

yJlOOxZy^

40000 100 xZy^

i

^Iv

i > t

f

Ii ;;

328


V 2 40000 64 =625

Now, = or 5 = 1625

V N V N

or /CX2 625 (5)^ = ^ or 25 = 625 N

or 25 N = 625 'N = 625

Hence, number of items is 25. 25 ~

= 25

(b) Direct Method

J:XY-N

r =

i n ,

ll In

N.

In j

n

nJ

• EXY

= -p- Kf

/ix--

* N V N

nixy~zxxy

y/NEX^ -iZXf Xy/NZY^ -{ZYf

• ^ " - - fr— the following

where.

r =

X =

lix' N {Xfx

ZX N , Y = ZY N

N

~(Y)

v\2

r =

X.

1XY-N.{X).{Y}

X ^V X—

r =


Illustration 9. The data of price and quantity purchased relating to a commodity for 5 months are given below. Calculate the product moment correlation (Karl Pearson's coefficient of correlation) between price and quantity and comment on its sign and magnitude.

Months Price (in Rs) Quantity (in kg)

1 10 5

2 10 6

3 11

4

4

12 3

Solution. Calculation of coefficient of correlation.

• 10 10 11 12 12

2:x = 55

5

6 4 3 2

ZY= 20

100 100 121 144 144

ZX' = 609

25 36 16 9 4

lY' = 90

Steps :

1. Calculate arithmetic means of X and Y series.

2. Square the values of X series and obtain the total, i.e., EX^

3. Square the values of Y series and obtain the total, i.e., EY^

4. Multiply X and Y values and find out the total, i.e., ZXY.


5 12 2

xy

50 60 44 36 24

EXY = 214

r =

IXY-N.X.Y

VeX^ - N(Xf x ^zy^ - N(Yf

Let us calculate arithmetic means of X and Y series

X =

EX 55

EY 20

N 5 ~ ^^ ' ^ - N 5

Here, ZXY = 214, EX^ = 609, ZY^ = 90, X = 11, Y = 4 and N = 5 Now, we get

214-5x11x4

= 4

r =

^609-5(11)^ xV90-5(4)2 214-5x44

Ii ■ [i 1

330

214-220 V609-605x>/90-80 -6 -6

-6


= -0.949

>/4xV10 2x3.162 6.324

Hence, there is high degree of negative correlatiori between price and quantity purchased relating to a commodity of 5 months.

In other words, purchase (demand) decreased due to increase in the price of commodity.

Illustration 10. Draw a scatter diagram and calculate Karl Pearson's coefficient of correlation between X and Y. Interprete the result and comment on their relationship. X : 1 3 4 5 ■ 7

8

: 2 6 8 10 . 14 16

Solution.

^

¥

%

u

scatter diagram

Scale: 1 cm = 1 on X-axis 1 cm = 2 on Y-axis

- Q

lo

v ■SA

1

in

1V -Q-

(V) U —A-

-At

2

u ) ■ 1 J. t ( 1 7 ! <

fx,

Fig. 4

From the above scatter uiagram we can decide that variables X and Y are correlated. The points take the shape of line, and it goes up from left bottom to right top then there IS perfect positive correlation between X and Y.

fon

equ in s con negt


331

X y X' XY'

1 2 1 4 2

3 6 9 36 18

4 8 16 64 32

5 10 25 100 50

7 14 49 196 98

8 16 64 256 128

X = 28 ZY = 56 IX' = 164 ZY' = 656 ZXY = 328


r =

N

V

N N

Here, IXY = 328, 2X = 28, lY = 56, IX' = 164, IT- = 656 and N = 6

r =

6

(28)^ r

rr-f

1164 56-

{56r

328-261.33

V164 -130.666 X V656 - 522.666

66.67 V33.334 X VI 33.334

66.67 66.67

= +1

5.774x11.547 66.67 r = +1

There is perfect positive correlation by scatter diagram and even by Karl Peaison^ formula, resulting to r = +1.

We observe from the illustration the changes in tw^o values X and Y are exactiv in equal proportion. Y values are exactly double than the corresponding values of X movuig in same direction (upward). In such situation, correlation results to perfect positiJe correlation. If equal proportional changes are in the reverse direction, there is perfect negative correlation (r = -1).

f/ '

fiPl

IN

332

-fflcient of c„„e,ado„

Statistics for Economics-XI I

on their relationship.

between X and Y and

X Y : Solution.

comment

-3 9

-2 4

-1 1

2 4

3 9

-3 -2 -1 1 2 3

IX = 0

l^jgilftjon of Coeffident of Correlation

9 4 1 1 4 9

xy = 28

9 4 1 1 4 9

= 28

81 16 1 1 16 81

ly^ = 196

XY

-27 -8 -1 1 8 27

Zxy= 0

r =

IXY-

N

■2 (ZYf

fzy^-

N

= 0

yf^x^/65.334 = 0 5.291x8.083"

rhey



When actual mean is not a whole number; but a fraction or the series is large, the calculation by actual mean method and direct method will involve a lot of calculations and time. To avoid such tedious calculations, we can use the assumed meat, method. Correlation coefficient can be obtained by the following formula.

Idxdy-

r =

Ux.Zdy N_

W-

(Zdyf

N

Illustration 12. Calculate Karl Pearson's coefficient of correlation of the following data of height of fathers in inches (X) and their sons (Y). Interpret the result.

Height of fathers (in inches) : 65 66 57 67 68 69 70 72 Height of sons (in inches) : 67 56 65 68 72 72 69 71 Solution.


65

66 57

67

68

69

70 72

-3 -2 -11 -1 0 +1 +2 +4

Zdx = -10

9 4 121 1 0 1 4 16

Z^^ = 156

67 56 65

68 72 72 69 71

+2 -9 0 +3 +7 +7 +4 +6

ldy = 20

4 81 0 9 49 49 16 36

Zdy" = 244

-6 +18 0 -3 0 +7 +8 +24

Idxdy = 48

Steps :

1. Calculate the deviations of X series from an assumed mean (68) and denote them

by dx and find out the total, i.e., ILdx. 1. Calculate the deviations of Y series from an assumed mean (65) and denote them by dy and find out the total, i.e., Uy.

3. Square the deviations of X series and obtain the total, i.e., Zdx^.

4. Square the deviations of Y series and obtain the total, i.e., I^y^.

5. Multiply t/x and f/y and find out the total,/.e;, Ikixiiy.

6. Applying the following formula, we get

334


Here, Zdxdy = 48, ZJx = -10, I^y = 20, N = 8, W = 156, ZJ^a ^ ,44 Now, we get r = -j

8

48 + 25 >/i56-12.5XV244-50

73 73

73

11.97x13.92 ~ U^ ""

the™: " —l>etwee„ height of fathers and

We can simplify the above calculations by using log tables : Taking Logarithms

73

Hence,

VM3.5xVm log r . log 73 - 1 [log 143.5 + log 194]

= 1.8633 - i [2.1563 + 2.2878]

= 1.8633 - 1 [4.4441]

= 1.8633 - 2.2220 = -0.3587 = -0.3587 (+1) - -1 + (1 - 0.3587) = Antilog T.6413 r = 0.4378 = 0.438

{d) Step Deviation Method convenient common factor to redZ „ T , "" by

u Stq

, • 335


I unaffected by the change of origin and change of scale of X and Y. After changing these deviations, we apply the same formula of assumed mean method.

Illustration 13. The data on price and supply relating to a commodity for 7 months are given below :^ ^

".s) : 40 lo^ .o .o

Supply (in kg) : 400 200 500 1000 400 1100 1200

Calculate product moment of correlation between price and quantity and comment on its sign and magnitude. Solution.


X-60 X-60 dx' (kg) y-700 dy^ drdy

.Price (Rs) 20 I /W 100 t*jf

X dr Y ■ dy

20 -40 -2 4 400 -300 -3 Q 6 :

40 -20 -1 1 200 -500 -5 25 s

60 0 0 0 500 -200 -2 4 o;

80 +20 +1 1 1000 +300 +3 9 3

100 +40 +2 4 400 -300 -3 9 -6

120 +60 +3 9 lioo +400 +4 16 12

140 +80 +4 16 1200 +500 +5 25 20

ZX = 560 Tdx = 7 Zdx^ = 35 ZY = 4800 Zdy = -l Idf = 97 lAxdy= 40

Steps :

1. Calculate the deviations of X series from an assumed mean and divide them by common factor. Denote them by dx and find out the total, i.e., Idx.

2. Calculate the deviations" of Y series from an assumed mean and divide them by common factor. Denote them by dy and find out the total, i.e., Uy.

3. Square the step deviations of X series and obtain the total, i.e., Idx\

4. Square the step deviations of Y series and obtain the total, i.e., Zd-f.

5. Multiply dx and dy and find out the total, i.e., Idxdy.

6. Applying formula, we get

(Zdx).CZdy)

N_

Zdx.dy-

r =

[Zdxf N

Zdy^-

(Zdy)^ N

336

Here, Idx.dy = 40, Zdx = 7, Uy = _

r =

40-1^

'7

Statistics for Economics-XI 1; N = 7, W = 35 and Ldy^ = 97

^_ - —

41

y/Bxy/96-857

41

41

= + 0.787

_ 5.29x9.84 52.05

commodity. ner words, supply mcreases due to mcrease in the price of

Change of Scale in the Calculation of r

valu" of ~ ff ^^ ^^^

we take the values as T 2 ani s thrva L T Z ^^ ^ ^00 and

values of Y-series are 1 i 2.4 and 3 7 lev ci\ 1 T J'l^^^^

these values to he 12,24 ^n,^'' - —

The following example would illustrate the poim •

Illustration 14.^ Calculate^coeffic^^^^^

SolL^W^rmultSTLes^lOa^r."^^^ ^^^^^

values of X and Y would L ^ ^ ^y ^o that the

X : 1 ■

^ = 5 6 t ' ' '

; 10 11 J3

Calcularion of Coefficient of Correlation


Applying formula

337

"Ldxdy —

r =

(ldx)(Uy) N

Here, Uxdy = 46, Ux = 0

Idy = 0, EJx^ = 28, Zdy^ = 76, N = 7

Now, we get

+46-

r =

(OHO) .7

f^xf^

+ 46

sBmm

LIBRARY

= 0.997

V28x47

Hence, there is high degree of positive correlation.

If the original values of X and Y were used the result would still be the same and r would be +0.997.

Assumptions of Karl Pearson's CoefiScient of Correlation

Pearsonian coefficient is based on following assumptions :

1. Linear relationship : If two variables are plotted on a scatter diagram, it is assumed that the plotted points will form a straight line. So there is a linear relationship between the variables.

2. Normality : The correlated variables are affected by a large number of independent causes, which form a normal distribution. Variables like indices of price and supply, ages of husbands and wives, heights of fathers and sons, price and demand are affected by such forces the normal distribution is formed.

3. Causal relationship : Correlation is only meaningful, if there is a cause and effect relationship between the force, affecting the distribution of items in two series. It-is meaningless, if there is no such relationship. There is no relationship between rice and wheat, because the factors that affect these variables are not common. Similarly, the weight of an individual during the last ten years may show an upward trend and his income during this period may also show similar tendency but there cannot be any correlation between the two series because the forces affecting the two series are entirely unconnected with each other. The calculated coefficient of correlation of such series is usually termed as ''non-sense or spurious^ correlation.

4. Proper grouping : It will be a better correlation analysis if there is an equal number of pairs.

5. Error of measurement: If the error of measurement is reduced to the minimum the coefficient of correlation is more reliable.

Ji ■

" i

338

Mathematical Properties of th^ r Statistics for Economics-XI

Th^ ( 11 Coefficient of Correlation

are muIdpUed or divided by so™ ^rant ^ f'"" * ^-ri' consr^r . subtracted or added frorivl®'f^^'/v""®" "" '^at a " and 14). of X and Y series. (See Illustration

3. The converse of the dieorem r=n i.

and need not necessarily be indepe^detit Uncor^L ^ """elated variables and y Stnyly implies the absence of Itoear rZI^ \ ™™bles X

however, be related in some other fo™ between them. They may,

MerU^^d Demerits of Karl Pearson's Coeffident Demerits

3 process is time consuming.

• relaZ'Sp'™ between the variables, whether such

4. Correlation lies between ± 1 Thi ^ i

otherwise it may be misinterpreted. ' interpretation.

^Rww^ ____________

under consideration

Measures of Correlation ^^^

Charles Edward spearman, a British psychologist developed a formula in 1904 which consists in obtaining the correlation coefficient between ranks of N individuals in the two attributes under study called coefficient of correlation, by rank differences. It is the Product Moment Correlation between the ranks.

This method is applicable only to individual observations rather than frequency distribution. The result we get from this method is only approximate one, because under ranking method original values are not taken into account.

After assigning ranks to the various items, the differences of corresponding rank vaiues are calculated and following formula is used :

rfe = 1 -

N^-N

where,

rk = Coefficient of rank correlation ZD' = the total of squares of the differences of corresponding ranks N = the number of pairs of observations Like Karl Pearsons, the value or rk lies between +1 and -1. If rk = +1, then there is complete agreement in the order of ranks and the direction of the rank is also the same. When rk = -1, then there is complete disagreement in order of ranks and they are in opposite direction. Let us examine by following example :

1

2 3

1

2 3

0 0 0

0 0 0

niy = 0

rk = \-

= 1 -

ZD^ N^-N

6x0

3^-3

=1-0=1

Perfect positive correlation, i.e., there is a complete agreement.

1 •2

3

3 2 1

-2 0 2

4

0 4

= 8

rk = l-

= 1 -

6ZD^ N^-N

6x8

3^-3

= 1 - 2 = -1

Perfect negative correlation, i.e., there is complete disagreement.

The problems are of three types of calculation of rank correlation :

When Ranks are given, (fc) When Ranks are not given, (c) When Ranks are equal or repeated.

Ir i

(If:

340

(a) When ranks are given :

Illustration 15. In a hah

Entry Judge I Judge II

Calculate the rank correlation coeLent!" Solution.


A B C D E

1 2 3 4 5

2 3 1 6 4

F 6 5

G // I / K

7 8 9 10 11

8 7 10 11 9

^^^ff^^ffident of Correlatic

A

B

C

D

E

F

G

H

I

J

K

N= 11

1 2

3

4

5

6

7

8 9

10 11

2 -1

3 -1

1 +2

6 -2

4 +1

5 +1

8 -1

7 +1

10 -1

11 -1

9 +2

Steps :

1 1 4 4 1 1 1 1 1 1 4

ZD'= 20.

1. Calculate the difference of two ranks /e Rn u

2. Square these differences and find o^ 'Z tok]


rk=l-

Now we get.

N^-N

n'-n

where, ZD^ = 20 and N = H


341

rk=l-

6x20 11^-11

= 1 -

120 1320

= 1 - 0.091 = 0.909

Hence, there is high degree positive correlation, i.e., two judges are agreeing to the degree of 0.909. It indicates that judges have fairly strong likes and dislikes so far as ranking of the babies are concerned.

{b) When ranks are not given :

Illustration 16. From the following marks obtained by 10 students in Statistics and Economics, calculate Spearman's coefficient of rank correlation.

Statistics : 36 56 20 65- 42 33 44 53 15 60 Economics : 50 35 70 25 58 75 60 45 89 38 Solution.

Calculation of Rank Coeffident of Correlation

36 4 50 -5 -1 ' 1

56 8 35 2 +6 36

20 2 70 8 -6 36

65 10 25 1 +9 81

42 5 58 6 -1 1

33 3 75 1 ? -6 36

44 6 60 7 -1 1

53 7 45 4 \ +3 9

15 1 89 10 -9 81

60 9 38 3 +6 36

N= 10 = 318

Steps

1. Assigns ranks to given data. Ranks can be given by allotting the biggest item the first rank, the next to its second rank and so on or smallest item the first, next to its second rank and so. on. Any one of the above method of ranking must be followed in case of both the variables.

2. Find the difference of two ranks (i.e., R^ - R^) and denote these differences by D.

3. Square these differences and find out the total TB^.

4. Apply the formula.

342



rk = 1

Here,

. N'-N ^D- = 318 and N = 10

rk = 1 -

^x318

= 1 -

1908 990

th uu = 1 -1.-927 .-0.927.

2^x^226-(55). (55)

iiOx

2260-3025

V3850 - 3025 x 73850^30^ -765

^^ X ^JS25 = - 0.927, which is same as before

343


Blustration 17. Ten entries are submitted for a competition. Tbree judges study eacb

Ranks given by :

Calculate the appropriate rank correlations to help you to answer the following questions :

(a) Which pair of judges agree the most?

(b) Which pair of judges disagree the most? Solution.

Entry No. : 1 2 3 4 5 6 7 8 9 10

Judge A : 9 3 7 5 1 6 2 4 10 8

Judge B : 9 1 10 4 3 8 5 2 7 6

Judge C : 6 3 8 7 2 4 1 5 9 10

Calculation of Rank Coefficient of Correlation

- ■A • a

1 9 9 6

2 3 1 3

3 7 10 8

4 5 4 7

5 1 3 2

6 6 . 8 4

7 2 5 1

8 4. 2 5

9 10 7 9

10 8 6 10

N = 10

- IjF^

0 0 +3 9 +3 9

+2 4 0 0 -2 4

-3 9 -1 1 +2 4

+1 1 -2 4 ,-3 9

-2 4 -1 \ - 1 +1 1

-2 4 +2- . • 4 +4 16

-3 9 +1 1 +4 16

+2 4 -1 " 1. -3 9

+3 9 +1 1 ' -2 4

+2 4 -2 '4 -4 16

SD^ = 48 ID' = 26 ID- - 88

Applying the following formula, we get

rk = l-

N-' - N

rk (between Judges A and B) rk = 1 -

6x48

10" -10

= +0.71

rk (between Judges A and C)

rk = I -

6x26

10-^10 = +0.8425

= I _ ^ = 1 _ 0.29 290

= 1-^=1- 0.1575

990

344

rk (between Judges B and C)


rk = l-

6x88 10^-10

^ ^-e ^e nearest approach

(*) ^mce is n^nimun. of the pair of judges ^ and C. therefore, they disagree the

in and" ^^rr/t^S-"^nts

in ranks in t»o subjects obtained W L of 1 ' <liff"ence

of 7. Find the corrit coefficiem'jrnLl:!::^™^" "" ^ ^^d Solution.

rk=l-

N^-N

Substituting the values in above formula, we get

0.5 = 1 -

(10)^-10 = 1 - 0.5 = 0.5

990

6 ED2 = 0.5 x 990

2£>2 ^ 0-5x990 6

= 82.5

Corrected ID^ = 82.5 - (3)^ + (7)2

= 82.5 - 9 + 49 = 122.5 6x122.5

Corrected rk = I -

= 1 -

(10)^-10 735

990

rk = + 0.2576

(c) When ranks are equal or repeated

13

13

24

15

20

19

Measures of Correlation ■ Solution.

Calculation of Rank Coefficient of Correlation

345

X Y

48 13 8 5.5 +2.5 6.25

33 13 6 5.5 +0.5 0.25

40 24 7 10 -3.0 9.00

9 6 1 2.5 - 1.5 2.25

16 15 3.5 7 -3.5 12.25

16 4 3.5 1 +2.5 6.25

65 20 10 9 +1.0 1.00

25 9 5 4 +1.0 1.00

15 6 2 2.5 -0.5 0.25 _

57 19 9 8 +1.0 i.oo"

N = 10 ZD^ = 39.5

Steps :

1. Assign the ranks to given data. When two or more items are of equal value, they are assigned average ranks. For example, in X series value 16 repeated twice and

they are each ranked ^^ =3.5 and in Y series value 13 are given the rank ^^ = 5.5 and so on.

2. Obtaining ID^ apply the formula. When equal ranks are assigned to same of the

entries and adjustment is made in the formula of rank correlation, i.e., adding —

{m^ -m) to the value of SD^ Here, m represents for number of times whose ranks are repeated. In case, there are more than one such group of values with common rank 1/12 (m^ - m), is added as many times the number of such groups. The adjusted formula is as under :

..3

rk = 1-

ID^ + ^(m^ -m) +...

N^-N

Now we get, rk

6{39.5 + -2) + -2) + -2)} = 1 ~ in3„n

ta :

= 1 -

= 1 -

6(39.5 + 1.5)

990

= 1 -

10^-0 6x41.0

990

246

= 1 - 0.246 = 0.754

990 rfe = 0.754

Hence, there is high degree of positive correlation.

346


Merits and Demerits of the Rank Method Merits :

M^hT " » - compared ,o Karl Pearson.

De^r;' - degree of correlation,

grouped frequency dis^ibnrion (bivtiaSs^l^n' " '

OF FORMULAlt

1. Karl Pearson's Coefficient of Correlation (a) Actual Mean Method

r = ^y _ Zxy J_ J N.ax.ay N ^ ax ^

ay

(b) Direct Method

4

N N

Ixy

xZ/

ZXY-N

r =

In

N

N.

' N

In

I N

.N

_N

N

N

N1XY-EX.IY

zxy-n.x.y

I------------^^ ± —1\.


(c) Assumed Mean Method and Step Deviation Method

r =

347

N

N

Explanation of Symbols

r = Karl Pearson's Coefficient of Correlation. X = (X - X), deviations taken from actual mean of X series. y = (Y - Y), deviations taken from actual mean of Y series. ax = Standard deviation of X series. ay = Standard deviation of Y series. ZX = Sum of the values of X series ZY = Sum of the values of Y series ZX^ = Sum of square of the values of X series ZY^ = Sum of square of the values of Y series ZXY = Multiplying X and Y values and obtaining the total N = No. of pairs of observations

dx = {X - A), deviations taken from assumed mean of X series.

dy = (Y - A) deviations taken from assumed mean of Y series.

dxdy = Multiplying the deviations taken from assumed mean of X series with the deviations taken from assumed mean of Y series.

2. Spearman's Rank Coefficient of Correlation

rk = \-

= 1 -

N^-N

N^-N

rk = Spearman's rank correlation. D = Difference of ranks. N = Number of pairs of observations. m = Number of times the value repeated.

348

exercises


4.

5.

6. 7.

10.

11

12.

13.

14.

Questions ; ~ "

fea What is meant by correlaHon? wi,^^ t » Distrnguish betvê;^^;!^^::.^^ T™ ^^ —

Distinguish between : 'ô-elation from coeffidem of variation.

(a) Positive and negatives correlation

ib) Lmear and non-li„ear correlation,' and

ic) Simple, partial and multiple correlation Give three examples of perfect correlation. '

ê^pl^r between X and y is .ro, does it mean that variable

d^r Smir born and export over last

correlation): correlations (positive, negaL or no J

(/) Sale of woollen garments and the day temperature-

dCflllrrtfou^^^^^^^^^^^ /J" 'he po,„« .He scaler

Distinguish between covariance and variance.

ia) Define Spearman's rank correlation.

ib) What are the limits of rk

! ' âSl.^/^- we .legate product moment

r ..correlation be equal to the val^^^? ^^ ^ -d Y, will this

8.

9.

2 6

3 5

5 7

6 8

8 12

9 11

349

Me^swres of Correlation

15 What are the advantages of Spearman's rank correlation over Karl Pearson's correlation coefficient? Explain the method of calculatmg Spearman s rank

correlation coefficient.

16. (a) How is Karl Pearson's coefficient of correlation defined?

(b) What are the limits of the correlation r?

(c) If r = +1 or r = -1. What kind of relationship exists between X and Y?

17. Write short notes on :

(a) Spurious correlation, and

[b) Positive and negative correlation.

Problems :

1. Give the following pairs of value of variables of capital em^ployed and^profit^. Capital employed (in crores of Rs) (X) Profit (in lacs of Rs) (Y)

(a) Make a scatter diagram.

ib) Do you think that there is any correlation between profit and capital employed.

Is it positive or negative? Is it high or low? (c) By graphic inspection, draw an estimating hne.

2 Plot the following data as a scatter diagram and comment over the result : ' X ■ 11 10 15 13 10 16 13 8 17 14

y : 6 7 9 9 7 . 11 9 6 12 11

3 Following are the heights and weights of 10 students in a class. Draw a scatter ■ diagram and indicate whether the correlation is positive or negative.

Height im inches) : 72 60 63 66 70, 75 58 78 72 62

Weight (m kg) : 65 54 55 61 60 54 50 63 65 50

4. Construct the scatter diagram of the data given below and interpret it

Average value (in Lakhs Rupees)

Year Cotton (import) Cloth (export)

: 1990 : 47 1991 641992 100 1993 971994 126 1995 203 1996 171

: 70 85 100 103 111 139 133

{expuri/.

Draw a scatter diagram for the data given below and interpret it

X : 10 20 30 40

y . 32 20 " 24 36

6, Draw a scatter diagram of the following data :

50 40

60 28

70 48

80 44

X Y

15

7

18

10

30

17

27 16

25 12

23 13

30 9

f!

350 •

- , Statistics for Economics-XI

X Series

Y Series

Arithmetic Mean Square of deviations from Arithmetic Mean

■ -------J__

Summation of products of deviatioZl^lTZTV'^----

means = 122 ^ ^nd Y series from their respective

Number of points of values = 15

15 25

18 25

39 41

24 27 30 36 27 31 33 35

Calculate product moment of correlation between X and Y ■

- ^^ - S 2 2 - -

42 41

48 45

[r = 0.98]

91 95 49 40

SliL" Z r ''' whea, and ^

; - « .^o 550

16 17 20 19 19 20 25 27

11. Calculate Karl Pearson's coefficient nf i l ^ '' = 0-1621]

the following 10 firms : ^^ correlation between the sales and expenses of

Firms 12 3

■ ™ - - - J J

Expenses . n 13 14 ..

(/« Rs '000) ^ 15 14 13 j3

12. Ten students got the following percentage of b ■ c • f'" = Ser,al No. ; 1 ^ 'J' ^^ Statistics and Mathematics -

Statistics : gO 60 51 76 58 J J ' ^ 10

Mathematics 45 yj ^^ 62 64 72 56 58

. Calculate product moment correlation ^^ ^^ ^^ ^^ ^^ 60

13. Calculate correlation coeffident between X the nnn,K . • ^^ = " 3ri Y, the number of rain coats solrl in

ot rainy days per month

Interpret the results. ^ ^ certain shop for 12 months.

^ =Vl4 8 18 10 22 9 3 ,

^ 11 20 12 15 73'' ^^

^ 3 4 7 10 11 29

[r = - 0.67]


351

^ f The deviations from their means of two senes (X and Y) are given below . Ky . 4 -3 -2 -10+1+2+3+4

: j -3 -4 0+4+1 .2 -2 -1

< Calculate Karl Pearson's coefficient of correlation and interpret the resu.t. ^^ ^ ^^

15 Find the product moment correlation of the following data : ' X • 1 2 3 4 56 7

Y ; 9 8 10 12 11 13 14

8 16

16.

9 15

[r = +0.931

Calculate the correlation coefficient of the marks obtained by 12 students m Mathematics and Statistics and interpret it.

Students

Marks (in Maths) Marks (in Statis.)

A 50 22

B 54 25

C 56 34

D 59 28

£ 60 26

F 62 30

G 61 32

H 65 30

7 7 K ^ 67 71 71 74 28 34 36 40

[r = + 0.783]

67

68

68 72

69 VO

71 73 69 70 [r = 0.47]

17. The height of fathers and sons are given below : Height of fathers (in inches) : 65 66 67 Height of sons (in inches) : 67 68 64

Calculate Karl Pearson's coefficient of correlation. , A

18 Find Karl Pearson's coefficient of correlation from the following index numbers and

'f;'" «''«- -''

Costofltvtng : 98 99 [r = +0.85]

19. Find the product moment correlation between sales, and expenses of the following 10 firms.

Firms Sales Expenses

1 50 11

2. 50 13

3

55 14

4

60 16

5

65 16

6

65 15

7 65 15

8 60 14

9 60 13

20.

10 50 13

[r = +0.797]

Calculate the coefficient of correlation for the following ages of husbands and wives in years at the time of their marriage.

Age of husbands : 23 27 28 28 29 30 31 Ageofu^ives : 18 20 22 27 21 29 27

33 35 36 29 28 29 [r = +0.82]

21 Find suitable coefficient of correlation for the following data : .^iFertUizers used ^ tons) : 15 18 20 24 30^ 35^

Productivity (in tons) ■ ^^

40 50 150 160 (r = +0.99]

352 22.

23.

P , Statistics for Economics-XI

« age of ca„ and annua, matoenance

Age of cars (years) ; 2 4

Annual maintenance ■. I6OO 1 Ton lonn ^ ^ 10 12 Co^MmRs) 1800 1900 1700 2100 2000

Calculate coefficient of correlation K,. p , ^^ = +0.836]

population and death rate ^ ^^^^^^ between the density of

C 400 14

Citites Density Death rate

A 200 10

B 500 16

D

700 20

E 600 17

24.

F 300 13

[r = +0.988]

Total of the deviation of X = -170 Total of the deviation of Y = -20 Total of squares of deviation of X = 2264 lotal of the squares of deviation of Y - 8288

series : Arithmetic Average = 65

•• Standard deviation = 23 33 r series : Arithmetic Average = 66 Standard deviation =14 9

Rauk Correlation ir = +0.78]

26. Calculate Spearman' r , „on from the following data :

^ •• 15 10 I '' 20 25 40

^ 25 16 12 8

27. The following are the marks obtained fout nf mm u t'" = +0-143] emp oyment interview held by two Ldependem - ^ '

coefficient of correlation. ^n^ependem judges separately Calculate the rank

Candidates ■■ A n n r^

'' - " n

Judge X Judge Y

20 22

20 15

14

10

8

11

12

13 9 [r = 0.721]

353


28 Two judges in a beauty competition rank the 12 entries as follows

X : 1 2 3 4 5 6 7 8 9 10

y. 12 96 10 354782

Calculate rank coefficient of correlation.

29 Calculate rank coefficient of correlation of the followmg data : ' X : 80 78 75 75 68 67 60

Y : 12 13 14 14 14 16 15 ^^ _ ^^^^

30. Twelve entries were submitted in a flower show competition. They were ranked by

two judges as under : ^ , 7 8 9 10 11

^ 7 8 2 1 9 3 12 11 4 10 6

futeB : ^ : 5 3 11 2 12 10 5 9 7 Calculate Spearman's rank correlation. 31 Calculate the coefficient of rank correlation from the following data X ■ 48 33 40 9 16 16 65 25 15 y . 13 13 24 6 15 4 20 9 6

11 12 11 1 [r = -0.454]

59 17

12 5 8

[r = +0.86]

57 19

[r = +0.73]

32. Calculate rank coefficient of correlation between years of service and efficiency rating.

Persons

Years of Service Efficiency rating

A 24 66

B 30 51

C 12 84

D

25 66

E 29 45

F 19 81

G 16 72

H 10

97

I J

11 7 92 70

[r = -0.78]

33. From the following data calculate coefficient of correlation by the method of rank

95 70 60 80 81 150 115 110 140 142

X Y

75 120

68 134

50 100

[r = +0.93]

1. 2.

3.

4.

5.

6.

7.

8. 9.

10. 11.

Chapter 12

IWTRDOCUTION TO INDEX NUMBERS

Introduction Definition

Types of Index Numbers Problems in Construction of Index Numbere Methods of Constructing Index Numbers Consumer Price Index (CPI) Index of industrial Production (IIP) General Uses of Index Numbere Inflation and Index Numbers Limitations of Index Numbers List of Formulae

li'ftSWf

INTRODUCTICm

■ have r'r' we

Quantities; in measures of rp r ion-ab j^^^^^ Po-'ional values :

Deviation, Mean Deviation and tondat Devia,l"t""

two variables. From the above tookAe s™ ' association of

chapter we will learn how to obtS sum jrr"^ ""T? "-is

variables. "" """"nary measures of change in a group of related

ail .^l^tZSieThlt:rtTp^t I P^.ces of

price of some commodities may'alat » 'T -™°''it,es,

illustration : ^ ^^^ ^^^^ "s examine from the following

Commodity Prices of ve^ 2000 ?etable oil and tea

Vegetable oil (per litre) Rs Tea (per kg in Rs) 40 100 ^yjyjj 80 150

ti c

Introduction to Index Numbers 355

We can measure the change in the prices of vegetable oil and tea in two ways :

(a) Actual Difference

(b) Relative Change (Price Relative)

(a} Actual difference. The actual difference in price is the difference between the current year price and the base year price.

Actual difference = Current year price - Base year price

Current year : 2005 Base year : 2000

Difference in : Vegetable oil (per litre) Tea (per kg)

We find that the rate of vegetable oil is increased by Rs 40 and of tea by Rs 50 from the year 2000 to 2005. From this, it appears that the increase in price of tea is more than the increase in price of vegetable oil. (b) Relative change (price relative). The relative change in prices is the actual difference in prices relative to the original price. From the above example :

80 - 40 = Rs 40 150 - 100 = Rs 50

Relative change =

Actual difference

Base year Price Relative change in vegetiable oil :

80-40

or 1-

Current year Price Base year Price

For vegitable

For Tea

40

150-100 100

= 1

or

1 -

= 0.5 or 1 -

40 150 100

= 1

= 0.5

This change can also be expressed in percentage : For vegetable oil : 1 x 100 = 100% And for tea : 0.5 x 100 = 50%

The ratio of prices in two years is called price relative which is a pure number and this price relative for a single commodity even may be called an index number of that commodity.

However, if we calculate the rise in percentage taking 2000 as the base year, we. find that the rise is 100% of vegetable oil and 50% in case of tea.

Symbolically,

Current Year Price Base Year Price

iL Po

100

100

356

P^ - price of the current year (2005) Pa = price of the base year (2000)


Vegetable oil : x 100 = 200

Tea

40

m 100

= 150

M Increase of vegetable o,l is 100°/Th f . ™ " "" :

(b) The relative comparison o o^e T ^000 = 100.

Thus, change in pncrrrll ^"-at of tea. actual difference in prices. important than just the

As measurement of veeerahip nil .v. iv " i of measurement, their absZe duffel l ""

, "" "Ot be combined. But relative

changes or price relattons are pure numbers namely for vegetable „„

Rs 80

Rs150

R740 ^^^ tea

Rs 100 be combined to obtam the arithmetic mean of price relatives,

I.e.,

80 150

+

40 100

2 + 1.5

= 1.75

it c!:- - "seeher ,s ,.,5 . ,00 = 1.5. Now,

mcreasedby27%.,nthe!ame„ay cltsi'XL''"" "f - i

hat Dearness Allowance of Governlm eij^ »

".dex goes up. Similarly, we come acroTs rdex ^b '"T"''-production, sales, export, prices, wages «c Thev ar7 f , "Sncultural and industrial economy. Index numbers are the bafo^rlX^^VrrnX:'™''''''"-

■

definition

orhif Je-chats^—trd r Tllnr IrT " "r »

commodtty to another. Tley are usually IXJIZI^Z^


According to Spigel, "An index number is a statistical measure designed to show changes in variables or a group of related variables with respect to time, geographic location of other characteristics."

According to Croxton and Cowden, "Index numbers are devices for measuring differences in the magnitude of a group of related variables".

Thus, the characteristics of index numbers can be heighhghted as under :

1. Index numbers are expressed in terms of percentages so as to show the extent of relative change. However, percentage sign (%) is never used.

2. Index numbers are relative or comparative measurement of group of items. They compare changes taking place over time or between places or like categories-schools, persons, hospitals etc.

3. Index number are called SpeciaUsed type of averages in the senses that they help us in comparing change in series which are in different units. Averages like mean, median and mode can be used to compare only those series which are expressed in the same unit.

4. The technique of index numbers is utilised in measuring changes in magnitude which are not capable of direct measurement due to composite and complex character of the phenomenon. Examples of such phenomena or magnitudes are price level, 'cost of living', prices of specified list of commodities, volume of production in different sectors of an industry, production of various agricultural crops, 'business or economic activity' etc. Changes in business activity in a country are not capable of direct measurement but it is possible to study relative changes in business activity by studying the variation in the values of same such factors which affect business activity and which are capable of direct measurement.

ttpes of index numbers

The usefulness of any index number lies in the types of questions it can answer. Each index number is designed for particular purpose, and it is the purpose that determines its method of construction

There are various kinds of index numbers. In economics and business, they can broadly classified as under :

1. Wholesale Price Index (WPI)

2. Consumer Price Index (CPI) or Cost of Living Index

3. Index number of Industrial Production (IIP)

4. Index number of Agricultural Production (L\P)

5. Sensex

1. Wholesale price index (WPI) is used to measure the general price level where we are required to obtain the wholesale prices of industrial, agricultural and other products from wholesale market. It does not include the items pertaining to services like repairing charges, barber charges etc. WPI is used to eliminate the effect of

358


the Priva" str " "he public and

4. number of Agriculn.ral Production (lAP) is used to study the rise and fall of the yteld of pnncpal crops from one period to other period

5. Sensex is a useful guide for the investors in the stock marter If rh. • ■

appropnate time for mvestment. The rise in sensex at the highest level reflects the base''??S'valut'oft°hT'°™ Bombay Stock Exchange Sensitive todex with 1978-79 as

iiuinoer will replace wholesale price index. Producers Price Index /PPT^

Introduction to Index Numbers

359

in construction of index numbers

Following are the important problems which must be well defined for the construction of index numbers :

1. Purpose : Every index number has its own particular uses and hmitations. The first and foremost problem in the construction of index numbers is in regard to the objective or the purpose for which they

are required. It is important to know what is to be measured and how these measures are used. If the purpose is to measure the general price level, then wholesale price index number is used. If the purpose is to measure cost of living of middle class families, working class (labour) or agricultural workers, in a particular region or city, then consumer price mdex number is used. If the object is to measure relative change in industrial production, then index number of industrial production is to be used.

2. Selection of base period : When comparison is to be made between different time periods or different places, some point of reference is to be decided. This is called base. In the above illustration about prices of vegetable oil and tea, we have taken year 2000 as the base year and 2005 as current year for our calculations' of index numbers. The base is assigned the value of 100%.

For making the comparison over a period of time it must be remembered :

(a) The base period should not be either too short or too long : It should be neither less than a month nor more than a year from calculations' point of view.

(b) The base period should not be too near or too far : This is because people usually prefer to compare present conditions with conditions in base or reference period that is not too far back time. If the base period is too far the comparison becomes meaningless. Due to introduction of new commodities, change in habits, taste, fashion, in economy many commodities may go out of use. In such situation it becomes necessary to shift the base period.

(c) The base period should be normal and representative period : Base period should be free from all sorts of abnormalities and random or irregular fluctuations like earthquakes, wars, floods, famines, labour strikes, lockouts, economic boom and depression.

(d) It should be a period for which actual data is available

Fixed base and chain base : If the period of comparison is kept fixed for all current years, it is called fixed base period. However sometimes chain base method is used, in which the changes in the prices for any given year are compared with prices in the preceding year and not with the fixed year. Naturally, the chain base method gives a better picture than what is obtained by fixed base method. However, much would depend upon the purpose of constructing the index.

3. Selection of items : Collection of data is a special problem in constructing index numbers, since there is a large variety of goods and prices. Care also must be taken that data from unrelated commodities or periods are not grouped together for the calculation of price index. If the number of the commodities is too large, a choice

360


of some representative items has to be made. On the other hand, inclusion of too few Items would make the index number unrepresentative of he ™

mcrdfal, '' ^^^^ calculations, it is nof^"s bk «

include all commodities in construction of index numbers

be considered :

(«) Commodities selected should be relevant and representative of the group according to the purpose of mdex number. For example, in const uction of wholesale price mdex number to know the general price level, we sZw nclude wholesale prices of some major mdustrial and agricultural ^olmoS and Other goods and services. In the same way for coLtruction ofZsle

^^ — which-sm^nt toT W ^^^ ^^^be neither too

ot data. Ihere is always a chance of getting spurious or misleading results if the

w ^^ r - ^be changes m mdustrfalToI

we must collect the prices relating to production of various goods of factories For

nlrh^'. ' ^^ ^^^ consideration, w! can re y onT^^^^^^^

published market reports by business houses. Chamber of Commerce aS

L utilised " Pl-- also

I


361

6. Choice of an average : For constructing an index number any average such as mean, median, mode, geometric mean and harmonic mean can be used Frorn the practical point of view median and mode are unsuitable because of their being Latic. The geometric mean and harmonic mean are difficuh to calculate hence; arithmetic mean is used. Though with the development of the use of electronic computers, the use of geometric mean is also becoming popular.

7. System of weighting : In order to allow each commodity to have reasonable influence on the index it is advisable to use a suitable weighting system. Unweighted index numbers are those where all commodities are given equal importance. But in most cases different commodities are given different degrees of importance, therefore, weights are assigned to the various items.

The method of weighting used would depend on the purpose of index. Weighting may be done according to : (a) Value or quantity produced, (b) Va ue of quantity consumed, and (c) Value or quantity sold. When the quantity is the basis of weight it is called quantity weighting and when the value is the basis, it is called value weighting. Weight may ,be either implicit (arbitrary) or explicit (actual).

8. Choice of method : There are various methods of calculating index numbers such as the aggregative method or the price relative method. Various methods have been proposed for calculation of weighted index number such as Laspeyre's method, Paasche's method, Dorbish and Bowley's method, Marshall Edgeworth's method, Kelley's method and Fisher's method. Fisher's method is considered as ideal for constructing index numbers. No single formula can be said to be appropriate for all types of index numbers and as such the choice of a formula wi 1 have to be made taking into account the object of index numbei; the data available and the resources at the disposal of the person or organisation constructing the mdex

number. »

constructing indei

The methods of construction of index numbers are given below :

Methods

_

Price

362


Unweighted Index Numbers

I. Simple Aggregative of Actual Price Method

This is the simplest method of calculating index numbers. In this method, total of the current year prices for the various commodities is divided by the total of base year prices and the quotient is multiplied by 100.

Symbolically, Price Index

Po^ =

Quantity Index

^01 =

^Po

X 100

100

where.

Poi = Current year price index number ^01 = Current year quantity index number Zp^ = Total of current year prices for various commodities ^Po = Total of base year prices for various commodities

= Total of current year quantities for various commodities = Total of base year quantities for various commodities

Illustration 1. Calculate price index number for 2005 taking 1995 as the base year trom the following data by simple aggregative method.

Commodities

Prices in 1995 (in Rs)

Prices in 2005 (in Rs)

Solution.

I

A B C D E

100 80 160 220 40

140 120 180 240 40

Construction of Price Index Number

Commodities Price in 199S (Rs) Price in 2005 (Rs)

A 100 140

B 80 120

C 160 180

D 220 240

E 40 40

Zp^ = 600 Xpj = 720

It:

exi


363

Steps

1 Aggregate the current year prices of various commodities (Ep,)

2. ISrelate the base year prices of various commodities W


Po, =

= ^ X 100

Zpo

Here,

p = Price index number of the current year (2005) Zp = Total of current year prices for various commodities Zp - Total of base year prices for various commodities

Now, we get

= ^xlOO = —X 100 = 120

thus, the pncr^.^.;::

Number for 2004 wirh base 1995 from Ae^followmg • ^ ^

Commodities ■■ ^ ^^^ jj 5 2

80 60 20 10 6

1995 Quantity (kg) 2004 Quantity (kg)

Calculation of Quantity Index Number

Commodities

A B C D E

Total

Quantity in 1995 (in kg)

(qp)

___ _

40 10

5 2

■Quantity in 2004 (in kg)

Iq^}

Uo = 107

80 60 20 10 6

24, = 176


^^ - 100

^01 E^o

Here,

101

= Quantity index no. of 2004, Zq, = 107, and Zq, = 176

_ IZi X 100 = 164.48

^01 107

extent of 64.48% as compared to 1995.

364


Illustration 3. Compute index numbers for the vears 1996 innn ( u r ., data (Base Year 1995). ^ 1996 to 2000 from the following

Year : 1995 Price : 10 Solution.

1996 14

1997 16

1998 20

1999 22

2000 24

Calculation of Index Numbers (Base year 1995)

Price Index Numbers ^ ^xlQO Pff

1995 10 100

1996 14 ~ X 100 = 140 10

1997 16 16 jQ X 100 = 160

1998 20 20 jQ x 100 = 200

1999 22 24 10 22 ^ X 100 = 220

2000 Hprp -h —24 c ^u 24 Y^ X 100 = 240

» 1 ----'I'* ^.ixv, wiiixi^iiL ycai

Po = Price of the base year

Limitations

1. No weight is given to the relative importance of items

2. Index IS influenced by the items with the large unit prices

II. Simple Average of Price Relatives Method

X 100. Index number by this method is the arithmetie mean or median or geometric'

365


Illustration 4. Construct pure index number for 2005 taking 2000 as the base year from the following data by simple average of price relative method.

A B C D L

100 80 160 220 40

140 120 180 240 40

" Solution. 'A price relative is the price of the current period expressed as a percentage of the price at the base period'.

Construction of Price Index Number

Commodities Price in 2000 (in Rs) Price in 2005 (in Rs)

Prices in 2000 Prices in 2005 Price Relatives

Commodities - - 'a : : SL X 100 ; :: Pov , ^

A 100 140 ^^^ X 100 - 140 100

B 80 120 120 „ " X 100 = 150 80

C 160 180

D 220 240 X 100 = 109.1 220

E 40 40 "^^x 100 - 100 40

Total • Z^x 100 F 611.6 Po

Steps :

1. Calculate the price relatives of current year

|LxlOO

2. Aggregate the calculated price relatives.

3. Divide the total of price relatives by number of commodities.


>1

Po^ =

Po

xlOO

N

Here, = Price index number of current year (2005)

366


^ X 100 = Price relatives of current year N = Number of commodities

Now we get, p^^ =

^xlOO {Po

611.6

= 122.32

N ~ s

m th^nrVt' T" 2005 is 122.32. In other words, there is net increase

m the prices of commodities m the year 2005 to the extent of 22.32% as compared to

Merits

1. Index number is not influenced by extreme items. Equal importance is given to all

LfxC? ItvlliS*

Limitations

1. The relatives calculation are assumed to have equal importance. This assumption may not be always correct. ^

2. There is a problem of selecting a proper appropriate average. Weighted Index Numbers

weil?*''''^ '' 1" appropriate

weights are assigned to various commodities to reflect their relative importance in the

I. Weighted Aggregative Method

Weights are assigned to the various items. There are various methods of assigning

merk r J Tn Laspeyre's method, Paasche's method, Dorbish and Bowley's

method, Marshall Edgeworth's method, Kelley's method and Fisher's method Fisher's

constructing index numbers. According to syllabus of Class XI, we are discussing here Laspeyre's and Paasche's method of constructing index

Laspeyre's Method Paasche's Method

Price Index fo: = X 100 Zpolo Quantity Index = X 100 ZqoPo Price Index = v^^ X 100 Quantity Index = X 100 ZqoPi

for Economics-XI j Introduction to Index Numbers

jthere is net increase 2% as compared to

^nce is given to all fquoted or absolute

!. This assumption

^od appropriate pportance in the

of assigning ndex numbers r and Rowley's hod. Fisher's ' syllabus of ucting index

' "" " ' ^ '--'■'a 'O ger quantity index n

denoted as ,, and' the'^na^; „ ^t™ ylJlt^^^.T'''' ''' Laspeyre-s metltod is very widely used" Ti, ha The

which are not changed front one^yert^n « S„ Te r^ T ^ V'

Value Index Number =

Base year values

V = ^lilyinn •^01 xioo

or

=

XlOO

where

XVo

Poi = Price index number ^01 = Quantity index number Vgj = Value index number Pi = Current year price pQ = Base year price

= Current year quantity = Base year quantity V, = Current year value (Zp^q^) Vp = Base year value {Lp q )

Commodities

A B C D

1996 Base Year

Price

10 8 6 4

Quantity

30 15 20 10

2005 Current Year

Price

12 10

6 6

Quantity

50 25 30 20

368

Solution.

Construction of Price Index Numbers


Year (ZOOS) 1

Quantity It PSt Ptlo Pitr

50 25 30 20 300 120 120 40 500 200 180 80 360 150 120 60 600 250 180 120

= 580 = 960 = 690 = 1150

where

1, = quantity of current year (200J)

Po = price of base year (1996)

. ° quantity of base year (19961

(A) Laspeyre's Method. In this merhoH K

Steps. (Price Index nuntM " ^ ""^hts.

otrC.'™' contntodtties „.th base year weights and

oti^i.^^ "" —« wtth base year weights and

TptfL^t' ' * "-'ent by ,00. The above steps give us

Laspeyre's Price Index Number Symbolically,

X 100

690

= J^ X 100 = 118.96

Thus the price index number of 2005 is 1 IS v pnces of commodities m the year 2005 to the ex,^^ urease m the Laspeyre's Quantity Index : 18-96% as compared to 1996.

= |»xlOO ôPo

960

^01 =

580

XlOO = 165.52

JOU

Thus, the quantity index number of 2005 is 165

,.ant.ty of commodittes n, the year ÔO^l'^^îrroHÎ^t™


(B) Paasche's Method : In this method current year quantities are. taken as weights. Steps. (Price Index Number)

1. Mukiply current year prices of various commodities with current year weights and obtain Spj^j.

2. Multiply the base year prices of various commodities with the current year weights and obtain

3. Divide by JLp^q^ and multiply the quotient of 100. Symbolically,

Paasche's Method :

p 3,^x100

1150 960

X 100 = 119.79

Thus, the price index number of 2005 is 119.79. In other words, there is net increase in prices of commodities in the year 2005 to the extent of 19.79% as compared to 1996.

Paasche's Quantity Index :

_ = ^fiPL X 100 ZqoPr

1150 690

xlOO = 166.67

Thus, the quantity index number of 2005 is 166.67. In other words, there is net increase in quantity of commodities in the year 2005 to the extent of 66.67% as compared to 1996.

Value Index Number :

Zpo^l

1150 580

X 100 = 198.28

Thus, the value index number of 2005 is 198.28. In other words, there is net increase in value of commodities in the year 2005 to the extent of 98.28% as compared to 1996.

n. Weighted Average of Price Relatives Method

Illustration 6. Calculate weighted average of price relative index number of prices for 2005 on the basis of 2004 from the following data :

370

Commodities

A B

C D E

Solution.

Weights

20 12 8 4 6


Price 2004

20 15 10 5 4

Price 2005

Commodities

A B C D E

Weights

(w) %

20 12 8 4 6

Steps :

of Price Index Numbers

35 18 11 5 5

Price 2004

20 15 10 5 4

Price 2005

35 18 11 5 5

Value weights

(PolJ [V]

400 180 80 20 24

Pi

^xlOO ypo

fp]

IV = 704

175 120 110 100 125

[PV]

70000 ■21600 8800 2000 3000

IPV = 105400

Calculate the price relatives of the current year fexiool ■

Item of the period for which the in^ "Jexpressmg each

of the same item in the tse pL^" ca4>ated as a percentage

p. calculated pereentages for each ttem hy value weights,

nr fPl/l

or [PV]

Po, =

— XlOO \Po

^Poqo ZV - 75J- = i^y./I

-n .ere .s a net mcrease

^^'./l /o as compared to 2004.

or

^^ _ 105400

= 149.71

tr in

n

CO

im

fin coi pu] are


371

somer price index (cpi)

The wholesale price index numbers measure the changes in the general level of prices and they fail to reflect the effect of the increase or decrease of prices on the cost of living of different classes or group of people in a society. Consumer price index numbers are also called (/) Cost of living index number, or («) Retail price index number, or (ttt) Price of living index numbers. Consumer price index numbers are designed to measure the average change over time in the price paid by ultimate consumer for a specified quantity of goods and services. They measure the change in the cost of living of a particular section of society due to change in the retail price. A change in the price level affects the cost of living of different classes of people differently. The general index number fails to reveal this. So there is the need to construct consumer price index. People consume different types of commodities. People's consumption habit is also different from person to person, place to place and class to class, i.e., richer class, middle class and poor class.

Construction of Consumer Price Index

The following are the steps in construction of consumer price index : (1) Determination of the class of people : Consumer price index numbers are constructed separately for different classes of people or groups or sections of the society, e.g., urban wage earners, agricultural labourers, industrial workers, government employees etc.'and also for different geographical areas like town, city, rural area, urban area, hilly area and so on. The group has to be clearly defined. When we talk of government employees, then we have to decide about the low paid or high paid government employees as their consumption pattern differs. The class for which the index number has to be constructed must be as far as possible homogeneous from the point of view of income and habits. The major groups of consumers for whom the consumer price index numbers have been constructed in India are : (/■) the industrial workers, (ii) the urban non-manual workers, and (Hi) the agricultural labourers In India, the consumer price index for industrial workers is by far the most popular index. This is constructed on monthly basis with lag of one month. CPI measures changes in retail prices of goods and services covering 260 items of consumption from 70 centres. The base year of CPI (IW) is 1982. The CPI for industrial workers is increasingly considered as the appropriate indicator of general inflation, which shows the most accurate impact of price rise on the cost of living of common people.

(2) Conducting family budged enquiry : Family budget enquiry is held with a view to find out how much an average family of this group spends on different items of consumption. The quantity of the commodities consumed, as also prices at which they are purchased are noted down. The enquiry is done on a random sample basis. Some famihes are selected from the total number by lottery method, and their family budgets are

372


Jt''™^ "" - groups;

(/■) Food («) Clothing (Hi) Fuel and Lighting

(iv) House Rent

(v) Miscellaneous

the "--i-s is very in,pona„t in

to shop and W 17,■

cCect tetai, priee. The pHncpies ^tSroTtr;^^^

1. Pnce must relate to a fixed list of items for a fixed quality.

2. Retad price must be the price which is given by the consumer.

3. M drscount U given to all customers, it can be taken into account.

In ttcTt'oft ° T""' """ " """ OP'"price

questiomtaires. First we musfSaToecw" >e"ts or mail

must be instructed properly anTK listTtem ,'^^^^ "

be conducted by "cU pLngC^thllhSm?

for rhfferenr classes <ff Peo-pl^Tltrnd^^r^ ^r^^^ Methods of Construction

(t) Famdy Budget Method or Weighted Relatives :

IWR

Consumer Price Index =

IW

Here,

R = ^x 100 for each item

Po.

W = Weights


(ii) Aggregative Expenditure Method or Aggregative Method :

Consumer price Index = ^^ x 100

Zpo^o

This is based upon Laspeyre's method. According to this method, the various items are given weights on the basis of quantity consumed in the base year.

If the calculated cost of living index number is more than 100, it means a higher cost of living, necessitating an upward adjustment in the wages and salaries of employees The rise of wages or salaries is equal to the amount of percentage it exceeds 100. If the calculated index number is less than 100, it means the cost of hving has decline by the balancing percentage between 100 and calculated index number.

Illustration 7. An enquiry into the budgets of the middle class families in a certain city gave following information. What is the cost of living index of 2004 as compared with 1995. Calculate by :

(i) Family budget method, and

(ii) Aggregative expenditure method.

Expenses on items

Price (in Rs) 2004

Price (in Rs) 1995

Solution.

(i) Constructing Cost of Living\Index Number

(Family Budget Method)

Food Fuel Clothing Rent Misc.

35% 10% 20% 15% 20%

1500 250 750 300 400

1400 200 500 200 250

Expenses on item Weights (%) (W) Price (in Rs) 1995 (Po) Price (in Rs) 2004 (P.) Price Relative - A (R) Weighted Relatives (WR)

Food 35 1400 1500 107.14 3749.9

Fuel 10 200 - 250 125 1250

Clothing 20 500 750 150- 3000

Rent 15 200 300 150 2250

Misc. 20 250 400 160 3200

ZW = 100 IWR = 13449.9

Cost of living Index for 2004

CPI =

LWR 13449.9

= 134.499

ZW 100

Thus, there is increase of 34.5% in prices of 2004 with that of 1995.

374

Expenses on items

Food Fuel

Clothing

Rent

Misc.

Weights

35 lb 20 15 20

(ii) Aggregate Expenditure Method


Price (in Rs) 1995 Po

1400 200 500 200 250

Consumer Price Index for 2004

Price (in Rs) 2004 Pi P<Ao Pi'Jo

1500 • 250 750 300 40049000 2000 10000 3000 5000 52500 2500 15000 4500 8000

^Po^o = 69000 = 82500

CPI - ^Pi'^o 82500

100

" « ^ 100 = 119.565

items are 75, 10, 5, 6, and 4 rewrtvdv Pr^r ""^^ts of these

hving for 2005 with 1980 as JS hSe ' """ber for eost of

Items Price in 1980 Price in 2005

Food Clothing Fuel and lighting House rent Misc. 100 20 15 30 35 200 25 20 40 65

Solution.

_-^Consfructing Cost of Living Index Number

Items Weights W Price 1980 (Pol Price 200S (PJ Price: Relative R 100 Weighted Relatives (WR)

Food Clothing Fuel and lighting House rent Misc. 75 10 5 6 4 100 20 15 30 35 200 25 20 40 65 200 125 133.33 133.33 185.71 15000 1250 666.65 799.98 742.84

ZW = 100 IWR = 18459.47

Introduction to Index Numbers Cost of living Index for 2005

CPI =

375

IWR IW

18459.47 100

= 184.594

Thus, there is increase of 84.6% in prices of 2005 with that of 1980„ Illustration 9. The consumer price index for June 2005 was 125. The food index was 120 and that of other items 135. What is the percentage of the total weight given to food? Solution.

Items Index Weights WI

(I) : m

Food 120 w, 120 Wj

Other items 135 w. 135 W,

IW = 100 IWj == 120 Wj + 135 W^

Let the total weight = 100, W^ = Food and W^ = other items Hence, 100 = Wj +

We are given consumer price Index =125

IWR

.(1)

CPI =

125 =

IW

120 Wi +135 W2 . 100

or

12500 = 120 Wj + 135 W^

•(2)

Now, solving equation (1) and (2), .

100 = 12500 =

We get 13500 =

12500 =

... X (135)

Wj + W^ 120 Wj + 135 W^

135 Wj + 135 W^ 120 W, + 135 W^

1000 = 15 Wj W, = iff^ = 66.67

Now using equation (1), we get 100 = Wj + 100 = 66.67

W, = 100 - 66.67 = 33.33 Hence, percentage of total weighs given to food = 66.67% and for other items = 33.33%

M.

376

Verification


Items

Food Other items

Index

(I)

120 135

Weights (W)

66.67 33.33

IW = 100

W.

8000 4500

IW, = 12500

CPI =

IWI IW 12500 100

= 125

Consumer Price Index No. is 125 as given in question. Uses of Consumer Price Index

Consume, price index is'called J prici^dX^fri^Ze '

{a) Purchase Power of Money = _ ^_

Consumer Price Index

(fe) Real wages =


Suppose, the consumer price index was 400 in 7(\[\a n^ u 100 m 2000-01. Then a rupee in 2004-^5 wol be ^^uTto ' " ""

100 400

= 0.25

F I 11 £ purchasmg power of rupee in 2004-05

Rs 32T:I-ch w" tr P" n-onth was

100 and for 2004.0J was AmwTlnA A Consumer pnce mdex for 2000-01 was by rise of h,s wages^T^elrl^Telltagt""" """"""

Real wage = Money Wages

Consumer Power Index

For 2000- 01 : ^ X 100 = Rs 3250 For 2004-05 : ^ X 100 = 1250


377

However the monthly money wage was raised from Rs 3250 to 5000 in 2004-05. The worker has not gained. In fact his real wage has gone down. The real wage of the worker is Rs 1250 in 2004-05 as compared Rs 3250 in 2000-01.

Example 12. If the salary of a person in the base year is Rs 4000 per annum and the current year salary is Rs 6000, by how much should his salary rise to maintain the same standard of living if the CPI is 400.

Solution. We are given :

Year ^ Salary CPI

Base Current Rs 4000 Rs 6000 100 400

When 100 is the CPI of Base year, bis salary is Rs 4000 400 CPI of current year, his salary should be

400x4000

100

= Rs 16,000

Hence, his salary rise should be of Rs 10,000 (16,000 - 6000 = Rs 10,000) in current year to maintain the same standard of living.

2. The government (central or state) and many big industrial and business units use consumer price index numbers to regulate the Dearness Allowance (D.A.) or grant of bonus to employees. This compensates them for increased cost of living due to price rise. They are used by the government for the formulation of price policy, wage policy and general economic policies.

3. If the prices of some important essential commodities (like wheat, rice, sugar, cloth, etc.) increase, due to shortages, the government may decide to provide them through fair price shops or rationing.

4. Costs of living index numbers are used for deflating value series in national accounts.

5. Consumer price index numbers are used widely in wage negotiations and wage contracts. They are used for automatic adjustment (increase) of wages corresponding to a unit increase in the consumer price index.

mdex of industrial production (iip)

Index numbers of industrial production are fairly common these days. They tell about the relative increase or decrease in the level of industrial production in a country in relation to the level of production in the base year. They are the best measures of economic progress in any country. These indices can be constructed by studying variations in the level of industrial output. As such the first step in the construction of such index numbers is to find the level of output of various industries of the country. It should be remembered that these index numbers throw light on changes in the quantum of production, not in

^ ^ Statistics for Economics-XJ

values. If the variations in the value of output are to be studied, data about the value of mdustnal output have to be used for the purpose of constructing such index numbers. Thus, mdices of mdustrial production are constructed either by studying changes in the quantum of production or its value.

Index of Industrial Production in India

A number of index numbers of industrial production are compiled in India by official and non-official agencies. The general index of industrial production is the most popular among these. In India, Index of Industrial Production is published by Central Statistical Organisation (CSO), Industrial Statistics Wing. The old series of Index of Industrial

.'aatlT''. . ' but now new series is pubHshed with the base year

1993-94. An abstract from Economic Survey 2005-06 is as under :

General Index of Industrial Production Base 1993-94 = 100

1993-94 1998-99 1999-00 2000-01 2001-02 2002-03 2003-042004-05

100 145.2 154.9 162.6 167.0 176.6 189.0 204.8

Usually important data about production are collected under following major heads:

I. Mining Industries : Coal (inc. lignite). Petroleum, crude (off-shore and on-shore) Iron ore.

II. MetaUurgical Industries : Hot metal (inc. pig iron), crude steel, semi-finished steel, steel castings, aluminum, bister copper.

III. Mechanical Engineering Industries : Machine tools, cotton textile machinery, cement machinery, railway wagons, automobiles, (commercial vehicles, cars, jeeps, land rovers), power driven pumps, diesel engines, earth moving equipment, bicycles sewing machines, agricultural tractors. '

IV. Electrical Engineering Industries : Power transformers, electric motors, electric fans, electrical lamps, radio receivers, aluminum conductors.

V. Chemical and Allied Industries : Nitrogenous fertilizer (N), phosphatic fertihzer (P^Oj), soda ash, caustic soda, paper and paper bond, automobile tyres, bicycle tyres, cement, petroleum refinery products, penicillin, streptomycin, chloramphenical powdei; vitamin A.

VI. TextUe Industries : Jute textiles, cloth (cotton cloth), mixed/blended cloth, spuny and filament yarn, staple fibre etc.

VII. Food Industries : Sugar, tea, coffee, vanaspati, salt.

VIII. Electricity Generated : Related to utility.

IX. Miscellaneous : Glass, soap etc.

Usually important data about production are collected under above major heads.


379

The data relating to the production of the above mentioned industries are cq^lected either monthly, quarterly or yearly. The production of the base year is taken as 1^0 and the current year's production is expressed as a percentage of the base year's production. These percentages are multiplied by the relative weights assigned to various industries. Weights are usually assigned on the basis of the relative importance of different industries. The relative importance of industries is usually decided on the basis of capital invested, the gross value of productions, turnover, net output etc. Many other criteria of relative importance can also be laid down. Usually weights in an index number of industrial production are based on the values of net output of different industries. The weighted arithmetic average or geometric mean of the relatives give the index number of industrial production. Such index numbers can be constructed both for gross output as well as net output.

The following table shows broad industrial grouping and their weights.

Broad groupings Weight in % Index No. in May 200S

Mining and quarrying 10.47 155.2

Manufacturing 79.36 222.7

Electricity 10.17

General Index 213.0

From the above table, we find that the growth performances of broad Industrial categories differ.

Method of Constructing Index of Industrial Production

Formula : Using simple arithmetic :

Index No. of Industrial Production (IIP) = where.

iL Uo

W

IW

q^ = Current year Quantity produced q^ ^ Base year Quantity produced

W = Relative importance of different outputs. Illustration 13. Construct Index of Industrial Production for 2004 from the following information.

Industry Output (Units)

1996 2004 Weights

1. Mining 120 160 20

2. Textile 80 110 25

3. Mechanical Engineering 70 90 15

4. Chemical 80 70 25

5. Electrical 90 120 15

380

. ^ , , Statistics for Economics-XI

Solution. of Index Number of Industrial Production

In.Iusln-

I ■ Milling

2. Textile

3. Mechanical Engineering

4. Chemical

5. Electrical

1996 la nm W ^xlOO % / \ 'SL ! w

120 80 70 80 90160 110 90 70 120 20 25 15 25 15 133.33 137.50 128.57 87.50 133.332666.66 3437.50 1928.55 2187.50 1999.99

-__ XW = 100 12220.20

Industrial Production Index =

X iL W

ZW

12220.20

100

= 122.20

22.20% increase of industrial production in 2004.

general uses of index numbers

ItTor^b™^^ can be summarised as follows :

as barometers to find th^Ta^td^^^^^r T"" ^^ act

barometers which are used'm physlTrmr" ^ L^ke

measure the level of economic andTsmesTac ™ ""^bers barometers' or 'barometers of econorr ^ ^ ^'economic exchange, reserve bank deposits^r"row 1on T ""

activities of a country and these md cesTan t u" business

could act as an economic blromeJer -êx which

2. To measure comparative chanirfe ■ tu

measure relative temporal or crolêS^n^^.r''™ ""'"bers is to compared with same base figure Inde^^^^^^^^ ' "^"^^le or a set of variables time to time, among differentXeând T " comparison of changes from

m the phenomena like prTce £ cost ^^ ^ ^^^ ^^ânges

measured with the help'of ''' ^^^^^^

-f --^g complex variable through time or space " 'be

- - --red ,„

Infme"r„gfc7:a7uf ^ nature into

/


381

3. They help in framing suitable policies : Index numbers are indispensable tools for the management of any government organisation or an individual business concern for efficient planning and formulation of business policies. For example, relative wholesale and retail price index numbers are the output (volume of trade, industrial and agrxultural production etc.) help in economic and business policy making.

It is not in the field of business and economics that index numbers are used as a basis for policy frame but even in disciplines like Sociology and Psychology their utility is immense. For example, sociologists may speak of population indices, psychologists measure intelligence quotients which are essential index numbers comparing a person's intelligence score with that of an average for his or her age. Health authorities prepare indices to display changes in the adequacy of hospital facilities and educational research organisations have devised formulae to measure changes in effectiveness of school systems.

4. lo measure the purchasing power of money : Index numbers are helpful in finding out the intrinsic worth of money as contrasted with its nominal worth. Very often statements are made that purchasing power of the Indian rupee in 2000 is only 20 paise as compared to its purchasing power in 1990. It means that a person who was having an income of Rs 1000 per month in 1990 should have an income of Rs 5000 to maintain the same standard which he was maintaining in 1990. This helps in determining the wage policy of a country.

5. To help in study of trend : Index numbers are very useful in the trend or tendency of a series over a period of time. It is easy to find out the trend of exports, imports, balance of payments, industrial production, prices, national income and variety of other phenomena. It is also useful in forecasting future trends. With the help of index numbers of prices, demand, wages, income etc., a business executive is in a better position to take decisions about whether a new product should be launched or whether there is scope for exploring new markets or whether the existing pricing and production policies need a change.

6. For adjusting National Income : Index number are vfery helpful in deflating (adjusting) national income on the basis of constant prices to enable us to find out whether there is any change in the real income of the people. They are used to adjust the original data for price changes, or to adjust wages for cost of living changes and thus be transformed into real income and nominal sales into real sales through appropriate index numbers.

inflation and index numbers

According to Samuelson— By inflation we mean a time of generally rising prices for goods and factors of production—rising prices of bread, cakes, haircuts, rising wages, rents, etc.

According to Ackley "Inflation can be defined as a persistent and appreciable rise in general level of average prices".

1

382


inflaS,:. T" T"

to be cateeorised a, / „ Pe'^P'ible and persistent over time

with rismg prices The essence of '' Tl

pnce levef rLg ove ' t me Jnfl^^ " f -- -PP^X that keeps the

IS expressed as percentarrist ^ P- mdex and

or week. The variationsin gen rarprfcel" 1 m ^^^^ ^^^^

numbers. They are : ^ measured by two types of index

(a) The Wholesale Price Index (WPI), and

(b) Consumer Price Index (CPI)

pnce at which a commortv^ oU fn " ® '"'n'"^ P"« 'te

measure the general pê it ,„ the 17 7 """êrs which

(WPI). „ ,s aL.n L T

movements ,n a comprehensive way h ' n ifdl:ri„?T

commodities in all trade and transact,™ tL movement m prices of

important in modern ecoTo^c " tTa ' ifa/T T P™es is very

activity WPI is the most commo::i;r;;:dt:ar^f

Wholesale Index Number in India

Eco^dvt: Mti^r^^^^^^^^ published every week by the office of

was constructed m 194ny he S <" -^h index number

1952-53 as base year in mrTre ^ TûT

Government of India has tl base st 98,I""

base year 1993-94. ^ published with the

of c^dilX'atT''" --- «> "ver large number


383

Category

Weifihts %

22.0 14.2 63.8

■No. of items

98 19 318

(a) Primary goods

(b) Fuel power, light and lubricants ^_(c) Manufactured goods

Source : Economi^ Survey ^05-06 Inflation and Wholesale Price Index Number

WPI IS the only price index m ndt wS ^^^ ^^^de and transactions,

lag of two weeks I^is due to rhe-^ ^^ '' ^^ """ ' ''''''

Table 1

Index Number of Wholesale Prices

Year

Primary Articles

Fuel, Power, Light and Lubricants

Manufactured Goods

AU , Cunnnodirics

Last week of

1989-90

1990-91

1991-92

1992-93

1993-94

1994-95

1995-96

1996-97

1997-98

1998-99

1999-00

2000-01 2001-02

2002-03

2003-04

2004-05 Average of weeks

2005-06 October-Nov. (Provisional)

167 196 225 . 232 259

(Base 1981-82 = 100)

165 189 214 246 278

175 171.1

190 191.8

214 217.8

231 233.1

254 258.3

121 125 136 142 153 159 162 168 178 181 183

198

199

Base 1993-94 = 100)

109

115

130

148

153

193

223

231

256

263

290

313 312

117 ■ 116.9

123 122.2

126 128.8

129 134.6

135 141.7

139 150.9

144 159.2

144 161.8

152 172.3

162 180.3

169 189.5

172 197.8

173 198.3


Uses of Wholesale Price Index Numbers

From the above Table 1 of Wholesale Price Index Number obtained from Economic Survey, 2005-2006, let us understand the uses of WPI under the following heads:

1. Price trends in India

2. Measuring rate of inflation

3. Forecasting future prices

4. Estimation of demand and supply

5. Determining real changes in aggregates

6. Uses in planning

1. Price trends in India : Ever since independence the price trends in India have varied

between sharp to moderate increases. With the exception of some years of the First

Five-Year Plan, viz., 1952-53 and 1954-55 when prices showed a moderate decline, almost

the entire period of over five decades since 1950-51 has shown persistent rise in prices.

molesale Prices. The rising trend in wholesale prices, as shown by the Wholesale

, bas continued ever since 1960-61, but it assumed alarming dimensions since

1972-73 after the first oil shock of 1973 when" OPEC nations affected a manifold rise

in oil prices. OPEC again increased petroleum prices in 1978 that adversely affected our

To-^o'?! ^ u ^^^ """^ber of wholesale prices, with

1970-71 as base 100) increased to 175 in 1974-75 and further to 256 in 1980-81 thus

showing two and a half fold .increase in price in just one decade. The base year for WPI

was changed to 1981-82 = 100 under the new index which rose to 258.3 in 1993-94

showing another two and a half fold rise in price in a little over one decade The base

year was again changed to 1993-94 = 100 under the current series of Wholesale Price

'''''' P"ce level between

1993-94 and 2004-05. The Wholesale Price Index stood at 189.5 in 2004-05. Table 1

shows the movement of wholesale prices of various commodity groups since 1986-87

2 Measuring rate of inflation : WPI is used to measure the rate of inflation. The rate of inflation is useful to know the real value of income, savings and wealth etc. Using WPI of 2003-04 and 2004-05 for all the commodities from the table given above, the rate of inflation can be calculated as under :

Rate of inflation =

WPI of current year WPI of previous year

XlOO

-100

= [189.5 " L180.3 = 5.1%

or Rate of inflation = WPI of current year - WPI of previous year' XlOO

WPI of Picvious year

"189.5-180.3' 180.3 X 100

= 5.1%


385

Thus, the annual inflation rate during 2004-05 was 5.1% in case of all commodities. One can also calculate inflation rates for different commodities or commodity groups as required for policy purposes.

Economic Survey 2005-' Annual point-to-point inflation rate in terms of

the Wholesale Price Index (WPI) increased from 4.6 per cent at end March 2004 to 5.1 per cent at end March 2005. The year 2005-06 started with an inflation rate of 5.7 per cent on April 2, 2005, which was followed by a softening trend until August 27, 2005 when it reached a trough of 3.3 per cent. While the rate rose steadily thereafter, it remained below 5 per cent. At 4.5 per cent on January 21, 2006 it was significantly lower than 5.4 per cent recorded a year ago. Average WPI inflation decelerated from 10.6 per cent in the first half of 1990s to 4.7 per cent during 2001-02 to 2004-05.

3. Forecasting future prices : From the above time series data of WPI understand that the wholesale price level has increased in 2004-05 for primary articles by 83%, for fuel power, light and lubricants 191%, for manufactured products 69% and for all commodities by 89.5%. Thus, WPI can be used to forecast the increase in future prices.

4. Estimation of demand and supply : One can use an appropriate model to estimate the future demand and supply as the prices affect both the demand and supply. WPI therefore is useful for analysing and forecasting trade situations by interpreting the present trend in supply and demand conditions.

5. Determining real changes in aggregatives : WPI are useful to determine the real changes in aggregates like, national income, national expenditure, capital formation etc. National income is defined as the value of goods and services produced in a certain year. National income at current prices can be obtained after calculating the value of goods and services according to prices prevailing in the same year.

The real change in national income can be calculated as given below :

Real Change of National Income

WPI of Base year WPI of current year

X National income of current prices

For example, suppose the national income of the country in 2001 on the basis of current year prices amounts to Rs 700 crore which is increased to Rs 780 crore in 2002. Suppose the WPI increased to 150 in the year 2002 as compared to 2001 WPI as 140. The real change in national income can be calculated as :

= 11^x780

150

= Rs 728 crore

Here, the real increase in national income of Rs 28 crore (728 - 700). while actual monetary increase is Rs 80 crore (780 - 700).

An increase in the national income at the current prices may be due to :

(a) an increase in the general price level, or

(b) an increase in the real output.

386


amotJ^rerpeSra^d'tr^^^^^^^^^ lajches number of projects wh.ch require huge for rhese profecrs in .rs I'a, bVd.e " " " " P"™'™

year due to rising costs of nroiecr t i cannot be same every

projects. The origfnal e«imatKo« ofT' the real cost of such

indicated by Wpf is con rred Ss tfelTe „7 '' ^

to revise the annual cost of projects On this b'^^^^^ " governmem

of funds for various schemLr^

Inflation and Consumer Price Index in India (CPI

r- ^J - as services

labourers or non-n^JTurlTZXye^

Workers (CPI-IW Base 1982 . 100r£s cha^.e I"dex for Industrial

Changes in cost of living of rur 1 aretfl^^^^^^^^^ ^^ -^-tnal workers.

(CPI-AL Base 1986-87 f 100) while CW for TIrt m "" ^S"cultural labourers

does It for urban non-manulltorkS

measure of price rise of inflation and is used for de^erm nL ^ '' considered a good governmem employees as well as otLHlr T ^ allowance (DA) of

monthly basis and is availabTe after Tw of r " ^ ^^^ ''

and agriculture labourers are publ^sheJ bVLa^^^^^

Organisation published the CpCmber of lib. ^^^ Statistical

because their typical consumptrbtll^o^^^^^^^^^^

groupAslivt trZ^^rJ^TtJ^'-' - 'y — —ty

the food price will have gSam ™ ^^^^^^ the rise in

price increase will not be inflationary Government gives the statement as oil

Table 2

Major Group

1. Food

2. Pan, Supari, tobacco etc.

3. Fuel and lighting

4. Housing

5. Clothing, bedding and foot wear

6. Misc. group

Base 1982 (Weight in %)

Grand Total

57.00 3.15 6.28 8.67 8.67 16.36

100.00

Base 2001 (Weight in %)

86.19 2.27 6.43 15.27 6.58 23.26

100.00

Source . Economic Survey, 2005-06 (p. 87) '

thus^ea^'res t ST^m ^th'atg Sr of""^ — ^^

that the consumers pa^. The''cotrrC'^'drS


387

commodities and regarded as an index of changes in cost of hving of industrial workers This index shows that there has been a five-fold rise in retail prices and cost of living of industrial workers since 1982. This index has gone up to 548 in Oct. 2006 as against 100 in base year 1982. General Index for Urban Non-Manual Employees showed over fourfold rise between 1982 and Oct. 2006. Thus, the impact of rising prices on urban non-manual employees was a little less than that on tbe industrial workers. Prices in rural areas also showed a rising trend but the extent of rise in prices this case was lower than that in urban areas as is indicated by the the general index for agricultural labourers. Table 3 shows the movement of prices in India as shown by the All India Consumer Price Index.

Table 3

All India Consumer Price Index Numbers

Industrial Workers Urban Non-Manual Agricultural Labourers

(Base 1982 = 100} Employees up to 1994-95

(Base 1984-85) Base 1960-61 = 100 1996-97 onwards

Last Month = 100)

of Food General (Base 1986-87 = 100)

Index Index General Index General Index

1986-87 141 137 115 572

1987-88 154 149 126 629

1988-89 168 163 136 708

1889-90 177 173 145 746

1990-91 199 193 161 803

1991-92 230 219 183 958

1992-93 254 240 202 1076

1993-94 272 258 216 1114

1994-95 304 284 237 1204

1995-96 337 313 259 234

1996-97 369 342 283 256

1997-98 388 366 302 264

1998-99 445 414 337 293

1999-00 446 428. 352 306

2000-01 453 444 371 305

2001-02 466 463 390 309

2002-03 477 482 405 319

2003-04 495 500 420 331

2004-05 506 520 436 339

Oct. 06 538 548 460 356

Source : Economic Survey, 2005-2006 (p. S-63).

High inflation hurts the poor with their incomes not indexed to prices. It also puts pressure on interest rates, and adversely affects both savings and investment. Because of its implications for the poor and its possible destabilizing effects on macro economic stability, containment of inflation is high on the Government agenda.

388


Causes of Rising Prices (Inflation)

in India, we have to look^tn L tr t understand the various price or inflation

Demand Side Factors

1. Increase in Money Supply

2. Faster Growth in Money Supply than the Growth Rate of National Income

3. Massive Increase of Government Expenditure

4. Deficit Financing

5. Growth of Black Money

6. Increased Wages and Salaries Supply Side Factors

1. Slow Growth of Agriculture

2. Slow Pace of Industrialisation

3. Increase in Petroleum Prices

4. Changes in Administered Prices

5. Wage Price Spiral (i.e., workers demanding higher wages)

limitations of index numbers

4


> list of formulae

1. Unweighted Simple Aggregative Method

Price Index p - ^ x 100

^Po

Quantity Index ^ = ^ x 100

y V.

Value Index V^j = ^ x 100

evq

^ im. X100 Zpo%

2. Unweighted Simple Average of Price Relative Method

X^xlOO VPo J Po. = -

3. Weighted Aggregative Method : Price Index

A. Laspeyre's Method = ^^ x 100

B. Paasche's Method p.. = ^^ x 100

ZPoqo

4. Weighted Average of Price Relative Method

Z ^xlOO x(Poqo) [Po J

Poi =

389

Quantity Index

. ^ ^ X 100

ZqoPo

, 100

^oPi


A. Aggregative Expenditure Method or Aggregati

CPI = ^ X 100

B. Family Budget Method

CPI =

Index of Industrial Product

ZPV Method

:ive

'ZWR " or "XW7"

_ XW .XW.

Industrial Production Index No. =

iL ZW

W

390


exercises

Questions :

1. Distinguish between actual difference and relative difference in prices. 2- Define index numbers. Why do we need an index number.?

3. What are the problems in construction of index numbers?

4. Briefly explain the importance of index numbers in the study of Economics.

5. What is index number? Discuss briefly the uses of index numbers.

6. What is the difference between a pure index and quantity index?

7. State the general uses of index numbers.

8. What are the desirable properties of the base period?

9. Distinguish between 'weighted' and 'unweighted' index of prices?

10. Is change in any price reflected in a price index?

11. Distinguish between Laspeyre's method and Paasche's method of constructing index number.

What does consumer price index for industrial workers measure?

Define Consumer Price Index number. Explain the uses of consumer price index numbers.

What are the uses of Wholesale Price Index numbers?

15. Explain Index Numbers of Industrial Production.

16. Distinguish between 'Wholesale Price Index' and 'Consumer Price Index'.

17. Why is it essential to have different CPI for different categories of consumers?

18. Discuss the limitations of index numbers.

19. Can CPI number for urban non-manual employees represent the changes in cost of hvmg of President of India?

20. What do you mean by inflation? How the wholesale price index numbers are useful for measuring the rate of inflation?

21. Try to list the important items of consumption in your family.

22. Write short notes on :

(a) Base year Index of Industrial Production

c Value Index (J) Consumer Price Index

; , (e) Wholesale Price Index.

Problems :

1. Construct the Index Number for 2002 with 2001 as base from the following prices of commodities by simple (Unweighted) aggregative method. Commodities : A B CD E

12.

13.

14.

Prices in Rs 2001 Prices in Rs 2002

50 80

40 60

10 . 5 2

20 10 6

[Index Number = 164.48]


391

2. Using the following data and 2002 as the base period, compute simple aggregative price indices for the two fuels.

Item Producers Price - - - - " c ;

V - 2000 2001 2002

Coal (Rs) 5 3 4

Crude oil (Rs) 2 3 4

5.

[Index Number : 2001 = 85.71, 2002 = 114.28]

3, Calculate the index number for 2002 with 2001 as base from the following prices of the commodities by simple (unweighted) aggregative method.

Commodity Price Price

and unit ' (2001) (2002)

Butter per kg 20.00 22.00

Milk per Litre 3.00 4.50

Cheese per Tin 18.00 19.80

Bread per kg 2.00 3.80

Eggs per Dozen 4.00 4.50


4, Calculate Quantity Index Numbers from the following data by simple aggregative method taking quantity of 1998 as base.

Commodity Quantity (in tons)

1998 1999 2000 2001 2002

A 0.30 0.33 0.36 0.36 0.39

B 0.25 0.24 0.30 0.32 0.30

C 0.20 0.25 0.28 0.32 0.30

D 2.00 2.40 2.50 2.50 2.60

(Quantity Index No. = 117.1, 125.1, 127.3, 130.5) Calculate index number for 2002 on the base prices for 1991 from the following by average of price relative method.

Items Prices (1991) Prices (2002)

6.

Bricks Timber Plaster Board Sand Cement 10 20 5 2 7

16 21 6 3 14

[Index No. = 147]

Construct the index number for 2000 taking 1990 as base by price relative method using arithmetic mean.

Commodities -.A B C D

Price (1990) : 10 20 30 40

Price (2000) : 13 17 60 70

[Index No. = 147.5]

392


1993

1994

1995

1996

1997

(in Rs)

75 50 65 60 72

1998

1999

2000 2001 2002

Wholesah Prices (in Rs)

7« 69 75 84 80

wholesale prices in India for second week of Sept. 2002 and L th'e wLr'"^'' ^^^ ""-ber of wholesale prLes

Weights Index

Food Article Manufactures Industrial Raw Material Semi-Manufactures Miscellaneous 31 30 18 17 4473.6 390.2 510.2 403.3 624.4

Q 1 1 ... [WPI = 449.2491

m^hor Tt^ ^^^^ data by weighted aggregative

method using : a Lasoevre's method ^ u aggregative

Commodity Price (2001) Quantity (2001) Price (2002) Quantity (2002)

A B C D 4 3 2 5 20 15 25 10 6 5 3 4 10 23 15 40

in v^r^rr. û J . , . L^^-^f^j-ô . XJ/.//, I'aascncs : 158.99]

aZtitv in? ^"û price index and quantity index numbers wn-li onoi__j -

quantity index numbers with base 2001 and interpret.

2001

Cummodtty

A B C

Pnce

_

4

3

Quantity

2002

Price

s

2 5 2

_

6 2 4

Quantity

3 1

6

[Laspeyre's : Price Index = 76.92, Quantity Index = 143 18-Paasche's : Price Index = 69.84, Quantity Index = 130]


11. Calculate weighted aggregative of actual price index number and quantity index number from the following data using (/) Laspeyre's Method, and {ii) Paasche's Method. Also calculate value index number and interpret them.

Commodity Base year Current Year

Quantity lbs. Price per lb. Quantity lbs. ' Price per lb.

Bread Meat Tea6 4 0.5 40 paise 45 paise 90 paise 7 5 1.5 30 paise 50 paise 40 paise

[Index Number = (/") 86.02, (ii) 81.25]

Commodity Price Base Year (in Rs) Price Current Year Quantity Base Year (in kg)

A 6.0 8.0 40

B 3.0 3.2 80

C 2.0 3.0 20


13. Prepare consumer price index numbers from the following data for 2000 and 1999 taking 1998 as base.

Group 1998 1999 (Price in Rs) 2000

A 20.00 24.00 21.00

B 1.25 1.50 1.00

C 5.00 8.00 8.00

D 2.00 2.25 2.12

[Index numbers, 1999 = 127.25, 2000 = 107.43] From the data given below construct the consumer price index number

Commodity Price Relatives Weights

Food 250 45

Rent 150 15

Clothing 320 20

Fuel and Lighting 190 5

Miscellaneous 300 15


UNIT 4

DEVELOPING PROJECTS IN EC

Chapter 13

PREPARATION OF A PROJECT REPORT

Introduction

2. Uses of Project Report

3. Consumer Awareness

4. Questionnaire for Dealers , 5.. Productivity Awareness

In the previous Unit 1, we have studied the Meaning of Economics; Scope and Importance of Statistics in Economics; in Unit 2, Collection and Organisation of Data; and in Unit 3, About the Various Statistical Tools. These tools are Very important in our daily life to analyse different economic activities such as consumption, production, distribution, transport in land and foreign trade and different business activities. In this chapter we will learn the method of developing a project report which will help us in understanding the application of statistical tools to analyse the various types of business activities.

Reports are prepared to give information about the development of institution, business, product, government activities etc. For example,

1. Consumer may be interested in knowing the quality, price and uses of product in changing environment and technology, e.g., preference for landline phone or mobile phone,^ detergent powder or detergent cake, fully automatic or semi-automatic washing machine, etc. Such surveys are conducted by manufacturing organisations.

2. Shareholders may be interested to know about the earning of organisation and possibility of getting dividend while holding the shares of the company. Such surveys are conducted by non-government organisations, societies, etc.

3. Central/state governments prepare reports for future development in priority areas such as road, power, teleconununication, education, health, etc. For example, for this purpose the government conducts surveys to know about likely requirement of primary health centres and schools for basic

education. Similarly government decides the requirement of power (Mega Watts), roads to construct in the light of changing population of a respective area.

4. Reserve Bank of India plans the opening of new branches of commercial banks, cooperative banks or agricultural banks in the light of increasing credit requirement of population on the basis of survey reports. Chamber of Commerce, namely.

Preparation of a Project Report 395

Federation of Indian Chamber of Commerce and Industry (FICCI), Confederation of-Indian Industry (CII) conduct surveys of abroad to know the business opportunities arising out of economic development of respective nation.

5. In the international context, United Nations Organisation (UNO) plans humanitarian help (food, hfe saving drugs, etc.) in war, drought, earthquakes and such other natural calamities based on survey reports.

In the light of above examples it is very clear that project reports help in understanding the requirements of shareholders, consumers. Central and State Governments, Reserve Bank of India and financial institutions and national and international bodies to plan their activities for future operations. Those organisations who ignore the changing requirement of the consumers or population may fail in achieving their goals and objectives.

uroject

Uses of project report can be highlighted as under :

1. To make aware individual groups about the present environment conditions of business/government, etc.

2. To help in the pohcy formation about the economic and social development of the country.

3. To direct the efforts of organisation in given objectives based on opportunities provided in the changing environment.

4. To pin-point the weaknesses of organisation so as to overcome such weaknesses.

5. To pay competitive prices for irequired goods by the consumer to take the real value of the price paid to sellers.

6. To invest in those securities that provides higher rate of interest/dividend to shareholders.

7. To exploit opportunities in the national and international markets by trade associations:

8. To provide food, medical help to badly affected areas due to any natural calamities by national, international, social and non-government organisations.

9. It helps in conducting research on various issues such as political, social, economical, technological aspects of national and international significance.

^^imers iuvareni

Consumers may be exploited by manufacturers, government agencies, board of directors and national and international agencies, e.g., manufacturers charge higher price, provide poor quality, lesser weight, defective product, etc., to the consumers. The Indian Consumer Protection Act, 1986 has provided various rights to the consumers, such as right to basic needs, safety, choice, information, education, redressal, representation and healthy environment. Any consumer is exploited on this ground can approach to the appropriate authorities to seek compensation or replacement of goods. For this consumers may be made aware about their rights and informed about proper agencies, which they can approach for grievances.

^^^ Statistics for Econotnics-Xl

There are five steps in preparing a project report for consumer awareness :

1. Identification of Problem

2. Preparation of Questionnaire

3. Collection of Data

4. Analysis and Interpretation

5. Conclusion

Identification of Problem

We want to know about consumers'/dealers' knowledge about the product of a company manufacturing namely, colour TV., air-conditioner, washing machine, refrigerator, car, scooter, computer etc. Let us take the example of air-conditioner where we are interested to know from dealers about the performance of air-conditioner with respect to price, cooling technology, quality, availability, warranty, after sales service etc. keeping in view other air-conditioners' manufacturers product available in the market in competition.

Preparation of Questionnaire

To know more about various aspects of air-conditioner in a more systematic manner, we must design a questionnaire covering all the aspects discussed above.

questionnaire for dealers

Name:.....

Address :. Phone No.

Q. 1. Please recall some air-condition brand name :

W ........................................... {ii) ......................

{Hi) ............................................ {iv) ......................

Q. 2. Which brands of AC's do you currently deal in :

(/■) Videocon (//) Carrier

(f) Samsung {viii) Others

{iv) National {vii) Voltas

{Hi) Amtrex {vi) LG

Q 3. Which brand AC would you recommend to the customer? (rank them) (1-Best, 8-Worst)

(/) Videocon

{iv) National {vii) Voltas

{ii) Carrier {v) Samsung {viii) Others

{Hi) Amtrex {vi) LG

Preparation of a Project Report

Q. 4. Reason behind above recommendation (rank the factors) : (1-Best, 8-Worst)

(ii) Quahty (Hi) Performance

(v) After sale service (viii) Warranty

397

(vi) Authentic

(i) Price (iv) Availability (vii) Technology

Q. 5. Which is most demanding brand of AC. (mention the name)?

Name .............................................................................................................

Q. 6. Rank the customer conferences while buying an AC. (1-Best, 8-Worst)

(ii) Quality (Hi) Performance

(v) After sale service (viii) Warranty

(vi) Authentic

(i) Price (iv) Availability (vii) Technology Q. 7. Does the brand name influence the customer?

Yes ............... No................ Some time...................

If No, then what else influences him (specify) :

Q. 8. Which brand of AC has least customer complaints (mention the name)?

Name : ............................................................................................................—

Q. 9. Rank the companies in regularity of supply (1-Best, 8-Worst) :

(iv) National

(/■) Videocon (ii) Carrier (Hi) Amtrex

(v) Samsung (vi) LG (vii) Voltas

(viii) Others

Q. 10. Which AC company provides the highest margins to their dealer?

Name of the company : .........................................................................................

Q. 11. Which AC company do you feel is the most aggressive in giving discounts and scheme (please specify)?

Name of the company :....................................Specification :......................................

Q. 12. Do you agree that huge advertisement campaigns are the most responsible factors for the changing market scenario and increasing demand?

(i) Agree very strongly (Hi) Agree (v) Disagree (vii) Don't know

Q. 13. Generally what short of problem do you face while doing a sale? Specify : .......................................................................................................

398


Q. 14. Which brand of AC in your opinion is having most advance technology?

Name the brand : ...................................

Q. 15. Warranty period offered by the companies (kindly tick) :

Brand name

(i) Videocon

(ii) Carrier

(iii) Amtrex

(iv) National

(v) LG

(vi) Samsung

(vii) Voltas

(viii) Others

Warranty

6 month-1 yr.

1-2 yrs.

2-3 yrs.

3 yrs. and above

Q. 16. What is the market size of the area you are dealing in'

(i) 0-1000 Machines —

(ii) 1000-2000 Machines (iv) 3000-4000 Machines

(iii) 2000-3000 Machines (v) 4000 and above

Q. 17. Average No. of units sold per month from your counter.

(Please specify brandwise)

(i) Videocon fl («) Carrier Q (Hi) Amtrex

(v) LG [j^ (vi) Samsung

(viii) Others

(iv) National (vii) Voltas

n 1 Brand name ___ Window -----—» '/f*"" ^^^ uiaiiu. Split

Lastyr. Projected Last yr. Projected

(/■) Videocon

(ii) Carrier

(iii) Amtrex

(iv) National

(v) LG

(vi) Samsung

(vii) Voltas

(viii) Others — -


Q. 19. Please give the sales break-up for the month of April, May and June in last three years :

{For the brand which you deal in)

Company name No. nf sales (Monthwtse)

200 i 2004 2005

May Apr. May June Apr. May Jme

(i) Videocon

(ii) Carrier

(Hi) Amtrex

(iv) National

(v) LG

(vi) Samsung

(vii) Voltas

(viii) Others

Q. 20. To get a substantial growth in your present sale which of the following would you prefer?

(/) Having a better brand name

If, name the brand ......................................................................................

{ii) Enhancing your infrastructure and sales persons' team.................................

Collection of Data

The above questionnaire with the help of investigators using sampling method will be filled in by the dealers. The number and geographical areas depend upon our requirement, where we want to position our product, namely, Delhi, Kolkata, Chennai and other capital cities of states.

We can also collect the information from government and industrial publications to know about the growth of air-conditioner industry and future government policy in this respect.

Analysis and Interpretation

Data collected through questionnaire will be classified and presented in the form of tables, graphs and diagrams, viz., bar diagrams, pie-diagram etc. For rigorous analysis

400


standard deviation and eoeffident oTva^t ^ OnV^K"" ""

also pro,eet fntnre demand tWongh ^ -

Blustration: Table and dtagram (based on hypothetiea. data, are g.ven beiow ,

Table 1

Consumer Awareness about Air-conditioners

Atvareness

Brand Present Availability Price

After Sales Service Technology

Conclusion

(U

O <

ai o cr

UJ CL

consumer awareness about air-conditioners

Scale : 0.5 cm = 10 percentage on V-axis

Brand

Present Availability

Price

After sales Services

xxxxxxv xxxxxxx xxxxxxx xxxxxxx xxxxxxx xxxxxxx

Technology

E3 Videocon □ Amtrex H Samsung EHJ Voltas m Carrier g National

m LG

^ Others

Observation

obse?ve bar diagram, „e

of view of : ^ customers prefer to buy air-conditioner from the point

1. Brand : Either Vidiocon or Samsung

2. Present AvatlahtUty : Voltas or other brand

4


3. Price : LG or Videocon

4. After Sales Service : LG or Videocon

5. Technology : Videocon or LG

Thus, the Air-conditioner company will come to know about the brand, present availability, price, after sales service, technology etc. Through this observation the company will be in a position to decide regularity of supply the number of units to be produced and to improve after sales service as per requirements of future consumers.

Analysis

Let us analyse the given data by applying different statistical tools (Mean, Standard deviation and Coefficient of Variation) using the following formulae :

1. Mean :

N

2. Standard Deviation :

a =

nx-xf

N

N

3. Coefficient of Variation (C.V.)

100

a

= J "

Table 2

Consumer Awareness about Air-conditioners

(Figures in percentages)

Awareness Name of Companies

Videocon Cjirrier Amtrex National Samsung LG Voltas Others

Brand 24 3 8 10 20 18 8 9

Present Availability 12 9 7 11 14 13 17 17

Price 25 10 6 5 12 28 9 5

After Sales Service 18 12 6 6 17 21 12 8

Technology 22 10 7 8 12 17 14 10

IX 101 44 34 40 75 97 60 49

Mean : X % 20.2 8.8 6.8 8 15 19.4 12 9.8

Observation

Considering brand, present availability, price, after sales service and technology, average percentage of customers of Videocon air-conditioner is the highest as 20.2% and hence will prefer to buy Videocon air-conditioner.

402

Videocm Carrier Af

Jf-f fX- X- (X~ X-

X Si/^ R

+3.8 14.44 -5.8 33.64 +1.2

.-8.2 67.24 +0.2 0.04 +0.2

M.8 23.04 +1.2 1.44 -0.8

-2.2 4.84 +3.2 10.24 -0.8

+1.8 3.24 +1.2 1.44 +0.2

Z(X -

XV 112.8 46.8

a 10.62 6.84

C.V 52.57 77.72

Thus, we get


Calculation of Standard Deviation and Coefficient of Variation Name of Companies

1.44 0.04 0.64 0.64 0.04

2.8 1.67 24.56

+2 +3 -3 -2 0

4

9 9 4 0

26 5.099 63.74

+5 -1 -3 +2 -3

- - —-- LG Vbit» t Othm 1

X - 1 /X - X-X flf-X)^ X -X jf/'

M

25 -1.4 1.96 -4 16 -0.8 0.64

1 -6.4 40.96 +5 25 +7.2 51.84

9 +8.6 73.96 -3 9 -4.8 23.04

4 +1.6 2.58 0 0 -1.8 3.24

9 -2.4 5.76 +2 4 +0.2 0.04

48 125.22 54 78.8

6.93 11.19 7.35 8.88

46.2 57.68 61.25 90.61

Name of Companies

Videocon

Carrier

Amtrex

National

Samsung

LG

Voltas

others

Mean

»

X

20.2

Standard devtatton

8.8 6.8 8 15 19.4 12 9.8

10.62 6.84 1.67 5.099 6.93 11.19 7.35 8.88

Coefftcient of Variation

52.57

77.72

24.56

63.74

46.2

57.68

61.25

90.61

Observations

ruJr!'^ of variation is the highest for other brands as 90.61%, hence the

customers will not prefer to buy other branded air-conditioners.

Requirement

to gyJhTLT'"''^ dealers/consumers to fill in the questionnaire

raTned\^o' " ^^ ^he information

reoZd Ar Potential dealers/customers, they may be asked to prepare required tables, graphs and diagrams etc.

4. Further, students should be asked to analyse and interpret the data collected by

5. They may also suggest the future course of action for the company.

r

Preparation of a Project Report

403

ivity

Productivity is the ratio between input and output of an organisation. Productivity varies from company to company. For example, X company manufactures a colour T.V. for Rs 10,000, while Y company manufactures the similar T.V. for Rs 11,000. In this case X company is more productive than Y company because X company's manufacturing cost of colour T.V. is less by Rs 1,000. Therefore, we say X company is more productive than Y company. In addition to this productivity, we can also be able to calculate productivity of different factors of production such as labour, capital etc. For example, the cost of labour of X company to manufacture colour T.V. is Rs 3,000 and that of Y company is Rs 2,500. In this case the labour productivity of Y company is better, although overall productivity of X company is better as compared to company Y.

Productivity is determined with internal and external factors. Internal factors are technology, organisation structure, managerial ability, ability of the firm to substitute different inputs, etc. External factors include growth of agriculture and industrial production, price, growth of bank deposits and credit, composition and growth of GDP, structure of foreign trade, savings and capital formation etc.

-4

Identification of Problem

We want to know productivity awareness amongst the enterprises of the following economic problems. We can identify the problems like :

[a) Industrial production

{b) National budget

(c) Population growth

{d) Gross national product

(e) Financial assistance by All India Financial Institutions

Collection of Data

Different ministries and departments of Central and State Governments publish regularly current information alongwith statistical data on the number of subjects. This information is quite reliable for related studies. We can collect data about identified problems from Newspapers/Economic Surveys/RBI Bulletin/Government Budget of the State or the Nation/Census Reports/NSS Reports/Annual Survey of Industries/ Labour Gazettes/Agriculture Statistics of India/Indian Trade Journals etc.

Statistical Tables and Analysis

Following are few illustrations for analysis.

404

Illustration 1.

P , Statistics for Economics-XI

Table 1

Annual Growth Rates of Industrial Production in Major Sectors of Industry

(Base : 1993-94 = 100)

Period

Weights

1995-96

1996-97

1997-98

1998-99

1999-00

2000-01 2001-02

2002-03

2003-04

2004-05

2004-05 (April-Dec.)

2005-06 (April-Dec.)

Mining and Quarrying

10.47

9.7 -1.9

6.9 -0.8 1.0

2.8 1.2 5.8 5.2 4.4 5.1

0.4

(In per cent)

Manufacturing Electricity Overall

79.36 10.17 100.0

14.1 8.1 13.0

7.3 4.0 6.1

6.7 6.6 6.7

4.4 6.5 4.1

7.1 7.3 6.7

5.3 4.0 5.0

2.9 3.1 2.7

6.0 3.2 5.7

7.4 5.1 7.0

9.2 5.2 8.4

9.2 6.4 8.6

8.9 4.8 7.8

economic survey : 2005-2006 (p 132) loofotconr' ^-'^^very that commenced from the second quarter of

per cent compared ,o a gro«h of 8.6 per Li ,n Ae^Xondmg^erd'of'zm^^^^^ Impressive performance of the mannfacuring sector which grew at 8 9 n!

mwsmmsm

^ ... ----------iiivcstmen

capacity additions and contributed to this shortage

.vear^mctde" during the current

H) normal business and investment cycles, (//■) lack of domestic and external demand'


(Hi) lack of reforms in land and labour markets,

(iv) bigh oil prices,

(v) existence of excess capacity in some sectors,

(vi) business cycle,

(vii) infrastructure bottlenecks particularly power, roads and transport,

(viii) continuing high real interest rates.

Illustration 2.

Table 2

Trends in Deficit of Central Government

(As per cent of GDP)

Year Revenue Primary s 1 , Bscat

Deficit befkit y Deficit

1990-91 3.3 2.8 6.6

1991-92 2.5 0.7 4.7

1992-93 2.5 0.6 4.8

1993-94 3.8 2.2 6.4

1994-95 3.1 0.4 4.7

1995-96 2.5 0.0 4.2

1996-97 2.4 -0.2 4.1

1997-98 3.1 0.5 4.8

1998-99 3.8 0.7 5.1

1999-00 3.5 0.7 5.3

2000-01 4.0 0.9 5.6

2001-02 4.4 1.5 6.2

2002-03 4.4 1.1 5.9

2003-04 3.6 0.0 4.5

2004-05=^ 2.5 0.6 4.1

(Provisional)

2005-06 (BE) 2.7 0.5 4.3

1

* Provisional and unaudited as reported by Controller General of Accounts, Department of Expenditure, Ministry of Finance.

Notes: 1. The ratios to GDP for 2005-06 (BE) are based on CSO's Advance Estimates GDP at current market prices prior to 1999-2000 based on 1993-94 series and from 1999-2000 based on new 1999-2000 series. 2. The fiscal deficit excludes the transfer of States' share in the small savings collections.

Source : Budget Document, Economic Survey—2005-06 (page 24).

Anlaysis : The Fiscal Responsibility and Budget Management Act (FRBMA), 2003 continued to provide a strong institutional mechanism for making sustained progress at

406


demand on «t^tt^lr^^^^^ a proportion of GDP, declined from 6.6 per cen, t itsTsi o " cSrS^^^^^

A^dr^i^ l^rov.;, leading to a marked improvement m the quahty of deficit Th^

available ar,ha, ^ - <>' G^P

Budeet for 7005 n^ u.^ ' 7 , ^ ^ P^'" cent, respectively. The

..dj, 5S.S

Requirement

2. TTiey can also be asked ro make presentation of snch problems

ItotTr" " ""rpretation of data of the stable they have

1 t

statistics for economics for class 11 n. m. shah

Documents