data collection and presentation

22
Data Collection & Presentation Presented by: Nasif Hassan Khan Abir ………… ID # 61531-24-007 Md. Ferdaus Alam ………… ID # 61531-24-010 Zakir Husain ………… ID # 61325-18-058 Md. Faruqul Islam ............ ID # 61325-18- 029

Upload: ferdaus44

Post on 20-Jan-2017

276 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Data collection and presentation

Data Collection & Presentation

Presented by:Nasif Hassan Khan Abir ………… ID # 61531-24-007Md. Ferdaus Alam ………… ID # 61531-24-010Zakir Husain ………… ID # 61325-18-058Md. Faruqul Islam ............ ID # 61325-18-029

Page 2: Data collection and presentation

Data

Data Collection The collection, organization, and presentation of data are basic

background material for learning descriptive and inferential statistics and their applications

Method of Collecting DataOn the basis of the source of collection data may be classified as: Primary data Secondary data

Types of DataThere are two types of data. They are: Numerical Data Categorical Data

Page 3: Data collection and presentation

Collection of Data

Collection of Data The data which are originally collected for the first time for the

purpose of the survey are called primary data. For example facts or data collected regarding the habit of taking tea or coffee in a village by an investigator.

Method of Collecting Primary DataThere are several methods for collecting primary data. Some of them

are: Direct personal investigation Indirect investigations Through correspondent By mailed questionnaire Through schedules

Page 4: Data collection and presentation

Collection of Data(cont’d)Secondary Data When we use the data, which have already been collected by

others, the data are called secondary data. This data is said to be primary for the agency which collects it first, and it becomes secondary for all the other users.

Method of Collecting Secondary Data Published reports of newspapers, RBI and periodicals. Publication from trade associations Financial data reported in annual reports Information from official publications Publication of international bodies such as UNO, World Bank etc. Internal reports of the government departments Records maintained by the institutions Research reports prepared by students in the universities

Page 5: Data collection and presentation

Types of Data

Categorical Data Categorical data is the statistical data type consisting of

categorical variables or of data that has been converted into that form, for example as grouped data. For example- Marital Status, Political Party, Eye Color, etc.

Numerical Data Numerical values or observations can be measured. And these

numbers can be placed in ascending or descending order. Numerical data can be divided into two groups:

Discrete(Counted Items such as- number of children, defects per hour etc.)

Continuous(Measured Characteristics such as- weight, voltage etc.)

Page 6: Data collection and presentation

Types of Data(cont’d)Level of Measurement/Measurement Scale

Interval Data

Ordinal Data

Nominal Data

Height, Age, Weekly Food Spending

Service quality rating, Standard & Poor’s bond rating, Student letter grades

Marital status, Type of car owned

Ratio Data

Temperature in Fahrenheit, Standardized exam score

Categories (no ordering or direction)

Ordered Categories (rankings, order, or scaling)

Differences between measurements but no true zero

Differences between measurements, true zero exists

EXAMPLES:

Page 7: Data collection and presentation

Data PresentationPresentation of Data Data collected in the form of schedules and questionnaires are

not self explanatory. These are in the form of raw data. In order to make them meaningful, these are to be made presentable.

 Presentation of Categorical Data Categorical Data can be presented by two ways: Tabulating Data(Summary Table) Graphing Data (Bar Chart, Pie Chart, Pareto Diagram)

Page 8: Data collection and presentation

The Summary Table

The summary table is a visualization that summarizes statistical information about data in table form.

 Example: Current Investment Portfolio

Investment Amount Percentage Type (in thousands $) (%)

Stocks 46.5 42.27Bonds 32.0 29.09CD 15.5 14.09Savings 16.0 14.55 Total110.0 100.0

Page 9: Data collection and presentation

Bar Chart

Bar charts are often used for qualitative data (categories or nominal scale). Height of bar shows the frequency or percentage for each category. Bar Chart for the previous summary table is

StocksBonds

CDSavings

0 5 10 15 20 25 30 35 40 45 50

Investor's Portfolio

Amount in $1000's

Page 10: Data collection and presentation

Pie Chart

Pie charts are often used for qualitative data (categories or nominal scale). Size of pie slice shows the frequency or percentage for each category. Pie Chart for the previous summary table is shown below

Page 11: Data collection and presentation

Pareto Diagram

Used to portray categorical data A bar chart, where categories are shown in descending order of frequency A cumulative polygon is often shown in the same graph Used to separate the “vital few” from the “trivial many”

Stocks Bonds Savings CD0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%Current Investment Portfolio

Series1Series2

% invested in each category (bar graph)

cumulative % invested (line

graph)

Page 12: Data collection and presentation

Presentation of Numerical DataCategorical Data can be presented by two ways:

Ordered Array (Stem-and-Leaf Display) Frequency/Cumulative Distributions (Histogram, Polygon,

Ogive)

Ordered Array  A sequence of data in rank order: Shows range (min to max) Provides some signals about variability within the range May help identify outliers (unusual observations) If the data set is large, the ordered array is less useful Example- Data in raw form (as collected): 24, 26, 24, 21, 27, 27,

30, 41, 32, 38 Data in ordered array from smallest to largest:21, 24, 24, 26, 27,

27, 30, 32, 38, 41

Page 13: Data collection and presentation

Stem-and-Leaf Diagram A simple way to see distribution details in a data set. To make

this diagram first

We have to separate the sorted data series into leading digits (the stem) and the trailing digits (the leaves).

Stem and Leaves of 21, 38 and 41 is,

Stem Leaf2 13 84 1

Page 14: Data collection and presentation

Frequency/Cumulative Distributions

What is a Frequency Distribution? A frequency distribution is a list or a table Containing class groupings (ranges within which the data fall) The corresponding frequencies with which data fall within each

grouping or category.

The reasons for using Frequency Distributions are: It is a way to summarize numerical data It condenses the raw data into a more useful form It allows for a quick visual interpretation of the data

Page 15: Data collection and presentation

Frequency/Cumulative Distributions(cont’d)

Class Intervals and Class Boundaries Each class grouping has the same width Determine the width of each interval by

Usually at least 5 but no more than 15 groupings Class boundaries never overlap Round up the interval width to get desirable endpoints

groupingsclassdesiredofnumberrangeintervalofWidth

Page 16: Data collection and presentation

Frequency Distributions Example

A manufacturer of insulation randomly selects 20 winter days

and records the daily high temperature 24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41,

43, 44, 27, 53, 27 For frequency distribution we need to follow the following steps:

Sort raw data in ascending order:12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Find range: 58 - 12 = 46 Select number of classes: 5 (usually between 5 and 15) Compute class interval (width): 10 (46/5 then round up) Determine class boundaries (limits): 10, 20, 30, 40, 50, 60 Compute class midpoints: 15, 25, 35, 45, 55 Count observations & assign to classes

Page 17: Data collection and presentation

Frequency Distributions Example(cont’d)

Page 18: Data collection and presentation

The Histogram

A graph of the data in a frequency distribution is called a histogram

The class boundaries (or class midpoints) are shown on the horizontal axis

the vertical axis is either frequency, relative frequency, or percentage

Bars of the appropriate heights are used to represent the number of observations within each class

Example-For previous data the Histogram should be like this. There will be no gap between bars.

5 15 25 35 45 55 650

1

2

3

4

5

6

7

Histogram: Daily High Temperature

Class Midpoints

Freq

uenc

y

Page 19: Data collection and presentation

The Frequency Polygon

In a percentage polygon the vertical axis would be defined to show the percentage of observations per class.

Example-For previous data the Frequency Polygon should be like this,

5 15 25 35 45 55 650

1

2

3

4

5

6

7

Frequency Polygon: Daily High Temperature

Class Midpoins

Freq

uenc

y

Page 20: Data collection and presentation

The Ogive

It is also known as the cumulative percent polygon.Example-For previous data the Ogive or Cumulative percent Polygon should be like this,

10 20 30 40 50 600

10

20

30

40

50

60

70

80

90

100

Ogive: Daily High Temperature

Class Boundaries (Not Midpoints)

Cum

ulat

ive

Perc

enta

ge

Page 21: Data collection and presentation

Guidelines for good data presentation

Not distorting the data Avoiding unnecessary adornments (no “chart junk”) Using a scale for each axis on a two-dimensional graph The vertical axis scale should begin at zero Properly labeling all axes The graph should contain a title Using the simplest graph for a given set of data

Page 22: Data collection and presentation

THANK YOU !!!