class interval class boundaries class mark, x i frequency, f i relative freq’y. %
TRANSCRIPT
Probability and StatisticsCourse Requirements1. Quizzes – 25%2. First Long Exam – 25%3. Second Long Exam – 25%4. Third Long Exam – 25%
5. Total – 100%
Passing – 60%
Probability and StatisticsStatisticsA branch of mathematics that deals with the
collection, organization and analysis of numerical data and with such problems as experiment design and decision making.
3 Important features of Statistics:
1. Data gathering
2. Data analysis3. Making decision
Probability and StatisticsDefinition of terms
1. Raw data Data collected in original form
2. Variable Characteristic or attribute that can assume
different values3. Population
All subjects possessing a common characteristic that is being studied
Probability and StatisticsDefinition of terms
4. Sample A subgroup or subset of a population
5. Parameter Characteristic or measure obtained from a
population6. Qualitative variables
Variables which assume non-numerical values
Probability and StatisticsDefinition of terms
7. Quantitative variables variables which assume numerical values
8. Discrete variables Variables which assume finite or countable
number of possible values, usually obtained by counting
9. Continuous variables Variables which assume infinite number of
possible values, usually obtained by measurement
Probability and Statistics Everyone involved in the experiment must have a
clear idea about what is to be studied, how the data is to be collected and at least a qualitative understanding as to how these data are to be analyzed.
Guidelines for designing experiments:1. Statement of the problem / recognition of the
problem Develop all the ideas about the objectives of
the experiment
Probability and StatisticsGuidelines for designing experiments:2. Choice of factors and levels
Choose the factors to be varied in the experiment
Choose the ranges over which these factors will be varied
Identify the specific levels at which runs will be made
Probability and StatisticsGuidelines for designing experiments:3. Selection of the response variable
The experimenter should be certain that this variable really provides useful information about the process under study
4. Choice of experimental design Involves the consideration of sample size
(number of replicates/trials), the selection of a suitable run order for the experimental trials, and the determination of whether or not blocking or other randomization restrictions involved.
Probability and StatisticsGuidelines for designing experiments:5. Performing the experiment
Monitor the process carefully to ensure that everything is being done according to plan
6. Data analysis Analyzing the data collected during the
experiment by statistical methods6. Conclusions
Making decision based on the statistical results
Probability and StatisticsMethods of Sampling1. Random sampling
sampling in which the data is collected using chance methods or random numbers.
2. Systematic sampling Sampling in which the data is collected by
selecting every kth object3. Stratified sampling
Sampling in which the population is divided into groups (strata) according to some characteristic. Each strata is then sampled either random or systematic
Probability and StatisticsMethods of Sampling4. Cluster sampling
sampling in which the population is divided into groups (usually geographically). Some of these groups are randomly selected, and then all of the elements in those groups are selected.
Probability and StatisticsMethods of Summarizing/Characterizing Data 1. Tabular Methods
a. Frequency Distribution
b. Cumulative Frequencyc. Stem and Leaf Table
2. Graphical Methodsa. Frequency Histogram
b. Frequency Polygonc. Ogived. Pie chart
Probability and StatisticsMethods of Summarizing/Characterizing Data 3. Numerical Methods
a. Measures of Central Tendencies
b. Measures of Dispersion
Mean/Average, Median, Mode
Range, Variance, Standard Deviationc. Measures of Shape
Skewness, Kurtosisd. Measures of Data Locations
Percentiles, Deciles, Quartiles
Probability and StatisticsTabular Methods1. Frequency Distribution
The organization of raw data in tabular form with classes and frequencies
Steps in Constructing a Frequency Distribution Table:1. Determine the number of class intervals, k, needed
to summarize the data:
No. of samplesNo. of class intervals
Probability and StatisticsTabular MethodsSteps in Constructing a Frequency Distribution Table:2. Find the range of observations
Minimum valueRangeMaximum value
Probability and StatisticsTabular MethodsSteps in Constructing a Frequency Distribution Table:3. Determine the width of the class intervals
No. of class intervals
Class width
Range
Probability and StatisticsTabular MethodsSteps in Constructing a Frequency Distribution Table:4. Form the frequency table
Class Interval
Class Boundaries
Class Mark,xi
Frequency,fi
Relative Freq’y.
%
Class interval Separates one class in a grouped frequency from
the other The interval could actually appear in the raw data
and it begins with the lowest value
Probability and StatisticsTabular MethodsSteps in Constructing a Frequency Distribution Table:4. Form the frequency table
Class Interval
Class Boundaries
Class Mark,xi
Frequency,fi
Relative Freq’y.
%
Class boundary Separates one class in a grouped frequency from
the other It has one more decimal place than the raw data
and therefore it does not appear in the data
Probability and StatisticsTabular MethodsSteps in Constructing a Frequency Distribution Table:4. Form the frequency table
Class Interval
Class Boundaries
Class Mark,xi
Frequency,fi
Relative Freq’y.
%
Class boundary
Probability and StatisticsTabular MethodsSteps in Constructing a Frequency Distribution Table:4. Form the frequency table
Class Interval
Class Boundaries
Class Mark,xi
Frequency,fi
Relative Freq’y.
%
Class Mark (Midpoint), xi The number in the middle of the class
Probability and StatisticsTabular MethodsSteps in Constructing a Frequency Distribution Table:4. Form the frequency table
Class Interval
Class Boundaries
Class Mark,xi
Frequency,fi
Relative Freq’y.
%
Frequency, fi The number of times a certain value or class of
values occurs
Probability and StatisticsTabular MethodsSteps in Constructing a Frequency Distribution Table:4. Form the frequency table
Class Interval
Class Boundaries
Class Mark,xi
Frequency,fi
Relative Freq’y.
%
Relative Frequency, % Frequency divided by the total number of data This gives the percent of values falling in that class
Probability and StatisticsTabular MethodsSteps in Constructing a Frequency Distribution Table:Illustration: the nicotine contents, in milligrams, for 40 cigarettes of a certain brand were recorded as follows:
1.09 1.92 2.31 1.79 2.281.74 1.47 1.97 0.85 1.241.58 2.03 1.70 2.17 2.552.11 1.86 1.90 1.68 1.511.64 0.72 1.69 1.85 1.821.79 2.46 1.88 2.08 1.671.37 1.93 1.40 1.64 2.091.75 1.63 2.37 1.75 1.69
Probability and StatisticsTabular MethodsSteps in Constructing a Frequency Distribution Table:Illustration: the nicotine contents, in milligrams, for 40 cigarettes of a certain brand were recorded as follows:
1.09 1.92 2.31 1.79 2.281.74 1.47 1.97 0.85 1.241.58 2.03 1.70 2.17 2.552.11 1.86 1.90 1.68 1.511.64 0.72 1.69 1.85 1.821.79 2.46 1.88 2.08 1.671.37 1.93 1.40 1.64 2.091.75 1.63 2.37 1.75 1.69
Class Interval
0.72 – 1.02
1.03 – 1.33
1.34 – 1.64
1.65 – 1.95
1.96 – 2.26
2.27 – 2.57
Probability and StatisticsTabular MethodsSteps in Constructing a Frequency Distribution Table:Illustration: the nicotine contents, in milligrams, for 40 cigarettes of a certain brand were recorded as follows:
Class Interval Class Boundaries Class Mark,
xi
0.72 – 1.02 0.715-1.025 0.871.03 – 1.33 1.025-1.335 1.181.34 – 1.64 1.335-1.645 1.491.65 – 1.95 1.645-1.955 1.801.96 – 2.26 1.955-2.265 2.112.27 – 2.57 2.265-2.575 2.42
Probability and StatisticsTabular MethodsSteps in Constructing a Frequency Distribution Table:Illustration: the nicotine contents, in milligrams, for 40 cigarettes of a certain brand were recorded as follows:
1.09 1.92 2.31 1.79 2.281.74 1.47 1.97 0.85 1.241.58 2.03 1.70 2.17 2.552.11 1.86 1.90 1.68 1.511.64 0.72 1.69 1.85 1.821.79 2.46 1.88 2.08 1.671.37 1.93 1.40 1.64 2.091.75 1.63 2.37 1.75 1.69
Class Boundaries
Frequency,fi
0.715-1.025 21.025-1.335 21.335-1.645 81.645-1.955 171.955-2.265 62.265-2.575 5
Probability and StatisticsTabular MethodsSteps in Constructing a Frequency Distribution Table:Illustration: the nicotine contents, in milligrams, for 40 cigarettes of a certain brand were recorded as follows:
Class Interval Class Boundaries
Class Mark,
xi
Frequency,fi
Relative Freq’y.
%
0.72 – 1.02 0.715-1.025 0.87 2 5.001.03 – 1.33 1.025-1.335 1.18 2 5.001.34 – 1.64 1.335-1.645 1.49 8 20.001.65 – 1.95 1.645-1.955 1.80 17 42.501.96 – 2.26 1.955-2.265 2.11 6 15.002.27 – 2.57 2.265-2.575 2.42 5 12.50
Probability and StatisticsTabular MethodsCumulative Frequency Distribution Table:Cumulative Frequency, cfi Gives the running total of the frequencies The number of observations in the sample whose
values are less than or equal to the upper boundary of the class interval
Relative Cumulative Frequency (cfi / total number of samples) * 100 Percent of the values which are less than the upper
boundary
Probability and StatisticsTabular MethodsCumulative Frequency Distribution Table:
Class Interval
Class Boundaries
Class Mark,
xi
Freq’y,fi
Cumulative Frequency,
cfi
Relative Cum.
Freq’y.%
0.72 – 1.02 0.715-1.025 0.87 2 2 5.00
1.03 – 1.33 1.025-1.335 1.18 2 4 10.00
1.34 – 1.64 1.335-1.645 1.49 8 12 30.00
1.65 – 1.95 1.645-1.955 1.80 17 29 72.50
1.96 – 2.26 1.955-2.265 2.11 6 35 87.50
2.27 – 2.57 2.265-2.575 2.42 5 40 100.00
Probability and StatisticsGraphical MethodsFrequency Histogram A graph which displays the data by using vertical
bars of various heights to represent frequencies The horizontal axis can either be class intervals,
class boundaries, or class marks
Probability and StatisticsGraphical MethodsFrequency Histogram
0.87 1.18 1.49 1.8 2.11 2.420
2
4
6
8
10
12
14
16
18
Class mark
freq
uenc
y
Probability and StatisticsGraphical MethodsFrequency Polygon
Class mark
freq
uenc
y
A line graph between frequency and class mark
0.87 1.18 1.49 1.8 2.11 2.420
2
4
6
8
10
12
14
16
18
Probability and StatisticsGraphical MethodsOgive
Upper class boundary
Rela
tive
cum
ulati
ve fr
eque
ncy
A frequency polygon of relative cumulative frequency against upper class boundaries
1.025 1.335 1.645 1.955 2.265 2.5750
20
40
60
80
100
120
Probability and StatisticsGraphical MethodsPie chart The degree of slice is based on the relative
frequency
552042.51512.5
Probability and StatisticsNumerical MethodsMeasures of Central Tendencies1. Mean / Average
The sum of the product of class mark and the corresponding frequency divided by the total number of samples
Probability and StatisticsNumerical MethodsMeasures of Central Tendencies2. Median
The value that will divide the samples into two equal halves when the samples are arranged from lowest to highest
Total frequencies of all class intervals before the median class
Frequency of the median class
Lower class boundary of the median class
Probability and StatisticsNumerical MethodsMeasures of Central Tendencies3. Mode
The most frequent number
Frequency difference of the modal class and the succeeding class
Frequency difference of the modal class and the preceeding class
Lower class boundary of the modal class
Probability and StatisticsNumerical MethodsMeasures of Variability / Dispersion1. Range
Measures how the samples are clustered. It is the difference between the highest and the
lowest values of the raw data
Minimum valueRangeMaximum value
Probability and StatisticsNumerical MethodsMeasures of Variability / Dispersion2. Variance
Measures how the samples are dispersed.
Probability and StatisticsNumerical MethodsMeasures of Variability / Dispersion3. Standard deviation, s
The positive square root of the variance
Coefficient of variation, Cv
If Cv < 10 – the data are considered clustered, else the data are dispersed
Probability and StatisticsNumerical MethodsMeasures of Shape1. Skewness
A measure of the symmetry of the distribution of the sample
If Sk < 0 – the distribution is skewed to the left (i.e., left tail is longer than right tail)
Probability and StatisticsNumerical MethodsMeasures of Shape1. Skewness
A measure of the symmetry of the distribution of the sample
If Sk = 0 – the distribution is symmetric with respect to the mean, i.e., right and left tails are of equal length (the distribution is called normal or Gaussian)
Probability and StatisticsNumerical MethodsMeasures of Shape1. Skewness
A measure of the symmetry of the distribution of the sample
If Sk > 0 – the distribution is skewed to the right (i.e., right tail is longer than left tail)
Probability and StatisticsNumerical MethodsMeasures of Shape2. Kurtosis
A measure of the height of the distribution
If kurtosis < 0 – the distribution has short height or is almost flat
Probability and StatisticsNumerical MethodsMeasures of Shape2. Kurtosis
A measure of the height of the distribution
If kurtosis = 0 – the distribution has the right height
Probability and StatisticsNumerical MethodsMeasures of Shape2. Kurtosis
A measure of the height of the distribution
If kurtosis > 0 – the distribution has a high peak
Probability and StatisticsNumerical MethodsMeasures of Data Location1. Quartiles: Q1, Q2, Q3
It is the 25%, 50% and 75% respectively of the data
2. Deciles: D1, D2, D3, … D9
It is the 10%, 20%, 30%,…90% respectively of the data
3. Percentile: P1, P2, P3, … P99
It is the 1%, 2%, 3%,…99% respectively of the data
Probability and StatisticsQuizThe diameter of 36 rivet heads in 1/100 of an inch is given below:
6.72 6.77 6.82 6.70 6.78 6.70 6.62 6.75 6.666.66 6.64 6.76 6.73 6.80 6.72 6.76 6.76 6.686.66 6.62 6.72 6.76 6.70 6.78 6.76 6.67 6.706.72 6.74 6.81 6.79 6.78 6.66 6.76 6.76 6.72
1. Construct a Cumulative Frequency Table2. Determine the Mean, Median and Mode3. Determine the Variance, Standard deviation and the
coefficient of variation
4. Determine the skewness and kurtosis of the distribution and make a conclusion about the shape of the distribution