engineering probability and statistics intro

132
Engineering Probability and Statis tics Fundamentals of Statistics

Upload: benjomanimtim

Post on 16-Dec-2015

46 views

Category:

Documents


7 download

DESCRIPTION

From PLM College of Engineering

TRANSCRIPT

Engineering Probability andStatisticsFundamentals of StatisticsIntroductionStatistics Both singular and plural in meaning

May refer to either numerical or quantitative data like age, weight, height, etc.

Is a scientific method of

collection, presentation, analysis andinterpretation of data for the purpose of drawing valid conclusions and reasonable

decisionsFields of StatisticsDescriptive StatisticsMethod concerned with the collection and description of a set of data to yield meaningful information

Provides information only about collected data and does not draw inferences of conclusions about a large set of data

Fields of StatisticsInferential StatisticsComposed of those methods concerned with the analysis of a smaller group of data which is known as the sample leading to predictions or inferences about the larger set of data, or the population at which the sample is drawn.

The Language of Summationn xii 1The symbol is read as the sum (or summation) of x1, x2, x3, , xn.

Or it is technically read as the sum of xiterms where i ranges from 1 to n.The symbol , the Greek capital letter sigma, is used to denote the sum.

Law of Notationa.)

n n n xi yi

xi

yii 1 i 1 i 1

n nb.)

cxi

c xii 1 i 1c.)

n c nci 1n m xij

( x11

x12

...

x1m ) ( x21

x22

...

x2 m ) ... ( xn1

xn 2

...

xnm )

i 1 j 1

Parameters and StatisticsParameter any numerical value describing the characteristics of a population

Statistic any numerical value describing the characteristics of a sample

Data Collection and PresentationData Collection

1. Data from Primary sources acquired through experiments, data gathering devices, interviews, and surveys

2. Data from Secondary sources from printed materials such as books, magazines and journals

Data Collection and PresentationSlovins FormulaThe sufficiency of sample size in surveys can be obtained using the formula

n N

where 1n = sample size

N = population size

e = estimated error

Ne2Example Slovins FormulaProblem: How large a sample should be chosen if we expect 5% error from a population of 3000?

Given: N = 3000, e = 0.05

Required: nSolution:

n 30001 3000 0.05n 352.94Hence, we choose the next higher wholevalue, which is 353, as the minimum number ofsample to sufficiently obtain the experiment.Sampling Techniques1. Non-Probability Sampling type of sampling when an individual subject has certain or no chance of being chosen as a sample.

a. Convenience Sampling sampling technique based primarily on the availability of the respondents.

Sampling TechniquesNon- Probability Sampling

b. Quota Sampling sampling technique where there is a desired number of sample and the respondents were taken as they volunteered themselves as to be part of the experimentc. Purposive Sampling sampling technique where sample is obtained based on a certain premise.

Sampling Techniques2. Probability Sampling eliminates the biases against certain event that has no chance to be selected by listing all the possible events

a. Simple Random Sampling performed by arranging the population according to a certain rule, each element being numbered and a sample is taken by various randomizing principles

Sampling TechniquesProbability Sampling

b. Systematic Sampling done by arranging the population in accordance to a certain order and the sample will be taken by dividing the population into equal groups and obtaining the kth element in each group.

Sampling TechniquesProbability Sampling

c. Stratified Sampling technique done by grouping the population into strata, a subpopulation with generally homogeneous or similar characteristics, where a random sampling is performed in each stratum

proportional to the size of the stratum relative to the population.

Sampling TechniquesProbability Sampling

d. Cluster Sampling technique done by identifying groups called clusters, a subpopulation with elements as heterogeneous or diverse in characteristics as possible. Clusters must be similar to each other with respect to the parameter being examined. A cluster or some clusters is selected for the sampling.

Types of DataQualitative descriptions used to portray the attributes of data

Quantitative measurable quantities such as scores, weights, grades being collected.

Grouped data categorized data

Ungrouped data raw, random data

Classification of DataCategorical data like gender, color, civil status, and location are commonly answered by non-numeric data (qualitative) form.

Numerical data information and observations that are countable or measurable quantities such as scores, weight and grades (quantitative).

Levels of Measurement of NumericalData1. Nominal data commonly categorical data assigned to numbers. Example of which is assigning 1 for males and 2 for females.

2. Ordinal data quantities where the numbers are used to designate the rank order of the data. Example of which is the result of a contest or race where ranking is measured.

Levels of Measurement of NumericalData3. Interval data data type where the range between the numeric values is constant. In this type of

data, addition and subtraction can be performed but not multiplication and division. Example of which is the year, temperature measured, final grade, etc.4. Ratio data widely used in science and engineering.Almost all basic operations can be performed in thisdata type. One significant characteristic is the presenceof a non-arbitrary zero-point. Examples of which arelength, mass, angles, charge and energy.Presentation of Data1. Textual Form way of presenting data in terms of statements, sentences, and paragraphs.

2. Tabular Form using tables to present data that is direct to the point and easily understood. Examples of which are reference table and the summary table.

Ungrouped DataMeasures of Central Tendency

1. Mean average of all the data or values

NPopulation

xi i 1

NSample

n xix i 1

nUngrouped Data

x%2. Median middle data or score

Procedure:a. Arrange data in an array b. Locate the middle value

if odd, middle value is the median if even, get the average of the two middle value

Ungrouped Datax3. Mode most frequent valueif two values occur the most at the same frequency, then bi-modalif three values occur the most at the same frequency, then tri-modalUngrouped DataMeasure of Variation of Data

How the data is spread out from the mean1. Range- Difference of highest and lowest valuesRANGE = HV LVUngrouped Data2. Standard DeviationPopulation

N xi i 1

NSample

n xi xs i 1

n 1Ungrouped Data3. VariancePopulation

N xi2 i 1

N

Sample n 2 xi xs2 i 1

n 1Ungrouped Data4. Coefficient of Variation

- the percentage of the ratio of standard deviation to the mean

Population

Sample

VpV ss x

100%100%

ExampleA food inspector examined a random sample of

7 cans of certain brand of tuna to determine thepercentage of foreign impurities. The followingdata are recorded:1.8, 2.1, 1.7, 1.6, 0.9, 2.7, 1.8

7 xi

CalculationsMean

x i 1 1.8 2.1 1.7 1.6 0.9 2.7 1.87 7x 1.8Median

0.9,1.6,1.7,1.8,1.8, 2.1, 2.7x% 1.8Mode

x 1.8Grouped Data1. Stem-Leaf Plot one way of summarizing ungrouped data. This table has two

columns, one for the stem and the other for the leaves.

2. Frequency Distribution Table (fdt) numerous data can be analyzed by grouping the data into different classes with equal class intervals and determining the number of observations that fall within each class.

Example Stem-Leaf PlotExpress the following data as a stem and leaf plot with the tens digit as the stem and the ones digit as the leaves

12, 23, 12, 11, 10, 25, 29, 39, 31, 43, 42, 54,53, 53, 56, 57, 56, 67, 54, 65, 76, 76, 75, 74

Hint: treat tens digit as the stem and ones digits as the leavesExample Stem-Leaf PlotSTEM-LEAF PLOTSTEM LEAF FREQUENCY1 0,1,2,2 42 3,5,9 33 1,9 24 2,3 25 3,3,4,4,6,6,7 76 5,7 27 4,5,6,6 4TOTAL 24Frequency Distribution Table TermsClass limits smallest and largest values that fall within the class intervalClass boundaries class intervals more precise limits (by the next significant digit)

Frequency number of observations

Class width numerical difference between the upper and lower class boundaries

Class mark midpoint of class intervalFrequency Distribution Table TermsLess than cumulative frequency start to add from the lowest class intervalGreater than cumulative frequency start to add from the highest class intervalGrouped DataSteps in constructing a frequency distribution

1. Decide on the number of class intervals required

Square-Root Principlek = N orSturges Formulak = 1 + 3.322logN (round-up result)2. Determine the range.3. Determine the size of the class interval by dividing the range by the desired number of class interval (round-off).

4. Determine the lower and upper class limits of each class interval. Start with the lowest data or score.5. Determine the number of observations falling under each class interval by tallying.

ExampleExample: Suppose that the following data are the average speeds of vehicles running in SLEX (in kph):27 56 38 43 48 38 43 3034 40 50 43 57 52 25 43

35 29 49 36 29 52 46 4946 47 31 52 41 31 55 5042 41 52 25 46 36 41 36

Calculations1. Square-Root Formula

k 40 6.32

k 7or Sturges Formula k k2. Range = 57-25

= 3232 4.57 5

1 3.322 log(40) 6.32

7

3. Class size = 7Frequency DistributionClass f

x(ClassMarks) Class Boundaries CF

55-59 3 57 54.5-59.5 40 350-54 6 52 49.5-54.5 37 9

45-49 7 47 44.5-49.5 31 16

40-44 9 42 39.5-44.5 24 25

35-39 6 37 34.5-39.5 15 3130-34 4 32 29.5-34.5 9 3525-29 5 27 24.5-29.5 5 40

Total N = 40

Grouped Data1. Measures of Central Location

a. Mean

a.1. Long Methodxwhere

f = frequency

x = class mark

fxN

N = number of observationsExample Calculationsf x fx d fd u fu

55-59 3 57 171 15 45 3 950-54 6 52 312 10 60 2 1245-49 7 47 329 5 35 1 7

40-44 9 42 378 0 0 0 0

35-39 6 37 222 -5 -30 -1 -6

30-34 4 32 128 -10 -40 -2 -825-29 5 27 135 -15 -75 -3 -15Total N = 40 = 1675 = - 5 = - 1

Example Calculationsa.1. Long Method

x fxNx 167540x 41.875

Grouped Dataa.2. Coding Methodx A C

fuNwhereA = class mark of the mean class (class interval of the assumed mean)

C = class width

f = frequencyu = codeN = number of observationsExample Calculationsf x fx d fd u fu

55-59 3 57 171 15 45 3 950-54 6 52 312 10 60 2 1245-49 7 47 329 5 35 1 7

40-44 9 42 378 0 0 0 0

35-39 6 37 222 -5 -30 -1 -6

30-34 4 32 128 -10 -40 -2 -825-29 5 27 135 -15 -75 -3 -15Total N = 40 = 1675 = - 5 = - 1

Example Calculationsa.2. Coding Methodx A C

fuNx 42 5

( 1)40

x 41.875

Grouped Dataa.3. Short/Deviation Methodwhere

x A

fdN

A = class mark of the mean class (classinterval of the assumed mean)f = frequency

d = deviation

N = number of observations

Example Calculationsf x fx d fd u fu

55-59 3 57 171 15 45 3 950-54 6 52 312 10 60 2 1245-49 7 47 329 5 35 1 7

40-44 9 42 378 0 0 0 0

35-39 6 37 222 -5 -30 -1 -6

30-34 4 32 128 -10 -40 -2 -825-29 5 27 135 -15 -75 -3 -15Total N = 40 = 1675 = - 5 = - 1

Example Calculationsa.3. Short/Deviation Methodx A

fdN

x 42 540x 41.875

Grouped Datab. Median

N f x L C 2 fMd whereLMd = lower class boundary of the median class

f< = the cumulative frequency of the class preceding the median class

fMd = frequency of the median class

C = class sizeSample CalculationsN = 40, N/2 = 20

Class f f