the nature of probability and statistics

33
The Nature of Probability and Statistics Section 1.1

Upload: others

Post on 18-Dec-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Nature of Probability and Statistics

The Nature of Probability and Statistics

Section 1.1

Page 2: The Nature of Probability and Statistics

What is statistics ? Statistics is the science of

conducting studies to collect, organize, summarize, analyze and

draw conclusions from data.

Page 3: The Nature of Probability and Statistics

• What is a variable ? A variable is a characteristic or attribute that can assume different values.

• What is data ? Data are the values (measurement or observations) that the variables can assume.

• Variable whose values are determined by chance are called random variables.

• A data set is a collection of data values. Each value in a data set is called a data value or datum.

Page 4: The Nature of Probability and Statistics

• In statistics we need to distinguish between a sample and a population.

• A population consists of all subjects (humans or otherwise) that are being studied.

• When data is collected from every subject in a population, it is called a census.

• A sample is a group of subjects selected from a population.

Page 5: The Nature of Probability and Statistics

• Types of statistics.

• (1) Descriptive Statistics

• (2) Inferential Statistics.

• What is descriptive statistics? Descriptive statistics consists of the collection, organization, summarization, and presentation of data.

• What is inferential statistics? Inferential statistics consists of generalizing from samples to populations, performing estimations and hypothesis tests, determining relationships among variables and making predictions.

Page 6: The Nature of Probability and Statistics

Section 1.2 : Variables and Types of Data

Page 7: The Nature of Probability and Statistics

• Variables can be classified as qualitative or quantitative.

• Qualitative variables are the variables that have distinct categories according to some characteristic or attribute.

• Example : male/female; yes/no etc.

• Quantitative variables are variables that can be counted or measured.

• Example : age, height, weights, temperature.

• So, quantitative variables can be further subdivided into discrete and continuous.

Page 8: The Nature of Probability and Statistics

• Discrete variables assume values that can be counted.

• Continuous variables assume an infinite number of values between any two specific values. They are obtained by measuring. They often include fractions and decimals.

Page 9: The Nature of Probability and Statistics

Variables

Qualitative Quantitative

Discrete Continuous

Page 10: The Nature of Probability and Statistics

• Measurement Scales.

• Nominal level of measurement is one in which no order or ranking is placed on the data.

• Example: subjects taught by various professors; religions; political party

• Ordinal level of measurement is one in which data are divided into categories that can be ranked.

• Example : grades, 1st, 2nd, 3rd place etc, Rating scale, poor, good, excellent.

Page 11: The Nature of Probability and Statistics

• Interval level of measurement ranks data and precise definition between the units of measure do exist, there is no meaningful zero.

• Example : IQ scores, temperature, SAT scores etc

• Ratio level of measurement, similar to the interval level of measurement except there is a meaningful zero. In addition, true ratios exist when the same variable is measured on two different members of the population.

• Example, height, weight, salary, age etc.

Page 12: The Nature of Probability and Statistics

Measurement Scales

Nominal

Classify

Dichotomy Categories Individuals

Ordinal

Rank

Interval

Quantity

Ratio

Quantity

Page 13: The Nature of Probability and Statistics

Section 1.3

Data Collection and Sampling Techniques

Page 14: The Nature of Probability and Statistics

• Data can be collected in a variety of ways one of which is through surveys, such as, telephone, mailed questionnaire and personal interview.

• Each of the above has its pros and its cons, for instance on telephone interviews some of the pros can be : financially feasible, people tend to be more candid.

• Cons , some will not have phone, some numbers will be unlisted, some will not accept phone calls.

Page 15: The Nature of Probability and Statistics

• When doing research data needs to be collected, so researchers use samples to collect data from a large population.

• Why use samples rather than the population?

• Samples save time and money.

• Question : How do we obtain these samples?

• Samples can be obtained that may be either biased or unbiased, the goal is to obtain a sample that is unbiased.

Page 16: The Nature of Probability and Statistics

Sampling Techniques

• Random Sampling

• Systematic Sampling

• Stratified Sampling

• Cluster Sampling

• Other Sampling Methods

Page 17: The Nature of Probability and Statistics

Random Sampling

• A random sample is one in which all members of the population have an equal chance of being selected.

• In order for this random selection to take place a random number generator is utilized.

Page 18: The Nature of Probability and Statistics

Systematic Sampling

• A sample is obtained by selecting every kth member of the population where k is a counting number.

• For instance if we had 3000 subjects and we wanted a sample of 100 subjects we would start with k=3000/100 = 30. so k=30 and we would then select the 30th subject from there on.

Page 19: The Nature of Probability and Statistics

Stratified Sampling

• A sample is obtained by dividing the population into subgroups or strata according to some characteristic relevant to the study. Then subjects are selected from the subgroups.

Page 20: The Nature of Probability and Statistics

Cluster Sampling

• A sample is obtained by dividing the population into sections or clusters and then selecting one or more clusters and using all members in the cluster(s) as the members of the sample.

Page 21: The Nature of Probability and Statistics

Other Sampling Methods

• Convenience sampling, this is when the researcher uses subjects that are convenient.

• Volunteer sampling or self-selective sampling

Page 22: The Nature of Probability and Statistics

Types of Errors

• Sampling error, is the difference between the results obtained from a sample and the results obtained from the population from which the sample was selected.

• Non-sampling error, occurs when the data are obtained erroneously or the sample is biased, i.e., nonrepresentative.

Page 23: The Nature of Probability and Statistics

Section 1.4 : Experimental Design

Page 24: The Nature of Probability and Statistics

Types of Studies

• Observational Studies

• Experimental Studies

• In an observational study the researcher observes what is happening or what has happened in the past and tries to draw conclusions based on these observations.

Page 25: The Nature of Probability and Statistics

Types of Observational Study

Observational Study

Cross-sectional

Retrospective Longitudinal

Page 26: The Nature of Probability and Statistics

• In a cross-sectional study all the data is collected at one time.

• In a retrospective study data is collected from records from the past.

• In a longitudinal study data is collected over a period of time.

Page 27: The Nature of Probability and Statistics

Advantages and Disadvantages of Observational Studies

Advantages

• Occurs in a natural setting.

• Can be done in situations that may be dangerous or unethical.

Disadvantages

• Variables are not controlled by the researcher.

• Cause and effect cannot be shown.

• Expensive.

• Time consuming.

Page 28: The Nature of Probability and Statistics

Experimental Study • In an experimental study the researcher manipulates one of

the variables and tries to determine how the manipulation influences other variables.

• The variable being manipulated by the researcher is called the independent variable or the explanatory variable and the resultant variable is called the dependent variable or the outcome variable.

• There are two groups the treatment group and the control group.

• The treatment group is the group that receives the instructions.

• The control group is the group that receives the placebo or no instructions.

Page 29: The Nature of Probability and Statistics

Advantages and Disadvantages of an Experimental Study

Advantages

• Researcher decides how to select the subjects.

• Researcher decides how to assign the subjects.

• Researcher controls or manipulates the independent variable.

Disadvantages

• Occur in natural settings.

• Hawthorne Effect

• Confounding variables

• (Note : a confounding variable is one that influences the dependent or outcome variable but was not separated from the independent variable.)

Page 30: The Nature of Probability and Statistics

Factors that can influence statistical studies

• Placebo effect. To minimize this effect researchers use a technique called blinding.

• In blinding the subjects do not know whether they are getting the treatment or not.

Page 31: The Nature of Probability and Statistics

Types of Blinding

Blinding

Single Double Triple

Page 32: The Nature of Probability and Statistics

Blocking

• Blocking is used to minimize variability.

• In an attempt to reduce variability, experiments can be completely randomized in design, in which case both subjects and treatment are randomized. As well as matched-pair in design, in which case one subject is assigned to the treatment and the other to the control group.

Page 33: The Nature of Probability and Statistics

Guidelines for Statistical Studies

• 1. Formulate the purpose of the study.

• 2. Identify the variables for the study.

• 3. Define the population.

• 4. Decide what sampling method you will use to collect the data.

• 5. Collect the data.

• 6. Summarize the data and perform any statistical calculations needed.

• 7. Interpret the results.