the nature of probability and statistics

Post on 18-Dec-2021

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

The Nature of Probability and Statistics

Section 1.1

What is statistics ? Statistics is the science of

conducting studies to collect, organize, summarize, analyze and

draw conclusions from data.

• What is a variable ? A variable is a characteristic or attribute that can assume different values.

• What is data ? Data are the values (measurement or observations) that the variables can assume.

• Variable whose values are determined by chance are called random variables.

• A data set is a collection of data values. Each value in a data set is called a data value or datum.

• In statistics we need to distinguish between a sample and a population.

• A population consists of all subjects (humans or otherwise) that are being studied.

• When data is collected from every subject in a population, it is called a census.

• A sample is a group of subjects selected from a population.

• Types of statistics.

• (1) Descriptive Statistics

• (2) Inferential Statistics.

• What is descriptive statistics? Descriptive statistics consists of the collection, organization, summarization, and presentation of data.

• What is inferential statistics? Inferential statistics consists of generalizing from samples to populations, performing estimations and hypothesis tests, determining relationships among variables and making predictions.

Section 1.2 : Variables and Types of Data

• Variables can be classified as qualitative or quantitative.

• Qualitative variables are the variables that have distinct categories according to some characteristic or attribute.

• Example : male/female; yes/no etc.

• Quantitative variables are variables that can be counted or measured.

• Example : age, height, weights, temperature.

• So, quantitative variables can be further subdivided into discrete and continuous.

• Discrete variables assume values that can be counted.

• Continuous variables assume an infinite number of values between any two specific values. They are obtained by measuring. They often include fractions and decimals.

Variables

Qualitative Quantitative

Discrete Continuous

• Measurement Scales.

• Nominal level of measurement is one in which no order or ranking is placed on the data.

• Example: subjects taught by various professors; religions; political party

• Ordinal level of measurement is one in which data are divided into categories that can be ranked.

• Example : grades, 1st, 2nd, 3rd place etc, Rating scale, poor, good, excellent.

• Interval level of measurement ranks data and precise definition between the units of measure do exist, there is no meaningful zero.

• Example : IQ scores, temperature, SAT scores etc

• Ratio level of measurement, similar to the interval level of measurement except there is a meaningful zero. In addition, true ratios exist when the same variable is measured on two different members of the population.

• Example, height, weight, salary, age etc.

Measurement Scales

Nominal

Classify

Dichotomy Categories Individuals

Ordinal

Rank

Interval

Quantity

Ratio

Quantity

Section 1.3

Data Collection and Sampling Techniques

• Data can be collected in a variety of ways one of which is through surveys, such as, telephone, mailed questionnaire and personal interview.

• Each of the above has its pros and its cons, for instance on telephone interviews some of the pros can be : financially feasible, people tend to be more candid.

• Cons , some will not have phone, some numbers will be unlisted, some will not accept phone calls.

• When doing research data needs to be collected, so researchers use samples to collect data from a large population.

• Why use samples rather than the population?

• Samples save time and money.

• Question : How do we obtain these samples?

• Samples can be obtained that may be either biased or unbiased, the goal is to obtain a sample that is unbiased.

Sampling Techniques

• Random Sampling

• Systematic Sampling

• Stratified Sampling

• Cluster Sampling

• Other Sampling Methods

Random Sampling

• A random sample is one in which all members of the population have an equal chance of being selected.

• In order for this random selection to take place a random number generator is utilized.

Systematic Sampling

• A sample is obtained by selecting every kth member of the population where k is a counting number.

• For instance if we had 3000 subjects and we wanted a sample of 100 subjects we would start with k=3000/100 = 30. so k=30 and we would then select the 30th subject from there on.

Stratified Sampling

• A sample is obtained by dividing the population into subgroups or strata according to some characteristic relevant to the study. Then subjects are selected from the subgroups.

Cluster Sampling

• A sample is obtained by dividing the population into sections or clusters and then selecting one or more clusters and using all members in the cluster(s) as the members of the sample.

Other Sampling Methods

• Convenience sampling, this is when the researcher uses subjects that are convenient.

• Volunteer sampling or self-selective sampling

Types of Errors

• Sampling error, is the difference between the results obtained from a sample and the results obtained from the population from which the sample was selected.

• Non-sampling error, occurs when the data are obtained erroneously or the sample is biased, i.e., nonrepresentative.

Section 1.4 : Experimental Design

Types of Studies

• Observational Studies

• Experimental Studies

• In an observational study the researcher observes what is happening or what has happened in the past and tries to draw conclusions based on these observations.

Types of Observational Study

Observational Study

Cross-sectional

Retrospective Longitudinal

• In a cross-sectional study all the data is collected at one time.

• In a retrospective study data is collected from records from the past.

• In a longitudinal study data is collected over a period of time.

Advantages and Disadvantages of Observational Studies

Advantages

• Occurs in a natural setting.

• Can be done in situations that may be dangerous or unethical.

Disadvantages

• Variables are not controlled by the researcher.

• Cause and effect cannot be shown.

• Expensive.

• Time consuming.

Experimental Study • In an experimental study the researcher manipulates one of

the variables and tries to determine how the manipulation influences other variables.

• The variable being manipulated by the researcher is called the independent variable or the explanatory variable and the resultant variable is called the dependent variable or the outcome variable.

• There are two groups the treatment group and the control group.

• The treatment group is the group that receives the instructions.

• The control group is the group that receives the placebo or no instructions.

Advantages and Disadvantages of an Experimental Study

Advantages

• Researcher decides how to select the subjects.

• Researcher decides how to assign the subjects.

• Researcher controls or manipulates the independent variable.

Disadvantages

• Occur in natural settings.

• Hawthorne Effect

• Confounding variables

• (Note : a confounding variable is one that influences the dependent or outcome variable but was not separated from the independent variable.)

Factors that can influence statistical studies

• Placebo effect. To minimize this effect researchers use a technique called blinding.

• In blinding the subjects do not know whether they are getting the treatment or not.

Types of Blinding

Blinding

Single Double Triple

Blocking

• Blocking is used to minimize variability.

• In an attempt to reduce variability, experiments can be completely randomized in design, in which case both subjects and treatment are randomized. As well as matched-pair in design, in which case one subject is assigned to the treatment and the other to the control group.

Guidelines for Statistical Studies

• 1. Formulate the purpose of the study.

• 2. Identify the variables for the study.

• 3. Define the population.

• 4. Decide what sampling method you will use to collect the data.

• 5. Collect the data.

• 6. Summarize the data and perform any statistical calculations needed.

• 7. Interpret the results.

top related