sqqs1013 ch1-a122

17
SQQS1013 Elementary Statistics INTRODUCTION TO INTRODUCTION TO Statistics Statistics 1.1 WHAT IS STATISTICS? The word statistics derives from classical Latin roots, status which means state. Statistics has become the universal language of the sciences. As potential users of statistics, we need to master both the “science” and the “art” of using statistical methodology correctly. These method include: Carefully defining the situation Gathering data Accurately summarizing the data Deriving and communicating meaningful conclusions Nowadays statistics is used in almost all fields of human effort such as: Education Agricultural Businesses Health Chapter 1: Introduction to Statistics 1 Specific definition: Statistics is a collection of procedures and principles for gathering data and analyzing information to help people make decisions when faced with uncertainty.

Upload: kim-rae-ki

Post on 15-Jul-2015

92 views

Category:

Education


2 download

TRANSCRIPT

SQQS1013 Elementary Statistics

INTRODUCTION TO INTRODUCTION TO StatisticsStatistics

1.1 WHAT IS STATISTICS?

• The word statistics derives from classical Latin roots,

status which means state.

• Statistics has become the universal language of the sciences.

• As potential users of statistics, we need to master both the “science” and the

“art” of using statistical methodology correctly.

• These method include:

Carefully defining the situation

Gathering data

Accurately summarizing the data

Deriving and communicating meaningful conclusions

• Nowadays statistics is used in almost all fields of human effort such as:

Education Agricultural Businesses Health

Chapter 1: Introduction to Statistics 1

Specific definition:

Statistics is a collection of procedures and principles for

gathering data and analyzing information to help people

make decisions when faced with uncertainty.

SQQS1013 Elementary Statistics

1. Sport

• Sports

A statistician may keeps records of the number of hits a baseball player gets in a season.

• Financial

Financial advisor uses some statistic information to make reliable predictions in investment.

• Public Health

An administrator would be concerned with the number of residents who contract a new strain of flu virus during a certain year.

• Others

Any Idea?…..

1.2 TWO ASPECTS IN STATISTICS

Statistics has Two Aspects:

1. Theoretical / Mathematical Statistics

° Deals with the development, derivation and proof of statistical theorems,

formulas, rules and laws.

2. Applied Statistics

o Involves the applications of those theorems, formulas, rules and laws to

solve real world problems.

o Applied Statistics can be divided into two main areas, depending on how data

are used. The two main areas are:

Chapter 1: Introduction to Statistics 2

Example applications of Statistics

Descriptive Statistics

Consist of methods that use results obtained from sample to make decisions or conclusions about a population

Applied Statistics

Consist of method for collecting, organizing, displaying and summarizing data

Inferential Statistics

Deals with the development, derivation and proof of statistical theorems, formulas, rules and laws.

Involves the applications of those theorems, formulas, rules and laws to solve real world problems.

Theoretical/MathematicalStatistics

ASPECTS OF STATISTICS

SQQS1013 Elementary Statistics

Determine which of the following statements is descriptive in nature and which is inferential.

a. Of all U.S kindergarten teachers, 32% say that “knowing the alphabet” is an essential skill.

Chapter 1: Introduction to Statistics

Descriptive statistics Inferential statistics

• What most people think of when they hear the word statistics

• Includes the collection, presentation, and description of sample data.

• Using graphs, charts and tables to show data.

•• Refers to the technique of

interpreting the values resulting from the descriptive techniques and making decisions and drawing conclusions about the population

3

Example 1

SQQS1013 Elementary Statistics

b. Of the 800 U.S kindergarten teachers polled, 32% say that “knowing the alphabet” is an essential skill.

• Why do we have to study statistics?

To read and understand various statistical studies in related field.

To communicate and explain the results of study in related field

using our own words.

To become better consumers and citizens.

1.3 BASIC TERMS OF STATISTICS

• Population vs. Sample

Population Sample

• A collection of all individuals about which information is desired.

‘Individuals’ are usually people but could also be schools, cities, pet dogs, agriculture fields, etc.

• There are two kinds of population: Finite population

When the membership of a population can be (or could be) physically listed.

e.g. the books in library. Infinite population

• A subset of the population.

Chapter 1: Introduction to Statistics 4

Parameter

Population

Sample

Statistic

Inference

SQQS1013 Elementary Statistics

When the membership is unlimited. e.g. the population of all people

who might use aspirin.

• Parameter vs. Statistic

Parameter Statistic

• A numerical value summarizing all the data of an entire population.

• Often a Greek letter is used to symbolize the name of parameter.

Average/Mean - µ Standard deviation - σ

e.g. The “average” age at time of admission for all students who have ever attended our college.

• A numerical value summarizing the sample data.

• English alphabet is used to symbolize the name of statistic

Average/Mean - s Standard deviation -

e.g. The “average” height, found by using the set of 25 heights.

• Variable

A characteristic of interest about each individual element of a population or sample.e.g. : A student’s age at entrance into college, the color of student’s hair.

• Data value

The value of variable associated with one element of a population or sample. This value may be a number, a word, or a symbol.e.g. : Farah entered college at age “23”, her hair is “brown”.

• Data

The set of values collected from the variable from each of the elements that belong to sample.e.g. : The set of 25 heights collected from 25 students.

• Census : a survey includes every element in the population.

• Sample survey : a survey includes every element in selected sample only.

A statistics student is interested in finding out something about the average ringgit value of cars owned by the faculty members of our university. Each of the seven terms just describe can be identified in this situation.

Chapter 1: Introduction to Statistics 5

Example 2

SQQS1013 Elementary Statistics

i) Population : the collection of all cars owned by all faculty members at our university.

ii) Sample : any subset of that population. For example, the cars owned by members the statistics department.

iii) Variable : the “ringgit value” of each individual car.(RM)

iv) Data value : one data value is the ringgit value of a particular car. Ali’s car, for example, is value at RM 45 000.

v) Data : the set of values that correspond to the sample obtained (45,000; 55,000; 34, 0000 ;…).

vi) Parameter : which we are seeking information is the “average” value of all cars in the population.

vii) Statistic : will be found is the “average” value of the cars in the sample.

1.3.1 Types of Variables

• Quantitative (numerical) Variables

A variable that quantifies an element of a population.

e.g. the “total cost” of textbooks purchased by each student for this semester’s classes.

Arithmetic operations such as addition and averaging are meaningful for data that result from a quantitative variable.

Can be subdivided into two classifications: discrete variables and continuous variables.

Discrete Variables Continuous Variables

Chapter 1: Introduction to Statistics 6

SQQS1013 Elementary Statistics

A quantitative variable that can assume a countable number of values.

Can assume any values corresponding to isolated points along a line interval. That is, there is a gap between any two values.

e.g. Number of courses for which you are currently registered.

A quantitative variable that can assume an uncountable number of values.

Can assume any value along a line interval, including every possible value between any two values.

e.g. Weight of books and supplies you are carrying as you attend class today.

• Qualitative (attribute, categorical) variables

A variable that describes or categorizes an element of a population.

e.g.: A sample of four hair-salon customers was surveyed for their “hair color”, “hometown” and “level of satisfaction”.

EXERCISE 1

1. Of the adult U.S. population, 36% has an allergy. A sample of 1200 randomly selected adults resulted in 33.2% reporting an allergy.

a. Describe the population.

b. What is sample?

c. Describe the variable.

d. Identify the statistics and give its value.

e. Identify the parameter and give its value.

2. The faculty members at Universiti Utara Malaysia were surveyed on the question “How satisfied were you with this semester schedule?” Their responses were to be categorized as “very satisfied,” “somewhat satisfied,” “neither satisfied nor dissatisfied,” “somewhat dissatisfied,” or “very dissatisfied.”

a. Name the variable interest.

b. Identify the type of variable.

3. A study was conducted by Aventis Pharmaceuticals Inc. to measure the adverse side effects of Allegra, a drug used for treatment of seasonal allergies. A sample of 679 allergy sufferers in the United States was given 60 mg of the drug twice a day. The patients were to report whether they experienced relief from their allergies as well as any adverse side effects (viral infection, nausea, drowsiness, etc)

a. What is the population being studied?

b. What is the sample?

Chapter 1: Introduction to Statistics 7

SQQS1013 Elementary Statistics

c. What are the characteristics of interest about each element in the

population?

d. Are the data being collected qualitative or quantitative?

4. Identify each of the following as an example of (1) attribute (qualitative) or (2) numerical (quantitative) variables.

a. The breaking strength of a given type of string

b. The hair color of children auditioning for the musical Annie.

c. The number of stop signs in town of less than 500 people.

d. Whether or not a faucet is defective.

e. The number of questions answered correctly on a standardized test.

f. The length of time required to answer a telephone call at a certain real

estate office.

1.3.2 Types of Data

• Data is the set of values collected from the variable from each of the elements that belong to sample.

• e.g. the set of 25 heights collected from 25 students.• Data can be collected from a survey or an experiment.

Chapter 1: Introduction to Statistics 8

Types of Data

Primary data

Necessary data obtained through survey conducted by researcher

Primary Data Collection TechniquesData is collected by researcher and obtained from

respondent

1. Face to face interviewTwo ways communication where researcher(s)

asks question directly to respondent(s).

Advantages:Precise answer.Appropriate for research that requires huge data

collection.Increase the number of answered questions.

Disadvantages:Expensive.Interviewer might influence respondent’s

responses.Respondent refuse to answer sensitive or personal

question.

2. Telephone interviewAdvantages:Quick.Less costly.Wider respondent coverage.

Disadvantages:Limited interview duration.Demonstration cannot be performing.Telephone is not answered.

3. Postal questionnaireA set of questions to obtain related information of

conducted study.Questionnaires are posted to every respondent.

Advantages:Wider respondent coverage. Respondent have enough time to answer

questions.Interviewer influences can be avoided.Lower cost.

Disadvantages:One way interaction.Low response rate.

Secondary data

Data obtained from published material by governmental, industrial or individual sources

Published records from governmental, industrial or individual sources.

Historical data.Various resources.Experiment is not required.

Advantages:Lower cost. Save time and energy.

Disadvantages:Obsolete information.Data accuracy is not confirmed.

SQQS1013 Elementary Statistics

Any Idea?.......

Another technique to collect primary data

is observation. List the advantages and

disadvantages of this technique.

1.3.2.1 Scale of Measurements

• Data also can be classified by how they are categorized, counted or measured.

• This type of classification uses measurement scales with 4 common types of scales: nominal, ordinal, interval and ratio.

Nominal Level of Measurement Ordinal Level of Measurement

A qualitative variable that characterizes (or describes/names) an element of a population.

Arithmetic operations not meaningful for data.

Order cannot be assigned to the categories.

Example: - Survey responses:- yes, no,

undecided, - Gender:- male, female

A qualitative variable that incorporates and ordered position, or ranking.

Differences between data values either cannot be determined or are meaningless.

Example: - Level of satisfaction:- “very

satisfied”, “satisfied”, “somewhat satisfied”, etc.

- Course grades:- A, B, C, D, or F

Chapter 1: Introduction to Statistics 9

SQQS1013 Elementary Statistics

Interval Level of Measurement Ratio Level of Measurement

Involve a quantitative variable. A scale where distances between

data are meaningful. Differences make sense, but ratios

do not (e.g., 30°-20°=20°-10°, but 20°/10° is not twice as hot!).

No natural zero Example:

- Temperature scales are interval data with 25oC warmer than 20oC and a 5oC difference has some physical meaning. Note that 0oC is arbitrary, so that it does not make sense to say that 20oC is twice as hot as 10oC.

- The year 0 is arbitrary and it is not sensible to say that the year 2000 is twice as old as the year 1000.

A scale in which both intervals between values and ratios of values are meaningful.

A real zero point. Example:

- Temperature measured in degrees Kelvin is a ratio scale because we know a meaningful zero point (absolute zero).

- Physical measurements of height, weight, length are typically ratio variables. It is now meaningful to say that 10m is twice as long as 5m. This is because there is a natural zero.

Levels of Measurement

• Nominal - categories only

• Ordinal - categories with some order

• Interval - differences but no natural starting point

• Ratio - differences and a natural starting point

EXERCISE 2

1) Classify each as nominal-level, ordinal-level, interval-level or ratio-level.

Chapter 1: Introduction to Statistics

a. Ratings of newscasts in Malaysia. (poor, fair, good, excellent)

b. Temperature of automatic popcorn poppers.

c. Marital status of respondents to a survey on saving accounts.

d. Age of students enrolled in a marital arts course.

e. Salaries of cashiers of C-Mart stores.

10

SQQS1013 Elementary Statistics

2) Data obtained from a nominal scalea. must be alphabetic.b. can be either numeric or nonnumeric.c. must be numeric.d. must rank order the data.

3) The set of measurements collected for a particular element is (are) calleda. variables.b. observations.c. samples.d. none of the above answers is correct.

4) The scale of measurement that is simply a label for the purpose of identifying the attribute of an element is the

a. ratio scale.b. nominal scale.c. ordinal scale.d. interval scale.

5) Some hotels ask their guests to rate the hotel’s services as excellent, very good, good, and poor. This is an example of the

a. ordinal scale.b. ratio scale.c. nominal scale.d. interval scale.

6) The ratio scale of measurement has the properties ofa. only the ordinal scale.b. only the nominal scale.c. the rank scale.d. the interval scale.

7) Arithmetic operations are inappropriate fora. the ratio scale.b. the interval scale.c. both the ratio and interval scales.d. the nominal scale.

8) A characteristic of interest for the elements is called a(n)a. sample.b. data set.c. variable.d. none of the above answers is correct.

9) In a questionnaire, respondents are asked to mark their gender as male or female. Gender is an example of a

a. qualitative variable.b. quantitative variable.

Chapter 1: Introduction to Statistics 11

SQQS1013 Elementary Statistics

c. qualitative or quantitative variable, depending on how the respondents answered the question.

d. none of the above answers is correct.

10) The summaries of data, which may be tabular, graphical, or numerical, are referred to as

a. inferential statistics.b. descriptive statistics.c. statistical inference.d. report generation.

11) Statistical inferencea. refers to the process of drawing inferences about the sample based

on the characteristics of the population. b. is the same as descriptive statistics.

c. is the process of drawing inferences about the population based on the information taken from the sample.

d. is the same as a census.

EXERCISE 3

1. In each of this statements, tell whether descriptive or inferential statistics have been used.

a) The average life expectancy in New Zealand is 78.49 years.

b) A diet high in fruits and vegetables will lower blood pressure.

c) The total amount of estimated losses from Tsunami flood was RM4.2

billion.

d) Researchers stated that the shape of a person’s ears is related to the

person’s aggression

e) In 2013, the number of high school graduates will be 3.2 million

students.

2. Classify each variable as discrete or continuous.

a) Ages of people working in a large factory

b) Number of cups of coffee served at a restaurant

Chapter 1: Introduction to Statistics 12

SQQS1013 Elementary Statistics

c) The amount of a drug injected into a rat.

d) The time it takes a student to walk to school

e) The number of liters of milk sold each day at a grocery store

3. Classify each as nominal-level, ordinal level, interval-level, or ratio level.

a) Rating of movies as U, SX and LP.

b) Number of candy bars sold on a fund drive

c) Classification of automobile as subcompact, compact, standard and

luxury.

d) Temperatures of hair dryers.

e) Weights of suitcases on a commercial airline.

4. At Sintok Community College 150 students are randomly selected and asked the distance of their house to campus. From this group a mean of 5.2 km is computed.

a. What is the parameter?

b. What is the statistics?

c. What is the population?

d. What is the sample?

Matrix No: _______________________ Group:______

TUTORIAL CHAPTER 1

In the following multiple-choice questions, please circle the correct answer.

1. You asked five of your classmates about their height. On the basis of this information, you stated that the average height of all students in your university or college is 65 inches. This is an example of:a. descriptive statisticsb. statistical inferencec. parameterd. population

Chapter 1: Introduction to Statistics 13

25

SQQS1013 Elementary Statistics

2. A company has developed a new computer sound card, but the average lifetime is unknown. In order to estimate this average, 200 sound cards are randomly selected from a large production line and tested and the average lifetime is found to be 5 years. The 200 sound cards represent the:a. parameterb. statisticc. sampled. population

3. A summary measure that is computed from a sample to describe a characteristic of the population is called aa. parameterb. statisticc. populationd. sample

4. A summary measure that is computed from a population is called aa. parameterb. statisticc. populationd. sample

5. When data are collected in a statistical study for only a portion or subset of all elements of interest, we are using a:a. sampleb. parameterc. populationd. statistic

6. Which of the following is not the goal of descriptive statistics?a. Summarizing datab. Displaying aspects of the collected datac. Reporting numerical findingsd. Estimating characteristics of the population

7. Which of the following statements is not true?a. One form of descriptive statistics uses graphical techniquesb. One form of descriptive statistics uses numerical techniquesc. In the language of statistics, population refers to a group of peopled. Statistical inference is used to draw conclusions or inferences about

characteristics of populations based on sample data

8. Descriptive statistics deals with methods of:a. organizing datab. summarizing datac. presenting data in a convenient and informative wayd. All of the above

9. A politician who is running for the office of governor of a state with 4 million registered voters commissions a survey. In the survey, 54% of the 5,000

Chapter 1: Introduction to Statistics 14

SQQS1013 Elementary Statistics

registered voters interviewed say they plan to vote for her. The population of interest is the:a. 4 million registered voters in the stateb. 5,000 registered voters interviewedc. 2,700 voters interviewed who plan to vote for her.d. 2,300 voters interviewed who plan not to vote for her

10. A company has developed a new battery, but the average lifetime is unknown. In order to estimate this average, a sample of 500 batteries is tested and the average lifetime of this sample is found to be 225 hours. The 225 hours is the value of a:a. parameterb. statisticc. sampled. population

11. The process of using sample statistics to draw conclusions about true population parameters is calleda. inferential statisticsb. the scientific methodc. sampling methodd. descriptive statistics

12. Which of the following is most likely a population as opposed to a sample?a. Respondents to a magazine surveyb. The first 10 students completing a final examc. Every fifth student to arrive at the book store on your campusd. Registered voters in the State of Michigan

13. Researchers suspect that the average number of credits earned per semester by college students is rising. A researcher at Michigan State University (MSU) wished to estimate the number of credits earned by students during the fall semester of 2003 at MSU. To do so, he randomly selects 500 student transcripts and records the number of credits each student earned in the fall term 2003. He found that the average number of semester credits completed was 14.85 credits per student. The population of interest to the researcher isa. all MSU studentsb. all college students in Michiganc. all MSU students enrolled in the fall semester of 2003d. all college students in Michigan enrolled in the fall semester of 2003

14. The collection and summarization of the graduate degrees and research areas of interest of the faculty in the University of Michigan of a particular academic institution is an example ofa. inferential statisticsb. descriptive statisticsc. a parameterd. a statistic

Chapter 1: Introduction to Statistics 15

SQQS1013 Elementary Statistics

15. Those methods involving the collection, presentation, and characterization of a set of data in order to properly describe the various features of that set of data are calleda. inferential statisticsb. the scientific methodc. sampling methodd. descriptive statistics

16. The estimation of the population average student expenditure on education based on the sample average expenditure of 1,000 students is an example ofa. inferential statisticsb. descriptive statisticsc. a parametera. a statistic

17. A study is under way in a national forest to determine the adult height of pine trees. Specifically, the study is attempting to determine what factors aid a tree in reaching heights greater than 50 feet tall. It is estimated that the forest contains 32,000 pine trees. The study involves collecting heights from 500 randomly selected adult pine trees and analyzing the results. The sample in the study isa. the 500 randomly selected adult pine treesb. the 32,000 adult pine trees in the forestc. all the adult pine trees taller than 50 feetd. all pine trees, of any age in the forest

18. The classification of student major (accounting, economics, management, marketing, other) is an example of a. a categorical random variable.b. a discrete random variablec. a continuous random variabled. a parameter.

19. Most colleges admit students based on their achievements in a number of

different areas. The grade obtained in senior level English course (A, B, C, D,

or F) is an example of a ________________, or ________________ variable.

20. For each of the following examples, identify the data type as nominal, ordinal,

or interval.

a. The letter grades received by students in a computer science class

________________

b. The number of students in a statistics course

Chapter 1: Introduction to Statistics 16

SQQS1013 Elementary Statistics

________________

c. The starting salaries of newly Ph.D. graduates from a statistics program

________________

d. The size of fries (small, medium, large) ordered by a sample of Burger King

customers. _____________________

e. The college you are enrolled in (Arts and science, Business, Education, etc.)

_________________

Chapter 1: Introduction to Statistics 17