統計學 fall 2003

40
1 統統統 Fall 2003 授授授授 授授授授授授 授授2003 授 9 授 16 授 授授 授授授授授授 一:?

Upload: lacy

Post on 06-Jan-2016

63 views

Category:

Documents


7 download

DESCRIPTION

統計學 Fall 2003. 授課教師:統計系余清祥 日期:2003年9月16日 第一週:什麼是統計?. 什麼是統計?. 統計學是研究定義問題、運用資料蒐集、整理、陳示、分析與推論等科學方法, 在不確定( Uncertainty) 情況下, 做出合理決策的科學。. Chapter 1 Data and Statistics. Applications in Business and Economics Data Data Sources Descriptive Statistics - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 統計學          Fall 2003

1 1 Slide Slide

統計學 Fall 2003

授課教師:統計系余清祥 日期: 2003 年 9 月 16 日 第一週:什麼是統計?

Page 2: 統計學          Fall 2003

2 2 Slide Slide

什麼是統計 ?

統計學是研究定義問題、運用資料蒐集、整理、陳示、分析與推論等科學方法 , 在不確定 (Uncertainty) 情況下 ,

做出合理決策的科學。

Page 3: 統計學          Fall 2003

3 3 Slide Slide

Page 4: 統計學          Fall 2003

4 4 Slide Slide

Page 5: 統計學          Fall 2003

5 5 Slide Slide

Chapter 1Chapter 1 Data and Statistics Data and Statistics

Applications in Business and EconomicsApplications in Business and Economics DataData Data SourcesData Sources Descriptive StatisticsDescriptive Statistics Statistical InferenceStatistical Inference

Page 6: 統計學          Fall 2003

6 6 Slide Slide

Applications in Applications in Business and EconomicsBusiness and Economics

AccountingAccounting

Public accounting firms use statistical sampling Public accounting firms use statistical sampling procedures when conducting audits for their procedures when conducting audits for their clients.clients.

FinanceFinance

Financial advisors use a variety of statistical Financial advisors use a variety of statistical information, including price-earnings ratios and information, including price-earnings ratios and dividend yields, to guide their investment dividend yields, to guide their investment recommendations.recommendations.

MarketingMarketing

Electronic point-of-sale scanners at retail checkout Electronic point-of-sale scanners at retail checkout counters are being used to collect data for a counters are being used to collect data for a variety of marketing research applications.variety of marketing research applications.

Page 7: 統計學          Fall 2003

7 7 Slide Slide

ProductionProduction

A variety of statistical quality control charts A variety of statistical quality control charts are used to monitor the output of a production are used to monitor the output of a production process.process.

EconomicsEconomics

Economists use statistical information in Economists use statistical information in making forecasts about the future of the making forecasts about the future of the economy or some aspect of it.economy or some aspect of it.

Applications in Applications in Business and EconomicsBusiness and Economics

Page 8: 統計學          Fall 2003

8 8 Slide Slide

DataData

Elements, Variables, and ObservationsElements, Variables, and Observations Scales of MeasurementScales of Measurement Qualitative and Quantitative DataQualitative and Quantitative Data Cross-Sectional and Time Series DataCross-Sectional and Time Series Data

Page 9: 統計學          Fall 2003

9 9 Slide Slide

Data and Data SetsData and Data Sets

DataData are the facts and figures that are are the facts and figures that are collected, summarized, analyzed, and collected, summarized, analyzed, and interpreted.interpreted.

The data collected in a particular study are The data collected in a particular study are referred to as the referred to as the data setdata set..

Page 10: 統計學          Fall 2003

10 10 Slide Slide

Elements, Variables, and ObservationsElements, Variables, and Observations

The The elementselements are the entities on which data are the entities on which data are collected.are collected.

A A variablevariable is a characteristic of interest for the is a characteristic of interest for the elements.elements.

The set of measurements collected for a The set of measurements collected for a particular element is called an particular element is called an observationobservation..

The total number of data values in a data set The total number of data values in a data set is the number of elements multiplied by the is the number of elements multiplied by the number of variables.number of variables.

Page 11: 統計學          Fall 2003

11 11 Slide Slide

Data, Data Sets, Data, Data Sets, Elements, Variables, and ObservationsElements, Variables, and Observations

ElementElementss

VariableVariabless

Data SetData Set DatumDatum

ObservatioObservationn

StockStock Annual Earn/ Annual Earn/

CompanyCompany Exchange Sales($M) Sh. Exchange Sales($M) Sh.($)($)

DataramDataram AMEXAMEX 73.1073.10 0.86 0.86

EnergySouthEnergySouth OTC OTC 74.0074.00 1.67 1.67

KeystoneKeystone NYSE NYSE 365.70 365.70 0.86 0.86

LandCareLandCare NYSE NYSE 111.40 111.40 0.330.33

PsychemedicsPsychemedics AMEXAMEX 17.6017.60 0.13 0.13

Page 12: 統計學          Fall 2003

12 12 Slide Slide

Scales of MeasurementScales of Measurement

Scales of measurementScales of measurement include: include:• Nominal( 名義 ) data are merely labels or assigned

numbers• Ordinal( 順序 ) data can be arranged in order such as

worst to best or best to worst• Interval data can be arranged in order and the

difference between numbers has meaning• Ratio data differ from interval data in that there is a

definite zero point The scale determines the amount of The scale determines the amount of

information contained in the data.information contained in the data. The scale indicates the data summarization The scale indicates the data summarization

and statistical analyses that are most and statistical analyses that are most appropriate.appropriate.

Page 13: 統計學          Fall 2003

13 13 Slide Slide

Types of DataTypes of Data

DiscreteDiscrete Discrete or continuousDiscrete or continuous

NominalNominal OrdinalOrdinal IntervalInterval RatioRatioLevels of Levels of

MeasurementMeasurement

Numerical dataNumerical data

QualitativeQualitative QuantitativeQuantitativeData TypesData Types

Page 14: 統計學          Fall 2003

14 14 Slide Slide

Scales of MeasurementScales of Measurement

NominalNominal

• Data are Data are labels or nameslabels or names used to identify an used to identify an attribute of the element.attribute of the element.

• A A nonnumeric labelnonnumeric label or a or a numeric codenumeric code may may be used.be used.

Page 15: 統計學          Fall 2003

15 15 Slide Slide

Scales of MeasurementScales of Measurement

NominalNominal

• Example:Example:

Students of a university are classified by Students of a university are classified by the school in which they are enrolled the school in which they are enrolled using a nonnumeric label such as using a nonnumeric label such as Business, Humanities, Education, and so Business, Humanities, Education, and so on.on.

Alternatively, a numeric code could be Alternatively, a numeric code could be used for the school variable (e.g. 1 used for the school variable (e.g. 1 denotes Business, 2 denotes Humanities, denotes Business, 2 denotes Humanities, 3 denotes Education, and so on).3 denotes Education, and so on).

Page 16: 統計學          Fall 2003

16 16 Slide Slide

Scales of MeasurementScales of Measurement

OrdinalOrdinal

• The data have the properties of nominal The data have the properties of nominal data and the data and the order or rank of the data is order or rank of the data is meaningfulmeaningful..

• A A nonnumeric labelnonnumeric label or a or a numeric codenumeric code may may be used.be used.

Page 17: 統計學          Fall 2003

17 17 Slide Slide

Scales of MeasurementScales of Measurement

OrdinalOrdinal

• Example:Example:

Students of a university are classified by Students of a university are classified by their class standing using a nonnumeric their class standing using a nonnumeric label such as Freshman, Sophomore, label such as Freshman, Sophomore, Junior, or Senior.Junior, or Senior.

Alternatively, a numeric code could be Alternatively, a numeric code could be used for the class standing variable (e.g. used for the class standing variable (e.g. 1 denotes Freshman, 2 denotes 1 denotes Freshman, 2 denotes Sophomore, and so on).Sophomore, and so on).

Page 18: 統計學          Fall 2003

18 18 Slide Slide

Scales of MeasurementScales of Measurement

IntervalInterval

• The data have the properties of ordinal data The data have the properties of ordinal data and the interval between observations is and the interval between observations is expressed in terms of a fixed unit of expressed in terms of a fixed unit of measure.measure.

• Interval data are Interval data are always numericalways numeric..

Page 19: 統計學          Fall 2003

19 19 Slide Slide

Scales of MeasurementScales of Measurement

IntervalInterval

• Example:Example:

Melissa has an SAT score of 1205, while Melissa has an SAT score of 1205, while Kevin has an SAT score of 1090. Melissa Kevin has an SAT score of 1090. Melissa scored 115 points more than Kevin.scored 115 points more than Kevin.

Page 20: 統計學          Fall 2003

20 20 Slide Slide

Scales of MeasurementScales of Measurement

RatioRatio

• The data have all the properties of interval The data have all the properties of interval data and the ratio of two values is data and the ratio of two values is meaningful.meaningful.

• Variables such as distance, height, weight, Variables such as distance, height, weight, and time use the ratio scale.and time use the ratio scale.

• This This scale must contain a zero valuescale must contain a zero value that that indicates that nothing exists for the variable indicates that nothing exists for the variable at the zero point.at the zero point.

Page 21: 統計學          Fall 2003

21 21 Slide Slide

Scales of MeasurementScales of Measurement

RatioRatio

• Example:Example:

Melissa’s college record shows 36 credit Melissa’s college record shows 36 credit hours earned, while Kevin’s record shows hours earned, while Kevin’s record shows 72 credit hours earned. Kevin has twice 72 credit hours earned. Kevin has twice as many credit hours earned as Melissa.as many credit hours earned as Melissa.

Page 22: 統計學          Fall 2003

22 22 Slide Slide

Qualitative and Quantitative DataQualitative and Quantitative Data

Data can be further classified as being Data can be further classified as being qualitative or quantitative.qualitative or quantitative.

The statistical analysis that is appropriate The statistical analysis that is appropriate depends on whether the data for the variable depends on whether the data for the variable are qualitative or quantitative.are qualitative or quantitative.

In general, there are more alternatives for In general, there are more alternatives for statistical analysis when the data are statistical analysis when the data are quantitative.quantitative.

Page 23: 統計學          Fall 2003

23 23 Slide Slide

Qualitative DataQualitative Data

Qualitative dataQualitative data are labels or names used to are labels or names used to identify an attribute of each element.identify an attribute of each element.

Qualitative data use either the nominal or Qualitative data use either the nominal or ordinal scale of measurement.ordinal scale of measurement.

Qualitative data can be either numeric or Qualitative data can be either numeric or nonnumericnonnumeric..

The statistical analysis for qualitative data are The statistical analysis for qualitative data are rather limited.rather limited.

Page 24: 統計學          Fall 2003

24 24 Slide Slide

Quantitative DataQuantitative Data

Quantitative dataQuantitative data indicate either how many or indicate either how many or how much.how much.

• Quantitative data that measure how many Quantitative data that measure how many are are discretediscrete..

• Quantitative data that measure how much Quantitative data that measure how much are are continuouscontinuous because there is no because there is no separation between the possible values for separation between the possible values for the data..the data..

Quantitative data are always Quantitative data are always numericnumeric.. Ordinary arithmetic operations are meaningful Ordinary arithmetic operations are meaningful

only with quantitative data.only with quantitative data.

Page 25: 統計學          Fall 2003

25 25 Slide Slide

Cross-Sectional and Time Series DataCross-Sectional and Time Series Data

Cross-sectional dataCross-sectional data are collected at the same are collected at the same or approximately the same point in time.or approximately the same point in time.

• Example: data detailing the number of Example: data detailing the number of building permits issued in June 2000 in each building permits issued in June 2000 in each of the counties of Texasof the counties of Texas

Time series dataTime series data are collected over several are collected over several time periods.time periods.

• Example: data detailing the number of Example: data detailing the number of building permits issued in Travis County, building permits issued in Travis County, Texas in each of the last 36 monthsTexas in each of the last 36 months

Page 26: 統計學          Fall 2003

26 26 Slide Slide

Data SourcesData Sources

Existing SourcesExisting Sources

• Data needed for a particular application Data needed for a particular application might already exist might already exist within a firmwithin a firm. Detailed . Detailed information is often kept on customers, information is often kept on customers, suppliers, and employees for example.suppliers, and employees for example.

• Substantial amounts of business and Substantial amounts of business and economic data are available from economic data are available from organizations that specialize in collecting organizations that specialize in collecting and maintaining dataand maintaining data..

Page 27: 統計學          Fall 2003

27 27 Slide Slide

Data SourcesData Sources

Existing SourcesExisting Sources

• Government agenciesGovernment agencies are another are another important source of data , and the data important source of data , and the data types include census (types include census ( 普查普查 ) ) and survey (and survey ( 抽抽樣樣 ) ) data. data.

• Data are also available from a variety of Data are also available from a variety of industry associationsindustry associations and and special-interest special-interest organizationsorganizations..

Page 28: 統計學          Fall 2003

28 28 Slide Slide

Data SourcesData Sources

InternetInternet

• The The InternetInternet has become an important has become an important source of data.source of data.

• Most government agencies, like the Bureau Most government agencies, like the Bureau of the Census (www.census.gov), make their of the Census (www.census.gov), make their data available through a web site.data available through a web site.

• More and more companies are creating web More and more companies are creating web sites and providing public access to them.sites and providing public access to them.

• A number of companies now specialize in A number of companies now specialize in making information available over the making information available over the Internet.Internet.

Page 29: 統計學          Fall 2003

29 29 Slide Slide

Statistical StudiesStatistical Studies• Statistical studies can be classified as either Statistical studies can be classified as either

experimental or observational.experimental or observational.• In In experimental studiesexperimental studies the variables of the variables of

interest are first identified. Then one or more interest are first identified. Then one or more factors are controlled so that data can be factors are controlled so that data can be obtained about how the factors influence the obtained about how the factors influence the variables.variables.

• In In observationalobservational (nonexperimental) (nonexperimental) studiesstudies no no attempt is made to control or influence the attempt is made to control or influence the variables of interest.variables of interest.• A A surveysurvey is perhaps the most common type is perhaps the most common type

of observational study.of observational study.

Data SourcesData Sources

Page 30: 統計學          Fall 2003

30 30 Slide Slide

Data Acquisition ConsiderationsData Acquisition Considerations

Time RequirementTime Requirement• Searching for information can be time Searching for information can be time

consuming.consuming.• Information might no longer be useful by the Information might no longer be useful by the

time it is available.time it is available. Cost of AcquisitionCost of Acquisition

• OrganizationsOrganizations often charge for information even often charge for information even when it is not their primary business activity.when it is not their primary business activity.

Data ErrorsData Errors• Using any data that happens to be available or Using any data that happens to be available or

that were acquired with little care can lead to that were acquired with little care can lead to poor and misleading information.poor and misleading information.

Page 31: 統計學          Fall 2003

31 31 Slide Slide

Descriptive StatisticsDescriptive Statistics

Descriptive statisticsDescriptive statistics are the tabular, are the tabular, graphical, and numerical methods used to graphical, and numerical methods used to summarizesummarize data. data.

Page 32: 統計學          Fall 2003

32 32 Slide Slide

91 78 93 57 75 52 99 80 97 6271 69 72 89 66 75 79 75 72 76104 74 62 68 97 105 77 65 80 10985 97 88 68 83 68 71 69 67 7462 82 98 101 79 105 79 69 62 73

91 78 93 57 75 52 99 80 97 6271 69 72 89 66 75 79 75 72 76104 74 62 68 97 105 77 65 80 10985 97 88 68 83 68 71 69 67 7462 82 98 101 79 105 79 69 62 73

Example: Hudson Auto RepairExample: Hudson Auto Repair

The manager of Hudson Auto would like The manager of Hudson Auto would like to haveto have

a better understanding of the cost of parts used a better understanding of the cost of parts used in thein the

engine tune-ups performed in the shop. She engine tune-ups performed in the shop. She examinesexamines

50 customer invoices for tune-ups. The costs of 50 customer invoices for tune-ups. The costs of parts,parts,

rounded to the nearest dollar, are listed below.rounded to the nearest dollar, are listed below.

Page 33: 統計學          Fall 2003

33 33 Slide Slide

Example: Hudson Auto RepairExample: Hudson Auto Repair

Tabular Summary (Frequencies and Percent Tabular Summary (Frequencies and Percent Frequencies)Frequencies)

PartsParts Percent Percent Cost ($)Cost ($) FrequencyFrequency

FrequencyFrequency

50-5950-59 2 2 4 4

60-6960-69 1313 2626

70-7970-79 1616 3232

80-8980-89 7 7 1414

90-9990-99 7 7 1414

100-109100-109 5 5 1010

Total 50Total 50 100 100

Page 34: 統計學          Fall 2003

34 34 Slide Slide

Example: Hudson Auto RepairExample: Hudson Auto Repair

Graphical Summary (Histogram)Graphical Summary (Histogram)

PartsCost ($)PartsCost ($)

22446688

10101212141416161818

Fre

qu

en

cy

Fre

qu

en

cy

50 60 70 80 90 100 11050 60 70 80 90 100 110

Page 35: 統計學          Fall 2003

35 35 Slide Slide

Example: Hudson Auto RepairExample: Hudson Auto Repair

Numerical Descriptive StatisticsNumerical Descriptive Statistics• The most common numerical descriptive The most common numerical descriptive

statistic is the statistic is the averageaverage (or (or meanmean). ). • Hudson’s average cost of parts, based on Hudson’s average cost of parts, based on

the 50 tune-ups studied, is $79 (found by the 50 tune-ups studied, is $79 (found by summing the 50 cost values and then summing the 50 cost values and then dividing by 50).dividing by 50).

Page 36: 統計學          Fall 2003

36 36 Slide Slide

Statistical InferenceStatistical Inference

Statistical inferenceStatistical inference is the process of using is the process of using data obtained from a small group of elements data obtained from a small group of elements (the (the samplesample) to make estimates and test ) to make estimates and test hypotheses about the characteristics of a hypotheses about the characteristics of a larger group of elements (the larger group of elements (the populationpopulation).).

Page 37: 統計學          Fall 2003

37 37 Slide Slide

Example: Hudson Auto RepairExample: Hudson Auto Repair

Process of Statistical InferenceProcess of Statistical Inference

1. 1. Population Population consists of allconsists of all

tune-ups. Averagetune-ups. Averagecost of parts iscost of parts is

unknownunknown.

2. 2. A sample of 50A sample of 50engine tune-ups engine tune-ups

is examined.is examined.

3. 3. The sample data The sample data provide a sampleprovide a sampleaverage cost ofaverage cost of

$79 per tune-up.$79 per tune-up.

4. 4. The value of the The value of the sample average is usedsample average is usedto make an estimate ofto make an estimate of the population average.the population average.

Page 38: 統計學          Fall 2003

38 38 Slide Slide

Population (all votes cast)Population (all votes cast)

Population Verses a SamplePopulation Verses a Sample

Sample (selected votes Sample (selected votes for observation)for observation)

Page 39: 統計學          Fall 2003

39 39 Slide Slide

Basic DefinitionsBasic Definitions

Descriptive Statistics Descriptive Statistics (( 敘述性統計量敘述性統計量 ): ): the the

collection and description of datacollection and description of data Inferential Statistics(Inferential Statistics( 推論性統計量推論性統計量 ): ):

analyzing, decision making or estimation based analyzing, decision making or estimation based on the dataon the data

Population(Population( 母體母體 ): ): the set of all possible the set of all possible measurements that is of interestmeasurements that is of interest

Sample(Sample( 樣本樣本 ): ): the portion of the population the portion of the population from which information is gatheredfrom which information is gathered

Page 40: 統計學          Fall 2003

40 40 Slide Slide

End of Chapter 1End of Chapter 1