welcome to the quantitative analysis (statistics/excel) module john gates oxford centre for water...
TRANSCRIPT
Welcome to theQuantitative Analysis (Statistics/EXCEL)
Module
John GatesOxford Centre for Water ResearchSchool of Geography and the Environment
What is statistics?
“…the collection and analysis of numerical data in large quantities.” – Oxford English Dictionary
“The mathematics of the collection, organization, and interpretation of numerical data, especially the analysis of population characteristics by inference from sampling.” – American Heritage Dictionary
“Statistics: the mathematical theory of ignorance.” – Morris Kline
“It has long recognized by public men of all kinds ... that statistics come under the head of lying, and that no lie is so false or inconclusive as that which is based on statistics.” - H. Belloc
“There are three kinds of lies - lies, damned lies and statistics.” –Benjamin Disraeli
“Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.” – H.G. Wells
Why statistics?
• make quantified statements about a phenomenon we are interested in
• frequently this phenomenon is too large to go out and measure exhaustively…
• …so we collect samples as proxies of the greater population of individuals or items that make up the phenomenon we are interested in
Aims of the course
• Introduction to basic statistics• Demonstrate geographical context• Learn to use analysis tools in EXCEL• Make you an intelligent user of data• Make you an intelligent user of statistics
We will bypass much of the underlying maths, rather will emphasize the understanding of underlying principles
How the course works
1. Cover the statistical principles in lecture• course lecture notes
2. Go through lecture notes in own time before practical• use textbooks to supplement lecture notes
3. Attend practical• work through practical handouts• ask demonstrators for help
4. Take online assessments• theory – any time after lecture• practical – any time after finishing prac
Course Structure
• Lectures on Mondays– in OUCE Lecture Theatre
• Practicals on Tuesday afternoons (except next week)– in Medical Sciences Teaching Centre’s
computing laboratory
2.00-4.00pm 4.00-6.00pmHarris ManchesterBrasenoseChrist ChurchHertfordJesusSt. Edmunds HallSt. Hilda’sWorcester
KebleMansfieldMertonRegent’s ParkSt. Anne’sSt. CatzSt. Peter’sWadhamSt. John’s
Practicals
Course Information
ALL INFORMATION IS ON THE WEBhttp://techniques.geog.ox.ac.uk
– Lecture notes and glossary– Practical notes– Excel files– Internet resources– Recommended textbooks– Tests
Week 1 - Central Tendency
1. Types of statistics2. Types of data3. Samples4. Frequency distribution5. Measures of central tendency
a) modeb) medianc) arithmetic mean
6. Precision and accuracy
1a. Descriptive Statistics
• Definition: Quantitative methods of organizing, summarizing, and presenting data numerical data in an informative way.
• Describe the overall characteristics of a sample (and hence the population?)
• Transform raw data into more easily understood forms
• Central tendency – “average” character of the data.
1b. Inferential (analytical) Statistics
• Definition: The branch of statistics used to make inferences or judgments about a larger population based on the data collected from a smaller sample drawn from the population
2. Types of Data
• Interval• Ordinal• Nominal
2. Types of Data
• Interval• Ordinal• Nominal
-- Can tell exactly how far any measurement is from any other
-- Examples: height, age, size
2. Types of Data
• Interval• Ordinal• Nominal
-- A set of observation ordered according to some criterion, i.e. ranking
-- Cannot tell how far one measurement is from the next
-- Examples: horses’ positions in race, the ten highest mountains in the world
-- Note that interval data can be converted into ordinal form
2. Types of Data
• Interval• Ordinal• Nominal
-- Also referred to as categorical data
-- Data are grouped into categories
-- Examples: land use type, ethnicity, rock type
-- Note that interval data can be converted into nominal form
3. Samples
• Definition: A subset of the target populationRandom:
– the individuals in the samples are randomly selected– each member of the population has a known, but
possibly non-equal, chance of being included in the sample
Independent:– a sample should have no effect and are not affected
by other samples selected from the same population, or different populations
4. Frequency Distribution
• The spread of data along its range– either mathematical description– or (and) visual description…
• …a frequency histogram– define categories or intervals or classes– count the number of measurements that fall
into each class– plot classes along x-axis– plot counts (frequencies) on y-axis
4. Frequency Distribution
020
4060
80100120
140160
180200
Variable X
Fre
qu
en
cy
25 30 35 40 45 50 55 60 65 70 75 80 850
20
4060
80100120
140160
180200
Variable X
Fre
qu
en
cy
25 30 35 40 45 50 55 60 65 70 75 80 85
Grade (in percent)
Grades for 1st Stats Practical (1991-2002)
5a. Mode
• Definition: The most commonly occurring value• for nominal data we refer to the modal class• not appropriate for ordinal or (usually) interval data
020
4060
80100120
140160
180200
Variable X
Fre
qu
en
cy
25 30 35 40 45 50 55 60 65 70 75 80 850
20
4060
80100120
140160
180200
Variable X
Fre
qu
en
cy
25 30 35 40 45 50 55 60 65 70 75 80 85
Modal Class
5b. Median
• Definition: The central value in an ordered set of data
Raw data
4
2
5
1
7
10
6
Sorted data
1
2
4
5
6
7
10
Median
5b. Median
• even number of values
Raw data
4
2
5
1
7
10
Sorted data
1
2
4
5
6
7
¬ Median(4 + 5) / 2 = 4.5
5c. Arithmetic Mean
n
x
x
n
ii
1
5c. Arithmetic Mean
n
xx
57
357
61071524
x
data: 4, 2, 5, 1, 7, 10, 6
The “average”
• average = central tendency• the mean, mode and median are all
measures of “average”• average mean
6. Precision and accuracy
• Precision:– The degree of refinement with which an
operation is performed or a measurement stated
• Accuracy:– Freedom from mistake or error
6. Precision and accuracy
Week 1 - Central Tendency
1. Types of statistics2. Types of data3. Samples4. Frequency distribution5. Measures of central tendency
a) modeb) medianc) arithmetic mean
6. Precision and accuracy
Excel skills in Practical 1
• Entering and sorting data• Calculating mean, median and mode• Creating frequency histograms• Introduction to formulas functions