spatial statistics: topic 31 descriptive statistics assoc. prof. dr. abdul hamid b. hj. mar iman...

48
Spatial Statistics: Topic 3 1 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering and Geoinformation Science Universiti Tekbnologi Malaysia Skudai, Johor Spatial Statistics (SGG 2413)

Post on 18-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 1

Descriptive Statistics

Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman

DirectorCentre for Real Estate Studies

Faculty of Engineering and Geoinformation ScienceUniversiti Tekbnologi Malaysia

Skudai, Johor

Spatial Statistics (SGG 2413)

Page 2: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 2

Learning Objectives

Overall: To give students a basic understanding of descriptive statistics

Specific: Students will be able to: * understand the basic concept of descriptive statistics * understand the concept of distribution * can calculate measures of central tendency dispersion * can calculate measures of kurtosis and skewness

Page 3: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 3

Contents

What is descriptive statisticsCentral tendency, dispersion, kurtosis,

skewnessDistribution

Page 4: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 4

Use sample information to explain/make abstraction of population “phenomena”.

Common “phenomena”: * Association (e.g. σ1,2.3 = 0.75) * Tendency (left-skew, right-skew) * Trend, pattern, location, dispersion, range * Causal relationship (e.g. if X then Y) Emphasis on meaningful characterisation of data

(e.g. central tendency, variability), graphics, and description

Use non-parametric analysis (e.g. 2, t-test, 2-way anova)

Descriptive Statistics

Page 5: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 5

Trends in property loan, shop house demand & supply

0

50000

100000

150000

200000

Year (1990 - 1997)

Loan to property sector (RM

million)

32635.8 38100.6 42468.1 47684.7 48408.2 61433.6 77255.7 97810.1

Demand for shop shouses (units) 71719 73892 85843 95916 101107 117857 134864 86323

Supply of shop houses (units) 85534 85821 90366 101508 111952 125334 143530 154179

1 2 3 4 5 6 7 8

0

50,000

100,000

150,000

200,000

250,000

300,000

350,000

Batu P

ahat

Joho

r Bah

ru

Kluang

Kota T

ingg

i

Mer

sing

Mua

r

Pontia

n

Segam

at

District

No

. o

f h

ou

ses

1991

2000

0

2

4

6

8

10

12

14

0-4

10-1

4

20-2

4

30-3

4

40-4

4

50-5

4

60-6

4

70-7

4

Age Category (Years Old)

Pro

po

rtio

n (

%)

E.g. of Abstraction of phenomena

Demand (% sales success)

12010080604020

Pri

ce

(R

M/s

q.f

t. b

uilt

are

a)

200

180

160

140

120

100

80

Page 6: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 6

Using sample statistics to infer some “phenomena” of population parameters

Common “phenomena”: cause-and-effect * One-way r/ship * Feedback r/ship * Recursive

Use parametric analysis (e.g. α and ) through regression analysis

Emphasis on hypothesis testing

Y1 = f(Y2, X, e1)Y2 = f(Y1, Z, e2)

Y1 = f(X, e1)Y2 = f(Y1, Z, e2)

Y = f(X)

Inferential Statistics

Page 7: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 7

Statistical analysis that attempts to explain the population parameter using a sample

E.g. of statistical parameters: mean, variance, std. dev., R2, t-value, F-ratio, xy, etc.

It assumes that the distributions of the variables being assessed belong to known parameterised families of probability distributions

Parametric statistics

Page 8: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 8

Examples of parametric relationship

Coefficientsa

1993.108 239.632 8.317 .000

-4.472 1.199 -.190 -3.728 .000

6.938 .619 .705 11.209 .000

4.393 1.807 .139 2.431 .017

-27.893 6.108 -.241 -4.567 .000

34.895 89.440 .020 .390 .697

(Constant)

Tanah

Bangunan

Ansilari

Umur

Flo_go

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: Nilaisma.

Dep=9t – 215.8

Dep=7t – 192.6

Page 9: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 9

First used by Wolfowitz (1942) Statistical analysis that attempts to explain the

population parameter using a sample without making assumption about the frequency distribution of the assessed variable

In other words, the variable being assessed is distribution-free

E.g. of non-parametric statistics: histogram, stochastic kernel, non-parametric regression

Non-parametric statistics

Page 10: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 10

DS gather information about a population characteristic (e.g. income) and describe it with a parameter of interest (e.g. mean)

IS uses the parameter to test a hypothesis pertaining to that characteristic. E.g.

Ho: mean income = RM 4,000

H1: mean income < RM 4,000) The result for hypothesis testing is used to make

inference about the characteristic of interest (e.g. Malaysian upper middle income)

Descriptive & Inferential Statistics (DS & IS)

Page 11: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 11

Measure Advantages Disadvantages

Mean(Sum of all values ÷

no. of values)

Best known average

Exactly calculable

Make use of all data

Useful for statistical analysis

Affected by extreme values Can be absurd for discrete data

(e.g. Family size = 4.5 person)

Cannot be obtained graphically

Median(middle value)

Not influenced by extreme

values Obtainable even if data

distribution unknown (e.g.

group/aggregate data) Unaffected by irregular class

width

Unaffected by open-ended class

Needs interpolation for group/

aggregate data (cumulative

frequency curve) May not be characteristic of group

when: (1) items are only few; (2)

distribution irregular

Very limited statistical use

Mode(most frequent value)

Unaffected by extreme values

Easy to obtain from histogram

Determinable from only values

near the modal class

Cannot be determined exactly in

group data

Very limited statistical use

Sample Statistics: Central Tendency

Page 12: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 12

Central Tendency – Mean

For individual observations, . E.g.

X = {3,5,7,7,8,8,8,9,9,10,10,12}

= 96 ; n = 12 Thus, = 96/12 = 8 The above observations can be organised into a frequency

table and mean calculated on the basis of frequencies

= 96; = 12

Thus, = 96/12 = 8

x 3 5 7 8 9 10 12

f 1 1 2 3 2 2 1

fx 3 5 14 24 18 20 12

Page 13: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 13

Central Tendency - Mean and Mid-point

Let say we have data like this:

Location Min Max

Town A 228 450

Town B 320 430

Price (RM ‘000/unit) of Shop Houses in Skudai

Can you calculate the mean?

Page 14: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 14

Central Tendency - Mean and Mid-point (contd.)

Let’s calculate:

Town A: (228+450)/2 = 339

Town B: (320+430)/2 = 375

Are these figures means?

M = ½(Min + Max)

Page 15: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 15

Central Tendency - Mean and Mid-point (contd.)

Let’s say we have price data as follows: Town A: 228, 295, 310, 420, 450 Town B: 320, 295, 310, 400, 430 Calculate the means? Town A: Town B: Are the results same as previously?

Be careful about mean and “mid-point”!

Page 16: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 16

Central Tendency – Mean of Grouped Data

House rental or prices in the PMR are frequently tabulated as a range of values. E.g.

What is the mean rental across the areas?

= 23; = 3317.5

Thus, = 3317.5/23 = 144.24

Rental (RM/month) 135-140 140-145 145-150 150-155 155-160

Mid-point value (x) 137.5 142.5 147.5 152.5 157.5

Number of Taman (f) 5 9 6 2 1

fx 687.5 1282.5 885.0 305.0 157.5

Page 17: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 17

Central Tendency – Median

Let say house rentals in a particular town are tabulated:

Calculation of “median” rental needs a graphical aids→

Rental (RM/month) 130-135 135-140 140-145 155-50 150-155

Number of Taman (f) 3 5 9 6 2

Rental (RM/month) >135 > 140 > 145 > 150 > 155

Cumulative frequency 3 8 17 23 25

1. Median = (n+1)/2 = (25+1)/2 =13th. Taman

2. (i.e. between 10 – 15 points on the vertical axis of ogive).

3. Corresponds to RM 140-145/month on the horizontal axis

4. There are (17-8) = 9 Taman in the range of RM 140-145/month

5. Taman 13th. is 5th. out of the 9

Taman

6. The rental interval width is 5

7. Therefore, the median rental can

be calculated as:

140 + (5/9 x 5) = RM 142.8

Page 18: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 18

Central Tendency – Median (contd.)

Page 19: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 19

Central Tendency – Quartiles (contd.)

Upper quartile = ¾(n+1) = 19.5th. Taman

UQ = 145 + (3/7 x 5) = RM 147.1/month

Lower quartile = (n+1)/4 = 26/4 = 6.5 th. Taman

LQ = 135 + (3.5/5 x 5) = RM138.5/month

Inter-quartile = UQ – LQ = 147.1 – 138.5 = 8.6th. Taman

IQ = 138.5 + (4/5 x 5) = RM 142.5/month

Following the same process as in calculating “median”:

Page 20: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 20

Variability

Indicates dispersion, spread, variation, deviation For single population or sample data:

where σ2 and s2 = population and sample variance respectively, xi = individual observations, μ = population mean, = sample mean, and n = total number of individual observations.

The square roots are:

standard deviation standard deviation

Page 21: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 21

Variability (contd.)

Why “measure of dispersion” important? Consider yields of two plant species: * Plant A (ton) = {1.8, 1.9, 2.0, 2.1, 3.6} * Plant B (ton) = {1.0, 1.5, 2.0, 3.0, 3.9} Mean A = mean B = 2.28% But, different variability! Var(A) = 0.557, Var(B) = 1.367

* Would you choose to grow plant A or B?

Page 22: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 22

Variability (contd.) Coefficient of variation – CV – std. deviation as % of

the mean:

A better measure compared to std. dev. in case where samples have different means. E.g.

* Plant X (ton/ha) = {1.2, 1.4, 2.6, 2.7, 3.9} * Plant Y (ton/ha) = {1.4, 1.5, 2.1, 3.2, 3.9}

Page 23: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 23

FarmNo.

Yield(ton/ha)

SpeciesX

SpeciesY

1 1.2 1.4

2 1.4 1.5

3 2.6 2.1

4 2.7 3.2

5 3.9 3.9

Mean 2.36 2.42

Var.  1.20 1.20 

Variability (cont.)

Calculate CV for both species.

CVx = (1.2/2.36) x 100

= 50.97%

CVy = (1.2/2.42) x 100

= 49.46% Species X is a little more variable than species Y

Page 24: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 24

Variability (cont.) Std. dev. of a frequency distribution E.g. age distribution of second-home buyers (SHB):

Page 25: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 25

Probability distribution If there 20 lecturers, the probability that

A becomes a professor is: p = 1/20 = 0.05 Out of 100 births, half of them were

girls (p=0.5), as the number increased to 1,000, two-third were girls (p=0.67) but from a record of 10,000 new-born babies, three-quarter were girls (p=0.75)

The probability of a drug addict recovering from addiction is 50:50

General rule: No. of times event X occurs Pr (event X) = ------------------------------------- Total number of occurrences Probability of certain event X to occur has a specific form of

distribution

Logical probability:

Experiential probability:

Subjective probability:

Page 26: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 26

Probability Distribution

Dice1

Dice2 1 2 3 4 5 6

1 2 3 4 5 6 7

2 3 4 5 6 7 8

3 4 5 6 7 8 9

4 5 6 7 8 9 10

5 6 7 8 9 10 11

6 7 8 9 10 11 12

Classical example of tossing

What is the distribution of the sum of tosses?

Page 27: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 27

Probability Distribution (contd.)

Values of x are discrete (discontinuous)

Sum of lengths of vertical bars p(X=x) = 1 all x

Discrete variable

Page 28: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 28

Probability Distribution (cont.)

Age Freq Prob.

36 3 0.02

37 14 0.07

38 10 0.04

39 36 0.18

40 73 0.36

41 27 0.14

42 20 0.10

43 17 0.09

Total 200 1.00

Age distribution of second-home buyers in

probability histogram

Pr (Area under curve) = 1Pr (Area under curve) = 1

Continuous variable

Mean = 39.5

Std. dev = 2.45

Page 29: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 29

Pr (Age ≤ 36) = 0.02 Pr (Age ≤ 37) = Pr (Age ≤ 36) + Pr (Age = 37) = 0.02 + 0.07 = 0.09 Pr (Age ≤ 38) = Pr (Age ≤ 37) + Pr (Age = 38) = 0.09 + 0.04 = 0.13 Pr (Age ≤ 39) = Pr (Age ≤ 38) + Pr (Age = 39) = 0.13 + 0.18 = 0.31 Pr (Age ≤ 40) = Pr (Age ≤ 39) + Pr (Age = 40) = 0.31 + 0.36 = 0.67 Pr (Age ≤ 41) = Pr (Age ≤ 40) + Pr (Age = 41) = 0.67 + 0.14 = 0.81 Pr (Age ≤ 42) = Pr (Age ≤ 41) + Pr (Age = 42) = 0.81 + 0.10 = 0.91 Pr (Age ≤ 43) = Pr (Age ≤ 42) + Pr (Age = 43) = 0.91 + 0.09 = 1.00

Probability Distribution (cont.)

Cumulative probability corresponds to the

left tail of a distribution

Page 30: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 30

As larger and larger samples are drawn, the probability distribution is getting smoother

Tens of different types of probability distribution: Z, t, F, gamma, etc

Most important: normal distribution

Larger sample

Very large sample

Probability Distribution(cont.)

Page 31: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 31

Normal Distribution - ND

Salient features of ND:

* Bell-shaped, symmetrical

* Total area under curve = 1

* Area under curve between

any two points = prob. of

values in that range (shaded area)

* Prob. of any exact value = 0

* Has a function of:

μ = mean of variable x; σ = std. dev. of x; π = ratio of circumference of a circle to its diameter = 3.14; e = base of natural log = 2.71828.

Page 32: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 32

Normal Distribution - ND

Population 1Population 2

1 2

1

2

* A larger population has

narrower base (smaller

variance)

* determines location

while determines

shape of ND

Page 33: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 33

Normal Distribution (cont.)* Has a mean and a variance 2, i.e. X N(, 2 )

* Has the following distribution of observation:

“Home-buyers example…”

Mean age = 39.3

Std. dev = 2.42

Page 34: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 34

Standard Normal Distribution (SND)

Since different populations have different and (thus, locations and shapes of distribution), they have to be standardised.

Most common standardisation: standard normal distribution (SND) or called Z-distribution

(X=x) is given by area under curve Has no standard algebraic method of integration

→ Z ~ N(0,1) To transform f(x) into f(z):

x - µ

Z = ------- ~ N(0, 1)

σ

Page 35: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 35

Z-Distribution

Probability is such a way that: * Approx. 68% -1< z <1 * Approx. 95% -1.96 < z < 1.96 * Approx. 99% -2.58 < z < 2.58

Page 36: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 36

Z-distribution (cont.)

When X= μ, Z = 0, i.e.

When X = μ + σ, Z = 1 When X = μ + 2σ, Z = 2 When X = μ + 3σ, Z = 3 and so on. It can be proven that P(X1 <X< Xk) = P(Z1 <Z< Zk)

SND shows the probability to the right of any particular value of Z.

Page 37: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 37

Normal distribution…QuestionsA study found that the mean age, A of second-home buyers in Johor Bahru is 39.3 years old with a variance of RM 2.45.Assuming normality, how sure are you that the mean age is: (a) ≥ 40 years old; (b) 39 to 42 years old?

Answer (a): P(A ≥ 40) = P[Z ≥ (40 – 39.3)/2.4] = P(Z ≥ 0.2917 0.3000) = 0.3821 (b) P(39 ≤ A ≤ 42) = P(A ≥ 39) – P(A ≥ 42) = 0.45224 – P[A ≥ (42-39.3)/2.4] = 0.45224 – P(A ≥ 1.125) = 0.45224 – 0.12924 = 0.3230

Always remember: to convert to SND, subtract the mean and divide by the std. dev.

Use Z-table!

Page 38: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 38

“Student’s t-Distribution”

Similar to Z-distribution (bell-shaped, symmetrical) Has a function of

where = gamma distribution; v = n-1 = d.o.f; = 3.147

Flatter with thicker tails Distributed with t(0,σ) and -∞ < t < +∞ As n→∞ t(0,σ) → N(0,1)

Probability calculation requires

information on d.o.f.

Page 39: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 39

How Are t-dist. and Z-dist. Related? Using central limit theorem, N(, 2/n) will become

zN(0, 1) as n→∞ For a large sample, t-dist. of a variable or a

parameter is given by:

The interval of critical values for variable, x is:

Page 40: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 40

Skewness, m3 & Kurtosis, m4

Skewness, m3 measures degree of symmetry of distribution

Kurtosis, m4 measures its degree of peakness

Both are useful when comparing sample distributions with different shapes

Useful in data analysis

Xi = indivudal sample observation, =

sample mean; = std. deviation; n = sample size

Page 41: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 41

Skewness

Bimodal Uniform J-shaped

Perfectly normal (zero skew)Right (+ve) skew Left (-ve) skew

Page 42: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 42

Kurtosis

Mesokurtic

(normal)

(zero kurtosis)

Leptokurtic

(high peak)

(+ve kurtosis)

Platykurtic

(low peak)

(-ve kurtosis)

Mesokurtic distribution…kurtosis = 3

Leptokurtic distribution…kurtosis < 3

Platykurtoc distribution…kurtosis > 3

Page 43: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 43

X-coord.(000)

Y-coord.(000)

Trees with Ganoderma

535.60 104.80 8

536.70 107.30 12

536.80 106.80 11

537.30 107.31 12

537.15 105.40 13

537.40 105.37 13

538.48 107.82 9

542.22 106.10 8

540.35 105.91 7

540.10 104.95 7

540.30 104.75 6

538.75 102.80 5

545.10 105.90 4

546.30 105.90 3

547.15 105.90 2

Occurrence of ganoderma

X-coord.(000)

Y-coord.(000)

Trees with ganoderma

547.75 106.08 5

547.10 105.25 8

547.80 101.05 7

548.18 105.92 8

548.80 105.90 12

548.95 104.85 15

548.94 104.50 13

548.75 103.73 7

548.94 102.80 4

Occurrence of ganoderma

Page 44: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 44

Al p.p.m. Freq.

0 0

250 7

500 13

750 25

1000 18

1250 13

1500 9

1750 7

2000 3

2250 4

2500 3

E.g. Al2++ + H2++O-- → Al2O + H2

sum 102.00 

mean 1073.53

553.05

305867.94

169161266.28

93555193911.64

skew 0.77

kurtosis 13.44

Aluminium residues in the soil

Page 45: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 45

E.g. WCM = ((545.10-542.86)2 + (105.90-105.48)2)0.5

= (5.0176 + 0.1764)0.5

= 2.28 (i.e. 2,280 m)

Measures of spatial separation

Weighted mean centre (Xcoord.) =

Weighted mean centre (Ycoord.) =

Standard distance =

Distance (x1,y1) and (x2,y2) =

Page 46: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 46

Occurrence of ganoderma

Sum f = 191.00 Xw = 103687.00 Yw = 20147.40 (Xw- )2 =588.46 (Yw- )2 = 55.50

Weighted mean centre   542.86 105.48    

Standard distance 1.84        

Point to point distance (e.g.)          

x-dist. 5.00        

y-dist. 0.17        

Distance Wc-M 2.27        

Spatial distribution –

Page 47: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 47

Spatial distribution – point dataEthnic distribution of residence

Page 48: Spatial Statistics: Topic 31 Descriptive Statistics Assoc. Prof. Dr. Abdul Hamid b. Hj. Mar Iman Director Centre for Real Estate Studies Faculty of Engineering

Spatial Statistics: Topic 3 48

Ethnic distribution of residence

k = (fx) -1

Test statistics

   -8.15tc

   0.12CV

   0.02CV

   0.012

   0.49

1.5468140 

1.511892 

0.5150501 

-0.490810 

(x- )2fxfx 

Ho: 2 = (pattern is random)

H1: 2 > (pattern is clustered) or 2 < (pattern is scattered)

X = no. of observations per quadrat; f = frequency of quadrats; = (fx)/f; 2 = (x- )2/(fx) -1; CV = 2/ ;

CV = (2/(k-1))½.

Reject Ho…residence pattern is scattered