lecture 3: chi-sqaure, correlation and your dissertation proposal non-parametric data: the...

31
Lecture 3: Chi-Sqaure, correlation and your dissertation proposal • Non-parametric data: the Chi-Square test • Statistical correlation and regression: parametric and non-parametric tests Break • Regression in SPSS • Writing a dissertation proposal when you plan to use statistics • Exercises, assessment and assistance

Post on 19-Dec-2015

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Lecture 3: Chi-Sqaure, correlation and your dissertation proposal

• Non-parametric data: the Chi-Square test

• Statistical correlation and regression: parametric and non-parametric tests

• Break

• Regression in SPSS

• Writing a dissertation proposal when you plan to use statistics

• Exercises, assessment and assistance

Page 2: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Non-parametric statistics

• Non-parametric statistics in human geography

• Different types of non-parametric test:– 1 sample– 2 independent samples– 2 tied samples– 3 or more samples

Page 3: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

The Chi-Square test

• Most versatile test in social science

• Can be used to examine nominal data, ordinal data and interval/ratio data in groups

• There are no assumptions about independent or paired observations

Page 4: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Theory of Chi-Square

• The test examines the difference between observed counts and expected values

• Suppose we wanted to examine the difference between age groups in our sample and people in those groups in the UK? Or perhaps the difference between age groups between two or three samples?

• Chi-Square can examine these differences

Page 5: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

The Chi-Square Equation

χ2 = Sum of: (observed - expected)2

expected

Page 6: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

One way Chi-Square test

• Examines whether there is a difference between one sample and a population

• We can assume either that the expected counts will be equal between categories or that we know the proportions

• But, before we do the test, we have to cross-tabulate the data

Page 7: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

The Cross-tabulationAge 18-30 31-50 51-65 Over 65 Total

North 30 20 35 55 140

Total 30 20 35 55 140

Page 8: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

The expected counts

• Expected counts relate to either equal proportions or previously known proportions (e.g. from a population)

• These are then compared to observed counts and the difference is calculated

• A significance level is selected and the null hypothesis is accepted or rejected

Page 9: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

The Contingency Table

Age 18-30 31-50 51-65 Over 65 Total

North 30 20 35 55 140

Exp 35 35 35 35 140

Total 30 20 35 55 140

Page 10: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

The test result

• Chi-Square is calculated as the sum of each difference for every cell

• Assessed as for other statistical tests

• χ2 = 7.1 (p <0.05)

Page 11: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Two way Chi-Square tests

• Very often, we want to compare more than one sample with a population, such as with another sample, or three or more samples

• Two way Chi-Square allows us to do this easily

• Again, we cross-tabulate the data

Page 12: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

The Contingency table

Age 18-30 31-50 51-65 Over 65 Total

North 25 25 35 55 140

South 40 35 25 20 120

Total 65 60 60 75 260

Page 13: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Two-way analysis

• Chi-Square calculates expected values by multiplying the row and column totals and dividing between the grand total

• Expected values represent the number in each category which, given the sample sizes and distribution, we would expect to see in each cell

Page 14: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

The Chi-Square result• Chi-Square gives the result and we evaluate the

test with the use of significance tests

• χ2 = 21.7 (p <0.05)

• But, we can only state that there is a difference - not what the difference is. For example, does our sample from the north have more older people in it?

• We must examine the relative proportions of the contingency table to find this out

Page 15: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

The expected counts problem

• Chi-Square has the stipulation that 20% or less of the expected counts in an analysis must be under 5. If there are more than this, the test is invalid

• So, how can we get over this problem?

Page 16: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Recoding variables

• We can aggregate suitable variables to make the number of groups smaller

• Aggregating only works with ordinal data

• This reduces the number of groups and makes the likelihood of obtaining counts below 5 less

• We can also use this to make interval/ratio data into groups

Page 17: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Chi-Square: Qualifications

• You should have no less than 20 cases

• As stated above, not more than 20% of cells should have expected values under 5

• You should not necessarily ignore a contingency table, even if the Chi-Square test is invalid

• Remember, above all, that Chi-Square is a test of difference, not correlation

Page 18: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Statistical correlation: relationships among variables

• Relationships are concerned with the extent to which variable A is related to B

• This is termed correlation

• Correlation does not necessarily imply causation, but merely a possible relationship

• There are parametric and non-parametric tests of correlation

Page 19: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Types of correlation

• Perfect positive correlation: +1

• Perfect negative correlation: -1

• Linear relationship• No correlation: 0• Non-linear

relationship0

5

10

15

20

0 5 10 15 20

Page 20: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Parametric correlation: Pearson’s r

• Assumes your data are on interval/ratio scales AND are normally distributed

• Measured as -1 - +1

• This result shows the strength of the relationship

• The test must be judged by its significance (as for other parametric tests: < > 0.05)

Page 21: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Non-parametric correlation:Spearman’s rs

• Assumes ordinal data, or interval/ratio data that are not normally distributed

• Data are ranked for the test

• Measured as for Pearson’s

• Significance as for Pearson’s

Page 22: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

From correlation to explanation: regression analysis

• Regression seeks to examine the nature of the relationship between one or more independent variables and a dependent variable

• It is concerned with prediction, not just correlation

• To predict, there is an equation which describes the ‘line of best fit’ between variables

Page 23: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

The Line of best fit• Line of best fit ‘fits’ a

straight line through the data points you observe

• Can be expressed by:

Y = mx + cWhere:

Y = Dependent variable

c = constant (intercept)

m = slope gradient

x = independent variable

y = 0.9677x + 0.5895

02468

10

1214161820

0 5 10 15 20

Page 24: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Predicting using the regression equation

• You can use the equation to predict levels of Y for given levels of X

• This is often of use when looking at different outcome situations

Page 25: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Interpreting regression results

• R2: the ‘goodness of fit’ that the model offers, expressed in per cent

• F: the significance of the model

• The regression coefficients and associated p values

Page 26: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Regression: assumptions

• Your data:– Are measured on interval/ratio scales;– Are normally distributed;– And are therefore Parametric; and...– Have a linear relationship• You can use other techniques for non-linear

regression and regression with nominal/ordinal variables

Page 27: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Is any of this relevant to me?

• YES - you have to write a dissertation proposal

• Saying you will ‘analyse’ the data using appropriate methods is not enough

• You will get a far higher mark if you follow these simple steps in the next two months when preparing your proposal:

Page 28: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Writing your Dissertation Proposal: key points

• Do you need to use a questionnaire/other quantitative instrument?

• If yes, what key questions are you posing?

• ALWAYS relate these questions to your plans for analysis

• How will you analyse these collected data to meet your aims and objectives?

Page 29: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Writing your proposal• Methodology

• Questionnaire

• Questions

• Data this will yield

• Analysis types

• Analysis tools

• Quantitative/qualitative?

• Type: closed/open/both?

• Yes/no; frequency; categorical; multiple response?

• Parametric/non-parametric?

• Description, Differences, relationships?

• Parametric/non-parametric?

Page 30: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

Example of this process

Section of proposal Abstract example Specific example

Methodology Quantitative A questionnaire

Questionnaire Type Closed, with one open question

Questions DichotomousCategoricalAgreementFrequencyWrite-in answers

Yes/no, M/F, etc.Family type, etc.Attitude questionsBehaviour questionsAge

Data this will yield ParametricNon-parametric

AgeAll other variables

Analysis types DescriptionDifferencesRelationships

Describe attitudesDifference in attitudes (e.g. M/F)Correlation of attitudes/behaviour

Analysis tools VisualParametricNon-parametric

Bar and pie graphs, Pareto chartst-tests, ANOVA, Pearson’s rChi-Square, Spearman’s rs

Page 31: Lecture 3: Chi-Sqaure, correlation and your dissertation proposal Non-parametric data: the Chi-Square test Statistical correlation and regression: parametric

A final word

• Think carefully about your questionnaire - can you meet the objectives you have set yourself?

• Do you need to use every statistical test?

• Assessments (all 3) due in on 6 May

• Where can you get help?– Friday 14th March, 9-11am;– Monday 28th April, 11am-1pm• E-mail: [email protected]