making social work count lecture 9

Making Social Work Count Lecture 9

An ESRC Curriculum Innovation and Researcher Development Initiative

Testing for statistical significance

Is there a real difference?

Learning outcomes

• At the end of this session, you should be able to:– Define research question, research hypothesis, null

hypothesis and statistically significant;– Discuss the basic requirements for testing the

difference between two means;– Define and describe the difference between the alpha

value and P value, and Type I and Type II errors;• Advanced Study:– Calculate the difference between the means (t-ratio)

using example data through advanced study

1. theorise and hypothesise

2. collect sample data 3. test 4. determine the likelihood

the hypothesis is true

1. theorise and hypothesize

2. test3. decide and predict

Taking decision-making a step further

Research Questions

What kinds of research questions might we want to ask in social work?

Research questions

1. Do children exposed to domestic violence experience more mental ill health than children who have not been exposed?

2. Is smacking an effective behavioural management tool?3. Are young offenders on non custodial sentences less

likely to offend than young offenders on custodial sentences?

4. Is parenting capacity reduced when parents misuse drugs or alcohol?

5. Do children in kinship care have better outcomes than children with unrelated foster carers?

Hypotheses: Research and Null

Research hypotheses

1. Children exposed to domestic violence experience more mental ill health than children who have not been exposed

2. Smacking is an effective behavioural management tool3. Young offenders on non custodial sentences are less likely

to offend than young offenders on custodial sentences 4.Parenting capacity is reduced when parents misuse drugs

or alcohol5. Children in kinship care have better outcomes than

children with unrelated foster carers

Making statements for testing

Research hypothesis• A proposed explanation for a

phenomenon that can be tested• There is a relationship between

two measured variables• A particular intervention makes

a difference/has an effect

Example: Smacking is an effective behavioural management tool

Null hypothesis • The opposite position of the

hypothesis • There is no relationship

between two measured variables

• The particular intervention does not make a difference/has no effect

Example: Smacking is not an effective behavioural management tool

Accepting the research hypothesis

• Accepting the research hypothesis says that the difference between sample means is too large to be accounted for by sampling error, and, therefore, reflects differences within or between populations.

Research questions and hypotheses in Social Work

• Does a particular intervention work? – Have negative symptoms reduced after the

intervention? (i.e. depression; anxiety; numbers of hours of care)

– Have positive attributes of service users increased after the intervention? (i.e. self-esteem; optimism; hope; less number of hours of care)

• Are there differences in treatment outcomes based on gender, ethnicity, age?

Singing and Mental Health: An example of hypothesis testing

http://www.youtube.com/watch?v=XDrmH0uM5xM

Does singing have an effect on mental health? • Study looking at the effect of choral signing on mental health

(N=218)• Research hypothesis – Participation in choral singing will result in

changes in the participant’s emotional state. • Null hypothesis – Participation in choral singing will result in no

changes in the participant’s emotional state. • Measured aspects of one’s emotional state before and after the

choral singing intervention• Found that positive emotions increased (excited, alert, hopeful,

relaxed, harmonic, to be present) and negative emotions decreased (pain, headache). Also found that participants felt more together and less alone.

Does singing have an effect on mental health? • Are these findings true

or did the differences in emotional states just happen by chance?

• How confident are we that there was a statistically significant difference in emotional state pre and post the choral singing?

Statistically Significant

• The researchers would have employed statistical tests to determine the extent to which they are confident that the results accurately reflect what occurs in the population (a real population difference) versus merely occurring by chance (or sampling error):– t – test

• Based on the outcome of the test, the researchers can determine whether the results are statistically significant. That is, the outcome is unlikely to have occurred by chance

The basic requirements for testing the difference between two means

Basic requirements in hypothesis testing/ answering research questions1. Develop a research hypothesis to test or a research question to answer2. Select or determine the main variable of interest to measure (data)

• This could involve using a standardised measure (i.e. Strengths and Difficulties questionnaire for children; parenting capacity measure) or any other numerical value (i.e. number of units of alcohol or drugs used; number of weeks in foster care)

3. Select your sample(s), gather the data, and calculate the descriptive statistics of the sample(s)

4. Conduct a statistical test of difference between the means (or other descriptive statistics, such as percentages) (t-test; chi-square; ANOVA; Fisher’s exact test)

• This involves pre-determining the alpha value5. Interpret whether the results are statistically significant

An example of determining differences between hope

amongst social work students

A “hopeful” example of how to test for differences

1. Develop a research question to answer or a research hypothesis to test:• Research Question - Is there a difference in level of

hope between first year and final year social work students?

• Research Hypothesis - First year social work students will have higher levels of hope than final year social work students.

• Null Hypothesis – There is no difference in level of hope between first year and final year social work students.


2. Select an instrument or measure of hope:• The Trait Hope Scale (Snyder et al., 1991)

Question DefinitelyFalse

MostlyFalse

MostlyTrue

DefinitelyTrue

I can think of many ways to get out of a difficult situation.

1 2 3 4

I energetically pursue my goals. 1 2 3 4

I feel tired most of the time. 1 2 3 4

There are lots of ways around any problem. 1 2 3 4

I am easily downed in an argument. 1 2 3 4

I can think of many ways to get the things in life that are most important to me.

1 2 3 4

I worry about my health. 1 2 3 4

Even when others get discouraged, I know I can find a way to solve the problem.

1 2 3 4

My past experiences have prepared me well for my future.

1 2 3 4

I’ve been pretty successful in life. 1 2 3 4

I usually find myself worrying about something. 1 2 3 4

I meet the goals that I set for myself. 1 2 3 4

The Trait Hope Scale

• Sums the scores of items: 1, 2, 4, 6, 8, 9, 10, and 12 (the other items are fillers and do not contribute to the overall score).

• Score can range from 8 – 32 with higher scores equally higher levels of hope.


3. Select a random sample of students from the first year cohort and final year cohort (N = 30 for each cohort):• Administer the the Trait Hope Scale to the sample

of first and final year students. • Calculate the mean scores for first and final year

students.

First Year Social Work Students (N=30)

Final Year Social Work Students (N=30)

30 25 18 20

25 31 20 28

31 30 25 24

24 32 19 23

30 29 23 22

25 26 24 28

24 29 25 30

22 30 26 25

18 27 30 23

20 28 18 24

31 23 15 27

32 32 16 18

28 29 14 19

30 30 18 20

29 31 30 15

Is there a difference in level of hope between first and final year students?• Calculate the mean scores between the two groups of

students:– First Years = 27.7– Final Years = 22.2

• There is a mean difference of 5.5 • Can we conclude that first years are actually more

hopeful than final years?• Could the difference be a result of sampling error and,

therefore, the difference is due to chance and chance alone?

Descriptive Statistics

N Min. Max. Mean Std. Deviation

Overall hope score for first year students 30 18 32 27.70 3.706

Overall hope score for final year students 30 14 30 22.23 4.688

Valid N (listwise) 30

Output #1: The mean hope score for the first and final year students

The distribution of hope scores for first and final year students

Mean Mean

Standard Deviation: A measure of variability

• The standard deviation (sd) reflects the typical deviation from the mean

• Describes the variability of the scores– how close are the scores or how far apart are the

scores?• The sd for the hope scores were as follows:– sd (first years ) = 3.706– sd (final years) = 4.688

Standard Deviation: An example of great variability and no variability

N=30; Min = 30; Max = 30; sd = .000 N=30; Min = 15; Max = 30; sd = 6.226


4. Conduct a statistical test of significance (t-test) between the first year hope score

mean and the final year hope score mean.– Determine the level of significance or α (alpha

value)

What level of significance?

• Statistically significant difference – The differences observed in a sample reflect a real population difference and is not a result of sampling error.

• In order to determine if something is statistically significant, you must establish a level of significance (represented by the Greek letter α (alpha)).

Probability (P) and alpha value (α)

• α = the level of probability where the null hypothesis can be rejected with confidence and the research hypothesis accepted with confidence

• We symoblise the probability as P < .05 • It is convention to use the α = .05 level of significance,

but significance levels can be set up for any degree of probability – α = .01 level of significance; P < .01

• We reject the null hypothesis if the P value is less than the alpha value and otherwise retain it.

Output #2: The t-test for differences between two means (Independent samples test)

Levene’s Test for

Equality of Variances

t-test for Equality of Means

Hope Score F Sig. t df Sig. (2-

tailed)

Mean Differenc

e

Std. Error Difference

95% Confidence

Interval of the Difference

Lower UpperEqual variances assumed

Equal variances not assumed

2.464 .122 5.010

5.010

58

55

.000

.000

5.467

5.467

1.091

1.091

3.283

3.280

7.651

7.653


5. Interpret whether the results are statistically significant – Is there a statistically significant difference

between the two means? – Can we reject the null hypothesis of no difference?

Output #2: The t-test for differences between two means (Independent samples test)

Levene’s Test for

Equality of Variances

t-test for Equality of Means

Hope Score F Sig. t df Sig. (2-tailed)

Mean Difference

Std. Error Difference

95% Confidence Interval of the

DifferenceLower Upper

Equal variances assumed

Equal variances not assumed

2.464 .122 5.010

5.010

58

55

.000

.000

5.467

5.467

1.091

1.091

3.283

3.280

7.651

7.653

The results

• ‘a t of 5.010 with 58 degrees of freedom indicates that the difference between first and final year students in their mean hope scores is statistically significant at the .001 level.’

99% Confident (1 chance out of 100 of a Type I error)

Mean Mean

Statistically Significant

• The researchers would have employed statistical tests to determine the extent to which they are confident that the results accurately reflect what occurs in the population (a real population difference) versus merely occurring by chance (or sampling error):– t – test

• Based on the outcome of the test, the researchers can determine whether the results are statistically significant. That is, the outcome is unlikely to have occurred by chance

Have we made the right decision: Type I and Type II errors

Type I and Type II errors

• Setting a level of significance does not mean we will always be able to say with 100% confidence that we are correct in accepting or rejecting the null hypothesis. – There is a 5 in 100 chance (P = .05) with α = .05

level of significance of being wrong and 1 in 100 (P = .01) chance with α = .01 level of significance of being wrong.


• Type I error (α) – Reject the null hypothesis (stating there IS a difference between means) when we should have accepted (stating there IS NOT a difference between means).– The more stringent our level of confidence, the less likely we

will make a Type I error• Type II error (β)– Retaining the null hypothesis (stating

there IS NOT a difference between means) when we should have accepted (stating there IS a difference between means)– Increase the size of samples


Correct Decision

Type I errorP (Type I error) = α

Type II error P (Type II error) = β

Correct Decision

Retain null Reject nullHypothesishypothesis

DECISION

REALITY

Null hypothesisis true

Null hypothesisis false

* Levin, J., & Fox, J.A. (2003). Elementary statistics in social research, (9th ed.). Boston: Allyn and Bacon.

Testing the difference between means

Type of Test When to use (examples)

Independent samples t-test To test whether the means of two unrelated samples differ• Testing whether hope scores differ by year in the social work programme• Testing whether IQ scores differ by sex

Paired-samples t-test Is used with matched pair data where the research question calls for the repeated measurement of responses from the same individual • Testing whether hope scores increase for individuals from before a motivation course to after the course has taken place• Testing whether depression scores decrease for individuals from before to after therapy

Example 2: How do placements in kinship care compare with those in non-kin foster care? (Farmer, 2009)

1. Develop a research hypothesis (or question) to test (or answer)

• Are there differences in parent-related adversities and child-related adversities of children placed with kin versus children placed with unrelated foster carers?

(Farmer, 2009)

2. Select or determine the main variable of interest to measure (data)

• Parent-related adversities (independent variable) – death of a parent; drugs misuse; mental health problems

• Child-related adversities (independent variable) – neglected; sexual abuse; domestic violence

• Placement of child (either with kin or with non-related foster carers)

3. Select your sample, gather the data and calculate descriptive statistics of the sample • List of 2240 children – A sample of 270 children were selected, just over

half of whom (53%) were placed with family or friends and just under half (47%) with unrelated foster carers.

• Reviewed case files, interviews using semi-structured interview format, standardized measures

4. Conduct a statistical test

5. Interpret the results of the study

Learning outcomes

• Are you able to:– Define research question, research

hypothesis, null hypothesis and statistically significant;

– Discuss the basic requirements for testing the difference between two means;

– Define and describe the difference between the alpha value and p value, and Type I and Type II errors;

• Advanced Study:– Calculate the difference between

the means (t-ratio) using example data through advanced study

Activity

Activity

Ask the students to read the following case scenario and tasks:

Andrew (7-years-old) was referred by his social worker to a domestic abuse intervention programme run by a local organisation. The social workerreferred Andrew to the program because he had witnessed domestic abusebetween his mother and father and was presenting with behavioural andemotional difficulties. Andrew was described as “withdrawn, easily tearful, and reluctant to engage with this fellow classmates at school”. The intervention program consisted of 10 weeks of arts-based group work where 5-10 children (ages 6 – 11) met for 1 ½ hours per week. The programme uses the Strengths & Difficulties Questionnaire (SDQ) to assess the participants on behavioural and emotional aspects. The social worker wishes to evaluate whether the programme is effective to determine whether to continue to use this service in the future.

…Continued

Based on the above scenario, you are tasked with designing an evaluation of the domestic abuse intervention programme. Start by exploring the Strengths & Difficulties Questionnaire by accessing the forms and scoring sheets on the website: http://www.sdqinfo.com/ and then answer the following questions:

• What is your hypothesis and null hypothesis?• Which form(s) will you use for the evaluation and why? • Who will fill out the form(s) and when?• How will you determine whether the intervention was successful? In

particular, consider the type of statistical test(s) you will use and how you will interpret the results.

http://www.sdqinfo.com/

References

• Carpenter, J., Shardlow, S.M., Patsios, D., & Wood, M. (2013). Developing the confidence and competence of Newly Qualified Child and Family Social Workers in England: Outcomes of a National Programme, British Journal of Social Work, 1-24. – Lecturers could use this article as an example of hypothesis testing and comparing the mean between two samples.

• Farmer, E. (2009). How do placements in kinship care compare with those in non-kin foster care: Placement patterns, progress and outcomes? Child & Family Social Work, 14, 331-342.

• Strengths and Difficulties Questionnaire (SDQ) webpage. (nd). Information for researchers and professionals about the Strengths & Difficulties Questionnaires. Retrieved 11 January 2014, from http://www.sdqinfo.com/.

• Webber, M. (2011). Evidence-Based Policy and Practice in Mental Health Social • Work, 2nd edn. Exeter: Learning Matters. – Lecturers should refer to Chapter 11:

Demystifying p-values: a user-friendly introduction to statistics used in mental health research.

http://www.sdqinfo.com/

making social work count lecture 9

Documents

research hypothesis

research hypotheses

research questions5

research questionswe

kinds of research questions

null hypothesis

alcohol children

sample of data