making social work count lecture 9
DESCRIPTION
Making Social Work Count Lecture 9. An ESRC Curriculum Innovation and Researcher Development Initiative. Testing for statistical significance. Is there a real difference? . Learning outcomes. At the end of this session, you should be able to: - PowerPoint PPT PresentationTRANSCRIPT
Making Social Work Count Lecture 9
An ESRC Curriculum Innovation and Researcher Development Initiative
Testing for statistical significance
Is there a real difference?
Learning outcomes
• At the end of this session, you should be able to:– Define research question, research hypothesis, null
hypothesis and statistically significant;– Discuss the basic requirements for testing the
difference between two means;– Define and describe the difference between the alpha
value and P value, and Type I and Type II errors;• Advanced Study:– Calculate the difference between the means (t-ratio)
using example data through advanced study
1. theorise and hypothesise
2. collect sample data 3. test 4. determine the likelihood
the hypothesis is true
1. theorise and hypothesize
2. test3. decide and predict
Taking decision-making a step further
Research Questions
What kinds of research questions might we want to ask in social work?
Research questions
1. Do children exposed to domestic violence experience more mental ill health than children who have not been exposed?
2. Is smacking an effective behavioural management tool?3. Are young offenders on non custodial sentences less
likely to offend than young offenders on custodial sentences?
4. Is parenting capacity reduced when parents misuse drugs or alcohol?
5. Do children in kinship care have better outcomes than children with unrelated foster carers?
Hypotheses: Research and Null
Research hypotheses
1. Children exposed to domestic violence experience more mental ill health than children who have not been exposed
2. Smacking is an effective behavioural management tool3. Young offenders on non custodial sentences are less likely
to offend than young offenders on custodial sentences 4.Parenting capacity is reduced when parents misuse drugs
or alcohol5. Children in kinship care have better outcomes than
children with unrelated foster carers
Making statements for testing
Research hypothesis• A proposed explanation for a
phenomenon that can be tested• There is a relationship between
two measured variables• A particular intervention makes
a difference/has an effect
Example: Smacking is an effective behavioural management tool
Null hypothesis • The opposite position of the
hypothesis • There is no relationship
between two measured variables
• The particular intervention does not make a difference/has no effect
Example: Smacking is not an effective behavioural management tool
Accepting the research hypothesis
• Accepting the research hypothesis says that the difference between sample means is too large to be accounted for by sampling error, and, therefore, reflects differences within or between populations.
Research questions and hypotheses in Social Work
• Does a particular intervention work? – Have negative symptoms reduced after the
intervention? (i.e. depression; anxiety; numbers of hours of care)
– Have positive attributes of service users increased after the intervention? (i.e. self-esteem; optimism; hope; less number of hours of care)
• Are there differences in treatment outcomes based on gender, ethnicity, age?
Singing and Mental Health: An example of hypothesis testing
Does singing have an effect on mental health? • Study looking at the effect of choral signing on mental health
(N=218)• Research hypothesis – Participation in choral singing will result in
changes in the participant’s emotional state. • Null hypothesis – Participation in choral singing will result in no
changes in the participant’s emotional state. • Measured aspects of one’s emotional state before and after the
choral singing intervention• Found that positive emotions increased (excited, alert, hopeful,
relaxed, harmonic, to be present) and negative emotions decreased (pain, headache). Also found that participants felt more together and less alone.
Does singing have an effect on mental health? • Are these findings true
or did the differences in emotional states just happen by chance?
• How confident are we that there was a statistically significant difference in emotional state pre and post the choral singing?
Statistically Significant
• The researchers would have employed statistical tests to determine the extent to which they are confident that the results accurately reflect what occurs in the population (a real population difference) versus merely occurring by chance (or sampling error):– t – test
• Based on the outcome of the test, the researchers can determine whether the results are statistically significant. That is, the outcome is unlikely to have occurred by chance
The basic requirements for testing the difference between two means
Basic requirements in hypothesis testing/ answering research questions1. Develop a research hypothesis to test or a research question to answer2. Select or determine the main variable of interest to measure (data)
• This could involve using a standardised measure (i.e. Strengths and Difficulties questionnaire for children; parenting capacity measure) or any other numerical value (i.e. number of units of alcohol or drugs used; number of weeks in foster care)
3. Select your sample(s), gather the data, and calculate the descriptive statistics of the sample(s)
4. Conduct a statistical test of difference between the means (or other descriptive statistics, such as percentages) (t-test; chi-square; ANOVA; Fisher’s exact test)
• This involves pre-determining the alpha value5. Interpret whether the results are statistically significant
An example of determining differences between hope
amongst social work students
A “hopeful” example of how to test for differences
1. Develop a research question to answer or a research hypothesis to test:• Research Question - Is there a difference in level of
hope between first year and final year social work students?
• Research Hypothesis - First year social work students will have higher levels of hope than final year social work students.
• Null Hypothesis – There is no difference in level of hope between first year and final year social work students.
A “hopeful” example of how to test for differences
2. Select an instrument or measure of hope:• The Trait Hope Scale (Snyder et al., 1991)
Question DefinitelyFalse
MostlyFalse
MostlyTrue
DefinitelyTrue
I can think of many ways to get out of a difficult situation.
1 2 3 4
I energetically pursue my goals. 1 2 3 4
I feel tired most of the time. 1 2 3 4
There are lots of ways around any problem. 1 2 3 4
I am easily downed in an argument. 1 2 3 4
I can think of many ways to get the things in life that are most important to me.
1 2 3 4
I worry about my health. 1 2 3 4
Even when others get discouraged, I know I can find a way to solve the problem.
1 2 3 4
My past experiences have prepared me well for my future.
1 2 3 4
I’ve been pretty successful in life. 1 2 3 4
I usually find myself worrying about something. 1 2 3 4
I meet the goals that I set for myself. 1 2 3 4
The Trait Hope Scale
• Sums the scores of items: 1, 2, 4, 6, 8, 9, 10, and 12 (the other items are fillers and do not contribute to the overall score).
• Score can range from 8 – 32 with higher scores equally higher levels of hope.
A “hopeful” example of how to test for differences
3. Select a random sample of students from the first year cohort and final year cohort (N = 30 for each cohort):• Administer the the Trait Hope Scale to the sample
of first and final year students. • Calculate the mean scores for first and final year
students.
First Year Social Work Students (N=30)
Final Year Social Work Students (N=30)
30 25 18 20
25 31 20 28
31 30 25 24
24 32 19 23
30 29 23 22
25 26 24 28
24 29 25 30
22 30 26 25
18 27 30 23
20 28 18 24
31 23 15 27
32 32 16 18
28 29 14 19
30 30 18 20
29 31 30 15
Is there a difference in level of hope between first and final year students?• Calculate the mean scores between the two groups of
students:– First Years = 27.7– Final Years = 22.2
• There is a mean difference of 5.5 • Can we conclude that first years are actually more
hopeful than final years?• Could the difference be a result of sampling error and,
therefore, the difference is due to chance and chance alone?
Descriptive Statistics
N Min. Max. Mean Std. Deviation
Overall hope score for first year students 30 18 32 27.70 3.706
Overall hope score for final year students 30 14 30 22.23 4.688
Valid N (listwise) 30
Output #1: The mean hope score for the first and final year students
The distribution of hope scores for first and final year students
Mean Mean
Standard Deviation: A measure of variability
• The standard deviation (sd) reflects the typical deviation from the mean
• Describes the variability of the scores– how close are the scores or how far apart are the
scores?• The sd for the hope scores were as follows:– sd (first years ) = 3.706– sd (final years) = 4.688
Standard Deviation: An example of great variability and no variability
N=30; Min = 30; Max = 30; sd = .000 N=30; Min = 15; Max = 30; sd = 6.226
A “hopeful” example of how to test for differences
4. Conduct a statistical test of significance (t-test) between the first year hope score
mean and the final year hope score mean.– Determine the level of significance or α (alpha
value)
What level of significance?
• Statistically significant difference – The differences observed in a sample reflect a real population difference and is not a result of sampling error.
• In order to determine if something is statistically significant, you must establish a level of significance (represented by the Greek letter α (alpha)).
Probability (P) and alpha value (α)
• α = the level of probability where the null hypothesis can be rejected with confidence and the research hypothesis accepted with confidence
• We symoblise the probability as P < .05 • It is convention to use the α = .05 level of significance,
but significance levels can be set up for any degree of probability – α = .01 level of significance; P < .01
• We reject the null hypothesis if the P value is less than the alpha value and otherwise retain it.
Output #2: The t-test for differences between two means (Independent samples test)
Levene’s Test for
Equality of Variances
t-test for Equality of Means
Hope Score F Sig. t df Sig. (2-
tailed)
Mean Differenc
e
Std. Error Difference
95% Confidence
Interval of the Difference
Lower UpperEqual variances assumed
Equal variances not assumed
2.464 .122 5.010
5.010
58
55
.000
.000
5.467
5.467
1.091
1.091
3.283
3.280
7.651
7.653
A “hopeful” example of how to test for differences
5. Interpret whether the results are statistically significant – Is there a statistically significant difference
between the two means? – Can we reject the null hypothesis of no difference?
Output #2: The t-test for differences between two means (Independent samples test)
Levene’s Test for
Equality of Variances
t-test for Equality of Means
Hope Score F Sig. t df Sig. (2-tailed)
Mean Difference
Std. Error Difference
95% Confidence Interval of the
DifferenceLower Upper
Equal variances assumed
Equal variances not assumed
2.464 .122 5.010
5.010
58
55
.000
.000
5.467
5.467
1.091
1.091
3.283
3.280
7.651
7.653
The results
• ‘a t of 5.010 with 58 degrees of freedom indicates that the difference between first and final year students in their mean hope scores is statistically significant at the .001 level.’
99% Confident (1 chance out of 100 of a Type I error)
Mean Mean
Statistically Significant
• The researchers would have employed statistical tests to determine the extent to which they are confident that the results accurately reflect what occurs in the population (a real population difference) versus merely occurring by chance (or sampling error):– t – test
• Based on the outcome of the test, the researchers can determine whether the results are statistically significant. That is, the outcome is unlikely to have occurred by chance
Have we made the right decision: Type I and Type II errors
Type I and Type II errors
• Setting a level of significance does not mean we will always be able to say with 100% confidence that we are correct in accepting or rejecting the null hypothesis. – There is a 5 in 100 chance (P = .05) with α = .05
level of significance of being wrong and 1 in 100 (P = .01) chance with α = .01 level of significance of being wrong.
Type I and Type II errors
• Type I error (α) – Reject the null hypothesis (stating there IS a difference between means) when we should have accepted (stating there IS NOT a difference between means).– The more stringent our level of confidence, the less likely we
will make a Type I error• Type II error (β)– Retaining the null hypothesis (stating
there IS NOT a difference between means) when we should have accepted (stating there IS a difference between means)– Increase the size of samples
Type I and Type II errors
Correct Decision
Type I errorP (Type I error) = α
Type II error P (Type II error) = β
Correct Decision
Retain null Reject nullHypothesishypothesis
DECISION
REALITY
Null hypothesisis true
Null hypothesisis false
* Levin, J., & Fox, J.A. (2003). Elementary statistics in social research, (9th ed.). Boston: Allyn and Bacon.
Testing the difference between means
Type of Test When to use (examples)
Independent samples t-test To test whether the means of two unrelated samples differ• Testing whether hope scores differ by year in the social work programme• Testing whether IQ scores differ by sex
Paired-samples t-test Is used with matched pair data where the research question calls for the repeated measurement of responses from the same individual • Testing whether hope scores increase for individuals from before a motivation course to after the course has taken place• Testing whether depression scores decrease for individuals from before to after therapy
Example 2: How do placements in kinship care compare with those in non-kin foster care? (Farmer, 2009)
1. Develop a research hypothesis (or question) to test (or answer)
• Are there differences in parent-related adversities and child-related adversities of children placed with kin versus children placed with unrelated foster carers?
(Farmer, 2009)
2. Select or determine the main variable of interest to measure (data)
• Parent-related adversities (independent variable) – death of a parent; drugs misuse; mental health problems
• Child-related adversities (independent variable) – neglected; sexual abuse; domestic violence
• Placement of child (either with kin or with non-related foster carers)
3. Select your sample, gather the data and calculate descriptive statistics of the sample • List of 2240 children – A sample of 270 children were selected, just over
half of whom (53%) were placed with family or friends and just under half (47%) with unrelated foster carers.
• Reviewed case files, interviews using semi-structured interview format, standardized measures
4. Conduct a statistical test
4. Conduct a statistical test
5. Interpret the results of the study
5. Interpret the results of the study
Learning outcomes
• Are you able to:– Define research question, research
hypothesis, null hypothesis and statistically significant;
– Discuss the basic requirements for testing the difference between two means;
– Define and describe the difference between the alpha value and p value, and Type I and Type II errors;
• Advanced Study:– Calculate the difference between
the means (t-ratio) using example data through advanced study
Activity
Activity
Ask the students to read the following case scenario and tasks:
Andrew (7-years-old) was referred by his social worker to a domestic abuse intervention programme run by a local organisation. The social workerreferred Andrew to the program because he had witnessed domestic abusebetween his mother and father and was presenting with behavioural andemotional difficulties. Andrew was described as “withdrawn, easily tearful, and reluctant to engage with this fellow classmates at school”. The intervention program consisted of 10 weeks of arts-based group work where 5-10 children (ages 6 – 11) met for 1 ½ hours per week. The programme uses the Strengths & Difficulties Questionnaire (SDQ) to assess the participants on behavioural and emotional aspects. The social worker wishes to evaluate whether the programme is effective to determine whether to continue to use this service in the future.
…Continued
Based on the above scenario, you are tasked with designing an evaluation of the domestic abuse intervention programme. Start by exploring the Strengths & Difficulties Questionnaire by accessing the forms and scoring sheets on the website: http://www.sdqinfo.com/ and then answer the following questions:
• What is your hypothesis and null hypothesis?• Which form(s) will you use for the evaluation and why? • Who will fill out the form(s) and when?• How will you determine whether the intervention was successful? In
particular, consider the type of statistical test(s) you will use and how you will interpret the results.
References
• Carpenter, J., Shardlow, S.M., Patsios, D., & Wood, M. (2013). Developing the confidence and competence of Newly Qualified Child and Family Social Workers in England: Outcomes of a National Programme, British Journal of Social Work, 1-24. – Lecturers could use this article as an example of hypothesis testing and comparing the mean between two samples.
• Farmer, E. (2009). How do placements in kinship care compare with those in non-kin foster care: Placement patterns, progress and outcomes? Child & Family Social Work, 14, 331-342.
• Strengths and Difficulties Questionnaire (SDQ) webpage. (nd). Information for researchers and professionals about the Strengths & Difficulties Questionnaires. Retrieved 11 January 2014, from http://www.sdqinfo.com/.
• Webber, M. (2011). Evidence-Based Policy and Practice in Mental Health Social • Work, 2nd edn. Exeter: Learning Matters. – Lecturers should refer to Chapter 11:
Demystifying p-values: a user-friendly introduction to statistics used in mental health research.