s response to reviews title - static-content.springer.com10.1186... · author’s response to...
TRANSCRIPT
Author’s response to reviews
Title: A tailored intervention to implement guideline recommendations for elderly patients with
depression in primary care: a pragmatic cluster randomised trial
Authors:
Eivind Aakhus ([email protected])
Ingeborg Granlund ([email protected])
Jan Odgaard-Jensen ([email protected])
Andrew Oxman ([email protected])
Signe Flottorp ([email protected])
Version: 1 Date: 23 Feb 2016
Author’s response to reviews:
Response to reviewers
We want to thank the reviewers for their thorough review and excellent suggestions to improve
our paper. For improved readability please refer to the attached
• Answers written in italic are written in direct response to the reviewer’s question for
clarification, and may have, but not necessarily, led to changes in the manuscript
• We provide two versions of the reviewed paper, one with track changes and one without
Reviewer #1:
Abstract
• Clearly written.
o Thank you
• From the background it is not clear at what group of stakeholders the interventions/study is
targeting at.
o We changed the background section: “In Norway we tested this approach to improve
adherence to six recommendations for elderly patients with depression, targeting healthcare
professionals, patients and administrators”.
• I suggest broadening the discussion with relating the findings to the recommendations and care
for elderly with depressive disorder.
o We added in the discussion: It remains uncertain how best to improve adherence to evidence-
based recommendations and thereby improve the quality of care for these patients.
• There are some abbreviations that should be explained (CME, GP etc).
o We have written continuous medical education in full. We have omitted the abbreviation GP,
and written “general practitioner” throughout the paper
Main text
Background
There is no explicit record of the main research question(s) and hypothesis. Especially when
investigating and reporting the effectiveness of an intervention, it is very important as a reader to
understand what is exactly the research aim and how this relates to the existing knowledge. This
is important for the reader to be able to review the methods for its appropriateness and
subsequent reporting of results and drawing conclusions. As this is not described in the protocol
either, or not in standardised terms (e.g. PICO), I suggest to elaborate on this and advise to refer
to the CONSORT statement for guidelines.
o We have added the following sentence at the end of the background: “The objective of this
study was to evaluate the effectiveness of tailored interventions to implement those six
recommendations.” Our main research question was: Does a tailored implementation strategy
increase the extent to which general practitioners adhere to evidence-based recommendations for
managing elderly patients with depression compared to no intervention.
Methods - design
• It would benefit from clarity to explain the rationale for doing a pragmatic RCT and what in
this study 'pragmatic' entails.
o We have added the following: The trial was pragmatic in that it attempted to answer a practical
management question in normal practice, including all general practices in the targeted
municipalities and all patients with the targeted condition, the intervention was delivered
flexibly, and the primary outcome measure was clinically meaningful [20, 21].
20. Zwarenstein M, Treweek S, Gagnier JJ, Altman DG, Tunis S, Haynes B et al. Improving the
reporting of pragmatic trials: an extension of the CONSORT statement. BMJ. 2008;337:a2390.
doi:10.1136/bmj.a2390.
21. Thorpe KE, Zwarenstein M, Oxman AD, Treweek S, Furberg CD, Altman DG, et al. A
pragmatic-explanatory continuum indicator summary (PRECIS): a tool to help trial designers. J
Clin Epidemiol 2009; 62:464-75.
• As power is an issue in this study, the manuscript would gain in strength if the authors explain
why less than 20% of the municipalities were randomised. This also relates to a comment
concerning the power calculations where 50% GPs responding seems to be overestimated. I
suggest not only referring to an annex to the protocol paper, but including the main arguments in
this manuscript as well. This also relates to a suggestion for the analyses and presentation of
results.
o We have added the following under “design”: We included 80 municipalities due to the
required number of general practitioners that should be included in the study. The municipalities
were selected from seven of 19 counties in Norway for practical reasons (geographical access by
the research team) and because they represented both urban and non-urban and large and small
municipalities. Under “Sample size” we added: The assumed 50% participation in data collection
was based on two previous studies in Norwegian primary care [33, 34].
33. Fretheim A, Oxman AD, Havelsrud K, Treweek S, Kristoffersen DT, Bjorndal A. Rational
prescribing in primary care (RaPP): a cluster randomized trial of a tailored intervention. PLoS
Med. 2006;3(6):e134. doi:10.1371/journal.pmed.0030134.
34. Flottorp S, Oxman AD, Havelsrud K, Treweek S, Herrin J. Cluster randomised controlled
trial of tailored interventions to improve the management of urinary tract infections in women
and sore throat. BMJ. 2002;325(7360):367.
Participants, inclusion and eligibility criteria
• I do not understand why patients assumed to have a depressive disorder diagnosed using the
ICD10 are included were those who do not have an ICD10 diagnosis are excluded. Could you
please elaborate.
o This was mentioned by both reviewers. We agree that this sentence is not clear and we have
omitted “or assumed to have a diagnosis of depression”. We used diagnostic criteria from ICD10
to assess type of depression (single episode, recurrent or dysthymia) and severity of episode
(mild, moderate, severe).
• It might be helpful to explain to what extent the patient identification software did not function
and why as identification of patients manually by GPs might be an important factor in
participation rates. See also below.
o This is also mentioned by both reviewers and we have elaborated in the text: “If the software
did not work (e.g. installation of software was denied due to security systems, such as a firewall,
or a patient list was not produced), we asked the general practitioners to think of elderly patients
with depression from their list. Approximately one out of five general practitioners experienced
some problems while installing the software. Nearly all of these general practitioners whom we
interviewed were able, nonetheless, to identify elderly patients with depression from their
practice.” We have also given a more detailed description of the software: We developed
software for the five electronic journal systems used by general practitioners in Norway. The
software was either available from the website and could be downloaded and installed by the
practitioner prior to the interview, or was sent on a memory stick beforehand.
Interventions
• One question I have is to what extent the recommendations as part of the intervention are
conceived by GPs as evidence-based and more importantly, as clinical guidelines in treating
elderly with depressive disorder in Norway? Is there a possibility in the data to answer this
question?
o This question will be addressed in the process evaluation which will be submitted separately
later this year. In this evaluation we interviewed 20 GPs from the intervention group. Although
this little group hardly could be claimed to be representative for all the GPs in the intervention
group, there was little disagreement with the recommendations or the evidence-base. What
disagreement there was, was mostly for the recommendation for a depression case manager.
Prior to the study, the recommendations were discussed with various stakeholders in the
reference group, which included a representative from the Norwegian College of General
Practice (NFA). We presented the recommendations at outreach visits as part of the intervention,
and our clear impression was that the general practitioners accepted the recommendations, with
the same exception regarding depression case managers.
Outcome measures
• The first sentence (lines 14-17) reads very difficult.
o We have rewritten the description of the primary outcome: The primary outcome was the
proportion of the six recommendations adhered to by the general practitioners. We measured this
outcome by calculating a single measurement for each general practitioner based on the mean
adherence across the six recommendations.
Randomisation
• Please explain the cut-off values for inhabitants?
o The cut-off value for large and small municipalities was practical, based on data from Statistics
Norway to ensure that large municipalities (with the most dominant urban characteristics) were
fairly distributed across the two groups. We elaborated in the “Recruitment …” section: There
are few large municipalities in Norway. There is generally more access to cognitive behavioural
therapy and other health and social services in large municipalities. Because we believed that
there might be an association between access to health and social services and the primary
outcome measure, we stratified the randomisation based on the size of the municipality to ensure
that large municipalities were evenly distributed in the intervention and comparison arms of the
trial. Furthermore, we stratified the randomisation on the variable “Proportion of inhabitants 80
years or older”, acknowledging the increased prevalence of depression among the oldest, which
may influence the general practitioners’ experience with and skills of managing this patient
group. The choice of the cut-off (> 25,000 inhabitants or designated city status by Statistics
Norway) was based on data from Statistics Norway, reported in Appendix C of the protocol.
• Why was it not possible to blind the researchers?
o Throughout data collection the researchers were constantly in contact with the GPs that were
invited to participate and coordinated schedules for interviews, making appointments and
establishing contact between interviewer and interviewee and assisting them when the
installation of software failed. To ensure blinding, all this work must have been done by others,
which was outside the financial scope of our project. On the other hand, as stated in the text, the
statistician who performed the analyses, was blinding to the randomisation when analysing
primary and secondary outcomes.
Statistical methods
• In light of the ITT analyses, could you please include a description on how possible missing
data is accounted for?
o We did not attempt to impute values for missing data. We have changed the sentence
accordingly: We performed the analysis as an intention-to-treat analysis based on the available
data; all general practitioners were analysed in the group to which they were assigned, regardless
of whether they used the interventions we offered or not. We did not impute values for missing
data.
Data collection
• Please explain why the number of patients discussed with the GP is lower than described in the
protocol (resp. 4 versus 6 in the protocol).
o Due to the extensive interview that we planned for each patient, we calculated that
interviewing the GP about six of his/her patients would take approximately two hours. Feedback
from GPs that we met during outreach visits prior to data collection, after writing the protocol,
convinced us that it was unrealistic to expect GPs to offer so much of their time to participate in
the interview. By narrowing the extent of the interview and the number of patients we planned to
conduct each interview within a time frame of one hour which we expected to be more realistic.
Results
• To identify the patients, specialized software is used to extract patient files from the GPs
administrations. The authors note that if the software did not work, the research team asked the
GPs to identify potential participants at patient level. This is a demanding task for GPs and hence
information about the successful attempts using the software might give insight in what GPs
were lost on follow-up or not responding. I would advise the researchers to interrogate the data
on this issue and find out if the process of identification of patients might be an important
alternating factor in GPs participation rates.
o Our impression during data collection was that the general practitioners were easily able to
identify some patients from their practice, but often fewer than if the list was generated from the
software.
o We have added this to the strengths and limitations section in the Discussion: We used two
different methods, interviews and questionnaires, to obtain data from general practitioners
regarding diagnosis and management of elderly patients with depression from their patient list. It
is not possible to determine whether the two methods gave comparable results, due to the small
sample size.
Discussion
Although the authors reflect on reasons that might have caused the low response rates of the
GPs, I would expect further debate on the following issues to learn and prevent for similar
outcomes in similar studies:
• How do the GPs regard the intervention and especially the recommendations? Do they agree
with them or do they feel unsure?
o This will be addressed in the process evaluation.
• Could you elaborate on the recruitment strategy? What has been done to try to overcome the
financial, administrative and organizational burdens?
o We have elaborated: However, general practitioners do not routinely grade the severity of
depression or the type of depression using diagnostic codes. Thus, collecting data on adherence
to recommendations that apply to severity or type of depression requires an interview or
completion of a questionnaire. We have also added: Alternatively, a fee for participating, either
as part of national health authorities’ support of research in primary care or as part of the
research project funding, might have improved recruitment of participants for data collection
[48]. We included a reference to Foy and colleagues for the support of fees for participating in
research, although the evidence for these strategies is weak.
• What is the basis for the 50% response rate and thus including 'only' 80 municipalities, (besides
practical reasons)?
o Our anticipated 50% participation rate was based on two Norwegian studies in primary care,
which both are referred to in the protocol. We have now included these two references in the
text.
o Including 80 municipalities is actually quite a large sample, representing approximately one
fifth of the Norwegian population and healthcare professionals. The mean number of practices
per municipality is 3-4 and between 3-4 GPs per practice.
Conclusions
• I suggest including more 'lessons learned' in the conclusions (and abstract) so research
community can more easily grasp the bottlenecks in this kind or research.
o We have added in the following key messages to the Conclusions:
Pragmatic trials of implementation strategies are needed to answer real world questions about
how to improve the quality of care. A key message from this trial for implementation researchers
is that access to outcome data is essential to their success. This trial included all general
practitioners in 80 municipalities representing 20% of the Norwegian population with close to
1000 general practitioners. Randomising jurisdictions or large numbers of practices without
consent, is a highly pragmatic approach to answering real world questions about how to improve
the quality of care, provided access to outcome data is ensured, for example via routinely
collected data. However, this approach proved to be fatally flawed in this trial, because
collecting outcome data required active participation of general practitioners.
A second key message is that a randomized trial is not the best study design for answering a
pragmatic question about how to improve practice when random allocation is not feasible. A
major limitation of this trial was that we were not able to include what might have been
important, effective components of our tailored implementation strategy, such as integrating our
resources in widely used electronic information sources, because we could not randomly allocate
these.
A third key message is that future research evaluating methods for tailoring implementation
strategies should directly compare tailored implementation strategies that use different methods
to tailor the interventions. It remains logical that implementation strategies should address
important barriers to implementing evidence-based recommendations. However, little is known
about how best to identify important barriers and how to select interventions to address identified
barriers [12, 13]. The TICD project conducted some ground-breaking research comparing
different methods for identifying determinants of practice [16] and for linking interventions to
those determinants [14]. However, our trials were limited to comparisons of tailored strategies to
no intervention. This research and process evaluations linked to our trials [ref] can shed some
light on why our tailored implementation strategies appeared to have, at best, modest effects.
However, there is a need for explanatory trials designed to answer questions about how best to
tailor implementation strategies, as well as pragmatic trials to answer real world questions about
how to improve practice.
A key message for general practitioners and policymakers is that to answer important questions
about how to improve practice, general practitioners need to have time, resources and structures,
such as research networks, fees for participating in prioritised research or learning health systems
[ref].
Reviewer #2
Reviewer’s report
General comments
The study is of interest but underpowered and as a consequence there is no answer to the
question what the effectiveness of tailored interventions are to improve care for elderly patients
with depression. I doubt if the lessons learnt are new compared to the literature and add value to
new research. However, the design and the tailored interventions are very interesting.
I do not feel adequately qualified to assess the statistics. My advice is to ask a statistician to carry
out the statistical review.
Major compulsory revisions
Methods
Participants, inclusion and eligibility criteria, p.5
The study was part of the international research project Tailored Implementation for Chronic
Diseases (TICD). The aim of the TICD project was to develop valid and efficient methods of
tailoring implementation interventions to determinants of practice for knowledge implementation
in chronic illness care. In the study regarding depression, Included were patients with a diagnosis
of mild, moderate, severe or recurrent depression or assumed to have a diagnosis of depression
according to standardised ICD 10 criteria.
• Just wondering, why did the researchers not choose for patients who were diagnosed with a
persistent (chronic) depression? The categories of mild, moderate and severe depressive episodes
are used only for a single (first) depressive episode. As the authors mention on page 10:
“Learning more about effective strategies to improve healthcare for patients with chronic
diseases is important”.
o We did include patients with dysthymia and recurrent depression in addition to patients with
depressive episode. This is clarified now, thank you! The term chronic depression, albeit
frequently used in clinical discussions and which serves as a specifier (a code that gives
additional information to the main diagnosis) to the diagnostic code of Major depression in DSM
IV – TR, it is not included in the ICD10. We have omitted the term chronic depression
throughout the text for clarity now. Although the term in DSM-IV-terminology implies any
degree of depression severity lasting more than 2 years, the ICD10 term dysthymia, which
represents a long-standing sub-syndromal depressive state resembles the DSM-term.
• In addition, can you clarify what is meant by ‘assumed to have a diagnosis of depression’? Can
you describe how this is done?
o This was mentioned by both reviewers. We agree that this sentence is not clear and we have
omitted “or assumed to have a diagnosis of depression”. We used diagnostic criteria from ICD10
to assess type of depression (single episode, recurrent or dysthymia) and severity of episode
(mild, moderate, severe) and included only patients who fulfilled the diagnostic criteria.
Patients were included according to standardised ICD-10 criteria. Eligible patients were
identified from the GPs' patient lists using software that extracted information from the GPs’
electronic medical records, based on an algorithm of ICPC-2 diagnostic codes, ICPC-2
diagnostic text, free text, prescription of antidepressants and billing codes.
• Can you clarify what the relation is between the ICD-10 and the ICPC-2? Do GPs record both
ICD-10 and ICPC-2? Or was the ICPC-2 used as a first step in inclusion, and the ICD-10 as
second step?
o The reviewer’s assumption is correct; the ICPC-2 classification system is used by Norwegian
GPs. The GPs very rarely use ICPC-2 to specify depression severity and depression type. Thus
we used ICPC-2 code as one of the items in a checklist to identify eligible patients from the
patient list. In order to check whether practice were in accordance with the recommendations
(that addressed different types of depression and severity) we used ICD10 criteria to 1. Ascertain
that the patient could be included and 2. Establish the nature of the patient’s depression.
• Which terms were used from the free text?
o This is reported in the additional file in the protocol (see text box). These terms are Norwegian
terms that might indicate an assessment of depression, but not necessarily. As we described in
the protocol, various assessments, including ICPC2-codes, text excerpts from the electronic
medical journal, prescription of antidepressants, and additional fees each yielded a score.
Patients with the highest score (maximum 6, minimum 0, the P76 Depression code in ICPC2
automatically was given the highest score) were presented first on the list that were shown to the
GP on the screen after installing the software when starting the interview
Deprimert, depres* (depresjon, depressiv, depresjonsskjema), deppa, nedfor, nedstemt*
(nedstemt, nedstemthet) trist* (trist, tristhet, tristesse), sorgtung, svartsyn, håpløs* (håpløs,
håpløst, håpløshet), tungsinn, mismot, ulykkelig, MADRS
If the software did not work, GPs were asked to identify elderly patients with
depression to discuss with the authors.
• What do the authors mean by ‘If the software did not work’?
o This is mentioned by both reviewers and we have elaborated in the text: “If the software did
not work (e.g. installation of software was denied due to security systems, such as a firewall, or a
patient list was not produced), we asked the general practitioners to think of elderly patients with
depression from their list. Approximately one out of five general practitioners experienced some
problems while installing the software. Nearly all of these general practitioners whom we
interviewed were able, nonetheless, to identify elderly patients with depression from their
practice.”
• What was discussed between the authors and the GP? Whether the patients had a diagnosis of
mild, moderate, severe or recurrent depression?
o The discussion was an interview following the same structure as the interview we carried out
for GPs with a software-selected patient list, that is initially assessing the eligibility of each
patient, using criteria-based questions based on the ICD10 depression criteria, ruling out
exclusion criteria and then asking the GP questions about his/her practice in accordance with the
type and severity of the depression.
• And did the authors assume on the results of the discussion that a patient had a depression or
not? If so was it possible to use the ICD 10 criteria?
o Please see our response to the question above. If a diagnosis of depression was not established
for any of the patients that the GP could think of, the interview was terminated.
In the final stage of the data collection, the authors asked GPs who had not responded to the
invitation, to complete a questionnaire regarding their diagnosis and management of one elderly
patient with depression from their patient list.
• Can the authors mention which questions were asked?
o In the questionnaire the GP was asked about practice regarding one patient. The patient was
selected by the GP himself. An initial diagnostic procedure to establish the diagnosis (type and
severity of depression) was followed by the same questions that were asked to GPs in the
interviews.
• Did you check whether the ‘discuss’ method and the method using a self-report questionnaire
gave comparable results?
o We did not check this difference due to the small sample size, and we have inserted a sentence
about this in the “strengths and limitations” section: We used two different methods, interviews
and questionnaires, to obtain data from general practitioners regarding diagnosis and
management of elderly patients with depression from their patient list. It is not possible to
determine whether the two methods gave comparable results, due to the small sample size.
GPs who participated in the interview received credit for the CME course, but did not receive
any financial compensation.
• What was the amount they received? Or do you mean discount for the CME course because
you mention that GPs’ did not receive financial compensation?
o GPs did not receive any amount of money, but an amount of Points/Hours. The interview
qualified for 3 hours activity, and together with the e-learning course (8 hours), some specified
activity that should help the GP to adhere to the recommendations (4 hours) and participating in
the outreach visits (3 hours), should altogether give the GP sufficient hours/points to qualify for
the CME course which merited for the speciality.
Based on group interviews interventions were identified that addressed determinants to
adherence to the guideline recommendations.
• What was the experience with the offered interventions by the GPS, healthcare professionals,
administrators and patients?
o We did not conduct group interviews after the intervention, only prior to the intervention in the
process of planning the interventions. The reviewer’s question will to some extent be answered
in the process evaluation which will be published later this year. However, in that study we only
interviewed GPs in the intervention group, not other healthcare professionals, administrators or
patients.
• Is there more to say about: Which interventions were effective or helpful? What was the
perceived usefulness of the tailored interventions offered in response to the determinants?
Whether the intervention really fits to the determinant? On p.12, I read that you have conducted a
process evaluation to investigate reasons for the observed effects of the tailored implementation
strategies, including the extent to which you were able to identify and address the most important
determinants of practice. Probably this is the answer to the above mentioned questions?
o We were not able to measure the effectiveness of each of the 52 interventions (small and large)
that were implemented. The reviewer is correct; we will be able to present feedback from GPs in
the intervention group regarding their assessments of the importance of the determinants and the
feasibility of each of the intervention that we planned in the process evaluation.
• Why did you not evaluate the provided intervention by embedding a continuous feedback loop
with GPs to optimise the tailoring process?
o That is an excellent idea. Alas, we did not maintain contact with the GPs between the time of
the outreach visits and when we started to invite them to participate in the data collection. We
believe that the research group must have been larger or the intervention period longer if we
could be able to maintain contact. Possibly such contact could have yielded a larger impact of the
intervention and the resources we offered on each GP in the intervention group, but this we don’t
know. Furthermore, this type of feed-back loop was not suggested as one of the interventions
when we planned and tailored the interventions.
Recruitment, randomisation and blinding, p.7
The municipalities were the unit of randomisation. The primary outcome was the proportion of
recommendations that the GPs implemented, measured as the mean adherence for each GP
across the six recommendations, patients and depression severity. As a consequence it is more
appropriate to randomize the GPs?
• Why did you choose for the municipalities?
o Although the primary outcome was at the level of individual GPs, the interventions targeted all
levels of healthcare in the municipalities, including GP practices, other healthcare professionals,
healthcare administrators in the municipalities, patients and their relatives. Thus, there would
have been contamination if we had randomized GPs.
The authors divided the municipalities into four strata based on two factors: (1) municipalities
with city status or a large population (>25 000 inhabitants) versus ones with smaller populations
(≤25 000 inhabitants) (3) municipalities with a high proportion (>5%), versus a low proportion
(≤5%) of inhabitants 80 years or older.
• What was the reason you based the stratification on these factors? What is known from the
literature?
o We elaborated in the text: There are few large municipalities in Norway. There is generally
more access to cognitive behavioural therapy and other health and social services in large
municipalities. Because we believed that there might be an association between access to health
and social services and the primary outcome measure, we stratified the randomisation based on
the size of the municipality to ensure that large municipalities were evenly distributed in the
intervention and comparison arms of the trial. Furthermore, we stratified the randomisation on
the variable “Proportion of inhabitants 80 years or older”, acknowledging the increased
prevalence of depression among the oldest, which may influence the general practitioners’
experience with and skills of managing this patient group. The choice of the cut-off (> 25,000
inhabitants or designated city status by Statistics Norway) was based on data from Statistics
Norway, reported in Appendix C of the protocol.
Data collection, p.7
The planned interview schedule (six patients per GP) was too extensive and you reduced the
number of patients that we wanted to discuss with each GP to four. We encouraged the GPs to
provide diagnoses for two additional patients.
• Is it not clear to me whether the authors used the interview also for inclusion. Is the discussion
with the GP a part of the interview?
o That is correct, the initial step in the interview was to decide whether the patient could be
included (ICD10 depression criteria, F32, 33 and 34.1) and if any of the exclusion criteria
applied (not home dwelling, short life expectancy, a diagnosis of dementia or bipolar
disorder).We have elaborated: The initial part of the interview served to decide whether the
patients could be included and to assess depression severity and type and then to assess the
practitioners’ management of the patient’s depression.
The authors did not collect baseline data on adherence to the recommendations, assuming that
both groups were comparable following the stratification and randomisation process.
• Can you clarify how population size and age is of influence on the 6 (guideline)
recommendations?
o Please see our answer following the reviewer’s question about stratification
• A baseline measure gives insight in the differences between groups and may be of interest in
the interpretation of the results. Unfortunately the baseline measure was not performed.
o We agree. Indeed, this decision was made for practical reasons, in particular the very short
period that was to be used on the intervention in TICD. Collecting baseline data would have
delayed the intervention substantially, and we did not have the resources to collect baseline data.
We added to the limitations section: We did not collect baseline data due to the very short period
of the TICD project that was planned for the intervention; collecting baseline data would have
delayed the intervention substantially. Furthermore, we did not have the resources to collect such
data.
Results
Figure 1. Flow chart of participants. In total 444 GPs were allocated to the intervention and 293
GPs were provided the intervention.
• The authors mention that 393 GPs are lost to follow-up. Can you explain this?
o We assessed all GPs eligible for inclusion regardless of whether they participated in the
outreach visits or not. Thus, all 444 GPs in the intervention group were eligible for inclusion in
the data collection, regardless of whether they participated in the outreach visit or not. Only 59
consented to participate, of whom 8 did not participate in the interview after all (as explained in
the results section). We have corrected the numbers in the flow chart in figure 1. Thus, the rest
was considered lost to follow-up in data collection (the specific number of GPs not responding
and of those who consented but did not participate is clarified in the figure).
The authors mention that in total, 124 GPs (response rate 14%) participated in the data collection
(p.8).
• This is not according to the figure, data from the 124 GPs was analysed. My advice is to use the
CONSORT 2010 Flow Diagram.
o We have adjusted the flow chart in accordance with the 2010 CONSORT flow chart. We have
corrected a few inaccuracies in the “Lost to follow up” boxes (for both intervention and control
groups) hopefully making it easier to follow.
The authors mention a response rate of 14% (p.8).
• Is this correct? 124 GPs from which sample?
o 124 GPs are 14% of the total sample of GPs (900) in both control and intervention groups.
Discussion
Strengths and limitations
The authors mention: “The major limitation of our study is that we were not able to recruit a
sufficient number of general practitioners to participate in the data collection within the time and
resource limits that applied to our project”.
For next research it is important to learn from this study and therefore it is important that the
authors give insight in their recruitment strategy as done.
• The authors mention that the GPs’ secretaries often guarded the GPs that they were not able to
get in touch with them. It surprises me that the researchers did not devise a strategy for this
previously, because it is a common way for GPs to organize their work.
o We agree – we could have planned for this.
Comparison with existing literature
Although it is interesting to compare with former tailored studies, I think it is valuable to give
some attention to other studies about obstacles to perform a clinical trials in relation to the trial
you carries out. For example from Richter-Sundberg ea, 2014: Improving treatment of
depression in primary health care: a case study of obstacles to perform a clinical trial designed to
implement practice guidelines. And also with literature about how to overcome the obstacles.
o Thank you, we agree that this is a highly relevant article, and have included it in the discussion:
Richter-Sundberg and colleagues conducted a post-RCT qualitative study to identify barriers to
the implementation of a clinical practice guideline for depression in Sweden. In spite of fees for
performance and consent to participate collected prior to the implementation process, the project
was not able to recruit the required number of patients to reach sufficient statistical power (after
18 months only 30 patients were included) [45]. Based on the framework described by Grol and
Wensing [46], the authors identified and sorted numerous barriers that hindered participation in
the study. The excessive workload associated with the research design was one major barrier.
Introducing new psychological therapies that challenged established professional role identity
was another.
o Regarding literature about how to overcome the obstacles, this is still a field with much
uncertainty and little consensus, making future research projects testing the effectiveness of
tailoring important.
The authors mention that in their study adherence to the recommendations was higher than in
previous studies [9], 58% and 52% in intervention and control groups respectively. (p. 11)
• Can you describe what the similarities and differences are between the study of Smolders and
your study? In my opinion the studies are quite different in design and data collection. In
addition, your study was underpowered and I wonder if a comparison can be made.
o We agree that Smolders’ study is not directly comparable to ours in terms of design and data
collection. We have addressed this issue extensively in a more elaborated introduction to the
“Comparison with existing literature” section: There are few studies that measure adherence to
depression guidelines and they are not directly comparable to our study. A review found that
general practitioners’ adherence to mental health clinical practice guidelines is low [38].
Fernandez and colleagues [ref] found in a large epidemiological study based on interviews with
21 425 home-dwelling persons in six European countries, that “treatment adequacy” for
depression as defined by the research group was particularly low (23%) in the patient group that
received management in “general medical care“ (which included general practitioners and
specialists other than psychiatrists and psychologists). Duhoux and colleagues [40] found that
elderly patients (65+) received less guideline concordant management as compared with younger
adults, regardless of which definition of concordance that were selected. Smolders and
colleagues [9] combined information from a patient questionnaire that measured depression and
anxiety symptoms and general practitioners’ performance as recorded in the electronic medical
patient records that only 42% of the depressed patients received management in accordance with
evidence-based management of depression as defined by an expert panel. In our study adherence
to the recommendations tended to be higher than in previous studies that have reported
adherence to depression guidelines, 58% and 52% in the intervention and control groups
respectively. The weak recruitment of general practitioners to participate in data collection might
indicate that we were only able to collect data from general practitioners that were particularly
interested in this patient group, both in the intervention and in the control group, which may
explain the relatively high adherence rate.
Conclusions
• Is it not premature to mention that: “Our tailored implementation strategy is unlikely to have
had more than a modest effect, despite our having put substantial effort into using multiple
methods to identify determinants of practice and to design a multifaceted implementation
strategy to address those determinants” ?
o The confidence interval for the primary outcome indicates a possible 9 percent improvement in
adherence, which is the basis for our conclusion. We agree with the reviewer that the finding is
weak. Thus we find it reasonable to conclude that it is “unlikely” to have more than a modest
effect.
Minor essential revisions
Title and abstract
• The recommendations are derived from guidelines. To mention this in title and abstract:
‘guideline recommendations’, it’s more clear what is meant by recommendations.
o We have changed accordingly
Abstract
Methods/design
• The authors mention that ‘The interventions targeted healthcare professionals, administrators
and patients,…’. After they write: ‘We offered the intervention to all GPs in the intervention
municipalities’. It is not clear to whom the interventions are offered.
o We agree, this is not clearly described. We have re-written: We offered outreach visits to all
general practitioners and practice staff in the intervention municipalities.
Table 2
• Hospital Anxiety and Depression Scale (HADS) instead of Hospital and Anxiety Scale.
o Of course. Thank you!
Post-hoc analyses
• In table 4 you mention as a separate category ‘recurrent depression/dysthymia’. For
clarification add dysthymia also in the text (p.9).
o Done. We have also omitted the term “chronic depression” as this is not an established
diagnosis in ICD10.
Results
Among the 385 patients, 221 patients (58%) suffered from recurrent or chronic depression
according to ICD-10 criteria. Of the remaining 164 patients with a first depressive episode,
almost 40% suffered from a severe episode (p.9).
• What do you mean by chronic depression according the ICD-10: Persistent mood disorders or
more specific Dysthymia?
o We have removed the term “chronic depression”, see explanation above
Strengths and limitations
“Getting GPs to contribute to the data collection once we contacted them was less difficult than
getting in contact with them (p. 10)”.
• The sentence is not clear to me what do you mean?
o We have elaborated: The biggest challenge was to get in contact with the general practitioners.
When we had been able to talk to them, it was less difficult to get the general practitioner’s
consent to participate in the data collection.