s response to reviews title - static-content.springer.com10.1186... · author’s response to...

Author’s response to reviews

Title: A tailored intervention to implement guideline recommendations for elderly patients with

depression in primary care: a pragmatic cluster randomised trial

Authors:

Eivind Aakhus ([email protected])

Ingeborg Granlund ([email protected])

Jan Odgaard-Jensen ([email protected])

Andrew Oxman ([email protected])

Signe Flottorp ([email protected])

Version: 1 Date: 23 Feb 2016

Author’s response to reviews:

Response to reviewers

We want to thank the reviewers for their thorough review and excellent suggestions to improve

our paper. For improved readability please refer to the attached

• Answers written in italic are written in direct response to the reviewer’s question for

clarification, and may have, but not necessarily, led to changes in the manuscript

• We provide two versions of the reviewed paper, one with track changes and one without

Reviewer #1:

Abstract

• Clearly written.

o Thank you

• From the background it is not clear at what group of stakeholders the interventions/study is

targeting at.

[email protected]

[email protected]

[email protected]

[email protected]

[email protected]

o We changed the background section: “In Norway we tested this approach to improve

adherence to six recommendations for elderly patients with depression, targeting healthcare

professionals, patients and administrators”.

• I suggest broadening the discussion with relating the findings to the recommendations and care

for elderly with depressive disorder.

o We added in the discussion: It remains uncertain how best to improve adherence to evidence-

based recommendations and thereby improve the quality of care for these patients.

• There are some abbreviations that should be explained (CME, GP etc).

o We have written continuous medical education in full. We have omitted the abbreviation GP,

and written “general practitioner” throughout the paper

Main text

Background

There is no explicit record of the main research question(s) and hypothesis. Especially when

investigating and reporting the effectiveness of an intervention, it is very important as a reader to

understand what is exactly the research aim and how this relates to the existing knowledge. This

is important for the reader to be able to review the methods for its appropriateness and

subsequent reporting of results and drawing conclusions. As this is not described in the protocol

either, or not in standardised terms (e.g. PICO), I suggest to elaborate on this and advise to refer

to the CONSORT statement for guidelines.

o We have added the following sentence at the end of the background: “The objective of this

study was to evaluate the effectiveness of tailored interventions to implement those six

recommendations.” Our main research question was: Does a tailored implementation strategy

increase the extent to which general practitioners adhere to evidence-based recommendations for

managing elderly patients with depression compared to no intervention.

Methods - design

• It would benefit from clarity to explain the rationale for doing a pragmatic RCT and what in

this study 'pragmatic' entails.

o We have added the following: The trial was pragmatic in that it attempted to answer a practical

management question in normal practice, including all general practices in the targeted

municipalities and all patients with the targeted condition, the intervention was delivered

flexibly, and the primary outcome measure was clinically meaningful [20, 21].

20. Zwarenstein M, Treweek S, Gagnier JJ, Altman DG, Tunis S, Haynes B et al. Improving the

reporting of pragmatic trials: an extension of the CONSORT statement. BMJ. 2008;337:a2390.

doi:10.1136/bmj.a2390.

21. Thorpe KE, Zwarenstein M, Oxman AD, Treweek S, Furberg CD, Altman DG, et al. A

pragmatic-explanatory continuum indicator summary (PRECIS): a tool to help trial designers. J

Clin Epidemiol 2009; 62:464-75.

• As power is an issue in this study, the manuscript would gain in strength if the authors explain

why less than 20% of the municipalities were randomised. This also relates to a comment

concerning the power calculations where 50% GPs responding seems to be overestimated. I

suggest not only referring to an annex to the protocol paper, but including the main arguments in

this manuscript as well. This also relates to a suggestion for the analyses and presentation of

results.

o We have added the following under “design”: We included 80 municipalities due to the

required number of general practitioners that should be included in the study. The municipalities

were selected from seven of 19 counties in Norway for practical reasons (geographical access by

the research team) and because they represented both urban and non-urban and large and small

municipalities. Under “Sample size” we added: The assumed 50% participation in data collection

was based on two previous studies in Norwegian primary care [33, 34].

33. Fretheim A, Oxman AD, Havelsrud K, Treweek S, Kristoffersen DT, Bjorndal A. Rational

prescribing in primary care (RaPP): a cluster randomized trial of a tailored intervention. PLoS

Med. 2006;3(6):e134. doi:10.1371/journal.pmed.0030134.

34. Flottorp S, Oxman AD, Havelsrud K, Treweek S, Herrin J. Cluster randomised controlled

trial of tailored interventions to improve the management of urinary tract infections in women

and sore throat. BMJ. 2002;325(7360):367.

Participants, inclusion and eligibility criteria

• I do not understand why patients assumed to have a depressive disorder diagnosed using the

ICD10 are included were those who do not have an ICD10 diagnosis are excluded. Could you

please elaborate.

o This was mentioned by both reviewers. We agree that this sentence is not clear and we have

omitted “or assumed to have a diagnosis of depression”. We used diagnostic criteria from ICD10

to assess type of depression (single episode, recurrent or dysthymia) and severity of episode

(mild, moderate, severe).

• It might be helpful to explain to what extent the patient identification software did not function

and why as identification of patients manually by GPs might be an important factor in

participation rates. See also below.

o This is also mentioned by both reviewers and we have elaborated in the text: “If the software

did not work (e.g. installation of software was denied due to security systems, such as a firewall,

or a patient list was not produced), we asked the general practitioners to think of elderly patients

with depression from their list. Approximately one out of five general practitioners experienced

some problems while installing the software. Nearly all of these general practitioners whom we

interviewed were able, nonetheless, to identify elderly patients with depression from their

practice.” We have also given a more detailed description of the software: We developed

software for the five electronic journal systems used by general practitioners in Norway. The

software was either available from the website and could be downloaded and installed by the

practitioner prior to the interview, or was sent on a memory stick beforehand.

Interventions

• One question I have is to what extent the recommendations as part of the intervention are

conceived by GPs as evidence-based and more importantly, as clinical guidelines in treating

elderly with depressive disorder in Norway? Is there a possibility in the data to answer this

question?

o This question will be addressed in the process evaluation which will be submitted separately

later this year. In this evaluation we interviewed 20 GPs from the intervention group. Although

this little group hardly could be claimed to be representative for all the GPs in the intervention

group, there was little disagreement with the recommendations or the evidence-base. What

disagreement there was, was mostly for the recommendation for a depression case manager.

Prior to the study, the recommendations were discussed with various stakeholders in the

reference group, which included a representative from the Norwegian College of General

Practice (NFA). We presented the recommendations at outreach visits as part of the intervention,

and our clear impression was that the general practitioners accepted the recommendations, with

the same exception regarding depression case managers.

Outcome measures

• The first sentence (lines 14-17) reads very difficult.

o We have rewritten the description of the primary outcome: The primary outcome was the

proportion of the six recommendations adhered to by the general practitioners. We measured this

outcome by calculating a single measurement for each general practitioner based on the mean

adherence across the six recommendations.

Randomisation

• Please explain the cut-off values for inhabitants?

o The cut-off value for large and small municipalities was practical, based on data from Statistics

Norway to ensure that large municipalities (with the most dominant urban characteristics) were

fairly distributed across the two groups. We elaborated in the “Recruitment …” section: There

are few large municipalities in Norway. There is generally more access to cognitive behavioural

therapy and other health and social services in large municipalities. Because we believed that

there might be an association between access to health and social services and the primary

outcome measure, we stratified the randomisation based on the size of the municipality to ensure

that large municipalities were evenly distributed in the intervention and comparison arms of the

trial. Furthermore, we stratified the randomisation on the variable “Proportion of inhabitants 80

years or older”, acknowledging the increased prevalence of depression among the oldest, which

may influence the general practitioners’ experience with and skills of managing this patient

group. The choice of the cut-off (> 25,000 inhabitants or designated city status by Statistics

Norway) was based on data from Statistics Norway, reported in Appendix C of the protocol.

• Why was it not possible to blind the researchers?

o Throughout data collection the researchers were constantly in contact with the GPs that were

invited to participate and coordinated schedules for interviews, making appointments and

establishing contact between interviewer and interviewee and assisting them when the

installation of software failed. To ensure blinding, all this work must have been done by others,

which was outside the financial scope of our project. On the other hand, as stated in the text, the

statistician who performed the analyses, was blinding to the randomisation when analysing

primary and secondary outcomes.

Statistical methods

• In light of the ITT analyses, could you please include a description on how possible missing

data is accounted for?

o We did not attempt to impute values for missing data. We have changed the sentence

accordingly: We performed the analysis as an intention-to-treat analysis based on the available

data; all general practitioners were analysed in the group to which they were assigned, regardless

of whether they used the interventions we offered or not. We did not impute values for missing

data.

Data collection

• Please explain why the number of patients discussed with the GP is lower than described in the

protocol (resp. 4 versus 6 in the protocol).

o Due to the extensive interview that we planned for each patient, we calculated that

interviewing the GP about six of his/her patients would take approximately two hours. Feedback

from GPs that we met during outreach visits prior to data collection, after writing the protocol,

convinced us that it was unrealistic to expect GPs to offer so much of their time to participate in

the interview. By narrowing the extent of the interview and the number of patients we planned to

conduct each interview within a time frame of one hour which we expected to be more realistic.

Results

• To identify the patients, specialized software is used to extract patient files from the GPs

administrations. The authors note that if the software did not work, the research team asked the

GPs to identify potential participants at patient level. This is a demanding task for GPs and hence

information about the successful attempts using the software might give insight in what GPs

were lost on follow-up or not responding. I would advise the researchers to interrogate the data

on this issue and find out if the process of identification of patients might be an important

alternating factor in GPs participation rates.

o Our impression during data collection was that the general practitioners were easily able to

identify some patients from their practice, but often fewer than if the list was generated from the

software.

o We have added this to the strengths and limitations section in the Discussion: We used two

different methods, interviews and questionnaires, to obtain data from general practitioners

regarding diagnosis and management of elderly patients with depression from their patient list. It

is not possible to determine whether the two methods gave comparable results, due to the small

sample size.

Discussion

Although the authors reflect on reasons that might have caused the low response rates of the

GPs, I would expect further debate on the following issues to learn and prevent for similar

outcomes in similar studies:

• How do the GPs regard the intervention and especially the recommendations? Do they agree

with them or do they feel unsure?

o This will be addressed in the process evaluation.

• Could you elaborate on the recruitment strategy? What has been done to try to overcome the

financial, administrative and organizational burdens?

o We have elaborated: However, general practitioners do not routinely grade the severity of

depression or the type of depression using diagnostic codes. Thus, collecting data on adherence

to recommendations that apply to severity or type of depression requires an interview or

completion of a questionnaire. We have also added: Alternatively, a fee for participating, either

as part of national health authorities’ support of research in primary care or as part of the

research project funding, might have improved recruitment of participants for data collection

[48]. We included a reference to Foy and colleagues for the support of fees for participating in

research, although the evidence for these strategies is weak.

• What is the basis for the 50% response rate and thus including 'only' 80 municipalities, (besides

practical reasons)?

o Our anticipated 50% participation rate was based on two Norwegian studies in primary care,

which both are referred to in the protocol. We have now included these two references in the

text.

o Including 80 municipalities is actually quite a large sample, representing approximately one

fifth of the Norwegian population and healthcare professionals. The mean number of practices

per municipality is 3-4 and between 3-4 GPs per practice.

Conclusions

• I suggest including more 'lessons learned' in the conclusions (and abstract) so research

community can more easily grasp the bottlenecks in this kind or research.

o We have added in the following key messages to the Conclusions:

Pragmatic trials of implementation strategies are needed to answer real world questions about

how to improve the quality of care. A key message from this trial for implementation researchers

is that access to outcome data is essential to their success. This trial included all general

practitioners in 80 municipalities representing 20% of the Norwegian population with close to

1000 general practitioners. Randomising jurisdictions or large numbers of practices without

consent, is a highly pragmatic approach to answering real world questions about how to improve

the quality of care, provided access to outcome data is ensured, for example via routinely

collected data. However, this approach proved to be fatally flawed in this trial, because

collecting outcome data required active participation of general practitioners.

A second key message is that a randomized trial is not the best study design for answering a

pragmatic question about how to improve practice when random allocation is not feasible. A

major limitation of this trial was that we were not able to include what might have been

important, effective components of our tailored implementation strategy, such as integrating our

resources in widely used electronic information sources, because we could not randomly allocate

these.

A third key message is that future research evaluating methods for tailoring implementation

strategies should directly compare tailored implementation strategies that use different methods

to tailor the interventions. It remains logical that implementation strategies should address

important barriers to implementing evidence-based recommendations. However, little is known

about how best to identify important barriers and how to select interventions to address identified

barriers [12, 13]. The TICD project conducted some ground-breaking research comparing

different methods for identifying determinants of practice [16] and for linking interventions to

those determinants [14]. However, our trials were limited to comparisons of tailored strategies to

no intervention. This research and process evaluations linked to our trials [ref] can shed some

light on why our tailored implementation strategies appeared to have, at best, modest effects.

However, there is a need for explanatory trials designed to answer questions about how best to

tailor implementation strategies, as well as pragmatic trials to answer real world questions about

how to improve practice.

A key message for general practitioners and policymakers is that to answer important questions

about how to improve practice, general practitioners need to have time, resources and structures,

such as research networks, fees for participating in prioritised research or learning health systems

[ref].

Reviewer #2

Reviewer’s report

General comments

The study is of interest but underpowered and as a consequence there is no answer to the

question what the effectiveness of tailored interventions are to improve care for elderly patients

with depression. I doubt if the lessons learnt are new compared to the literature and add value to

new research. However, the design and the tailored interventions are very interesting.

I do not feel adequately qualified to assess the statistics. My advice is to ask a statistician to carry

out the statistical review.

Major compulsory revisions

Methods

Participants, inclusion and eligibility criteria, p.5

The study was part of the international research project Tailored Implementation for Chronic

Diseases (TICD). The aim of the TICD project was to develop valid and efficient methods of

tailoring implementation interventions to determinants of practice for knowledge implementation

in chronic illness care. In the study regarding depression, Included were patients with a diagnosis

of mild, moderate, severe or recurrent depression or assumed to have a diagnosis of depression

according to standardised ICD 10 criteria.

• Just wondering, why did the researchers not choose for patients who were diagnosed with a

persistent (chronic) depression? The categories of mild, moderate and severe depressive episodes

are used only for a single (first) depressive episode. As the authors mention on page 10:

“Learning more about effective strategies to improve healthcare for patients with chronic

diseases is important”.

o We did include patients with dysthymia and recurrent depression in addition to patients with

depressive episode. This is clarified now, thank you! The term chronic depression, albeit

frequently used in clinical discussions and which serves as a specifier (a code that gives

additional information to the main diagnosis) to the diagnostic code of Major depression in DSM

IV – TR, it is not included in the ICD10. We have omitted the term chronic depression

throughout the text for clarity now. Although the term in DSM-IV-terminology implies any

degree of depression severity lasting more than 2 years, the ICD10 term dysthymia, which

represents a long-standing sub-syndromal depressive state resembles the DSM-term.

• In addition, can you clarify what is meant by ‘assumed to have a diagnosis of depression’? Can

you describe how this is done?

o This was mentioned by both reviewers. We agree that this sentence is not clear and we have

omitted “or assumed to have a diagnosis of depression”. We used diagnostic criteria from ICD10

to assess type of depression (single episode, recurrent or dysthymia) and severity of episode

(mild, moderate, severe) and included only patients who fulfilled the diagnostic criteria.

Patients were included according to standardised ICD-10 criteria. Eligible patients were

identified from the GPs' patient lists using software that extracted information from the GPs’

electronic medical records, based on an algorithm of ICPC-2 diagnostic codes, ICPC-2

diagnostic text, free text, prescription of antidepressants and billing codes.

• Can you clarify what the relation is between the ICD-10 and the ICPC-2? Do GPs record both

ICD-10 and ICPC-2? Or was the ICPC-2 used as a first step in inclusion, and the ICD-10 as

second step?

o The reviewer’s assumption is correct; the ICPC-2 classification system is used by Norwegian

GPs. The GPs very rarely use ICPC-2 to specify depression severity and depression type. Thus

we used ICPC-2 code as one of the items in a checklist to identify eligible patients from the

patient list. In order to check whether practice were in accordance with the recommendations

(that addressed different types of depression and severity) we used ICD10 criteria to 1. Ascertain

that the patient could be included and 2. Establish the nature of the patient’s depression.

• Which terms were used from the free text?

o This is reported in the additional file in the protocol (see text box). These terms are Norwegian

terms that might indicate an assessment of depression, but not necessarily. As we described in

the protocol, various assessments, including ICPC2-codes, text excerpts from the electronic

medical journal, prescription of antidepressants, and additional fees each yielded a score.

Patients with the highest score (maximum 6, minimum 0, the P76 Depression code in ICPC2

automatically was given the highest score) were presented first on the list that were shown to the

GP on the screen after installing the software when starting the interview

Deprimert, depres* (depresjon, depressiv, depresjonsskjema), deppa, nedfor, nedstemt*

(nedstemt, nedstemthet) trist* (trist, tristhet, tristesse), sorgtung, svartsyn, håpløs* (håpløs,

håpløst, håpløshet), tungsinn, mismot, ulykkelig, MADRS

If the software did not work, GPs were asked to identify elderly patients with

depression to discuss with the authors.

• What do the authors mean by ‘If the software did not work’?

o This is mentioned by both reviewers and we have elaborated in the text: “If the software did

not work (e.g. installation of software was denied due to security systems, such as a firewall, or a

patient list was not produced), we asked the general practitioners to think of elderly patients with

depression from their list. Approximately one out of five general practitioners experienced some

problems while installing the software. Nearly all of these general practitioners whom we

interviewed were able, nonetheless, to identify elderly patients with depression from their

practice.”

• What was discussed between the authors and the GP? Whether the patients had a diagnosis of

mild, moderate, severe or recurrent depression?

o The discussion was an interview following the same structure as the interview we carried out

for GPs with a software-selected patient list, that is initially assessing the eligibility of each

patient, using criteria-based questions based on the ICD10 depression criteria, ruling out

exclusion criteria and then asking the GP questions about his/her practice in accordance with the

type and severity of the depression.

• And did the authors assume on the results of the discussion that a patient had a depression or

not? If so was it possible to use the ICD 10 criteria?

o Please see our response to the question above. If a diagnosis of depression was not established

for any of the patients that the GP could think of, the interview was terminated.

In the final stage of the data collection, the authors asked GPs who had not responded to the

invitation, to complete a questionnaire regarding their diagnosis and management of one elderly

patient with depression from their patient list.

• Can the authors mention which questions were asked?

o In the questionnaire the GP was asked about practice regarding one patient. The patient was

selected by the GP himself. An initial diagnostic procedure to establish the diagnosis (type and

severity of depression) was followed by the same questions that were asked to GPs in the

interviews.

• Did you check whether the ‘discuss’ method and the method using a self-report questionnaire

gave comparable results?

o We did not check this difference due to the small sample size, and we have inserted a sentence

about this in the “strengths and limitations” section: We used two different methods, interviews

and questionnaires, to obtain data from general practitioners regarding diagnosis and

management of elderly patients with depression from their patient list. It is not possible to

determine whether the two methods gave comparable results, due to the small sample size.

GPs who participated in the interview received credit for the CME course, but did not receive

any financial compensation.

• What was the amount they received? Or do you mean discount for the CME course because

you mention that GPs’ did not receive financial compensation?

o GPs did not receive any amount of money, but an amount of Points/Hours. The interview

qualified for 3 hours activity, and together with the e-learning course (8 hours), some specified

activity that should help the GP to adhere to the recommendations (4 hours) and participating in

the outreach visits (3 hours), should altogether give the GP sufficient hours/points to qualify for

the CME course which merited for the speciality.

Based on group interviews interventions were identified that addressed determinants to

adherence to the guideline recommendations.

• What was the experience with the offered interventions by the GPS, healthcare professionals,

administrators and patients?

o We did not conduct group interviews after the intervention, only prior to the intervention in the

process of planning the interventions. The reviewer’s question will to some extent be answered

in the process evaluation which will be published later this year. However, in that study we only

interviewed GPs in the intervention group, not other healthcare professionals, administrators or

patients.

• Is there more to say about: Which interventions were effective or helpful? What was the

perceived usefulness of the tailored interventions offered in response to the determinants?

Whether the intervention really fits to the determinant? On p.12, I read that you have conducted a

process evaluation to investigate reasons for the observed effects of the tailored implementation

strategies, including the extent to which you were able to identify and address the most important

determinants of practice. Probably this is the answer to the above mentioned questions?

o We were not able to measure the effectiveness of each of the 52 interventions (small and large)

that were implemented. The reviewer is correct; we will be able to present feedback from GPs in

the intervention group regarding their assessments of the importance of the determinants and the

feasibility of each of the intervention that we planned in the process evaluation.

• Why did you not evaluate the provided intervention by embedding a continuous feedback loop

with GPs to optimise the tailoring process?

o That is an excellent idea. Alas, we did not maintain contact with the GPs between the time of

the outreach visits and when we started to invite them to participate in the data collection. We

believe that the research group must have been larger or the intervention period longer if we

could be able to maintain contact. Possibly such contact could have yielded a larger impact of the

intervention and the resources we offered on each GP in the intervention group, but this we don’t

know. Furthermore, this type of feed-back loop was not suggested as one of the interventions

when we planned and tailored the interventions.

Recruitment, randomisation and blinding, p.7

The municipalities were the unit of randomisation. The primary outcome was the proportion of

recommendations that the GPs implemented, measured as the mean adherence for each GP

across the six recommendations, patients and depression severity. As a consequence it is more

appropriate to randomize the GPs?

• Why did you choose for the municipalities?

o Although the primary outcome was at the level of individual GPs, the interventions targeted all

levels of healthcare in the municipalities, including GP practices, other healthcare professionals,

healthcare administrators in the municipalities, patients and their relatives. Thus, there would

have been contamination if we had randomized GPs.

The authors divided the municipalities into four strata based on two factors: (1) municipalities

with city status or a large population (>25 000 inhabitants) versus ones with smaller populations

(≤25 000 inhabitants) (3) municipalities with a high proportion (>5%), versus a low proportion

(≤5%) of inhabitants 80 years or older.

• What was the reason you based the stratification on these factors? What is known from the

literature?

o We elaborated in the text: There are few large municipalities in Norway. There is generally

more access to cognitive behavioural therapy and other health and social services in large

municipalities. Because we believed that there might be an association between access to health

and social services and the primary outcome measure, we stratified the randomisation based on

the size of the municipality to ensure that large municipalities were evenly distributed in the

intervention and comparison arms of the trial. Furthermore, we stratified the randomisation on

the variable “Proportion of inhabitants 80 years or older”, acknowledging the increased

prevalence of depression among the oldest, which may influence the general practitioners’

experience with and skills of managing this patient group. The choice of the cut-off (> 25,000

inhabitants or designated city status by Statistics Norway) was based on data from Statistics

Norway, reported in Appendix C of the protocol.

Data collection, p.7

The planned interview schedule (six patients per GP) was too extensive and you reduced the

number of patients that we wanted to discuss with each GP to four. We encouraged the GPs to

provide diagnoses for two additional patients.

• Is it not clear to me whether the authors used the interview also for inclusion. Is the discussion

with the GP a part of the interview?

o That is correct, the initial step in the interview was to decide whether the patient could be

included (ICD10 depression criteria, F32, 33 and 34.1) and if any of the exclusion criteria

applied (not home dwelling, short life expectancy, a diagnosis of dementia or bipolar

disorder).We have elaborated: The initial part of the interview served to decide whether the

patients could be included and to assess depression severity and type and then to assess the

practitioners’ management of the patient’s depression.

The authors did not collect baseline data on adherence to the recommendations, assuming that

both groups were comparable following the stratification and randomisation process.

• Can you clarify how population size and age is of influence on the 6 (guideline)

recommendations?

o Please see our answer following the reviewer’s question about stratification

• A baseline measure gives insight in the differences between groups and may be of interest in

the interpretation of the results. Unfortunately the baseline measure was not performed.

o We agree. Indeed, this decision was made for practical reasons, in particular the very short

period that was to be used on the intervention in TICD. Collecting baseline data would have

delayed the intervention substantially, and we did not have the resources to collect baseline data.

We added to the limitations section: We did not collect baseline data due to the very short period

of the TICD project that was planned for the intervention; collecting baseline data would have

delayed the intervention substantially. Furthermore, we did not have the resources to collect such

data.

Results

Figure 1. Flow chart of participants. In total 444 GPs were allocated to the intervention and 293

GPs were provided the intervention.

• The authors mention that 393 GPs are lost to follow-up. Can you explain this?

o We assessed all GPs eligible for inclusion regardless of whether they participated in the

outreach visits or not. Thus, all 444 GPs in the intervention group were eligible for inclusion in

the data collection, regardless of whether they participated in the outreach visit or not. Only 59

consented to participate, of whom 8 did not participate in the interview after all (as explained in

the results section). We have corrected the numbers in the flow chart in figure 1. Thus, the rest

was considered lost to follow-up in data collection (the specific number of GPs not responding

and of those who consented but did not participate is clarified in the figure).

The authors mention that in total, 124 GPs (response rate 14%) participated in the data collection

(p.8).

• This is not according to the figure, data from the 124 GPs was analysed. My advice is to use the

CONSORT 2010 Flow Diagram.

o We have adjusted the flow chart in accordance with the 2010 CONSORT flow chart. We have

corrected a few inaccuracies in the “Lost to follow up” boxes (for both intervention and control

groups) hopefully making it easier to follow.

The authors mention a response rate of 14% (p.8).

• Is this correct? 124 GPs from which sample?

o 124 GPs are 14% of the total sample of GPs (900) in both control and intervention groups.

Discussion

Strengths and limitations

The authors mention: “The major limitation of our study is that we were not able to recruit a

sufficient number of general practitioners to participate in the data collection within the time and

resource limits that applied to our project”.

For next research it is important to learn from this study and therefore it is important that the

authors give insight in their recruitment strategy as done.

• The authors mention that the GPs’ secretaries often guarded the GPs that they were not able to

get in touch with them. It surprises me that the researchers did not devise a strategy for this

previously, because it is a common way for GPs to organize their work.

o We agree – we could have planned for this.

Comparison with existing literature

Although it is interesting to compare with former tailored studies, I think it is valuable to give

some attention to other studies about obstacles to perform a clinical trials in relation to the trial

you carries out. For example from Richter-Sundberg ea, 2014: Improving treatment of

depression in primary health care: a case study of obstacles to perform a clinical trial designed to

implement practice guidelines. And also with literature about how to overcome the obstacles.

o Thank you, we agree that this is a highly relevant article, and have included it in the discussion:

Richter-Sundberg and colleagues conducted a post-RCT qualitative study to identify barriers to

the implementation of a clinical practice guideline for depression in Sweden. In spite of fees for

performance and consent to participate collected prior to the implementation process, the project

was not able to recruit the required number of patients to reach sufficient statistical power (after

18 months only 30 patients were included) [45]. Based on the framework described by Grol and

Wensing [46], the authors identified and sorted numerous barriers that hindered participation in

the study. The excessive workload associated with the research design was one major barrier.

Introducing new psychological therapies that challenged established professional role identity

was another.

o Regarding literature about how to overcome the obstacles, this is still a field with much

uncertainty and little consensus, making future research projects testing the effectiveness of

tailoring important.

The authors mention that in their study adherence to the recommendations was higher than in

previous studies [9], 58% and 52% in intervention and control groups respectively. (p. 11)

• Can you describe what the similarities and differences are between the study of Smolders and

your study? In my opinion the studies are quite different in design and data collection. In

addition, your study was underpowered and I wonder if a comparison can be made.

o We agree that Smolders’ study is not directly comparable to ours in terms of design and data

collection. We have addressed this issue extensively in a more elaborated introduction to the

“Comparison with existing literature” section: There are few studies that measure adherence to

depression guidelines and they are not directly comparable to our study. A review found that

general practitioners’ adherence to mental health clinical practice guidelines is low [38].

Fernandez and colleagues [ref] found in a large epidemiological study based on interviews with

21 425 home-dwelling persons in six European countries, that “treatment adequacy” for

depression as defined by the research group was particularly low (23%) in the patient group that

received management in “general medical care“ (which included general practitioners and

specialists other than psychiatrists and psychologists). Duhoux and colleagues [40] found that

elderly patients (65+) received less guideline concordant management as compared with younger

adults, regardless of which definition of concordance that were selected. Smolders and

colleagues [9] combined information from a patient questionnaire that measured depression and

anxiety symptoms and general practitioners’ performance as recorded in the electronic medical

patient records that only 42% of the depressed patients received management in accordance with

evidence-based management of depression as defined by an expert panel. In our study adherence

to the recommendations tended to be higher than in previous studies that have reported

adherence to depression guidelines, 58% and 52% in the intervention and control groups

respectively. The weak recruitment of general practitioners to participate in data collection might

indicate that we were only able to collect data from general practitioners that were particularly

interested in this patient group, both in the intervention and in the control group, which may

explain the relatively high adherence rate.

Conclusions

• Is it not premature to mention that: “Our tailored implementation strategy is unlikely to have

had more than a modest effect, despite our having put substantial effort into using multiple

methods to identify determinants of practice and to design a multifaceted implementation

strategy to address those determinants” ?

o The confidence interval for the primary outcome indicates a possible 9 percent improvement in

adherence, which is the basis for our conclusion. We agree with the reviewer that the finding is

weak. Thus we find it reasonable to conclude that it is “unlikely” to have more than a modest

effect.

Minor essential revisions

Title and abstract

• The recommendations are derived from guidelines. To mention this in title and abstract:

‘guideline recommendations’, it’s more clear what is meant by recommendations.

o We have changed accordingly

Abstract

Methods/design

• The authors mention that ‘The interventions targeted healthcare professionals, administrators

and patients,…’. After they write: ‘We offered the intervention to all GPs in the intervention

municipalities’. It is not clear to whom the interventions are offered.

o We agree, this is not clearly described. We have re-written: We offered outreach visits to all

general practitioners and practice staff in the intervention municipalities.

Table 2

• Hospital Anxiety and Depression Scale (HADS) instead of Hospital and Anxiety Scale.

o Of course. Thank you!

Post-hoc analyses

• In table 4 you mention as a separate category ‘recurrent depression/dysthymia’. For

clarification add dysthymia also in the text (p.9).

o Done. We have also omitted the term “chronic depression” as this is not an established

diagnosis in ICD10.

Results

Among the 385 patients, 221 patients (58%) suffered from recurrent or chronic depression

according to ICD-10 criteria. Of the remaining 164 patients with a first depressive episode,

almost 40% suffered from a severe episode (p.9).

• What do you mean by chronic depression according the ICD-10: Persistent mood disorders or

more specific Dysthymia?

o We have removed the term “chronic depression”, see explanation above

Strengths and limitations

“Getting GPs to contribute to the data collection once we contacted them was less difficult than

getting in contact with them (p. 10)”.

• The sentence is not clear to me what do you mean?

o We have elaborated: The biggest challenge was to get in contact with the general practitioners.

When we had been able to talk to them, it was less difficult to get the general practitioner’s

consent to participate in the data collection.

Discretionary revisions

Abstract

Background

• In two sentences ‘tested’ is mentioned In the second sentence, you can use instead of tested for

example ‘examined’.

o Done

s response to reviews title - static-content.springer.com10.1186... · author’s response to...

Documents