Brief Cognitive Behavioural Therapy Compared to Optimised General Practitioners’ Care for Depression: A Randomised Trial

Background: How to treat Major Depressive Disorder (MDD) in primary care? Studies that compared (brief) Cognitive Behavioural Therapy (CBT) with care as usual by the General Practitioner (GP) found the first to be more effective. However, to make a fair comparison GP care should be optimised and protocolised according to current evidence based guidelines for depression. So far this has not been the case. We studied whether a protocolised 8 session CBT is more effective than optimised and protocolised GP care (GPC). Methods: 121 patients with MDD, age 18-70 years, from 40 Dutch general practices, were randomised to either brief CBT or GPC. Assessments were at baseline (t0), 12 weeks (t1) and 52 weeks (t2). Main outcomes: decrease in depressive symptoms, response and remission on the Hamilton Depression Rating Scale-17 (HDRS-17) and the Patient Health Questionnaire-9 (PHQ-9). (Trial registration: ISRCTN65811640). Results: Both continuous and dichotomous HDRS-17 and PHQ-9 outcome scores favoured the brief CBT group. Number of treatment contacts and external referrals were not different between groups. GPs prescribed antidepressants (AD) to 48% of GPC patients and to 11% of CBT patients. Conclusions: Brief CBT by psychologists seems more effective than optimized GPC. Effect sizes comparable to the (statistical significant) results from meta-analyses, together with lower AD prescriptions, are both in favour of brief CBT which might make it a first choice treatment for patients with MDD in general practice.


Introduction
Major Depressive Disorder (MDD) is a heterogeneous illness with quite different levels of severity and duration, varying numbers of recurrences over the life-time and a ten percent risk for a chronic course. The one year prevalence rates range from 4% to 8% in the general population [1][2][3] and raise to 12% to 25% among primary care (PC) populations across the world [4]. Related to this heterogeneity, treatment of MDD differs widely in terms of method, intensity, duration, setting and professional(s). Patients with severe, enduring or highly recurrent MDD and those with complicating psychiatric and/or somatic co-morbidity are mostly treated in specialized mental health care settings. However the majority of patients suffer from mild to moderate MDD and those are primarily treated in PC by their General Practitioner (GP) or by mental health professionals, like psychologists [5,6].
For mild to moderate severe MDD treatment guidelines from the National Institute for Health and Clinical Excellence [7] and the American Psychiatric Association [8] recommend low intensity psychological therapy and only in some cases antidepressants (AD; e.g. past history of severe MDD or inadequate response to initial interventions). For moderate to severe MDD these guidelines recommend AD, psychotherapy, or a combination of both. Of the psychotherapeutic interventions for MDD Cognitive Behavioural Therapy (CBT) so far has the most robust evidence [9][10][11].
CBT for primary care settings is mostly delivered in brief forms, which include six to eight sessions [12][13][14], about ten sessions less than the 'classic' CBT. In this form CBT is more suitable for PC settings and more acceptable for PC patients. From an explanatory trial perspective brief psychological interventions are better for a comparison with nonpsychological treatments than these classical longer ones, where differences in the total amount of treatment and attention may be problematic when differences in outcome in favour of these psychological treatments have to be interpreted [15].
Effectiveness studies and a recent meta-analysis showed that for MDD patients brief CBT is more effective than treatment as usual by GPs, especially during the first four months [16][17][18][19][20]. A limitation of most comparisons of treatment as usual with an experimental treatment is that the first are not standardised according to recent clinical guideline recommendations with regard to what 'optimal' usual care should include. This unfair competiton may result in a biased outcome; an overestimation of the effect of the experimental treatment (CBT).
Protocolising treatment as usual and improving adherence to this protocol are possible solutions to this methodological issue [21][22][23][24]. In this study we did so by standardising the Dutch GP Practice Guideline for MDD [25] and by giving GPs a short training in order to optimise adherence to this protocol. We will refer to this intervention as optimised general practitioners' care (GPC). We hypothesised that brief CBT is more effective than GPC for primary care patients with MDD.

Setting
The study was conducted in patients of 40 general practices in two parts of the Netherlands; West (Amsterdam region) and East (Nijmegen region). Inclusion took place from January 2007 through August 2010 and follow up assessment till August 2011. The study protocol was approved by the institutional ethics review committees of the Academic Medical Center in Amsterdam and the Radboud University Medical Center in Nijmegen. More details of the study design, including the content of the two treatments, can be found in Baas et al. [26].

Participants and procedure
GPs referred patients with a possible MDD to the study centre. Patients who gave informed consent for the diagnostic phase of the study were assessed by telephone on demographics and clinical characteristics within three days with the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID-I). Inclusion criteria were: MDD and age between 18 and 70 years. Exclusion criteria were: mental retardation, schizophrenia, bipolar disorder, severe suicidal thoughts, terminal somatic illness, receiving active MDD treatment (a. use of anti-depressive medication, b. psychotherapy, or c. supportive consultations by the GP or social worker) and insufficient comprehension of the Dutch or English language. Eligible patients were next assessed with the Hamilton Depression Rating Scale-17 (HDRS-17) and the Patient Health Questionnaire-9 to establish the severity of MDD. After that they were asked for participation in the trial by their own GP.

Randomisation
After informed consent, the GP performed an internet based randomisation procedure. A centralised computer program generated a block randomisation of four blocks stratified by gender and location (Nijmegen/Amsterdam) where the patient was randomised over the two treatment strategies; optimised general practitioners' care (GPC) or brief cognitive behavioural therapy (CBT). The computer than generated an outcome on the GP's computer screen and simultaneously send an email of this outcome to the researcher. In this way, patients and investigators were not able to foresee the allocation.

Interventions General practitioners' care (GPC)
The GPC protocol (developed by HvW and AS; available on request) was based on the Dutch College of General Practitioners Practice Guideline (NHG-Standard) for depressive disorder [25]. GPC had duration of 12 weeks and consisted of monitoring of symptoms, psycho-education, discussion about possible causes or reasons of MDD, problem solving techniques, life style advices and supportive contacts. In accordance with the guidelines GPs were free to decide whether they prescribed an antidepressant (AD) and if so which one. Minimum frequency was one contact (10-15 minutes) every two weeks during the first six weeks and one telephonic contact and one face-toface evaluation contact during the next 6 weeks. This number could be increased in case of severity of symptoms and/or complaints and/or lack of social support [26].

Brief cognitive behavioural therapy (brief CBT)
The brief CBT protocol (developed by CLHB; available on request) consisted of eight sessions (50 minutes) within the 12 week period. Treatment included behavioural activation and cognitive interventions including identification and challenging of negative thoughts and underlying attitudes and schemas [26]. Therapists were skilled psychologists connected to one of the participating general practices. They all were specifically trained in this brief CBT, had a Master in Clinical Psychology and a four year post academic education in behavioural therapy. They were all members of the Association of Behavioural and Cognitive Therapy. CBT sessions were delivered at the GPs practices.
All brief CBT sessions were audio-taped and treatment integrity was assessed by checking whether the essential ingredients of the intervention were present (behavioural activation, i.e. identification and expanding of potentially pleasant activities and identifying and challenging negative thoughts/ formulating rational thoughts) in a random sample of 10% of each therapists tapes. Therapists had intervision sessions (an organized meeting between colleagues in which performed sessions and related problems are discussed) and supervision sessions (an organized meeting between the therapist and a supervisor, a fully trained cognitive behavioural therapist with a Master in Clinical Psychology and a four year post academic education in Behavioural Therapy and Clinical Psychology). During the 12-week treatment period therapists were asked to refrain from referring the patient to the GP for medication. study all SCID-I interviewers participated in ongoing training sessions and monthly consensus meetings supervised by an expert psychiatrist (JH) to maximize accuracy and consistency in the administration.

Outcome measures
As primary outcome we used a clinician-rated scale, the Hamilton Depression Rating Scale-17 and a patient-rated scale, the Patient Health Questionnaire-9. For both scales we measured continuous outcomes and dichotomous outcomes (response, remission).
The Hamilton Depression Rating Scale-17 (HDRS-17; [29]) is a well-known clinician rated, semi-structured clinical interview for assessing the severity of depressive disorder which has good psychometric properties [30]. In this study the HDRS was administered by telephone [31] by three independent interviewers, trained by an expert psychiatrist (JH). To maximize accuracy and consistency in the administration we used the same training and supervision procedure as with the SCID-I. The HDRS-17 total score ranges from 0 to 52 points and is categorised as: 8-13, mild; 14-18, moderate; 19-52, severe [32].
The Patient Health Questionnaire-9 (PHQ-9; [33,34]) is the 9 item patient rated subscale of the Primary Care Evaluation of Mental Disorders (PRIME-MD) which has good psychometric characteristics [35] and connects well to primary care [36,37]. In this study the PHQ-9 was send by mail. It evaluates the presence of the nine DSM-IV criteria for a depressive episode. The sum score ranges from 0 to 27 and can be categorised as: 5-9, mild; 10-14, moderate; 15-19, moderately severe; 20-27, severe). The PHQ-9 also offers a diagnostic algorithm. A positive outcome on this algorithm requires that five or more of the nine depressive symptom criteria of the DSM-IV are present more than half the days in the past two weeks (suicidal thoughts count if present at all), and at least one of these five or more symptoms has to be depressed mood or anhedonia.

Dichotomous outcomes
Response: was defined as a total score reduction of >50% on the HDRS-17 or the PHQ-9 [38][39][40]. Response was assessed only at the end of treatment (12 weeks).
Remission: Using the HDRS-17, remission was defined as a total score of 7 or less. For the PHQ-9 there is no consensus on the definition of remission. To make comparisons with other studies possible we applied two definitions. First, the original validation study of the PHQ-9 [34] recommends that scores of 0-4 are in the minimal depression range and scores of 5-9 are in the mild depression range. We therefore defined remission as a score of <5. Second, the cut-off value more widely used to identify a positive case for depressive disorder is a total score of 10 or higher. We therefore also used a more lenient, remission definition: a PHQ-9 total score <10 [39][40][41]. Remission was assessed both at 12 weeks (t1) and 52 weeks (t2).

Blinding
Interviewers were kept blind for the group assignment. They only got the telephone number and name of the patient to be interviewed. Before the interview patients were asked "not to reveal their treatment condition". Interviewers were asked in the monthly consensus meetings in how many cases the blinding was broken. They concluded that in 85% to 95% of the cases patients had been able to retain the concealment.

Treatment
In order to optimise GPC each participating GP received training before the start of the study. During this one hour training we educated the GP in the treatment protocol and discussed the content of each of the contacts. After the training we provided a ring binder with the treatment protocol (containing the content of each contact) and informed the GP about the possibility of consulting an independent physician for questions about the treatment protocol during the intervention period.
To monitor the actual content of the provided GPC and potential supplementary treatment next to GPC and brief CBT, GPs received a questionnaire at the end of the 12 week treatment period (t1) in which we asked how many appointments they had had with the patient, whether they prescribed an antidepressant (+ type and dosage) and whether they had referred CBT or GPC patients for supplementary treatment to another health care professional.

Statistical Methods
To evaluate potential differences in baseline characteristics between the two groups we used t-test for independent samples and the chisquare test for categorical variables. When expected cell counts were too small for the chi square test, Fishers exact test was used.
To take potential biased outcomes caused by selective loss to followup into account we used multiple imputation (MI) which, assuming missing at random (MAR) for missing values, gives unbiased results with correct standard errors [42]. The results of the multiple imputation analyses (5 imputed datasets) were combined using Rubin's rules [43]. F-test values and degrees of freedom were calculated by the method proposed by Marshall et al. [44]. Since MI based pooled estimates are considered less biased, all analyses are based on the MI data. In the tables we show the estimates based on the imputed data; in the legend of the tables the estimates based on the actual observed data are presented. Furthermore, all analyses were intention to treat. Analyses were carried out with SPSS Statistics 18.0. To guard against an increased family wise error due to multiple testing we corrected the significance threshold from .05 to .01. We used this somewhat more lenient correction than the Bonferoni correction because the 7 tests in our article were not independent (e.g. HDRS score is correlated with PHQ score) which would make a Bonferoni correction to strict and would result in an unnecessary loss of power.

Treatment effect
Continuous outcomes in HDRS-17 and PHQ-9 scores were analyzed with a linear mixed model regression analysis with change between t0 and t1 and change between t0 and t2 as dependent variables, and treatment group, time and the treatment group by time interaction as independent variables, and t0 severity (HDRS total score) as covariate. In this model the main effect of treatment group indicates treatment effect and the treatment by time interaction indicates whether this effect sustains. When treatment by time interaction was significant, we performed planned contrasts to assess whether changes from t0 differed between both groups at t1 and at t2. When the treatment by time interaction was not statistically significant, we fitted a regression model without the interaction term and present the effect estimate based on this model. variable and treatment group as independent variable, and baseline severity as covariate. Percentage remission was analyzed with longitudinal logistic regression analysis, using the SPSS Generalized Estimating Equations programme (GEE) with a logit link function and a binomial error distribution. Dependent variable was remission in terms of the HDRS-17 or the PHQ-9. Independent variables were treatment group, time, the treatment group by time interaction and t0 severity as covariate. Figure 1 shows the patient flow through the study [45]. During the recruitment phase, 175 patients were referred; five patients (2.9%) improved during recruitment and therefore declined participation; 170 could be assessed for eligibility. Of the 170 patients 34 (20%) patients did not meet these criteria, eight (4.7%) patients refused randomization due to treatment preference, six (3.5%) patients said they were improved and therefore declined randomization, and one (0.6%) patient did not accept the diagnosis of MDD; 121 (71.2%) met the inclusion criteria and agreed to participate and to be randomised. Data on outcome were obtained for 94 (78%) patients at 12 weeks and for 77 (64%) patients at 52 weeks.

Dichotomous outcomes
Response: Table 3 shows that after 12 weeks of treatment on the HDRS-17 a response was shown by 34.1% of the GPC group and by 48

Adherence and Antidepressant Use
Patients in the brief CBT group received a mean of 6.1 sessions with the psychologist (range 0-8; s.d 2.7). Three patients did not attend any treatment session. The most common reason for termination of therapy was lack of motivation to attend the sessions. Patients in the GPC group received a mean of 4.9 appointments with their GP (range 1-12; s.d 2.2). All patients attended at least one GP-appointment.
Treatment integrity assessment of CBT showed that the essential ingredients of the intervention (i.e. behavioural activation and challenging negative thoughts/ formulating rational thoughts) were present. No interference in the work procedures of the psychologists was necessary.
GPs prescribed antidepressants to 48% of the GPC patients and to 11% of the CBT patients. In total one in seven patients received an outside referral for additional mental health treatment (11% in brief CBT and 18% in GPC).

Discussion
We investigated whether brief cognitive behavioural therapy (CBT) applied by psychologists was more effective than optimised and protocolised general practitioners' care (GPC), based on a GP treatment guideline for primary care patients with MDD. Over the full 52 weeks follow up the depression severity in both groups reduced to about half of the baseline value. Both on the HDRS-17 interview and the PHQ-9 self-rating about 70% of this improvement was between baseline (t0) and the end of treatment at 12 weeks (t1). Our study shows that CBT is at least as effective as GPC (both treatments did not differ on the dichotomous outcomes: response at t1 on HDRS and PHQ-9, and remission on t1 and t2 on HDRS and PHQ-9) and gives an indication that CBT is perhaps more effective (improvement on the mean PHQ-9 score differed significantly between both groups in favour of brief CBT [effect size: .51] and the combined overall trend appears to consistently favour brief CBT). Till now effectiveness studies and a recent meta-analysis showed that brief CBT is more effective than treatment as usual by GPs. We studied an optimized form of GP care and conclude that brief CBT is just as good and perhaps better.
It cannot be ruled out that due to the sample size our tests had insufficient power to detect effects of the observed magnitude. This is supported by the fact that our effect sizes (most between .30 and .50) are comparable in magnitude to the statistical significant pooled effect sizes of three meta-analyses of CBT versus usual care [20,46,47] which were all in favour of CBT and were in the range of 0.33 to 0.42. Given the effect sizes it is not unreasonable to expect that we might have found more statistical significant effects of treatment type with a larger sample size although such a larger sample size might reduce effect sizes [11].
For the PHQ-9 we used two remission rates and found that those based on the definition of a PHQ-9 cut off <10 were more comparable to the HDRS-17 remission rates than those based on the definition of a PHQ-9 cut off <5. Compared to HDRS-17 the first one (<10) seems to overestimate remission by some 10%, while the second one (<5) underestimates remission by some 20%.
Our remission data at 52 weeks follow up need some explanation. First it is known that 50% of depressive episodes will be in remission at three months, but another 25% will last longer than 12 months [48]. It is also known that during the first year the proportion of patients relapsing/recurring is 25% [49]. To consider a new period of depression as a real recurrence the DSM IV states that between episodes 'there must be an interval of at least 2 consecutive months in which criteria are not met for a Major Depressive Episode'. Unfortunately we were not able to assess relapses or recurrences of depression in that detail over the t1-t2 period. We only know that non-remission (considering PHQ-9 cut off <10 and HDRS-17) at 52 weeks was in the 28-44% range for CBT and in the 45-55% range for GPC. Whether this non-remission was related to 'real' non-remission, relapse or recurrence is not known.
This issue of remission/recurrence of course is interesting because one important advantage of CBT in comparison with AD is its prophylactic effect on depression recurrence after acute CBT/AD treatment has been stopped [49][50][51]. However in almost all studies on this prophylactic aspect, CBT was of the 16-18 sessions type, as was common for the acute treatment of CBT, and so it remains unclear whether the brief CBT we studied also has an enduring effect.
Patients in de brief CBT group received six therapy sessions (sometimes called ultra-brief CGT; [52]), about two sessions less than the intended and protocolised eight sessions which is in line with other primary care studies on brief CBT in primary care [13,20]. The five (face to face or telephonic) appointments of the GPC patients correspond to the recommended number in the protocol. During the 12 week treatment phase patients in the GPC group received one session less than the brief CBT patients which suggest a comparable number of professional contacts in both groups. Furthermore the referral rate to other mental health treatments was also comparable; one in seven patients in both groups.
Anti-depressants (AD) were one of the treatment options in the GPC. The GPs were educated in the treatment protocol and they received feedback of the SCID-I and HDRS interview results (i.e. diagnosis and severity of MDD). By this procedure they were in an optimal condition to decide to prescribe ADs in accordance with the primary care guidelines. This finally resulted in 48% of AD prescriptions in the GPC group. We think this percentage is in accordance with the depressions severity at baseline, but rather low compared to other GP studies where 49-96% of the patients received ADs [17,[53][54][55].
Interestingly we have to conclude that the somewhat better depression outcome was accomplished with a small percentage of AD prescriptions (11%) in the brief CBT group. Since studies showed that patients generally prefer psychological therapy to pharmacological therapy [56][57][58][59], this low AD prescription may be another advantage of brief CBT. However, we have no assessment of patient' preference or satisfaction, and so this advantage is not proven in our sample.
Earlier studies on brief CBT in depressed PC patients are scarce and showed brief CBT to be more effective than treatment as usual, especially in the short term, but the advantages were small [16][17][18]47]. Our study differs from these earlier studies in several ways. First, we optimised treatment as usual by using the treatment protocol and the GP training. By this, we think our study design results in a more valid comparison and consequently more valid estimate of the differences in treatment effects between brief CBT and GPC. To our knowledge only one other study also educated GPs; Conradi et al. [19] invited GPs to attend a 2-hour booster session about guidelines for the treatment of depression. Second, the number of offered sessions of the brief CBT differed. In two of the earlier studies brief CBT consisted of ten or more sessions (range 10-12 sessions; [16,19]), while we finally used the six sessions found in clinical practice. Furthermore, the content of the brief CBT differed. Scott et al. [17], for instance, used mainly cognitive techniques instead of a mixture of cognitive and behavioural aspects as we did. Finally, the way the severity of depression was assessed differed. Only one study used both a patient rated and a clinician rated instrument [17]. The other studies used either a clinician rated instrument (HDRS-17; [16]) or a patient rated instrument (Beck Depression Inventory [60]). A recent meta-analysis showed that clinician-rated and patient-reported measures of improvement in depression studies are not equivalent and the authors recommended using both [14] as we did.

Strengths and limitations
Our study has limitations. First, one could argue that a control or treatment as usual group should have been included. We think that CBT had already been proven effective as compared to care as usual [20] and so we considered a placebo or control group not necessary and even unethical. Second, we obtained no information about harms (adverse effects) of the GPC while, for example, side effects of medication can have an influence on whether an intervention will be acceptable. Third, although we tried hard, we were not successful to measure adherence to the GPC protocol because this measure was too time-consuming for busy GPs. However, we monitored the actual content of the provided GPC by sending each GP a short questionnaire at the end of the intervention period in which we asked how many appointments they had had with the patient, whether they prescribed an antidepressant and whether they referred patients to another health care professional. This provided a global estimate of adherence to the protocol. Fourth, we did not measure the total amount of treatment patients in both conditions received between 12 weeks and follow-up at 52 weeks. Finally, we did use multiple imputations (MI) which requires the missing mechanism to be missing at random (MAR). Unfortunately, the MAR assumption is not testable [62]. Serious violations of MAR are, however, unlikely and lesser violations are only problematic in specific situations that are also rare [63][64][65]. Alternatives to MI like complete cases analysis or mixed model regression analysis on the observed data either are much more likely to lead to biased results (complete cases analysis) or also require MAR (mixed model regression).Even though we used the best method we cannot completely rule out that the MAR assumption violations induced some bias. However, even in this case the results after MI are most likely less biased than the results from the observed data.
Strengths of this study are the adequate randomisation procedure, the follow-up of 52 weeks, the assessment of AD prescriptions and the detailed analyses of treatment effects. We have tried to improve generalizability by including a great number of general practices from two geographical areas and by using lenient inclusion criteria, permitting co-morbidity [61,62]. Finally the optimisation and standardisation of the GP treatment and the limitations for external referrals allowed a fairer comparison.

Conclusion
We conclude that compared to optimised GPC the brief CBT we developed is probably more successful on depression outcome. This finding, together with a substantial reduction of AD prescriptions in the brief CBT group and the similarity of our effect sizes to the (statistical significant) pooled effect sizes of three meta-analyses of CBT versus usual care, makes brief CBT a good first choice treatment for PC patients with MDD. For the majority of patients a primary care intervention seemed sufficient. One in seven patients was referred for additional therapy. However further trials are needed to support the probable superiority of brief CBT in comparison to optimized GP care and to examine the relative cost-effectiveness.

Funding
This study was funded by a grant from the Netherlands Organisation for Health Research and Development (Zon Mw), program Mental Health (# 100.003.005 and # 100.002.021) and the Academic Medical Center/University of Amsterdam. The funder had no role in the study design and the collection, analysis, and interpretation of the data and the writing of the article and the decision to submit it for publication. All authors were independent from the funder and had access to all the data.

Declaration of interest:
None.