The American Psychiatric Association (APA) has updated its Privacy Policy and Terms of Use, including with new information specifically addressed to individuals in the European Economic Area. As described in the Privacy Policy and Terms of Use, this website utilizes cookies, including for the purpose of offering an optimal online experience and services tailored to your preferences.

Please read the entire Privacy Policy and Terms of Use. By closing this message, browsing this website, continuing the navigation, or otherwise continuing to use the APA's websites, you confirm that you understand and accept the terms of the Privacy Policy and Terms of Use, including the utilization of cookies.

×
Published Online:https://doi.org/10.1176/foc.2.3.462

Abstract

The goal of the study was to provide a quantitative analysis of the relative efficacy of all five currently available serotonin reuptake inhibitors (SRIs) and behavior therapy [exposure and response prevention (ERP)] for obsessive compulsive disorder. The relationship between effect size and methodological characteristics was also empirically examined. A search was conducted of several computerized databases covering the dates from 1973 to 1997. Seventy-seven studies were identified, yielding 106 treatment comparisons involving 4,641 patients. Effect sizes were analyzed between individual interventions and between intervention class [SRI, ERP or the combined treatment of an SRI with ERP (ERP/SRI)]. Data were analyzed both before and after controlling for methodological variables. The effect size for clomipramine (CMI) was significantly greater than the other SRIs, with the exception of fluoxetine (FLX). CMI was not significantly greater than ERP or ERP/SRI. As a class, ERP was significantly greater than SRIs as a whole. Effect sizes were larger for studies without a control group or random assignment, for self-reported outcome measures, and varied significantly by method of effect size calculation. Year of publication was significantly related to effect size. When controlling for these methodological variables, CMI was not significantly greater than FLX or fluvoxamine (FLV), and ERP was no longer significantly greater than the SRIs as a whole. No significant difference was found between CMI and the other SRIs as a group in head to head trials. No differences in drop-out rates were found. CMI stands out from the other SRIs. This difference is probably not clinically significant enough to warrant first choice treatment, given CMI’s greater lethality in overdose. The choice between an SRI or ERP is dominated primarily by the infrequent availability of ERP and to a lesser degree by personal preference. Methodological differences significantly impact effect size.

Introduction

Obsessive-compulsive disorder (OCD) was once thought a rare condition refractory to treatment. The development of exposure and response prevention (ERP) in the early 1970s resulted in the first empirically validated treatment for OCD. Until that time, OCD was believed to be refractory to treatment and carried a poor prognosis. Treatments such as psychodynamic therapy, which focused on the meaning of obsessions and compulsions, had not been successful in the treatment of OCD (Malan 1979). On the other hand, the efficacy of ERP, which focuses on compulsive behaviors as treatment targets in and of themselves, has been well documented in over 2 decades of controlled clinical trials (Foa et al. 1985; Baer and Minichiello 1990; Steketee 1993).

Until recently, there were no medications available with an FDA approved indication for the treatment of OCD. The first medication to receive such an indication, clomipramine (Anafranil) (CMI), was initially introduced in Europe as an antidepressant in 1966. Early reports of the efficacy of CMI in OCD and accumulating evidence in support of the serotonergic hypothesis on the etiology of OCD (Flament et al. 1987; Goodman et al. 1989a) led to several other serotonin reuptake inhibitors (SRIs) being tried as a treatment for this disorder. There is now empirical evidence from multi-center, double-blind, placebo-controlled trials supporting the efficacy of clomipramine (CMI) (clomipramine clomipramine Collaborative Study Group 1991), fluoxetine (Prozac) (FLX) (Tollefson et al. 1994), sertraline (Zoloft) (SER) (Greist et al. 1995a), paroxetine (Paxil) (PAR) (Wheadon et al. 1993), and fluvoxamine (FLV) (Luvox) (Rasmussen et al. 1998) in the treatment of OCD. The FDA granted an indication for the treatment of OCD for CMI in 1989, FLX in 1990, FLV in 1994, and SER and PAR in 1996.

Treatment of choice: drugs, behavior therapy, or both?

In spite of the success of SRIs, a substantial proportion of OCD patients (roughly 30%) remain clinically unchanged after an adequate trial (Rasmussen et al. 1993). Still others (8–15%) discontinue treatment due to side effects (Greist et al. 1995b). Many treatment responders still have significant levels of symptomatology after treatment. For example, in the multi-center CMI study, mean scores on the Yale-Brown Obsessive Compulsive Scale (YBOCS) (Goodman et al. 1989b) dropped from 26 to 15 after 10 weeks of treatment, which is in the mild, but still clinical range of severity (Clomipramine Collaborative Study Group 1991). In addition, relapse rates following discontinuation of SRIs are high, i.e., in one study 89% of patients treated with CMI relapsed within 7 weeks after drug discontinuation even after 1 year of therapy at adequate dosages (Pato et al. 1988); in another, patients treated successfully with PAR relapsed an average of 62.9 days following placebo substitution (Steiner et al. 1995).

For behavior therapy, there are also problems to consider. Exposure therapy may involve considerable discomfort as patients expose themselves to anxiety-provoking situations while at the same time refraining from performing the rituals that would reduce their anxiety. Estimates on the percent of patients who complete ERP and are helped by it range from 67% to 90% (Hafner et al. 1981; Foa et al. 1992), and estimates on drop-out rates range from 20% to 25% (Rachman and Hodgson 1980; Hafner et al. 1981; McDonald et al. 1988). Thus, this method may help only 50% of patients with OCD (Hafner et al. 1981). Exposure therapy may be less successful for patients with obsessions alone (Marks 1981), although more recent results based on specific identification of mental rituals are better (Salkovskis and Westbrook 1989). In addition, patients with high levels of co-morbid depression do not appear to habituate during treatment (Marks 1981). Finally, approximately 25% of patients refuse behavioral treatment (Foa et al. 1983; McDonald et al. 1988), making the overall percentage of patients helped even lower (Baer and Minichiello 1990).

Given the advantages and disadvantages of each treatment, information is needed on their relative efficacy in order to guide treatment decisions. Several meta-analyses have been published comparing the relative efficacy of behavior therapy with pharmacologic interventions in OCD. Christensen et al. (1987) conducted a meta-analysis comparing several treatment approaches, including several types of psychosurgery. Using combined observer and self-ratings, the effect size for ERP was 2.34, compared with 1.41 for tricyclic medication, 1.12 for psychosurgery, and 0.24 for placebo. Results provide some estimate of the relative efficacy of SRIs and ERP, although interpretations are tentative, due to the combining of the effect size for CMI, the only available SRI at the time, with the effect sizes of other non-SRI tricyclics, and the omission of data on the other four currently available SRIs (SER, FLX, FLV, PAR).

A second meta-analysis was published by van Balkom and colleagues (1994). Effect size was reported separately for self-rated and assessor-rated outcome measures. On self-ratings, behavior therapy was significantly more effective than serotonergic antidepressants, and combined treatment was significantly more effective than antidepressants alone. However, on assessor ratings, no significant differences were found between the three treatment conditions.

While the van Balkom et al. study includes more recent information than the Christensen et al. study, it contains comparisons for only three of the five currently available SRIs (SER was not considered an SRI and thus was not included in the comparative analysis). Abramowitz (1997) criticized the study on methodological grounds, i.e., all effect sizes were computed using pre- to post-test data, even for studies where a control group was used. Effect sizes of pre-to-post designs have been found to be significantly larger than other designs, due to the smaller variability that results from pooling the variance from the same patients.

Three meta-analyses examined the relative efficacy between the SRIs without comparison to ERP. Stein and colleagues (1995) examined effect sizes separately for placebo-controlled (n=12), all controlled (placebo-controlled and medication-controlled, n=22), and all trials (controlled and uncontrolled, n=28). Using ANOVA techniques, they found CMI had a significantly greater effect size (1.64) than FLX (0.51) in placebo-controlled trials, and significantly greater effect size (1.71) than FLX (1.08) and FLV (1.14) in all controlled trials. In all trials (controlled and uncontrolled), no significant difference was found between the SRIs; however, when this sample was restricted to studies with 50 or more subjects (CMI, FLX, and SER studies only), CMI had a significantly greater effect size than FLX (effect sizes not reported). While the study provides information on the relative efficacy of the SRIs, the use of ANOVA techniques to compare differences in effect sizes between classes has been criticized due to the violation of the assumption of homogeneity of variance (Hedges and Olkin 1985; Stevens 1986). Hedges and Olkin (1985) note that due to differences in sample size, the error variance between studies can vary by a factor of 10 or 20, and that ANOVA does not provide information on whether the effect size estimates within each class are homogeneous.

Piccinelli and colleagues (1995) found CMI had a larger effect size (1.41) than FLX (0.57), FLV (0.57) or SER (0.52), which did not differ from each other. Their analyses were restricted to randomized, double-blind, medication- or placebo-controlled trials. Their analyses between the SRIs was also restricted to studies using the YBOCS or the National Institute of Mental Health Obsessive-Compulsive Rating Scale (NIMH-OC). A total of 12 studies was included in this analysis.

Greist and colleagues (1995b) conducted a meta-analysis of the four multi-center, placebo-controlled trials for CMI (n=520), FLX (n=355), FLV (n=320), and SER (n=325). Effect sizes were calculated by subtracting the endpoint drug change scores from the endpoint placebo change scores and dividing by the pooled change standard deviations. CMI had a significantly larger effect size (1.48) than FLX (0.69), FLV (0.50) or SER (0.35), which did not differ from each other. In addition to the large sample sizes, this study had the advantage of similar treatment designs and outcome measures, with several of the investigators involved in more than one trial.

Finally, Abramowitz (1997) examined the effect sizes of the SRIs versus placebo (no statistical comparisons between individual SRIs were reported). He also examined the effect sizes of ERP compared to relaxation, cognitive therapy, and to its component parts (i.e., exposure alone and response prevention alone). ERP was not statistically compared to the SRIs. Studies were limited to randomized trials with multiple treatments or control groups. He also examined the relationship between effect size and the degree to which the study was unblinded, measured by the difference in the proportion of patients reporting side effects in drug versus placebo groups (“side effect contrast”). On clinician ratings, CMI had the largest effect size (1.31), followed by FLV (1.28), FLX (0.68) and SER (0.37). Interestingly, side effect contrast was significantly correlated with effect size [r(21)=0.62, P<0.01], with CMI having the largest side effect contrast [the correlation between effect size and side effect contrast for CMI was r(10)=0.92, P<0.001]. He speculates that part of the superiority of CMI may be attributable to the unblinding of the CMI patients due to side effects, an idea previously discussed by Fisher and Greenberg (1993) in the context of the problems maintaining the blind in double-blind pharmaceutical trials. ERP was significantly better than relaxation, but not better than cognitive therapy or ERP’s individual components.

The purpose of the current study is to provide a comprehensive quantitative summary of the treatment literature on the relative efficacy of SRIs and ERP in the treatment of OCD. It differs from the previous studies in the following ways: 1) it provides the first comparative quantitative analysis of all the currently available SRIs and ERP for the treatment of OCD; 2) it is comprehensive, in that all studies (in both the published and unpublished literature) were included regardless of methodological features. This enabled an empirical examination of the relationship between methodological qualities and effect size; and 3) statistical procedures were employed to control for the differences in variability resulting from different sample sizes between studies, allowing for statistical comparison between treatment groups, and control of extraneous sources of variance (such as methodological differences between studies).

The study is designed to address the following questions:

1.

What is the relative efficacy between ERP and individual serotonergic antidepressants?

2.

What is the relative efficacy between ERP, serotonergic antidepressants as a whole, and the combination of ERP and serotonergic antidepressants (ERP/SRI)?

3.

To what extent do methodological variables (i.e., method of effect size calculation, random versus non-randomized assignment, use of control group, or publication source) impact effect size?

Materials and methods

Literature search

Studies were identified through a computerized search of the MEDLINE, PsycINFO, and Dissertations Abstracts International databases covering the dates from 1973 (the date of the first controlled trial of ERP) through May 1997. In addition, a similar search was conducted of the Obsessive Compulsive Information Center’s database, a comprehensive computer-based service with over 10 000 references on OCD and related topics. Other sources were bibliographies of other reviews, reference lists of published articles, unpublished manuscripts, and abstracts or titles of papers presented at professional meetings and conferences.

Inclusion criteria

Studies were included that meet the following criteria:

1.

Subjects had a primary diagnosis of OCD according to DSM-II, DSM-III or DSM-III-R criteria. Concurrent diagnoses were acceptable as long as OCD was considered the primary diagnosis;

2.

Duplicate publications of the same study were omitted; similarly, reports from a single site in a multi-site clinical trial were not included if the results of the entire multi-center trial were available;

3.

Case studies were not included, as integration with group studies is problematic (Strube et al. 1985);

4.

Trials that did not contain enough information to estimate effect size were not included;

5.

In crossover designs, only pre-crossover data were included, due to the problem of carry-over effects. Several writers have argued that the low statistical power typically found in crossover designs due to small sample size makes statistical tests to detect carry-over effects inaccurate (Cleophas 1993);

6.

As suggested by Glass et al. (1981), studies were not excluded because of poor methodological quality. Instead, all studies were included, and methodological quality was examined empirically as to its relationship to effect size;

7.

Studies of children and adolescents were included for the following reasons: (a) the diagnostic criteria for OCD is the same for children and adults (American Psychiatric Association 1980, 1994); (b) the disorder typically starts during childhood or adolescence (Ingram 1961; Lo 1967); (c) several studies have demonstrated treatment efficacy for SRIs in this population (Leonard et al. 1988; Liebowitz et al. 1990; March et al. 1990), and CMI, FLV and SER have FDA approved indications for treating OCD in children as young as 10, 8, and 6 years of age, respectively; and (d) exposure and response prevention has also been shown to be an effective treatment for OCD in minors (Bolton et al. 1983; March et al. 1994);

8.

Studies utilizing combined treatments (other than those which are the focus of this study, i.e., CMI, FLX, FLV, SER, PAR, ERP, or a combination of a specific SRI and ERP) were excluded. For example, a treatment arm consisting of an SRI combined with a benzodiazepine was not included.

Quantification of studies

In addition to treatment outcome, characteristics of studies can be measured and their relationship to treatment outcome can be quantified (Glass et al. 1981). The following substantive characteristics of each study were coded:

1.

Class of intervention (i.e., SRI, ERP, or SRI/ERP);

2.

Specific intervention (i.e., CMI, FLX, FLV, SER, PAR, ERP, or SRI/ERP);

3.

Length of treatment;

4.

Dropout rate;

5.

Year of publication;

6.

Type of outcome measure used (self ratings, observer ratings, or pooled self and observer ratings).

The following methodological characteristics of each study were coded:

1.

Method used to calculate effect size;

2.

Presence or absence of a control group;

3.

Whether subjects were randomly assigned to treatment.

In addition, each study was coded as to its publication source as follows: refereed journal, unrefereed journal, book, unpublished manuscript, presentation at conference, or dissertation. In cases where duplicate publication of the same study was encountered, the highest prestige source was coded (Holloway and Wampold 1986).

A total of 295 studies were identified and reviewed. Of these, 218 were rejected for not meeting inclusion criteria, leaving 77 studies (106 treatment comparisons) that were included for evaluation, including at least one large multi-center trial for each of the SRIs, and seven head-to-head comparisons between individual SRIs.

Effect size

Effect sizes for individual studies were calculated by one of three methods:

a.

For studies using a control group, effect size was calculated by subtracting the endpoint drug treatment mean change from the endpoint placebo treatment mean change, and dividing by the endpoint pooled change standard deviation. The pooled standard deviation was used as it is a more precise estimator of the population standard deviation, assuming equal population variances and two groups per experiment (Hedges and Olkin 1985).

b.

For studies using a control group, but where change scores were unavailable, effect size was calculated by subtracting the mean post-treatment control group score from the mean post-treatment experimental group score and dividing by the pooled post-treatment standard deviation.

c.

For studies without a control group, effect size was calculated by subtracting the mean post-treatment score from the mean pre-treatment score and dividing by the pooled standard deviation.

Since g is a biased estimator of the true population effect size (Hedges and Olkin 1985; Coleman et al. 1995), a corrected estimator, d was computed, using methods provided by Hedges and Olkin (1985) for between-group studies, and methods described by Coleman et al. (1995) for within-subjects designs. Similarly, the variance for the unbiased estimator d for between-group designs was calculated with the equation provided by Hedges and Olkin (1985), and according to methods described by Coleman et al. (1995) for the one-sample case.

When pooling effect sizes from a series of independent studies, the effect sizes from studies with larger sample sizes will have smaller variability and are thus more precise estimators of the population effect size (Hedges and Olkin 1985). Thus, effect sizes were weighted so that the more precise estimates will have greater weights, using the formula given by Hedges and Olkin (1985).

To test whether the true effect size of a pooled group of studies differs from zero, 95% confidence intervals were calculated for all pooled effect sizes (Hedges and Olkin 1985). If the interval did not include zero, the hypothesis that the true effect size equals zero was rejected at the 0.05% significance level.

Only one effect size was computed per treatment group. In studies where a primary outcome measure was identified a priori, it was used to calculate effect size. When more than one outcome measure was used and a primary outcome measure was not identified, the outcome measures were pooled for analyses.

Data analyses

To test whether effect sizes differed between groups, the chi-square test based on between-groups, goodness-of-fit statistic was calculated, as described by Hedges and Olkin (1985). The significant omnibus test was followed by performing all possible pair-wise comparisons among the groups of effect sizes, using a procedure based on the Scheffe method (Hedges and Olkin 1985). To test whether effect sizes within each group were homogeneous, Cramer’s (1974) modified minimum chi square was used as described by Hedges and Olkin (1985).

A comparison of effect sizes was performed between individual treatments, between classes of treatments, and between methodological variables. The relationship between effect size, length of treatment, and year of publication were examined using procedures described by Mullen (1989). These procedures estimate the probability that a continuous predictor variable co-varies significantly with effect size. Mullen suggests the use of this procedure over the standard regression analysis because of the probability of the violation of the assumption of homogeneity of variance.

It was postulated that studies not using control groups would have larger effect sizes due to a) regression to the mean, i.e., untreated subjects will improve with time, and b) the placebo effect is subtracted out of the effect size in calculating the effect size of placebo-controlled studies.

Between group comparisons of dropout rates and the percentage of patients rated “much” or “very much” improved on the NIMH Global Improvement Scale (Guy 1976) were made using chi-square procedures.

Finally, in order to address the “file drawer problem” (i.e., the sampling bias resulting from studies not having significant results being less likely to be published and therefore less available for inclusion), the Fail-safe N (Rosenthal 1979) was calculated for the effect size of each treatment group. The Fail-safe N answers the question “How many studies with a null hypothesis confirmation (i.e., effect size=0) would have to be added to the results in order to change the conclusion that the combined effect sizes are significantly different than zero?” Rosenthal (1991) refers to this as the tolerance of future null results. If the number of studies is small, one can conclude that the meta-analysis is not resistant to “file drawer threat” (Rosenthal 1979).

Results

Individual interventions

Effect sizes for the individual interventions, as well as their 95% confidence intervals, are presented in Table 1. None of the confidence intervals included zero, indicating that the effect sizes were significantly greater than zero for all of the interventions. To test whether effect sizes differed between groups, an omnibus chi-square was computed as previously described. Results were significant [χ2(6)=76.93, P<0.001], indicating a significant overall difference between the groups. Pairwise comparisons were then conducted, with each contrast tested by rejecting the null hypothesis if the obtained χ2 value exceeded 12.59 (the 0.05% critical value of the chi-square distribution with 6 degrees of freedom; degrees of freedom was the smaller of the number of contrasts or the number of classes minus one; Hedges and Olkin 1985). Results found the effect size for CMI was significantly greater than for each of the other SRIs except FLX [χ2(6)=7.54 versus FLX; χ2(6)=17.46 versus FLV; χ2(6)=34.62 versus SER; and χ2(6)=43.83 versus PAR]. CMI was not significantly different than ERP, χ2(6)=2.08, or ERP/SRI, χ2(6)=0.01. The effect sizes for FLX, ERP, and ERP/SRI were significantly greater than the effect size for SER [χ2(6)=16.63, 26.14, and 15.49, respectively] and also significantly greater than the effect size for PAR [χ2(6)=19.91, 32.95, 16.08, respectively]. No other pair-wise comparisons were significant.

Chi-square tests for the homogeneity of effect sizes were also conducted. As shown in Table 1, the resulting Q statistics were sufficiently large to reject the null hypothesis of homogeneity of effect sizes for individual studies within treatment groups for all the treatment conditions, with the exception of SER (two studies coded), PAR (two studies coded), and the combined treatment of ERP and SRI (six studies coded). Heterogeneity of effect sizes within treatment conditions makes the interpretation of between group results more difficult (Hedges and Olkin 1985).

The Fail-safe N was calculated for the effect sizes for the seven treatment conditions (see Table 1). As can be seen, the effect sizes for CMI, FLX, and ERP were well protected from the “file drawer” problem. Although they contained large, multi-center trials, the SER and PAR class contained only two studies each, making them more vulnerable to the “file drawer” problem.

Intervention classes

In order to compare the SRIs as a group to ERP and the combined treatment of ERP and SRI, the weighted mean effect size for all of the SRIs as a single group was computed (see Table 1). The 95% confidence interval for the effect size of the SRIs as a group did not contain zero, indicating the effect size for the SRIs as a group was significantly greater than zero. Differences in effect size between the three treatment classes was statistically significant [χ2(2)=10.38, P=0.0056] with the effect size for ERP significantly greater than the effect size for the SRIs as a whole [χ2(2)=8.28, P=0.004].

Percent much or very much improved

A significantly greater percentage of CMI patients was rated “much” or “very much” improved (63%) than FLX patients [43%; χ2(1)=23.29, P<0.001], FLV patients [44%; χ2(1)=24.78, P<0.001], SER patients [37%; χ2(1)=44.22, P<0.001] and PAR patients [45%; χ2(1)=24.49, P<0.001]. The only other significant pair-wise comparison found patients in the PAR treatment group had a significantly greater percentage of patients rated “much” or “very much” improved than SER patients [χ2(1)=5.50, P=0.019].

Dropout rates

A comparison of drop-out rates is presented in Table 2. No significant difference was found between any of the individual interventions [χ2(5)=8.51, P=0.130]. Similarly, no significant difference in drop-out rate was found between the three intervention classes [χ2(2)=0.26, P=0.614].

Differences in drop-out rates due to drug side effects only were also examined for the five SRIs. No significant difference was found between the SRIs on drop-outs due to side effects [χ2(4)=5.83, P=0.212]. This finding was somewhat surprising, considering the more favorable side-effect profile of the newer, more selective SRIs.

Methodological variables

Effect sizes were compared along the dimensions of each methodological variable (see Table 3). This method of examining the effects of a categorical predictor variable was described by Mullen (1989) as an alternative to the point-biserial correlation, which is less informative.

Significant differences in effect size were found along all five methodological variables: (a) studies calculating effect size using method 1 (change scores) had significantly lower effect sizes than studies calculating effect size using method 3 (pre- to post-treatment (within subject) difference scores) [χ2(1)=38.21, P<0.001]; (b) studies with a control group had significantly lower effect sizes than studies without a control group [χ2(1)=46.00, P<0.001]; (c) studies that randomly assigned subjects to treatment had significantly lower effect sizes than studies that did not [χ2(1)=8.55, P=0.004]; (d) studies from refereed journals had significantly larger effect sizes than studies from other sources (data collapsed due to small sample size of other categories) [χ2(1)=8.50, P=0.004]; and (e) studies using observer rated outcome measures had significantly lower effect sizes than studies using self-rated outcome measures [χ2(1)=6.62, P=0.04].

The relationship between length of treatment and year of publication and effect size was examined using procedures developed by Mullen (1989). Length of treatment was not significantly related to effect size [Z=0.58, P=0.281]; however, year of publication was [Z=5.74, P<0.001].

Several researchers have postulated that studies conducted after the availability of SRIs on the market may have resulted in more treatment resistant patients in the later studies, as well as higher dropout rates, resulting in lower effect sizes given an intent-to-treat analysis (Greist et al. 1995b; Stein et al. 1995). We examined this by dichotomizing year of study into pre- and post-1990 (the year that CMI became available in the United States for OCD). Results found studies prior to 1990 had a significantly higher effect size (1.08) than those conducted from 1990 onward (0.78) [χ2(1)=22.15, P<0.001]. However, studies conducted prior to 1990 also had a significantly greater percentage of studies using self-rated outcome measures (75%), which, as previously discussed, have significantly higher effect sizes than observer rated measures (see Table 3). The effect size for the CMI studies conducted prior to 1990 (n=14; d=1.14) did not differ significantly from studies conducted from 1990 onward (n=12; d=0.95) [χ2(1)=2.23, P=0.136].

Given the significant impact of several methodological variables on effect size, we recalculated effect sizes for the individual interventions and for each of the three treatment classes using procedures described by Hedges and Olkin (1985) to control for methodological variables. Effect size was recalculated controlling for year of publication, method of effect size calculation, random versus non-random assignment to treatment, type of outcome measure used, and control versus no control group. Results are presented in Table 4. The omnibus chi-square was still significant, indicating a significant overall difference was still found between individual interventions [χ2(6)=36.89, P<0.001]. As previously described, pairwise comparisons were tested using a critical value of 12.59 (the 0.05 critical value with 6 degrees of freedom) (Hedges and Olkin 1985). When controlling for methodological variables, the effect size for CMI was significantly greater than for SER [χ2(6)=20.85] and PAR [χ2(6)=23.78], but not significantly different from FLX [χ2(6)=7.93], FLV [χ2(6)=12.26], ERP [χ2(6)=5.62], or ERP/SRI [χ2(6)=0.52]. No other pair-wise comparisons were significant. The omnibus chi-square for comparisons of effect size between treatment category (SRI, ERP, SRI/ERP) controlling for these methodological variables was still non-significant [χ2(2)=0.82, P=0.664]; thus no pairwise comparisons were conducted.

Direct head-to-head comparisons between CMI and other SRIs have been suggested as the optimal method of determining the relative efficacy of these treatments (Greist et al. 1995b). Seven head-to-head trials (n=2, FLX; n=4, FLV; n=1, PAR) were identified. All compared CMI to another SRI; none had a placebo control except the PAR study. We performed a separate analysis of these studies, combining FLX, FLV, and SER into an “other SRI” category and compared it to CMI. The effect size for CMI (0.85) was larger, but not significantly different from the “other SRI” category (0.76) [χ2(1)=0.50, P=0.478].

Discussion

Comparative efficacy of the sris

While all the treatments included in this meta-analysis have previously demonstrated their clinical efficacy for the treatment of OCD, one of the goals of the current study was to examine the relative efficacy of each intervention. The results of the current study are consistent with previous meta-analyses, in that CMI stands out from the rest of the SRIs in terms of clinical efficacy (i.e., has the largest effect size), although it did not separate statistically from FLX, or (when methodological variables were controlled) from FLV. In the analysis of head to head comparisons, CMI retained its relative position, but its difference from the other SRIs as a whole was not statistically significant. The confidence interval for CMI overlapped slightly with the confidence interval of its nearest competitor, FLX (and with FLV when methodological variables were controlled), but did not overlap with any of the other SRIs, providing some evidence for the differentiation of CMI from the other SRIs.

What is also striking is the relative consistency in the order of effect sizes for the SRIs found in the various meta-analyses published to date. This point is illustrated in Table 5. The order of effect size magnitude found was CMI first followed, in most studies, by FLX, FLV, and SER (PAR had not been examined prior to this study). Thus, different meta-analyses using different methodologies and different criteria for inclusion of studies have all coalesced with the same finding regarding the relative efficacy of the SRIs. The inclusion of both large multi-center trials and trials from a variety of independent settings makes this result more generalizable.

CMI, the SRI with the largest effect size, is also the least selective of the SRIs with regard to blocking serotonin reuptake over norepinephrine (Jenike et al. 1990; Richelson 1994). The fact that CMI, whose mechanism of action affects both the transmission of serotonin and norepinephrine, appears to have a larger treatment effect than the other SRIs, whose mechanisms of action are more selective for serotonin, leads to speculation that more than a single neurotransmitter system is involved in the pathophysiology of OCD (Jenike et al. 1990; Greist et al. 1995b). Whether agents such as duloxetine, milnacipran, and venlafaxine, which inhibit uptake of both serotonin and norepinephrine, will work well in treating OCD remains to be seen (Greist et al. 1995b). One open study of venlafaxine with ten patients who had failed previous SRI treatment found a statistically significant decrease in YBOCS scores, with an effect size of 1.27 (Rauch et al. 1996). However in a small (n=30) placebo-controlled trial, venlafaxine failed to separate from placebo, although a significant pre- to post-treatment effect was found (Yaryura-Tobias et al. 1994).

In spite of the less favorable side effect profile, the dropout rate for CMI due to side effects was not significantly different from the other SRIs. One possible explanation for this is that most of the CMI trials included in this study were conducted before the availability of any other effective medication for the treatment of OCD. Thus, patients may have remained in these trials in spite of unwanted side effects, while patients in later trials of the other SRIs may have been more likely to drop out if side effects were encountered and seek alternative treatment. It has been postulated that this differential dropout rate could affect endpoint scores in intent-to-treat studies, where the last available evaluation is carried forward and used as the endpoint score. Patients who discontinue due to side effects early in treatment may have less improvement than patients with the same side effects who remain in treatment for the full duration of the trial. When examined, however, total drop out rates did not differ between the treatments. However, year of publication was significantly related to effect size. Thus, the smaller effect size found in the later studies is probably related to some factor other than drop-out rate, such as the possibility of the later trials containing more treatment resistant patients (Greist et al. 1995b) or the confound of earlier studies having a greater percentage of studies using self-rated outcome measures.

Another possible result of the less favorable side effects profile with CMI is the potential for unblinding the study (Fisher and Greenberg 1993; Abramowitz 1997). Compared to the other more recent SRIs (which have a more favorable side effect profile), CMI may have given more clues to clinicians (and patients) that patients were on active drug as opposed to placebo. Evidence presented by Abramowitz (1997) suggests that this may have biased clinician ratings of improvement [although placebo side effects are also common, and have been found to vary by disorder (Johnston et al. 1990)]. Ratings by a third party observer who is blind to both treatment condition and side effects would be helpful in this regard. Another solution is the use of computer-administered rating scales. Computers have the advantage of being blind to both side effects and study visit, and the standardization of administration provides a more objective and reliable measure of symptom severity. A desktop computer-administered version of the YBOCS has been developed by Rosenfeld and colleagues (1992), and an interactive voice response (IVR) version (i.e., administered by computer over the telephone) has been developed by Baer et al. (1993).

Behavior therapy and the sris

The effect sizes for ERP and ERP/SRI were not significantly different from the effect size for CMI. This finding held for the percent of patients rated “much” or “very much” improved as well. The effect size for ERP was significantly larger than the effect size for the SRIs as a whole, but this difference became non-significant when methodological variables were controlled. The effect size for ERP/SRI was not significantly larger than the effect size for ERP alone. The total drop-out rates were also comparable; there was no significant difference between ERP and any individual SRIs, or between ERP and the SRIs as a whole. One of the points that proponents of ERP stress is that, unlike pharmacological treatments, ERP has no “drug side effects” (Marks 1990). However, patients undergoing exposure therapy often suffer situational anxiety approaching panic (Katz et al. 1990). Differences in drop-out rates due to adverse events could result in differences in the effectiveness between two treatments (i.e., how well a treatment does in the real world in terms of number of patients staying in treatment long enough to receive benefit versus how well patients do in controlled clinical trials). The results of the current study indicate that the total drop-out rates for the two treatments were roughly equivalent (19% versus 17% for CMI and ERP, respectively).

Another comparison that has been made between pharmacological and behavioral treatments is the difference in relapse rates following treatment discontinuation. In the current study, no studies were found in which a no-treatment follow-up period followed discontinuation of the active treatment as part of the study design. This would have enabled a fairer comparison of medication to ERP, which cannot be placebo-blinded after active treatment as easily as can drug treatment.

The most appropriate method of evaluating relapse rates is through placebo substitution studies. The two placebo-substitution studies that have been published bear out what many clinicians have found to be true in clinical practice: discontinuation of anti-obsessive medications leads to rapid relapse. Pato and colleagues found 89% of the patients relapsed within 7 weeks after drug discontinuation, even after 1 year of CMI therapy at adequate dosages (Pato et al. 1988). Treatment duration prior to placebo discontinuation was not related to frequency or severity of relapse. More recently, Steiner and colleagues (1995) found the mean time to relapse during placebo substitution for patients responding to treatment with PAR was 62.9 days, compared to 28.5 days for patients who had previously responded to placebo.

Similar results were found by Leonard and colleagues (1991) in a desipramine substitution study in children and adolescents receiving long-term CMI treatment for OCD. Eight (89%) of the nine desipramine substituted patients relapsed during the 2-month comparison period, compared with two (18%) of the 11 patients who continued to receive CMI.

Methodological characteristics

One of the goals of the present study was to examine empirically the relationship between certain methodological variables and effect size. In general, effect sizes were larger under the less rigorous conditions, i.e., studies without control groups, without random assignment, and effect size based on within-subjects pre- to post-test score. Differences between studies on methodological variables may obfuscate true differences (or lack of differences) between treatment groups, resulting in type I or II errors. Given the significant impact methodological differences have on effect size, future meta-analyses should examine and control for these potential confounds.

The vast majority of studies (92%) were from refereed journals. The bias resulting from only studies with significant results being submitted or accepted to refereed journals may have impacted the results. However, the “Fail-safe N” shows that with the exception of SER and PAR (where only two studies each were included) the other treatment conditions were well protected from the “file drawer threat” (Rosenthal 1979).

The use of control groups in ERP is more challenging than in medication studies, where drug and placebo can be given in identical looking capsules. Only two “code-able” ERP studies were found that used a psychotherapy control group (Fals-Stewart and Schafer 1992; Lindsay et al. 1997).

The average effect size in this study was 0.8736. The implication for clinical practice is that the average score in the treatment group is 0.8736 of a standard deviation above the mean score of the control group (or pre-treatment score, depending on method of effect size calculation). Using the YBOCS as an illustration, this would be roughly the equivalent of a change from 24.25 to 18.24. This represents a change from a “moderate-to-severe” to a “mild-to-moderate” level of symptomatology, i.e., still within the clinical range. Thus, on average, while treatment in general helps reduce symptomatology, patients on average do not become asymptomatic. Information on differences between treatments in the percent of patients who become asymptomatic (or subclinical) following treatment was unavailable, and would have provided an additional perspective on the relative efficacy of these treatments.

Treatment of choice: drugs, behavior therapy, or both?

The question remains as to the extent to which the current study findings help clinicians in making treatment decisions. The findings seem to provide some continued support for the relative superiority of CMI over the other SRIs. Whether the difference is clinically (as opposed to statistically) significant has not been resolved. This difference is probably not clinically significant enough to warrant first choice treatment given CMI’s greater lethality in overdose. Treatment guidelines developed by the International Psychopharmacology Algorithm Project (Jefferson et al. 1995) did not recommend CMI over the other SRIs stating “. . . clinicians must weigh the possibility of greater efficacy with clomipramine against the likelihood of fewer side effects with SSRIs” (Jefferson et al. 1995, p. 489). When choosing between individual SRIs, the final decision may depend on personal preferences. For some, particularly the more severe cases, the additional benefit that CMI offers over the other SRIs might outweigh the disadvantages in terms of side effects (one study found OCD patients complained of side effects associated with placebo significantly less than patients with other psychiatric disorders, suggesting that this patient population may be more willing to tolerate side effects than patients with other disorders, or at least complains to their physicians about them less (Johnston et al. 1990)). In terms of a second choice SRI, the findings suggest that the less selective SRIs (i.e., 5HT/NE selectivity) appear to be more effective than the more selective ones. Thus, FLX and FLV appear to be better choices than SER or PAR. On the other hand, there will be individuals who will respond to one SRI and not another (Greist et al. 1995b). Thus failure on FLX or FLV might warrant a trial with SER or PAR, given that none of the non-SRI medications have demonstrated efficacy for treating OCD (with the exception of single trials of phenelzine and clonazepam (Hewlett et al. 1992; Vallejo et al. 1992). The International Psychopharmacology Algorithm Project recommends switching to another SRI versus augmentation if the initial SRI has shown no efficacy, although no studies have directly compared augmentation to switching (Jefferson et al. 1995).

When methodological differences were controlled, no significant difference was found between ERP, the SRIs as a class, and combined ERP/SRI treatment. Unfortunately, limited availability of ERP more often determines choice of treatment than any personal preference. For some patients, taking medication may be aversive and ERP may be preferable; others may find ERP too frightening or difficult, and may prefer an SRI. One advantage of the SRIs is that they can concurrently treat OCD and comorbid depression. In contrast, high levels of comorbid depression were associated with poorer treatment response to ERP in some studies (Foa et al. 1983; Buchanan et al. 1996), but not others (see Buchanan et al. 1996, for a review). The combined treatment with both ERP and an SRI had a larger effect size than treatment with ERP or an SRI alone, although the differences were not statistically significant. Unfortunately, no studies were found that would allow a methodologically sound comparison of relapse rates among the three classes. Overall, these findings suggest that a combined treatment approach using both ERP and an SRI might be the best course to follow, although the limited availability of trained behavior therapists makes this option more difficult (Greist 1989). Recently, computer-administered behavior therapy for OCD has been shown to be effective in a pilot study (Greist et al. 1996), which may lead to this treatment option becoming more widely available.

Table 1. Weighted Mean Estimator of Effect Size (d), Lower and Upper Bounds for the 95% Confidence Intervals for d+, Homogeneity Statistics, and Fail-Safe N for Groups of Studies
 95% Confidence IntervalHomogeneity Statistics
GroupdLower BoundUpper BoundP for d+QdfPFail-Safe N
All studies0.87360.82390.92330.0000366.381050.00005606
ALL SRIs0.82100.75980.88220.0000248.42630.00001946
CMI1.09200.98621.19780.000064.31250.0000362
FLX0.87640.76470.98820.000078.43190.0000167
FLV0.73440.60420.86460.000038.79130.000282
SER0.37140.15590.58690.00010.1310.72181
PAR0.40850.23610.58100.00000.2010.65881
ERP0.98690.89081.08300.0000105.95350.0000714
ERP + SRI1.07500.79861.35140.00001.6450.896910
Table 1. Weighted Mean Estimator of Effect Size (d), Lower and Upper Bounds for the 95% Confidence Intervals for d+, Homogeneity Statistics, and Fail-Safe N for Groups of Studies
Enlarge table
Table 2. Drop-out Rates (Percent) by Individual Intervention and Intervention Class
GroupTotal Drop-out RateDrop-out Rates for Side Effects Only
All SRIs20.4810.58
CMI18.4910.55
FLX19.209.84
FLV21.9513.26
SER23.949.25
PAR21.7311.16
ERP16.71
Table 2. Drop-out Rates (Percent) by Individual Intervention and Intervention Class
Enlarge table
Table 3. N, Percentage of Total N, Effect Size (d) and Chi-Square Statistics for the Methodological Variables
 n(%)d
Effect size method
    1 (between-subjects change scores)1716%0.7001
    2 (between-subjects endpoint scores)109%0.7911
    3 (within-subjects pre-to-post treatment)7975%1.0206
    χ2(2) = 38.6706***
Random versus non-randomized assignment
    Yes7470%0.8303
    No3230%1.0007
    χ2(1)=8.5479**
Control versus no control group
    Yes2423%0.6735
    No8277%1.0214
    χ2(1) = 46.0025***
Publication form
    Refereed journal9792%0.9007
    Other sources98%0.6719
    χ2(3) = 8.4964**
Type of outcome measure
    Observer rated6258%0.8378
    Self-rated1817%1.0577
    Pooled self- and observer-rated2625%0.9206
    χ2(2) = 7.2823*

*P<0.05; **P<0.01; ***P<0.001. Statistics refer to d

Table 3. N, Percentage of Total N, Effect Size (d) and Chi-Square Statistics for the Methodological Variables
Enlarge table
Table 4. Weighted Mean Estimator of Effect Size (d) Controlling for Random Assignment, Control Group, Year of Study, and Method of Effect Size Calculation, and Lower and Upper Bounds for the 95% Confidence Intervals for d+
  95% Confidence Interval
GroupdLower BoundUpper Bound
ALL SRIs0.80510.66270.9475
CMI1.01890.84521.1926
FLX0.77560.60200.9492
FLV0.70570.52520.8860
SER0.43350.16160.7054
PAR0.47950.24150.7176
ERP0.81130.65130.9713
ERP + SRI0.90500.59631.2137
Table 4. Weighted Mean Estimator of Effect Size (d) Controlling for Random Assignment, Control Group, Year of Study, and Method of Effect Size Calculation, and Lower and Upper Bounds for the 95% Confidence Intervals for d+
Enlarge table
Table 5. Mean Effect Sizes for Pharmacotherapy of OCD in Published Meta-Analyses
DrugKobaka et al. (1998)Abramowitzb (1997)Greist et al. (1995b)van Balkomc et al. (1994)Steind et al. (1995)Piccinellie et al. (1995)
CMI1.111.311.481.461.711.41
FLX0.860.680.831.391.390.57
FLV0.791.280.501.181.140.57
SER0.520.370.450.450.770.52
PAR0.56n/an/an/an/an/a

a Controlled for methodological differences

b Clinician ratings only (no FLX self-ratings reported)

c Pooled self and assessor ratings

d Studies conducted post-1980

e Placebo-controlled studies using YBOCS or NIMH only

Table 5. Mean Effect Sizes for Pharmacotherapy of OCD in Published Meta-Analyses
Enlarge table

(Reprinted with permission from Psychopharmacology 1998; 136:205–216)

References

Abramowitz J (1997) Effectiveness of psychological and pharmacological treatments for obsessive-compulsive disorder: a quantitative review. J Consult Clin Psychol 65:44–52CrossrefGoogle Scholar

American Psychiatric Association (1980) Diagnostic and statistical manual of mental disorders, 3rd edn. APA, Washington, DCGoogle Scholar

American Psychiatric Association (1994) Diagnostic and statistical manual of mental disorders, 4th edn. APA, Washington, DCGoogle Scholar

Baer L, Minichiello WE (1990) Behavioral treatment for obsessive-compulsive disorder. In: Noyes R, Roth M, Burrows GD (eds) Handbook of anxiety, vol 4: the treatment of anxiety. Elsevier, Amsterdam, NYGoogle Scholar

Baer L, Brown-Beasley MW, Sorce J, Henriquess AI (1993) Computer-assisted telephone administration of a structured interview for obsessive-compulsive disorder. Am J Psychiatry 150:1737–1738CrossrefGoogle Scholar

Bolton D, Collins S, Steinberg D (1983) The treatment of obsessive-compulsive disorder in adolescence: a report of fifteen cases. Br J Psychiatry 142:456–464CrossrefGoogle Scholar

Buchanan AW, Meng KS, Marks IM (1996) What predicts improvement and compliance during the behavioral treatment of obsessive compulsive disorder? Anxiety 2:22–27CrossrefGoogle Scholar

Christensen H, Hadzi-Pavlovic D, Andrews G, Mattick R (1987) Behavior therapy and tricyclic medication in the treatment of obsessive compulsive disorder: a quantitative review. J Consult Clin Psychol 55:701–711CrossrefGoogle Scholar

Cleophas TJM (1993) Crossover studies: a modified analysis with more power. Clin Pharmacol Ther 53:515–520CrossrefGoogle Scholar

Clomipramine Collaborative Study Group (1991) Clomipramine in the treatment of patients with obsessive-compulsive disorder. Arch Gen Psychiatry 48:730–738CrossrefGoogle Scholar

Coleman HLK, Wampold BE, Casali SL (1995) Ethnic minorities. ratings of ethnically similar and European American counselors: a meta-analysis. J Counsel Psychol 42:55–64CrossrefGoogle Scholar

Cramer H (1974) Mathematical models of statistics. Princeton University Press, Princeton, N.J.Google Scholar

Fals-Stewart W, Schafer J (1992) The treatment of substance abusers diagnosed with obsessive-compulsive disorder: an outcome study. J Subst Abuse Treat 9:365–370CrossrefGoogle Scholar

Fisher S, Greenberg RP (1993) How sound is the double-blind design for evaluating psychotropic drugs? J Nerv Ment Dis 181:345–350CrossrefGoogle Scholar

Flament MF, Rapoport JL, Murphy DL, Berg CJ, Lake CR (1987) Biochemical changes during clomipramine treatment of childhood obsessive-compulsive disorder. Arch Gen Psychiatry 44:219–225CrossrefGoogle Scholar

Foa EB, Steketee GS, Grayson JB, Doppelt HG (1983) Treatment of obsessive-compulsives: when do we fail? In: Foa E, Emmelkamp PMG (eds) Failures in behavior therapy. Wiley, New YorkGoogle Scholar

Foa EB, Steketee GS, Ozarow BJ (1985) Behavior therapy with obsessive-compulsives: from theory to treatment. In: Mavissakalian M, Turner SM, Michelson L (eds) Obsessive-compulsive disorder: psychological and pharmacological treatment. Plenum Press, New York, pp 49–129Google Scholar

Foa EB, Kozak MJ, Steketee GS, McCarthy PR (1992) Treatment of depressive and obsessive-compulsive symptoms in OCD by imipramine and behavior therapy. Br J Clin Psychol 31:279–292CrossrefGoogle Scholar

Glass GV, McGaw B, Smith ML (1981) Meta-analysis in social research. Sage, Beverly Hills, Calif.Google Scholar

Goodman WK, Price LH, Delgado PL, Palumbo J, Krystal JH, Nagy LM, Rasmussen SA, Heniger GR, Charney DS (1989a) Specificity of serotonin reuptake inhibitors in the treatment of obsessive compulsive disorder: comparisons of fluvoxamine and desipramine. Arch Gen Psychiatry 47:577–585CrossrefGoogle Scholar

Goodman WK, Price LH, Rasmussen SA (1989b) The Yale-Brown obsessive compulsive scale, I: development, use, and reliability. II. Validity. Arch Gen Psychiatry 46:1006–1011, 1012–1016CrossrefGoogle Scholar

Greist JH (1989) Computer-administered behavior therapies. Int Rev Psychiatry 1:267–294CrossrefGoogle Scholar

Greist JH, Chouinard G, DuBoff E, Halaris A, Kim SW, Koran L, Liebowitz M, Lydiard RB, Rasmussen SA, White K, Sikes C (1995a) Double-blind parallel comparison of three dosages of sertraline and placebo in outpatients with obsessive-compulsive disorder. Arch Gen Psychiatry, 52:289–295CrossrefGoogle Scholar

Greist JH, Jefferson JW, Kobak KA, Katzelnick DJ, Serlin RC (1995b) Efficacy and tolerability of serotonin transport inhibitors in obsessive compulsive disorder: a meta-analysis. Arch Gen Psychiatry 52:53–60CrossrefGoogle Scholar

Greist JH, Baer L, Marks I, Kobak KA, Wenzel KW, Dottl SL (1996) Computer-Assisted behavior therapy for OCD. American Psychiatric Association 149th Annual Meeting, New York, N.Y.Google Scholar

Guy W (1976) ECDEU Assessment Manual for Psychopharmacology. DHEW Publication No. (ABM) 76–338. US Government Printing Office, Washington, D.C.Google Scholar

Hafner RJ, Gilchrist P, Bowling J, Kalucy R (1981) The treatment of obsessional neurosis in a family setting. Aust NZ J Psychiatry 15:145–151CrossrefGoogle Scholar

Hedges LV, Olkin I (1985) Statistical methods for meta-analysis. Academic Press, Orlando, FloridaGoogle Scholar

Hewlett WA, Vinogradov S, Agras WS (1992) Clomipramine, clonazepam, and clonidine treatment of obsessive-compulsive disorder. J Clin Psychopharmacol 12:420–430CrossrefGoogle Scholar

Holloway EL, Wampold BE (1986) Relation between conceptual level and counseling-related tasks: a meta-analysis. J Counsel Psychol 33:310–319CrossrefGoogle Scholar

Ingram IM (1961) Obsessional illness in mental hospital patients. J Ment Sci 107:382–402CrossrefGoogle Scholar

Jefferson JW, Altemus M, Jenike MA, Pigott TA, Stein DJ, Greist JH (1995) Algorithm for the treatment of obsessive-compulsive disorder (OCD). Psychopharmacol Bull 31:487–490Google Scholar

Jenike MA, Hyman S, Baer L, Holland A, Minichiella WE, Buttolph L, Summergrad P, Seymour R, Ricciardi J (1990) A controlled trial of fluvoxamine in obsessive-compulsive disorder: implications for a serotonergic theory. Am J Psychiatry 147:1209–1215CrossrefGoogle Scholar

Johnston H, Kobak KA, Greist JH (1990) Placebo side effects in depression and obsessive-compulsive disorder. New Research Program and Abstracts, American Psychiatric Association, 143rd Annual Meeting, New York, N.Y.Google Scholar

Katz RJ, Landau P, De Veaugh-Geiss J (1990) Response: drug versus behavioral treatment of obsessive-compulsive disorder. Biol Psychiatry 28:1073–1074CrossrefGoogle Scholar

Leonard H, Swedo SE, Lenane MC, Rettew DC, Cheslow DL, Hamburger SD, Rapoport JL (1991) A double-blind desipramine substitution during long-term clomipramine treatment in children and adolescents with obsessive-compulsive disorder. Arch Gen Psychiatry 48:922–927CrossrefGoogle Scholar

Leonard H, Swedo SE, Rapoport JL, Coffey M, Cheslow D (1988) Treatment of childhood obsessive compulsive disorder with clomipramine and desmethylimipramine: a double-blind crossover comparison. Psychopharmacol Bull 24:93–95Google Scholar

Liebowitz MR, Hollander E, Fairbanks J, Campeas R (1990) Fluoxetine for adolescents with obsessive-compulsive disorder. Am J Psychiatry 147:370–371Google Scholar

Lindsay M, Crino R, Andrews G (1997) A controlled study of exposure and response prevention in the treatment of obsessive-compulsive disorder. Br J Psychiatry 171:135–139CrossrefGoogle Scholar

Lo WH (1967) A follow-up study of obsessional neurotics in Hong Kong Chinese. Br J Psychiatry 113:823–832CrossrefGoogle Scholar

Malan D (1979) Individual psychotherapy and the science of psychodynamics. Butterworths, LondonGoogle Scholar

March JS, Johnston H, Jefferson JW, Kobak KA, Greist JH (1990) Do subtle neurological impairments predict treatment resistance to clomipramine in children and adolescents with obsessive-compulsive disorder? J Child Adolesc Psychopharmacol 1:133–140CrossrefGoogle Scholar

March JS, Mulle K, Herbel B (1994) Behavioral psychotherapy for children and adolescents with obsessive-compulsive disorder: an open trial of a new protocol-driven treatment package. J Am Acad Child Adolesc Psychiatry 33:333–341CrossrefGoogle Scholar

Marks IM (1981) Review of behavioral psychotherapy, I: Obsessive-compulsive disorders. Am J Psychiatry 138:584–592CrossrefGoogle Scholar

Marks IM (1990) Drug versus behavioral treatment of obsessive-compulsive disorder. Biol Psychiatry 28:1065–1080CrossrefGoogle Scholar

McDonald R, Marks IM, Blizard R (1988) Quality assurance of outcome in mental health care: a model for routine use in clinical settings. Health Trends 20:111–114Google Scholar

Mullen B (1989) Advanced basic meta-analysis. Lawrence Erlbaum Associates, Hillsdale, N.J.Google Scholar

Pato M, Zohar-Kadouch R, Zohar J (1988) Return of symptoms after discontinuation of clomipramine in patients with obsessive compulsive disorder. Am J Psychiatry 145:1521–1525CrossrefGoogle Scholar

Piccinelli M, Pini S, Bellantuono C, Wilkinson G (1995) Efficacy of drug treatment in obsessive-compulsive disorder. A meta-analytic review. Brit J Psychiatry 166:424–443CrossrefGoogle Scholar

Rachman SJ, Hodgson RJ (1980) Obsessions and compulsions. Prentice Hall, Englewood Cliffs, N.J.Google Scholar

Rasmussen SA, Eisen JL, Pato MT (1993) Current issues in the pharmacologic management of obsessive compulsive disorder. J Clin Psychiatry 54[suppl 6]:4–9Google Scholar

Rasmussen SA, Goodman WK, Greist JH, Jenike MA, Kozak MJ, Liebowitz M, Robinson DG, White K (1998) Fluvoxamine in the treatment of obsessive-compulsive disorder: a multicenter double-blind placebo controlled study in outpatients. Am J Psychiatry (in press)Google Scholar

Rauch SL, O’Sullivan RL, Jenike MA (1996) Open treatment of obsessive-compulsive disorder with venlafaxine: a series of ten cases. J Clin Psychopharmacol 16:81–84CrossrefGoogle Scholar

Richelson E (1994) Pharmacology of antidepressants-characteristics of the ideal drug. Mayo Clin Proc 69:1069–1081CrossrefGoogle Scholar

Rosenfeld R, Dar R, Anderson D, Kobak KA, Greist JH (1992) A computer administered version of the Yale-Brown Obsessive Compulsive Scale. Psychol Assess 4:329–332CrossrefGoogle Scholar

Rosenthal R (1979) The “file-drawer” problem and tolerance for null results. Psychol Bull 86:638–641CrossrefGoogle Scholar

Rosenthal R (1991) Meta-analytic procedures for social research (revised ed). Sage, Newbury Park, Calif.Google Scholar

Salkovskis PM, Westbrook D (1989) Behaviour therapy and obsessional ruminations: can failure be turned into success? Behav Res Ther 27:149–160CrossrefGoogle Scholar

Stein DJ, Spadaccini E, Hollander E (1995) Meta-analysis of pharmacotherapy trials of obsessive-compulsive disorder. Int Clin Psychopharmacol 10:11–18CrossrefGoogle Scholar

Steiner M, Bushnell MS, Gergel IP (1995) Long-term treatment and prevention of relapse of OCD with paroxetine. Paper presented at the 148th Annual Meeting of the American Psychological Association, New YorkGoogle Scholar

Steketee GS (1993) Treatment of obsessive compulsive disorder. Guilford, New YorkGoogle Scholar

Stevens J (1986) Applied multivariate statistics for the social sciences. Erlbaum, Hillsdale, N.J.Google Scholar

Strube MJ, Gardner W, Hartmann DP (1985) Limitations, liabilities, and obstacles in reviews of the literature: the current status of meta-analysis. Clin Psychol Rev 5:63–78CrossrefGoogle Scholar

Tollefson GD, Rampey AH, Potvin JH, Jenike MA, Rush AJ, Dominquez RA, Koran LM, Shear MK, Goodman WK, Genduso LA (1994) A multicenter investigation of fixed-dose fluoxetine in the treatment of obsessive-compulsive disorder. Arch Gen Psychiatry 51:559–567CrossrefGoogle Scholar

Vallejo J, Olivares J, Marcos T, Bulbena A, Menchon JM (1992) Clomipramine versus phenelzine in obsessive-compulsive disorder: a controlled clinical trial. Br J Psychiatry 161:665–670CrossrefGoogle Scholar

van Balkom AJLM, van Oppen P, Vermeulen AWA, van Dyck R, Nauta MCE, Vorst HCM (1994) A meta-analysis of the treatment of obsessive compulsive disorder: a comparison of antidepressants, behavior, and cognitive therapy. Clin Psychol Rev 14:359–381CrossrefGoogle Scholar

Wheadon DE, Bushnell WD, Steiner M (1993) A fixed dose comparison of 20, 40, or 60 mg paroxetine to placebo in the treatment of obsessive compulsive disorder. Paper presented at the Annual Meeting of the American College of Neuropsychopharmacology, Honolulu, HawaiiGoogle Scholar

Yaryura-Tobias JA, Neziroglu FA, McKay DR (1994) The action of venlafaxine on obsessive-compulsive disorder. Biol Psychiatry 35:737Google Scholar