The effectiveness of psychodynamic psychotherapy for ameliorating symptoms of mental illness has been demonstrated (1—3), but there have been few studies demonstrating dynamic changes over the course of long-term psychotherapy (1, p. 14). A subsequent question is whether dynamic therapy helps individuals by ameliorating underlying variables derived from the psychodynamic theoretical model, e.g., defense mechanisms and defense styles for which there are now validated measures (4—10). One such measure, the Defense Style Questionnaire (4), is a self-report measure that is easily administered and has been widely used to determine levels of maturity or adaptiveness of defenses. Akkerman et al. (9) found that as depressed patients recovered with treatment, their use of mature defenses on the Defense Style Questionnaire improved from low to normal levels. Over the course of 2 years, their use of immature defenses gradually decreased.
To our knowledge, only one study of a single case (10) demonstrated whether psychodynamic psychotherapy was associated with sustained rather than transient change in defenses. Since dynamic psychotherapy specifically addresses defenses and conflicts, it is logical to examine the empirical evidence that they change. Theoretical writers have proposed that exploration and interpretation of wishes, fears, and defenses lead to long-term changes in defenses (11). Previous research has demonstrated that more adaptive defenses are associated significantly with better mental and physical health (12, 13) and that adaptive defenses predict subsequent health in the face of a deprived childhood (14) and successful aging (15). If it were demonstrated that therapy helped patients develop more reliance on adaptive and less on maladaptive defenses, which then protected against relapse and recurrence of symptoms, then a strong case could be made for such therapy. The issue of causation is always problematic, and we do not know if shifts toward greater maturity of defenses simply accompany improvement in overall functioning or if there is a somewhat independent process that leads to changes in defense use. However, first we must demonstrate whether changes in defenses occur and, second, whether these correlate with changes in symptoms and psychosocial functioning. Therapeutic alliance in the early phase is a robust predictor of the outcome of psychotherapy (16), and thus the relationship between alliance and defenses also must be examined.
Another limitation of most research on psychotherapy is the short duration of the treatment and follow-up for problems that tend to be chronic and recurrent. Eight-week to 16-week trials of brief psychotherapy, even with 1-year follow-up, do not reveal whether patients develop significant sustained improvement. In such studies, outcome measurement tends to evaluate a particular moment in time rather than an ongoing experience (2).
Patient selection is another potential bias. Studies focusing on homogeneous samples may use exclusion criteria that eliminate many average "real-life" patients with multiple problems from participation (17). Patients with chronic and recurrent anxiety, depressive, or personality disorders require several years of follow-up to ascertain if meaningful change has taken place (18). Effectiveness research is concerned with how an already established treatment works in the hands of a diverse sample of practitioners treating a specified but essentially uncontrolled sample of patients under real-world conditions (19). This contrasts with efficacy studies that examine manualized treatments for specific conditions in homogeneous samples of patients who have passed the exclusion criteria, which can mean that they have only one axis I disorder and fewer complications than the typical clinic patient. Taking many of these studies into account, our study was designed to examine psychotherapy as practiced on a broad sample of patients.
We conducted a naturalistic study in which patients with both axis I and II disorders were offered long-term dynamic psychotherapy with or without concomitant medications. We conducted periodic follow-along interviews by using both symptomatic and dynamic measures. The aim of the study was to evaluate changes in both types of measures and to determine whether these changes are sustained over time by using the follow-along method.
The questions we address in this report are
Subjects were referred from the outpatient department of the Department of Psychiatry at Sir Mortimer B. Davis—Jewish General Hospital to the Long-Term Dynamic Psychotherapy Research Project. The project offered a minimum of 3 years of dynamic psychotherapy to subjects who met the following selection criteria, including having a depressive disorder, an anxiety disorder, and/or a personality disorder; expressing a desire for psychotherapy; and agreeing to participate in the research component of the project. Informed consent was given, and subjects were told that if they declined to participate in the research, alternate referrals for therapy would be made. Exclusion criteria included psychosis, organic brain disorders, or significant current substance abuse that might interfere with learning. After all aspects of the research were explained to the subjects, written informed consent was obtained.
Fifty-three subjects entered the study; 41 (77%) were women. The median age was 30.9 years (SD=29.9, range=17—53). When we used the Hollingshead Two-Factor Index, median social class was III (middle level), and the mode was V (the lowest level). Ten subjects (19%) were receiving welfare or disability.
Subject diagnoses included 40 with personality disorders plus eight with significant personality disorder traits, 39 with depressive disorders, and 36 with anxiety disorders. Of those with personality disorders, 16 had borderline personality disorder, while 20 had borderline traits. Self-defeating (13), avoidant (9), and narcissistic (7) were other relatively frequent personality disorders. The mean number of different lifetime axis I disorders was 3.8 (SD=2.4). This did not include recurrences. The mean current rating on the Global Assessment of Functioning Scale (GAF) was 53.4 (SD=4.05), and the highest score in the past year was a mean of 60.5 (SD=6.6). Psychotropic medications were used at intake by 23 subjects.
The 22 therapists were all experienced, with a mean of 13.1 years of posttraining: eight psychiatrists, 10 psychologists, three social workers, and one nurse; 20 were psychoanalysts. The therapy was free. Patients were offered a minimum of 3 years, but the patients could terminate treatment at will, or try other therapies (e.g., pharmacotherapy) concurrently or sequentially, reflecting real-life practice conditions. Patients agreed to participate in the follow-along assessments for a minimum of 3 years but with the option of longer treatment, by mutual agreement.
The subjects had about 10 hours of research interviews and completions of questionnaires before starting their therapy. An experienced clinician administered the Guided Clinical Interview to gather information to make DSM-IV diagnoses (20). After psychotherapy sessions 3, 5, 7, 9, and 11, the patients completed the patient’s version of the California Psychotherapy Alliance Scale (21). Research assistants met with the subjects every 6 months to administer the Longitudinal Interval Follow-Up Evaluation (22) and other measures to be described. Follow-up lasted up to 5 years.
The Defense Style Questionnaire (4) is a self-report questionnaire with 88 items designed to measure defense mechanisms. Previous factor analysis yielded four factors of presumed defense mechanisms, which we called defense styles (detailed in reference 9). The styles are ranked on a continuum of adaptiveness from 1) maladaptive, 2) image distorting, and 3) self-sacrificing to 4) adaptive. An overall defensive functioning score can be calculated, with a higher score indicating greater adaptiveness or maturity. Based on previous work (4), a subject is deemed to score high on styles 1, 2, and 3 if his or her score is 0.5 standard deviation above the mean for a normative nonpatient group and low on style 4 if it is 0.5 standard deviation below the mean on that style. This cutting point approximated the median, yielding the best distribution. The means of the normative sample for nonpatients (23) and a sample of those with borderline personality (24), respectively, were style 1 (mean=3.6, SD=1.2, and mean=5.2, SD=1.1), style 2 (mean=2.5, SD=1.0, and mean=3.5, SD=1.1), style 3 (mean=2.9, SD=0.6, and mean=4.1, SD=1.2), and style 4 (mean=4.7, SD=1.0, and mean=4.4, SD=1.2).
The California Psychotherapy Alliance Scale—Patient Version (21) is a questionnaire designed to measure the patient’s ratings of the alliance with the therapist. The 21-item Hamilton Depression Rating Scale (25) is a rating scale for observer-rated depressive symptoms. Subjects were considered depressed if their initial Hamilton depression scale score was ≥ 7 at time of entry. The SCL-90-R (26) is a 90-item self-report questionnaire designed to assess current distress. The global severity index is a mean of all the items. A score >0.5 is considered in the clinical range.
All analyses were performed by using SAS version 8.2 (27). Correlational analyses were performed by using two-tailed Spearman rank-order statistics (rs). Because subjects were not selected to be homogeneous with regard to each outcome variable, subjects were dichotomized into low and high subgroups, whenever appropriate, based on intake values that were above or below predefined cutoff scores that were representative of patient status. This was not done for the GAF, since all subjects scored below 71 and hierarchical linear regression was used. We used likelihood-based mixed-effects linear models to analyze longitudinal data, using SAS Proc Mixed (27). "Subject" was treated as a random effect, while the continuous variable "elapsed time" was treated as both a fixed and random effect. As a result, a randomly distributed regression intercept and slope (over time) was calculated for each subject in addition to a "fixed" mean overall time effect. This is sometimes referred to as a random-slope, random-intercept model and is often helpful in accounting for extra subject-specific variability when the primary interest is in mean longitudinal effects within a population (28). Significant effects were examined further by taking the slope and intercept for each subject from these models and calculating predicted values for the minimum and maximum elapsed times of actual observation for each subject. Because several subjects with two observations yielded improbable outlying scores, we limited these models to subjects with at least three observations. This decision generally yielded more conservative overall effects. Predicted change from these models was then transformed to within-condition effect sizes to allow for ready comparison of the relative size of effects across all outcome variables in this study group. We believe that these predicted values, based on the models, represent more stable estimates of long-term change than do the actual first and last raw scores. Effect size was calculated by dividing raw change predicted from the model (the numerator) by the standard deviation of the initial value of the measure (the denominator). Finally, we examined the relative contributions of change in Defense Style Questionnaire variables to change in three outcome variables by using hierarchical multiple linear regression.
Of the 53 subjects, 29 (24 women) completed therapy according to therapist agreement or were still in therapy at the time of analysis, 14 (nine women) had moved or stopped therapy for external reasons, and 10 (eight women) had dropped out without therapist agreement. Neither major demographic and diagnostic variables nor initial scores on the Defense Style Questionnaire defense styles and overall defensive functioning, global severity index, 17-item Hamilton depression scale, or the GAF predicted dropout status. Subjects had a mean of 3.0 years (SD=2.1) of treatment and provided a mean of 4.2 years (SD=2.0) of follow-up.
As anticipated, the Proc Mixed analyses generally found significant interaction effects for time by subgroup (i.e., rate of change by low versus high initial score). The following data were found for Defense Style Questionnaire styles 1, 3, and 4 and overall defensive functioning (all df=1, 199): style 1—F=5.89, p=0.02; style 2—F=1.28, p=0.20; style 3—F=4.10, p<0.0001; style 4—F=2.07, p=0.04; overall defensive functioning—F=2.72, p=0.007; for the global severity index—F=2.47, df=1, 199, p=0.01; and for the 17-item Hamilton depression scale—F=2.81, df=1, 154, p=0.006. This justified examining the subgroups above and below the clinical cutoff score separately. Because the focus of this report is on the relationship of change in defense style to change in other variables, the full results from these analyses will be presented elsewhere.
Changes in defense styles over time
Table 1T1 displays the initial and final means of the subgroups dichotomized above (high) and below (low) the clinical cutoff score for the Defense Style Questionnaire, including only subjects with at least three observations. For the subgroups with scores in the patient range (high-scoring subgroups for styles 1, 2, and 3 but the low-scoring subgroup for style 4), significant improvement was noted for styles 1 (maladaptive) and 3 (self-sacrificing) with respective effect sizes of 0.80 (p<0.01) and 0.67 (p<0.001). Styles 2 (image distorting) and 4 (adaptive) were not significant, with effect sizes of 0.42 and 0.68 (p=0.08), respectively. The mean scores of the low-scoring subgroups did not change appreciably, except in the case of style 3, in which the mean score of the low-scoring subgroup rose significantly (effect size=−0.51, p<0.05). For the whole group (N=41), styles 1 (maladaptive) and 2 (image distorting) improved (effect size=0.37, p<0.05, and effect size=0.29, p<0.05), as did overall defensive functioning, with an effect size of 0.43 (p<0.05).
Symptoms and defense style changes
The initial symptom scores of the global severity index, Hamilton depression scale, and GAF correlated significantly with the initial overall Defense Style Questionnaire defensive functioning score in the expected direction (−0.54, p<0.001; 0.48, p<0.002; and 0.40, p<0.006, respectively): higher defensive functioning was associated with fewer symptoms and better adjustment. Over the follow-up, the mean GAF of our sample changed from 53.4 (range=42—65) to 56.9 (range SD=5.5), yielding an effect size of 0.82 (p>0.0001). The global severity index of the SCL-90-R improved, with an effect size of 0.59 (p=0.001) (Table 1T1).
In order to determine if there was improvement in depression, we looked specifically at the subgroup with depressive symptoms at intake in the clinical range (21-item Hamilton depression scale score ≥ 7). These depressed subjects improved over the follow-up period, with an effect size of 0.56 (N=29, p<0.05).
Next, we examined change in overall defensive functioning on the Defense Style Questionnaire as a predictor of change in the three outcome variables. Among those with initial 21-item Hamilton depression scale scores in the clinical range, we used two hierarchical linear regressions. In the first model (Table 2T2, model 1), we entered initial Hamilton depression scale score followed by overall defensive functioning effect size from the Defense Style Questionnaire. In this model, the overall defensive functioning effect size from the Defense Style Questionnaire accounted for a significant percentage of the variance (18.6%) of the Hamilton depression scale effect size. In a follow-up model (not shown), we entered initial Hamilton depression scale score followed by change in global severity index (global severity index effect size), thereby controlling for change in distress. When overall defensive functioning effect size on the Defense Style Questionnaire was added to that model, global severity index effect size became insignificant, while overall defensive functioning effect size on the Defense Style Questionnaire remained significant, as in model 1, indicating that change in self-report distress was less predictive of change in observed depressive symptoms (21-item Hamilton depression scale) than change in Defense Style Questionnaire overall defensive functioning.
Next, we examined change in the global severity index (global severity index effect size) among subjects initially scoring in a clinical range (Table 2T2, model 2). Using hierarchical regression, we entered initial global severity index score followed by overall defensive functioning effect size on the Defense Style Questionnaire. The Defense Style Questionnaire accounted for 21.8% of the variance in change in distress, a larger figure than predicted by the initial global severity index (8.1%).
Finally, we examined change in GAF (Table 2T2, model 3). We entered the initial GAF into the model, followed by Defense Style Questionnaire overall defensive functioning effect size. Initial GAF accounted for 12.1% of the variance, and overall defensive functioning effect size on the Defense Style Questionnaire explained an additional 9.4%.
Defense style questionnaire and therapeutic alliance
Initial defense style scores were significantly associated with lower mean California Psychotherapy Alliance Scale alliance ratings (all N=41) for style 1 (rs=−0.53, p=0.0004), style 2 (rs=−0.47, p=0.002), and style 3 (rs=−0.34, p=0.03). The adaptive style 4 did not correlate significantly with a higher mean alliance (rs=0.26, p=0.10). Consequently, a higher overall defensive functioning score (Defense Style Questionnaire) was associated with a better self-report therapeutic alliance early in therapy (rs=0.44, p=0.004). Of the four California Psychotherapy Alliance Scale subscales, the magnitude of the correlations was largest for patient working capacity (e.g., correlations with styles 1 and 2, both rs=−0.58, p=0.0001).
This study had several limitations. The sample was not diagnostically homogeneous, and the therapists did not follow a standard treatment manual. Some subjects received other treatments in addition to long-term dynamic psychotherapy, primarily medications. Some subjects gave few interviews before dropping out. While we found no major demographic or diagnostic predictors of those who gave few observations, nonetheless, the possibility remains that data from these individuals might alter our findings had they been available. Over the study, life events may have influenced outcome. As a naturalistic study, without a comparison group, causation of change could not be determined. However, the study does reflect real-life patient selection and treatment and has the advantage of multiple assessments over time.
The use of within-condition effect sizes allows ready comparison of the magnitude of change across measures within a study. Our sample’s heterogeneity on all measures had the additional effect of increasing the size of the standard deviation above that of a more homogeneous sample and, consequently, reducing the size of the effect sizes. Reliance on effect sizes also does not directly address whether subjects recovered or normalized. This important question will be addressed when we explore the predictors of improvement and recovery.
In our study, on the maladaptive style, subjects who initially scored high showed a significant decrease over time, while those initially scoring low showed no significant change. Similarly, subjects who scored high on the self-sacrificing style had a decrease in use on that style over time, but those who initially scored low increased their use. In previous studies (4, 5), the self-sacrificing style (e.g., reaction formation, pseudo-altruism) correlated more with lower mental health, and so it makes sense that with clinical improvement, there would be less use of this style. However, regression to the mean might also contribute. This style is least strongly correlated with poor mental health, and it is possible that some subjects’ increased use was a move up from the maladaptive and image-distorting styles. The whole group of subjects also decreased their use of style 2 image-distorting defenses. Subjects who initially scored low on the adaptive style changed, but it was shy of significance over time toward greater use of this style. Overall, defensive functioning improved significantly. Since these subjects were assessed a median of six times over a mean of about 4 years, we believe that the data reflect a sustained trend over time.
These findings are congruent with those of Akkerman et al. (9), who found that as depressed patients recovered with treatment, their use of mature defenses improved from low to normal over the course of 2 years of treatment, while their use of immature defenses gradually decreased to levels below those of nonpatients.
In a previous study (5), we found that after 6 months of naturalistic treatment, defense style scores were stable upon test-retest (median r=0.70). However, the sample means at 6 months indicated that the subjects used less maladaptive and image-distorting styles (p<0.005 and p<0.003, respectively) and more of the adaptive style (p<0.01). Thus, there was greater adaptiveness as a group, although individuals retained their relative defensive profiles. This might reflect spontaneous improvement, treatment effect, or regression toward the mean, that is, the tendency for more extreme scores to moderate to a more usual level over repeated testing. In our current study, we did not find a clear pattern of change in the first 2 years for most subjects because of high levels of variability. This reinforces the value of following subjects for more than 2 years with multiple assessments so that the signal of longitudinal improvement can emerge above the noise of state-dependent changes.
Are defenses a state or trait phenomenon? The data of Akkerman et al. (9) indicated that intermediate-level defenses (in a three-factor version, neither adaptive nor maladaptive) did not change with recovery from depression and might be more trait-like. (Traits are more enduring while states change, e.g., depressed or not depressed.) However, they may change after more than 2 years, which was the follow-up period in that study. The maladaptive, image-distorting, and adaptive defenses might be more state-dependent phenomena while still reflecting some trait aspects. When subjects fall ill, their capacity to use mature adaptive defenses may diminish. As they regress, their least adaptive defenses emerge, i.e., they start to more frequently employ maladaptive defenses, which they may have used less often while being well compensated.
In our study, initial overall Defense Style Questionnaire scores were significantly correlated with initial scores on self-rated distress (global severity index), observer-rated depression (21-item Hamilton depression scale), and observer-rated functioning (GAF). As a group, subjects improved on all of these variables.
Improvement in overall defensive functioning predicted improvement in observer-rated depression, even after we controlled for level of depression at intake and improvement in self-report distress. Also, change in overall defensive functioning on the Defense Style Questionnaire was a better predictor of improvement in general level of distress than initial level of distress, as well as a significant predictor of change in score on the GAF after initial GAF assessment. Thus, although we cannot determine whether defense change causes symptom change or vice versa or whether both change as a function of some third factor, change in overall defensive functioning was a potent predictor of change in symptoms and functioning.
From a clinical perspective, therapy should help a patient use more adaptive and fewer maladaptive defenses. A crucial question is whether dynamic therapy, which explicitly addresses a patient’s defenses and conflicts, has a greater effect on change in defense use than therapy with different theoretical aims. Theoretically, change according to psychoanalytic theory suggests that addressing defenses and conflicts over time should lead the patient to become better adapted to conflictual issues, more aware of maladaptive defenses, and more flexible and adaptive in handling stress.
Two studies of in-session changes offer some evidence of immediate change, albeit without proving any causal link. In studying sequential consequences of therapists’ interventions, Milbrath et al. (29) found that patients’ emotional elaboration was followed by therapists’ defense interpretation, which in turn was followed by more patient emotional elaboration, implying a decrease in defensiveness. Bond et al. (30) found that defense interpretations mixed with supportive interventions were followed by enhanced therapeutic work without increasing defensiveness. However, in-session changes alone do not offer evidence of long-term sustained changes in defense maturity.
In the current study, subjects who initially scored high on the maladaptive defense style had a mean score in the range of a reference sample of borderline personality disorder (24). This is consistent with the fact that 75% of the present sample had personality disorders, half of whom had borderline personality disorder. The significant improvement in maladaptive defenses over a sustained period of time that was evident in this group indicates that people with personality disorders, including borderline personality disorder, can experience dynamic improvement with psychodynamic therapy and follow-up of 3—5 years. Dynamic improvement moved in concert with symptomatic changes in GAF, depression, and distress. The fact that subjects with chronic or recurrent conditions and axis II disorders showed sustained improvement on both dynamic and symptomatic measures is important. This is consistent with the findings reported by Perry et al. (2, p. 1319), viz., psychotherapy was associated with about a sevenfold faster rate of recovery in personality disorders (defined as no longer meeting full personality disorder criteria) than was found in the natural history studies of borderline personality disorder (25.8% per year versus 3.7% per year). Our use of regression models to compare predicted initial and final scores also would minimize the effect of regression toward the mean on final change because the model uses all observations to detect the trend underlying the variation of the actual scores. This is more conservative than using the actual initial and final scores, which would be more susceptible to state effects. Furthermore, by omitting subjects with only two observations, we eliminated some exaggerated changes, which were unlikely to reflect long-term changes. Also, long-term changes are smaller than short-term changes reported in some studies because of the recurrent nature of chronic illness.
While initial defense style did not predict attrition or continuation in therapy, it did predict the early self-reported therapeutic alliance. It makes sense that a more adaptive style would correlate positively with a more positive experience of therapy and the therapist. The type of items that a subject would endorse on the California Psychotherapy Alliance Scale would reflect having propensities for affiliation, taking responsibility for oneself, and agreeing with the tasks of therapy. The magnitude of these relationships was in fact highest for items reflecting the patient’s working capacity, which is consistent with a higher level of defensive functioning. Therapeutic alliance may be a partial mediating factor that we will explore in future research examining the therapeutic process. On the other hand, staying in therapy may depend on other characteristics, such as fear of abandonment (31). Thus, some maladaptive defensive patterns can be associated with a need for attachment and an emotional availability and may keep the subject in therapy.
In summary, in this naturalistic study, patients with chronic and recurrent depressive, anxiety, and/or personality disorders who are treated with open-ended dynamic psychotherapy demonstrated improvement in both defensive functioning and symptoms as a group over a 3—5-year period. We were not able to address the causation of change. Because this was a naturalistic study, we did not control for variations in therapist technique and medication use. While naturalistic studies must be complemented by efficacy studies, they nonetheless provide valuable estimates that suggest that the relationship between dynamic and symptomatic change is ripe for further study.