|
|
||||||||
Ann Thorac Surg 1998;65:1320-1325
© 1998 The Society of Thoracic Surgeons
a Department of Health Psychology, Flinders Medical Centre, Adelaide, South Australia, Australia
b Department of Surgery, Flinders Medical Centre, Adelaide, South Australia, Australia
Accepted for publication December 19, 1997.
Address reprint requests to Dr Baker, Cardiac Surgical Research Group, Department of Surgery, Flinders Medical Centre, Bedford Park, SA 5042 Australia
| Abstract |
|---|
|
|
|---|
Methods. Neuropsychologic assessment was performed on 50 patients before and at 7 days after either hypothermic or normothermic coronary artery bypass grafting. From a matched control group of 24 normal subjects who were examined twice over a similar interval, reliable change indices that controlled for measurement error and practice effects were calculated for each neuropsychologic measure. With the use of these indices, the incidence of postoperative decline among the study patients was determined. For comparison, the incidence of decline using the "one standard deviation" criterion also was calculated.
Results. Comparing the reliable change and standard deviation methods, statistically significant differences in the incidence of decline were observed in 5 of 11 neuropsychologic measures. The reliable change method identified more patients with neuropsychologic deficits on most measures.
Conclusions. The control of measurement error and practice effects can alter significantly the calculated incidence of neuropsychologic impairment after coronary artery bypass grafting.
| Introduction |
|---|
|
|
|---|
The variability in the incidence of neuropsychologic impairment also has been shown to depend on the statistical criteria used to infer that meaningful change has occurred. At present, there is no agreement as to what degree of change is indicative of neuropsychologic dysfunction. In a single sample of patients undergoing coronary artery bypass grafting (CABG), Mahanna and colleagues [6] compared the incidence of postoperative neuropsychologic dysfunction with the use of four commonly used criteria of impairment: (1) a decline of 1 standard deviation (SD) from the preoperative test score on 20% of measures, (2) a decline in test scores by at least 20% from baseline on at least 20% of measures, (3) impairment indices adjusted for age, and (4) impairment indices unadjusted for age. Depending on the criteria used, the incidence of postoperative decline ranged from 15% to 66% before hospital discharge and from 3% to 19% at 6 months follow-up.
Within the published literature, of more concern than the lack of uniformity in defining neuropsychologic change is the fact that the methods that have been used are based on arbitrary statistical decisions that have no theoretic underpinning. One of the most commonly used approaches to defining postoperative neuropsychologic dysfunction is the SD method, in which a patient is considered to have deteriorated if his or her change score equals or exceeds 1 SD of the group mean preoperative score on that measure. Consistent with the arbitrariness of this approach, there is little agreement as to how the criteria should be applied. Whereas some studies have used the SD method in isolation [2], other studies have used another condition that the decline of 1 SD must occur on two or more tests [7] or on 20% of tests [6].
Despite the lack of a theoretic foundation for the SD criteria of change, and the recommendation of the Statement of Consensus on Assessment of Neurobehavioural Outcomes After Cardiac Surgery [8] that practice effects [9] be considered in any analyses of change data, the SD method is still widely used [10, 11]. However, there is emerging recognition of the importance of defining "real" change in test-retest scores as opposed to "artifactual" change resulting from low test reliability and susceptibility to practice effects [12, 13]. Failure to take into account these factors, as in the arbitrary cutoff methods, has the potential to lead to significant underestimation or overestimation of the incidence of neuropsychologic dysfunction after CABG.
To define more accurately "real" change by controlling for the reliability of a test measure, Jacobson and Truax [12] proposed the use of a reliable change (RC) index. The RC index defines the range in which an individual score is likely to fluctuate because of the imprecision of the measuring instrument. Although this index controls for test-retest reliability, it makes no correction for practice effects, which are an independent source of error [9]. Subsequently, Chelune and associates [13] used the RC index method, with the addition of a correction for observed practice effects on each measure. Chelune and associates proposed that the RC method, corrected for practice effects, addresses the limitations of the traditional arbitrary cutoff techniques by providing clear-cut, statistically sound criteria that define whether an individuals posttreatment change score exceeds the variation attributable to test reliability and practice effects.
The aim of the present study was to contrast the incidence of immediate postoperative neuropsychologic impairment obtained with the RC and SD methods to illustrate the variation produced by failure to account for measurement error and practice effects.
| Material and methods |
|---|
|
|
|---|
Control subjects
Twenty-four unpaid volunteers were recruited from bowling clubs, senior citizens groups, and the Flinders Medical Centre Volunteers Service. Volunteers were ineligible for inclusion in the study if they had a past history of cardiac operation, if they had a neurologic injury or disease, or if they did not speak English as their first language. The demographic and baseline neuropsychologic data for both the patients and the control subjects are presented in Table 1.
|
Neuropsychologic examination
Patients were tested individually by qualified examiners (A.C.K. and M.J.A.) on the day before operation and then before hospital discharge. All control subjects were tested on two occasions, 7 days apart. The same examiner always was used for the two testing sessions of a given patient or control subject. All patients and control subjects were tested individually in the examiners office, free of distractions. Neuropsychologic test selection was based substantially on the Statement of Consensus [8]. The test battery in order of administration was as follows: California Verbal Learning Test (CVLT) [14], Purdue Pegboard (Peg R, Peg L, Peg RL) [15], Controlled Oral Word Association Test (COWAT, initial letter verbal fluency) [16], Trail Making Test (TMT A and TMT B) [17], Wechsler Adult Intelligence Scale-Revised (WAIS-R) Digit Symbol subtest (Dig Symb) [18], Boston Naming Test (BNT) [19], and National Adult Reading Test-Revised (NART-R, administered only at baseline) [20]. All tests were administered and scored in a standardized manner.
The CVLT is able to generate numerous measures of various aspects of learning and memory. However, in the present study, we elected to analyze three measures comprising learning (Tot: the sum of the number of words recalled on trials 1 through 5) and recall (Long Free: the number of words recalled after a 20-minute delay, and Long Cued: the number of words recalled after a 20-minute delay with category cues provided). To minimize practice effects on the CVLT, alternate test forms [21] were administered at the preoperative and postoperative examinations. Every second patient or control subject was administered the alternate form first.
Methods of defining postoperative neuropsychologic change
Two statistical methods for defining postoperative changes on neuropsychologic measures were used and compared directly with one another.
Reliable change indices
Using the methodology outlined by Jacobson and Truax [12], an RC index was calculated for each neuropsychologic measure using the baseline and follow-up data of the control subjects. First, the test-retest reliability coefficient (rxx) was computed for each measure (Pearson correlation between preoperative and postoperative scores), from which the standard error of measurement (SEm) was calculated using the formula
, where SD1 is the SD of the baseline score. The standard error of the difference (SEdiff) then was calculated using the formula
. The standard error of the difference describes the spread of distribution of change scores that would be expected if no actual change had occurred [12]. To establish a 90% RC confidence interval (two-tailed prediction) in which only 5% of cases would be expected to fall above and 5% to fall below the cutoff, the SEdiff was multiplied by ±1.64 SD. A correction representing the practice effect then was added to the two-tailed cutoff points [13]. The practice effect was calculated for each variable as the mean difference between the follow-up and baseline scores. Thus, an RC 90% confidence interval was calculated from this formula for each variable:
![]() |
Standard deviation method
Using the traditional SD method [5], a postoperative change on a neuropsychologic measure was considered to have occurred if a patients change score equaled or exceeded 1 SD of the group mean preoperative score on that measure.
Data analysis
Statistical analyses were performed using the SPSS statistical software package (SPSS Inc, Chicago, IL), with an alpha of 0.05 considered statistically significant. When comparing quantitative data, one-way analysis of variance was used, with the Bonferroni correction (alpha/number of contrasts) applied to control for multiple comparisons [22]. Categorical data were analyzed with the
2 statistic. Fishers two-tailed exact test was used when the expected cell sizes were small. To examine practice effects, differences between a control subjects preoperative and postoperative scores were analyzed with paired t tests. Sign tests were used to compare the number of patients who showed postoperative deficits as classified according to the RC and SD methods [23]. This study was approved by the Clinical Investigation Committee of the Flinders Medical Centre (approval number 136/94).
| Results |
|---|
|
|
|---|
Reliable change indices
The data obtained from control subjects that were used in calculating the RC indices are summarized in Table 2. All the measures showed acceptable test-retest reliability, with a range from 0.67 (TMT A) to 0.94 (Dig Symb). Statistically significant practice effects were observed on the CVLT Long Free, TMT A, TMT B, Dig Symb, and BNT measures. It is of interest that not all practice effects demonstrated an improvement on retesting. All three CVLT measures showed a decline from baseline to retesting, and the decline was statistically significant on the Long Free measure. To control for this learning bias, the RC indices were corrected by adding the practice effect and were rounded to the nearest whole number outside the 90% interval to obtain the RC intervals (Table 2). The practice effect on some measures was negligible; thus, the RC interval effectively did not differ from the uncorrected RC index.
|
|
| Comment |
|---|
|
|
|---|
Both the RC method and fixed cutoff methods (ie, the SD method) serve as statistical conventions for defining change [5, 12]. However, the latter approaches fail to take into account a test instruments susceptibility to practice effects and its test-retest reliability. The potential for serious misclassification brought about by failure to control for these factors is well illustrated on the Dig Symb subtest of the WAIS-R. Using the SD method, no patient was classified as showing a postoperative decline on this measure; using the RC method with correction for practice, 36% of the patients were so classified. If the RC method were used with no correction for practice effects, the percentage of patients classified as impaired would fall to 8%. Because the Dig Symb subtest was highly reliable over a 6-day test-retest interval (r = 0.94), the range of measurement error was narrow. If the reliability of the test were more modest (eg, r = 0.75), the range of measurement error would be broadened, and no patients would have exceeded the cutoff for impairment.
The preceding example supports concerns raised by several investigators [8, 12, 13] that meaningful change can be obscured by any data analysis methods that fail to control appropriately for practice effects and test reliability. Although a recent study attempted to revise the traditional SD method to incorporate practice effects [11], and others have raised the issue of the influence of test-retest reliability [2], the RC method is unique in that it considers the influence of practice effects and test-retest reliability for each measure in an approach that is both psychometrically sound and statistically valid [12, 13]. It is important that both test-retest reliability and practice effects be considered because they have independent effects on the analysis of change data. Test-retest reliability coefficients are based on relative rank ordering on the two administrations of the test. Therefore, a high reliability coefficient could be obtained if all subjects systematically made a mild, moderate, or substantial improvement (or decline) at retesting compared with baseline testing, but maintained their same relative rank ordering at both testing times.
With the use of the RC method, the highest incidences of decline were observed on timed tests of visuomotor speed and attention (TMT B, Dig Symb) and manual dexterity (Pegs R). Deficits in these domains are consistent with the findings of most studies that have investigated the incidence of neuropsychologic impairment after cardiac operations [7, 8].
In the present study, one finding that requires further discussion is the negative practice effects displayed by control subjects on the CVLT. We elected to use the CVLT as a list learning task instead of the Rey Auditory Verbal Learning Test as recommended by the Consensus [8]. The CVLT has equal or superior alternate form test-retest reliability on key measures [21, 24, 25] and it measures many more parameters of learning and memory that can be analyzed to define specific subtypes of impairment [14]. The three CVLT measures used in this study were selected on the basis of their robust test-retest reliability coefficients [21]. Although the negative practice effects in the control subjects were not consistent with the positive practice effects displayed on the other neuropsychologic measures, the alternate form of the CVLT was designed specifically such that practice effects would be minimized [21]. Therefore, because the control subjects were not expected to show any improvement on their retest scores, the small decline experienced by the group falls within the overall expectations of the test. Another finding unique to the CVLT measures was the trend for the SD method to classify more patients as declined than the RC method. This tendency is related in part to the negative practice effects displayed by the control subjects.
One aspect of our study limits its comparison with other studies [14, 6]: we have not proposed a single overall incidence value for neuropsychologic impairment after CABG. Although such single figures superficially provide a convenient summation of the extent of acquired impairment, it must be recognized that they are calculated by imposing another arbitrary statistical decision on individual test measures. As such, overall incidence data will vary according to which statistical criteria are used [6], as well as with the sensitivity of the tests, the number of tests used, and the range of cognitive domains they assess [8]. Notwithstanding concerns about the accuracy of overall impairment data, any approach that essentially dichotomizes patients as "impaired" or "unimpaired" promotes a one-dimensional view of brain dysfunction [26].
The main concern regarding the use and application of the RC method in this clinical population is the requirement that the indices be derived from an appropriate control group. There is much debate regarding the most appropriate control group for comparison with patients undergoing CABG [5]. The use of a nonsurgical control group enabled us to assess practice effects and the reliability of the measures. Nonsurgical control subjects have been reported previously in the literature [27, 28] and represent an improvement over the single-group incidence study protocol. However, this strategy does not account for the potentially dramatic short-term effects of major operations. The difficulty in using a surgical control group lies with matching the control group to the patients undergoing extracorporeal circulation.
In conclusion, we believe the RC method is a more valid and theoretically sound statistical approach to defining neuropsychologic change after CABG than the traditionally used fixed, arbitrary cutoff methods, which fail to control for test reliability and practice effects. We propose the use of the RC method as a standardized approach to defining acquired neuropsychologic deficits after cardiac operations.
| Acknowledgments |
|---|
|
|
|---|
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
P. J. Tully, R. A. Baker, J. L. Knight, D. A. Turnbull, and H. R. Winefield Neuropsychological Function 5 Years after Cardiac Surgery and the Effect of Psychological Distress Arch Clin Neuropsychol, October 29, 2009; (2009) acp082v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. A. Barber, S. Hach, L. J. Tippett, L. Ross, A. F. Merry, and P. Milsom Cerebral Ischemic Lesions on Diffusion-Weighted Imaging Are Associated With Neurocognitive Decline After Cardiac Surgery Stroke, May 1, 2008; 39(5): 1427 - 1433. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. D. Rubens, M. Boodhwani, and H. Nathan Interpreting studies of cognitive function following cardiac surgery: a guide for surgical teams Perfusion, May 1, 2007; 22(3): 185 - 192. [Abstract] [PDF] |
||||
![]() |
P. D Raymond, M. Radel, M. J Ray, A. D Hinton-Bayre, and N. A Marsh Investigation of factors relating to neuropsychological change following cardiac surgery Perfusion, January 1, 2007; 22(1): 27 - 33. [Abstract] [PDF] |
||||
![]() |
E. Farag, G. J. Chelune, A. Schubert, and E. J. Mascha Is depth of anesthesia, as assessed by the bispectral index, related to postoperative cognitive dysfunction and recovery? Anesth. Analg., September 1, 2006; 103(3): 633 - 640. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. S. Lewis, P. Maruff, B. S. Silbert, L. A. Evered, and D. A. Scott Detection of Postoperative Cognitive Decline After Coronary Artery Bypass Graft Surgery is Affected by the Number of Neuropsychological Tests in the Assessment Battery Ann. Thorac. Surg., June 1, 2006; 81(6): 2097 - 2104. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. K. Bhudia, D. M. Cosgrove, R. I. Naugle, J. Rajeswaran, B.-K. Lam, E. Walton, J. Petrich, R. C. Palumbo, A. M. Gillinov, C. Apperson-Hansen, et al. Magnesium as a neuroprotectant in cardiac surgery: A randomized clinical trial J. Thorac. Cardiovasc. Surg., April 1, 2006; 131(4): 853 - 861. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. D. Raymond, A. D. Hinton-Bayre, M. Radel, M. J. Ray, and N. A. Marsh Assessment of statistical change criteria used to define significant change in neuropsychological test performance following cardiac surgery Eur. J. Cardiothorac. Surg., January 1, 2006; 29(1): 82 - 88. [Abstract] [Full Text] [PDF] |
||||
![]() |
A Collie, P Maruff, M Makdissi, M McStephen, D G Darby, and P McCrory Statistical procedures for determining the extent of cognitive change following concussion Br. J. Sports Med., June 1, 2004; 38(3): 273 - 278. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. S. Silbert, P. Maruff, L. A. Evered, D. A. Scott, M. Kalpokas, K. J. Martin, M. S. Lewis, and P. S. Myles Detection of cognitive decline after coronary surgery: a comparison of computerized and conventional tests Br. J. Anaesth., June 1, 2004; 92(6): 814 - 820. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. A. Selnes, M. A. Grega, L. M. Borowicz Jr, R. M. Royall, G. M. McKhann, and W. A. Baumgartner Cognitive changes with coronary artery disease: a prospective study of coronary artery bypass graft patients and nonsurgical controls Ann. Thorac. Surg., May 1, 2003; 75(5): 1377 - 1386. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Whitaker The use of Z scores in assessing neuropsychological change after cardiac operations Ann. Thorac. Surg., March 1, 2003; 75(3): 1066 - 1066. [Full Text] [PDF] |
||||
![]() |
A. Collie, D. G. Darby, M. G. Falleti, B. S. Silbert, and P. Maruff Determining the extent of cognitive change after coronary surgery: a review of statistical procedures Ann. Thorac. Surg., June 1, 2002; 73(6): 2005 - 2011. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. van Dijk, A. M. A. Keizer, J. C. Diephuis, C. Durand, L. J. Vos, and R. Hijman Neurocognitive dysfunction after coronary artery bypass surgery: A systematic review J. Thorac. Cardiovasc. Surg., October 1, 2000; 120(4): 632 - 639. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. F. Bruggemans, F. J.R. van de Vijver, and H. A. Huysmans Defining neuropsychological deterioration after cardiac surgery Ann. Thorac. Surg., January 1, 1999; 67(1): 297 - 297. [Full Text] [PDF] |
||||
![]() |
A. C. Kneebone, M. J. Andrew, R. A. Baker, and J. L. Knight Reply Ann. Thorac. Surg., January 1, 1999; 67(1): 297 - 298. [Full Text] [PDF] |
||||
![]() |
M. J. Andrew, R. A. Baker, A. C. Kneebone, and J. L. Knight Neuropsychological dysfunction after minimally invasive direct coronary artery bypass grafting Ann. Thorac. Surg., November 1, 1998; 66(5): 1611 - 1617. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ANN THORAC SURG | ASIAN CARDIOVASC THORAC ANN | EUR J CARDIOTHORAC SURG |
| J THORAC CARDIOVASC SURG | ICVTS | ALL CTSNet JOURNALS |