Ann Thorac Surg 2009;88:151-156. doi:10.1016/j.athoracsur.2009.03.080
© 2009 The Society of Thoracic Surgeons
Original Articles: Pediatric Cardiac
Interinstitutional Comparison of Risk-Adjusted Mortality and Length of Stay in Congenital Heart Surgery
William M. DeCampli, MD, PhDa,b,*,
Redmond P. Burke, MDa
a The Congenital Heart Institute, Arnold Palmer Hospital for Children, Miami Children's Hospital, Orlando, Florida
b Department of Medical Education, The University of Central Florida College of Medicine, Orlando, Florida
Accepted for publication March 25, 2009.
* Address correspondence to Dr DeCampli, The Congenital Heart Institute, Arnold Palmer Hospital for Children, 50 W Sturtevant St, Orlando, FL 32813 (Email: wdecampl{at}mail.ucf.edu).
 |
Abstract
|
|---|
Background: Risk Adjustment for Congenital Heart Surgery (RACHS) and basic Aristotle scores (BCS) have been shown to correlate with mortality and length of stay (LOS) after congenital heart surgery. Interinstitutional comparisons using these scores, as well as comprehensive Aristotle score (CCS), have not been demonstrated.
Methods: We recorded age, weight, RACHS, BCS, CCS, mortality, and LOS for 1,103 patients undergoing cardiac surgery between September 1, 2004, and June 1, 2007, at two institutions. We used binary logistic and multiple linear regressions to evaluate determinants of mortality and LOS, respectively, the C statistic to compare the predictive power of the three scoring systems for mortality, the odds ratio to compare the two institutions, and regression coefficients to compare scoring systems and institutions for LOS.
Results: Raw mortality was 2.9% at both institutions. Final logistic regression models contained only CCS. Odds ratios for death at institutions 1 and 2 were 1.25 and 1.26, respectively (not significant). C statistics for RACHS, BCS, and CCS were 0.73, 0.63, and 0.81, respectively (p = 0.01 for CCS versus BCS; p = 0.02 for CCS versus RACHS). Final regression model for LOS retained age, RACHS, and CCS (R2 = 0.44). The RACHS regression coefficient was greater for institution 2.
Conclusions: The CCS tends to have more predictive power than RACHS and BCS for mortality. The LOS is moderately correlated with CCS, RACHS, and age together, but the model is a poor predictor of individual LOS. The LOS for RACHS category 6 cases differed between the institutions. This study suggests methods that can be used to compare institutions in a risk-adjusted manner.
 |
Introduction
|
|---|
Interinstitutional comparison of morbidity and mortality in congenital cardiac surgery is challenging owing to the large number of different operations performed and the relatively low annual case volumes per institution. Valid risk adjustment must also take into account a sizable number of a priori comorbidities. The RACHS (Risk Adjustment in Congenital Heart Surgery) and Aristotle basic complexity (BCS) scores are consensus-based systems that attempt to couch risk into a single score for each operation [1–4]. The Aristotle comprehensive complexity score (CCS) adds patient-specific comorbidity factors to the BCS score. Boethig and colleagues [5] previously found that RACHS score was reflective of both mortality and length of stay (LOS) in pediatric cardiac surgery. Recently O'Brien and associates [6], analyzing 3 years of data from the European Association of Cardiothoracic Surgery (EACTS) and The Society of Thoracic Surgeons congenital database, found a significant correlation between the BCS score of an operation and its observed procedure-specific risk of mortality and prolonged LOS (>21 days). The C statistic was 0.70 for mortality and 0.67 for prolonged LOS. Al-Radi and coworkers [7], using data from a single institution, found that RACHS score was more predictive of mortality than was BCS, the C statistics being 0.76 and 0.74, respectively, when adjusted for year of operation. Adding age to the regression model increased the predictive power of BCS. The RACHS score was also better than the BCS score in predicting LOS. The results of Al-Radi and associates [7] were consistent with those of Kang and colleagues [8], who found RACHS to be strongly, but BCS weakly, predictive of mortality, the former being strengthened further by the addition of age as a risk factor. These results suggest that (1) the Aristotle score could be used to make interinstitutional comparisons of morbidity and mortality and (2) that CCS might be better than BCS and RACHS in predicting mortality and LOS. In this paper we present a preliminary comparison of morbidity (LOS) and mortality between two institutions using the RACHS, BCS, and CCS scoring systems.
 |
Material and Methods
|
|---|
Institutions
The institutions (1 and 2) both carry out comprehensive pediatric cardiac care of arbitrary complexity. They are contracted so that a single physician group carries out pediatric cardiac services (surgery, cardiology, anesthesia, and critical care) at both institutions ("joint program"). In practice, each institution has its own hospital-based physician staff, but cross-coverage by the other institution's staff is not uncommon. Although 230 miles separate the institutions, they share clinical data by means of three or more video teleconferences per week. Institution 1 performs about 240 cardiopulmonary bypass (CPB) cases and 110 non-CPB cardiovascular cases per year. Institution 2 performs about 120 CPB cases and 50 non-CPB cardiovascular cases per year. Although they share data and opinions regularly, the staff at each institution is not held to strictly common practice protocols, although the protocols probably overlap considerably more than they would in the absence of a joint program.
Data Sources and Patient Population
We obtained approval for this study and for waiver of parental consent from our institutional review boards. We recorded the RACHS score, BCS, CCS, mortality, and LOS for 1,103 patients who underwent congenital cardiac surgery between September 1, 2004, and June 1, 2007, at the two institutions. We used The Society of Thoracic Surgeons congenital cardiac database (CardioAccess Inc, Miami, FL) to derive operation data, and our electronic medical records program (iRounds; Teges Inc, Coral Gables, FL) to extract comorbidities and LOS. We used either CardioAccess or the Aristotle Web site (www.aristotleinstitute.org) to calculate BCS and CCS. Operations included those coded as type "CPB" or "no CPB cardiovascular" as defined by The Society of Thoracic Surgeons database project, as long as they also had an RACHS and a BCS score. We excluded patent ductus arteriosus ligations in infants weighing 2,500 g or less. We excluded 7 patients having major cardiovascular procedures whose procedure did not appear in the published list of RACHS and BCS scores. Only the first operation for each admission was included. Operations consisting of more than one procedure were assigned to the procedure with the highest BCS score (no ties were encountered).
Outcome Variables
Mortality was defined as death during the same hospitalization as the operation regardless of cause, or out of hospital within 30 days postoperatively if the cause was related to the patient's cardiac disease. Length of stay was defined as the duration of hospitalization in which the operation occurred. If a patient was transferred to another inpatient acute care facility, the case was excluded.
Statistical Analysis
We used Student's t tests and one-way analysis of variance to compare individual variables between the institutions. We used binary logistic regression to evaluate determinants of mortality. The initial model included age, weight, location (institution), RACHS, and CCS as independent variables, and we used stepwise elimination to determine the final model. In the final model, we tested the possibility of nonlinearity by expanding the final variables out to second order (quadratic form) and comparing the result with the linear solution. We used the Hosmer-Lemeshow statistic to test goodness of fit. We used the area under the receiver operating characteristic curve (C statistic) to compare the ability of RACHS, BCS, and CCS to discriminate mortality. We made comparisons using 95% confidence limits determined under the nonparametric assumption. Given the relatively low sample size, we also used pairwise backward stepwise elimination based on the probability of the likelihood-ratio statistic calculated using the maximum partial likelihood estimates to determine significant differences in predictive value of the RACHS, BCS, and CCS scores. Removal criterion was set at a probability of less than 0.1. To further test the effect of institution on the mortality curve, we determined the C statistic for the regression of mortality on a solution that contained CCS and institution as independent variables. We then compared this C statistic with that of the regression of mortality on CCS alone.
We used multiple linear regression with backward stepwise elimination to evaluate determinants of LOS, with age, weight, BCS, CCS, RACHS, and institution as the initial variables. Mortalities were excluded from the LOS analysis. The sample size was too small, for most operations, to compare operation-specific observed and expected outcomes. We examined several models, including linear, quadratic, log-linear, and log-log (power function). We used the Bland-Altman method to test several models for accuracy and bias by plotting the difference between the observed and predicted dependent variable against the observed value, then computed the mean difference and 95% confidence limits, ie, "limits of agreement" [9]. We used SPSS 14.0 (SPSS, Inc, Chicago, IL) for statistical analysis.
 |
Results
|
|---|
A total of 1,103 patients met the criteria for mortality analysis, and 1,070 patients met the criteria for LOS analysis. Mean values for risk factors are shown in Table 1. There was no significant difference in any of these risk factors between the institutions. The distribution of cases by RACHS score is shown in Table 2. Although we treated CCS as a continuous variable, for purposes of display we defined "CCS category" (1 through 5) according to the following intervals: 1 (0.1–5.9), 2 (6.0–9.9), 3 (10.0–13.9), 4 (14.0–17.9), 5 (>17.9). The distribution of cases by CCS category is shown in Figure 1. The distributions were similar between the institutions, with institution 1 having a somewhat greater fraction of RACHS 3 cases relative to RACHS 2 cases than institution 2. The Pearson correlations between CCS and RACHS, CCS and age, and RACHS and age were 0.73, –0.34, and –0.32, respectively.
View this table:
[in this window]
[in a new window]
|
Table 2 Distribution of Raw Mortality and Risk Adjustment in Congenital Heart Surgery Scores for Each Institution
|
|

View larger version (19K):
[in this window]
[in a new window]
|
Fig 1. Distribution of cases by Aristotle comprehensive complexity score (CCS) category for each institution (Inst). Aristotle comprehensive complexity score categories are defined in the text and in Table 5.
|
|
The raw mortality at both institutions was 2.9%. The distribution of mortality by RACHS score is shown in Table 2. The final logistic regression model for each institution and for the combined institutions contained only CCS as a significant independent variable (Hosmer-Lemeshow
2 = 10.2; p = 0.25). Inclusion of a quadratic term (CCS)2 did not significantly improve the model coefficients or goodness of fit. The odds ratio for death (for an increment of 1.0 in CCS) was 1.25 (p = 0.0005) for institution 1 and 1.26 (p = 0.003) at institution 2, but the difference in the unstandardized β coefficients between the institutions was insignificant (institution 1: 0.22 ± 0.05; institution 2: 0.26 ± 0.09). To examine institutional differences in mortality further, we calculated the regression equation for mortality including "institution" as a variable. This equation was y = –6.44 + 0.49 x CCS + 0.18 x Institution, where y is the probability of mortality and Institution is a categorical variable for institution (1 or 2). We then generated a receiver operating characteristic curve for the regression of y against CCS. We compared its C statistic with that of the regression of mortality on CCS alone. The C statistic for the former was 0.811 ± 0.028 and 0.810 ± 0.029 for the latter, which is not statistically significantly different. In further support of this finding, we stratified cases by RACHS and compared mortality for each RACHS category between the institutions, and found no significant differences.
The receiver operating characteristic curves for institutions 1 and 2 are shown in Figure 2, and the C statistics for the discriminative ability of RACHS, BCS, and CCS are shown in Table 3. The discriminative ability of CCS, based on the C statistic, was significantly better than that of BCS but not that of RACHS. Using the likelihood ratio (LR) statistic (LR = change in –2 x log likelihood), RACHS was a better predictor of mortality than BCS (RACHS LR = 10.5; p = 0.001; BCS LR = 1.2; p = 0.26), and CCS was a better predictor of mortality than was RACHS (CCS LR = 15.4; p = 0.0005; RACHS LR = 0.0; p = 0.998).

View larger version (15K):
[in this window]
[in a new window]
|
Fig 2. Receiver operating characteristic curves for the logistic regression of mortality, for institution 1 (A) and institution 2 (B). The corresponding C statistics are displayed in Table 3. (BCS = Aristotle basic complexity score; CCS = Aristotle comprehensive complexity score; RACHS = Risk Adjustment in Congenital Heart Surgery.)
|
|
Raw LOS data for each institution are shown in Figure 3. The median and range of LOS for the two institutions, stratified by RACHS score and by CCS category, are shown in Tables 4 and 5,
respectively. The LOS values for RACHS 6 and CCS category 5 cases were significantly greater at institution 2 compared with those at institution 1 (p = 0.007 and 0.026, respectively). The final multiple regression model (for both institutions combined) for LOS retained age, RACHS, and CCS as risk factors. The model is shown in Table 6. The model adjusted the regression coefficient (R
2) to equal 0.44—a moderate correlation. When we ran the regression for each institution separately, the same variables appeared in the final models, with adjusted R
2 = 0.39 for institution 1 and 0.55 for institution 2. Only the RACHS β coefficient differed significantly between the two institutions (2 > 1), owing mainly to differences in the LOS of RACHS 6 cases. For the combined data, backward elimination of the "Institution" variable resulted in a change in adjusted R
2 from 0.435 to 0.436, an insignificant change (p = 0.431). For graphical purposes, the regression lines of log(LOS) on CCS alone for institutions 1 and 2 are shown in Figure 4. Models that included other functional relationships, such as quadratic terms, inverse terms, or a power law, did not yield significantly different results. "Limits of agreement" by Bland-Altman analysis were ±0.61 for the model shown in Table 6. The corresponding antilogs are 0.25 to 4.0.

View larger version (16K):
[in this window]
[in a new window]
|
Fig 3. Raw data for hospital length of stay (Hosp LOS; in days) versus Aristotle comprehensive complexity score (CCS) score for institution 1 (A) and institution 2 (B).
|
|

View larger version (22K):
[in this window]
[in a new window]
|
Fig 4. Regression lines and lines of 95% confidence limits for the regression of log10 of hospital length of stay (HLOS) versus Aristotle comprehensive complexity score (CCS) for institution 1 (A) and institution 2 (B).
|
|
 |
Comment
|
|---|
All of the aforementioned studies have used or compared the RACHS and BCS scores in analyzing mortality and LOS. Although two additional studies used CCS to examine outcomes for the Norwood operation, the CCS score has not been applied to analyzing overall results in congenital heart surgery [10, 11].
This is the first study to use the CCS score to risk-adjust overall mortality and LOS in congenital cardiac surgery and to attempt a preliminary comparison of the performance of two institutions using all three systems (RACHS, BCS, CCS). In examining the combined institutions' data and comparing the predictive values of these systems, we found that CCS was more predictive of mortality than both BCS and RACHS. The C statistic for mortality versus CCS was 0.81—higher than the values in the range of 0.63 to 0.74 found for RACHS and BCS in our study as well as those of O'Brien and colleagues [6] and Al-Radi and associates [7]. As the latter authors pointed out, both RACHS and BCS only partially account for patient age at operation, which is a known independent risk factor for mortality. The fact that CCS does account for patient age, as well as other comorbidities, probably accounts for its increased predictive power.
On the other hand we determined using linear regression that age, RACHS, and CCS were significant predictors of LOS. It makes sense that no single variable was dominantly predictive of LOS. Length of stay is determined by a host of preoperative as well as postoperative risk factors. The probability of a postoperative complication may or may not be significantly correlated with one of the preoperatively determined scores. Despite the best attempt to account for all preoperative factors (for example, in the CCS score), the occurrence of any of a number of significant postoperative complications will importantly affect the resulting LOS, essentially adding intrinsic randomness (variability) to the data. In a study of infants younger than 6 months of age, Gillespie and coworkers [12] found several intraoperative and postoperative variables associated with LOS, including CPB time, ventilatory support time, need for catheterization, necrotizing enterocolitis and nasogastric feeding. All of these factors may be surrogates for intraoperative and postoperative complications not necessarily reflective of the preoperative complexity scores. This conundrum is manifested, in part, by the only moderate model correlation coefficients in the range of R
2 = 0.38 to 0.55 (0.44 for the combined data). Thus, only 44% of the observed LOS distribution can be accounted for by factors represented by the model containing RACHS, CCS, and age as risk factors. The problem with using a preoperatively determined scoring system to predict LOS is easily seen by examination of the raw data (Fig 3). These data are characterized by a cluster about a fairly flat curve out to about CCS = 15, but with a significant number of much larger values of LOS for CCS as low as 7.0. These larger values cannot be treated as outliers, as there are too many of them. The data are suggestive of an "ideal" relationship between LOS and CCS along the line of clustered data, but with other factors (not sufficiently accounted for by the CCS score) contributing to the significant number of nonclustered data points. This is why no simple "tight" relationship between LOS and CCS could be found. The problem is further revealed by the Bland-Altman analysis. For every model we examined, the Bland-Altman curves showed a nonrandom distribution of residuals and large limits of agreement between the observed and predicted LOS. This analysis demonstrates that CCS alone, or the model of Table 6, are poor predictors of LOS for an individual patient. Such a conclusion may be relatively independent of sample size, and more sensitive to the nature of the model used.
In comparing the two institutions using the methods described, we found that they were remarkably similar in performance. This finding is possibly a result of their cooperative relationship as a joint program. In particular, we found no difference in risk-adjusted mortality. This conclusion was reached with a model using CCS as the principal risk factor. Our analysis suggests that this method of interinstitutional comparison is superior to that performed using RACHS, BCS, or no risk stratification at all. Nonetheless, when differences are demonstrated and one wants to know where those differences arise, comparison of the factors of linear regression on CCS may not be sufficient, and a full multiple-variable logistic regression or some sort of group stratification scheme may be needed. This may, in turn require a much larger sample size.
The LOS models differed significantly between the two institutions. In particular, the β coefficient for RACHS was significantly greater for institution 2 than for institution 1, owing mainly to differences between the institutions in the LOS of RACHS 6 patients. Variability in mortality and LOS of these most complex patients can arise in several ways. Differences could be related to differences in preoperative, operative, or early postoperative management, risk factors of the patient population not accounted for by the standard scoring systems, or protocols for later postoperative management, such as feeding protocols and criteria for hospital discharge. In 2005, institution 2, for example, undertook an effort to minimize both hospital and interstage mortality after the Norwood procedure. Having achieved a hospital mortality of less than 3% for the Norwood operation, it sought to minimize interstage mortality by instituting an interim in-hospital assessment of and aggressive intervention on factors such as aortic arch status, aortopulmonary shunt status, and feeding and nutritional status. Patients with significant tricuspid regurgitation were kept in the hospital until their second-stage operation. This protocol resulted in an interstage mortality of only 2.5% but at the expense of longer duration of hospital stay and may account for some of the findings of this study.
One limitation of this study is the relatively small sample size. This limitation reduced the goodness of fit of our regression models and limited the number of independent variables we could simultaneously examine. In comparing two institutions, one is bounded by the need to examine a contemporary population, because our approaches to congenital cardiac management change so rapidly. One is also limited by the volume of each program. These two factors limit sample size. Indeed, Al-Radi and colleagues [7] found better predictability when they incorporated year of operation (era) into their regression models, a consequence of their data being spread over 22 years. Some improvement in statistical power can be achieved by comparing individual institutions with a larger population such as that of The Society of Thoracic Surgeons congenital database. This study awaits the inclusion of CCS scores in that database, as well as verification that institutions determine LOS consistently.
A second limitation of our study is that despite our examination of several nonlinear models, we did not attempt more complicated modeling schemes, such as spline-fitting and/or proportional hazards techniques in place of linear regression. The raw data (Fig 3) do suggest that such techniques might be better suited to the present analysis. Although such techniques might improve the predictive power of a model, however, they may or may not shed light on the underlying clinical interpretation of the model. We also did not determine whether the differences among CCS, RACHS, and BCS were of clinical importance using methods suggested, for example, by Al-Radi and coworkers [7]. We will perform this analysis as we accumulate more data.
In conclusion, we have used regression methods to evaluate the power of RACHS, BCS, and CCS to predict mortality and LOS. We find that CCS is better than RACHS and BCS in predicting mortality. We find a moderate correlation of LOS with a model that includes CCS, RACHS, and age together (R
2 = 0.44), although the model is a poor predictor of LOS for individual patients. Regression methods can be used to elucidate differences in performance among institutions. In this two-institution analysis, we found no significant difference in mortality between our institutions. Institution 2 had greater LOS for RACHS 6 cases compared with institution 1, which may, in part, be related to an evolving protocol at institution 2 to minimize interstage risk.
 |
Acknowledgments
|
|---|
This research was supported in part by the OrlandoHealth Foundation, Orlando, FL. We thank Patty Sieffert, ARNP, and Michael O'Brien, PA, for assisting with the data acquisition.
 |
References
|
|---|
- Jenkins KJ, Gauvreau K, Newburger JW, et al. Consensus-based method for risk adjustment for surgery for congenital heart disease J Thorac Cardiovasc Surg 2002;123:110-118.[Abstract/Free Full Text]
- Lacour-Gayet F, Clarke D, Jacobs J, et al. The Aristotle score: a complexity-adjusted method to evaluate surgical results Eur J Cardiothorac Surg 2004;25:911-924.[Abstract/Free Full Text]
- Jenkins KJ, Gauvreau K. Center-specific differences in mortality: preliminary analyses using the Risk Adjustment in Congenital Heart Surgery (RACHS-1) method J Thorac Cardiovasc Surg 2002;124:97-104.[Abstract/Free Full Text]
- Lacour-Gayet F, Jacobs JP, Clarke D, et al. Evaluation of the quality of care in congenital heart surgery: contribution of the Aristotle complexity score Adv Pediatr 2007;54:67-83.[Medline]
- Boethig D, Jenkins KJ, Hecker H, et al. The RACHS-1 risk categories reflect mortality and length of hospital stay in a large German pediatric cardiac surgery population Eur J Cardiothorac Surg 2004;26:12-17.[Abstract/Free Full Text]
- O'Brien SM, Jacobs JP, Clarke DR, et al. Accuracy of the Aristotle basic complexity score for classifying the mortality and morbidity potential of congenital heart surgery operations Ann Thorac Surg 2007;84:2027-2037.[Abstract/Free Full Text]
- Al-Radi OO, Harrell FE, Caldarone CA, et al. Case complexity scores in congenital heart surgery: a comparative study of the Aristotle Basic complexity store and the Risk Adjustment in Congenital Heart Surgery (RACHS-1) system J Thorac Cardiovasc Surg 2007;133:865-874.[Abstract/Free Full Text]
- Kang N, Tsang VT, Elliott MJ, et al. Does the Aristotle score predict outcome in congenital heart surgery? Eur J Cardiothorac Surg 2006;29:986-988.[Abstract/Free Full Text]
- Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement Lancet 1986;1:307-310.[Medline]
- Sinzobahamvya N, Photiadis J, Kumpikaite D, et al. Comprehensive Aristotle score: implications for the Norwood procedure Ann Thorac Surg 2006;81:1794-1801.[Abstract/Free Full Text]
- Li J, Zhang G, Holtby H, et al. Significant correlation of comprehensive Aristotle score with total cardiac output during the early postoperative period after the Norwood procedure J Thorac Cardiovasc Surg 2008;136:123-128.[Abstract/Free Full Text]
- Gillespie M, Kuijpers M, Van Rossem M, et al. Determinants of intensive care unit length of stay for infants undergoing cardiac surgery Congenital Heart Dis 2006;1:152-160.
Related Article
-
Invited Commentary
- Robert H. Habib
Ann. Thorac. Surg. 2009 88: 156-157.
[Extract]
[Full Text]
[PDF]
This article has been cited by other articles:

|
 |

|
 |
 
C. Arenz, B. Asfour, V. Hraska, J. Photiadis, C. Haun, E. Schindler, and N. Sinzobahamvya
Congenital heart surgery: surgical performance according to the Aristotle complexity score
Eur J Cardiothorac Surg,
April 1, 2011;
39(4):
e33 - e37.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. M. Giroud, J. P. Jacobs, D. Spicer, C. Backer, G. R. Martin, R. C. G. Franklin, M. J. Beland, O. N. Krogmann, V. D. Aiello, S. D. Colan, et al.
Report From The International Society for Nomenclature of Paediatric and Congenital Heart Disease: Creation of a Visual Encyclopedia Illustrating the Terms and Definitions of the International Pediatric and Congenital Cardiac Code
World Journal for Pediatric and Congenital Heart Surgery,
October 1, 2010;
1(3):
300 - 313.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. H. Habib
Invited Commentary
Ann. Thorac. Surg.,
July 1, 2009;
88(1):
156 - 157.
[Full Text]
[PDF]
|
 |
|