|
|
||||||||
Ann Thorac Surg 2005;79:1555-1562
© 2005 The Society of Thoracic Surgeons
a Department of Cardiothoracic Surgery, University of Cologne, Cologne, Germany
c Department of Cardiology, University of Cologne, Cologne, Germany
d Department of Anesthesiology, University of Cologne, Cologne, Germany
b Institute of Medical Statistics, Informatics and Epidemiology, University of Cologne, Cologne, Germany
Accepted for publication October 14, 2004.
* Address reprint requests to Dr Hekmat, Department of Cardiothoracic Surgery, University of Cologne, Kerpener Str 62, 50924 Cologne, Germany; (E-mail: khosro.hekmat{at}uk-koeln.de).
| Abstract |
|---|
|
|
|---|
METHODS: This prospective study consisted of all consecutive adult patients admitted after cardiac surgery to our ICU over a period of 3 years. Evaluation of variables was performed using the first year patients who stayed in the ICU for at least 24 hours. The reproducibility was then tested in two validation sets using all patients. Performance was assessed with the Hosmer-Lemeshow (
2 statistics) goodness-of-fit test and receiver operating characteristic (ROC) curves and compared with the Acute Physiology and Chronic Health Evaluation (APACHE II) and Multiple Organ Dysfunction Score (MODS).
RESULTS: A total of 3,230 patients were admitted to the ICU after cardiac surgery. Mean
2 values for the new score were 5.8 (APACHE II, 11.3; MODS, 9.7) for the construction set, 7.2 (APACHE II, 8.0; MODS, 4.5) for the validation set I, and 5.9 for the validation set II. The mean area under the ROC curve was 0.91 (APACHE II, 0.86; MODS, 0.84) for the new score in the construction set, 0.88 (APACHE II, 0.84; MODS, 0.84) in the validation set I, and 0.92 in the validation set II.
CONCLUSIONS: Our new 10-variable risk index performs very well, with calibration and discrimination very high, better than general severity systems; and it is an appropriate tool for daily risk stratification in ICU cardiac surgery patients. Thus, it may serve as an "expert system" for diagnosing organ failure, decision making, resource evaluation, and predicting mortality among ICU cardiac surgical patients.
| Introduction |
|---|
|
|
|---|
Most of the general scoring systems require extensive descriptor data collection, which limit their use in routine postoperative cardiac surgical patients [4, 5]. Predictive accuracy, specificity, and simplicity are factors of any scoring system governing its daily use in the cardiac surgical ICU.
The aim of this study was to develop a new specific and simple postoperative score to produce daily prognostic estimates for critically ill adult ICU cardiac surgical patients. These daily risk estimates could be useful for measuring the benefit of therapy during the ICU stay, and for evaluating new therapies. Just as repeated tests of white blood count and measurements of temperature assess progress or the lack of benefit in treating infection, repeated risk score measures may have the potential of predicting and defining the benefit of ongoing ICU treatment [6].
| Patients and Methods |
|---|
|
|
|---|
We studied all consecutive adult patients (18 years or older) who were admitted after cardiac surgery with cardiopulmonary bypass to our ICU over a period of 3 years. Evaluation of variables was performed using the first year patients who stayed for at least 24 hours. In this construction set (April 1999 to May 2000), we daily evaluated 109 candidate variables known to be important after cardiac surgery including all variables required for the Acute Physiology and Chronic Health Evaluation (APACHE II), the Therapeutic Intervention Scoring System (TISS-76), and the Multiple Organ Dysfunction Score (MODS) scoring systems (see Appendix).
The reproducibility was then tested in two validation sets using the data of all adult patients. In the validation set I (May 2000 to May 2001), we daily evaluated 57 variables including all variables of the APACHE II and MODS scores.
While all physiologic, demographic, and therapeutic data for the construction set and the validation set I were entered into a Microsoft Access database, we used our Intensive Care Information System (Eclipsys; Sunrise Critical Care, Boca Raton, Florida) for the construction set II (February 2002 to February 2003). The intensive care information system is interfaced with the patient monitor (Hewlett-Packard, Palo Alto, CA), different types of ventilators (Evita II and Evita IV; Draeger, Luebeck, Germany; and Servo 300 and 900; Siemens-Elema, Solna, Sweden), blood gas analysis on ICU (Radiometer, Copenhagen, Denmark), and the central blood test laboratory. The nursing staff has to trigger at least one value for each variable per hour. Obvious laboratory errors or abnormal results associated with technical problems or clinical interventions such as tracheal suctioning or sedation were omitted by the nurse. In the validation set II, we daily evaluated only the 10 variables of our new scoring system (Table 1).
|
Continuous data significantly associated with 30-day mortality on ICU day 1 to ICU day 6 based on two-tailed Student's t test were included into a multivariate logistic regression analysis. The same procedure was applied for all candidate categorical variables based on Fisher's exact test.
The performance of the different scoring systems was assessed by evaluation of calibration and discrimination. Calibration compares the observed mortality with that predicted by the model within severity strata. The most accepted method for measuring calibration is the goodness-of-fit statistic [7], which uses a
2-like statistics (Hosmer-Lemeshow
2 statistics). Small
2 values and high corresponding p values indicate a good calibration.
Discrimination, or the ability of a scoring system to distinguish between a patient who will live and one who will die, was measured by the area under the receiver operator characteristic (ROC) curve according to Hanley [8]. The ROC curve shows the relation between the true-positive rate (sensitivity) and the false-positive rate (100% specificity). An area under the curve of 1.0 implies perfect discrimination, whereas an area of 0.5 indicates results that are not better than chance. The overall predictive ability of the new score index, the Cardiac Surgery Score (CASUS), was assessed by daily calculating the area under the ROC curve, and compared with the APACHE II, TISS-76, and MODS scoring systems.
| Results |
|---|
|
|
|---|
In summary, a total of 2,545 patients aged 65.4 ± 10.4 years (range, 18 to 89) were scored daily. There were 1,793 men (70.5%) and 752 women (29.5%).
The operations performed were 1,580 (62.1%) isolated CABG, 421 (16.5%) isolated valve surgery, 292 (11.5%) combined CABG with valve surgery, 113 (4.4%) surgery of the thoracic aorta, 46 (1.8%) cardiac transplantation and 93 other procedures including congenital heart disease in adults, pulmonary thromboembolism, cardiac neoplasm, penetrating trauma of the heart, pericardial diseases, and pacemaker lead extraction.
There were no missing data in all three sets. Tables 2, 3 and 4 summarize the statistical results of the construction set and validation sets I and II. Low values of the Hosmer-Lemeshow
2 statistic indicate that there are no statistical hints against an assumed fit of the scoring system. The MODS and the CASUS score show the best calibration. The APACHE II and the TISS-76 did not calibrate as well. In addition the APACHE II showed a significant difference between observed and expected number of deaths on postoperative day 1 and 2 of the construction set. The same was true for the MODS on postoperative day 3 (construction set).
|
The results of the ROC curve indicate a very high discrimination of the CASUS, followed by the TISS-76, the APACHE II, and the MODS scores.
In summary, the new scoring system CASUS performed better in terms of overall correct classification values and discrimination when compared with the APACHE II, MODS, and TISS-76 scores. Calibration was equal for the MODS and the CASUS scoring systems. The TISS-76 was the most time-consuming scoring system in this study.
| Comment |
|---|
|
|
|---|
Postoperative cardiac surgical patients are unique in several ways. First, owing to cardiopulmonary bypass, these patients may have pathophysiologic changes, which may have no impact on outcome [11]. Serum potassium, sodium, and glucose are variables of the APACHE II scoring systems, which will revert to normal spontaneously or are corrected readily by the ICU staff. Second, many physiologic changes may be masked by multiple system support devices, such as intraaortic balloon pumps, ventricular assist devices, hemofiltration, and mechanical ventilation [4]. Third, most cardiac surgery patients with cardiopulmonary bypass are weaned from ventilator support several hours after surgery on the ICU [12, 13]. The Glasgow Coma Scale, a variable of APACHE II and MODS, is affected by therapy in the form of sedation, anesthesia, and paralysis. Additionally, its calculation requires clinical evaluation, which may be biased by subjective expectation [2, 4, 9].
In an attempt to simplify daily ICU scoring in cardiac surgery, we reviewed the literature to identify descriptors of mortality and multiorgan dysfunction in postoperative cardiac surgical patients [3, 4, 1123]. An ideal score consists of a minimum of ideal descriptors. The new developed CASUS is a compact score index and consists only of 10 readily available descriptors (Table 1). Three descriptors (PO2/FiO2, lactate, pressure-adjusted heart rate) are volatile and may change significantly from one hour to the next. According to the APACHE II, we chose for all variables the worst measure per day.
Descriptors of the CASUS
Widely used variables of individual organ system function that met our requirements for construct validity, reproducibility and responsiveness [2, 24, 25] were identified and included in the score. Univariate analysis revealed a significant association with 30-day mortality (ICU day 1 through ICU day 6) for all variables of the CASUS for all three data sets.
Descriptors of the cardiovascular system are the pressure-adjusted heart rate (adopted from Marshall and coworkers [2], lactate, intraaortic balloon pump, and ventricular assist device. The modified MODS [24] replaced the pressure adjusted heart rate (PAR) with a combination of heart rate, inotropic agents, and lactate values greater than 5 mmol/L. In our study, the mean heart rate was higher in patients who survived (3 days each in construction set and validation set I), indicating that the heart rate is not a good predictor in cardiac surgery patients. Inotropic agents (epinephrine, norepinephrine, dopamine, and dobutamine) were highly significant for mortality in the validation set I, and are a variable of the Sequential Organ Failure Assessment (SOFA) scoring system [26]. We chose not to include the inotropic agents, because the combination regimens of inotropic agents differ between ICUs. Moreover, the dosages of inotropic agents may depend on volume replacement.
Cardiac index measures were only available in fewer than 30% of all patients in the construction and in the validation set I, and is therefore not an ideal variable. Mixed venous saturation showed no significant association with 30-day mortality in both construction and validation set I.
Lactate proved to be one of the best predictors in postoperative cardiac patients. Most of the scoring systems were constructed at a time when lactate was not readily available. Today lactate is available in every ICU with its own blood gas analysis. Davies and associates [21] were able to demonstrate that lactate was the best predictor for intraaortic balloon pumping failure after cardiac surgery. Many physiologic changes of the cardiovascular system may be masked by intraaortic balloon pumps and ventricular assist devices, which are commonly used in cardiac surgical patients [4]. Including therapeutical interventions in a score has the potential disadvantage that the organ dysfunction will be described differently between centers because of different practice patterns [2]. Higgins and colleagues [27] reported an odds ratio of 7.11 in their morbidity model and an odds ratio of 4.46 in their mortality model for the postoperative intraaortic balloon pump usage. Ventricular assist devices may have even a greater impact on outcome and resource allocation compared with intraaortic balloon pump insertion.
The respiratory system is adequately represented by the arterial PO2/FiO2 ratio. Both PO2 and FiO2 as single variables were less reliable in the construction set and in the validation set I. As in the MODS, the hepatic system is best represented by serum bilirubin, and the hematologic system by the platelet count.
The renal system is reflected by serum creatinine and any kind of renal replacement therapy. Continuous venovenous hemofiltration or dialysis are often used in postoperative cardiac patients and may mask a dysfunction of the renal system.
Many scoring systems have included the Glasgow-Coma-Scale (GCS) [1, 2, 9, 26] to describe changes of the central nervous system. Although significant we did not include the GCS in our new scoring system, because the calculation requires several minutes per day and patient. This limits the practicability of a scoring system and the compliance of the staff. Therefore, we constructed a new variable called "neurologic state" (see Table 1), which is readily available and as significant for mortality as the GCS. Diffuse neuropathy includes signs and symptoms of stroke or cerebral hemorrhage.
In order to simplify this scoring system, we choose not to use a higher acuity for patients on intraaortic balloon pump, ventricular assist device, or renal replacement therapy. Higher acuity may result in even better calibration and discrimination for our patients in this study, but changes over time and different practice patterns in individual institutions may lead to less conformity for the CASUS scoring system.
Age was found to be a significant risk factor for morbidity and mortality in several studies dealing with preoperative variables in cardiac surgical patients such as the Cleveland score [22] and the European System for Cardiac Operative Risk Evaluation (EuroSCORE) [28]. It is also a variable of the APACHE II [1] and the ICU admission scores [27]. Although the risk of open heart surgery is known to be increased above the age of 65 [22], we were not able to find a significant association with 30-day mortality. We believe the reason for this was our policy in patient selection.
Semiautomated Scoring of the CASUS
As described in Patients and Methods, we used our Intensive Care Information System for the construction set II. Since this electronic flow chart records an enormous amount of physiologic and therapeutic data, it was only logical to determine the CASUS semiautomatically. It is of great importance that all data are validated by the ICU staff. Spurious heart rates caused by loose leads or incorrect blood pressure recordings caused by kinked or dampened arterial or venous lines are common examples of incorrectly stored data variables. To overcome these difficulties, every single datum that is stored in our Intensive Care Information System has to be validated by the attending nurse. We integrated a program into this system that automatically acquires the most abnormal physiologic values for severity scoring. The research clerks were advised to validate all data before storing. The calculation of the CASUS takes less than 1 minute a day for 1 patienta prerequisite for its acceptance in other ICUs. The use of an Intensive Care Information System resulted in a very high mortality prediction (see Table 4) with excellent calibration and discrimination.
|
Conclusion
The new CASUS correlates in a graded fashion with the ICU mortality rate, both when applied on the first day of ICU admission as a prognostic indicator and when calculated over the ICU stay as an outcome measure. The variables of the CASUS are simple, routinely and reproducibly measured, and when combined with a semiautomated intensive care information system completed in less than 1 minute per patient and day. Thus, the CASUS may serve as an "expert system" for diagnosing organ failure, decision making, resource evaluation, and predicting mortality in ICU cardiac surgical patients. We recommend its widespread use in combination with a preoperative risk stratification model like the EuroSCORE [28].
| Appendix |
|---|
|
|
|
|
|
| Acknowledgments |
|---|
|
|
|---|
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
K. Ak, S. Isbir, A. Tekeli, A. Ergen, N. Atalan, S. Dogan, A. Civelek, and S. Arsan Presence of lipoprotein lipase S447X stop codon affects the magnitude of interleukin 8 release after cardiac surgery with cardiopulmonary bypass J. Thorac. Cardiovasc. Surg., August 1, 2007; 134(2): 477 - 483. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Patila, S. Kukkonen, A. Vento, V. Pettila, and R. Suojaranta-Ylinen Relation of the Sequential Organ Failure Assessment Score to Morbidity and Mortality After Cardiac Surgery Ann. Thorac. Surg., December 1, 2006; 82(6): 2072 - 2078. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ANN THORAC SURG | ASIAN CARDIOVASC THORAC ANN | EUR J CARDIOTHORAC SURG |
| J THORAC CARDIOVASC SURG | ICVTS | ALL CTSNet JOURNALS |