|
|
||||||||
Ann Thorac Surg 2000;70:162-168
© 2000 The Society of Thoracic Surgeons
a Division of Cardiovascular Surgery, the Toronto General Hospital and University of Toronto, Toronto, Ontario, Canada
Address reprint requests to Ms Ivanov, Division of Cardiovascular Surgery, Toronto General Hospital, CCRW 4-803, 200 Elizabeth St, Toronto, ON, Canada M5G 2C4
e-mail: joan.ivanov{at}uhn.on.ca
| Abstract |
|---|
|
|
|---|
Methods. Nine clinicians estimated the predicted probability of OM and ICU stay greater than 48 hours from an abstract of information for each of 100 patients selected from the 1996 to 1997 database of 1,904 patients who underwent isolated CABG. Logistic regression models were used to calculate the predicted probability of OM and ICU stay greater than 48 hours for each patient. The study sample was split into two parts; clinicians were randomly given access to a predictive rule to guide their judgements for one part of the study.
Results. Clinicians estimates were similar with or without access to the rule, and both parts of the study were therefore pooled. Clinicians significantly overestimated the probability of OM (model 6.3% ± 1%, clinicians 7.6% ± 3%, p = 0.0001) and ICU stay greater than 48 hours (model 25% ± 2%, clinicians 28% ± 1%, p = 0.0012). Clinicians estimates of OM were not significantly higher than the models for nonsurvivors (0.8% ± 0.7%, p = 0.2), but were significantly higher for survivors (1.4% ± 0.3%, p = 0.039).
Conclusions. Clinicians trusted their own empiric estimates rather than a predictive rule and overestimated the probability of OM and ICU stay greater than 48 hours.
| Introduction |
|---|
|
|
|---|
| Patients and methods |
|---|
|
|
|---|
A half-page abstract was prepared for each patient. Again, to support clinical judgements, the abstract contained a table of information covering many more variables than the statistical models that were derived and validated by our group [1, 12, 16]. The variables drawn from our clinical cardiac surgery database included: age, gender, left ventricular ejection fraction, New York Heart Association functional class, anginal symptoms (ie, stable, unstable, acute coronary insufficiency), previous CABG, timing of surgery (ie, elective, same-hospitalization, urgent, emergent), left main disease, positive exercise stress test, recent myocardial infarction within the month before surgery, diabetes, peripheral vascular disease (including carotid stenoses), history of hypertension, renal insufficiency, height, and weight. Results of coronary angiography were also shown in detail as a diagram along with information regarding the quality of the vessels (ie, degree of stenosis, and vessel size, quality of distal vessels, and collateral flow), when available. A brief narrative was prepared that described each patients relevant history, presenting scenario, and other comorbidity.
Predictive rule
A predictive rule, developed from the combined CABG database of two Toronto hospitals for the years 1993 to 1996, was used as the format for the contemporary, site-specific guidelines. Risk weights for this rule were recalibrated in the 1993 to 1997 database for the Toronto General Hospital only. Risk weights and cutoff points of the total risk score, which defined relative risk groups (low, medium, high), can be found in Table 1. Details regarding the development of this rule have been published [12]. By presenting a simple and additive scoring system, we hoped to maximize the systems acceptance by clinicians.
|
Clinicians were randomized to receive the predictive rule as an aid for Part I or Part II of the study. After a 4-month "washout" period, clinicians who received the predictive rule for Part I did not receive it for Part II of the study, and vice versa.
Outcomes
Clinicians were asked to estimate, for each patient, the probability of: (1) operative mortality (OM), defined as any postoperative, in-hospital death, and (2) prolonged ICU length of stay, defined as an ICU length of stay greater than 48 hours, which corresponded to the 83rd percentile of ICU length of stay for 1995 to 1997.
Statistical model
The model was essentially similar to the original predictive rule used to assist clinicians. Risk factors from the original rule, in addition to other important prognostic variables, were submitted to a logistic regression analysis to recalibrate regression coefficients and optimize precision [12].
In 1995, the pattern of practice changed regarding discharge from the ICU: patients were extubated earlier and discharged sooner than previous patients because of "fast-tracking" [20, 21]. Because of this changing pattern, the logistic regression model for the prolonged ICU stay outcome was rederived for only the 1995 to 1997 database of isolated CABG patients. An ICU stay of greater than 48 hours corresponded to the 83rd percentile of total ICU length of stay.
Regression coefficients from the newly derived models were used to calculate the patient-specific predicted probability of each outcome using methods previously described [12, 16, 22, 23].
Statistical analysis
Data were managed in dBASE IV datasets and analyzed using SAS for Windows Version 6.12 (SAS, Cary, NC) [24]. Results are presented as means ± standard errors. A two-tailed p value less than 0.05 indicates statistical significance unless otherwise noted.
The estimates of probability for OM and ICU stay greater than 48 hours with and without access to a predictive rule were evaluated by unpaired t test. The difference between the clinicians estimate of the probability of OM and ICU stay greater than 48 hours and the models estimate was calculated for each patient (clinician minus model) and evaluated by paired t test. The null hypothesis for this analysis was that the difference in the predicted probability of an event equaled zero.
Discrimination (or predictive accuracy) was assessed by calculation of the area under the receiver-operator characteristic curve (ROC) for each set of estimates [25, 26]. The area under the ROC curve reflects the proportion of randomly paired sets of patients for which the patient experiencing the event has a higher predicted probability of having the event, compared with the patient who does not experience the event. An area under the ROC curve of 50% indicates no discriminatory ability, ROC areas above 70% represent fair discriminatory ability, and ROC areas above 80% represent good discriminatory ability.
Calibration, or precision at the group level, was evaluated by the Hosmer-Lemeshow goodness-of-fit
2 statistic (HLGOF) [22]. Hosmer-Lemeshow goodness-of-fit p values less than 0.05 indicate a significantly imprecise model (ie, the model did not fit the data). Calibration is also important because, in contrast to the match between probabilities and categorical events reflected in the ROC curve area, it captures the ability of the model (or clinicians) to predict the absolute event rates on average.
| Results |
|---|
|
|
|---|
Logistic regression coefficients for both the OM model and the prolonged ICU stay model are found in Table 2. Both models demonstrated excellent discrimination and precision.
|
Sample characteristics
Table 3 depicts the demographics of the study population and the 100-patient study sample. The deliberate skew of the study sample is evident in both the prognostic variables and the prevalence of outcomes. There was no difference between Part I and Part II for any prognostic or outcome variable.
|
Difference between clinicians and the model
Predicted probabilities of outcomes
Figure 1 shows the difference by paired t test between each clinicians predicted probability and the models calculations for OM (top) and ICU greater than 48 hours (bottom). The differences for OM were significant for the experienced clinicians (Nos. 1, 2, 3, 5, 6), whereas the more inexperienced clinicians (Nos. 7 to 9) demonstrated no significant difference from the models predictions. The junior clinicians (Nos. 7 to 9) estimates were significantly different from the senior clinicians (Nos. 1 to 5) by analysis of variance (p < 0.05).
|
Table 4 shows the results of the pooled clinician sample compared with the statistical model. The clinicians significantly overestimated both total OM and ICU stay greater than 48 hours. However, when we evaluated probabilities for patients who died versus survivors, the differences for predictions of OM were significant only for survivors (clinicians 7.0% ± 9%, model 5.6% ± 7%, p = 0.0001); there was no significant difference in OM predictions for the nonsurvivors (clinicians 9.0% ± 10%, model 8.1% ± 9%, p = 0.23). Clinicians estimates of prolonged ICU time were significantly higher than the models for both those with short ICU stays (clinicians 22% ± 22%, model 20% ± 15%, p = 0.03) and those who did have an ICU length of stay greater than 48 hours (clinicians 39% ± 28%, model 34 ± 23%, p = 0.01).
|
|
| Comment |
|---|
|
|
|---|
In this study, we examined the comparative performance of clinicians and a statistical model in predicting adverse outcomes after CABG. We presented a group of clinicians with case abstracts for 100 patients, and deliberately skewed the sample by including 29 patients who died postoperatively. The additional information contained in the narrative section of the patient histories, coupled with the skew towards higher risk patients who experienced adverse events, was designed to give clinicians an advantage over the statistical models estimates of probability. We postulated that clinicians conceptual flexibility and intuition might allow them to identify patients at special risk more accurately than a statistical model.
The predictive rule as depicted in Table 1 was provided to each clinician for one phase of the study. The 4-month "washout" period between the two parts of the study was designed to prevent contamination. We were not particularly surprised that the majority of clinicians elected not to use the rule consistently to calculate patient-specific probabilities of each outcome. Indeed, we found no differences in predictive accuracy whether clinicians were exposed to the rule or not, and whether they claimed to use it consistently or not. These findings highlight the tendency of clinicians to rely on their own intuitive judgements rather than a statistical model.
Given the lack of impact of access to the predictive rule, we analyzed the results for both phases of the study together. Unlike previous studies, which have shown that clinicians outperformed quantitative models [27, 30, 31], a robust analysis of the paired difference of each clinicians estimates of probability versus those made by the statistical model revealed that senior surgeons significantly overestimated the probability of operative mortality. One possible explanation for these results is that senior surgeons make typically broad predictions based on their recent experience [15], whereas the junior clinicians, lacking in experience, may rely more closely on published results.
The relatively poor performance of the statistical model in the 100-patient study sample was understandable given the deliberately skewed prevalence of outcomes in the study sample. Indeed, as noted, we expected that the approach towards a study sample with an increased prevalence of high-risk patients having outcomes would give clinicians an advantage. It did not.
One limitation of this study was that the clinicians were making estimates of mortality and prolonged ICU stay from a written synopsis of patient information rather than direct examination. However, clinicians were given the advantage by being provided with additional relevant clinical information in the narrative portion of each patient abstract. Also, the diagram of the cardiac catheterization results included distal vessel quality if it was noted to be poor. Despite these advantages, the clinicians did not out perform the statistical models, which relied solely on a limited number of variables. We doubt, therefore, that our findings would have been different if the study had taken place prospectively in practice as opposed to relying on written case abstracts.
As noted, the limited impact of access to a predictive rule was not surprising. Clinicians are trained to rely heavily on their own experience and intuitive judgements. Moreover, the literature on changing physician behavior amply illustrates that clinical decision-making is not readily influenced unless there is acceptance of a new practice norm by "opinion leaders" and a concerted local effort to address barriers to change (eg, by automating decision support systems) [32, 33]. However, surgeons and anesthetists at our center are already using an implied form of prior probability estimates to fast-track low-risk cardiac surgical patients through the ICU [20, 21]. In this regard, resources have been streamlined for those patients deemed to be at a low risk of poor outcome and, conversely, there is a greater concentration of resources for higher risk patients. We believe that with appropriate implementation strategies, clinicians will use statistical tools that enable them to make more accurate judgements of patients risks for adverse outcomes after CABG.
Summary
Predictive rules have the advantage of being able to integrate more information by relating both continuous and categorical variables to an outcome of interest. An accurate and precise model can be used as a concise summary of the relationships between prognostic variables and the outcome [34, 35].
Cardiac surgery clinicians, when given the option, preferred not to use a predictive rule but rather trusted their own judgements to estimate the probability of operative mortality or prolonged ICU length of stay after coronary bypass surgery. Experienced surgeons significantly overestimated the risk of operative mortality compared with their junior colleagues. Clinicians predictive accuracy was only fair for operative mortality, and only slightly better for estimates of prolonged ICU length of stay. Although no model can predict the specific individual who will have an adverse event, statistical models do permit reasonably accurate estimates of event rates for subgroups of patients. If methods were devised to ensure that clinicians can and do make use of predictive rules, they would be able to make more accurate judgements of the probability of adverse outcomes after CABG surgery.
Study participants
Surgeons: Tirone E. David, MD (chief), Lynda L. Mickleborough, MD, Charles M. Peniston, MD, Robert J. Cusimano, MD, Anthony C. Ralph-Edwards, MD, and Terrence M. Yau, MD. Residents: Gideon Cohen, MD, and Michael A. Borger, MD. Nurse clinician: Nancy Walton, BSc.
| Acknowledgments |
|---|
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
J. Nilsson, L. Algotsson, P. Hoglund, C. Luhrs, and J. Brandt Comparison of 19 pre-operative risk stratification models in open-heart surgery Eur. Heart J., April 1, 2006; 27(7): 867 - 874. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. D. Desai, M. P. Pelletier, H. R. Mallidi, G. T. Christakis, G. N. Cohen, S. E. Fremes, and B. S. Goldman Why Is Off-Pump Coronary Surgery Uncommon in Canada? Results of a Population-Based Survey of Canadian Heart Surgeons Circulation, September 14, 2004; 110(11_suppl_1): II-7 - II-12. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Liao and D. B. Mark Clinical prediction models: are we building better mousetraps? J. Am. Coll. Cardiol., September 3, 2003; 42(5): 851 - 853. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ANN THORAC SURG | ASIAN CARDIOVASC THORAC ANN | EUR J CARDIOTHORAC SURG |
| J THORAC CARDIOVASC SURG | ICVTS | ALL CTSNet JOURNALS |