ATS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Ann Thorac Surg 2009;87:361-364. doi:10.1016/j.athoracsur.2008.10.053
© 2009 The Society of Thoracic Surgeons

This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to Personal Folders
Right arrow Download to citation manager
Right arrow Author home page(s):
Gary L. Grunkemeier
Ruyun Jin
YingXing Wu
Right arrow Permission Requests
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Grunkemeier, G. L.
Right arrow Articles by Wu, Y.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Grunkemeier, G. L.
Right arrow Articles by Wu, Y.
Related Collections
Right arrow Education
Right arrowRelated Article


The Statistician's Page

Cumulative Sum Curves and Their Prediction Limits

Gary L. Grunkemeier, PhD, Ruyun Jin, MD, YingXing Wu, MD*

Medical Data Research Center, Providence Health & Services, Portland, Oregon

* Address correspondence to Dr Wu, 9205 SW Barnes Rd, Suite 33, Portland, OR 97225 (Email: yingxing.wu{at}providence.org).


    Introduction
 Top
 Introduction
 Risk Adjustment
 Examples
 CUSUM Prediction Limits
 Comment
 Footnotes
 Acknowledgments
 References
 
In this issue of The Annals of Thoracic Surgery, Dr Brevig and colleagues [1] describe their program to reduce blood transfusions in cardiac surgery. Their analysis of the changes in operative mortality during this process uses the risk-adjusted cumulative sum (CUSUM) of observed minus expected deaths. We previously discussed this method [2], but the "bullet-shaped" prediction limits that we used, which give rise to an expanding interval over time, raised some concern. Recently a statistically sophisticated cardiac surgeon asked, "Why do these intervals get wider, rather than narrower, as the number of patients increase?" The December 2004 issue of The Journal of Thoracic and Cardiovascular Surgery contained five important and enlightening articles on CUSUM methods, including a tutorial [3], two commentaries [4, 5], a clinical article [6], and an editorial [7]; the latter called the bullet-shaped prediction limits "nonintuitive." The present study was undertaken primarily to convince us that these bullet-shaped prediction limits are indeed reasonable and appropriate. Our secondary purpose is to provide a basic introduction to the CUSUM method of risk-adjustment, and to illustrate the concepts underlying this important analytical and graphical methodology.


    Risk Adjustment
 Top
 Introduction
 Risk Adjustment
 Examples
 CUSUM Prediction Limits
 Comment
 Footnotes
 Acknowledgments
 References
 
Measuring the results of a surgical intervention is essential to determining its effectiveness. Public reporting and comparisons are proceeding, with the ultimate goal of identifying the best providers, whose permissions and reimbursements might be determined by their results. For this task, risk adjustment is imperative; because of differences in patient profiles, no two patients, and thus their intrinsic risks of death and other complications, are exactly the same. Among medical procedures, cardiac surgery has led the way in this area, contributing many risk models for mortality and other serious complications. For each patient, these risk models provide a predicted probability of the occurrence of the event.

How do we then accomplish "risk adjustment," using these predicted results to adjust the observed results? For an in-hospital or operative event (eg, death), the observed outcome for each patient is either 0 (lived) or 1 (died). The predicted probability is a decimal value, strictly between 0 and 1 (usually expressed as a percentage). So it is a little bit tricky to compare or reconcile the observed (0 or 1) values with the expected/predicted decimal values. However, if you think about this, it is reasonable: for any group of patients, their expected number of deaths is simply the sum of their individual predicted (decimal) probabilities of death. For example, if you operate on 10 identical patients, each with a 0.10 (10%) probability of death, then you would expect that one of those 10 would die. Thus, the expected number of deaths is 10 times 0.10, or 1.0 death, that is, the sum of the predicted probabilities of death. (Ten real patients would all have different risks, but the sum of their predicted risks will still equal the expected number of deaths in the group.)

O/E Ratios and Odds Ratios
A common method of risk-adjustment is to divide the sum of the observed (O) deaths for a group of patients by the sum of their expected (E) mortality risks to produce the O/E ratio. A similar, but technically more suitable [8], statistic is the odds ratio (OR). If a provider is operating as expected/predicted, these ratios should equal about 1. For the example just given, the O/E ratio is exactly 1 (= 1/1).

O-E Differences and Cumulative Differences
Another method of risk adjustment is to subtract, rather than divide, to produce the O-E difference. If a provider is operating as expected, this difference should equal about zero. A great advantage of the O-E difference, compared with the O/E ratio and the OR, is that when it is computed on a patient-by-patient basis, it describes an evolving process that can be plotted over patient number (or surgery date) to produce an informative graph of the trends over caseload (or time) that can be monitored continuously. Plus, the vertical axis in this graph has the intuitive interpretation of "excess deaths," (ie, the number of observed deaths that were not expected). Also, negative values on this vertical axis are "lives saved" (ie, the number of expected deaths that did not occur).


    Examples
 Top
 Introduction
 Risk Adjustment
 Examples
 CUSUM Prediction Limits
 Comment
 Footnotes
 Acknowledgments
 References
 
We will begin with a simple example using dummy patients. Suppose you operate on 10 relatively high-risk patients with expected mortality ranging from 10% to 35% (Fig 1). In this example, the sum of the expected mortality risks (in decimal notation) is E = 2.0, meaning that 2.0 deaths are expected. In fact, two patients actually died (O = 2), so the O/E ratio is 1.0, and you are operating as expected. A more detailed analysis of this limited experience can be obtained by plotting the cumulative sum (CUSUM) of these O-E differences. When a patient dies the CUSUM goes up, and when a patient lives the CUSUM goes down (Fig 1). In this contrived example, the CUSUM curve ends at zero, indicating performance exactly as expected.


Figure 1
View larger version (20K):
[in this window]
[in a new window]

 
Fig 1. Ten hypothetical patients with varying risks (Exp = expected mortality), for a total risk of 2.0. The surgeon is operating exactly as expected, and 2 patients died (Obs = observed mortality). Thus, the observed/expected (O/E) ratio is 1, and the cumulative sum (CUSUM) of the observed mortality minus expected mortality equals zero at the end.

 
Real Patients
The Providence Health and Services (PHS) hospitals research and data sharing collaborative developed a risk model for valve surgery operative mortality, based on 4,920 valve operations from 1997 to 2004 [9]. Subsequently, the PHS cardiac programs have performed 3,216 additional heart valve surgeries in which a predicted risk score could be obtained, allowing us to compare the observed mortality in this validation subset of subsequent patients to that expected from the PHS risk model. The CUSUM for these patients, constructed just as the example in Figure 1, using the PHS risk model for expected mortality, is shown in Figure 2. The horizontal axis is scaled by surgery number, and the operative years are given by vertical grid lines. (An alternative presentation would be to scale the horizontal axis by calendar time and depict the number of cases by vertical grid lines.)


Figure 2
View larger version (29K):
[in this window]
[in a new window]

 
Fig 2. The cumulative sum of observed minus expected operative deaths for 3,217 heart valve surgeries, with 95% point-wise prediction limits. The horizontal axis is scaled by patient number, and the operative years are given by vertical grid lines. The odds ratio (OR), with 95% confidence interval (CI), gives an overall assessment of performance.

 

    CUSUM Prediction Limits
 Top
 Introduction
 Risk Adjustment
 Examples
 CUSUM Prediction Limits
 Comment
 Footnotes
 Acknowledgments
 References
 
The O-E difference will almost never equal exactly zero, even when performance is as expected. Random variation must be accounted for, before any clinical difference is attributed to performance, by constructing an interval estimate that contains the values that are consistent with the observed (point) estimate. A simple 95% confidence interval can be constructed using the normal ("bell-shaped") approximation to the binomial distribution (Appendix). To produce prediction intervals for the CUSUM plot in Figure 2, we computed these 95% limits at each point (patient). As previously mentioned, it is not intuitively apparent why these prediction intervals should be expanding, or bullet-shaped, as the number of patients increase. To resolve this concern, and to convince any skeptics, we will enlist the help of a "virtual" surgeon, or rather, a large number of them, to operate on these real patients and help us confirm the appropriateness of these prediction intervals.

Virtual Surgeons
Nowadays, there is a new paradigm in statistics that has been made possible by availability of the ubiquitous computing power of simulation. We no longer have to theorize about what will happen "in the long run," as we can simply reproduce an observed outcome (eg, a simulated experience for a group of real patients) as often as desired, all within a few seconds of computer time. To apply this technology to the present situation, we take the predicted risk for each patient, but ignore the actual fate of the patient. Instead, we let a random process determine the survival for each patient, based on his or her expected risk. This produces an entire series of "pseudo outcomes," completely consistent with the expected values, and a resulting CUSUM path. The computer acts as a "virtual surgeon" operating exactly as expected, giving a patient with a p% expected risk exactly a p% chance of death. The actual death, or not, depends on chance as incorporated into the (binomial) distribution. Repeating this procedure produces another pseudo series, and another CUSUM path, different than the first, but equally consistent with the expected risks. When a large number of these CUSUMs are plotted, the appropriate, expected random variability becomes evident.

To see how extreme these "expected" paths may be at each point, we recruited 5,000 virtual surgeons, each of whom operated exactly as expected, with only random variability separating their results. The first 100 of these results are drawn in Figure 3 as light gray lines, and the quantiles for the middle 95% at each point (patient) for all 5,000 are drawn as thicker black lines. Note that they correspond with the normal-approximation limits, shown by the smooth, thicker gray lines. The probability that any given surgeon will exceed these prediction limits is somewhat greater than 5% because of the multiple comparisons at each time point [10]. It can be seen from Figure 3 that at each point approximately 5% (5 of 100) of the curves are outside these limits, but more than 5% are outside these limits at some point. Thus these point-wise confidence limits do not constitute a 95% confidence band, nor can this method provide a strict hypothesis test. (Variations of the CUSUM technique provide exact hypothesis tests, but do not produce such readily intuitive graphs [2].)


Figure 3
View larger version (103K):
[in this window]
[in a new window]

 
Fig 3. Visual justification of the prediction limits for the cumulative sum curves, using 3,217 real valve surgery patients and 5,000 simulated surgeons, and each "surgeon" operating on each patient with expected mortality given from the risk model, with random deviations from the horizontal zero line due only to chance. Only the first 100 simulations are plotted (light gray jagged lines), but the quantiles for the middle 95% from all 5,000 simulations are also shown (black jagged lines). These point-wise 95% limits agree with the prediction limits based on a normal approximation (thicker, smooth gray lines).

 
Confidence Limits
Both the O/E ratio and the OR combine all of the data into a single statistic, and thus transcend the multiple comparison problem. Similarly, using the prediction limits at the very last CUSUM point, to which all the data contribute, avoids the multiple comparisons. The confidence interval for this [11 is based on the same formula as that of the O/E ratio [12], which is in turn very similar to the OR confidence interval. So, to assign a numeric value to the whole CUSUM experience, we compute the OR and its 95% confidence interval, using an intercept-only logistic regression with the logit of the risk as an offset term [8]. We, as do others [13], prefer to emphasize estimation rather than hypothesis testing, and thus provide the point and interval estimates of the OR, as well as the path of the CUSUM in relation to the expected prediction limits (Fig 2). In Figure 2, the OR confidence interval agrees with the CUSUM prediction limits: the overall performance is better than expected, but not significantly so.


    Comment
 Top
 Introduction
 Risk Adjustment
 Examples
 CUSUM Prediction Limits
 Comment
 Footnotes
 Acknowledgments
 References
 
Steiner and colleagues [14] note that the limitation of the O-E CUSUM plots is that "... they do not specify how much variation in the plot is expected under good surgical performance, and hence how large a deviation from the expected should be a cause for concern" [14]. Our major purpose was to address this "limitation." Prediction limits can easily be put around a CUSUM using the familiar normal approximation to the binomial distribution. Also, but technically more difficult, exact prediction limits can be produced based directly on the binomial distribution [15, 16]. Note that prediction limits are necessary to provide a scale with which to interpret deviations from the horizontal line (O = E), since, for example, a difference of 10 deaths might be of great concern if the total number of surgeries was small and the expected death rate was low, whereas this might not be the case if the number of surgeries was very large or the expected death rate was very high, or both.

Alternative CUSUM Methods
The CUSUM originally referred to a statistical method used in industrial applications, to test when a process became out of control [17]. But patients, unlike industrial units, are unique, so risk-adjusted versions of this hypothesis-testing setup have been produced for clinical use [10, 14]. However, to gain the advantage of being a strict hypothesis test, these cumulative sums are plotted in units of likelihood ratio logarithms, and thus have no easily appreciated meaning. We prefer the plots of observed minus expected mortality, which have an intuitive meaning and in fact, the conclusions from the simple O-E CUSUM and the more formal but less graphically helpful hypothesis testing setups are similar [2, 3, 6, 10, 16, 18]. Biau and colleagues [19] use the terms CUSUM graph to refer to the cumulative O-E curve and CUSUM test to refer to the hypothesis testing setup. A recent comparison of seven methods of detecting a high death rate, including the more descriptive and the more formal hypothesis testing methods, recommended one of the former [20].

Up or Down?
Previously we [2], as have others [21, 22], as well, used the cumulative sum of E-O; but we now prefer the complement, O-E, as do many others [3, 6, 10, 23, 24], because the "observed minus expected" is the usual direction that residuals (differences between observations and model-based predictions) are formed [25]. The original CUSUM by Page [17] used this approach (ie, up is bad, down is good).

Conservative Nature of Prediction Limits
These prediction limits are conservative (ie, they are point-wise rather than curve-wise). Although for each point, 95% of CUSUM paths should be inside, the probability that an entire CUSUM path will be within these limits at every point is less than 95%, and decreases with the number of points plotted [16]. In fact, the conservativeness of these CUSUM prediction intervals (meaning that some "normal" providers' CUSUM paths may stray outside the prediction limits and draw attention to themselves, when only chance is operating) may be desirable. In discussing the related hypothesis testing methods, Treasure and colleagues [5] conclude that, "It is inherent in the problem that an alert ... must signal before the conventional level of scientific proof required to test a scientific hypothesis; if we allow events to run their course until a conventional level of significance is reached (such as p = 0.05), many lives will have been lost."


    Appendix
 
Confidence Interval for a Single O–E Point
A 95% confidence interval for O–E can be constructed by computing the standard error (SE) of the point estimate, and using the normal distribution approximation of plus and minus 2 SE [11]. If "o" is the observed mortality for an individual patient (equals 1 if that patient died and 0 if not), then O = sum(o) is the total number of observed deaths. If "e" is the expected mortality (0 < e < 1) for an individual patient, then E = sum(e) is the number of expected deaths. Since "e" is an estimate from a risk model, it has some variability because of the error involved in estimating the coefficients of the model. But this variability in "e" is relatively small compared with the error or variability in "o" [12], so it is usually ignored and "e" is considered to be a fixed value, and SE(O–E) = SE(O). Since "o" follows the Bernouli distribution with parameter "e," then:


Formula

An approximate 95% confidence interval for the O–E difference based on the normal distribution is (O-E – 2*SE(O), O-E + 2*SE(O)).


    Acknowledgments
 Top
 Introduction
 Risk Adjustment
 Examples
 CUSUM Prediction Limits
 Comment
 Footnotes
 Acknowledgments
 References
 
The following listed facilities contribute to the PHS Cardiovascular Study Group database. Alaska: Providence Alaska Medical Center (Anchorage). Washington: Providence Regional Medical Center Everett; Providence St Peter Hospital (Olympia); Sacred Heart Medical Center (Spokane). Oregon: Providence Portland Medical Center; Providence St. Vincent Medical Center (Portland); Providence Medford Medical Center. California: Providence St Joseph Medical Center (Burbank); Providence Holy Cross Medical Center (Mission Hills); Little Company of Mary Hospital (Torrance). Montana: St. Patrick Hospital and Health Sciences Center (Missoula).


    Footnotes
 Top
 Introduction
 Risk Adjustment
 Examples
 CUSUM Prediction Limits
 Comment
 Footnotes
 Acknowledgments
 References
 
For related article, see page 532


    References
 Top
 Introduction
 Risk Adjustment
 Examples
 CUSUM Prediction Limits
 Comment
 Footnotes
 Acknowledgments
 References
 

  1. Brevig J, McDonald J, Zelinka ES, Gallagher T, Jin R, Grunkemeier GL. Blood transfusion reduction in cardiac surgery: multidisciplinary approach at a community hospital Ann Thorac Surg 2009;87:532-539.[Abstract/Free Full Text]
  2. Grunkemeier GL, Wu YX, Furnary AP. Cumulative sum techniques for assessing surgical results Ann Thorac Surg 2003;76:663-667.[Free Full Text]
  3. Rogers CA, Reeves BC, Caputo M, Ganesh JS, Bonser RS, Angelini GD. Control chart methods for monitoring cardiac surgical performance and their interpretation J Thorac Cardiovasc Surg 2004;128:811-819.[Free Full Text]
  4. Spiegelhalter DJ. Monitoring clinical performance: a commentary J Thorac Cardiovasc Surg 2004;128:820-822.[Free Full Text]
  5. Treasure T, Gallivan S, Sherlaw-Johnson C. Monitoring cardiac surgical performance: a commentary J Thorac Cardiovasc Surg 2004;128:823-825.[Free Full Text]
  6. Caputo M, Reeves BC, Rogers CA, Ascione R, Angelini GD. Monitoring the performance of residents during training in off-pump coronary surgery J Thorac Cardiovasc Surg 2004;128:907-915.[Abstract/Free Full Text]
  7. Blackstone EH. Monitoring surgical performance J Thorac Cardiovasc Surg 2004;128:807-810.[Free Full Text]
  8. Grunkemeier GL, Wu Y. What are the odds? Ann Thorac Surg 2007;83:1240-1244.[Free Full Text]
  9. Jin R, Grunkemeier GL, Starr A. Validation and refinement of mortality risk models for heart valve surgery Ann Thorac Surg 2005;80:471-479.[Abstract/Free Full Text]
  10. Spiegelhalter D, Grigg O, Kinsman R, Treasure T. Risk-adjusted sequential probability ratio tests: applications to Bristol, Shipman and adult cardiac surgery Int J Qual Health Care 2003;15:7-13.[Abstract/Free Full Text]
  11. Sherlaw-Johnson C, Lovegrove J, Treasure T, Gallivan S. Likely variations in perioperative mortality associated with cardiac surgery: when does high mortality reflect bad practice? Heart 2000;84:79-82.[Abstract/Free Full Text]
  12. Hosmer DW, Lemeshow S. Confidence interval estimates of an index of quality performance based on logistic regression models Stat Med 1995;14:2161-2172.[Medline]
  13. Gardner MJ, Altman DG. Confidence intervals rather than P values: estimation rather than hypothesis testing Br Med J (Clin Res Ed) 1986;292:746-750.[Medline]
  14. Steiner SH, Cook RJ, Farewell VT, Treasure T. Monitoring surgical performance using risk-adjusted cumulative sum charts Biostatistics 2000:441-452.
  15. Sherlaw-Johnson C, Gallivan S, Treasure T, Nashef SA. Computer tools to assist the monitoring of outcomes in surgery Eur J Cardiothorac Surg 2004;26:1032-1036.[Abstract/Free Full Text]
  16. Sherlaw-Johnson C, Morton A, Robinson MB, Hall A. Real-time monitoring of coronary care mortality: a comparison and combination of two monitoring tools Int J Cardiol 2005;100:301-307.[Medline]
  17. Page ES. Continuous inspection schemes Biometrika 1954;41:100-115.[Free Full Text]
  18. Novick RJ, Fox SA, Stitt LW, Forbes TL, Steiner S. Direct comparison of risk-adjusted and non-risk-adjusted CUSUM analyses of coronary artery bypass surgery outcomes J Thorac Cardiovasc Surg 2006;132:386-391.[Abstract/Free Full Text]
  19. Biau DJ, Resche-Rigon M, Godiris-Petit G, Nizard RS, Porcher R. Quality control of surgical and interventional procedures: a review of the CUSUM Qual Saf Health Care 2007;16:203-207.[Abstract/Free Full Text]
  20. Poloniecki J, Sismanidis C, Bland M, Jones P. Retrospective cohort study of false alarm rates associated with a series of heart operations: the case for hospital mortality monitoring groups BMJ 2004;328:375.[Abstract/Free Full Text]
  21. Lovegrove J, Valencia O, Treasure T, Sherlaw-Johnson C, Gallivan S. Monitoring the results of cardiac surgery by variable life-adjusted display Lancet 1997;350:1128-1130.[Medline]
  22. Poloniecki J, Valencia O, Littlejohns P. Cumulative risk adjusted mortality chart for detecting changes in death rate: observational study of heart surgery BMJ 1998;316:1697-1700.[Abstract/Free Full Text]
  23. Grigg OA, Farewell VT, Spiegelhalter DJ. Use of risk-adjusted CUSUM and RSPRT charts for monitoring in medical contexts Stat Methods Med Res 2003;12:147-170.[Abstract/Free Full Text]
  24. Sibanda T, Sibanda N. The CUSUM chart method as a tool for continuous monitoring of clinical outcomes using routinely collected data BMC Med Res Methodol 2007;7:46.[Medline]
  25. Royston P. The use of customs and other techniques in modeling continuous covariates in logistic regression Stat Med 1992;11:1115-1129.[Medline]

Related Article

Blood Transfusion Reduction in Cardiac Surgery: Multidisciplinary Approach at a Community Hospital
James Brevig, Julie McDonald, Edy S. Zelinka, Trudi Gallagher, Ruyun Jin, and Gary L. Grunkemeier
Ann. Thorac. Surg. 2009 87: 532-539. [Abstract] [Full Text] [PDF]



This article has been cited by other articles:


Home page
Ann. Thorac. Surg.Home page
R. Jin, A. P. Furnary, S. C. Fine, E. H. Blackstone, and G. L. Grunkemeier
Using Society of Thoracic Surgeons Risk Models for Risk-Adjusting Cardiac Surgery Results
Ann. Thorac. Surg., March 1, 2010; 89(3): 677 - 682.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to Personal Folders
Right arrow Download to citation manager
Right arrow Author home page(s):
Gary L. Grunkemeier
Ruyun Jin
YingXing Wu
Right arrow Permission Requests
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Grunkemeier, G. L.
Right arrow Articles by Wu, Y.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Grunkemeier, G. L.
Right arrow Articles by Wu, Y.
Related Collections
Right arrow Education
Right arrowRelated Article


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
ANN THORAC SURG ASIAN CARDIOVASC THORAC ANN EUR J CARDIOTHORAC SURG
J THORAC CARDIOVASC SURG ICVTS ALL CTSNet JOURNALS