|
|
||||||||
a Department of Surgery, University of Virginia Health System, Charlottesville, Virginia
b Department of Public Health Sciences, University of Virginia Health System, Charlottesville, Virginia
Accepted for publication January 31, 2012.
* Address correspondence to Dr Kozower, University of Virginia Health System, General Thoracic Surgery, PO Box 800679, Charlottesville, VA (Email: bdk8g{at}virginia.edu).
Presented at the Fifty-eighth Annual Meeting of the Southern Thoracic Surgical Association, San Antonio, TX, Nov 9–12, 2011.
| Abstract |
|---|
|
|
|---|
Methods: Esophageal cancer resection patients were identified in the 2007 Nationwide Inpatient Sample. Hospital volume was measured using a continuous linear function, a nonlinear function using restricted cubic splines, and using quintiles of volume. The statistical significance of the relationship between hospital volume and mortality risk was assessed, and adjusted for patient age, for comorbid disease, and for correlated events within hospitals.
Results: A total of 6,248 esophageal cancer resection patients from 217 hospitals were identified. All 3 models demonstrated excellent performance characteristics (C index = 0.94, Nagelkerke R2 = 0.62). However, no significant association was demonstrated between hospital procedure volume and in-hospital mortality in any model. Important predictors of mortality included age, hypertension, weight loss, and peripheral vascular disease (p < 0.001).
Conclusions: Esophageal cancer resection volume is not a significant predictor of mortality and should not be used as a proxy measure for surgical quality.
| Introduction |
|---|
|
|
|---|
Careful methodological reviews of volume-outcome studies describe serious concerns with the validity of this association [12, 13]. Most volume-outcome studies place procedure volumes into arbitrarily defined categories, rather than treating volume as a continuous variable. Importantly, the manner in which categories are defined determines the magnitude of the effect of volume on mortality risk when measured by odds ratios [14]. The majority of studies have also used administrative data sets containing very large procedure numbers. By having very large sample sizes, statistically significant differences between parameters are almost guaranteed and effect sizes are needed to distinguish between clinical and statistical significance. In addition, the quality and performance of the statistical models used and the contribution of procedure volume to explained variance were not rigorously evaluated in the majority of the volume-outcome studies examined [13–15]. The present study addresses these important issues to evaluate the relationship between hospital procedure volume and mortality after esophagectomy for esophageal cancer.
| Material and Methods |
|---|
|
|
|---|
Eligibility Criteria
The study population included all discharge records for patients with an International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) procedure code for esophagectomy (42.4, 42.40, 42.41, 42.42, and 43.99) and an accompanying ICD-9-CM diagnosis code for esophageal cancer (150.X). Discharge records for patients with missing discharge status were excluded.
Volume Measurement
The weighted total of discharge records per hospital was calculated for each hospital in the data set. Hospital volume was represented in 3 different ways in subsequent statistical analyses of the relationship between volume and mortality outcomes: as a continuous variable, as a spline function, and as a categoric variable using quintiles of volume [18]. Representing volume as a continuous variable assumes that the relationship between volume and mortality is linear. Spline regression allows more accurate characterization of the relationship between volume and mortality by incorporating nonlinearity in the volume-outcome relationship [11, 19, 20]. Spline regression creates a functional representation of the shape of the relationship between volume and the outcome of mortality using piecewise polynomial functions [21]. Restricted cubic splines force the tails of the function to be linear, which simplifies the representation. In the second model, a restricted cubic spline function was developed using an inflection point placed at the median value of weighted procedure volume and at the fifth and 95th percentiles of the distribution to define the tails of the function. In the third model, volume was represented using categories of volume, which also allows potential nonlinearity in the relationship between volume and mortality to be addressed. The weighted volume of procedures was calculated for each hospital to identify thresholds for defining quintiles of volume for the overall distribution. Hospitals were then stratified into quintiles. The majority of prior studies have used this technique, partitioning volume into quartiles, quintiles, or into other equivalently sized groups.
The confidence limits for the weighted estimates were calculated using standard Wald "linear" confidence limits for proportions. The confidence limits use the variance estimates that are based on the weighted sample design. The weighted frequencies and confidence limits for the weighted frequencies were calculated using the SAS 9.2 procedure SURVEYFREQ (SAS Institute, Cary, NC) and have been added to the methods section.
Hierarchical General Linear Models
Hierarchical generalized linear models were used to estimate the statistical significance of the relationship between inpatient mortality and hospital volume, adjusted for differences in patient age, gender, and comorbid disease. Twenty-seven categories of comorbid disease were measured using the method described by Elixhauser and colleagues [22]. The Elixhauser method has been demonstrated to provide effective adjustments for surgical mortality risk in prior research [23–25]. In the absence of present on admission coding, risk adjustment methods using reported diagnoses are subject to the problem of including some postoperative complications in adjustments ostensibly made for baseline risk [26]. In this study, the Elixhauser method categories of anemia, electrolyte disorders, and coagulopathy were excluded from the risk adjustment models because these conditions are more likely to be postoperative complications than preoperative comorbidities [18].
Three hierarchical generalized linear models were developed using each of the 3 alternative representations of volume, with otherwise identical adjustments for age, gender, and comorbid disease. Hierarchical generalized linear models were used to account for potentially overdispersed variance estimates associated with correlated outcomes within hospitals. In SAS 9.2, these models were produced by using the general linear mixed model procedure, "Proc GLIMMIX." Hospitals were included in the models as random effects, allowing the relationship between volume and in-hospital death to be different across hospitals, by using the command "random intercept/subject = HOSPID" where HOSPID is the variable for hospital in the NIS [11, 27].
Tests of Covariate Significance
The statistical significance of the effect of volume on mortality was measured using 2 different tests. The overall statistical significance of the effect of volume in each model was measured using the nested model log-likelihood test, which compares the total log-likelihood obtained by the model to that obtained by an otherwise identical model excluding the volume measure [28, 29]. The difference in model log-likelihood yields a test statistic with a
2 distribution with degrees of freedom equal to the number of parameters associated with the volume measure. The statistical significance of each volume measure and all other parameters for each of the fixed effects included in the models was also assessed using the F test statistic.
Model Performance
The predictive accuracy of the models was quantified using the C statistic and the Nagelkerke R
2 statistic. The C statistic is equivalent to the area under the receiver operating characteristic curve for models with a dichotomous response variable, and it provides an estimate of the model's ability to discriminate between observed instances of inpatient death and survival. A value of 0.5 indicates that the model provides no predictive discrimination, while a value of 1.0 indicates perfect separation. We also calculated the Nagelkerke R
2 statistic. This is a log-likelihood ratio
2 based measure that is analogous to the R
2 statistic in ordinary multiple regression. The Nagelkerke R
2 statistic ranges from 0 for models that provide no predictive information to 1 for models that predict perfectly. Model covariates with p values less than 0.05 were identified as statistically significant. All statistical analyses were conducted using SAS 9.2 (SAS Institute).
Model Validation
The 3 hierarchical general linear models were externally validated using an independent data set, the 2008 NIS. Development of the patient population and analyses were conducted as otherwise described above.
| Results |
|---|
|
|
|---|
|
|
|
|
Odds ratios and 95% confidence intervals for the effects of volume estimated in each of the 3 models are presented in Table 4. In model 3, the lowest volume quintile (quintile 1) is associated with an over 1,200% increase in mortality compared with the highest volume quintile (quintile 5). However, the confidence intervals widely cross 1, demonstrating that this is not statistically significant. Odds ratios were not calculated for the volume measure from the spline regression model (model 2), which represents volume as a combination of linear and nonlinear components.
|
| Comment |
|---|
|
|
|---|
The method used to measure the effect of volume can have an important effect on the observed association. A common practice in volume-outcome studies is to divide volume into a series of percentile categories (quartiles, quintiles, etc) [20]. The conversion of continuous data into categoric data results in the loss of information, and thereby reduces the model's power to detect differences. Most of the research on the volume-outcome relationship in esophageal cancer resection has used this approach in an effort to address nonlinearity and to identify potential thresholds for changes in mortality risk [3, 5–7].
This study used hierarchical generalized linear models to test the significance of the relationship between hospital volume and in-patient mortality. This modeling technique accounts for correlated outcomes within hospitals and adjusts for potentially overdispersed variance estimates. It is imperative that volume-outcome studies use hierarchical modeling, including hospitals as random effects in the models, to allow the relationship between volume and inpatient death to be different across hospitals [11, 27]. Other recent studies using hierarchical modeling have shown no significant volume-outcome relationship for bariatric surgery and lung cancer resection [18, 31].
The volume-outcome relationship in esophageal cancer resection was recently studied using a detailed clinical database from the Society of Thoracic Surgeons [32]. Esophagectomy volume ranged from 1 to 83, with the majority of hospitals performing fewer than 10 procedures per year. This study found no association between esophagectomy volume and a composite outcome of morbidity and mortality. It concluded that volume is an inadequate proxy for quality assessment after esophagectomy. The Society of Thoracic Surgeons Database was also used to evaluate hospital performance variation in lung cancer resection [33]. This study reports on over 18,000 lung cancer resections performed at 111 hospitals and found that there were statistically significant differences between the best and worst performing hospitals. However, the predictors of improved hospital performance were poorly understood. Another recent volume-outcome analysis in thoracic surgery reported on survival differences after lung transplantation [29]. This study identified a weak volume-outcome relationship but significant variability in hospital performance remained after controlling for procedure volume. The authors concluded that further exploration of the causes of hospital variation is warranted because presumed predictors of outcome do not explain large amounts of the variability in hospital performance.
All studies using administrative databases are limited by the potential for imprecision, inaccuracy, and potential bias in how diagnoses are recorded [26, 34]. These limitations apply to our findings using the HCUP NIS. However, the HCUP NIS is the most comprehensive and inclusive collection of hospitalization data available for the US population. Our models demonstrate that patient age and comorbidities are the most important predictors of outcome. Another potential limitation of this study is that we investigate hospital procedure volume rather than surgeon volume [35]. Unfortunately, the HCUP NIS database does not include uniformly collected identifiers of individual surgeons. However, even though analyses of surgeon volume might be preferable, the study of hospital volume effects is relevant to current policy as the Leapfrog Group criteria for selective referral to high-volume centers refers only to hospital and not surgeon volume [36]. Although this study demonstrates no significant relationship between hospital esophageal cancer resection volume and in-hospital mortality, it does not address other important clinical outcomes such as 30-day mortality, inpatient resource allocation, hospital readmission, or the impact of transferring sick postoperative patients from one hospital to another.
Esophageal cancer resection is a fairly high-risk procedure, with an in-hospital mortality of 4.29% in this dataset. Many surgeons, patients, and policy makers believe that the risk of mortality with complicated surgical procedures decreases with increased hospital and surgeon volume. However, commonly referenced studies examining the volume-outcome relationship have measured the effects of volume using arbitrary categorization and without accounting for correlated outcomes within hospitals [6, 35, 37]. Hospital surgical volume is only a proxy for the multidimensional concept of institutional experience [38]. There are unmeasured, unknown factors at a hospital level that have a greater influence on mortality than volume [29].
This study demonstrates that the most important predictors of mortality after esophageal cancer resection are a patient's age and comorbid disease, and that there is not a significant association between hospital procedure volume and in-hospital mortality. Importantly, this study measures volume in 3 different ways to compare what is commonly done in the literature with more accurate statistical techniques such as spline regression. Our findings demonstrate that the use of volume as a proxy measure for quality is problematic. It is imperative that further, well-designed research be conducted to determine the strength and validity of the volume-outcome relationship in esophagectomy and other reported high-risk procedures.
| Discussion |
|---|
|
|
|---|
And secondly, if hospital volume does not influence mortality after esophagectomy, then what, aside from individual patient comorbidities, does? How do we investigate the effect of other variables such as hospital size, teaching status, surgeon specialty and surgeon volume to identify significant predictors of morbidity and mortality after esophageal resection?
Thank you for the opportunity to review this paper.
DR KOZOWER: Thank you. We modeled in-hospital mortality after esophagectomy and used three separate models to evaluate three alternative techniques for measuring the impact of hospital procedure volume. Volume was not a predictor of mortality in any of them and in fact, model performance was identical whether hospital procedure volume was included or not. This demonstrates that hospital procedure volume is not an important independent predictor of in-hospital mortality and should not be used as a proxy measure of hospital quality. This finding, using sound statistical methodology, is contrary to the majority of volume-outcome studies examining predictors of mortality after esophagectomy.
Your second question is much more complicated and I wish I knew the answer. If procedure volume isn't the answer, then what variables are truly important in understanding hospital performance variation? We know from The Society of Thoracic Surgeons Database and from administrative databases, that there is variation in hospital performance. However, our current models only explain roughly half of this variation. Future research will likely combine quantitative and qualitative methodology to help explain what factors are truly important in understanding quality.
DR RISHINDRA REDDY (Ann Arbor, MI): Dr Kozower, I would like to thank you for your talk. It was very interesting. Do you have any concerns with regards to published articles on other frequently performed surgeries that comment on volume, such as Whipple procedures in general surgery or lobectomies in thoracic surgery?
DR KOZOWER: We have studied the volume-outcome relationship after lung cancer resection and the results are similar. Hospital procedure volume is not an independent predictor of mortality. Currently, we are investigating other procedures, including pancreatectomy, where the Agency for Healthcare Research and Quality recommends a minimum volume threshold.
DR ARJUN PENNATHUR (Pittsburgh, PA): One of the issues regarding the analysis of the relationship between the hospital volume and mortality is the risk adjustment of the patient. So, for example, a peripheral low volume hospital might refer a complex high-risk patient who requires esophageal resection to a high volume hospital; and because the risks of the patients are higher, in that particular high volume hospital, you may not be able to see the difference of what the impact of higher volume is on outcome. So is there any way to risk adjust this population using the database which you used and did you do this risk adjustment?
DR KOZOWER: Yes, our analyses are risk adjusted for age, sex and comorbidity. The NIS database includes the Elixhauser risk adjustment technique, which uses 30 different comorbidity variables. Research has shown that it is the preferred method for risk adjusting administrative data. In this study, the major predictors of mortality after esophagectomy were a patient's age and comorbidities.
DR SHANDA HALEY BLACKMON (Houston, TX): That was a great talk. One of the weaknesses of this study might be looking at mortality as an end point. In my particular practice, a good bit of the esophageal volume we see is patients who had their initial esophageal cancer resection done at a low volume center, and then when they have a complication, they are transferred into our facility. Did your study look at that, because I think that might be a better marker for the overall outcomes. Those patients who have had a leak for several days, are septic, get transferred into our facility, and if they die in our facility, is that counted as hospital mortality? Who is that death attributed to? The tertiary hospital managing complications?
DR KOZOWER: Your point is quite valid, and mortality is only one outcome that can be examined. It is a difficult thing to look at in the NIS database which is compiled from hospital discharge abstracts and does not include 30-day mortality, readmission data or other utilization data such as ICU details. However, this study uses in-hospital mortality as the primary outcome. Therefore, if a patient was transferred from one hospital, where they had their esophagectomy, to your hospital, where they had their complications managed, you would not be "penalized" if the patient died because you did not perform the procedure and there would not be a procedure code for esophagectomy at your hospital.
DR BLACKMON: So it might be interesting to try to find a way to look at that, because your study misses what I think is a very important piece of the puzzle.
DR KOZOWER: Understanding these types of scenarios would require a longitudinal database such as MEDPAR or the linked SEER-Medicare database. I agree that these are important issues and we are currently trying to understand these more complicated outcomes.
DR BILL PUTNAM (Nashville, TN): From your presentation, it appeared that one institution performed 120 operations. Is that correct?
DR KOZOWER: No, one hospital performed 120 procedures, but it's not even close to half of the procedures performed which were over 6,000.
DR PUTNAM: That seems to be larger than most practices. Was that excluded as an outlier and the remaining centers examined for the effects of volume on outcome?
DR KOZOWER: No, you wouldn't want to exclude the highest volume center; you would want to include all of the data. The spline regression evaluates whether or not a threshold value exists for the relationship between hospital procedure volume and mortality.
In terms of the insurance agencies such as Leapfrog that are trying to establish volume based referral strategies, I think that is exactly what they want. They would prefer to have a few centers performing huge procedure volumes, and there is no data that that would actually improve outcomes. In addition, there is no data that such a policy would lower cost. There is data that it would increase the distance patients need to travel for their care.
DR M. BLAIR MARSHALL (Washington, DC): Benjamin, I really enjoyed that study. Could help me understand the limitations of eight to 120 esophagectomies as high volume? Are eight esophagectomies actually representative of a high volume center? Would there be any benefit evaluating multiple years of accumulated data to provide additional clarity?
DR KOZOWER: Those are two good questions. First, I agree that eight to 120 esophagectomies aren't the same and that eight really isn't high volume. However, it is completely arbitrary. We can divide the volume of procedures into deciles, quintiles, quartiles, etc, and each of these arbitrary categorizations would give us a different "high volume number." One of the points I want to make is that the categorization of a continuous variable such as volume into arbitrary groups is an error in itself. We only included the categorization of volume into quintiles as one of the models in our study because that's what has been done for the last 30 years. We tried to identify a threshold value for hospital procedure volume but one does not exist for this dataset.
Your second question asks about combining multiple years of the NIS. The analyses were performed using data only from 2007. We did not combine data from different years because of how the weights are estimated in the NIS dataset. The same facilities are not used in every year of data. Therefore, the reliability estimates for the weighted volume estimates would be different for hospitals that are represented in every year compared with hospitals that are used in several years or only one year. This is a meaningful problem because there are some hospitals with a very limited number of lung cancer cases, and volume is the key analysis variable.
DR MARSHALL: Thank you.
DR STEPHEN C. YANG (Baltimore, MD): Ben, again, a great talk. We heard some great presentations yesterday about minimally invasive esophagectomy, how it is safe, probably better than open, anyone can do it. So are you saying now that MIEs can be done in any hospital irrespective of the volume?
DR KOZOWER: What I am saying is that hospital procedure volume should not dictate patient referrals. If I have to have an esophagectomy tomorrow, am I going to a hospital that does one per year? I am not. But I am not saying that we should have a policy that uses this as a proxy for quality, because the data don't support it. We need to focus our effort on what factors really explain hospital performance variation. We also need to focus on better ways of looking at our surgical outcomes. The government and insurance carriers are going to help us do this.
DR BLACKMON: So just on that same line, you would go to a hospital that does eight a year?
DR KOZOWER: I'm sorry?
DR BLACKMON: So you would go to a hospital that does eight a year?
DR KOZOWER: I would go to a hospital that I felt had the best outcomes after esophagectomy. I wouldn't use an arbitrary volume threshold.
DR FRANK A. BACIEWICZ, JR (Detroit, MI): I enjoyed your presentation. I may have missed this with all the data, but even at our institution, we have general surgeons, we have thoracic surgeons doing esophagectomies with multiple different techniques from robot to open. My question was, did you look at whether thoracic surgeons are doing the procedure or whether general surgeons are doing it, because you may think that a place doing one or two a year, the general surgeons may be doing it there.
DR KOZOWER: Specialty training is not included in the NIS so that isn't something we examined.
DR BACIEWICZ: Because it has been looked at for lung resections and it is very significant there and I would predict it would be very significant here.
DR ROBERT J. CERFOLIO (Birmingham, AL): My point again is similar to what I said earlier. When you start telling me that your conclusions are things that don't make any sense you have to question the methods of the study first and the reliability of the data. I am all for question dogma. I have written a lot of studies that do that but the data is clean and reliable. Why should we treat my patients different than you treat yourself? You said yourself you wouldn't go to a low volume center. So I am going to pretend my patients are as important to me as you are to you. We know that by doing something more and more you get better, plain and simple. It makes sense. We know that if one surgeon does something 100 times a year with his team, they are going to be better than somebody who does it once a year. It doesn't make sense to come up here and tell me it is not true.
So if that's the case, you have got to question the data, and when you start telling me the database. You have told me it doesn't allow you to tell you who the surgeon is or how many the surgeon did and thus there is the flaw in the methods and thus the conclusion. It comes back to the reliability of the data. And so I invite you when we are going to talk about things, we need strong, reliable data. And it's good to question things.
DR KOZOWER: I agree that it's always good to question the data but I don't agree with your conclusion that it makes no sense. The NIS is the largest all payer database in the US and provides the best database to study what is actually happening in the country. In addition, our conclusion that esophagectomy volume is not a predictor of mortality after esophagectomy agrees with the recent paper from The Society of Thoracic Surgeons Database, the largest and most comprehensive clinical thoracic surgery database in the US. They also concluded that volume should not be used as a proxy for quality. Currently, both the clinical and administrative data don't support an esophagectomy volume threshold.
DR CERFOLIO: But that is institution, not surgeon. And so when you start talking about institution and you don't know how many the surgeon did and if it is the same team taking care of him, it is the reliability of the data. If you are able to say it was that surgeon at one hospital with the same team, you know, it makes sense to you, they are going to get better the more they do.
DR FREDERICK GROVER (Denver, CO): This has been a great discussion, what we like to see in the Southern Thoracic, and I can't resist jumping in. I am not taking a stand on the volume issue except to say we have always advocated that it is the outcomes that are important. Measuring the outcome, whatever outcomes you want to measure, is the gold standard, not volume. If a low volume center has excellent outcomes, in my opinion this is acceptable. However, if a high volume center has poor outcomes, that is not acceptable. One reason that volume is advocated by some is that it is a cheap way to compare programs or to direct referrals and funds, but in the absence of outcomes data, in my opinion is inferior. Although I think volume is important to record, it does not necessarily reflect outcome and quality and should not be used for quality assessment without outcomes.
| Acknowledgments |
|---|
|
|
|---|
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
V. H. Coupland, J. Lagergren, M. Luchtenborg, R. H. Jack, W. Allum, L. Holmberg, G. B. Hanna, N. Pearce, and H. Moller Hospital volume, proportion resected and mortality from oesophageal and gastric cancer: a population-based study in England, 2004-2008 Gut, July 1, 2013; 62(7): 961 - 966. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. LaPar, G. J. Stukenborg, C. L. Lau, D. R. Jones, and B. D. Kozower Differences in reported esophageal cancer resection outcomes between national clinical and administrative databases J. Thorac. Cardiovasc. Surg., November 1, 2012; 144(5): 1152 - 1159. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ANN THORAC SURG | ASIAN CARDIOVASC THORAC ANN | EUR J CARDIOTHORAC SURG |
| J THORAC CARDIOVASC SURG | ICVTS | ALL CTSNet JOURNALS |