|
|
||||||||
Ann Thorac Surg 2006;82:1140-1146
© 2006 The Society of Thoracic Surgeons
a International Center for Health Outcomes and Innovation Research, Columbia University, New York, New York
b Department of Surgery, Columbia University, New York, New York
Accepted for publication May 22, 2006.
* Address correspondence to Dr Gelijns, Columbia University, InCHOIR, 600 W 168 St, New York, NY 10032 (Email: acp10{at}columbia.edu).
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
Left ventricular assist devices (LVAD) were approved as a bridge to transplantation therapy in 1998, and in 2001, they received FDA approval for long-term implantation in patients with advanced heart failure who were ineligible for cardiac transplantation (destination therapy). The trials supporting approval demonstrated the value, and potential, of LVAD therapy, but also highlighted the need for improved devices that would address the significant side effects such as bleeding, sepsis, and neurologic events associated with this therapy [1, 2]. Several newer-generation LVADs, which may address these shortcomings, have entered, or are poised to enter, clinical trials in this country. An effective development process, however, requires an adequate translational research infrastructure of patients, investigators, and financial resources. Yet, herein lies the problem. Bridge to transplantation is an orphan indication, and the patient population that is currently being referred to LVAD therapy for long-term implantation is severely limited, despite the significant number of end-stage heart failure patients who could, in principle, be eligible for treatment. As a result, the increasing numbers of devices entering clinical trials have to compete for limited patient numbers. This situation severely hampers the timely completion of clinical trials in this field, and more seriously, decreases the incentives to innovate all togetherin an area where innovation is sorely needed.
In this paper, we will explore the pros and cons of a range of clinical trial designs, particularly in terms of their sample sizes and the strength of the resulting evidence. Solutions that contend with small patient populations, however, are not only statistical in nature. We will, therefore, also review the policy options for encouraging alternative trial designs. A unique policy initiative has been the creation of a registry for collecting long-term effectiveness and safety data on LVAD therapy (Interagency Registry of Mechanically Assisted Circulatory Support [INTERMACS]) by the National Institutes of Health (NIH), the FDA, and the Centers for Medicare and Medicaid Services (CMS). As such, we will discuss what opportunities exist for shifting the balance between premarketing and postmarketing studies in providing evidence about the value of different LVADs. First, however, we will set the stage by clarifying the clinical need for improved assist devices and the size of the heart failure patient populations available for clinical trials.
| Need for Improved LVAD Therapy and Realities of Patient Recruitment |
|---|
|
|
|---|
In destination therapy, the need for improvement in LVAD therapy is even more salient as the intervention is intended to be long term. The Randomized Evaluation of Mechanical Assistance in the Treatment of Congestive Heart Failure (REMATCH) trial showed a statistically significant survival benefit (48% risk reduction in all-cause mortality; p = 0.001) and quality of life improvement of LVADs over optimal medical management [2]. Recently, all patients completed their 2-year follow-up, and the survival benefit persisted: 30% for LVADs versus 13% for optimal medical management at 2 years (p < 0.05). Despite the impressive mortality difference between arms, survival on LVAD therapy still needs to improve, and the same holds for morbidity. Left ventricular assist device patients experienced more than twice the rate of serious adverse events per year than optimal medical management patients (ratio 2.31 [1.87; 2.85]). Beyond the most common problems encountered in the bridge population, device reliability issues, and the need for LVAD replacement, were found to be a major issue. These events were costly in economic terms as well. The median cost of the implant hospitalization for LVAD patients in the REMATCH trial was $138,000 [4]. If patients experienced bleeding, internal pump infections, and sepsis during their implant hospitalization, the expected cost of their hospital stay would increase to nearly $900,000 [4].
Newer generation devices, which use continuous flow technology and, therefore, are smaller in size, may address these shortcomings. Their size allows for a simpler implantation procedure, which could reduce surgical morbidity, and may facilitate use in smaller patients who would otherwise not be candidates for implantable LVAD support. In addition, these devices may be more reliable, which would avert morbidity and mortality from device replacement.
The available population for evaluating these newer devices is limited, however. Bridge to transplant is an orphan indication, with, until recently, a relatively constant implantation rate of 500 patients annually, although newer estimates suggest an increase in the number of implantations. Whereas destination therapy is potentially a much larger indication, with candidacy estimates ranging from 20,000 to 60,000 patients a year, fewer than 300 patients were implanted over the last 2 years [5]. These patient numbers raise fundamental concerns about how to best design LVAD trials, especially for destination therapy patients. As experimental LVADs typically start testing in a bridge population, and then in a destination therapy population, we will review the current situation for BTT trials first. There are distinct advantages in starting with a BTT trial in that this offers clinical centers an opportunity to gain experience with a new device while there is still a safety net for patients (ie, transplantation).
| Bridge to Transplantation Trials: The Case for Single-Arm Trials |
|---|
|
|
|---|
) of 70% and that the experimental device improves mortality by an absolute margin of 5%. Designing such a trial would require more than 2,500 patients to ensure that a two-sided test has 80% power to reject the null hypothesis. (This is a test of Ho:
= 70% [null hypothesis] versus H1:
= 75% [alternative hypothesis].) Table 1
depicts sample size calculations under various assumptions about the treatment benefit of the experimental device. Only when a new device has an unrealistically high expected survival to transplantation of 90% will the sample size come into a feasible range of about 125 patients (2-year anticipated accrual time).
|
While this approach deals with the small size of the BTT population, the absence of a concurrent control group poses problems for data interpretation. That becomes particularly problematic in an area with active therapeutic innovation, ongoing changes in patient selection, and evolution of trial conduct. One major challenge, as discussed, is that the transplantation population has changed since the original premarking trials; current patients may not be comparable in their risk profile to the patients from the original studies. Moreover, clinical management of LVAD patients has evolved, affecting the survival rates and adverse event profiles of patients receiving the current devices as compared with historical controls. Another issue is the changing waiting times for donor hearts. In terms of trial conduct, adverse event definitions have evolved, trials are now independently adjudicatedwhich was not the case for the pivotal trials of the currently marketed devicesand the newer devices are, therefore, held to stricter standards. Finally, while this design may offer precision in its estimates of treatment effect (eg, narrow confidence intervals), those estimates are potentially biased by an unquantifiable amount due to unmeasured differences in patient comparison groups.
Many of these problems can be addressed through the implementation of the INTERMACS registry, which for commercially available devices will provide concurrent benchmarks of the baseline characteristics of BTT patients, their survival to transplant rates and waiting times, and adverse event rates using standardized definitions. This dataset could also provide a comparison group, which through modeling, could be appropriately adjusted for risk factors.
| Long-Term LVAD Therapy Trials: Challenges and Design Options |
|---|
|
|
|---|
Superiority Trials
These considerations suggest that traditional superiority trials are not feasible because of sample size issuesas in BTT trials. Assume, for example, that 2-year survival in the control arm (the marketed device) is 45%. This is not an unrealistic assumption, as survival was nearly 40% for those patients enrolled in the second half of the REMATCH trial, and experienced LVAD centers are likely to have improved their outcomes with ongoing learning in the postmarketing setting [6]. If we then assume that a new device decreases mortality by a relative 10%, namely, the hazard ratio (or instantaneous relative risk of death)
= 0.90, a sample size of nearly 4,500 patients is required to ensure that a two-sided test has 80% power to reject the null hypothesis. (This will test H0:
= 1.0 [the null hypothesis] versus H1:
= 0.90 [the alternative hypothesis].) Table 2
shows sample size requirements under various assumptions for the mortality benefit of the experimental device, assuming 30 months of accrual and 18 months of follow-up. Even if a new device decreases mortality by a relative 30%, which would bring the 2-year survival of its recipients close to 60%, the sample size of a well-powered trial would still need to be more than 400 patients.
|
Composite Endpoints
Another option is to use a composite endpoint, which may highlight the differences between comparison devices by combining survival with important adverse events for patients, such as stroke or device replacements [7]. Such an endpoint may increase the difference in event rates, and, consequently, statistical power. A case in point is device-replacementfree survival. In the REMATCH trial, for example, LVAD patients had a 65% 2-year probability of replacement [8]. If a new device would improve device reliability by 30% to 50%, one has a good chance of showing superiority on the basis of device-replacement free survival with about 300 or 200 patients, respectively.
However, there are disadvantages to composite endpoints. First, with newly emerging devices, there may be considerable uncertainty about their long-term impact on particular adverse events, such as device reliability. Thus, this strategy is risky. Moreover, trial results can be difficult to interpret when the various individual component endpoints are not consistently superior. For example, if the device with the best replacement-free survival offered a better device failure rate but worse overall survival, we might be hard-pressed to call it a winner. In such circumstances, the relative importance of the component events would need to be factored in, which is a difficult process.
Realistic sample size estimates of trials using noninferiority designs or composite endpoints still require the enrollment of roughly 200 to 300 patients. Currently, the available destination therapy clinical trials population is 200 patients per year, and if we assume five competing devices in trials, then enrollment alone would take between 5 and 7.5 years. If we include a 2-year follow-up time, these trials could run up to 10 years. The length of time involved in completing such trials raises serious concerns about the relevance of the results, and economic feasibility.
Performance Goal Design or Postmarketing Studies Only
One option is to eliminate the randomized design and opt for a performance goal-type study, as used in BTT trials. In destination therapy, however, there is limited clinical experience on which to base a performance goal. There is only one premarketing trial (REMATCH) and the postmarketing experience with the HeartMate VE is based on some 250 patients. Over time, with the implementation of the INTERMACS registry, the destination therapy experience should broaden, and this approach may become feasible. Moreover, if the registry captures longer BTT implantation times, the data may provide greater insight into the long-term adverse events seen in destination therapy patients. The other option is to allow a new device on the market after it has satisfactorily completed the evaluation for bridging to transplantation by providing conditional approval, and requiring data on the destination therapy indication to be collected in the postmarketing setting using the registry (with mandatory interim analyses to be submitted to the FDA). The disadvantage of this option is a limited observation period in the BTT trial (only short-term data on survival and adverse events for the experimental device) being the basis for approval for long-term use. Moreover, the acceptability of this strategy will depend on being able to show that the bridge and destination therapy populations are similar in their response to treatment.
In short, traditional randomized trials using customary levels of statistical precision still require too many patients, and the nonrandomized options require a leap of faith to allow widespread use of a new device in a destination population with very limited data on which to base the decision.
| Small Randomized Destination Therapy Trials Augmented by BTT Data: A Possible Solution? |
|---|
|
|
|---|
= 5%) for a test of noninferiority are possible if the experimental device offers a greater degree of survival improvement. For example, a noninferiority trial with 152 patients would be possible if the experimental device improves 2-year survival to 50% (noninferiority margin 15%; power 80%;
= 5%). Smaller noninferiority margins are also an option, if the experimental device improves survival by a small amount compared with the predicate device (Table 4).
|
|
|
The level of imprecision of the estimated treatment difference in these types of trials is larger than is customary in pivotal trials. However, this approach provides an advantage over a performance goal design in that we obtain an unbiased estimate of treatment effect owing to randomization, and can quantify the remaining random variation. If one remains concerned about an increased type I error rate, one could require evidence of additional endpoints (eg, device replacement, stroke) demonstrating device effectiveness. For example, one might require demonstrating noninferiority with regard to mortality as described above and also require demonstrating superiority on one or more other endpoints with a customary type I error rate. According to Capizzi and Zhang [10], this approach could maintain the overall type I error rate at 5%, as long as at least one endpoint is significant at the 5% level and the others trend in the same direction and are significant at the 20% level at most. Neuhauser and colleagues [11] suggested a modification of this approach, essentially arguing for a minor downward adjustment of the alpha level to account for the multiplicity of tests performed.
In addition, there are two ways that the data of destination therapy trials can be augmented if the results appear to be promising. The first is to combine the destination therapy data with results from the BTT trials. This scenario is consistent with the growing awareness that the distinction between populations defined at the time of implant as either BTT or destination therapy is artificially imposed, and is becoming clinically obsolete [12]. The data then could be pooled using an appropriate multilevel model or estimates could be synthesized at discrete time points (eg, 1 year) as in a meta-analysis. The second approach is to rely on postmarketing studies to substantiate data from the premarketing trials.
| Policy Options for LVAD Trials: Perfect is the Enemy of Good |
|---|
|
|
|---|
One could, of course, counter this argument by suggesting that the solution to the problem is not one of clinical trials design, but rather one of having better devices emerge from the research and development process. This point of view neglects the reality of the innovation process: much innovation does not occur in the laboratory or with the device alone, but rather at the bedside. Substantial learning occurs in clinical practice as a result of better selection of patients and better management of their care, as well as feedback to manufacturers that may result in device improvements. Many of these changes are incremental, but over time may lead to major advances in outcomes. This learning process was already demonstrated in the REMATCH trial, where in the second half of the trial survival and adverse events improved significantly over the first half [6]. This means that if we are going to make progress in this area we are going to need to allow novel devices, which have met a premarketing threshold of efficacy and safety, into more widespread clinical practice within a realistic time period (eg, 3 to 4 years) so that further improvements can occur.
The crucial questions then are: What kinds of trial designs provide acceptable evidence that a new device meets the premarketing efficacy and safety threshold? And what are the policy implications for the stakeholders involved? In this paper, we discuss a range of premarketing clinical trial options. We conclude by opting for a smaller randomized trial in the destination therapy area, which would preserve the advantages of randomization, but would allow for a shorter, and more feasible, enrollment period. At the same time, we advocate more rigorous in-vitro device reliability testing [13] and a shift toward a heavier reliance on postmarketing studies to provide long-term data on LVADs.
The FDA will play a critical role in determining the acceptability of this approach. The FDA always has to balance the risks of letting new therapies on the market earlier versus the benefits of giving patients more rapid access. These benefits are obvious in a life-threatening condition with few treatment alternatives. Moreover, this approach could enhance innovation in an industry that is characterized by small start-up firms. The risk in permitting smaller trials as a basis for premarketing approval is that it increases the probability of a type I error; namely, accepting a new device that may be inferior to the predicate device. However, in comparison with the way that patients are usually treated, the proposed noninferiority trial design does not expose patients to much risk. If, as discussed, the observed 2-year survival in the control arm is 45%, then the experimental device survival must be at least 38% in order to claim noninferiority, with a 15% noninferiority margin. That is substantially higher than the 13% point estimate for 2-year survival of optimal medical management patients in REMATCH [6], a survival estimate that still holds because no major changes in medical therapy have occurred lately. This trial design, then, protects against approving a device that would not offer a survival benefit compared with medical therapy, which is what most patients receive.
The acceleration of premarketing studies, however, needs to be balanced by more rigorous postmarketing studies. The new NIH-CMS-FDA supported INTERMACS registry provides a robust infrastructure to monitor the long-term performance of LVADs. In comparison with other registry initiatives where compliance can be an issue, the LVAD registry is likely to capture most patients as participation in data collection will be a prerequisite for CMS reimbursement to the clinical centers. Currently, as the resulting information is in many respects a public good, the registry is supported by federal agencies. In the future, the support structure should evolve into a public-private partnership, including industry (who benefit from accelerated premarketing trials) as well as federal agencies, such as CMS, and private payers, who can monitor quality for their beneficiaries.
In conclusion, the need to obtain better knowledge about the clinical outcomes of device therapies is a general one. The LVAD case, however, offers unique challenges in that currently small patient populations complicate the clinical trials process, and may decrease the incentives to innovate. The case is, therefore, especially strong for the creation of partnerships between clinical investigators, the public sector, and industry to experiment with novel trial designs that would facilitate rigorous and timely premarketing and postmarketing trials. These partnerships would also allow stakeholders to experiment with new institutional models to create an infrastructure for conducting LVAD trials, and with new financial models to help support these studies.
| Acknowledgments |
|---|
|
|
|---|
| References |
|---|
|
|
|---|
Related Articles
This article has been cited by other articles:
![]() |
K. L. Baughman and J. A. Jarcho Bridge to Life -- Cardiac Mechanical Support N. Engl. J. Med., August 30, 2007; 357(9): 846 - 849. [Full Text] [PDF] |
||||
![]() |
E. Chen, W. Sapirstein, C. Ahn, J. Swain, and B. Zuckerman FDA perspective on clinical trial design for cardiovascular devices. Ann. Thorac. Surg., September 1, 2006; 82(3): 773 - 775. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ANN THORAC SURG | ASIAN CARDIOVASC THORAC ANN | EUR J CARDIOTHORAC SURG |
| J THORAC CARDIOVASC SURG | ICVTS | ALL CTSNet JOURNALS |