Real world evidence is everywhere these days, bolstered primarily by the FDA motivated discussion of including real world evidence as a supplement or complete replacement of control arms in clinical trials (e.g. synthetic control arms). Inspired by some twitter discussion, I’m going to talk about potential savings and biases with multiple kinds of controls, methods for limiting biases, and a hope at the end of the blog of where we go from here.

Please note I have nothing against real world evidence with proper mitigation of the risks of bias (yes, that’s hard…) but I think we have better long-term solutions which achieve the same aims with less risk. My long-term solution takes a while to implement, so we will need to continue to utilize real world evidence and historical randomized trials in the meantime.

Pros and Cons to historical/external/real world/synthetic controls

Control arms are costly. Obviously, removing controls from our current two arm trials can cut per patient costs in half. Our concern is the potential for biases that include, but aren’t limited to

  1. Placebo or assessor biases – everyone knows the treatment assignments in a one arm trial.
  2. Time biases – A control arm today may perform differently than a control arm from the past, and if we don’t see today’s control arm we will never know. Today’s control arm might be better (improving standard of care) or perhaps even worse (increasing antibiotic resistance, for example)
  3. Selection biases – Historical data usually comes from different sites with different inclusion/exclusion criteria. Some of these biases are explicit, others unknown.
  4. Clinical Trial bias – Really this is a selection bias, but I’m going to separate it here. Are patients enrolling in a clinical trial the same as patients that don’t enroll, whether by their choice or a doctor’s choice in suggesting a trial?

Mitigation strategies

We have lots of options to mitigate the risk of biases. All of these methods may be combined.

  1. Make trials more pragmatic – If we worry about the biases associated with the distance between real world data and clinical trials, we could make trials more pragmatic, with inclusion/exclusion criteria reflecting real world patients. If we worry about real world data not being applicable to clinical trials, we should worry about clinical trials not being applicable to the real world.
  2. Propensity Scores and similar methods – Some biases result from different covariate makeup in the historical/real world data and the population enrolled in the clinical trial (the clinical trial population is younger, or has more/less comorbidities, etc.) Propensity scores and related methods attempt to correct for these imbalances. Synthetic control arms aim to create a “comparable” selection of real-world data to compare against the clinical trial treatment arm. A weakness here is that it is hard to correct for any unknown unknowns.
  3. Enrolling some concurrent controls – We will never be able to assess time bias unless we see some concurrent controls in our studies. Enrolling 2:1 or 3:1 or more allows us to see some concurrent randomized controls and compare them to any historical data, allowing us some hope of detecting time biases and any unknown unknowns we missed in the propensity scores. Usually these concurrent controls are employed with dynamic borrowing, allowing us to make heavier use of the historical data when it agrees with the concurrent controls and down weighted use of the historical data when it disagrees with the concurrent data. Statistically, the risks of bias almost asymptotes when no concurrent controls are employed. Even a small fraction of concurrent controls significantly reduces this risk.

My continuum of controls

Different types of controls are more or less susceptible to these biases, and for each of these biases there are potential corrections. Here’s my continuum of controls.

  1. Randomized concurrent controls – The status quo, with no cost savings, but the most straightforward way to avoid biases. We have randomization to avoid placebo and assessor biases, time biases are avoided because they are concurrent, and of course being randomized from the same population we do not have selection biases. 

  2. Contemporary and historical controls in a platform trial – A platform trial continually enrolls patients in multiple experimental arms with a control, randomizing 4:1 or more to active. The larger the platform trial, the more imbalanced the randomization can be while still maintaining a large sample of control patients.

    Suppose arm A is in a platform trial. Controls allocated during the same time frame that A is enrolling are concurrent controls, with all the nice properties of (1) above. Controls allocated similarly in time to A, but not overlapping, are contemporary. Clearly the control patient allocated the day before Arm A started is not that different from the control allocated the day after Arm A started. Historical controls in this context would be controls allocated distant in time from Arm A (and yes, it’s vague where the dividing line is).

    Contemporary and historical controls in platform trials have some crucial advantages. They potentially suffer from time biases, but we have a crucial tool in that we see the entire time course of the control arm. We can directly model evolution in the control arm performance over time and down weight the history of the control arm as appropriate. Our mitigation strategies are therefore more powerful. Contemporary controls are also far less susceptible to selection and clinical trial biases, since they are clinical trial patients and were enrolled in the same network of sites and inclusion criteria.

    Second to concurrent controls, contemporary controls present the least biased alternative with powerful mitigation strategies for detecting and reacting to biases should they occur. 

  3. Historical controls from randomized trials – Here I mean controls from past randomized trials. These potentially suffer from all the time and selection biases mentioned above, but they are clinical trial patients. I note with some irony we haven’t really settled the use of these patients in clinical trials, but we have partially leapfrogged this issue to discuss real world evidence, which must be more complex and riskier.

  4. Real world evidence – Not at the bottom because I think they lack value, just that I think they can have all the biases of the historical controls above, and any biases associated with patients outside clinical trials. Again, I think part of the solution here is to make clinical trials look more like the real world.

Hope for the future

I’d like to see us move to a paradigm dominated by platform trials. Imagine in 2050 we had large scale platform trials that had been running for 25 years over dozens of arms. We might be randomizing 10:1 active to control, which removes 90% of the cost of the control arm while significantly reducing the potential for bias and providing us with a toolbox for addressing bias when it occurs. These contemporary controls could easily be supplemented by real world evidence, appropriately adjusted with propensity scores and employing dynamic borrowing. With most of the cost of control eliminated, there is less incentive to risk other potential biases by completing eliminating controls.