By Shantanu Phatakwala
As artificial intelligence has seeped into nearly every corner of our lives, we’re often amused when the technology falls way off the mark—the recommended show on Netflix you'd never watch, the suggested product in your Facebook feed that you wouldn't buy.
But there are times when these misses are no laughing matter. Such was the case recently, when a team of researchers reported that they had discovered unintentional racial bias in a widely used population health model. Owned by a company with millions of attributed lives, the model consistently underestimated black patients' needs for care management, making them less likely than white patients with equivalent health problems to be offered these resources to improve their health.
These findings should serve as a wake-up call to the health care industry, because the flaw that produced the bias isn't limited to one company. Across the country, health plans and providers use predictive models that rely heavily on past health expenditures to project future spending, rate each patient's level of risk for future health problems and provide additional resources based on those risk scores. However, this cost data doesn't account for the fact that, overall, black patients utilize health services at lower rates than white patients. What results is a double whammy for many patients who desperately need attention—not only do they face greater barriers to care, but those barriers often hide their true level of clinical need and deprive them of care management resources.
So why does the industry keep using these flawed, cost-based models? They have been popular for 30 years because claims data that show past spending are convenient to access and relatively easy to use, and most care management professionals are familiar with it. The output from the models is tailored for care managers to easily understand and apply to patient interactions. They can quickly see how a patient's past costs and diagnosis codes contributed to their overall risk score.
Claims data have well documented limitations, however. Claims were created for billing, not specifically for measuring or improving outcomes. They contain very limited clinical information, and the information is three to four months old due to the lag time involved in submitting, adjudicating and reporting claims. Today, with the adoption of electronic medical records and digitization of other valuable data, there is a plethora of better information to help make more accurate predictions about patients' health. That said, this data is more difficult to use because it is spread across multiple sources and doesn't follow a consistent format.
Will the revelations of bias finally spark companies to tackle the shortcomings of cost-based predictive models? When the study was published, the company issued a quote advising that its algorithms should not been taken as a substitute for a physician’s expertise and knowledge of their patients. However, it's unrealistic to entirely depend on doctors' or care managers' judgment to help overcome the model's flaws. We need to address the issue structurally, by building better population health models, rather than by making disclaimers.
Driving Out Bias, By Design
Fortunately, biases in algorithms are not as stubborn as they are in people. They can be mitigated with the right design of models and carefully selecting the data to be used.
Choosing the right outcome is the single most important decision in constructing a model to predict patients' risk of future medical needs. That outcome should focus on an event we want to avoid—not cost—and it must be very specific. For example, Evolent's models, which I helped develop and which my health plan uses, predict the likelihood of an "impactable" hospital admission within the next six months.
By design, this approach avoids the structural bias inherent in using medical expenses as the primary outcome. In contrast to models that rely on past health care costs to predict future need, Evolent's models recognize that the lack of certain encounters can sometimes be a strong predictor of future hospitalization. For example, if you have diabetes and hypertension but you don’t see your primary care physician, that fact might actually increase the chance of an admission over someone who regularly sees their primary care provider, and that patient will have a higher risk score in Evolent's model.
Gathering the right data for the AI model to "learn" from is also vital, and this needs to be a focus even though the data collection is not easy. There are so many combinations of factors that can signal a patient is at high risk of being hospitalized in coming months, so we need to build those factors into our models. Patients' clinical conditions, including their severity and progression, as well as hospital and emergency department visits, would be components of that model. But the model should also give serious weight to lab results and prescriptions, the use of equipment such as home oxygen, and demographic data.
Incorporating social determinants of health data into population health models can also protect against bias. Learning from this data, such as transportation barriers or lack of social support, the model can pick up on correlations between health barriers and adverse outcomes. In Evolent's models, socioeconomic status and living conditions are often stronger predictors of adverse events than clinical or utilization indicators.
But revamping these population health models won't be as simple as upgrading software and pushing out updates to users. Instead, efforts should tailor models to different types of patients and different interventions. For example, patients with behavioral health issues need to be targeted using a model and interventions that are specific to behavioral health. Training and resources need to be revamped for care managers so that they can understand the different predictive models. Workflows and interventions will be tailored to these groups. Moving from a one-size-fits-all approach to a more specific approach will increase complexity and the demands for retraining staff.
This work isn't convenient. It's very hard and requires a rapid test-and-learn approach that can be difficult for large payers and providers. I'm proud to have helped Evolent build a new generation of predictive models that not only seek to avoid bias but also drive breakthrough cost and quality outcomes.
Regaining Trust in Population Health Models
It's well past time for population health models to shift their focus from medical expenditures to actual health outcomes. Given that more than 150 million American lives are managed under some form of risk arrangement and value-based payment structure, such as Medicare Advantage plans, these algorithms have enormous impact by determining which patients will be offered services that could help them better manage their diseases. How many more patients in dire need of these services would get them if we could consistently identify these people? Are services being deployed for patients who don't need them? Are we unintentionally perpetuating systemic racial bias in the health care industry, compounded by time and the scope of old models still in use?
Beyond helping patients, it's critical that providers trust these algorithms. When a patient is referred to care management, we want physicians to welcome the opportunity and tap into the resources that are being offered. We don't want them to harbor suspicions about possible bias or think twice about engaging the care management team.
Minimizing racial bias in population health models won’t happen overnight. For the field to redesign algorithms and redeploy them at scale could take years. In the interim, the field needs to make clear it takes the problem seriously, is addressing it and is constantly guarding against it.