Asking the right questions: AI in Healthcare

Community post from EI Network member Alexandra Crew

Mar 29, 2024

One of the core components of the EI Network’s mission is to bring together professionals in Responsible AI to share expertise, experiences, and best practices in the pursuit of innovative solutions and critical thought. One way we create the space to enable this collaboration is to give a platform to the voices within our community by publishing a monthly post from an EI Network member.

This month we have the opportunity to hear from Alexandra Crew and her experience working in Responsible AI & Ethics in Healthcare!

If you are interested in writing a community post for the EI Network, please email helena@ethicalintelligence.co for further information.

Practical Tips for AI in Healthcare

Ethics Questions to Engage in Measuring, Validating, and Monitoring the Performance of AI-enabled Diagnostics

Community post by Alexandra Crew

Imagine this…

The Chief Product Officer of a MedTech company, Marisa, is thrilled her team’s new AI-enabled lung cancer diagnostic is nearly ready to launch. She knows well that one of the strengths of AI/ML is image analysis, which can support disease classification for conditions with diagnostic processes that use imaging – like lung cancer and diabetic retinopathy. She hopes her team’s tool will help catch lung cancer earlier for more people, potentially saving lives. Despite this, she also knows this technology comes with risks – and the stakes are particularly high given the potential impact on patient outcomes and the sensitivity of health information. However, she doesn’t know where to begin in exploring these risks – It’s hard to know what you don’t know.

How can Marissa embrace the promises of this new technology while still proactively engaging with the potential risks of AI-enable diagnostics?

While the list of potential considerations is extensive, let’s dive into the specifics of measuring, validating, and monitoring the performance of AI-enabled diagnostics, exploring the potential issues and important questions someone like Marissa would need to be asking.

Measuring Performance

At times, the evaluation focus is on safety and efficacy as measured by sensitivity and specificity. While these are crucial metrics that merit consideration, accounting for only this narrow idea of performance could lead to missing broader patient impacts of using an AI-enabled diagnostic, such as patients receiving care sooner due to shorter wait times for results. Additionally, if we exclusively use blanket evaluation metrics, such as only the overall sensitivity and specificity, this could leave hidden potential differences in performance across patient sub-groups. For example, if the diagnostic performs well with one racial group, but not another, this could be buried without metrics to indicate differences for sub-groups.

If someone like Marissa needs to push her thinking on how to assess her company’s tool, she can ask:

What performance metrics are used to evaluate this technology?
Have we included measures to reflect the influence on patient outcomes?
Have we incorporated metrics that indicate potential performance differentials across groups?

Validating Performance

As for validating these AI-enabled diagnostics, understanding their performance in real-world settings can offer valuable information. While retrospective studies can provide insight into the tool’s capabilities, this does not always reflect how the tool will perform in a real-world environment, as demonstrated in a study from Beede et al. As a result, validating these tools in real-world clinical workflows can offer important insight into the impact the tool will have when used in such environments.

For someone like Marissa, she can prompt further thinking on this issue by asking:

How might we validate this technology in a real-world setting?
Does this tool have the desired influence when used in a real-world clinical workflow?
Is it as accurate as intended?

Monitoring Performance

Monitoring continuously learning algorithms is important as it offers insight into if and how the algorithm’s performance is evolving. Some algorithms continuously learn as they receive new inputs, and in many cases, this can offer benefits as the algorithm’s performance might improve with new data. However, there are also risks that the performance could deteriorate or change in unexpected ways.

Continuously learning AI-enabled diagnostics are not permitted in all regulatory environments, but if their use is allowed or becomes allowed in your region, someone like Marissa would need to consider:

How will this algorithm be monitored?
How can we monitor multiple facets of performance beyond traditional performance measures?
How might we ensure it is continuing to perform at a similar or improved level?

What next?

Seeing as these tools have potential risks, we would do well to proactively engage with them and seek solutions. To begin, you can include metrics that reflect the technology’s impact on patient outcomes and indicate its performance across patient groups. For real-world validation, you can test these technologies in real-world clinical workflows to understand if they maintain their desired influence. And for monitoring continuously learning algorithms, you can craft methods to take this on when needed. Policy-makers can also create/adapt relevant regulations to surface and guide technologists through these considerations.

Beyond Healthcare

Each of these insights can also be applied to AI tools beyond healthcare. Across industries, we’d be wise to consider the influence an AI solution has and how to measure it beyond gauging exclusively traditional performance metrics. If we myopically focus on traditional performance metrics, we could miss important impacts (risky and beneficial).

You can’t always predict how humans will interact with technology, nor the influence it will have, so testing your tool in real-world settings before launching broadly is valuable for many AI tools, particularly those that will be used on a large scale or by a significant volume of people.

We’d do well to monitor continuously learning algorithms to ensure performance or intended impact does not deteriorate or slip into unwanted risks. Monitoring could even shed light on valuable improvements. Asking these questions and thoughtfully exploring them to find solutions is the first step for you and Marisa in creating a technology with risks minimized – so that you can focus on the benefits it will bring.

About the author

Alexandra Crew is a professional and a graduate student who focuses on the intersection of healthcare and technology. She believes that as AI transforms elements of healthcare, engaging ethical considerations is paramount to proactively ensure a positive and optimal impact. Her studies and research at Oxford examine ethical considerations for AI healthcare applications with the intent of informing the ethical translation of AI solutions. This academic focus pairs well with her professional experiences, which allow her to root ethical considerations in business principles and the realities of healthcare ecosystems. Her professional experiences range from informing the launch strategy of an AI-enabled lung cancer diagnostic for a Fortune 50 company to advising the COVID-19 vaccine rollout for a US state. Alexandra holds a BASH from Stanford in both Human Biology and Comparative Studies in Race & Ethnicity, and she is pursuing an MSc in Translational Health Sciences at the University of Oxford.

Join the community

This newsletter is a feature of The EI Network.

A global community designed to cultivate the space for interdisciplinary collaboration in Responsible AI and Ethics, the EI Network brings together members to share expertise, experiences, and best practices in the pursuit of innovative solutions and critical thought.

Hosting both virtual and in-person functions, EI Network members have the opportunity to gain insights from an online international community while also benefitting from the support of local chapters.

In addition to receiving this newsletter at a 30% discount, EI members have access to a private Slack community, local chapter meet-ups, education sessions on Responsible AI, and the opportunity to discuss with some of the world’s leading minds in AI Ethics.

Membership to the EI Network is on an application basis.

Apply to join

Hello from the community

EI Network members Dave Barnes, Jillian Powers, Olivia Gambelin, and Paige Lord at this year’s Leaders in Responsible AI Summit in Cambridge.