Fair Play in AI Drug Discovery: Navigating Bias in DRL

Deep reinforcement learning (DRL) is carving a niche in molecular design, driving forward the development of new drugs. But with great power comes great responsibility. The excitement is tempered by uneven performance across disease areas and chemotypes due to biases in data, rewards, and evaluation.

The Fairness Challenge

Despite the buzz, there's a glaring gap in how fairness is defined and tested in DRL-based drug discovery. As we push the boundaries of AI in healthcare, understanding fairness becomes not just a necessity but a critical component of trustworthy medicine. The core question: How do we ensure that AI-driven molecule generation doesn't perpetuate bias?

This review dives into three main areas. First, how dataset composition and splitting strategies, particularly scaffold versus random splits, impact evaluation and distribution shifts. Second, how reward designs, such as QED, docking scores, toxicity, and synthetic accessibility, can either create or mitigate bias. Notably, there's a focus on cancer targets, a field where stakes are incredibly high. Lastly, it scrutinizes which metrics accurately capture fairness, looking at parity across cancer and non-cancer indications and distributional balance in key descriptors.

Data: The Double-Edged Sword

From 2017 onward, major biomedical and engineering literature databases, alongside arXiv, have been tapped to scan the horizon for insights. The findings underscore that dataset choices significantly influence parity outcomes. Yet, the industry often overlooks this essential link.

Here's a rhetorical question: How can we trust DRL models if they can't guarantee unbiased outcomes? The review provides a concise set of definitions and metrics to aid in addressing this challenge. It also offers practical guidance for ensuring distribution and outcome parity.

The Bias in Rewards

Reward systems in DRL are another contentious topic. While they aim to guide model learning, they can inadvertently introduce or amplify biases. For instance, prioritizing certain chemical properties might skew results towards specific cancer targets, leaving other indications behind. If the AI can hold a wallet, who writes the risk model?

there's a need to standardize how we report on these models. Uneven outcomes hint at gaps that must be filled to make DRL generation trustworthy, especially in cancer research, where precision is essential.

Ultimately, the intersection of AI and drug discovery is fraught with complexities. Yet, ignoring fairness metrics isn't just an oversight, it's a disservice to the potential of AI in healthcare. The intersection is real. Ninety percent of the projects aren't.

Fair Play in AI Drug Discovery: Navigating Bias in DRL

The Fairness Challenge

Data: The Double-Edged Sword

The Bias in Rewards

Key Terms Explained