Guest Column | April 8, 2026

AI Drug Discovery Is Revealing The Preclinical Bottleneck

By Sean Murphy, Ph.D., CTO, Bexorg

group of growing arrows inside an airtight tank-GettyImages-1164079846

Consider the numbers: roughly 95% of therapeutic candidates that enter phased clinical trials fail. For central nervous system diseases, Alzheimer's, Parkinson's, amyotrophic lateral sclerosis (ALS), and treatment-resistant depression, the failure rate approaches 99%. These are not early-stage molecules falling out of a funnel. These are candidates that have already survived years of optimization, consumed hundreds of millions of dollars, and cleared every preclinical hurdle the industry places in front of them. They arrive at the human trial looking like winners. Then they meet actual human physiology, and almost all of them collapse.

In software engineering, there's an old joke, usually printed on a meme featuring the "Most Interesting Man in the World": "I don't always test my code, but when I do, I do it in production." This reference has worked its way into the drug discovery world and parallels how we develop therapeutics for human beings. This cliché has shown that we are, in the most literal sense, testing in production — with people as the production environment.

The Translational Gap Is A Chasm

The root cause is well understood: there’s a translational gap between animal models and human biology. Mouse physiology diverges from human physiology in ways that are sometimes subtle, sometimes profound, and almost always consequential for predicting whether a molecule will be safe and effective in a human body. The mouse brain is not merely a smaller human brain, it differs in cell-type composition, receptor distribution, blood–brain barrier properties, neuroinflammatory signaling, and metabolic pathways. A molecule that engages its target in a mouse cortex may be irrelevant, ineffective, or dangerous in a human one.

The industry has known this for decades. We've written thousands of papers about it and convened hundreds of panels. And we have continued, with remarkable persistence, to feed candidate after candidate into a pipeline whose core validation step relies on a model organism that diverges from the target organism in the ways that matter most.

Generative AI: Expanding The Throughput Of The Bottleneck

The generative AI revolution in molecular design is going to turn the translational gap in drug discovery from a chronic inefficiency into an acute crisis.

We are entering an era where computational platforms can design novel therapeutic molecules at extraordinary speed and scale. Diffusion models, large language models trained on molecular representations, and reinforcement learning agents navigating chemical space are all systems that produce candidate molecules at a rapid pace. The throughput of the design phase is increasing by orders of magnitude.

This throughput at the front of the pipeline will be useless, and even destructive, if the bottleneck at the middle of the pipeline remains in place. Today, the rate-limiting step for getting a molecule from design to patient is not ideation. It is validation. And our primary validation tool is still the mouse.

Think about what happens when you increase the flow rate into a pipe without increasing the pipe's diameter. Pressure builds, costs escalate, and timelines extend. Right now, generative AI systems may already be producing molecules that would succeed in humans, but we have no reliable way to identify which ones they are. Instead, those candidates enter a preclinical gauntlet that evaluates them in the wrong species, and the few that emerge are sent into human trials costing $50 million to 100 million each, where the vast majority fail. It is not necessarily that the molecules are bad, but the selection process that promotes them is not equipped to predict human outcomes in the first place.

Human Tissue Is The Test Environment We've Been Missing

Software engineering figured out decades ago that you need development, test, and production environments that share the same fundamental architecture. You write code in dev, you validate it by testing it in an environment that faithfully mirrors production, and you deploy to production only after it has passed rigorous evaluation in that test environment. Yet the pharmaceutical industry runs the equivalent of untested, translated code to production each time it extrapolates mouse data to a human clinical trial.

Drug discovery has no test environment. It has a development environment — generative AI, computational chemistry, virtual screening — where molecules are conceived and iterated. But the preclinical validation step that should function as the test environment instead relies on animal models, which run a different operating system (mouse physiology) from the production environment (human trials). The result is a 95% to 99% crash rate on deployment.

The good news is that the test environment for human trials exists and is available now. Tissue-engineered organ-on-a-chip platforms represent a promising and increasingly mature first layer of human-relevant validation for simpler tissues like bone marrow and the liver. These systems can recapitulate key features of human tissue architecture and function with impressive fidelity, and they are beginning to scale in ways that make them practical for routine use upstream in the drug development pipeline.

For the most complex tissues, however (with the CNS as the paradigmatic example), a different path forward needs to be taken. Organoids have made remarkable strides, but they remain simplified approximations. They lack the full cellular diversity, the mature circuit architecture, the vascular complexity, and the microenvironmental context of actual human brain tissue. For a field where 99% of candidates fail in human trials, "approximate" is not good enough.

As exploration evolves, CNS drug discovery is looking to a complementary pathway in the study of ethically sourced, donated postmortem human brain tissue. In an ex vivo perfusion window of up to 24 hours, researchers can investigate tissue biodistribution, target engagement, pharmacokinetics, pharmacodynamics, and protein network perturbation across the full multi-omics stack: proteomics, metabolomics, bulk and single-nuclear transcriptomics, and histology. This mapping of translational biomarkers between tissue and cerebrospinal fluid provides a direct bridge between preclinical data and clinical readouts.

This is not a mouse-to-human extrapolation. This is direct measurement in the target species, the target organ, with the target complexity. This is testing in the right production environment.

Importantly, the regulatory landscape is already aligned with this direction. The FDA Modernization Act 2.0 amended the Federal Food, Drug, and Cosmetic Act to replace the long-standing mandate for "animal testing" with the broader term "nonclinical tests," formally opening the door for human tissue-based assays, organ-on-a-chip systems, and computational models to serve as accepted preclinical evidence.

Complementing this, the FDA's new approach methodologies (NAMs) initiative is building the qualification pathways and scientific frameworks needed to evaluate these alternatives rigorously across the agency's centers. The statutory change and operational infrastructure are both in place. For those developing human tissue platforms, the path from preclinical validation to regulatory acceptance now has mandated institutional support behind it.

Redesigning For Human-Relevant Validation At Scale

For the generative AI revolution in drug discovery to deliver on its promise, the industry needs more than better molecules, it needs a better pipeline architecture, one that looks favorably to modern software systems design.

The development environment — generative AI, computational chemistry, virtual screening — is where molecules are conceived and iterated.

The test environment — organ-on-a-chip for simpler tissues, donated tissue for complex organs like the brain — is where human tissue platforms provide high-fidelity, human-relevant validation before any molecule reaches a living patient.

The production environment — clinical trials — remains the final deployment. Clinical trials will always be necessary, but they should not be the first real encounter with human biology. Instead, they should be the final confirmation of a molecule that has been thoroughly validated in human tissue.

That means pivoting away from animal models as the primary workhorse of preclinical validation, not only for animal rights but because it is archaic, economically inefficient, and scientifically inadequate for the challenge ahead. It means investing in tissue-engineered platforms for tissues that can be faithfully recapitulated in vitro and in donated human tissue platforms for organs that are too complex to accurately replicate. The generative AI revolution has given us an extraordinary engine for molecular design. Now we need to build the testing infrastructure worthy of it. The production environment, who are also our patients, deserves nothing less.

About The Author

Sean Murphy is chief technology officer at Bexorg, where he leads the development and scaling of the company’s human-first discovery platform, integrating BrainEx experimental rigs, high-throughput assays, and a human data-driven AI stack to accelerate CNS drug discovery and advance in-house neurodegeneration assets and pipeline-building efforts. He and his team architect Bexorg’s end-to-end technology, from on-prem instrument control and data capture to secure, multi-account cloud infrastructure, so that large-scale human brain experimentation translates into reliable, analysis-ready datasets and production AI models. Sean holds a Ph.D. in neuroscience from Yale University, and undergraduate degrees in biology and electrical engineering from MIT.