News | February 10, 2026

LIGAND-AI Aims To Transform Early Drug Discovery Through Machine Learning

A global research initiative that includes researchers from the Leslie Dan Faculty of Pharmacy is undertaking a large-scale project to advance the use of artificial learning (AI) and machine learning to identify small molecules that bind to proteins – an essential first step in early drug discovery. A key goal of the initiative, called LIGAND-AI, is to generate the experimental data needed to develop essential AI tools.

“Early drug discovery doesn’t have an AI problem – it has a data problem,” says Rachel Harding, assistant professor at the Leslie Dan Faculty of Pharmacy and an investigator with the Structural Genomics Consortium (SGC), one of the initiative’s key partners.

“AI has potential to give us insights or generate new hypotheses of compounds to test in the lab, but it needs to be trained on high-quality, consistent data to make accurate predictions. Currently, we don’t have the kind of data we need to use AI for this difficult problem.”

“AI has potential to give us insights or generate new hypotheses of compounds to test in the lab, but it needs to be trained on high-quality, consistent data to make accurate predictions.”

Much of early drug discovery depends on identifying small molecules called ligands, which might be developed into chemical probes – chemical tools that bind to and affect the activity of proteins. A ligand that binds well to a protein could serve as the foundation of a new drug, so identifying ligands is the very first step in drug discovery.

Traditionally, researchers have screened libraries of compounds against a specific protein and looked for compounds that bind to it, but these experimental methods are time-consuming and expensive, often taking years and millions of dollars to identify a single ligand to a protein. In addition, these datasets are rarely shared in the public domain, so the availability of training data to identify ligands is a limited patchwork of disconnected source data.

As an investigator with the SGC, Harding is also part of Target 2035, an SGC initiative that aims to identify a ligand for every protein in the human proteome by 2035. But to meet this goal, researchers have to move beyond traditional ligand discovery methods.

“Over the next several years, we need to accelerate the pace at which we’re discovering chemical probes for all these different human proteins. If we continue to do it experimentally, it’s not going be fast enough, it’s going to be too expensive, and our goal will be unattainable,” says Harding. “So, we need to use machine learning and other tools that will help us along the way.”

LIGAND-AI brings industry and academics together for public good
LIGAND-AI is a five-year, multimillion-dollar initiative that will address these challenges by developing computer models and generating the training data needed to accurately predict ligands for every human protein. Research teams from around the world will screen different types of proteins against standardized compound libraries to generate billions of data points to train the computer models.

“This project brings together scientists and companies from across disciplines within an open science ecosystem. It is heartening to see these diverse scientific communities coalesce around a common vision to generate and share valuable chemical data openly with the world,” said Aled Edwards, CEO of the Structural Genomics Consortium and project coordinator.

“It is heartening to see these diverse scientific communities coalesce around a common vision to generate and share valuable chemical data openly with the world.”

With her expertise in structural biology, Harding is leading one work package of LIGAND-AI focused on hit validation, which assesses how well a potential hit binds to its target protein. This work will help determine if the ligand is a suitable option for further research and development as a potential drug.

LIGAND-AI depends on open science and collaboration to generate the experimental data required to build the AI models, and Harding says that she sees this project as an exciting way to collaborate with local researchers working on specific proteins to generate data that can have global impact and contribute to drug discovery.

“The idea behind LIGAND-AI is that we’ll create a consistent, open data set, and along the way, we also get to discover lots of really cool science about some of our favourite proteins,” says Harding. “It’s really exciting, and I see it as making a difference for the public good.”

Source: University of Toronto