From The Editor | March 20, 2026

5 Takeaways From The Boston Open Science & Innovation Forum

Ray Dogum 450 450 Headshot

By Ray Dogum, Chief Editor, Drug Discovery Online

GettyImages-1304908386

Open software is fundamentally a public good, deserving of recognition and support across all industries. Nowhere is this more critical than in drug and disease research, where open tools can help save lives and ease suffering.

On March 18, 2026, leaders of this social and technological frontier gathered at the Boston Open Science & Innovation Forum to share updates with the community. The event was co-hosted by Ginkgo Datapoints and the Open Molecular Software Foundation (OMSF).

Speakers from left to right: Woody Sherman, Charlles Abreu, Jonathan Gilbert, John Androsavich, Jackson Burns, Beth Cimini, and Devany West.

Here are 5 takeaways from the event I found worth sharing.

1. Science Community Adoption Takes Time

When OpenFold training datasets, AI model source code, inferences, and weights are all shared publicly for anyone to freely access without commercial restrictions, it’s a huge win for the world. It’s a win for patients, for researchers, and for corporations—but it doesn’t come without its own unique challenges.

Getting people to adopt open-source software is a grand challenge and many projects do not gain enough steam to establish a solid community foundation or to build a product that is user-friendly enough for wider adoption. Oftentimes their original developers cannot continue to run a project with barebones funding, so the project fizzles out over time and the creators find more traditional industry jobs.

However, some projects are unique and valuable enough for the community to catch on and support its growth organically. For example, the Linux project, which started in 1991, has blossomed into the most popular operating system for global servers and devices. It only took a few decades to reach that level of maturity.

Cell Profiler (introduced in 2003 by Anne Carpenter) and ChemProp (introduced in 2019 by Kevin Yang et al) are two examples of open drug discovery tools that have been successful in terms of adoption and product maturity.

Slide on CellProfiler.

At this event, Beth Cimini highlighted how CellProfiler has evolved into one of the most widely used and consistently cited image‑analysis tools in drug discovery. Cimini leads her lab within the Broad Institute’s Imaging Platform at MIT and Harvard, where CellProfiler continues to anchor high‑content screening workflows across academia and industry.

MIT’s green group research scientist, Charlles Abreu, emphasized the impact of ChemProp’s 2D‑based modeling, noting how frequently researchers rely on it, especially in blind challenge competitions where variations of ChemProp regularly appear among top‑performing models.

MIT Fellow, Jackson Burns, underscored the importance of clear communication around model limitations and the practical realities of deploying machine‑learning tools in real‑world discovery pipelines. He also discussed CheMeleon, a Chemprop-based model, which he characterized as his group’s first big step into foundational models for molecular property prediction.

Slide on the CheMeleon project.

2. Sharing Data Is Still A Key Issue

Data is gold in drug discovery. All early‑pipeline biopharma companies need to protect their research data from being released into the wild, where competitors could inadvertently gain access through insecure AI models.

Jonathan Gilbert, leading Lilly TuneLab’s ecosystem growth, shared his excitement for their continuously improving federated learning model and infrastructure (hosted by third party Rhino Federated Computing) that gives smaller companies access to Lilly’s internally-trained inferencing models.

He explained, “whether you are inferencing or predicting characteristics of a small molecule or from an antibody sequence, that information is never leaving your private environment. It’s never seen by us. When you're training the model, the data does not leave your private environment. The model goes to the data, then trained on the data, and then any changes in the weights or gradients are brought back in and aggregated. That's the global model that's in service to everyone in the community.”

3. New Research Fellowship Announced with Nobel Laureate David Baker’s Lab

Woody Sherman announced the OMSF has launched a new research fellowship between the OpenFold Consortium and the University of Washington’s Institute for Protein Design (IPD), led by Nobel laureate David Baker.

The program will fund graduate students and postdocs in the Baker Lab to develop next‑generation open‑source AI models for antibody–antigen design and protein structure prediction. All resulting software will be released under permissive licenses, reflecting the shared commitment of OpenFold and IPD to openness in biomolecular AI.

OpenFold engineers will also provide packaging, documentation, and maintenance support to ensure broad usability. The initiative aims to build community‑owned AI infrastructure that accelerates scientific progress and advances drug discovery.

4. Ginkgo Offers Free High-Throughput Transcriptomics And Publicly Releases The Data

Slide on Ginkgo’s Virtual Cell Pharmacology Initiative (VCPI).

The Virtual Cell Pharmacology Initiative (VCPI) recently released their first open-source dataset (DRUG-seq data, featuring 2,280 small molecules in a 6 point-dose response screened in THP-1 cells). The open initiative is creating transcriptomic profiles of compounds in defined cell types with the aim of building virtual representations of cellular responses to chemical perturbations.

I was encouraged by Ginkgo Datapoints’s general manager, John Androsavich, who explained, “if we’re going to build a virtual cell that’s going to predict chemical perturbations, we need diverse chemistries, we need diverse input, and we also need help modeling.”

This strategy seems to reflect Ginkgo’s culture of enabling open science. Ginkgo Bioworks (via Ginkgo Datapoints) has become one of the largest producers of Drug‑seq datasets in the world. During our visit, the Ginkgo team also gave us a tour of their facilities, allowing us to see firsthand the automation, assay engineering, and high‑throughput sequencing infrastructure that make this scale possible.

View of Ginkgo Bioworks’ lab facilities.

From our tour of Ginkgo Bioworks’ lab facilities.

5. OpenADMET Hosting Blind Competitions in 2026

Slide on OpenADMET’s upcoming challenges, partners, and contact info.

Devany West, scientist at OpenADMET, provided the audience with an update from the consortium which aims to build open predictive safety and toxicity models for small molecules. West promoted their upcoming blind challenges happening this year, including the following key areas:

  • PXR Activation + Structural Data in April,
  • CYP Inhibition, Reactivity and TDI in August,
  • and an unconfirmed drug discovery dataset in December.

I suggest reaching out to their team if you’d like to participate as predictive tox and safety is one of the most exciting areas in drug discovery.

The Boston Open Science & Innovation Forum couldn’t have happened at a better time. We seem to be entering a new phase of open science that involves greater collaboration, corporate commitment, and accelerated learning.

Thankfully, scientists have options when it comes to what tools they want to use.

The OMSF is one of many groups navigating the open science software deployment. If you’re interested in learning more about this evolving space, I strongly recommend you register for an upcoming Drug Discovery Online Virtual Live Panel, Why Open-Source Models Matter for Computational Drug Discovery happening on March 26, 2026. Hope to see you there!

Ray Dogum at the Boston Open Science & Innovation Forum