NVIDIA at JP Morgan: Generative AI and the Computer Aided Drug Design Tipping Point
By Allison Proffitt
January 8, 2024 | Drug design is at a tipping point, Kimberly Powell, vice president of Healthcare for NVIDIA, argued in her Monday morning presentation at the JP Morgan Healthcare Conference, being held this week in San Francisco. She compared today’s CADD—computer-aided drug design—to the CAD—computer aided design—that revolutionized the chip industry.
“What CAD [computer-aided design] and EDA [electronic design automation] did for chip design, the CADD industry—computer-aided drug design—will do for drug design,” she said. “Two necessary conditions have arrived for drugs: digitizing biology and being able to represent it in a computer. We have the perfect conditions to see a massive expansion of this computer-aided drug discovery industry to serve the $250 billion spent every year in R&D.”
“NVIDIA has been preparing for this moment for over a decade,” she added.
Generative AI is the class of tools that will facilitate this codifying of the industry into scalable methods, Powell said. “Generative AI presents a new class of tools that will get codified into applications and new methods of discovery,” she said. “In fact, [it will] go beyond discovery and evolve into design, helping to create the conditions to no longer be a hit-or-miss industry. This new class of CADD will synthesize and systematically endeavor the process and it will help it be more consistent and more efficient in finding drugs for specific diseases, and specific people one day.”
Powell outlined several of NVIDIA’s latest steps toward this generative AI drug discovery future including two new foundation models for BioNeMo: a target discovery model built by Recursion for phenomics and an NVIDIA-built model for generating molecular molecules. In other NVIDIA news, Powell shared that Amgen subsidiary, deCODE Genetics, has deployed an NVIDIA DGX H100 supercomputer for its genomics foundation models.
GenAI for Biology
Most of Powell’s announcements centered on BioNeMo. Announced in September 2022 at NVIDIA’s GPC event, BioNeMo was originally billed as a large language model for biology. Now Powell leans into the “generative AI” language, calling it “a generative AI platform that provides services to develop, customize and deploy foundation models for drug discovery.”
“BioNeMo features a suite of pretrained biomolecular AI models for protein structure prediction, protein sequence generation, molecular optimization, generative chemistry, docking prediction and more,” Powell wrote in the NVIDIA blog today. “It also enables computer-aided drug discovery companies to make their models available to a broad audience through easy-to-access APIs for inference and customization.”
BioNeMo was originally announced as an outgrowth of NVIDIA’s partnership with the Broad Institute, but since then NVIDIA has furthered BioNeMo with investments in innovative techbio companies—the company named 19 in a late-December blog post.
BioNeMo Service is now in beta, Powell announced, and is being adopted by computer-aided drug design companies across the industry. BioNeMoe aggregates leading generative AI methods into the singular BioNeMo service. “CADD makers have access to optimized, scalable, stable APIs and enterprise-grade services that they can rely on to build out all the methods that they deeply understand to deliver to the market and enable drug discovery and design,” Powell said. NVIDIA delivers cost-savings by optimizing and accelerating the models.
Powell mentioned several proprietary models running on BioNeMo Service including the Samson molecular design platform from OneAngstrom, Deloitte’s Quartz Atlas AI data harmonization platform; Recursion OS; Terray’s T Compute and T Array; Innophore’s Catalophore enzyme identification platform; and Insilico Medicine’s Pharma.AI platform for novel target discovery.
But there are also other models on BioNeMo, some created by NVIDIA, some open-source models pioneered by global research teams and optimized by NVIDIA, and, now, models developed by NVIDIA partners available for non-commercial use. At JP Morgan, Powell introduced both a new model by NVIDIA and the first model developed by an NVIDIA partner.
MolMIM is the first NVIDIA-generated foundation model to be deployed on BioNeMo, Powell said. MolMIM uses self-supervised learning and a learning process called Mutual Information Machine for controlled molecular generation of small molecules with desired properties, “while constraining to the original starting molecule,” Powell explained. Researchers can define their own constraints—synthetic accessibility, drug-likeness, solubility, binding affinity—to steer the generative process. “Controlled generation increasing likelihood of success by generating molecules that satisfy multiple objectives simultaneously,” she added.
Recursion’s Phenom-Beta is the first third-party model on BioNeMo. Phenom-Beta is a foundation model for turning cellular microscopy images into mathematical representations of biology, said Recursion CEO Chris Gibson in a pre-briefing. Phenom-Beta was trained on Recursion’s publicly available RxRx3 dataset of biological images using the company’s BioHive-1 supercomputer, based on the NVIDIA DGX SuperPOD reference architecture. Now Recursion is making a portion of that dataset available as Phenom-Beta for other groups to explore.
“We’ve been partnered with NVIDIA now for several year and it’s been such a fruitful collaboration for us,” explained Gibson during the briefing. “Because Recursion has been so far ahead of the rest of the field in our use of images of human biology and in particular images of human cells to build a new kind of ‘omics…we felt confident it made sense for us to share some of what we’ve learned with the broader industry.”
Recursion sees “phenomics” as the next logical area of ‘omics research and has built “hundreds of millions of experiments worth of data to train models” to mathematically assess cell microscopy images. This isn’t digital pathology, he points out, but at the biological level.
“We think that by sharing this foundation model, we’ll actually accelerate the sharing of other models that other companies and other groups are using. And we think that’ll move all of us forward faster,” Gibson said.
deCODE’s Freyja
Finally, Powell announced that Amgen plans to build AI models trained to analyze one of the world’s largest human datasets on an NVIDIA DGX SuperPOD, a full-stack data center platform, that will be installed at Amgen’s deCODE genetics’ headquarters in Reykjavik, Iceland.
“The system will be named Freyja in honor of the powerful, life-giving Norse goddess associated with the ability to predict the future,” according to an NVIDIA blog post that went live with Powell’s presentation. “Freyja will be used to build a human diversity atlas for drug target and disease-specific biomarker discovery, providing vital diagnostics for monitoring disease progression and regression. The system will also help develop AI-driven precision medicine models, potentially enabling individualized therapies for patients with serious diseases.”