Digital Twin of Infant Microbiome Reliably Predicts Cognitive Deficits
By Deborah Borfitz
May 8, 2024 | Researchers at the University of Chicago have succeeded in creating a digital twin of the gut microbiome of premature infants that reliably models the many interactions taking place between quickly changing bacterial inhabitants. The so-called “Q-net” model, created with generative artificial intelligence (AI), was used to make predictions about which infants were going to have neurodevelopmental deficits using head circumference as the proxy, reports Ishanu Chattopadhyay, Ph.D., assistant professor of medicine.
Unlike other types of health-related measures, such as blood pressure and heart rate, there is no baseline for normal when it comes to microbes in the gut, he says. Digital twins are what will now enable the discovery of what constitutes a healthy microbiome while also considering the wide variability from one individual to the next.
Admittedly, it’s a tall order. The typical gut has more than 1,000 bacterial species that may be interacting via competition, cooperation, or cross-feeding, and “we do not know a priori what those dynamic relationships are,” says Chattopadhyay. The microbiome also tends to resist change, often rejecting bacteria in probiotics meant to correct deficiencies.
Since that “complexity cannot vanish,” it follows that a predictive model of the microbiome at the individual level will be multifaceted with many parameters, he adds. Generative AI, which deals with the complexities of the human language, “fits like a glove for this type of ecosystem.”
As Chattopadhyay sees it, “a difference is emerging in the scientific method itself” thanks to the emergence of generative AI to systematically explore hypotheses and come up with solutions or provide actionable models. It’s possible that scientific laws have always been very simple because that’s how nature works, he adds, but the other real possibility is that only simple laws are discoverable without AI in the mix.
That case was made by a newly published study in Science Advances (DOI: 10.1126/sciadv.adj0400) where Chattopadhyay and his team used Q-net to make predictions about which preterm babies were at risk for cognitive deficits. The model’s performance (76% area under the curve with 80% specificity and 60% sensitivity) was good enough to demonstrate its utility in pre-screening infants who could then undergo further testing and monitoring, he says.
The digital twin was constructed from relative abundance profiles observed in a cohort of 58 preterm infants from UChicago’s Comer Children’s Hospital and validated on a separate, out-of-sample cohort of 30 preterm infants from Beth Israel Deaconess Medical Center in Boston. In addition to its forecasting function, Q-net was also used to determine patient-specific risk measures that could be assessed early enough to design targeted clinical interventions.
For 45% of the Boston cohort, Q-net predicted that deviations in the microbiome could be mitigated to reduce the risk of neurodevelopmental problems—meaning, more than half the time the model was unable to find the right intervention at the right time, says Chattopadhyay. And choosing the incorrect supplementation could raise the risk of problems.
“This is not a one-size-fits-all problem,” he says. As has long been suspected, interventions need to be personalized to a specific microbial environment. Next steps include validating intervention results on real patients.
Multiple Applications
Chattopadhyay’s group is interested in big-data problems wherever they may be found and is currently exploring how Q-net can be applied to solve problems in medicine as well as in basic biology and the social sciences. Among the use cases are modeling the opinions of people and how pathogens evolve in the wild and jump from animals to humans, which are topics of other papers now under review.
Several applications have been developed in parallel and the one on the infant microbiome just happened to come out first, Chattopadhyay says. The microbiome has been a hot topic of study for the past decade, but the focus has traditionally been on cataloguing the species or class of gut microbes and coupling that with metabolomic and proteomic data to reveal more patterns for predictive modeling, he says.
The problem with that approach, in his opinion, is that the research community has “hit a wall” due to the methodological challenges of dealing with noisy, highly complex data. Predictive models have pretty much stagnated on correlations between over- and under-abundances of different bacterial species and a particular disorder, which while important are all population-level results, Chattopadhyay says.
It is inherently difficult to bring population-level results down to individual level, which is what would be useful for treatment purposes and the design of personalized interventions and therapeutics, he continues. This gets back to his motivation for building a digital twin to model out all the interactivity between microbial species.
Q-net was trained to make predictions about the trajectory of relative abundances of different microbial classes in the infant microbiome, a computational exercise complicated by the fact that bacterial inhabitants and interactions are in flux because of normal human development. Correlations between bacterial species and health disorders are therefore impossible to make, in addition to being irrelevant to individuals.
The study focused on how patient-specific perturbations of the relative abundances of specific microbial classes modulated the estimated risk of suboptimal developmental outcomes. Using machine learning models, researchers showed that increasing the relative abundances of Bacteroidia and Actinobacteria in the early life of preterm babies can help reduce future risk, and therefore under-abundant, while increasing Clostridia and Gammaproteobacteria can heighten risk, and are therefore over-abundant. Other microbial classes also modulated risk, but in a more complex time-dependent and patient-specific manner, he notes.
Role in NICU
Preterm infants are at elevated risk of suboptimal developmental outcomes and picking out which ones are going to be afflicted with what is a nontrivial matter, says Chattopadhyay. But it would be “highly possible” to deploy the Q-net tool in neonatal intensive care units (NICU) because microbiome profiles can be determined using high-throughput gene sequencing and the clinical rationale is the known connection between gut microbiome health and neurological development—the so-called gut-brain axis.
The data is the starting point to finding the baseline for normal so doctors can know whether and in what ways deviations are occurring, but those answers won’t come easily, he stresses. The microbiome is impacted not only by the genetics of individuals but a long list of environmental variables influencing risk either negatively (e.g., being male and total amount of enteral feeds) or positively (vaginal delivery and increasing the proportion of human milk in diet).
Given the limited amount of data they were working with, Chattopadhyay says, Q-net did a reasonably good job of assessing the interactions happening in the microbiome of the infants. With a larger dataset and more longitudinal observations per participant, investigators expect to further characterize uncertainty and robustness properties of the inferred models as well as better understand the influence of clinical factors and the impact of what happens in the home following NICU discharge.
Researchers plan to seek funding for clinical trials using digital twins, says Chattopadhyay, but as an intermediate step will try to replicate the model’s predictions in a bioreactor imitating to some extent conditions in the human gut—for example, the same type of metabolic profiles and nutrient streams. Probiotics targeting bacterial classes identified as under- or over-represented in preterm infants will be tested. These include Bacteroidia, where deficits have previously been linked with autism spectrum disorder and attention deficit/hyperactivity disorder, and Actinobacteria, whose relative abundance has been implicated or associated with improved temperamental traits and fine motor skills.
Caution is required here, he adds, because they’re working with a complex, interactive system with many checks and balances and potential unwanted effects. The microbial communities being maintained in the bioreactor are also derived directly from human fecal samples, the supply of which is not unlimited. “It’s like a whole different ballgame when you’re analyzing this kind of data.”
Inferring Relationships
Development of the digital twins takes a substantial amount of time, but not when compared to the alternative of doing wet lab experiments testing the interactions of bacteria. Exploring just the two-way communications of a typical colony with 1,000 species would take more than 1,000 years, and more complex interactions involving multiple species are common.
Others have approached the longitudinal analysis of microbiome profiles using classical statistical methods and dynamical systems theory, including probabilistic graphical methods like dynamic Bayesian networks (DBNs), says Chattopadhyay. But, as suggested by comparative results reported in the Science Advances paper, existing methods aren’t predictive when applied to noisy, sparse, high-dimensional microbiome data.
DBNs are in a sense a generative model in that they can produce synthetic data, but with Q-net the creation of digital twins is “more central” to the approach, he explains. “Our objective was not to create classifier... but purely to understand the system.”
That means the model must use existing data to produce new data about individual patients who could have, but did not actually, exist, Chattopadhyay continues. The microbiome is constrained by a lot of “hidden laws,” so randomly perturbing the data would serve only to generate an invalid data set.
For individual-level predictions, “five equations aren’t going to explain what the dynamics are,” he says. Models will, of necessity, be complex. “You need a digital twin.”