A Computational Approach to Modeling COVID-19
By Allison Proffitt
October 1, 2020 | In months since March, the biotech community has rallied to identify how our unique perspectives and skills can help with the COVID-19 pandemic. The computational biologists at QuartzBio were no different.
“There's plenty of things that are necessary that I cannot do—like be a physician in the ER,” Renée Deehan-Kenney told Bio-IT World, but while doctors treated patients, researchers have been busy generating new data. “Science can sometimes be a very closed competitive world, [but] the speed and the volume with which people were really publishing data and papers or preprints around all of this was fantastic!” she said.
Deehan-Kenney is vice president of computational biology at Precision Medicine Group’s QuartzBio team. QuartzBio is the biomarker data science element of Precision Medicine Group, she explains. Datasets are the raw material at QuartzBio, so although the research team wasn’t generating their own COVID-19 data, the collaborative culture was generating and sharing raw data to play with.
“At the time when we started doing this, there were probably about 10 or so different studies that had been published by academic groups and made to be publicly available,” Deehan-Kenney explains. The team chose a small cohort—only two COVID-19 patients, both of whom died—published in Cell by a team at Ichan School of Medicine at Mount Sinai. The published dataset included RNA-Seq data from lung biopsies and applied prior knowledge about cause and effect relationships in biology to help us get an understanding of what's going on in the disease. (DOI: 10.1016/j.cell.2020.04.026)
“RNA-Seq data is very amenable to this prior-knowledge-based approach because lots of people have published other studies using RNA-Seq or microarray data. We wanted to look for that and there was a great bolus of data that came out of the Icahn Institute by D. Melo et al. They had published RNA-Seq and other data on lung biopsies from COVID-19 patients, but also a couple of different cell line models, data from a ferret model of SARS-CoV-2 infection and made that available.
With these data in hand, Deehan-Kenney and her team began outlining the questions they could ask.
“We knew that there were a lot of mysteries in terms of what is the actual biological molecular impact of SARS-CoV-2 infection on human beings. And then also, we rely a lot on research models. How good were research models at recapitulating a real COVID-19 disease state? Could we provide any value there in terms of steering translational folks who were working on repurposing strategies within their own organizations toward models that might be more reflective of real human biology?”
Deehan-Kenney and her team built the initial model based on the roughly 19,000 RNA-seq measurements they were able to map to different coding regions from those two patients. Since then, they’ve added more patient data—"in the tens, not yet hundreds,” of individuals Deehan-Kenney said. “Right now, any RNA-seq or other molecular data on clinical samples is a premium, especially lung biopsies, which is really a main site of disease activity, and obviously those are not easy to get. It's going to be very hard to get them from infected individuals that are asymptomatic,” she said. “We're just really focused on whatever is available with the absolute hope that the individuals that published the data were able to include even really simple clinical annotations, like age, or race, or major underlying conditions, because it would be great to know if somebody had underlying cardiovascular conditions or are they reasonably healthy.”
The model is also populated with data on drugs and drug mechanism of action. Repurposing drugs is a significant focus in the search for COVID-19 therapies, especially given that those therapies have already be approved and shown to be safe, and the QuartzBio team wanted as much known information about drugs to be included as possible.
“We're not necessarily doing anything to build a classifier that would turn into a diagnostic or to repurpose an actual drug ourselves,” Deehan-Kenney clarifies. “It was more, can we take the data that's available to us, learn something new about COVID biology and then share that with the world so that folks, either in academia or in industry who are doing their own COVID-19 research, can learn something about the mechanisms that actually prosecute SARS-CoV-2 infections, or what are some good models like preclinical models that we could use to study in our own translational or repurposing programs.”
A third goal, of course, is to improve on QuartzBio’s own computational prowess. “Our day job is to think about different ways to analyze high throughput molecular data,” Deehan-Kenney said. “So if we can using COVID-19 as an example, [and] show folks some ways in which they might be able to also utilize public data or work with us on different types of methods that we would use to apply to the problem, then great.”
New Problems, Old Approaches
While COVID-19 is the most pressing problem of the day, modeling this viral disease is, in most ways, similar to many other disease biology problems before it.
“No matter what, I think it's critical if you're developing a drug or a therapy or a vaccine, that you have to really understand the underlying biology of what it is you are trying to treat,” Deehan-Kenney said. “If you understand the biology, then you're not just throwing darts in the dark; you have areas of focus. From that perspective, I would say that looking at and trying to model the biology of COVID-19 is identical to how you would think about a lupus program or a pancreatic cancer program.”
Equally essential when considering biology is to plan for and model the diversity of patients. With only two patients, QuartzBio was not able to ask questions about age or underlying conditions. Instead, initial focus was on understanding what happens downstream of infection. How does infection impact the patients with severe disease?
“Next step would be to get into those other questions,” Deehan-Kenney said. “Are the same pathways activated in a healthy 12-year-old as an individual who's 70 and had a lot of underlying conditions? And what is it about the 12-year-old that allowed them to overcome the infection?”
The QuartzBio model isn’t a simple predictive model. “The question we were asking was, ‘What is the complexity of biology that happens downstream with infection?’ That's more of a large-scale characterization rather than, ‘Can I predict who is going to be infected or not?’ Or, ‘Who is going to progress to need a ventilator or not?’” Deehan-Kenney explained. “Really what you're looking to try to do is to find a therapeutic that will restore balance and push the individual in that dysregulated state back to a healthy state.”