AI In Real-World Drug Discovery: The Experts Weigh In

By Deborah Borfitz

May 8, 2019 | While artificial intelligence (AI) still tends to be underfunded by big pharma, machine learning (ML) is creating efficiencies in the drug development process and collaboration between biologists and ML experts is becoming more commonplace. Companies are still formulating their overall AI strategy, but hopes are high for a future with "killer apps" for predicting toxicity and drug response.

Those were among the themes that emerged from an "AI in Practice" keynote panel session at the recent Bio-IT World Congress & Expo in Boston. Participating panelists were Anne E. Carpenter, PhD, senior director of the Imaging Platform at Broad Institute of Harvard and MIT; Iya Khalil, PhD, chief commercial officer and co-founder of GNS Healthcare; Mariana Nacht, PhD, chief scientific officer, Vivid Biosciences; and Susie Stephens, PhD, senior director of Oncology & Vaccine R&D Information Technology at Pfizer.

What's New

Increasingly, more user-friendly AI tools are getting into the hands of people with domain knowledge who understand the problems that most need solving, says Carpenter. Using images as a data source, her lab at the Broad Institute is currently using ML for phenotypic profiling in drug discovery.

Both the quantity and quality of data has improved, says Khalil, and AL and ML are no longer cost-prohibitive. She spends a lot of her time searching for big datasets that can virtually represent patients, allowing her to run simulations to predict optimal treatments. "It's time for organizations to start figuring out how to adopt the technology," she adds, which is essentially a set of mathematical equations for improving clinical care.

Over the next decade, biological proofs of concept will demonstrate which algorithms are the valuable ones, says Nacht. "It's not the data that's interesting but the application of the data." Vivid, a startup focused on precision medicine, does high-throughput functional screening against hundreds of agents trying to identify the right therapy for the right patient.

Broadly defined, AI (outside of automation) is "still maturing," says Stephens. "The life sciences have more data than in past, and it's making a difference, but not enough for all the potential of AI."

Managing 'Boatloads' of Data

How high quality the data going into AL algorithms must be has a lot to do with the use case, says Khalil. An endpoint of cancer survival, for example, will be easier to measure from a dataset than treatment response to heterogenous diseases like arthritis and diabetes. "Any model needs accurate data," she adds, "but if you're trying to predict if I have a cold… 50% accuracy is good enough for me."

Quality data also need to be put in a place where it can be reused, adds Stephens, which for Pfizer is a data lake.

To guard against bias, Khalil recommends "strict guidelines around out-of-sample [model selection period] and in-sample [forecast evaluation period] performance." Since algorithms are trained on existing data, Nacht adds, they are biased initially but will grow less so over time with additional data and iterations. As Stephens points out, it's also a good reason to involve subject matter experts in AI projects.

Carpenter cited "privacy-preserving" data sharing as an exciting development, and Khalil says learning from all trials done by aggregating study data—and not just within one organization—is "the dream."

"We need a consistent set of metrics," says Nacht, and "exact conditions matter, for data acquisition as well. One universe of data is impractical."

Pfizer's data strategy approach, in addition to the data lake, is to keep patient-focused data separate from data capture tools and technologies, says Stephens. "There's a lot of resistance to standards across the industry and progress in that area would be helpful," she adds.

Technology and Collaboration

For anything other than push-button insights, data scientists are needed to work collaboratively with clinicians and biologists to ensure AI algorithms are robust, says Khalil. "The definition of biologist will change."

"Sequencing will become commoditized," predicts Nacht. It is getting easy to access "omics" data of all types, she adds. The question remains how to monetize it.

AI technologies are deployed "quite broadly" at Pfizer, says Stephens. In the image space, the company's push toward automation is being facilitated through a collaboration with Atomwise to identify potential drug targets for target proteins and more recently with Concerto HealthAI to apply AI technology to precision health oncology to identify best treatments and design clinical studies requiring fewer patients.

Elsewhere, AI is starting to drive efficiencies in multiple arenas, including using microscopy images to triage patients, says Carpenter. Panelists agrees that it is also becoming a tool to prospectively identify targeted therapies for specific types of populations. But in oncology, where less than 10% of cancers have genomic markers that are actionable, researchers need to identify other ways to predict efficacy and get good outcomes, says Nacht.

The advantages of AI are that it can do what humans under the best of circumstances cannot—e.g., automate the process of image analysis and model the longitudinal response of cancer patients to different treatments to create virtual patient cohorts for clinical trials. Given that only about 20% of oncology drugs succeed in the clinic, using AI to bump up response rates to even 50% or 60% would be "amazing," says Nacht.

Assembling the right domain experts around AI could one day eliminate animal models and make clinical trial results fully predictable from cell growth in a petri dish, says Carpenter. Pfizer is purposefully spreading the gospel of AI across the organization, facilitated by an AI Center of Excellence and a newly hired chief digital officer at the C-suite level.

Staffing and Training

In addition to in-house training programs, the bioinformatics department of GNS Healthcare is hiring AI expertise, says Khalil. She adds that a new generation of scientists is being simultaneously trained in biology and healthcare.

Cross-training will be critical, says Nacht. Experts will all have their own turf but start communicating more with one another. "We need teams that are diverse."

And they need to be aware of the ethical hurdles, Carpenter adds, citing evidence that current clinical protocols may have negative results for black women. "If you commit to this, it will be hard and you need to put patients first," echoes Khalil. In addition to being cognizant of their privacy needs, she advises, be sure patients retain their decision-making autonomy over treatment decisions.

Expertise is needed to curate the metadata, and at Pfizer a metadata capture tool helps automate that process as well, says Stephens.

If bio-samples are treated ethically—e.g., deidentified with no link back to metadata—"most people are very willing be involved [in clinical studies] for the greater good of learning more, to help themselves or society as a whole," says Nacht. This is hopefully not because they do not fully understand what they're agreeing to, she quickly adds.

Pfizer certainly wants patients to fully understand the informed consent documents they're signing rather than mindlessly clicking "I accept," says Stephens. Willingness to participate when a DNA sample is involved can look very different for a cancer patient with no available therapies than for those with more treatable conditions and plenty of other options.

FAIR Data Principles

Discussions are taking place internally at Pfizer about FAIR—Findable, Accessible, Interoperable and Reusable—data principles, Stephens says. "We work hard to make sure data is managed robustly and is available to mine."

Companies making significant investments in data monitoring and storage will need revenue to offset those costs, notes Carpenter. "Having an organization to host data is potentially useful."

Khalil points to the availability of open-access databases, such as the UK biobank, which researchers can run their datasets through to make predictions on smaller populations. This, she adds, is distinct from testing data used to assess the performance of the model itself.

It simply makes sense to produce some training datasets in-house. "[Vivid Biosciences] uses human patient samples vs. cell lines, so we have to develop everything from scratch," says Nacht. Pfizer's data lake approach enables metadata to be associated and to house multiple databases within the single repository.

What's Needed

Panelists couldn't point to a single killer AI app, but they'd welcome one that could forecast toxic effects of chemicals on biological systems given that 40% of trials fail due to toxicology. It's a huge, complex issue because people not only metabolize drugs differently but are often taking more than one medication and they potentially interact, says Nacht. Existing, alert-based apps are too imprecise.

The most useful AI-powered apps Stephens has seen are the wearables devices Pfizer now uses in clinical studies so patents less often have to be inconvenienced by a trip to an investigative site.

While people within Pfizer are for the most part actively embracing AI, the acceptance level across big pharma in general is relatively low. Al tends to be "under-resourced," says Carpenter, with far more enthusiasm than use cases.

Few in the industry doubt the potential of AI to make a difference in the world of drug development, especially when it comes to creating efficiencies—including clinical trials that enroll fewer patients on whom a higher proportion of readouts are done, says Carpenter. Stephens agrees, adding "the automation of tasks leaves a lot more time for value-added activities."

AI can also help clinical studies become more adaptive and optimized sooner, says Khalil, streamlining recruitment and operations.