AI Taking on the Holy Grail of Diagnostics: Predicting Drug Response
By Deborah Borfitz
July 30, 2024 | Machine learning models can be superior to radiologists when it comes to processing complex images and putting a predictive label on them, particularly at scale, according to Peter Sorger, Ph.D., professor of systems pharmacology at Harvard Medical School. The problems currently being studied in biomedicine with machine learning, such as identifying the subtype of cancer a patient has, are ones that have been largely ignored for a long time, he says.
It’s not just that the computational tools now exist to help understand the tumor microenvironment and predict tumor behavior, Sorger adds. Ideas about what constitutes an “interesting question” have also evolved.
The disturbing realty today is that most patients, even in an oncology setting, are “getting drugs without any indication of whether they will or will not respond,” he says. His own research indicates most cancer patients are treated with medications that “almost definitively will not matter and... carry a pretty substantial burden of therapy.”
Only about 6% of all diagnoses are impacted by a genetic marker, says Sorger, highlighting the “incredible mismatch” between what is thought of as precision medicine and what happens in the real-world practice of medicine. The diagnostics of greatest interest to industry, and the only type it will ever invest in, are “the ones that in principle predict a drug response—and in general will be introduced only in the circumstance where a clinical trial will fail in the absence of that biomarker.”
In numerous situations, biomarkers backed by scientific evidence of improved patient outcomes have not been advanced by a drug company simply because there wasn’t enough economic incentive to do so. It is simply far more profitable to develop a drug, says Sorger, pointing to data from IQVIA suggesting investment in the development and validation of biomarkers is about 100 times smaller than that devoted to new medicines.
Sorger and his team witnessed this firsthand after discovering a new predictive biomarker for colorectal cancer identifying patients who will respond to immunotherapy—notably, a much larger population than is currently being treated with checkpoint inhibitors because they have tested positive for specific gene changes (i.e., high level of microsatellite instability or changes in one of the mismatch repair genes). Their technique employs artificial intelligence (AI) to learn how many immune cells have infiltrated a tumor to predict the likelihood of progression-free survival in that “hot tumor” population.
Speeding Translation
Sorger says he ran a basic research lab at Massachusetts Institute of Technology (MIT) before moving to Harvard Medical School 12 years ago to start the Harvard Program in Therapeutic Science. Its objective is to accelerate the rate at which drug discovery projects transition to the regulatory review process.
As part of that work, he has been looking at the role of machine learning and AI in image recognition—specifically, “at the intersection between translational and basic research and the development of modern diagnostics,” Sorger says. He draws some of his inspiration from the work of Regina Barzilay, professor of AI and health at MIT. After she was diagnosed with breast cancer, she developed a set of algorithms for determining when and how often women should have a mammogram based adaptively on their circumstances to minimize unnecessary radiation exposure.
Sorger shares her quest for better imaging-based biomarkers for improving patient diagnosis and treatment outcomes. Disease biomarkers of one kind or another have been pivotal in at least a dozen practice-changing clinical trials, he says, but in the bigger picture done little to fulfil the promise of precision medicine. Nearly all solid cancers are still diagnosed via visual inspection of a piece of tissue by a pathologist under an optical microscope.
The two most important biomarkers currently are a BRAF mutation in cutaneous melanoma and the image-based HercepTest for a subset of breast cancers, Sorger notes. In the latter case, pathologists score patients on a scale from 0 to 4 based on the expression of the HER2 oncogene. “Women with HER2-posiitve breast cancer can live almost a full lifetime as a consequence of this kind of effective diagnosis.”
For both melanoma and colorectal cancer, incident rates in the U.S. are rising rapidly in the young for reasons that are not well understood, says Sorger. Treatment guidelines put out by the National Comprehensive Cancer Network cover disease staging and grading, and almost all of this is being done manually by physicians.
Despite the interest in treating colon cancers based on a biomarker measurement, only a “small minority” of patients are treated with any immunotherapy, he continues. Treatment with checkpoint inhibitor immunotherapy is generally reserved for individuals with advanced disease caused by a dysfunctional DNA repair system. Almost everyone else is going to get cytotoxic chemotherapies, including patients presenting with early-stage disease.
Roughly 65% of all solid cancers get surgery, meaning a lot of tissue is being recovered, Sorger adds. The pathology department at Brigham and Women’s Hospital probably runs about one million tissue specimens per year and, “it is still being done in a way that is fundamentally similar to what a physician would have done in 1890.” This is what AI is poised to complement.
Role of Immune Cells
Existing image-based diagnostics for colon cancer rely on an immune-based classification for determining if a tumor is hot (quick response to immune checkpoint inhibitors) or cold (low response rate). Pathologists generally use a brown stain, a technique using an antibody connected to a chemical that makes immune cells visible, to determine how many of them have infiltrated a tumor, Sorger explains.
Immune checkpoint inhibitors are given to individuals whose cancer triggers a strong immune response and are therefore expected to respond well to immunotherapy. But the treatment is often unsuccessful, says Sorger.
People who have reached their early 60s will have about 100 oncogenic mutations per square centimeter of sun-exposed skin and, “in principle, those mutations could go on to cause cancer except for [their] immune system,” he says. After the colon, the skin is the most infiltrated organ in the body when it comes to immune cells, which are continuously on the lookout for oncogenic transformation and mounting counterattacks.
This surveillance process is what’s being observed with hot tumors—and capturing it is an incredibly complex task that modern machine learning models lack the sophistication to do on their own, says Sorger. Pathologists do it by toggling between the super-high magnification of single cells and low-power context of the tissue that involves a lot of data integration as well as fiddling to find focus.
Sorger and his team partnered with RareCyte, a company providing spatial biology and liquid biopsy solutions, to develop a new instrument (Orion) for mid-plex multimodal tissue imaging. The process combines immunostaining, which accounts for much of what is done in the pathology field, with molecular imaging on the same tissue section (Nature Cancer, DOI: 10.1038/s43018-023-00576-1).
The technique uses antibody-based imaging of up to 18 proteins all at once and up to 60 different colors, each representing a different molecule, he explains. “This allows us to... [obtain] underlying molecular information, something we couldn’t work on 20 years ago but has now sort of dawned on us to do.”
The project was funded by the National Cancer Institute under a Small Business Innovation Research grant and is now a commercially available instrument. About 50 of the platforms have already sold, reports Sorger, who serves on the company’s scientific advisory board.
Biomarker Stratification
Brown staining and digital imaging powered by machine learning were recently used in a study by Sorger and his colleagues to automate ImmunoScore (the standardized immune assay) consistent with current understanding of hot and cold cancers. Their intent is to predict progression-free survival based on a tumor’s T cell infiltration.
Participants in the single-center, retrospective study included 40 stage 2/3 treatment-naïve colorectal cancer patients. They then mapped out survival, as measured radiologically, for the high and low scorers over time.
Most of the patients in fact died at roughly the same time scale, in part because they were already quite sick upon presentation at the hospital, says Sorger. In another cohort where only a minority of the patients had undergone progression the situation was quite different.
In this group are some individuals with more infiltrated hot tumors who might be treated in a personalized fashion by being put on a “watchful waiting” protocol because they have better outcomes independent of therapy, he says. The side effects of chemotherapy treatment can include peripheral neuropathy where patients lose feeling in their hands and feet—sometimes permanently.
In the U.S., where “defensive medicine... is the standard of care,” says Sorger, “every single patient is put on the maximum dose of cytotoxic chemotherapy under all circumstances.” The use of biomarker stratification could spare some of them the ravaging side effects.
England has instead been optimizing the patient experience of chemotherapy treatment, in part to spare the medical system the cost of the drugs, he adds. “There is [also] a lot of evidentiary basis in medicine that never makes it into practice. Moreover, if patients move from Mass General Hospital to Dana-Farber [Cancer Institute] they would undergo changes in therapy due to protocol.”
The Outliers
When a tumor is highly infiltrated, which describes about 15% of colorectal cancer cases, immunotherapy remains a good idea because they experience “extremely good outcomes,” says Sorger. For the other 85% of patients, the challenge is to find accurate predictive biomarkers for colon cancer immunotherapy. And that will require personalized treatment approaches tailored to individuals rather than populations, he stresses. Consider what he and his team recently discovered by examining specimens from patients with Lynch syndrome, which is a type of hereditary colon cancer caused by mutations in one of four DNA mismatch repair genes and characterized by heightened T cell responses in the tumor.
The surprise discovery (yet-unpublished research), as measured with just a clinical specimen and fancy microscope, is that “the correlation [between mismatch repair status and T cell infiltration] breaks down for a whole series of highly infiltrated patients who do not have any of the established genetic criteria for infiltration,” reports Sorger. Excluding those with zero progression already targeted for immunotherapy, there remained others treated with chemotherapy who might have done better on either immunotherapy or nothing at all.
The reason some tumors are hot while others are cold is an “incredibly active area of research,” says Sorger, since genetic features explain less than half the variance. To that end, his group developed technology that generates three-dimensional moving images of single T cells interacting with tumor cells inside human tissue—and with a precision that a few years ago could be done only in a tissue culture dish in a preclinical lab.
Capturing individual molecular interactions in this way has allowed diagnostic features discovered in the clinical setting to be brought back to the lab to search for the underlying mechanism on the same piece of tissue, variably in patients whose cancer has and won’t progress, he says.
Current Trends
Machine learning is ubiquitous in biomedical research, says Sorger, noting that it has been deployed in one form or another by “essentially every laboratory on [the Harvard] campus.” Much of it is “interesting but really sort of practical stuff,” such as algorithms that look at slides alongside human pathologists.
There is also “a lot of back and forth between what we think of as patient care and what we think of as scientific discovery... and more sophisticated machine learning models have made it possible to do this kind of research much more efficiently,” Sorger says. “This is oddly a paradigm from medical schools in the 1950s,” he adds, although his lab now studies disease processes directly in humans—via deidentified patients under universal consent for biological samples—rather than yeast or mice as in prior years.
The relationship between academic and industry research as a drug moves from the basic science research stage to marketing approval is not as linear as many people assume, he continues. In fact, much of his work focused on the “backflow” of drugs from clinical practice into the research setting to optimize their use.
The future of biomedical research in the public interest is deeply uncertain, says Sorger, with the agenda increasingly driven by billionaires. Almost every aspect of modern medicine has been privatized as federal investment has dwindled—in the case of cancer peaking in 1988 and falling steadily since the early 2000s.
While the size of the U.S. population grew from 245 million to 350 million between 1988 and 2022, he says, “we’re investing about 40% less per cancer patient with public resources.” This has resulted in machine learning models which, despite being built on public assets, are more expensive and less equitably accessible than they otherwise would be.
The new add-on to a standard mammogram, for instance, costs between $40 and $100 out of pocket. Even AI products that have gone through the FDA authorization process have no billing codes radiologists can use for reimbursement purposes.
In Sorger’s view, privatization is a “rent-collecting activity on what should have been a public resource.” For the most part, companies developing AI are bringing their solutions straight to market and skipping the time and expense of clinical trials.
This could all change with the FDA’s final rule signaling the agency’s intent to regulate laboratory developed tests (LDTs) as it does other in vitro diagnostics. In the U.S., 95% of all molecular diagnostics are LDTs, says Sorger. This includes a growing number of multi-cancer early diagnostics.
“The expectation is that this will result in a reduction of available diagnostics by at least five-fold ... unless the economics change,” he says. Foundation Medicine (a division of Roche), whose genome sequencing LDT is done on almost all cancer patients cared for by top hospitals in the U.S. and had $3 billion in sales last year, could presumably afford the cost of running clinical trials but, notes Sorger, has yet to do so.