Open Science Needs A Do-Over
By Deborah Borfitz
May 6, 2019 | The business of research has traditionally focused on producing academic papers, limiting the growth of scientific knowledge to "massive incrementalism" around a succession of magic-bullet solutions—microassays, human genome project, robotics and artificial intelligence and, most recently, open science. Instead of research being translated by the market and compatible communities coming together to solve pressing problems, we have monopoly networks like Facebook, Apple, and Google taking the reins of the precision health movement.
That was the central message of John Wilbanks, chief commons officer of Merck spinoff Sage Bionetworks, in his keynote presentation during opening ceremonies of the 18th Annual Bio-IT World Conference & Expo held recently in Boston. "We don't need more science but more just science," he argues, where transparency, replication and trial registration are the norm. "Open is not something that can be built on broken science."
Currently, fragments of knowledge are being locked in PDFs—"between 0.6 and five published papers per $100K funding" to be exact, Wilbanks says. Researchers have succeeded in creating pathway maps for Alzheimer's disease, he cited as an example, but not in discovering a way to treat it or determining if beta-amyloid deposits are even the best therapeutic targets.
Open science has been a trending topic for several years now, as evidenced by boycotts of publishing giant Elsevier over journal costs and open access as well as the public access and data-sharing policies surrounding the National Cancer Institute's Cancer Moonshot initiative, says Wilbanks. Sage Bionetworks itself has petitioned for public access to results of publicly funded research. How people collectively organize themselves around the open science movement has a lot to do with their assumptions about what the future holds.
Sage Bionetworks "lives in that future" and has lessons to share from the work it has done, Wilbanks says. "Federation dreams are possible; the barriers are human."
Building High-Trust Communities
In the first-ever effort to generate consensus cancer subtypes, Sage Bionetworks convened an international group of colorectal cancer (CRC) researchers, data scientists, and clinicians to create the Colorectal Cancer Subtyping Consortium. The group collectively examined all the published data, in a single format, around identified CRC subtypes and eventually produced a consensus paper identifying four with unique clinical and molecular markers, Wilbanks says.
In an ideal world, collaboration like that would happen pre-competitively, before publication, he continues. If beta-amyloid plaque is the wrong therapeutic target, pharma companies and the National Institutes of Health (NIH) will want to figure out where to redirect their efforts. The number but not the size of high-trust research communities needs to scale around discrete scientific questions and methodologies.
"The value is in the community created," Wilbanks says, noting that Sage now has three technology platforms supporting research collaborations—one each for computational algorithms (Challenge Platform), data sharing and analysis (Synapse Platform), and conducting biomedical research studies using ResearchKit (iOS) and ResearchStack (Android) frameworks (Bridge Platform). "If the same discovery is made at two different labs, the odds that it's noise will be much lower."
The Bridge Platform provides a way to consent patients and do sensor-based tasks (e.g., track tremors in Parkinson's disease patients via a finger tapping test) in studies that rely on mobile phones. Enrollees may still drop out of studies after 30 days, Wilbanks says, but by then a lot of data has already been collected. "Invisible impacts are made visible." A study management portal has a dashboard where researchers can track variables such as number of taps, intervals between them, and the effect of medications over time.
Questions remain about how to connect the dots between individuals being tracked, how much to tell them and the regulatory triggers, Wilbanks says. There are additionally issues of privacy, equity and fairness to resolve. The impact most important to patients—the number of drugs coming to market—also hasn't changed.
New Approach to Open Science
Perhaps we can "triangulate to better bets," Wilbanks proposes, by moving toward "open science as a suite of methods." The NIH and its National Institute on Aging has a grant out to establish a multi-component Alzheimer Centers for Discovery of New Medicines to move beyond the amyloid hypothesis by looking at lead targets in the public domain. The Parkinson's Disease Digital Biomarker DREAM Challenge had contestants around the world working on methods for processing mobile sensor data to distinguish gait and motor differences between patients and controls.
Sage Bionetworks also openly released its mPower study and companion smartphone app to more than 150 qualified researchers with access to donated patient data through a public portal. "You can't open everything, but you can send people to data," says Wilbanks. The mPower study is piloting new approaches to monitoring key indicators of Parkinson's disease progression and diagnosis by supplementing traditional behavioral symptom measurements with novel metrics gleaned from sensor-rich mobile devices.
The National Center for Data to Health (aka CD2H), launched in the fall of 2017, is also endeavoring to coordinate informatics to accelerate innovation and improve patient care—in this case across the NIH's Clinical & Translational Science Awards Program.
To get the desired "network effect," institutions need to be thinking projects rather than mandates, says Wilbanks.
Justice Concerns
Issues of bias and fairness are of growing concern, says Wilbanks. Mobile studies may increase enrollment, but how often at the expense of informed consent due to deceptive practices at the digital user interface level? "We need to slow people down and make them pay attention. They're not really reading legal documents; they're just clicking 'OK,' and we don't want data owners to do that."
With what Wilbanks terms "participant-centered consent," individuals literally get a visual red-hand alert about privacy risk before being asked to choose from multiple data-sharing options. It's important to leverage the developer culture, he says, noting that Sage Bionetworks has an entire toolkit of design documents and templates focused on informed consent. "All it takes is one crappy piece of code and data leaking… and trust is gone."
"Just-ice" in science also helps prevent data problems, continues Wilbanks. An ML algorithm trained on photos of people who are predominantly white will have a hard time recognizing nonwhite faces. Amazon learned the hard way that algorithms can be inadvertently racist when rolling out its same-day service and failing to consider the economic disparities and segregation between white and black communities.
Faulty algorithms also have the potential to limit the benefits of precision medicine to the wealthy and white. Precision medicine, by definition, permits surveillance and "real harms… to real people."
A justice issue arises when anyone donates their DNA "with a click," giving away their full genetic code—and unintentionally putting their entire family at risk, says Wilbanks.
AllofUs Research Program
To advance precision medicine at scale, the NIH's AllofUs Research Program will be collecting data on one million or more people and follow them for a decade. "It's like the Framingham study of this decade," says Wilbanks. Potential activities asked of participants in the current protocol include authorization to share their full EHR, answer surveys, provide physical measurements and blood samples, and—coming soon—the ability to share data from wearable devices and genomic sequencing.
A single institutional review board is charged with reviewing the protocol, informed consent, and other participant-facing materials. "The protocol is where the rubber meets the road," says Wilbanks, and is managed by software technology with version control. Sage Bionetworks is implementing a universal informed consent process, which requires video voiceovers and fifth-grade readability to accommodate the diversity of the enrollment pool—including traditionally under-represented populations.
Biomedical researchers will start playing with data from the AllofUs initiative at the end of the year, Wilbanks says. "The data is in a private cloud and not downloadable," so it can neither be copied nor redistributed.
Disturbingly, patient surveys suggest many enrollees don't fully understand what they've signed up for, Wilbanks says. Three out of five respondents—across races, ethnicities and educational level—incorrectly identifies both the purpose of the AllofUs study and whether participation was voluntary or required. Wide variation in understanding about privacy risk prompted one question to be rewritten at a lower literacy level, part of an overarching mission at Sage Bionetworks to "move the needle on the 'informed' part of informed consent."