How Large Language Model-Based AI Will Transform Drug Safety & Regulatory Processes

Commentary Contributed by Ramesh Ramani and RaviKanth Valigari, ArisGlobal

July 12, 2024 | Training and validating intelligent automation systems for life sciences R&D purposes can undermine the business case. But the large language models (LLMs) used to power Generative AI (GenAI) is changing that and bolstering compliance, enabling on-the-fly data discovery, “in context” learning, and narrative extrapolation.

The trouble with applying AI to labor-intensive Safety and Regulatory processes within life sciences research and development is that the technology has to be extensively “trained” in what to look for. Its deductions also need to be “explainable” to regulators, for the sake of compliance, credibility, and trust. And it is here that the latest advances in AI and machine learning (ML) offer substantial process transformation potential.

How New-Generation AI Differs

While process automation solutions have existed for some time to lighten the R&D regulatory and safety workload and enhance efficiency, there have been two main sticking points up to now: how to swiftly train modern AI algorithms so that they pick up on only what’s significant, and how to satisfy the authorities’ need for accuracy and transparency.

GenAI technology, using LLMs, quickly understands what to look out for and can reliably summarize key findings for the user, without the need for painstaking "training" by overstretched teams, or validation of each configuration.

LLMs (the vast data banks referred to by GenAI tools), and advanced natural language processing (NLP) techniques like retrieval-augmented generation (RAG), make advanced automation a safe and reliable reality in key life sciences R&D processes—without the need for continuous, painstaking oversight. Essentially, RAG simplifies the process of fine-tuning AI models by allowing LLMs to integrate proprietary data with publicly-available information, giving them a bigger pool of knowledge—and context—to draw from.

Applying GenAI-Type Techniques to New Data

The biggest breakthrough with all of this is that specialized applications can now be developed that can apply GenAI-type techniques, contextually, to data they haven’t seen before—learning from and processing the contents on the fly.

For drug developers, this has the potential to transform everything from dynamic data extraction associated with adverse event (AE) intake, to safety case narrative generation, narrative theme analysis in safety signal detection, and the drafting of safety reports.

Importantly, carefully combined LLM and RAG capabilities are sufficiently transparent and explainable to regulators for the technology to be acceptable as safe and reliable. Responsible AI and AI compliance are particularly critical in life sciences use cases, so it is essential that companies deploy solutions that are proven and transparent. The LLM/RAG approach addresses potential concerns about data security and privacy, too, as it does not require the use of potentially sensitive patient data for algorithm training/machine learning. It also stands up to validation by way of periodic sampling by human team members—sampling that can be calibrated as confidence grows in the technology’s performance, ensuring that efforts to monitor its integrity do not undermine the significant efficiency gains that are possible.

Simplified System Validation

Because LLMs make it possible to bypass the need to train AI models or algorithms, a single technology solution can handle all variations of incoming data, simplifying the system validation burden. RAG patterns can play an important role here, in explaining a standard operation procedure to an LLM using natural language, so that the system knows what to do with each of many thousands of forms—without the need for special configuration for each relative format.

The potential impact is impressive. In a user survey we conducted, application of LLM-RAG technology to transform AE case intake has been shown to deliver upwards of 65% efficiency gains, with 90%+ data extraction accuracy and quality in early pilots. In the case of safety case narrative generation, the same technology is already demonstrating 80-85% consistency in the summaries it creates. And that’s from a standing start, without prior exposure.

Contextual Data Retrieval

The ability to retrieve data in context, rather than via a “Control F" (find all) command (e.g. everything among a content set that mentions headaches), could transform a range of processes linked to safety/adverse event discovery and reporting.

Going forward, equivalent solutions will help streamline the drafting of hefty regulatory safety reports, with advanced automation generating the preliminary narrative, and performing narrative theme analysis in safety signal detection. The technology could have a significant impact in distilling trends not captured in the structured data (e.g. a history of drug abuse or of people living with obesity across 500 patient narratives that are potentially of interest).

It is this broader potential that is now being discussed at meetings of life sciences’ new global GenAI Council. Any resistance to smarter automation linked to concerns about reliability or compliance has now been superseded by a hunger to embrace next-generation forms of the technology which quash those concerns and super-charge process efficiency.

Ramesh Ramani is VP of Technology at ArisGlobal, with a special interest in the application of AI and machine learning to transform Life Sciences R&D regulatory and safety use cases. He is based in Bengaluru, Karnataka in India. He can be reached at LinkedIn.

RaviKanth Valigari is VP of Product Development at ArisGlobal, and a specialist in digital transformation in Life Sciences R&D. He is based in Charlotte, North Carolina in the US. He can be reached at LinkedIn.

Both Ramesh and Ravi have deep expertise in applying AI, Natural Language Processing, and other smart automation technologies to Life Sciences R&D regulatory and safety use cases, via powerful, targeted, and quick-to-deploy cloud/multi-tenant SaaS applications.