Cross-Department Collaboration Reveals Business Intelligence Already On Hand
By Allison Proffitt
September 14, 2023 | INNOVATIVE PRACTICES AWARDS—There are questions every research group could ask: “Are there opportunities to better use existing scientific instruments?” “Could we plan better and budget more effectively if we better understood instrument usage?” At Regeneron’s Research and Preclinical department, tackling these questions meant accounting for more than 2,000 expensive and space-constrained instruments and seeking insight into how they are used.
But Regeneron is like a 35-year-old startup, explained Vinay Desai, senior director of Regeneron IT, in his award address at the Bio-IT World Conference & Expo last May. “The culture permeates throughout the laboratories, and it drives our IT teams.”
The Regeneron team didn’t expect a great deal of funding for the query. “We were unlikely to get the resources to discover unique usage data for each instrument type, nor would we have access to any engineers to structure and combine these various usage data sources together in a custom database,” they wrote in their entry. “We were going to have to think creatively to solve the problem.”
Their solution—both pragmatic and creative—earned the team one of the six 2023 Bio-IT World Innovative Practices Awards.
For an organization that operates and maintains thousands of scientific instruments in hundreds of distinct models from more than 50 different vendors, an off-the-shelf tool would have been extremely attractive. Ideally, the solution would acquire usage data from all thousands of instruments, store those data in a searchable and query-able database, and deliver intelligence via interactive dashboard visualizations. But the team found no centralized, vendor-agnostic, scalable source of data instrument usage across the lab. So they set out to build their own scalable, supportable solution for measuring the usage of the department’s instrumentation.
“The biggest driver was that we were expanding our facility footprint,” Desai said. “Every year we invest in more and more instruments, and… space was a big constraint. Where do you find space to put these things? And leadership kept asking, ‘Are you using these things?’”
The team found a tool in an unlikely place: Regeneron’s InfoSec department. Splunk, a software tool that Regeneron had been using for several years already for systems security and to detect malware and intrusions, had some intriguing capabilities. There was a need, the authors wrote, to “collaborate across nontraditional IT lines. After all, under what non-security circumstances would a team work so closely with InfoSec on a solution for R&D?”
Inference Problem
Splunk helps capture, index, and correlate real-time data in a searchable repository from which it can generate graphs, reports, alerts, and dashboards, identifying patterns in the data, providing metrics, and diagnosing problems. The Regeneron team turned those capabilities toward a massive dataset created by Windows Management Interface (WMI) and Windows Event Log software data collected from the PCs connected to the instruments.
The ingestion step itself was daunting. They pulled users’ login/logout data, CPU use, RAM use, and software process activity from WMI. Pulling data from all of the instrument-connected PCs created a massive dataset—on a typical day, Splunk ingests over 680 million events from WMI from the lab computers at Regeneron, Desai noted—but the bulk was important. “By ingesting WMI data on a mass scale for all lab computers, we were ingesting a standard set of data that we felt could be used to derive usage insights for most, if not all, instruments,” the authors wrote.
The next step was inferring instrument usage from those computer data. “When we started out, we weren’t sure this was going to work out,” the authors wrote. “How can you take just the CPU usage and RAM usage of a software process and use that to infer instrument usage? People were skeptical. The more we tested the solution, the more confidence we gained that it could work for most cases.”
The team found that Splunk could use the data to infer real-time and longitudinal instrument utilization, identifying patterns for instrument use across the Research organization. Data are presented in a homegrown web-based dashboard, allowing lab managers and others to make real-time business decisions.
Among the decisions the tool supports: proactively review instrument-use statistics and schedule work to reduce idle time and gain greater throughput; make data-driven purchasing decisions for expanding an instrument fleet; and support IT in planning software upgrades, patch installation, and other maintenance. While the experiment has been very successful so far, the team sees further opportunity as well.
“We found significant value being driven back by consuming 7tb of data,” Desai said. “Our science colleagues can plan what they’re purchasing; they can justify it. The folks supporting the instruments know what service contracts to maintain, the timing, the vendor’s scheduling, patching—everything is very streamlined. Allocation of space: as the facility expands folks can use real time data to figure out how to best optimize the lab footprints. And the big win with the scientists was that there was a single point to look at all of this,” he added. “It was extremely useful.”