Sentieon's 'Faster/Cheaper' Genomics Tools
By Benjamin Ross
June 25, 2019 | "Our mission is to enable precision data for precision medicine," Sentieon's Business Development Director, Brendan Gallagher, told Bio-IT World. "The data coming out of our toolsets are something people can have confidence in for their projects so that we can push the ball forward on improving healthcare."
The toolsets, Sentieon's DNAseq and TNseq software, are making waves in the life sciences, creating a "drop in replacement" that is a faster and cheaper alternative to the industry standard for secondary analysis in next-generation sequencing (NGS) data processing while producing the industry standard results. Recently, Sentieon's software earned them a Bio-IT World Innovative Practices award.
Sentieon's toolsets address three key needs for users of NGS data, primarily when it comes to "secondary analysis" data processing, Gallagher wrote in the company's entry form for the award. "The output of secondary analysis needs to be accurate, fit downstream tools and data format standards, and needs to be done cost effectively."
DNAseq and TNseq accomplish this, Gallagher says, by building on a foundation established by existing tools such as the Broad Institute's Genome Analysis Toolkit (GATK), MuTect and MuTect2, and the Burrows-Wheeler Aligner (BWA).
With such a rock-solid foundation, Sentieon felt confident enough to improve the tools where needed. Gallagher says bringing the industry-trusted methods into their tools was a unique challenge.
"We're essentially a math and computer science company," he said. "If we have a defined mathematical problem with a truth that you can prove that's also computationally complex, then that's a good problem for the Sentieon team to work on."
The results speak for themselves, according to Gallagher, with the tools using three to ten times fewer core-hours, never downsampling, and offering more scalable tools for rapid Turnaround Time and large-scale joint calling.
This distinction was key for Sentieon if they wanted to avoid offering "yet another variant caller," says Gallagher. "If everyone chooses their own unique pipeline, then anytime they share that data, the other people have to reprocess it with their own pipeline. This is true even if they use the same GATK tool, because there are different 'flavors' of pipelines. For instance, if I made a pipeline using a software tool and told the tool to do something a little different, now my pipeline is essentially different from anyone else' which makes comparing the results of those different pipelines difficult."
Standardizing on a functionally equivalent pipeline allows data processed with the same pipeline by different groups to be shared and integrated without requiring the data to be reprocessed, says Gallagher.
"Our tools can match those same tweaks that a workgroup chose, so now anyone who processes their data in the standardized way with our tool can take advantage of the public data that's becoming available and that has been processed the same way while enjoying the more efficient and easy to use tools."
Sentieon's tools are easier to use since they are self-contained applications that automatically scale to any size server and they can be easily distributed to multi-server clusters for parallel processing.
In Sentieon's entry form, Gallagher wrote that, "Sentieon's users can deploy the Sentieon-powered CCDG [Centers for Common Disease Genomics] pipelines, enabling them to pool their own internal data with other data processed by CCDG pipelines of other groups during downstream analysis."
The functional equivalent use case for these tools contributes to solutions for data sharing challenges within genomics, says Gallagher. "When different groups use the same tools, and eventually share their data, it removes the burden of reprocessing other peoples publicly available data since its all been processed with the industry standard functionally equivalent pipeline”
Price is also a point of emphasis, Gallagher says. "The Sentieon software is licensed at a cost so that the users have quantifiable and significant savings as compared to using free GATK tools, achieving a positive ROI," he wrote in the entry form. "Effectively, the Sentieon tools are 'cheaper than free' after considering the savings in the computing cost."
In one case, Sentieon processed 10,000 WGS for Autism Speaks using the functionally equivalent pipeline, and the total cost of Sentieon's License fee and Compute costs was less than using the BWA/GATK toolsets compute cost.
The software tools have been developed since 2014, Gallagher says, with updates being added continuously.