OmniTier’s CompStor Brings De Novo Analytics To Genomics
By Bio-IT World Staff
August 14, 2018 | OmniTier announced that in a joint study with researchers at the Mayo Clinic’s Center for Individualized Medicine, the company successfully demonstrated human DNA variant calling using de novo global assembly techniques with breakthrough performance and cost.
Using the well-characterized NA12878 short-read sequenced genome dataset, the CompStor compute cluster completed variant-caller-ready assembly of a 50x coverage genome in less than two hours. The solution scales to 800x sequence coverage, a dataset size previously considered too large for de novo techniques but necessary to reliably identify new and infrequent variants.
The joint study builds upon OmniTier’s previously announced result of assembling a human genome in 8 minutes. It shows that de novo global assembling can be routinely applied to whole genome sequencing, even with high coverage sequenced genomes where accuracy is paramount. The joint study also shows that de novo analytics no longer requires supercomputing-level resources, but instead can be completed on application-specific bioinformatics platforms based on standard, low-cost servers with next-generation software and memory – opening the door to hundreds of institutions globally that are eager to leverage next-generation sequencing techniques.
“De novo sequence assembly for better variant discovery and characterization has remained elusive due to the exceedingly long assembly times and resources requirement of existing assemblers. CompStor holds the promise to change that paradigm,” Alexej Abyzov, computational genomicist and biologist, senior associate consultant and assistant professor of biomedical informatics at the Mayo Clinic in Rochester, MN, said in a press release.
“In benchmarking, CompStor has enabled us with fast, robust, accurate, and unbiased analysis of individualized high coverage whole genome sequencing data. We expect to apply CompStor's unique capabilities to analyzing point nucleotide substitutions as well as larger structural variants and indels in several future studies,” Abyzov continued. Technical observations of the CompStor system will be published in a forthcoming peer-reviewed paper, jointly produced by Abyzov’s laboratory and OmniTier scientists.
Whole genome sequencing (WGS) has emerged as the central approach in characterizing human variation and disease states on a population scale, but it demands new computational bioinformatics solutions. Current variant-calling methods are based on alignment of the sequenced reads against a known reference genome – a single reference approach that introduces biases and leads to missed variants. This inherent bias limits the quality of whole genome sequencing and other Next Generation Sequencing (NGS) applications. The de novo global assembly methodology avoids the use of a reference and, therefore, enables complex variant calling with haplotype (phased) resolution. However, current de novo implementations are too slow and expensive for mass deployment.
CompStor’s implementation addresses the demand for higher quality in WGS by delivering high computational efficiency and low cost. Its tiered-memory architecture utilizes DRAM and flash memory resources in a novel cluster configuration to overcome the limitations of the existing de novo assembly implementations. By using standard, low-cost x86 (Intel processor) servers with software designed around expansive computational memory, supercomputer-like results are possible. CompStor clusters have optimized data ingress and egress – for example, ingesting raw genome sequence datasets at up to 10 GB/s from suitable external sources. The solution seamlessly integrates with existing genomics workflows and provides command line, API, and web-based job control mechanisms. Users can choose from industry-standard or built-in variant callers.
“CompStor is bringing the de novo analytics era to genomics,” said Hemant Thapar, founder and CEO of OmniTier, in an official statement. “Existing solutions are severely limiting in the use of de novo assembly, especially for high coverage genomes – yet in CompStor we have an approach that can be deployed pervasively. Our current results in the joint study with the Mayo Clinic highlight the potential to integrate de novo assembly methodology in genomic medicine and achieve higher accuracy and shorter assembly times on affordable cloud or on-premise infrastructure. Our CompStor platform side-steps fundamental data transfer problems to remote compute facilities and applies tiered memory innovations that can expedite the path to personalized medicine.”