GENALICE Launches Population Calling Analysis Module
By Bio-IT World Staff
October 8, 2015 | In a live webinar today, GENALICE launched the Population Calling analysis module to their GENALICE MAP Next-Generation Sequencing (NGS) Data Analysis Suite. In partnership with Mount Sinai Hospital, Amazon Web Services, and Intel, GENALICE processed the whole genomes of 800 patients from the Alzheimer’s Disease Neuroimaging Initiative (ADNI), an Alzheimer’s disease cohort during the webinar.
“NGS developments have brought us to a point that we can use the aggregated information of fast growing patient cohorts. Using alternative solutions has significant drawbacks. The smart solution GENALICE introduces today helps us tremendously to speed up genomics developments,” said Eric Schadt, director of the Icahn Institute for Genomics and Multi-scale Biology and chair of the department of Genetics and Genomics Sciences at Mount Sinai Hospital (NYC), in a statement before the webinar.
The Population Calling module is the newest addition to the GENALICE MAP NGS data analysis suite. GENALICE sells its analysis suite either through Amazon Web Services or as an appliance with commodity hardware. “People have the choice to work with us in the cloud or on site,” explained Jos Lunenberg, Chief Business Officer for GENALICE. The pipeline uses standard inputs—FASTQ files from Illumina—and give standard outputs—VCF. But GENALICE uses proprietary algorithms to speed up the steps between. On one node a single whole genome analysis would take 80 hours using BWA and GATK, but GENALICE can complete it in 30 minutes: 25 minutes for alignment and 5 minutes for variant calling.
Lunenberg compared the Population Calling module to GATK joint calling. By sequencing large cohorts and analyzing the findings together, researchers can learn much more, Lunenberg said. “Based on the context of the group, you can improve the individual variants. You can qualify or disqualify the variants found on an individual level based on the context of the group. We call that consensus based call enhancement.” And GENALICE can do that sort of comparison much faster than ever before, Lunenberg said.
“With joint calling from Broad that’s also possible,” he acknowledged. “The thing is, though, that this is a tool that you need a huge data center to use it if you have a larger sized cohort… In a comparison, for one whole genome patient on we take six minutes on one node. And it takes 34 hours to do the same thing with GATK.”
Lunenberg stresses that accuracy is not sacrificed for speed. He says GENALICE has been validated with several of its customers including Illumina’s platinum genome data and the accuracy is “top notch.” It’s “at least on par with BWA/GATK, especially in areas where the mapping is hard,” like repetitive areas, he said. The GENALICE file sizes are also smaller, requiring a much smaller storage footprint.
Population Calling is also flexible, Lunenberg said, and can re-analyze cohorts as data points are added. The output of the Population Calling module is a GENALICE Variant Map file, a new format that Lunenberg said is easy to query, multidimensional, and has a very low footprint. From the GENALICE Variant Map, variants of interest can be extracted as VCFs.
“The beauty is when you have a cohort like the 800 Alzheimer’s patients, and you have one additional patient come in, you can easily do incremental analysis with our tool,” Lunenberg said. “You just add one single patient to the existing GENALICE Variant Map… you don’t have to do a re-analysis with 801 patients.”
Lunenberg wouldn’t share pricing specifics, saying that they vary based on the configuration a customer chooses. But he said that GENALICE is “extremely cost effective.” Considering storage gains and decreased hardware needs, “We save our customers much more than the actual cost of the appliance,” he said.