Chinese Researchers Publish Comparisons Of BGISEQ-500, HiSeq2500

April 5, 2017

By Bio-IT World Staff 

April 5, 2017 | Researchers from China’s National Institutes for Food and Drug Control have published reference data of the BGISEQ-500 sequencing platform compared to Illumina’s HiSeq2500 platform. Their results are published in GigaScience. The authors say their findings demonstrate a high price/performance ratio, and the potential to further democratize the applications and access to sequencing technologies. Independently and transparently peer-reviewed by experts including from the National Institute of Standards, the peer-reviews, sequencing data and supporting imaging files are all available for download and reference by potential users.

The data was provided by BGI, National Institutes for Food and Drug Control (NIFDC) and the State Food and Drug Administration Hubei Center for Medical Equipment Quality Supervision and Testing. By comparing with public HiSeq2500 PE150 human resequencing data, the data of this new platform shows similar performance in alignment and variant calling, but with potentially lower cost (1/3 the cost of HiSeq2500).

In press materials, first author of the study, NIFDC Associate Professor Jie Huang said of these comparisons: “This reference dataset gives a suitable insight of the current capabilities of the BGISEQ-500 platform”, adding “the performance of this first mass-produced ‘Made in China’ sequencer already has the ability and strength to generate solid data and compete with the other sequencer manufacturers”.

The BGISEQ-500 sequencer was first announced by BGI in October 2015. It was developed from Complete Genomics applied DNA NanoBalls (DNBs) and combined primer anchor synthesis (cPAS) sequencing technologies. It has the characteristics of effective, stable, high throughput and low cost to help improve genomics and resequencing analysis.

The dataset was generated by sequencing the widely-used human cell line, HG001 (NA12878) in two sequencing runs using PE50 and two sequencing runs using PE100 reads. On top of FASTQ sequencing files, examples of the raw images from the sequencer for reference are also released for transparency’s sake. Finally, the researchers identified genomic SNPs and indels variations using this dataset, estimated the accuracy of the variations and compared to that of the variations identified from similar amounts of publicly available HiSeq2500 data.

The variants results show BGISEQ-500 PE100 had no noteworthy difference comparing to the HiSeq2500 data, further reflecting that the sequencer can be used in different research and applications. With rapid development of sequencing technology, future improvements in data quality, sequencing length, optimized insert sizes of the paired reads, as well as improvements in software/bioinformatics tools, the performance can be further improved. In the meantime, the quality of the whole genome sequencing data also reflected the feasibility of applying this sequencing platform for other scientific research applications (e.g. transcriptome, epigenome, metagenomics, etc.) and clinical applications.

With this first peer-reviewed reference dataset of human genome resequencing data from BGISEQ-500 sequencer, the research team provided an overview and some basic metrics for the new sequencing platform and anticipated it will help stimulating the further technical improvement and development of novel tools for accurately analyzing this data.