First Peer-Reviewed BGISEQ-500 Reference Genomes
Publish Date: 2017-04-05

On April 5, 2017, the open-access journal GigaScience published the first reference data of the new BGISEQ-500 sequencing platform, demonstrating in comparisons a high price/performance ratio, and the potential to further democratize the applications and access to sequencing technologies.

Driven by rapid technological advances in sequencing platforms, genomics is proceeding rapidly from fundamental research to the clinic. Currently there are several commercially available second generation sequencing platforms with differing performance and data features. New work published on April 5, 2017 in the open-access journal GigaScience, describes, reviews and presents the first human whole genome sequencing data of the new BGISEQ-500 platform. Independently and transparently peer-reviewed by experts including from the National Institute of Standards, the peer-reviews, sequencing data and supporting imaging files are all available for download and reference by potential users.

The data was provided by BGI, National Institutes for food and drug Control (NIFDC) and the State Food and Drug Administration Hubei Center for Medical Equipment Quality Supervision and Testing. By comparing with public HiSeq2500 PE150 human resequencing data, the data of this new platform shows similar performance in alignment and variant calling, but with potentially lower cost (1/3 the cost of HiSeq2500). This shows BGISEQ-500 has a higher price/performance ratio, helping further democratize the applications and access to sequencing technologies.

First author of the study, NIFDC Associate Professor Jie Huang says of these comparisons: “This reference dataset gives a suitable insight of the current capabilities of the BGISEQ-500 platform”, adding “the performance of this first mass-produced “Made in China” sequencer already has the ability and strength to generate solid data and compete with the other sequencer manufacturers”.

The BGISEQ-500 sequencer was first announced by BGI in October 2015. It was developed from Complete GenomicsTM applied DNA NanoBalls (DNBs) and combined primer anchor synthesis (cPAS) sequencing technologies. It has the characteristics of effective, stable, high throughput and low cost to help improve genomics and resequencing analysis. Here, the research team presented the first human whole genome sequencing dataset of BGISEQ-500. The dataset was generated by sequencing the widely used human cell line, HG001 (NA12878) in two sequencing runs using PE50 and two sequencing runs using PE100 reads. On top of Fastq sequencing files, examples of the raw images from the sequencer for reference are also released for transparency sake. Finally, the researchers identified genomic SNPs and InDels variations using this dataset, estimated the accuracy of the variations and compared to that of the variations identified from similar amounts of publicly available HiSeq2500 data.

The variants results show BGISEQ-500 PE100 had no noteworthy difference comparing to the HiSeq2500 data, further reflecting that the sequencer can be used in different research and applications. With rapid development of sequencing technology, future improvements in data quality, sequencing length, optimized insert sizes of the paired reads, as well as improvements in software/bioinformatics tools, the performance can be further improved. In the meantime, the quality of the whole genome sequencing data also reflected the feasibility of applying this sequencing platform for other scientific research applications (e.g. transcriptome, epigenome, metagenomics, etc.) and clinical applications.

With this first peer-reviewed reference dataset of human genome resequencing data from BGISEQ-500 sequencer, the research team provided an overview and some basic metrics for the new sequencing platform and anticipated it will help stimulating the further technical improvement and development of novel tools for accurately analyzing this data.

Further Reading

A reference human genome dataset of the BGISEQ-500 sequencer