Here’s a paper worth checking out: “Synthetic Spike-in Standards Improve Run-Specific Systematic Error Analysis for DNA and RNA Sequencing” from lead author Justin Zook at the National Institute of Standards and Technology. Published in PLoS One last month, the paper describes a study of ways to better manage systematic errors in DNA or RNA sequencing.
Most sequencing work currently relies on algorithms that recalibrate base scores after calculating a correction factor using either a subset of the sequenced data set or a separate data set, the paper’s authors write. They propose using synthetic spike-in standards, in this demonstration using RNA spike-ins for sequencing human RNA. This is followed up with base recalibration with the Genome Analysis Toolkit (GATK from the Broad Institute) that more accurately adjusts based on the spike-in’s unique sequence signature.
“Compared to conventional GATK recalibration that uses reads mapped to the genome, spike-ins improve the accuracy of Illumina base quality scores by a mean of 5 Phred-scaled quality score units, and by as much as 13 units at CpG sites,” the authors write. “In addition, since the spike-in data used for recalibration are independent of the genome being sequenced, our method allows run-specific recalibration even for the many species without a comprehensive and accurate SNP database.”
In a paper that focuses on improving quality and uniformity, we were delighted to see that our Pippin platform was used for the cDNA size selection step with Illumina sequencing.
Congratulations to authors Justin Zook, Daniel Samarov, Jennifer McDaniel, Shurjo Sen, and Marc Salit!