As we’ve seen throughout this blog series, Sage customers are conducting all sorts of great experiments pairing their Pippin size selection instruments with Illumina sequencers. Today we look at the final topic in this thread: boosting assembly accuracy with precise DNA size selection.
In the years since we first launched the Pippin Prep and its big brother, the BluePippin, we’ve found that the scientists who demand these tools the most are bioinformaticians. Why? Because they see the downstream impact of high-precision sizing and know that it can make an assembly far better than manual gel extraction or other less accurate sizing methods.
Andrew Sharpe, a Research Officer and Group Leader in the DNA Technologies Laboratory at the National Research Council of Canada, told us that he uses several Pippin instruments to build multiple pair-end libraries for the same sample. He might construct three libraries with 200-base, 300-base, and 400-base inserts, for instance, and then assemble them together. “If you assemble one of the libraries, then you’ll end up with an assembly. But if you assemble all three together using three different lengths, you get quite a bit better product,” Sharpe said.
Another approach is to construct a mate-pair library or a long-read library and assemble it with the shorter-insert paired-end libraries. That’s a method used by Matthew Clark’s sequencing technology development lab at The Genome Analysis Centre in Norwich, UK. Adding that large-insert information “has a massive effect on the quality of the output,” Clark told us. “The bigger-insert library gives you a 5x or 10x jump in quality, maybe even bigger, in terms of the sizes of the assembly that you’re able to generate.” He said that the TGAC bioinformatics team prefers Pippin-aided sequencing libraries because the tight size selection helps them determine how far apart certain reads should be and put together a more accurate assembly.
The newest tool in the Sage portfolio will be particularly useful for this application as well. SageELF is a whole-sample fractionation tool that generates 12 contiguous fractions from a DNA sample, making it very simple for scientists to construct libraries of various insert sizes from the same sample.
We hope these blog posts detailing some of the most popular techniques used with the Sage + Illumina combo have been helpful to you. Thanks for reading!