Before a new technology is deployed at The Genome Analysis Centre in Norwich, UK, it must run the gauntlet of Matthew Clark’s lab.
Clark is the sequencing technology development leader at TGAC, where part of his role involves testing out new platforms, deciding whether they’d be a good fit at the institute, and ironing out best-practice workflows for the ones that are chosen.
The BluePippin automated size selection platform from Sage Science is one of the tools that succeeded in Clark’s technology proving ground and is now used more broadly throughout TGAC, where major projects include sequencing the wheat genome and studying honeybees. Clark, who joined the institute in 2010 after spending seven years as a scientist at the Wellcome Trust Sanger Institute, says that TGAC researchers rely on Pippin for building long-insert libraries for sequencing projects.
These projects include de novo sequencing, for which precise size selection in mate-pair libraries offers a significant improvement in quality of the genome assembly. The mate-pair sequence might represent a small fraction of the total data generated for the assembly — most will come from the shorter-insert paired-end libraries — “but it has a massive effect on the quality of the output,” Clark says. “The bigger-insert library gives you a 5x or 10x jump in quality, maybe even bigger, in terms of the sizes of the assembly that you’re able to generate.” These libraries, which are sequenced on Illumina or Pacific Biosciences instruments, offer longer-range information that can fill gaps or jump over repeats and other problematic structures that would otherwise break up an assembly. Indeed, Clark says the TGAC bioinformatics team prefers Pippin-aided sequencing libraries because the tight size selection helps them determine how far apart certain reads should be and put together a more accurate assembly.
Clark’s team had tested other automated size selection options, but Pippin was the best platform for generating these valuable large-insert libraries. (Pippin Prep can yield libraries with inserts up to 8 kb, while BluePippin works for even longer inserts.) It also allows for loading more material than other size selection alternatives, making it a good fit for the no-PCR paired-end libraries that some TGAC scientists like to run, Clark says. “If you have enough DNA, you can just skip the PCR — and you get much better coverage across the genome and a better assembly. That’s easier to do on a Pippin.”