At MIT, Pippin Enables Splice Variant Detection and MicroRNA Analysis
At the BioMicro Center at MIT, Director Stuart Levine, PhD, recently introduced the Pippin Prep from Sage Science to enable key applications — including splice variant analysis with RNA-seq and microRNA analysis — that were not possible on other platforms.
Levine joined the BioMicro Center about four years ago; in that time, he has transformed it from a two-person lab to a major genomics and bioinformatics core facility with a dozen full-time employees serving more than 80 MIT faculty members. “My goal was to create an integrated core which would work with labs on everything from experimental design through to experimental analysis and anything they need help on anywhere along that chain,” Levine says.
His team of technology experts supports faculty members from a number of departments and institutes, covering scientific areas as diverse as cancer, environmental health sciences, biological engineering, and more. Levine’s core lab colleagues are professional scientists who focus on tools and methods, so they can translate that expertise across scientific areas and tailor experiments to each customer. “Ultimately what comes into our lab is nucleic acid, and what we do with nucleic acid is relatively constant,” Levine says. “The technology improvements that are useful for, say, understanding the evolution of ocean ecosystems can also apply to cancer research.”
In the last few years, Levine added sequencing sample preparation steps, and says that he realized early on he would have to find alternatives to manual gels. As a chargeback facility, Levine has to base prices on the fully-loaded costs of any service. When it came to setting prices for manual gel procedures, “I had to budget how long it takes to pour a gel, run out the sample — one per gel to avoid contamination — cut out the band, isolate the band. And when I put a price tag on it, I know that price tag is high enough that absolutely no one will pay for it. Once you start adding in the labor costs, the economics don’t make any sense at all,” he says.
Levine has been using the SPRIworks System from Beckman Coulter Genomics, which has been very successful as a gel alternative. However, certain applications require tighter sizing than was possible with that system, which is why he decided to look at the Pippin platform.
“What Pippin let us do was get into areas that we hadn’t been able to before,” Levine says. Two of those areas were RNA-seq — particularly identifying splice variants — and miRNAs, he adds. On the miRNA front, his team would otherwise have had to use manual gels and cut out bands, but “we wouldn’t be able to do that with any kind of economy of scale,” he says. “The high percentage gels on the Pippin allow us to cut out bands of the right size for microRNAs.”
When it comes to splice variant analysis, Levine says that his team recommends very tight sizing. “Some of the RNA-seq methodologies, when you’re doing de novo sequencing of transcriptomes and want to do assemblies, tend to perform better when the size distribution of the library inserts is very tight,” he says.
The reason this is helpful for splice detection is in adding another dimension of data in the alignment step to allow for the inference of structure between known nucleotides. “If you know where the left read is, and where the right read is, and if you know the size of the fragments, then you can infer based on known exons the entire pattern in between,” Levine says. “You can calculate the likelihood that the exon is included or not based on where pairs of reads are.” The same holds for de novo assembly in general, he notes. With very tight size distribution, “you have a much more constrained situation when you’re assembling,” he adds.
Levine also believes that the Pippin platform is well suited for planning ahead as sequencing technologies evolve to offer longer reads. “As we need longer inserts, we’re going to be limited in terms of what we can get off the SPRIworks machine,” he says. Pippin’s ability to work with longer ranges means that “it’s extremely useful both for the ability to do sample preparation now and for the ability to work with these future technologies.”
Pippin Platform Recommended for Nextera Mate Pair Size Selection
Illumina released new sample prep protocol guidelines for generating mate pair libraries with its Nextera kit, and we’re pleased to report that the Pippin platform is the recommended choice for automated size selection.
You can check out the Nextera Mate Pair Sample Preparation Guide here . (We’re under Size Selection in Chapter 3, beginning on page 40 of the Guide.)
Illumina says that using an extra size selection step offers “more stringent” sizing than AMPure alone and lets users make libraries with larger fragments and more precise distribution than a gel-free approach. While the company has validated a manual approach in addition to the Pippin platform, Illumina’s guidelines note that “in our experience running a standard agarose gel does not provide as robust and reproducible results as the Sage Pippin Prep.”
In the user document, Illumina recommends the Pippin Prep with the 0.75% cassette and “eluting fragments with a broad range of sizes, of 3 to 6 kb in width, increasing in width with increasing fragment length (e.g. 2–5 kb, 4–8 kb or 6–12 kb).”
For current Pippin users, we would like to add that you can also use the 0.75% agarose dye-free cassette (BLF7510) with the BluePippin for equivalent results.
Poster: Exome Sequencing with Ion Torrent and Pippin Prep
In a poster from Ion Torrent (Life Technologies) for 2012’s ASHG, scientists looked at exome sequencing by studying a familial trio on both the PGM™ and the Proton™ instruments. Size selection for both sequencers was performed on the Pippin Prep from Sage Science.
Using an enrichment process targeting protein-coding exons from various genetic databases (including GenCode, RefSeq, Ensembl, and others), the scientists report “an on-target read mapping rate of 80%.” The sequencing, which took about four hours on either instrument, generated more than 5 Gb of aligned sequence, with average depth greater than 50x.
The Life Technologies authors note that they used Pippin size selection for both instruments, selecting an average length of 200 bp for the Proton and 300 bp for the PGM. (Check out figure 3 of the poster to see the nice clean peak they generated with Pippin.) This step was followed by exome enrichment and amplification prior to sequencing. The results indicate that sequencing exomes on the Proton is better than five times more efficient than on the PGM, and the authors say that stat is expected to improve even more.
Overall, the authors say, this study demonstrates that “the combination of focused exome enrichment and Ion Torrent Systems-based sequencing and analysis provides an efficient, accurate, and rapid means to detect genetic variation in the well-annotated portion of the genome for state-of-the-art genetic disease research.”
Check out the poster here.
Tag Team: At DFCI, Pippin and Nextera Make Better Libraries
Over at the Dana-Farber Cancer Institute, scientists have been doing some really interesting work pairing the Pippin Prep with Illumina’s Nextera kit for library preparation.
Zach Herbert, associate director of the Molecular Biology Core Facilites at DFCI, says that he incorporated Pippin size selection when it became clear that the genomic libraries generated by Nextera could use some additional optimization. “It’s really beneficial to have a narrower size range than the Nextera kit generates on its own,” he says.
Herbert uses the efficient Nextera for much of the small genome work and some of the larger amplicon projects that are sent to his core lab. In order to optimize reproducibility, flow cell clustering on the MiSeq, and downstream analysis, Herbert added a Pippin step to generate very tightly sized libraries after the Nextera tagmentation protocol.
The Pippin/Nextera tag team also shows value beyond de novo assemblies. “Having a narrow and known size distribution makes calculating the molarity a lot easier so you can get a better cluster density and maximize the number of reads,” Herbert says. It’s also a boon for pooling samples. Attempting to pool samples with a broad size range in equimolar amounts is very tricky — “but if all those libraries are the same size, then we’re much more likely to get an even distribution of that pool.”
To learn more about Herbert’s work pairing Nextera and Pippin, read our case study here.
DNA for Species Identification: A Study of Rodents
Carl Linnaeus would be proud: A recent paper in PLoS One demonstrates the use of next-gen sequencing with genetic barcodes to accurately identify more than 100 different species from the Rodentia order. In the study, amplicons were run on the Pippin Prep from Sage Science to remove non-specific PCR products.
“Next-Generation Sequencing for Rodent Barcoding: Species Identification from Fresh, Degraded and Environmental Samples,” the paper from Maxime Galan, Marie Pagès, and Jean-François Cosson at the Center for Biology and Management of Populations at INRA, uses 454 GS-FLX sequencing. The authors note that correct species assignment in the diverse Rodentia order is quite challenging with morphological data alone.
In this work, the authors selected a 136 bp fragment from cytochrome b as a mini-barcode and then used it on more than 900 samples to determine its utility in accurately identifying species. Following a validation step, hundreds of samples of unknown identity were analyzed and the mini-barcode worked about 85 percent of the time, the scientists report. They also successfully tested degraded rodent DNA samples, including museum specimens and feces from rodent-eating predators.
The authors conclude, “This study demonstrates how this molecular identification method combined with high-throughput sequencing can open new realms of possibilities in achieving fast, accurate and inexpensive species identification.”