Have you been noticing the mate-pair trend? We certainly have. A technique that was once only used by a handful of labs has really come into its own as one of the preferred ways of gathering longer-range genomic data.
Here at Sage, what we like most about mate-pair sequencing is that it’s a clever way of gleaning new information that provides incredible downstream value, making assemblies significantly better. These long-range glimpses of an organism’s genome are enabling more accurate views of biology.
We’ve heard a lot about mate-pair sequencing at conferences we’ve attended recently. For instance, the new Revolocity system from Complete Genomics that was on display at ASHG uses mate-pair assembly to improve results. A recent publication from Lucigen scientists used mate-pair sequencing to improve assembly quality for an anaerobic bacterium cultured from a boiling spring.
Sage customers have been doing a lot of great work with mate-pair sequencing as well. At The Genome Analysis Centre, scientists developed a mate-pair method using SageELF that allows for lower-input samples while saving time and money. That method was essential for the new, highly contiguous wheat genome assembly the centre just released, where it helped boost contig N50 by a factor of 10. At RIKEN, researchers modified a Nextera mate-pair protocol to reduce costs. By using BluePippin earlier in the protocol, they were able to increase yield; they also reduced enzyme volumes for other steps and got high-quality results. In this app note, scientists from the University of Miyazaki also use BluePippin with Nextera for a mate-pair pipeline.
To learn more about how Sage customers are using and improving mate-pair sequencing, check out this page.
This week we’re heading to the Festival of Genomics in San Mateo, Calif. We had a blast at the first festival in Boston, so we’re excited about this next edition (and we packed our running shoes, just in case).
The meeting has an impressive slate of speakers — it’s a virtual Who’s Who of the genomics community. If you happened to miss Ting Wu’s mind-blowing plenary talk at the Boston festival, we’re thrilled to see that she’s speaking again and heartily recommend checking it out. The concurrent sessions are chock-full of great talks too, and it’ll be tough to choose among them.
The Sage Science team will be in the exhibit hall, central to all of the presentation stages, and we hope you’ll stop by to say hello. At the June festival, we presented a sneak peek of our HLS system, currently under development. If you missed it, we’ll be happy to get you up to speed on a new product we think will be incredibly useful for the NGS community.
Hope to see you in San Mateo!
The genomics field is a funny place: at times it feels like we’re not making progress quickly enough, and at other times things seem to be moving so fast we’re just holding on for dear life. We were reminded of this paradox recently thanks to some conference discussions with our fellow genomic veterans, so we came here to think out loud about it.
On one hand, we hear about consumers who can’t get access to their DNA data because of FDA regulations, or patients who would likely benefit from genomic testing but their doctors are too traditional to give it a shot. On the other hand, it seems like just yesterday there were thousands of scientists around the world working hard to assemble the first human genome sequence, and now it’s almost routine for organizations to launch studies including tens of thousands of whole genomes.
Some recent indications suggest that we’ll soon leave behind those lingering doubts about the speed of progress. If you didn’t see this summer’s PLoS Biology publication about the growth of genomic data, it’s well worth a read. Scientists from a number of institutions came together to write the commentary, which builds on data metrics from recent years to make the argument that in as little as a decade, genomics may lap fields such as astronomy and social media to become the most prolific producer of big data. How’s that for a field that until recently relied on shipping hard drives back and forth?
At the ASHG conference in Baltimore this month, we saw more evidence for the rapidly increasing pace of genomics. NIH Director Francis Collins spoke about the Precision Medicine Initiative and its efforts to aggregate or enlist 1 million whole genomes for a national database. Several speakers mentioned the 100,000 Genomes Project that’s underway in the UK, or the efforts at Geisinger Health System in Pennsylvania to sequence 250,000 patients. The days of congratulating ourselves for sequencing a single genome seem to exist only in the rear-view mirror.
Our internal data supports the same trend. Last November we launched the high-throughput version of our Pippin automated DNA sizing platform — an instrument designed in response to customer demand — and already the PippinHT is galloping along. So far, customers have ordered enough HT cassettes to process more than 30,000 samples. Whew! The PippinHT is best-suited for large-scale genomic or transcriptomic studies; even a few years ago, we would never have predicted this level of demand for the instrument. It’s a sure sign that genomics is scaling faster than any of us could have anticipated.
Samples Run on the PippinHT
We hope this pace in genome science translates into exciting new approaches to healthcare, the most obvious beneficiary of so many sequence databases and massive-scale discovery projects. For our part, we’ll keep churning out those high-capacity cassettes to help our customers increase their throughput for larger and larger studies.
Many thanks to the ASHG attendees who visited us at the Sage Science booth or stopped by our poster! ASHG 2015 was terrific, and the Sage team had a blast reconnecting with scientists and customers.
Naturally, in a meeting of 8,000 people, there was no single theme covering everything. But clinical genomics had a large role, and will be a nice fit for the technology we’re developing (we described it in our poster — check it out here). We believe that the ability to isolate extremely long DNA fragments will be increasingly important as sequencing technologies shift to single-molecule approaches delivering very long reads. That instinct was reinforced by the enthusiasm we saw among scientists at PacBio’s launch party for its new Sequel sequencer.
Sample prep didn’t get a lot of attention at ASHG this year — perhaps a sign that this part of the sequencing equation is more robust and routine than it has been in the past. If so, that’s good news for the sequencing community, and we’re glad to have played a part in helping scientists achieve it. Many companies like ours are bringing automation to various parts of the sequencing workflow, and collectively we are helping to make this process as bullet-proof as possible for maximum effectiveness in the clinic.
Our next conference presence will be the Festival of Genomics in San Mateo. If we didn’t catch up with you in Baltimore, we hope to see you in California next month!
In a highly accessed paper in BMC Medical Genomics, scientists from McGill University and EMBL tested several steps to find the most robust pipeline for discovering small non-coding RNAs (sncRNAs) that might be useful as biomarkers. As part of this effort, they evaluated several sizing options for microRNAs.
“Biomarker discovery: quantification of microRNAs and other small non-coding RNAs using next generation sequencing” comes from lead author Juan Pablo Lopez, senior author Carl Ernst, and several collaborators. The team sequenced 45 samples with Illumina platforms and validated the sequence data with qRT-PCR. “Our results show that good quality sequencing libraries can be prepared from small amounts of total RNA and that varying degradation levels in the samples do not have a significant effect on the overall quantification of sncRNAs via NGS,” the authors report.
Size selection of small RNAs has long been a challenge in workflows like these, as microRNAs and other sncRNAs tend to be very close in size to the adapters and other artifacts that must be removed to get the best results. In this study, scientists compared the Pippin Prep from Sage Science to Novex TBE PAGE gels and AMPure XP beads. Noting that their goal was to evaluate pros and cons of each technique, rather than choosing the best, they report, “We were able to obtain good quality sequencing libraries for all samples, but nonetheless, we found significant differences across purification methods.”
One of those differences was library yield. “The four libraries purified using [Pippin] also showed single peaks corresponding to miRNAs, but these libraries contained more than 50 times more product after purification, as compared to the Novex gel method,” the team writes.
After sequencing, the team assessed reads produced from each sizing method, as well as from a control library with no purification step. “Libraries prepared using [Pippin] gave the highest number of total reads with an average of 11.8 M reads per sample, while with the others we obtained only an average of 8.8 M (Novex), 9.1 M (AMPure) and 8.5 M (no purification),” Lopez et al. write. They add that Pippin sizing also identified more distinct miRNAs than any other protocol, and had the highest specificity to miRNAs.
Based on that, the team recommends Pippin Prep for medium-size projects. Our PippinHT was released after this project was completed and is a good option for scientists interested in the same high-quality, automated approach with significantly higher throughput.