In a Biotechniques paper this month, scientists from The Genome Analysis Centre describe a new method for mate-pair sequencing that saves time and money while decreasing the amount of input DNA required. The method is based on SageELF, which automatically generates 12 contiguous fractions of DNA from a single sample.
Led by Darren Heavens, the authors report that length and quantity of input DNA have been problematic factors in the preparation of long mate-pair (LMP) libraries for next-gen sequencing. To address that issue, they adjusted the sample prep protocol to use SageELF instead of conventional gel-based sizing, and then chose the fraction that best met their target fragment length.
“Using the SageELF streamlines the library construction process, allowing LMP libraries >10 kb to be constructed in under 2 days with <10 µg input material,” the scientists write. “For many genome projects, multiple insert size LMP libraries are required, and the ability to construct up to 12 discretely sized libraries for a combined reagent cost of $1270 compared with the reagent cost of $715 for a single insert size LMP library highlights the potential cost savings.” The protocol was developed to optimize the Nextera-based long mate-pair kit for library construction. In addition to the initial round of size selection with SageELF, the scientists conduct another sizing step on the BluePippin prior to Illumina sequencing to ensure selection of DNA fragments best suited for the platform. The protocol pays off by saving time and money in library prep, as well as by reducing the need for larger volumes of input DNA. It also leads to better sequencing results. “Accurately determining the size and span of the inserts for mate pair libraries simplifies the scaffolding problem, enabling the assembly of longer, more precise sequences with fewer non-determined bases (runs of N bases), empowering all subsequent downstream analysis,” the scientists report. Check out the full paper: “A method to simultaneously construct up to 12 differently sized Illumina Nextera long mate pair libraries with reduced DNA input, time, and cost.”
And for more on the TGAC team, check out this brief profile.
A team of scientists from the Icahn School of Medicine at Mount Sinai, Weill Cornell Medical College, Cold Spring Harbor Laboratory, European Molecular Biology Laboratory, and other institutions published the first analysis of a diploid human genome produced by combining single-molecule technologies.
Lead authors Matthew Pendleton, Robert Sebra, Andy Pang, and Ajay Ummat, along with their colleagues, report that integrating results from different technology platforms led to significant improvements in contiguity, with scaffold N50 values nearly 30 Mb. The high-quality assembly also allowed the team to find complex structural variants that can’t be detected in assemblies produced from short-read data.
The scientists used SMRT® Sequencing from Pacific Biosciences as well as genome maps from BioNano Genomics. “Our work shows that it is now possible to integrate single-molecule and high-throughput sequence data to generate de novo assembled genomes that approach reference quality,” they report.
In the study, which sequenced the well-characterized NA12878 genome, scientists used BluePippin to perform size selection prior to SMRT Sequencing. By removing DNA fragments smaller than 7 Kb, the team generated extraordinarily long reads with the PacBio platform. “Without selection, smaller 2000 – 7000 bp molecules dominate the zero-mode waveguide loading distribution, decreasing the sub-readlength” that can be achieved with the sequencer, the authors write in the supplementary materials.
For more, check out the full paper here: “Assembly and diploid architecture of an individual human genome via single-molecule technologies.”
Last week we attended the first-ever Festival of Genomics, a new series of meetings taking place in Boston, San Mateo, and London. This conference was held in Boston’s biggest convention center, and featured a music festival kind of approach, with four stages of concurrent sessions in addition to plenty of other activities.
The Sage Science team was out in force, and we participated in many of those activities. Our CSO Chris Boles gave a talk in the Tech Forum, sharing details of a new product in development we’re calling the SageHLS. Built to help scientists generate ultra long DNA fragments for the new breed of technologies that need them — from optical mapping to single-molecule sequencing — the SageHLS will also help streamline the library prep process. More details will be available later this year.
Another element of the circus-like atmosphere was Race the Helix, a fundraising event for the Greenwood Genetic Center in which teams have 20 minutes on a treadmill to run as far as they can. Our own Alex Vira suited up and ran with the PacBio team, winning an impressive second place in a field of competitor teams. We’re proud to have helped raise money for a good cause!
Some 1,200 people registered for the conference, and the plenary talks were frequently standing room only. Great presentations came from Ting Wu, Craig Venter, Heidi Rehm, Diana Bianchi, and a host of others. We really enjoyed the concurrent session focused on long-read sequencing that included Mike Snyder, Chad Nusbaum, Dick McCombie, and a few other terrific speakers. One of the truly unique things about the event was an evening play about clinical genomics featuring a number of brave scientists, including Eric Green, Andy Faucett, and others. Who stole the show? Naturally, it was George Church in the role of God.
The festival heads west to San Mateo this fall, with a winter performance in London. We look forward to seeing how the organizers from Front Line Genomics continue to innovate at this fun meeting!
If you haven’t listened yet to the Mendelspod interview with Bobby Sebra from the Icahn Institute for Genomics and Multiscale Biology at Mount Sinai, we can’t recommend it highly enough. And that’s not because we happen to be sponsoring this podcast series on DNA sequencing — it’s because Sebra offers up some really interesting perspectives on a range of topics.
For example, he talks with Mendelspod’s Theral Timpson about the institute’s Resilience Project, which is just now kicking into high gear. Sebra outlines efforts to scale up the sequencing facility to meet the needs of this massive project, which aims to scan the DNA of healthy people to find naturally occurring biological mechanisms that might help them escape the effects of disease-causing variants.
Sebra is the institute’s director of technology development, so of course the interview includes great information about his view of the different sequencing platforms and how he chooses which platform to use for which project (for example, short reads for resequencing, and long reads for reference-quality genomes). His take is that scientists get the best results by using multiple platforms to generate complementary data.
Our favorite part was the discussion of sample prep, which Sebra notes is becoming a bigger challenge for genomic scientists with the growing need for larger DNA fragments for long-read and single-molecule platforms. “The quality of your input material needs to be better,” Sebra says, calling for novel methods in DNA extraction and processing. While his team can currently make a 20 Kb to 50 Kb library with enough input material, he says the dream is being able to make these extremely large-fragment libraries from vanishingly small input.
Sebra covers several other compelling topics in the 27-minute podcast, such as his response to the accusation that the genomics revolution has fallen flat, what’s exciting in clinical genomics, the need for single-cell sequencing, and his experience with data from BioNano Genomics, 10X Genomics, and Oxford Nanopore. Be sure to check it out.
And if you missed the first installment in the series, here’s the podcast with Rod Wing at the Arizona Genomics Institute.
This week we’re traveling to Baltimore for the annual east coast user group meeting for Pacific Biosciences customers. We’re a sponsor of the event and look forward to the great scientific presentations these meetings have become known for. Click here to check out the latest resources and protocols for size-selecting PacBio libraries using Sage instruments.
This year for the first time there will be a half-day sample prep workshop, including talks on handling ultra-long DNA fragments, among others. We’re eager to see how PacBio users have made the most of BluePippin and SageELF, our automated DNA size selection platforms for long DNA fragments, in their research pipelines.
Held at the University of Maryland’s campus on June 17, the meeting will feature speakers from the National Institute of Standards and Technology, the United States Army Medical Research Institute of Infectious Disease, Baylor College of Medicine, Cold Spring Harbor Laboratory, and many others. We’re particularly looking forward to talks on fusion isoforms in breast cancer, de novo metagenomics, and diagnostic assays.
PacBio customers tend to be intrepid when it comes to trying new protocols and coming up with new methods, especially regarding sample prep. Each year at this meeting we’ve gotten a new glimpse into low-input sequencing or other technical achievements, so we anticipate great presentations from people who continue to push the envelope with long-read sequencing.
If you’ll be at the event, we hope you stop by our table and check out BluePippin and SageELF. If not, we’ll be tweeting and blogging, so stay tuned!