Preprint Highlights Utility of Size Selection for Nanopore Studies
A recently shared preprint demonstrates the effectiveness of size-selection for nanopore sequencing, relying on the PippinHT automated DNA sizing platform for high-throughput pipelines.
“Mapping And Phasing Of Structural Variation In Patient Genomes Using Nanopore Sequencing” comes from lead author Mircea Cretu Stancu and collaborators at University Medical Center Utrecht, the University of Torino, and other institutions. In it, the scientists report results from using an Oxford Nanopore MinION to sequence the genomes of two patients with congenital abnormalities, with a focus on structural variant (SV) detection. “Long-read sequencing is breaking ground for the discovery of SVs at an unprecedented scale and depth,” they write. The team used the PippinHT system to size-select DNA libraries for the second patient prior to sequencing.
The effort, which produced the first known whole human diploid genome assemblies using the MinION, was a success. “We were able to extract all known de novo breakpoint junctions for Patient1, even at relatively low coverage,” the scientists report. For the second patient, the sequence data revealed more complexity for many breakpoint junctions. “We observed that 33.3% of the high confidence set of SVs observed in the Nanopore data could not be found in matching Illumina sequencing data, despite the use of six different variant calling methods,” they add.
The authors note that “these results highlight the feasibility to sequence clinical human samples in real-time on a low-cost device.”
Creighton Lab Boosts Yield, Sequencing Efficiency with BluePippin
At Creighton University in Omaha, Neb., Dr. Anna Selmecki’s lab explores various fungal species to understand genome instability, pathogenesis, and the acquisition of drug resistance. For these investigations, her team relies heavily on whole genome sequencing, using both the Illumina MiSeq platform and Oxford Nanopore sequencers.
However, Selmecki and her team encountered two major obstacles with their library preparation pipeline. A bead-based size-selection step was decreasing their yield and even with size selection, the MiSeq was still generating very short reads. Using AMPure magnetic beads for sizing, “we always found that we lost a huge percentage of the library,” Selmecki recalls. Even when a Bioanalyzer reported that the library fragment size was in the desired range, sequencing results were consistently shorter than expected.
While both problems stemmed from the sizing step, switching to commonly used manual gel excision was not an option. “From previous experience, I knew that cutting bands out of a gel is horrible and you still lose a lot of your library that way,” Selmecki says. She remembered from her days at the Dana-Farber Cancer Institute that colleagues had raved about an automated size selection instrument from Sage Science.
So Selmecki brought in the BluePippin sizing platform and solved both problems. Recovery is significantly better, and more precise size selection removes the small fragments that had been leading to shorter-than-anticipated MiSeq reads. “The Pippin cleaned that up a lot, ensuring that we’re only amplifying pieces that are much larger,” she says. Using BluePippin for size selection followed by bead-based purification, Selmecki and her team can easily select for insert sizes of 600 bp to 1.2 Kb for their paired-end sequencing pipeline. “We found we got better coverage across the genome,” she adds.
Selmecki’s team is already planning to expand the use of its BluePippin instrument to other molecular biology techniques, such as molecular cloning and library preparations for Oxford Nanopore sequencing. “We’re just doing everything on the Pippin,” she says.
“If people are noticing really uneven coverage across their genomes or they’re having trouble with yield during their library prep, I would recommend considering the Pippin,” Selmecki says.
How to CATCH Your Gene of Interest
If you haven’t heard about CATCH by now, it’s time to catch up. Short for Cas9-assisted targeting of chromosome segments, CATCH comes from the lab of Yuval Ebenstein at Tel Aviv University and was first reported in this Nature Communications paper.
Like so many scientists, Ebenstein found himself routinely having to sequence whole-genome data in order to study a region that was too large to amplify easily with PCR. “You end up paying for all this data and eventually using a very small fraction of it,” he recalls. While there are several target-capture and enrichment methods, they all require knowledge of the sequence of interest. But for Ebenstein, who was interested in highly repetitive DNA, those methods didn’t work.
He cast about for a new approach, and found inspiration in the burgeoning CRISPR field. “We came up with this idea that you can cut the flanking region with Cas9 and then use gel electrophoresis to extract only the fragment that you’re looking for,” he says. The method involves RNA-guided Cas9 to make two cuts to pull out the specific region of interest, followed by a size-separation step to remove off-target fragments. It’s geared toward genomic regions that are 50 Kb or larger. Together with Ting Zhu and Chunbo Lou from Tsinghua University, the team began generating custom BACs by combining CATCH with the Gibson assembly to cut the desired piece of DNA and clone it into a vector in a streamlined process.
Since then, Ebenstein and many other labs using CATCH have been broadening the base of applications. It’s particularly attractive for third-gen sequencing platforms; because they typically have lower throughput, “it’s especially beneficial to only probe what you’re interested in and not waste your sequencing depth on regions that are not of interest,” he says. “This is the power of CATCH: no matter how complex the region or what structural variations are in it, if you know the flanking region, you can fish it out and analyze it.”
An early drawback with the CATCH protocol was its use of gel electrophoresis, which Ebenstein refers to as “a prehistoric technology.” Size selection is essential for the method, but users must perform the very cumbersome pulsed-field gel electrophoresis technique. That’s where the SageHLS instrument came in. “Sage basically eliminates all of that,” Ebenstein says. The automated platform handles everything inside the gel, and collects size fractions without needing a visible band. “The recovery is phenomenal,” he adds. “You can use a very low amount of starting material and you still get a meaningful amount of DNA for further analysis.”
The protocol for using the SageHLS instrument with CATCH (something we refer to as HLS-CATCH) is still undergoing optimization, with Ebenstein’s team putting the new platform through its paces.
In the meantime, the community continues to push ahead with CATCH. It is already in development in several labs for studies of plants, which have highly repetitive DNA. Ebenstein and others are working to make the protocol robust for use in human genetics as well, targeting important genes such as BRCA1 and BRCA2. He says that the SageHLS instrument will likely be an important factor in those efforts.
How can you tell if CATCH is right for you? Ebenstein has a simple rule: “If you can PCR it, PCR it,” he says. “If you can’t, then you probably need CATCH if you don’t want to go bankrupt.”
On DNA Day, Let’s Stop the Slicing and Dicing
Today is DNA Day, and we’re taking the opportunity to support the humane treatment of DNA. After all these years of harshly shearing these molecules and fragmenting them down to just a few hundred bases, can’t we agree that there are nicer ways to treat them? (Yeah, we know that for some applications, you really do need teeny tiny pieces of DNA. We get it.)
For many applications — particularly long-read sequencing and long-range technologies such as optical mapping — it’s actually better to leave DNA as intact as possible. Just a little gentle cleaving, and you wind up with extremely long DNA fragments that produce optimal results. By preserving these molecules as much as possible, we can detect large structural variants, phase distant SNPs, accurately count copy numbers, and much more.
Large input DNA is responsible for major advances in genomics, such as the most contiguous assemblies yet for humans and other mammals. These reference-grade assemblies have been tremendously useful for filling in blanks left by previous sequencing attempts using short reads, allowing scientists to discover new genomic elements — including entire genes — that had been missed with other approaches.
High molecular weight libraries are also being used for newer interrogations of the regions of DNA that touch when the molecule is folded in the nucleus. After decades of only studying DNA in linear order, we’re getting amazing new insights from approaches like proximity ligation mapping. Discoveries like this tell us that DNA is probably harboring even more fascinating secrets, and we just need to find the right ways of asking questions.
And perhaps it all begins with better treatment of your DNA molecules! We hope you’re celebrating DNA Day today. From all of us at Sage, happy HMW DNA to you!
At AACR, High-Profile Speakers and NGS Error Correction in the Spotlight
Last week’s annual meeting of the American Association for Cancer Research offered some great perspectives on innovation in oncology, both in the clinic and in academic labs.
We were particularly impressed by former Vice President Joe Biden’s update on the Cancer Moonshot initiative, which he characterized as a bright spot in bringing people together and pushing research forward. For example, Amazon offered to host some extremely large cancer databases, and in less than a year the information has been accessed 80 million times. Biden’s optimism about the effort turned to frustration with the new administration’s proposed cuts to research funding. He said the “Draconian cuts” would be a massive setback, though he expressed doubt that the proposed budget would pass Congress.
The association also gave out some prestigious awards, such as the AACR Award for Lifetime Achievement in Cancer Research to Mina Bissell. The Lawrence Berkeley National Laboratory scientist has been a pioneer in breast cancer research, delivering some of the earliest findings that cells lose their native expression patterns when cultured in different conditions (she also discovered that cells can “remember” their original profile when native microenvironment conditions are restored). As strong supporters of improved sample prep, we see Bissell as a champion for the kind of reproducible research practices that are essential to life science.
One of the most exciting technical advances came from a team at Johns Hopkins University, where scientists developed an error-correction method for NGS results from liquid biopsies targeting cell-free DNA. The approach boosts accuracy with ultra high-coverage sequencing. We’re excited about this work because it dovetails nicely with our new SageHLS instrument for purification of extremely large DNA molecules, such as entire genes associated with cancer. In beta tests, scientists have successfully purified the BRCA1 and BRCA2 genes using the platform with the CATCH method (Cas9-assisted targeting of chromosome segments).
We congratulate all the scientists who presented at AACR and made such a strong showing for the terrific recent advances in cancer research!