We’re learning as we go: that’s the message from Winston Timp, assistant professor at Johns Hopkins University, about how labs are handling the new demands placed on sample prep techniques by ever-changing sequencing technologies. Timp’s impressive results, particularly with handling DNA from difficult organisms like trees, make his advice relevant to anyone interested in working with high molecular weight DNA. We chatted with him about his approach.
Q: How has nanopore technology changed what’s possible in genomics?
A: Nanopore sequencing offers us a unique opportunity because the read length is limited only by the length of DNA that you can prepare and then the length of DNA you can deliver to the pore. People have generated megabase-scale DNA reads. That’s incredible because that means we’re going to be able to sequence through large sections of chromosomes that were heretofore impossible to reach. It’s going to make things like genome assembly trivial because you can assemble an E. coli genome from, say, five or six reads.
Q: What new demands are being placed on sample prep by long-read technologies?
A: Part of the problem is getting the reads to the sequencing instrument, whether that’s a 10x Genomics instrument, or PacBio, or a nanopore sequencing instrument. The other part of the problem is extracting these long molecules without too much trouble and then characterizing and size selecting them, which is what Sage excels at. These issues are coming to the forefront because of the further development of sequencing technologies and the fact that the yields of some of these sequencing technologies have increased recently. Nanopore and PacBio sequencing yields have increased substantially in the past year or two, while Illumina prices continue to drop — which allows 10x to leverage its methodology to generate long sequencing reads. In all these cases, you need to start with high molecular weight DNA.
Q: That challenge is even worse for plant genomes. Why?
A: When you’re dealing with plant specimens, they often have all these polyphenolic and polysaccharide compounds so it’s hard to get a nice clean prep of DNA. Using native DNA for nanopore sequencing — DNA that hasn’t been PCR amplified — requires that your DNA be really clean or else it could easily poison the sequencer such that you’ll get lower yields.
Q: How have you found methods that address these challenges?
A: We’re learning to do it as we go. For doing high molecular weight DNA extractions, some of the tools and technologies, like pulsed-field gels, are old and some are new. It’s a mix to get at questions we couldn’t access before. It’s a great time to be doing science.
Q: What approach is your lab using for these tree projects?
A: We paired with this group here in Baltimore called Circulomics. They spun out of a lab at Johns Hopkins and developed a material called Nanobind which is able to relatively easily purify high molecular weight DNA. We are trying to generate genomes for the giant sequoia and for the coastal redwood, but their leaves are difficult to extract DNA from. We’re cracking open the plant cells and extracting out the nuclei, and then taking those nuclei and cleaning up what’s left using Nanobind to really enrich for nice high molecular weight DNA. We consistently get DNA that looks like it’s at least 100 kilobases long. We can run this on the nanopore sequencer and get yields on the order of 8 gigabases.
Q: What’s your advice for other scientists who want to work with HMW DNA?
A: It’s always useful to collaborate. We wouldn’t be able to do this without our collaborations, both with the bioinformaticists who do the assembly work, the plant biologists with deep biological knowledge, and the materials scientists at Circulomics. Also, you should always think about what you actually need. Sure, you might be able to try for megabase-scale sequencing reads using even older-school technologies like spooling up DNA on a glass rod. But for sequencing the sequoia we’re satisfied with reads on the order of tens of kilobases long because that’s still in excess of what was previously possible. You have to define the parameters of what it is you’re going after and not get too greedy. You’re always going to be sacrificing something. Either you need to use more material to get the high molecular weight, or you might have more contaminants or you might have less yield but you’re going to get longer reads. There’s always a trade-off.