In the genetics department at Albert Einstein College of Medicine, Research Assistant Professor Alex Maslov is working to understand structural variants associated with aging and cancer. Using human and mouse cells, he deploys whole genome sequencing to make these links. In one current project, his lab is investigating whether chemotherapy causes somatic mutations in non-tumor tissue. For these studies, his team relies on Pippin automated DNA sizing instruments from Sage Science.
Maslov began with Pippin Prep, which he uses primarily for library preparation before Ion Torrent sequencing. “We were extremely happy with it because it’s very precise and reproducible, and doesn’t take much effort,” he says. But between his lab and the core facility led by Shahina Maqbool, demand quickly surpassed the Pippin Prep’s capacity.
That’s when Maslov got his PippinHT. In addition to solving the capacity issue, he says, the PippinHT delivers results more quickly, taking just 20 minutes per run. Reproducibility of sizing is very important to Maslov, who uses split reads to detect structural variants in Ion Torrent data. Any variability between samples changes the sensitivity of structural variant detection and makes results less reliable. “What we like about PippinHT is that it’s extremely reproducible. All 12 samples come out as identical,” he says. “When you do size selection on a gel, you can never do it precisely from one sample to another.”
The PippinHT was installed at the core lab, where it’s used by other scientists for Illumina sequencing, both for DNA and RNA projects. “For RNA library preparation, it’s even more critical,” says Maslov. “They need to distinguish library fragments from adapter-dimers, and in the case of microRNAs, the difference might only be 20 base pairs.”
Bringing in either Pippin instrument is an investment, but Maslov says that ultimately the tools help scientists save money. “With Ion Torrent, if you use fragments that are too small, you’re not getting the full output of sequencing. If you use fragments that are too long, you can lose whole runs,” he says. These kinds of mistakes in size selection can be quite costly, but they can be avoided with precise, automated sizing. “My advice to other scientists is: do not hesitate,” Maslov says. “Pippin works.”
This year’s DNA Day arrives at a heady time for advances with the world’s most important molecule: scientists have edited DNA in a human zygote for the first time, we’re closer to a fully finished human reference genome than ever before, and the community is making major strides in using DNA to store data.
It’s humbling to be part of a field where transformations are happening so quickly and with such frequency. What’s being accomplished today is truly amazing, especially when we consider that June 2000 saw the White House announcement of the first drafts of the human genome sequence from the Human Genome Project and Celera. Fifteen years ago, telling our friends and family about working in the genomics field was the ultimate conversation-stopper; today, we feel like rock stars when people learn that we’re part of this exciting industry.
DNA Day celebrates both the completion of the draft of the first human genome, published in April 2003, and the seminal paper on the structure of DNA from Watson, Crick, and collaborators in 1953. When we think about how much has been learned about DNA since those first studies, it’s staggering: from epigenetics to CRISPR, from transposable elements to folding properties, we have come so far in such a short period of time. Now biology is entering the realm of big data, and DNA sequencing has led the way.
Of course, there’s still a long way to go. We believe that public education is particularly important; in a recent survey of consumers, the vast majority of respondents said that “any food containing DNA” should be labeled as such. It’s sad that even as we’re making incredible leaps forward in our understanding of DNA, so many people still have little or no education about this molecule and its function in the world. We hope that the community finds new and innovative ways to inform the public as it continues this unprecedented pace of biological discovery.
We wish you and yours a happy DNA Day!
We always love a great protocol video, and this one from scientists at Weill Cornell Medical College, published through the Journal of Visualized Experiments, is a keeper. Check it out here: “Enhanced Reduced Representation Bisulfite Sequencing for Assessment of DNA Methylation at Base Pair Resolution.”
The protocol, which can also be viewed the old-fashioned way here, is an NGS-based approach to map DNA methylation patterns across the genome and was developed as an alternative to microarrays. The Cornell scientists and their collaborator at the University of Michigan present a step-by-step recipe for using a restriction enzyme in combination with bisulfite conversion to achieve base-pair resolution of methylation data. The entire method spans four days.
“Reduced representation of whole genome bisulfite sequencing was developed to detect quantitative base pair resolution cytosine methylation patterns at GC-rich genomic loci,” the scientists report. The data generated “can be easily integrated with a variety of genome-wide platforms.”
In the protocol, the scientists call for automated DNA size selection with Pippin Prep, assuming there’s enough input material to make it possible (25 ng or more). You can watch the process (just past the 4 minute mark in the video) or read about it in section 5.1 of the paper.
At the Whitehead Institute’s Genome Technology Core, scientists handle a lot of ChIP-seq and RNA-seq projects. To boost capacity in library prep, they recently upgraded from a small fleet of Pippin Prep instruments to the new PippinHT for high-throughput, automated DNA size selection.
Technical assistant Amanda Chilaka uses the PippinHT very often — primarily for ChIP-seq library prep — and says that in just three or four months it has become the preferred size selection instrument in the core lab. Compared to the Pippin Prep, Chilaka says, the speed and capacity of the PippinHT are a big improvement. “The PippinHT size selection is a little bit more precise and it’s faster as well,” she adds.
One benefit Chilaka finds particularly useful is the ability to cut different libraries at different size ranges in a single run. She also notes that sizing from one sample to another is very consistent.
“I can’t imagine us ever going back to cutting gels with a scalpel at this point,” Chilaka says. “I definitely recommend the PippinHT for anyone doing library prep or working at a high-throughput lab. It’s incredibly helpful.”
Darren Heavens has witnessed a fascinating transition at The Genome Analysis Centre as the Norwich, UK-based institute shifted from data-generation mode to data-analysis mode. When the center launched more than five years ago, there was a fairly even split between laboratory-based scientists and bioinformaticians, Heavens says; today, there are about 15 laboratory scientists and nearly 70 bioinformaticians. The focus is on generating great data that lets the bioinformatics experts perform the highest-quality analyses.
Heavens, a team leader in the Platforms and Pipelines group, spends a lot of his time figuring out how to make the data produced at TGAC more amenable to bioinformatics crunching. One of the newest weapons in his arsenal is the SageELF, an automated system that produces 12 contiguous fractions from a DNA sample.
His prior experience with Sage Science instruments came from the BluePippin, which he began using for size selection of NGS libraries after a TGAC bioinformatician presented data on the variability of insert sizes in libraries he was trying to assemble. “He did the data analysis and found that BluePippin sizing improved his outputs no end,” Heavens recalls.
So it was a no-brainer for Heavens to try out the new SageELF, which he’s been using for a few months now. “It’s great because it gives us the chance to make multiple libraries from one sample,” he says, noting that this helps keep reagent and other costs in check. For experiments requiring a very specific insert size, Heavens likes to run a sample on the SageELF and map the fractions to assembly data to determine which best meets the criteria before going ahead with the rest of the experiment.
His team uses the instrument for long mate-pair NGS projects, restriction-digest sequencing, and sequencing projects focused on copy number variation. For CNVs, Heavens and his colleagues came up with a protocol using SageELF to separate PCR products; they then sequence the largest fraction to get an accurate view of the highest copy numbers present in the sample. “That gives us the true copy number,” he says. “The duplicated genes themselves are so similar that if you don’t have the full-length fragment, they just collapse down in the assembly.” The protocol, which they developed for a project for one client, was so successful that several other clients have now come to TGAC asking for the same method for their samples, Heavens says.
The biggest advantage of SageELF compared to other fractionation methods is its recovery, according to Heavens. His team gets 40% to 45% recovery from input material with the platform, while “with a manual approach you’d be lucky to get 10% to 15% recovery,” he says. “For us that’s a big plus.” He notes that scientists working with precious samples might find SageELF particularly useful for making the most of input DNA.
Heavens says setup and training were simple and straightforward, and that his team is now running the SageELF at or near capacity, which equates to two runs per day of two cassettes each. Since each cassette yields 12 fractions, that’s 48 fractions each day that the TGAC team could potentially use for sequencing. “It has opened up so many avenues for us,” Heavens says.