Sage Blog

Illumina Workflow: Pippin for Improved Assembly Accuracy

As we’ve seen throughout this blog series, Sage customers are conducting all sorts of great experiments pairing their Pippin size selection instruments with Illumina sequencers. Today we look at the final topic in this thread: boosting assembly accuracy with precise DNA size selection.

In the years since we first launched the Pippin Prep and its big brother, the BluePippin, we’ve found that the scientists who demand these tools the most are bioinformaticians. Why? Because they see the downstream impact of high-precision sizing and know that it can make an assembly far better than manual gel extraction or other less accurate sizing methods.

Andrew Sharpe, a Research Officer and Group Leader in the DNA Technologies Laboratory at the National Research Council of Canada, told us that he uses several Pippin instruments to build multiple pair-end libraries for the same sample. He might construct three libraries with 200-base, 300-base, and 400-base inserts, for instance, and then assemble them together. “If you assemble one of the libraries, then you’ll end up with an assembly. But if you assemble all three together using three different lengths, you get quite a bit better product,” Sharpe said.

Another approach is to construct a mate-pair library or a long-read library and assemble it with the shorter-insert paired-end libraries. That’s a method used by Matthew Clark’s sequencing technology development lab at The Genome Analysis Centre in Norwich, UK. Adding that large-insert information “has a massive effect on the quality of the output,” Clark told us. “The bigger-insert library gives you a 5x or 10x jump in quality, maybe even bigger, in terms of the sizes of the assembly that you’re able to generate.” He said that the TGAC bioinformatics team prefers Pippin-aided sequencing libraries because the tight size selection helps them determine how far apart certain reads should be and put together a more accurate assembly.

The newest tool in the Sage portfolio will be particularly useful for this application as well. SageELF is a whole-sample fractionation tool that generates 12 contiguous fractions from a DNA sample, making it very simple for scientists to construct libraries of various insert sizes from the same sample.

We hope these blog posts detailing some of the most popular techniques used with the Sage + Illumina combo have been helpful to you. Thanks for reading!

Posted in Blog | Tagged , , , , , | Comments Off on Illumina Workflow: Pippin for Improved Assembly Accuracy

Cancer Genomics Crowd in Boston for Beyond the Genome Meeting

We’re gearing up for this week’s Beyond the Genome conference, to be held at Harvard Medical School here in Boston. This year’s event, hosted by Genome Medicine and Genome Biology, will focus on cancer genomics, therapies, and bioinformatics. A timely topic during breast cancer awareness month!

The Sage Science team has attended this meeting before, and that’s one of the reasons we’re so excited to be there this year — we know how great the science and speakers will be. The agenda is full of interesting sessions and presentations, including an opening talk from Gaddy Getz on cancer genomics and evolution; Mark Gerstein’s talk on human genome analysis; Andrea Califano speaking about regulatory networks; a talk from Sarah Highlander about the link between cancer and the human microbiome; and Peter Park on structural variation analysis.

We look forward to hearing about the latest advances in applying genomics — particularly next-gen sequencing — to find new ways to understand and defeat cancer. We are proud that so many of our users are deploying Sage products in these projects. From finding indels in paired-end sequencing to tracking structural rearrangements in long-read sequence data, or detecting full gene transcripts to conducting ChIP-seq experiments, Sage customers are truly driving advances in the cancer genomics community.

If you’re attending Beyond the Genome this week, please stop by our table. The Sage team would love to know more about your work and talk about how our products can make your life a little easier.

Posted in Blog | Tagged | Comments Off on Cancer Genomics Crowd in Boston for Beyond the Genome Meeting

New Resources: App Notes for Mate-Pair and Long-Read Sequencing with BluePippin

We’ve got some new application notes to share that will be particularly handy for BluePippin customers running mate-pair libraries or sequencing with the Pacific Biosciences platform. Many thanks to our distribution partner, Nippon Genetics, for making this great information available to the community.

In one app note, data provided by Dr. Yoshitoshi Ogura and Dr. Yasuhiro Gotoh from the University of Miyazaki in Japan demonstrate the use of BluePippin in a mate-pair library workflow with Nextera tagmentation. They prepared libraries for six strains of bacteria and used BluePippin to extract 8 Kb fragments. The scientists had previously used manual gel extraction, but found it to be time-consuming and troublesome. They report that BluePippin significantly reduces the amount of time required while delivering high-quality sizing results. Illumina’s mate-pair guidelines already suggest using Pippin Prep for size selection, and we’re glad to see this work validating the use of BluePippin as well.

The other two app notes cover studies conducted to assess the value of BluePippin size selection for achieving longer subreads with the PacBio RS II sequencer. BluePippin has been quite popular with PacBio customers because it can remove short fragments from libraries, focusing sequencing efforts on the longest fragments. This process not only increases average read length, but also boosts instrument throughput.

In one project, Dr. Yasuhito Arai at Japan’s National Cancer Center Research Institute used BluePippin’s high-pass mode to remove fragments smaller than 7 Kb from a library of human genomic DNA. Results were assessed with the Pippin Pulse, our pulsed-field gel electrophoresis product that quickly checks the size of long DNA fragments. According to the study, BluePippin selection offered a real improvement: libraries built without sizing resulted in an average subread length of 2,675 bp; with BluePippin, that average increased to 4,714 bp, an improvement of 76 percent.

For the other project, a scientist from the Okinawa Institute of Advanced Sciences in Japan built three libraries of bacterial DNA: one with no size selection; one selected for fragments 4 Kb and larger; and one selected for fragments 7 Kb and larger. Sequencing was performed using PacBio’s P5-C3 chemistry. Results were checked on both the Pippin Pulse and a Fragment Analyzer from Advanced Analytical. Both evaluations demonstrated that the library made without size selection included a number of short fragments, while the 4 Kb library reduced and the 7Kb library removed short fragments. Compared to the library with no size selection, the 7 Kb library yielded a 3.3-fold increase in average subread lengths (from 2,060 bp to 6,671 bp); the amount of data per cell also increased by 1.9-fold. According to the scientist, BluePippin is effective and essential for obtaining long reads.

Posted in Blog | Tagged , , | Comments Off on New Resources: App Notes for Mate-Pair and Long-Read Sequencing with BluePippin

Illumina Workflow: Pippin for Massively Parallel Genotyping

With so many Sage customers using their Pippin instruments in an Illumina sequencer pipeline, we’re taking a look at various applications enabled by the Sage + Illumina combination. Today we check out double-digest RADseq, which could not work without precise and reproducible size selection.

The approach was first nailed down by scientists in Hopi Hoekstra’s lab at Harvard University, which focuses on population genetics, development, speciation, and behavioral genetics. Their innovation, a new version of the popular reduced-representation genome sequencing (commonly called RADseq), introduced a second restriction enzyme step as well as Pippin Prep size selection. The result: a validated protocol for massively parallel genotyping that allows researchers to study hundreds or thousands of genetic loci across hundreds of thousands of samples — without any prior knowledge of the organism’s genome.

Essentially, scientists use ddRADseq to study a sliver of the genome in each sample; with Pippin sizing and the double restriction enzymes, they ensure that they’re looking at the same sliver across all samples. Then they can assess genetic variation within those regions for various applications, such as evolutionary development, population studies, and QTL mapping.

We talked to Brant Peterson, PhD, a postdoctoral fellow in the Hoekstra lab and lead author on the ddRADseq paper, to learn more about the work. He told us that the team’s usual method of size selection — manual gel extraction — was simply not reproducible enough to make the ddRADseq results meaningful. After switching to Pippin Prep, Peterson told us, “There’s very little difference from one sizing reaction to the next, which is the key to this approach working.”

In the time since the original paper came out, other labs have adopted the ddRADseq approach. One is GenCore, the genomics sequencing core at New York University’s Center for Genomics and Systems Biology. GenCore Manager Paul Scheid learned the method and offers it as a service for core clients. “We use the Pippin when constructing those ddRAD libraries to control the amount of loci that we hit from a given library,” he told us. “It’s very nice for fine-tuning that parameter.”

Next we’ll have the final post in our blog series. Check back to learn about how Pippin products are being used with Illumina sequencers to generate higher-accuracy assemblies.

Posted in Blog | Tagged , , , | Comments Off on Illumina Workflow: Pippin for Massively Parallel Genotyping

NIH Scientists Report New Findings in Battle Against Antibiotic Resistance

Antibiotic resistance is a scary concept, but at least there’s comfort in seeing so many great minds trying to solve the problem. Last week’s announcement that President Obama had issued an executive order for the development of a national plan to battle antibiotic resistance dovetailed nicely with a paper just published in Science Translational Medicine from NIH scientists.

The publication, “Single-molecule sequencing to track plasmid diversity of hospital-associated carbapenemase-producing Enterobacteriaceae,” reports the sequencing of 20 isolates of Enterobacteriaceae resistant to carbapenems, a powerful class of antibiotics used as a last resort in hospitals. Lead author Sean Conlan from NHGRI and his collaborators used the sequence data to understand the transmission path of a Klebsiella pneumoniae outbreak at the NIH Clinical Center in 2011, as well as isolates collected after the outbreak ended.

It’s impressive work, and we’re happy to report that our BluePippin automated DNA size selection platform was used in the project. Sequencing was performed with the PacBio RS II DNA Sequencing System; the team used BluePippin to remove fragments smaller than 5 Kb from the library prior to loading on the sequencer.

Long reads were necessary for the project, the authors note, because short-read sequence data as well as strain-typing technologies were unable to clearly distinguish between the organisms or to fully assemble the genomes.

Conlan et al. report finding less horizontal gene transfer than expected, but having the full sequence — including the drug-resistance-encoding plasmids associated with each genome — enabled them to get a sense of the remarkable diversity of the network of plasmids available to these bacteria.

The team also discovered that most of the cases suspected to represent hospital-acquired infections were in fact acquired earlier and missed in routine screening. This information helped them to focus their infection-prevention efforts on better screening at admission and increasing the frequency of surveillance cultures.

The authors suggest that real-time, whole-genome sequencing is already cost-effective for monitoring drug-resistant bacteria in clinical environments. “The cost of whole-genome sequencing is dwarfed by … costs associated with outbreaks and their investigations, including the human and financial toll and the loss of patient confidence in the health care facility,” they write.

Posted in Blog | Tagged , | Comments Off on NIH Scientists Report New Findings in Battle Against Antibiotic Resistance