Sage Blog

Scientists Compare PacBio, Oxford Nanopore Transcriptome Results

We enjoy a good technology evaluation as much as the next scientist, particularly when it comes to sequencing. So we were quite interested in a recent F1000Research publication about long-read sequencing platforms from researchers at the University of Iowa, the University of Oxford, and other institutions.

From senior author Kin Fai Au and collaborators, “Comprehensive comparison of Pacific Biosciences and Oxford Nanopore Technologies and their applications to transcriptome analysis” presents a nice assessment of the pros and cons of long-read sequencing tools. The authors note that PacBio libraries were prepared using SageELF size selection, while the Oxford Nanopore libraries were not size-selected. Some of the study results can be explained by the difference in sample prep.

To compare the technologies, scientists sequenced the transcriptomes of human embryonic stem cells with PacBio, ONT, and Illumina (short reads were used for comparison purposes as well as for building hybrid assemblies). They note that long reads have been especially informative for transcriptome studies, including gene isoform identification.

In this analysis, both platforms were able “to provide precise and complete isoform identification” for a small library of known spike-in variants and for more complex transcriptomes as well. “PacBio has a slightly better overall performance, such as discovery of transcriptome complexity and sensitive identification of isoforms,” the team reports.

Delving into details, the long-read platforms performed similarly in read length, based on a comparison of mappable length. The team found that a higher proportion of PacBio reads could be aligned to reference genomes compared to ONT reads. Throughput was also noticeably different: “the yield per flow cell of ONT is much higher than PacBio, because each nanopore can sequence multiple molecules, while the wells of PacBio SMRT cells are not reusable,” the authors note. Error rate was another area of divergent results. PacBio CCS reads had an error rate “as low as 1.72%,” the scientists report, giving data from that platform “higher base quality than corresponding ONT data.”

“PacBio can generate extremely-low-error-rate data for high-resolution studies, which is not feasible for ONT,” the scientists add, noting that ONT advantages include high throughput and lower expense. “The cost for our ONT data generation was 1,000–2,000USD,” they report.

In addition, the scientists assessed hybrid approaches for both platforms, adding short Illumina reads for error correction. It was the first known pairing of ONT and Illumina reads for this purpose. “As this first use of ONT reads in a Hybrid-Seq analysis has shown, both PacBio and ONT can benefit from a combined Illumina strategy,” the team writes. The authors note that both long-read sequencing tools have improved significantly with recent models and predict that future enhancements will be a boon to transcriptome studies as well.

Posted in Blog | Tagged , | Comments Off on Scientists Compare PacBio, Oxford Nanopore Transcriptome Results

Podcast: Nanopore Expert Mark Akeson on Challenges and Opportunities in Sequencing

Mark Akeson knows a thing or two about perseverance. After spending years as a soil biologist in Guatemala where he endured a series of parasite infections, he went on to become a pioneer in nanopore sequencing technology, which he has been developing for more than 20 years even though highly regarded scientists insisted it would never work.

His story is the subject of an interview with Mendelspod host Theral Timpson, and the conversation — which took place earlier this year — provides fascinating insight into the world of nanopore sequencing.

Now co-director of the biophysics laboratory at the University of California, Santa Cruz, and a consultant to Oxford Nanopore, Akeson told Timpson that current nanopore technology has a number of challenges but also has plenty of room for significant improvement. Right now, accuracy of single-pass reads is still lower than that of short-read sequencing platforms, and sample prep issues limit the length of fragments that can be fed through the pore. But having already overcome major obstacles — finding the right size sensor, improving sensitivity, and moving DNA reliably one nucleotide at a time — Akeson is confident the technology will get even better.

As he sees it, the major advantage of nanopore sequencing is the ability to interpret DNA or RNA directly from the cell. “You’re reading what nature put there, not only the bases but modifications,” he said. In addition, nanopores allow for very long reads (Akeson said his lab is routinely generating 200 Kb sequences) and are ideally suited for sequencing in the field.

This work may not have been possible without the commitment of program officers at NHGRI, whom Akeson praised as “visionaries on this research.” Now that the approach has finally been proven to work, “the whole nanopore sequencing endeavor is going exponentially faster,” he added.

Looking ahead, Akeson predicted that 20 years from now genome sequencing will be so cheap it’ll just be given away to consumers, with profits coming from follow-on interpretation services. One day, he said, we’ll look back on the $1,000 genome or even the $100 genome as very expensive. Sequencing technology will continue to improve over time because it’s “too important a technology to say ‘We’re done,’” he added, noting that accuracy, read length, and throughput will be areas of focus in the coming years.

Posted in Blog | Tagged , | Comments Off on Podcast: Nanopore Expert Mark Akeson on Challenges and Opportunities in Sequencing

Nabsys Returns: CEO Barrett Bready on the Importance of Structural Variation

A new podcast from Mendelspod features an interesting interview with Barrett Bready, CEO of electronic mapping firm Nabsys, who emphasizes the growing need to incorporate structural variation data into genome studies.

In the discussion, Bready describes his company’s platform, which relies on voltage-powered, solid-state nanodetectors to generate map-level information. Each nanodetector can cover 1 million bases per second, Bready said, and can be multiplexed for a highly scalable system. It’s a “really high-speed, highly scalable way of getting structural information,” he added.

Bready noted that the genomics community has realized the need for long-range information, estimating that known structural variants now make up about 60 Mb of the human genome, a number that has increased rapidly in the last few years even as the amount of sequence attributed to single-nucleotide variants has stayed the same. Nabsys aims to democratize access to structural information by producing a cost-effective mapping tool for routine analysis of these large variants.

This information will complement short-read data, Bready said, which necessarily sacrifices assembly contiguity due to the need to cut DNA into small fragments prior to sequencing. The Nabsys platform works with high molecular weight DNA to capture extremely long-range information. He also said that electronic mapping data offers more value than optical mapping technologies.

Beta testing for the new platform is expected to begin early next year.

Posted in Blog | Tagged | Comments Off on Nabsys Returns: CEO Barrett Bready on the Importance of Structural Variation

ddRAD-seq Study Explores Behavioral Roles in Speciation

A new preprint from the Hoekstra lab at Harvard makes great use of the double digest RAD-seq protocol to better understand reproductive barriers and speciation in closely related species of mice. Since it was the Hoekstra lab that gave us the ddRAD-seq method, we took notice when this preprint became available.

The paper comes from Hopi Hoekstra and Emily Delaney, a Harvard grad student who is now a postdoctoral fellow at the University of California, Davis. In “Sexual imprinting and speciation in two Peromyscus species,” the scientists describe how sexual imprinting, typically a learned trait, contributes to sexual isolation of Peromyscus leucopus, the white-footed mouse, and P. gossypinus, the cotton mouse.

One area of interest at the start of this project was determining the genetic or learned mechanisms underlying sexual isolation. The scientists “used genomic data to first assess hybridization in the wild and conclusively found that the two species remain genetically distinct in sympatry despite rare hybridization events,” they report. “We find that these mating preferences are learned in one species but may be genetic in the other: P. gossypinus sexually imprints on its parents, but innate biases or social learning affects mating preferences in P. leucopus.”

The study involved using ddRAD-seq to analyze 376 mice. In that workflow, the team used Pippin Prep to select fragments ranging from 265 bp to 335 bp. Libraries were sequenced with the Illumina platform.

“Our study supports an emerging view that sexual imprinting could be vital to the generation and maintenance of sexual reproductive barriers,” the authors conclude. “Examining the role of sexual imprinting in similar cases of speciation driven by sexual reproductive barriers will continue to expand our understanding of the role of behavior in speciation.”

Posted in Blog | Tagged , | Comments Off on ddRAD-seq Study Explores Behavioral Roles in Speciation

For 10x Genomics Workflow, Broad Institute Uses PippinHT Size Selection

At the Broad Institute, scientist Michelle Cipicchio is part of the technology development team responsible for optimizing new methods or sample types before they’re implemented on the organization’s industrial-scale exome and whole-genome sequencing pipeline. Recently, she’s been working with the Chromium platform from 10x Genomics, and part of getting it ready for production involved implementing the PippinHT for automated DNA size selection.

The technology development team is focusing on whole genome analysis with the Chromium platform. To put the workflow through its paces, they’re running a pilot project on 450 whole blood samples for scientists conducting a large schizophrenia study.

Cipicchio began working with automated DNA size selection from Sage Science at the recommendation of 10x Genomics. “The first step in the 10x process requires the longest DNA molecules that you can acquire,” she says. Since the Broad often uses legacy samples that have gone through multiple freeze/thaw cycles, her team doesn’t have the luxury of expecting high-quality, intact DNA. “For 10x, these long molecules are really necessary and most of our samples don’t have a ton of that kind of material,” Cipicchio adds. She began using BluePippin to remove smaller fragments prior to library construction. The team evaluated four samples with and without Pippin size selection and found that they were consistently able to get longer phasing data with automated size selection. To ramp up capacity so all 450 samples can be run with size selection prior to Chromium processing, the team upgraded to the higher-throughput PippinHT platform.

Optimization work for the workflow is still underway. Cipicchio and the team have run about 100 of the 450 samples so far, so they have lots more opportunities to polish and perfect the protocol before it’s ready for production mode.

Posted in Blog | Tagged , | Comments Off on For 10x Genomics Workflow, Broad Institute Uses PippinHT Size Selection