2015: The Year of Long Reads?
We couldn’t help noticing that “long reads” kept popping up in presentations and posters at AGBT, and we certainly weren’t alone. Aside from longtime long-read provider Pacific Biosciences and synthetic long-read service Moleculo, acquired by Illumina in 2012, new companies such as 10X Genomics and Dovetail Genomics were touting the value of this kind of information at AGBT.
We’re already seeing sessions on long-read sequencing on the agendas of other upcoming conferences, leading to our theory that 2015 will go down in sequencing history as the Year of Long Reads. It’s no wonder demand for this kind of data is soaring: after years of using short-read sequencers to analyze genomes, scientists are just now realizing how much information about structural variants, haplotype phasing, and other long-range, clinically relevant elements is inaccessible with short reads alone.
There are a couple of different approaches to long-read data. Single-molecule sequencing platforms, like those available through PacBio and Oxford Nanopore Technologies, generate truly long reads on their own. Users of both platforms have presented individual reads running well into tens of kilobases, a far cry from the few hundred bases we’re used to from Illumina and Ion Torrent sequencers. Assembling those long reads can lead to megabase-plus contigs.
But since the vast majority of sequencing data currently available has been produced with short-read technologies, there’s also a huge appetite for bolt-on products that can pull long-range information out of short-read data. Like their older sibling Moleculo, upstarts 10X Genomics and Dovetail Genomics focus on altering library prep in a short-read workflow to allow analytical tools to connect the sequence data into much longer blocks. These synthetic long reads have been shown to elucidate larger elements like structural variants without switching sequencing platforms.
Both approaches suggest an exciting trend that will let us get more out of each genome we sequence. Here at Sage Science, we’re pleased to report that our BluePippin automated DNA size selection platform can be used with either of these approaches to maximize the length of reads generated or synthesized. For an example of how BluePippin works with synthetic reads, check out this blog post; learn more about BluePippin with long-read sequencing in these app notes. And check back soon for new info on how the PippinHT can be used with long-read workflows too!
At AGBT, Stellar Science and Rapid Progress for Genomics Community
The Sage team has attended AGBT for years, and the 2015 meeting reminded us just how lucky we are to be part of this amazing community. For those of us who remember the first Marco conference in 2000, it is truly awe-inspiring to see just 15 years later that genomics is being used to treat, and even cure, patients around the world. We were humbled by the rapid and remarkable advances this community has enabled.
Some of our favorite talks this year focused on the human microbiome. Michael Fischbach from the University of California, San Francisco, spoke about naturally occurring molecules produced by the microbes that live in and on us. So many of these natural products are antibiotics that Fischbach joked the organisms had made an end-run around the FDA, finding a way to get these molecules into our systems without regulatory approval or a physician’s prescription. He noted that there’s still a lot to learn about the molecules that our microbes are synthesizing — it seems certain that discovering this information could have a major impact on how we view human health.
Rob Knight from the University of California, San Diego, presented work showing changes in microbiome from infancy onward; the profile evolves until age 2.5, at which point it has matured into the same profile seen in adults. He told attendees that despite the inability of genome-wide association studies to turn up reliably predictive genetic markers of obesity, analyzing the microbiome can reveal whether a person is lean or obese with 90 percent accuracy. Clearly, there’s a lot of uncharted territory in how our microbes are contributing to — or in some cases completely defining — various phenotypes.
There was also strong clinical content at AGBT, with impressive presentations describing how sequencing was used to diagnose patients or to suggest treatment options that are not the standard of care for a given condition. Steve McCarroll from Harvard Medical School gave a talk about how a collection of blood samples for a schizophrenia study led to the unexpected discovery of markers indicating early stages of blood cancer, long before the cancer could be diagnosed with traditional methods.
We can’t review all of the amazing talks and posters here, but suffice it to say, it was really great to witness the innovation, intelligence, and ingenuity driving the genomics community. Many thanks to the scientists who stopped by our suite to learn more about Sage Science, and we’re already looking forward to next year’s AGBT.
AGBT 2015: You Bring Last Year’s Backpack, We’ll Bring the Popcorn and PippinHT
Next week is the biggest party of the year for the genomics community: the Advances in Genome Biology and Technology meeting. The Sage Science team can’t wait to emerge from our Boston igloos to soak up some much-needed warmth in Marco Island (while we’re soaking in the great science, of course). Like so many attendees, we’re already dusting off last year’s backpack as we prep for the journey south.
Now in its 16th year, AGBT always manages to deliver a mix of stellar technology talks, brand-new scientific results, and topical information. This year we’re especially looking forward to talks on precision medicine and promising new methods using NGS. We’re also quite intrigued by presentations about genomics in space and city-scale metagenomics.
During the meeting, we’ll be showing off the newest member of our Pippin family, the PippinHT, which is a great fit for the large-scale projects conducted by scientists at this event. PippinHT features everything you love about our platform — fully automated DNA size selection with best-in-class results and reproducibility, without any risk of cross-contamination — now at scale, running up to 24 samples at a time. PippinHT increases throughput while reducing run times and cost per sample.
You can find us in lanai #179, where we’ll be serving up popcorn as well as technical tips on how more accurate DNA sizing can help you generate better results from your NGS pipeline. We hope to see you there!
FAQs: SageELF automated 1D protein fractionation
Q: Does the SageELF use PAGE?
A: No, cassettes are pre-cast with agarose and 0.1% SDS.
Q: What is the maximum sample amount that can be fractionated?
A: 350ug. Samples are prepared in a 22 ul volume to which TCEP and loading solution is added. Total volume in 40 ul.
Q: What is the volume of the collected fractions?
A: 28 ul per well.
Q: What is the buffer?
A: Tris TAPS, ph 8.7.
Q: What do you recommend for SDS removal?
A: We recommend the Pierce HiPPR detergent removal spin columns.
Q: What are the fractionation ranges?
A: At present, we offer 3% agarose cassettes (ELP3010) for ranges between 10-300 kDa and 5% agarose cassettes (ELP5010) for ranges between 10-150 kDa
Q: How long can you store a gel cassette?
A: 18 months.
Q: How does the software calibration work?
A: A labelled DNA oligo is added the sample (this is optional). The oligo runs ahead of the protein sample and when it is detected a run-time threshold is set. Software then uses calibration data based on run-time to estimate a fraction value in all 12 wells.
Q: How long is a run?
A: Typically 2-3 hours.
Q: Is the sample reduced and alkylated?
A: Our sample prep requires reducing with TCEP (DTT may also be used). We are evaluating alkylation at this time.
Q: What extraction method works best?
A: Extraction methods that are designed for SDS PAGE analysis are preferred. Extraction methods for IEF using urea/thiourea should be avoided.
Still Buzzing about the Coffee Talk at PAG
One of our favorite sessions at last week’s PAG meeting focused on a major sequencing effort to understand the coffee genome. The presentation, from scientists at Cenicafé (Colombia’s National Coffee Research Center) highlighted a new project designed to characterize elements of the coffee genome that might help breeders create strains better suited to a changing climate.
Coffee production has been hard hit in recent years: a coffee leaf rust epidemic in Latin America, for instance, has cost more than $1 billion. The plant is generally more susceptible to insects and diseases recently as a result of climate change, the scientists noted.
They teamed up with PacBio for long-read sequencing of the Coffea arabica cv. Caturra genome, an allotetraploid organism clocking in at about 1.3 Gb. We were delighted to see that they used our BluePippin automated DNA sizing platform to generate the longest possible PacBio reads. They produced about 60x coverage and built the first assembly of this genome.
The scientists plan to validate the assembly using a high-quality assembly of C. eugenioides, the diploid maternal ancestor of C. arabica. That genome assembly consists of sequence data from Roche 454 platforms as well as Illumina’s Moleculo technology.
The team is hopeful this work will make a difference in plant breeding to yield a hardier, healthier coffee plant. As Alvaro Gaitan and his colleagues wrote in their session abstract, the work “should dramatically improve our understanding of coffee genetics and genomics providing direct applications to breeders for climate change adaptation.”