A few weeks ago we asked people to participate in a quick survey to help us make sure that our product development efforts are focused on areas of greatest need to the community. We were blown away by the response — many thanks to all of you who took time out of your busy days to offer your perspective.
We wanted to share a little bit about what we heard. Most respondents work in research, though a fair number came from the clinical realm. The most common genomics applications they use are RNA-seq, targeted sequencing, genome resequencing, and de novo genome sequencing. The single biggest sample prep challenge they reported was dealing with low-input or precious samples. While the vast majority of respondents use short-read sequencers, they expressed a lot of interest in adding long-read sequencing or synthetic long-read data to their pipelines.
We also offered participants the chance to win an Apple Watch. We printed out the list of respondents who asked to enter the drawing, and then we chose Tanja, the cheeriest member of the Sage team, to select our winner (as you can see, she had a good time!). Congratulations to Davinder Sandhu at Weill Cornell Medical College for winning the watch!
A pair of user group meetings last week offered some intriguing glimpses into the future of long-read sequencing. Oxford Nanopore customers got together in New York, while PacBio users assembled in Palo Alto, Calif. The Sage Science team attended and sponsored the PacBio event, and we followed the ONT meeting on Twitter and through this great post from Keith Robison at his Omics! Omics! blog.
One of the first things we learned about long-read sequencing when it became available a few years ago is that size selection is perhaps even more important for this technology as it is for short-read sequencers. Removing shorter fragments during library construction allows these sequencers to focus on the longest fragments, maximizing the read lengths generated during sequencing. “ONT has started using the Sage BluePippin instrument to enrich libraries for long reads,” Robison reported, noting that some groups have demonstrated MinION library prep workflows that enrich for reads of 20 Kb and greater. Meanwhile, at the PacBio meeting, CSO Jonas Korlach told attendees about a protocol for building 30 Kb+ libraries using BluePippin for size selection and Diagenode for shearing.
Naturally, the sequencing community is most interested in what’s next for these technologies. According to Robison, Oxford Nanopore told users that they can expect to see direct RNA-seq, amplification-free barcoding, and a higher number of barcodes to allow for increased pooling of PCR amplicons and other samples. Its next instrument, the PromethION, is slated for release to early-access sites in the beginning of 2016; it may generate terabases of data in a single day.
At the PacBio meeting, Korlach spoke about several advances coming to SMRT Sequencing users, who are by all accounts champing at the bit for access to the company’s new Sequel instrument. Attendees were particularly excited about non-amplification-based target enrichment with Cas9, new protocols for HLA and 16S, and the ability to work with low-input samples.
Long-read sequencing is proving to be transformative for the genomics field, where it is chewing through genomes that make other sequencers choke. We enjoy these user meetings for the great science presented, including any number of people reporting the first-ever glimpses of novel architecture, genes, and other previously intractable genomic regions. It was a thrill to hear that PacBio users have now published more than 1,000 papers — truly a milestone for long-read sequencing.
Check out the tweet history for both meetings using #PBUGM and #MCMNewYork15.
Have you been noticing the mate-pair trend? We certainly have. A technique that was once only used by a handful of labs has really come into its own as one of the preferred ways of gathering longer-range genomic data.
Here at Sage, what we like most about mate-pair sequencing is that it’s a clever way of gleaning new information that provides incredible downstream value, making assemblies significantly better. These long-range glimpses of an organism’s genome are enabling more accurate views of biology.
We’ve heard a lot about mate-pair sequencing at conferences we’ve attended recently. For instance, the new Revolocity system from Complete Genomics that was on display at ASHG uses mate-pair assembly to improve results. A recent publication from Lucigen scientists used mate-pair sequencing to improve assembly quality for an anaerobic bacterium cultured from a boiling spring.
Sage customers have been doing a lot of great work with mate-pair sequencing as well. At The Genome Analysis Centre, scientists developed a mate-pair method using SageELF that allows for lower-input samples while saving time and money. That method was essential for the new, highly contiguous wheat genome assembly the centre just released, where it helped boost contig N50 by a factor of 10. At RIKEN, researchers modified a Nextera mate-pair protocol to reduce costs. By using BluePippin earlier in the protocol, they were able to increase yield; they also reduced enzyme volumes for other steps and got high-quality results. In this app note, scientists from the University of Miyazaki also use BluePippin with Nextera for a mate-pair pipeline.
To learn more about how Sage customers are using and improving mate-pair sequencing, check out this page.
This week we’re heading to the Festival of Genomics in San Mateo, Calif. We had a blast at the first festival in Boston, so we’re excited about this next edition (and we packed our running shoes, just in case).
The meeting has an impressive slate of speakers — it’s a virtual Who’s Who of the genomics community. If you happened to miss Ting Wu’s mind-blowing plenary talk at the Boston festival, we’re thrilled to see that she’s speaking again and heartily recommend checking it out. The concurrent sessions are chock-full of great talks too, and it’ll be tough to choose among them.
The Sage Science team will be in the exhibit hall, central to all of the presentation stages, and we hope you’ll stop by to say hello. At the June festival, we presented a sneak peek of our HLS system, currently under development. If you missed it, we’ll be happy to get you up to speed on a new product we think will be incredibly useful for the NGS community.
Hope to see you in San Mateo!
The genomics field is a funny place: at times it feels like we’re not making progress quickly enough, and at other times things seem to be moving so fast we’re just holding on for dear life. We were reminded of this paradox recently thanks to some conference discussions with our fellow genomic veterans, so we came here to think out loud about it.
On one hand, we hear about consumers who can’t get access to their DNA data because of FDA regulations, or patients who would likely benefit from genomic testing but their doctors are too traditional to give it a shot. On the other hand, it seems like just yesterday there were thousands of scientists around the world working hard to assemble the first human genome sequence, and now it’s almost routine for organizations to launch studies including tens of thousands of whole genomes.
Some recent indications suggest that we’ll soon leave behind those lingering doubts about the speed of progress. If you didn’t see this summer’s PLoS Biology publication about the growth of genomic data, it’s well worth a read. Scientists from a number of institutions came together to write the commentary, which builds on data metrics from recent years to make the argument that in as little as a decade, genomics may lap fields such as astronomy and social media to become the most prolific producer of big data. How’s that for a field that until recently relied on shipping hard drives back and forth?
At the ASHG conference in Baltimore this month, we saw more evidence for the rapidly increasing pace of genomics. NIH Director Francis Collins spoke about the Precision Medicine Initiative and its efforts to aggregate or enlist 1 million whole genomes for a national database. Several speakers mentioned the 100,000 Genomes Project that’s underway in the UK, or the efforts at Geisinger Health System in Pennsylvania to sequence 250,000 patients. The days of congratulating ourselves for sequencing a single genome seem to exist only in the rear-view mirror.
Our internal data supports the same trend. Last November we launched the high-throughput version of our Pippin automated DNA sizing platform — an instrument designed in response to customer demand — and already the PippinHT is galloping along. So far, customers have ordered enough HT cassettes to process more than 30,000 samples. Whew! The PippinHT is best-suited for large-scale genomic or transcriptomic studies; even a few years ago, we would never have predicted this level of demand for the instrument. It’s a sure sign that genomics is scaling faster than any of us could have anticipated.
Samples Run on the PippinHT
We hope this pace in genome science translates into exciting new approaches to healthcare, the most obvious beneficiary of so many sequence databases and massive-scale discovery projects. For our part, we’ll keep churning out those high-capacity cassettes to help our customers increase their throughput for larger and larger studies.