Sage Blog

Kick Up Your Heels for DNA Day

It’s that time of year again — the time our kids look at us, shake their heads, and ask, “There’s a day to celebrate DNA? Seriously?”

But for those of us in the industry, DNA Day is a big deal. April 25th was chosen to honor major milestones in our understanding of DNA (Watson and Crick’s publication on the double helix structure and the completion of the Human Genome Project), and for the community it’s a great day to reflect on the remarkable advances in this field. Here, we consider a few areas where progress is particularly impressive.

Diversity of DNA: Never in history have we had such a clear view of the genetics of organisms from extremophiles to extinct species, and everything in between. Cheap sequencing allows scientists to go far beyond model organisms, exploring genomes all across the tree of life. RAD-based sequencing approaches have made it more affordable to do massive-scale genotyping of non-model organisms as well. In addition, we’re engaged in the largest-scale studies the field has ever seen, with multiple efforts aiming to recruit 1 million people in cohorts that were until recently inconceivable.

How DNA functions: At last year’s inaugural Festival of Genomics, we listened with great interest as Harvard’s Ting Wu described compelling work to understand the function of DNA based on its folding patterns. Conventional wisdom had long suggested that unwieldy DNA strands scrunch themselves up however they can, but Wu and other scientists have shown that the folding pattern is instead precisely selected, with a significant impact on the downstream functions of that DNA. Findings like this remind us that we’re still at the beginning of the story of DNA, with many more chapters to go before we can truly say we understand it. In a recent paper we really enjoyed, scientists demonstrated that they could encode, encrypt, and extract short messages inserted into synthetic DNA.

How we treat DNA: Today, we think of treating DNA as a component of the NGS pipeline, with lots of effort to improve sample prep for everything from FFPE DNA samples to museum samples or precious clinical samples. But down the road, we may literally treat our DNA, using tools like CRISPR to edit out genetic problems from living people as a standard clinical treatment.

We hope you’ll be doing something fun to celebrate DNA Day this year. Follow along on Twitter with #DNADay16 to see how the community’s making it a special event.

Posted in Blog | Tagged | Comments Off on Kick Up Your Heels for DNA Day

Scientists Report Method for Secure Communication Using DNA

It’s a study that would make John le Carré proud. DARPA-funded MIT scientists published results of a new method for encrypting messages in synthetic DNA for highly secure communication. It popped up on our radar because our BluePippin automated size selection instrument was used during sequence analysis.

In the PLoS One paper, “Multiplexed Sequence Encoding: A Framework for DNA Communication,” authors Bijan Zakeri, Peter Carr, and Timothy Lu describe new approaches to encoding, encrypting, and fragmenting messages across multiple plasmids. “With synthesis and sequencing speeds rising, and costs rapidly declining, DNA is an intriguing option for the transfer and storage of digital information,” they write.

The team designed QWERTY-style keyboards to easily convert English words into nucleic acids, being careful to assign codons in a way that would minimize homopolymers in the resulting DNA sequence, though they note that users would be able to shuffle codon assignments for their own preference or to increase security of the message.

Next, they created what they call a “secret-sharing system” that encrypts the message and splits it across several DNA molecules, requiring the recipient to use a combination key to reveal the message. “This approach can add an additional layer of protection for a communication and also provide opportunities to explore introducing tiers of complexity within a communication that is afforded by the unique makeup of DNA as a chemical polymer for information storage,” the scientists write. (In a step we really enjoyed, the team also took the opportunity to encode decoy messages into the DNA.)

For the final part of the process, the team came up with a new approach to extract the original message. “We investigated a new method that allows for the multiplexed sequencing of multiple DNA molecules with a common primer, where regions within distinct DNA molecules that have matching information can be identified from a single sequencing reaction via chromatogram patterning,” they report. They validated the whole process by encoding watermarks, messages, and a combination key into six synthetic DNA strands, honoring the cryptography field by using an important World War II communication.

The scientists note that this work demonstrates proof of concept, and that they plan to follow up with additional innovations in future efforts.

Posted in Blog | Tagged , | Comments Off on Scientists Report Method for Secure Communication Using DNA

New hyRAD Method Expands Use of RAD-seq to Degraded Samples

Since RAD-seq was first developed, we’ve seen a number of new versions and approaches from an enthusiastic scientific community. The latest was recently published in PLoS One and demonstrates a RAD-based method suitable for analyzing degraded DNA, an essential step for studying samples stored in museum and other collections.

Hybridization Capture Using RAD Probes (hyRAD), a New Tool for Performing Genomic Analyses on Collection Specimens” comes from lead authors Tomasz Suchan and Camille Pitteloud at the University of Lausanne and their collaborators in Russia, Poland, and the UK. The project was launched to overcome the challenges of using traditional RAD-seq methods, which require longer DNA fragments than are typically available in museum samples. “Museum collections … have not necessarily ensured optimal conditions for DNA preservation,” the authors write. “As a result, many museum specimens yield highly fragmented DNA — even for relatively recently collected samples, limiting their use for molecular ecology, conservation genetics, phylogeographic and phylogenetic studies.”

Their solution is a method called hyRAD, for hybridization RAD, which starts by using double-digest RAD-seq to produce DNA fragments from fresh samples of the species of interest. Those fragments then become capture probes for use with the degraded DNA samples. “Our method thus combines the simplicity and relatively low cost of developing RAD-sequencing libraries with the power and accuracy of hybridization-capture methods,” the team reports. “This enables the effective use of low quality DNA and limits the problems caused by sequence polymorphisms at the restriction site.”

The scientists tested this protocol on eight samples of Lycaena helle butterflies, followed by a validation project on 49 samples of the Palearctic grasshopper Oedaleus decorus. Like other RAD methods, they used Pippin Prep for size selection prior to sequencing.

“Not relying on the presence of restriction site, the method presented here should be also useful for broader phylogenetic scales, allowing sequencing homologous loci from more divergent taxa, which would not be possible to retrieve using classical RAD-seq approaches,” the scientists conclude.

Posted in Blog | Tagged , , , , , | Comments Off on New hyRAD Method Expands Use of RAD-seq to Degraded Samples

AGBT in Review: Long-Range Data for Better Genome Insight

AGBT is behind us, which means the Sage Science team is officially back to the land of fleece and flannel. We had a great time at the conference and especially enjoyed seeing all the attendees making the most of our selfie sticks on the dance floor at the closing party!

The final AGBT sessions were every bit as interesting as the rest of the meeting. Nick Loman’s talk describing the use of Oxford Nanopore MinIONs during the recent Ebola epidemic in West Africa was an amazing glimpse of the kind of field-based sequencing we’ve dreamed of for a long time. His observation that the weak link in the system was the need for a constant Internet connection (required for the sequencer’s base-calling software) underscores the basic logistical challenges we face in achieving our ultimate goal of being able to sequence anything, anywhere, anytime.

The presentation from HudsonAlpha’s Shawn Levy continued the trend of 10x Genomics data, one of the major themes of this year’s AGBT. His emphasis on the importance of phasing and of finding complex events and structural variants mirrors a growing recognition in the community that short-read data will have to be supplemented by other data sources — be it long-read sequencing, Hi-C data, synthetic long reads, genome maps, or something else — for maximum benefit. Dovetail Genomics, which uses a Hi-C approach, was mentioned in several talks at the conference and really seems to be gathering steam in the field.

Normally we’d have a whole year to rest up for the next AGBT, but this September the organization will host its inaugural precision health meeting. We’re eager to see the speaker list and agenda, and maybe even get to experience the Scottsdale, Ariz., meeting in person!

Posted in Blog | Tagged | Comments Off on AGBT in Review: Long-Range Data for Better Genome Insight

AGBT: The All-Nighter Parties Haven’t Broken Us Yet

20160213_124911We may be a little bleary-eyed, but so far we’re surviving the Super Bowl of genomics, better known as AGBT. Last night we had a blast co-sponsoring a party out on the golf course with PacBio, and we’re glad so many people came out to eat, drink, and mingle.

The quality of scientific talks alone is enough to distinguish AGBT from other genomics conferences; this year’s slate of presentations has been no exception. We enjoyed Matt Sullivan’s talk in the opening plenary session and were delighted to see that he’s still using that carefully honed NGS pipeline for low-input samples in his new lab at Ohio State.

In a talk from the Joint Genome Institute’s Ji Lee, attendees got a nice glimpse of Oxford Nanopore’s sequencing technology in a mini-metagenomics experiment. Lee said her team regularly gets about 500 Mb of 2D sequence per flow cell and that low-input samples have provided sufficient yield for sequencing. They modified the library prep method, adding size selection with BluePippin or PippinHT after shearing to generate 20 Kb libraries — a significant boost from the median read lengths they were achieving before.

We also enjoyed the PacBio workshop yesterday, where scientists shared a number of great projects for which SMRT Sequencing had made a considerable difference in assembly quality. The event focused on human biomedical sequencing applications, so we heard some really nice examples of how long-read sequencing has provided insights for infectious disease, cancer, drug metabolism profiles, and even induced pluripotent stem cells. PacBio users have been deploying many of our automated DNA size selection tools to achieve higher average read lengths for years now, and it’s great to see these ultra-long reads making such a difference.

And now it’s back to the conference. One more day of sessions, and then we can go home and sleep for three straight days!

Posted in Blog | Tagged | Comments Off on AGBT: The All-Nighter Parties Haven’t Broken Us Yet