We’re packing our bags for Baltimore, land of crab cakes and this year’s annual meeting of the American Society of Human Genetics. With some 6,500 scientists expected to attend, the conference is one of the largest in the field — and with that comes a remarkable array of talks, posters, educational workshops, and much more.
The meeting will kick off with a splash: a presidential symposium featuring Francis Collins, David Hunter, Naomi Wray, and Marylyn Ritchie. The speakers will talk about precision medicine, large-scale genomic studies, and integration of electronic health records for clinical impact. ASHG always does a great job honoring the field’s best and brightest; this year, awards will be given to Emmanuelle Charpentier, Kay Davies, Jennifer Doudna, Leonid Kruglyak, and Hunt Willard, among many others.
If you’re attending the meeting, we hope you’ll have a chance to check out poster #1936, “An integrated method for extraction of high-molecular-weight DNA and preparation of genomic sequencing libraries using agarose gels” (Wed, Oct. 7th, 5 pm – 7 pm, clinical genetic testing section). From our R&D team, the poster presents information on a tool under development that we think will be particularly helpful for scientists and clinical researchers in genomics. The HLS enables fully automated, rapid purification of high molecular weight DNA directly from blood or cells. This HMW DNA (>50 Kb) is increasingly important for long-read sequencing and other applications. The poster shows how we accomplish this, as well as some milestones, such as the recovery of DNA fragments as large as 800 Kb.
The Sage Science team will be exhibiting in booth #1016, where we’d be happy to talk to you about how automated DNA size selection can help you produce better results for your genome sequencing projects.
If you’re looking for out-there ideas in genomics, there’s no better place to start than with Chris Mason’s lab at Weill Cornell Medical College. We were delighted that Mason was featured in the latest podcast from Mendelspod and its host Theral Timpson. From swabbing subway stations to tracking gene expression in astronauts, this podcast is truly riveting.
Mason’s lab achieved celebrity status in its hometown of New York City when the staff kicked off Pathomap, an effort to survey the microbes in places like subway stations. “We tried to get a complete molecular map of the city,” Mason tells Timpson. It was a “big discovery effort to build a baseline microbiome” — and one that produced “an interesting, inspiring, and … in some cases controversial bit of research.” After sequencing hundreds of samples, the team found that half of the DNA collected didn’t match any known organism.
Another lab project involves a longitudinal study of identical twins — one on land, and one spending a year in space. Mason’s team collected data for six months before the astronaut’s flight, is in the middle of 12 months of data from space, and will continue for six months after the twin returns. RNA analysis has already proven interesting. “The expression changes dramatically as soon as you get into space,” Mason says, who is particularly intrigued by the epitranscriptomic changes that his team is tracking.
Speaking of space, Mason is working to launch an Oxford Nanopore sequencer; currently, his collaborators are testing it in zero-gravity simulators here on Earth. He tells Timpson that data from the MinIon with the latest chemistry is promising, showing lower error rates and less GC bias than earlier versions. “It’s pretty compelling,” he says.
But Mason is not one to choose a single technology: he encourages scientists to validate data with orthogonal platforms whenever possible. “Every technology has a little bit of a blind spot,” he says.
The wide-ranging interview with Mason also covers synthetic biology, a theory that biologists are where physicists were in the late 18th century, commentary on long-read solutions such as PacBio and 10X, and his goal of engineering microbiomes to make it possible for humans to travel in space or colonize other planets.
PacBio users have been regularly serving up new microbial genome assemblies, and we’re glad to see that they’re using our BluePippin automated DNA size selection instrument to get the best results.
These are just some of the genome announcements published in the last few months:
A pathogen affecting economically important crops, such as melons and gourds, which had not previously been sequenced. Scientists present a draft sequence containing seven contigs and many phage or prophage elements.
Clostridium sporogenes DSM 795T
Researchers published this first whole genome sequence of this bacterium, a nontoxigenic relative of Clostridium botulinum. The genome was finished into a single contig of about 4 Mb and contains dozens of identical sequence copies greater than 1,000 bases.
A member of a group of sulfur-oxidizing bacteria, Sedimenticola thiotaurini strain SIP-G1 was sequenced and presented as a closed genome assembly. Scientists identified pathways not found in other members of this genus.
Scientists sequenced and annotated Microcystis aeruginosa NIES-2549, a freshwater cyanobacterium. The genome is almost 4.3 Mb and was sequenced to help understand the species’ ability to produce hepatotoxic cyanotoxins, which cause major environmental damage.
Escherichia coli O96:H19
This E. coli strain was responsible for a foodborne outbreak in Milan last year in which the organism’s pathogenicity was far more severe than usual. The published genome sequence is fully closed and allows scientists to study its acquired virulence.
In a Biotechniques paper this month, scientists from The Genome Analysis Centre describe a new method for mate-pair sequencing that saves time and money while decreasing the amount of input DNA required. The method is based on SageELF, which automatically generates 12 contiguous fractions of DNA from a single sample.
Led by Darren Heavens, the authors report that length and quantity of input DNA have been problematic factors in the preparation of long mate-pair (LMP) libraries for next-gen sequencing. To address that issue, they adjusted the sample prep protocol to use SageELF instead of conventional gel-based sizing, and then chose the fraction that best met their target fragment length.
“Using the SageELF streamlines the library construction process, allowing LMP libraries >10 kb to be constructed in under 2 days with <10 µg input material,” the scientists write. “For many genome projects, multiple insert size LMP libraries are required, and the ability to construct up to 12 discretely sized libraries for a combined reagent cost of $1270 compared with the reagent cost of $715 for a single insert size LMP library highlights the potential cost savings.” The protocol was developed to optimize the Nextera-based long mate-pair kit for library construction. In addition to the initial round of size selection with SageELF, the scientists conduct another sizing step on the BluePippin prior to Illumina sequencing to ensure selection of DNA fragments best suited for the platform. The protocol pays off by saving time and money in library prep, as well as by reducing the need for larger volumes of input DNA. It also leads to better sequencing results. “Accurately determining the size and span of the inserts for mate pair libraries simplifies the scaffolding problem, enabling the assembly of longer, more precise sequences with fewer non-determined bases (runs of N bases), empowering all subsequent downstream analysis,” the scientists report. Check out the full paper: “A method to simultaneously construct up to 12 differently sized Illumina Nextera long mate pair libraries with reduced DNA input, time, and cost.”
And for more on the TGAC team, check out this brief profile.
A team of scientists from the Icahn School of Medicine at Mount Sinai, Weill Cornell Medical College, Cold Spring Harbor Laboratory, European Molecular Biology Laboratory, and other institutions published the first analysis of a diploid human genome produced by combining single-molecule technologies.
Lead authors Matthew Pendleton, Robert Sebra, Andy Pang, and Ajay Ummat, along with their colleagues, report that integrating results from different technology platforms led to significant improvements in contiguity, with scaffold N50 values nearly 30 Mb. The high-quality assembly also allowed the team to find complex structural variants that can’t be detected in assemblies produced from short-read data.
The scientists used SMRT® Sequencing from Pacific Biosciences as well as genome maps from BioNano Genomics. “Our work shows that it is now possible to integrate single-molecule and high-throughput sequence data to generate de novo assembled genomes that approach reference quality,” they report.
In the study, which sequenced the well-characterized NA12878 genome, scientists used BluePippin to perform size selection prior to SMRT Sequencing. By removing DNA fragments smaller than 7 Kb, the team generated extraordinarily long reads with the PacBio platform. “Without selection, smaller 2000 – 7000 bp molecules dominate the zero-mode waveguide loading distribution, decreasing the sub-readlength” that can be achieved with the sequencer, the authors write in the supplementary materials.
For more, check out the full paper here: “Assembly and diploid architecture of an individual human genome via single-molecule technologies.”