Targeted Phasing of 2-200 Kilobase DNA Fragments with a Short-Read Sequencer and a Single-Tube Linked-Read Library Method

March 2023
Veronika Mikhaylova, Madison Rzepka, Tetsuya Kawamura1, Yu Xia, Peter L. Chang, Shiguo Zhou, Long Pham, Naisarg Modi, Likun Yao, Adrian Perez-Agustin, Sara Pagans, T. Christian Boles , Ming Lei, Yong Wang, Ivan Garcia-Bassets, and Zhoutao Chen

“In the human genome, heterozygous sites are genomic positions with different alleles inherited from each parent. On average, there is a heterozygous site every 1-2 kilobases (kb). Resolving whether two alleles neighboring heterozygous positions are physically linked—that is, phased—is possible with a short-read sequencer if the sequencing library captures long-range information. TELL-Seq is a library preparation method based on millions of barcoded micro-sized beads that enables instrument-free phasing of a whole human genome in a single PCR tube. TELL-Seq incorporates a unique molecular identifier (barcode) to the short reads generated from the same high-molecular-weight (HMW) DNA fragment (known as ‘linkedreads’). However, genome-scale TELL-Seq is not cost-effective for applications focusing on a single locus or a few loci. Here, we present an optimized TELL-Seq protocol that enables the cost-effective phasing of enriched loci (targets) of varying sizes, purity levels, and heterozygosity. Targeted TELL-Seq maximizes linked-read efficiency and library yield while minimizing input requirements, fragment collisions on microbeads, and sequencing burden. To validate the targeted protocol, we phased seven 180-200 kb loci enriched by CRISPR/Cas9-mediated excision coupled with pulse-field electrophoresis, four 20 kb loci enriched by CRISPR/Cas9-mediated protection from exonuclease digestion, and six 2-13 kb loci amplified by PCR. The selected targets have clinical and research relevance (BRCA1, BRCA2, MLH1, MSH2, MSH6, APC, PMS2, SCN5A-SCN10A, and PKI3CA). These analyses reveal that targeted TELL-Seq provides a reliable way of phasing allelic variants within targets (2-200 kb in length) with the low cost and high accuracy of short-read sequencing. Lynch syndrome (LS), caused by heterozygous pathogenic variants affecting one of the mismatch repair (MMR) genes (MSH2, MLH1, MSH6, PMS2), confers moderate to high risks for colorectal, endometrial, and other cancers. We describe a four-generation, 13-branched pedigree in which multiple LS branches carry the MSH2 pathogenic variant c.2006G>T (p.Gly669Val), one branch has this and an additional novel MSH6 variant c.3936_4001+8dup (intronic), and other non-LS branches carry variants within other cancer-relevant genes (NBN, MC1R, PTPRJ). Both MSH2 c.2006G>T and MSH6 c.3936_4001+8dup caused aberrant RNA splicing in carriers, including out-of-frame exon-skipping, providing functional evidence of their pathogenicity. MSH2 and MSH6 are co-located on Chr2p21, but the two variants segregated independently (mapped in trans) within the digenic branch, with carriers of either or both variants. Thus, MSH2 c.2006G>T and MSH6 c.3936_4001+8dup independently confer LS with differing cancer risks among family members in the same branch. Carriers of both variants have near 100% risk of transmitting either one to offspring. Nevertheless, a female carrier of both variants did not transmit either to one son, due to a germline recombination within the intervening region. Genetic diagnosis, risk stratification, and counseling for cancer and inheritance were highly individualized in this family. The finding of multiple cancer-associated variants in this pedigree illustrates a need to consider offering multicancer gene panel testing, as opposed to targeted cascade testing, as additional cancer variants may be uncovered in relatives.”

Sage Science Products:
The HLS-CATCH process (SageHLS system) was used to purify the larger (180-200kb) targets.

Author Affiliations:

Universal Sequencing Technology Corp., Carlsbad, CA
Sage Science Inc., Beverly, MA
Department of Medicine, University of California, San Diego, La Jolla, CA
Department of Medical Sciences, School of Medicine, University of Girona, Girona, Spain
Universal Sequencing Technology Corp., Canton, MA

BioRxiv preprint
DOI: 10.1101/2023.03.05.531179

Posted in Citation | Tagged , , | Comments Off on Targeted Phasing of 2-200 Kilobase DNA Fragments with a Short-Read Sequencer and a Single-Tube Linked-Read Library Method

Innovations in double digest restriction-site associated DNA sequencing (ddRAD-Seq) method for more efficient SNP identification

February 2023

Zenaida V.Magbanua, Chuan-YuHsu, OlgaPechanova, Mark Arick II, Corrinne E.Grover, Daniel G.Peterson

“We present an improved ddRAD-Seq protocol for identifying single nucleotide polymorphisms (SNPs). It utilizes selected restriction enzyme digestion fragments, quick acting ligases that are neutral with the restriction enzyme buffer eliminating buffer exchange steps, and adapters designed to be compatible with Illumina index primers. Library amplification and barcoding are completed in one PCR step, and magnetic beads are used to purify the genomic fragments from the ligation and library generation steps. Our protocol increases the efficiency and decreases the time to complete a ddRAD-Seq experiment. To demonstrate its utility, we compared SNPs from our protocol with those from whole genome resequencing data from Gossypium herbaceum and Gossypium arboreum. Principal component analysis demonstrated that the variability of the combined data was explained by the genotype (PC1) and methodology applied (PC2). Phylogenetic analysis showed that the SNPs from our method clustered with SNPs from the resequencing data of the corresponding genotype. Sequence alignments illustrated that for homozygous loci, more than 90% of the SNPs from the resequencing data were discovered by our method. Our analyses suggest that our ddRAD-Seq method is reliable in identifying SNPs suitable for phylogenetic and association genetic studies while reducing cost and time over known methods.”

Sage Science Products:
The SageELF to size select library fractions to 295 and 614 bp.

Author Affiliations:
Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, MS
Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA

Analytical Biochemistry
DOI: 10.1016/j.ab.2022.115001

Posted in Citation | Tagged , | Comments Off on Innovations in double digest restriction-site associated DNA sequencing (ddRAD-Seq) method for more efficient SNP identification

Utility of long-read sequencing for All of Us

January 2023

M. Mahmoud, Y. Huang, K. Garimella, P. A. Audano, W. Wan, N. Prasad, R. E. Handsaker, S. Hall, A. Pionzio, M. C. Schatz, M. E. Talkowski, E. E. Eichler, S. E. Levy, F. J. Sedlazeck

“The All of Us (AoU) initiative aims to sequence the genomes of over one million Americans from diverse ethnic backgrounds to improve personalized medical care. In a recent technical pilot, we compared the performance of traditional short-read sequencing with long-read sequencing in a small cohort of samples from the HapMap project and two AoU control samples representing eight datasets. Our analysis revealed substantial differences in the ability of these technologies to accurately sequence complex medically relevant genes, particularly in terms of gene coverage and pathogenic variant identification. We also considered the advantages and challenges of using low coverage sequencing to increase sample numbers in large cohort analysis. Our results show that HiFi reads produced the most accurate results for both small and large variants. Further, we present a cloud-based pipeline to optimize SNV, indel and SV calling at scale for long-reads analysis. These results will lead to widespread improvements across AoU.”

Sage Science Products:
The PippinHT was used to size select PacBio HiFi libraries with a target range between 15-22kb.

Author Affiliations:
Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX,
Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA
The Jackson Laboratory for Genomic Medicine, Farmington, CT
Discovery Life Sciences, Huntsville, AL
Department of Genetics, Harvard Medical School, Boston, MA
Department of Computer Science, Johns Hopkins University, Baltimore, MD
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
Genome Sci, University of Washington, Seattle, WA
Howard Hughes Medical Institute, University of Washington, Seattle, WA
Hudson Alpha Institute for Biotechnology, Huntsville, AL
Department of Computer Science, Rice University, Houston, TX

bioRxiv preprint
DOI: 10.1101/2023.01.23.525236

Posted in Citation | Tagged , , , | Comments Off on Utility of long-read sequencing for All of Us

Significance of HLA in the development of Graves’ orbitopathy

January 2023

Magdalena Stasiak, Katarzyna Zawadzka-Starczewska, Bogusław Tymoniuk, Bartłomiej Stasiak, Andrzej Lewiński

“Graves’ disease (GD), similarly to most autoimmune disease, is triggered by environmental factors in genetically predisposed individuals. Particular HLA alleles increase or decrease GD risk. No such correlation was demonstrated for Graves’ orbitopathy (GO) in Caucasian population. HLA-A, -B, -C, -DQB1 and -DRB1 genotyping was performed using a high-resolution method in a total number of 2378 persons including 70 patients with GO, 91 patients with non-GO GD and 2217 healthy controls to compare allele frequencies between GO, non-GO and controls. Significant associations between GO and HLA profile were demonstrated, with HLA-A*01:01, -A*32:01, -B*37:01, -B*39:01, -B*42:01, -C*08:02, C*03:02, DRB1*03:01, DRB1*14:01 and DQB1*02:01 being genetic markers of increased risk of GO, and HLA-C*04:01, -C*03:04, -C*07:02 and -DRB1*15:02 being protective alleles. Moreover, correlations between HLA alleles and increased or decreased risk of non-GO GD, but with no impact on risk of GO development, were revealed. Identification of these groups of GO-related and GO-protective alleles, as well as the alleles strongly related to non-GO GD, constitutes an important step in a development of personalized medicine, with individual risk assessment and patient-tailored treatment.”

Sage Science Products:
The Pippin Prep was used in conjunction with the MIA FORA NGS FLEX HT HLA Typing Kit from Immuncor per the manufacturer’s directions.

Author Affiliations:
Polish Mother’s Memorial Hospital – Research Institute, Department of Endocrinology and Metabolic Diseases, Lodz, Poland
Medical University of Lodz, Department of Immunology, Rheumatology and Allergy, Lodz, Poland
Lodz University of Technology, Institute of Information Technology, Lodz, Poland
Medical University of Lodz, Department of Endocrinology and Metabolic Diseases, Lodz, Poland

BMC Research Notes; Genes and Immunity
DOI: 10.1038/s41435-023-00193-z

Posted in Citation | Tagged , , , | Comments Off on Significance of HLA in the development of Graves’ orbitopathy

Proteomic analysis of sialoliths from calcified, lipid and mixed groups as a source of potential biomarkers of deposit formation in the salivary glands

January 2023

Natalia Musiał, Aleksandra Bogucka, Dmitry Tretiakow, Andrzej Skorek, Jacek Ryl Gdańsk, Paulina Czaplewska

“Salivary stones, also known as sialoliths, are formed in a pathological situation in the salivary glands. So far, neither the mechanism of their formation nor the factors predisposing to their formation are known despite several hypotheses. While they do not directly threaten human life, they signicantly deteriorate the patient’s quality of life. Although this is not a typical research material, attempts are made to apply various analytical tools to characterise sialoliths and search for the biomarkers in their proteomes. In this work, we used mass spectrometry and SWATH-MS qualitative and quantitative analysis to investigate the composition and select proteins that may contribute to solid deposits in the salivary glands. Twenty sialoliths, previously characterized spectroscopically and divided into the following groups: calcied (CAL), lipid (LIP) and mixed (MIX), were used for the study. Proteins unique for each of the groups were found, including: for the CAL group among them, e.g. proteins from the S100 group (S100 A8/A12 and P), mucin 7 (MUC7), keratins (KRT1/2/4/5/13), elastase (ELANE) or stomatin (STOM); proteins for the LIP group – transthyretin (TTR), lactotransferrin (LTF), matrix Gla protein (MPG), submandibular gland androgen-regulated protein 3 (SMR3A); mixed stones had the fewest unique proteins. Bacterial proteins present in sialoliths have also been identied. The analysis of the results indicates the possible role of bacterial infections, disturbances in calcium metabolism and neutrophil extracellular traps (NETs) in the formation of sialoliths”

Sage Science Products:
The SageELF (3% SDS-Agarase gels) was used for protein separation (10-300 kDa), upstream of Mass Spectrometry.

Author Affiliations:
University of Gdańsk, Poland
Medical University of Gdańsk, Poland
Gdańsk University of Technology, Gdańsk, Poland

Research Square preprint (under review, BMC Clinical Proemics
DOI: 10.21203/

Posted in Citation | Tagged , | Comments Off on Proteomic analysis of sialoliths from calcified, lipid and mixed groups as a source of potential biomarkers of deposit formation in the salivary glands