Citations

Enriching for Answers in Rare Diseases

October 2025

Authors:
Yilei Fu, Adam C. English, Luis F. Paulin, Shalini N Jhangiani, George Weissenberger, Vanessa Vee, Yi Han, Heer H. Mehta, Donna M. Muzny, Richard A. Gibbs, Jennifer E. Posey, Daniel G. Calame, Fritz J. Sedlazeck

Abstract:
We present Trio-barcoded ONT Adaptive Sampling (TBAS), a cost-efficient long-read sequencing strategy combining sample barcoding and adaptive enrichment to sequence rare disease trios on a single PromethION flow cell. TBAS achieved near-complete variant phasing and detection of small variants, structural variants, and tandem repeats with high accuracy and 77% potential solve rate. This scalable approach retains methylation data and enables clinically relevant, phenotype-guided long-read diagnostics at a fraction of current costs.

Sage Science Products:
PippinHT High-Pass size selection (>6kb) for ONT library prep.

Methods Excerpt:
“Genomic DNA was diluted to 30 ng/µl. Starting with 1500 ng for each sample, DNA was shared using g-tubes (Covaris 520079) to achieve an average size of ~10 kb. The sheared DNA was size-selected on the PippinHT instrument (Sage Science) using the 6-10 kb High-Pass definition with a minimum size selection threshold of 6 kb. Libraries for ONT were prepared using the Native Barcoding Kit 96 V14 (SQK-NBD114.96) wherein each trio was barcoded and pooled into 1 library. Libraries were sequenced on the ONT PromethION 24 device using R10.4.1 flow cells with the adaptive sampling option enabled.”

Author Affiliations:
Human Genome Sequencing Center, Baylor College of Medicine

Department of Molecular and Human Genetics, Baylor College of Medicine

Departments of Pediatrics and Medicine, Columbia University Vagelos College of Physicians and Surgeons

Section of Pediatric Neurology and Developmental Neurosciences, Department of Pediatrics, Baylor College of Medicine

Texas Children’s Hospital

Department of Computer Science, Rice University

medRxiv preprint
DOI: 10.1101/2025.10.21.25338483

Posted in Citation | Tagged , | Comments Off on Enriching for Answers in Rare Diseases

Phased genome assemblies and pangenome graphs of human populations of Japan and Saudi Arabia

August 2025

Authors:
Maxat Kulmanov, Saeideh Ashouri, Yang Liu, Marwa Abdelhakim, Ebtehal Alsolme, Masao Nagasaki, Yasuyuki Ohkawa, Yutaka Suzuki, Rund Tawfiq, Katsushi Tokunaga, Toshiaki Katayama, Malak S. Abedalthagafi, Robert Hoehndorf & Yosuke Kawai

Abstract:
The selection of a reference sequence in genome analysis is critical, as it serves as the foundation for all downstream analyses. Recently, the pangenome graph has been proposed as a data model that incorporates haplotypes from multiple individuals. Here we present JaSaPaGe, a pangenome graph reference for Saudi Arabian and Japanese populations, both of which have been significantly underrepresented in previous genomic studies. We constructed JaSaPaGe from high-quality phased diploid assemblies which were made utilizing PacBio high-fidelity long reads, Nanopore long reads, and Hi-C short reads of 9 Saudi and 10 Japanese individuals. Quality evaluation of the pangenome graph by variant calling showed that our pangenome outperformed earlier linear reference genomes (GRCh38 and T2T-CHM13) and showed comparable performance to the pangenome graph provided by the Human Pangenome Reference Consortium (HPRC), with more variants found in Japanese and Saudi samples using their population-specific pangenomes. This pangenome reference will serve as a valuable resource for both the research and clinical communities in Japan and Saudi Arabia.

Sage Science Products:
PippinHT size selection was used for PacBio HiFi library prep

Methods Excerpt:
“Library preparation was performed using the SMRTbell prep kit 3.0, following the manufacturer’s protocol. For final library size selection, a PippinHT System with 0.75% agarose gel cassettes and marker S1 was used. The cut-off size range was set to 10–50 Kbp as recommended by the manufacturer. Subsequently, library QC was performed using FEMTO Pulse and Qubit 1x dsDNA HS kit. The prepared sequencing templates were loaded onto the Sequel IIe system using Binding kit 3.2 and cleanup beads (>3 Kbp). A 30-hour movie run was performed, generating high-fidelity (Hi-Fi) reads.”

Author Affiliations:
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) Division, King Abdullah University of Science and Technology

KAUST Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology

KAUST Center of Excellence for Generative AI, King Abdullah University of Science and Technology

SDAIA–KAUST Center of Excellence in Data Science and Artificial Intelligence, King Abdullah University of Science and Technology

Genome Medical Science Project, National Institute of Global Health and Medicine, Japan Institute for Health Security

Biological and Environmental Sciences & Engineering (BESE) Division, King Abdullah University of Science and Technology

Genomics and Precision Medicine Department, King Fahad Medical City, Saudi Arabia

Division of Biomedical Information Analysis, Medical Research Center for High Depth Omics, Medical Institute of Bioregulation, Kyushu University

Center for Genomic Medicine, Graduate School of Medicine, Kyoto University

Division of Transcriptomics, Medical Institute of Bioregulation, Kyushu University

Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo

Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems

Department of Pathology and Laboratory Medicine, Tufts Medical Center and Tufts University School of Medicine

Department of Neurosurgery, Tufts Medical Center and Tufts University School of Medicine

King Salman Center for Disability Research, Riyadh, Saudi Arabia

Bioinformation and DDBJ Center, National Institute of Genetics, Research Organization of Information and Systems, Japan

Scientific Data
DOI:10.1038/s41597-025-05652-y

Posted in Citation | Tagged , , | Comments Off on Phased genome assemblies and pangenome graphs of human populations of Japan and Saudi Arabia

Genome-wide comparative analysis of variability and population structure between autochthonous Turkish chicken breeds and commercial hybrid lines

July 2025

Authors:
Eymen Demir, Bahar Argun Karsli, Demir Özdemir, Umit Bilginer, Huriye Doğru, Sarp Kaya, Veli Atmaca, Nimet Tufan, Ebru Demir, Taki Karsli

Abstract:
Next-generation sequencing (NGS) technologies have revolutionized livestock genomics by enabling rapid, high-resolution genotyping of local populations with thousands of single nucleotide polymorphisms (SNPs), offering unprecedented accuracy and cost efficiency. This study presents the first comprehensive genomic assessment of the Denizli (DNZ) and Gerze (GRZ) chicken breeds, comparing them to commercial broiler and layer hybrid lines using the double digest restriction-site associated DNA sequencing (ddRADseq) technique. A total of 94,208 bi-allelic SNPs were common between DNZ and GRZ, while 33,284 SNPs were retained among all populations after the quality filtering process. Genetic diversity parameters were higher in native Turkish chicken breeds compared to hybrid lines in which minor allele frequency (MAF) was higher than 0.3 in DNZ and GRZ while it was lower than this value in commercial hybrid lines. Notably, DNZ displayed the highest observed (0.386) and expected (0.375) heterozygosity, whereas the broiler hybrid line showed the lowest heterozygosity (0.254), suggesting inbreeding depression (FIS = 0.241). The negative inbreeding coefficient values occurring due to random mating were observed in DNZ and GRZ chicken breeds, while this value was estimated at 0.118 in the layer hybrid line. Population structure analyses such as principal component analyses (PCA), genetic distance-based neighbor-joining (NJ) tree, ADMIXTURE, and TreeMix algorithm revealed that DNZ and GRZ were genetically distinct from both each other and commercial hybrid lines. The results of this study confirm that comprehensive conservation strategies are efficient approaches to keeping genetic variability at an optimal level without inbreeding. Moreover, this study demonstrates the efficacy of ddRADseq in generating high-throughput genotypic data, providing a cost-effective framework for genomic diversity and population structure studies in indigenous chicken breeds.

Sage Science Products:
Pippin Prep size selection used for ddRADseq libraries

Methods Excerpt:
“After the ligation step, barcoded samples belonging to each sub-library were pooled in a single tube and cleaned with microbeads. For size selection, 30 µl (3000 ng DNA) from each sublibrary + 10 µl loading dye were loaded into each sample well of the 2 % agarose gel cassette. The size selection step was performed in the Pippin prep (Sage Science) instrument with a fragment range of 300-500 bp. After running for 79 min, size-selected sub-library samples were obtained in the elution modules of the cassettes. After the pippin prep step, PCR was performed to enrich each sublibrary and to attach 5 indexes suitable for the Illumina platform to each sublibrary. In this study, we pooled 120 samples into 3 different genomic libraries (each containing 40 samples) to increase depth coverage in the sequencing process. The enriched and cleaned DNA libraries were sequenced by the Illumina NovaSeq 6000 platform to obtain raw paired-end reads (2 × 150 bp)”

Author Affiliations:
Department of Animal Science, Faculty of Agriculture, Akdeniz University, Antalya, Republic of Türkiye

Department of Agricultural Biotechnology, Faculty of Agriculture, Eskişehir Osmangazi University

Department of Agricultural Biotechnology, Faculty of Agriculture, Akdeniz University, Republic of Türkiye

Department of Animal Science, Michigan State University

Department of Medical Services and Techniques, Vocational School of Burdur Health Services, Burdur Mehmet Akif Ersoy University, Republic of Türkiye

Department of Animal Science, Faculty of Agriculture, Eskişehir Osmangazi University

Science Direct
DOI: 10.1016/j.psj.2025.105193

Posted in Citation | Tagged , | Comments Off on Genome-wide comparative analysis of variability and population structure between autochthonous Turkish chicken breeds and commercial hybrid lines

Improved circulating tumor DNA profiling by simultaneous extraction of DNA methylation and copy number information on from Methylated DNA Sequencing data (MeD-seq)

May 2025

Authors:
Daan M Hazelaar, Ruben G Boers, Joachim B Boers, Vanja de Weerd, Jean Helmijr, Maurice PHM Jansen, Henk MW Verheul, Cornelis Verhoef, Joost Gribnau, John WM Martens, Stavros Makrodimtris, Saskia M Wilting

Abstract:
Cell-free DNA (cfDNA) analysis offers a powerful, minimally invasive approach to improve cancer care by measuring tumor-specific genomic and epigenetic alterations. Here, we demonstrate the versatility of MeD-seq, a methylation-dependent sequencing assay, for comprehensive cfDNA analysis, including DNA methylation profiling, chromosomal copy number (CN) alterations, and tumor frac on (TF) estimation. MeD-seq-derived CN profiles and TF estimates from 38 colorectal cancer with liver metastases (CRLM) and 5 ovarian cancer pa ents were highly comparable to shallow whole-genome sequencing (sWGS) validating our approach. For 120 CRLM pa ents we used MeD-seq CN and TF information in an improved Differential Methylation Model which detected additional significantly Differentially Methylated Regions (DMRs) correlating with TF estimates. Using the identified DMR sets we were subsequently able to distinguish healthy blood donors from CRLM patients with low amounts of circulating tumor DNA (ctDNA) as well. These findings show MeD-seq as an affordable platform for detecrting cancer-specific signals directly from plasma without prior tissue-based information. Future work could expand its application to other cancer types, solidifying MeD-seq as a versa le tool for cfDNA profiling.

Sage Science Products:
PippinHT was used for isolating MED-seq libraries.

Methods Excerpt:
“Samples were prepared for sequencing using the ThruPLEX DNA-seq 96D kit and the ThruPLEX DNA-Seq HV kit (Rubicon Genomics, Takara Bio Europe), for CRLM and ovarian cancer samples respectively, and purified on a Pippin HT system with 3% agarose gel cassettes (Sage Science, Beverly, MA) enriching for fragments ranging from 148 to 192bp (including sequencing adapters), which was shown to enrich for tumor-derived DNA fragments (Mouliere et al., 2018)..”

Author Affiliations:
Department of Medical Oncology, Erasmus MC Cancer Institute, University Hospital Rotterdam, The Netherlands
Department of Developmental Biology, Rotterdam, the Netherlands.
Department of Oncological and Gastrointestinal Surgery, Erasmus MC Cancer Institute, University Hospital Rotterdam, The Netherlands.

BioRxiv preprint
DOI: 10.1101/2025.01.21.633371

Posted in Citation | Tagged , | Comments Off on Improved circulating tumor DNA profiling by simultaneous extraction of DNA methylation and copy number information on from Methylated DNA Sequencing data (MeD-seq)

Chasing non-existent “microRNAs” in cancer

April 2025

Authors:
Ayla Orang, Nicholas I. Warnock, Melodie Migault, B. Kate Dredge, Andrew G. Bert, Julie M. Bracken, Philip A. Gregory, Katherine A. Pillman, Gregory J. Goodall & Cameron P. Bracken

Abstract:
“MicroRNAs (miRNAs) are important regulators of gene expression whose dysregulation is widely linked to tumourigenesis, tumour progression and Epithelial-Mesenchymal Transition (EMT), a developmental process that promotes metastasis when inappropriately activated. However, controversy has emerged regarding how many functional miRNAs are encoded in the genome, and to what extent non-regulatory products of RNA degradation have been mis-identified as miRNAs. Central to miRNA function is their capacity to associate with an Argonaute (AGO) protein and form an RNA-Induced Silencing Complex (RISC), which mediates target mRNA suppression. We report that numerous “miRNAs” previously reported in EMT and cancer contexts, are not incorporated into RISC and are not capable of endogenously silencing target genes, despite the fact that hundreds of publications in the cancer field describe their roles. Apparent function can be driven through the expression of artificial miRNA mimics which is not necessarily reflective of any endogenous gene regulatory function. We present biochemical and bioinformatic criteria that can be used to distinguish functional miRNAs from mistakenly annotated RNA fragments..”

Sage Science Products:
Pippin Prep was used to size select RNA libraries.

Methods Excerpt:
“Amplified barcoded libraries were then size selected using auto gel-purification Pippin prep 3% agarose (SAGE science) which targets ranges from 100-250 bp. Libraries (size between 180-190 bp) were then confirmed by Qubit HS DNA and Bioanalyzer HS DNA assay for size and concentration. The libraries were then pooled together in equimolar amounts and sequenced using an Illumina Nextseq 500 using a 1 × 75 cycle high output kit.”

Author Affiliations:
Centre for Cancer Biology, an alliance of SA Pathology and University of South Australia, Adelaide, South Australia, Australia

ACRF Cancer Genomics Facility, Centre for Cancer Biology, SA Pathology, Adelaide, South Australia, Australia

Adelaide Centre for Epigenetics, School of Biomedicine, Faculty of Health and Medical Sciences, University of Adelaide, Adelaide, South Australia, Australia

School of Medicine, Discipline of Medicine, University of Adelaide, Adelaide, South Australia, Australia

School of Biological Sciences, Faculty of Sciences, University of Adelaide, Adelaide, South Australia, Australia

Nature Oncogenesis
DOI:10.1038/s41389-025-00550-9

Posted in Citation | Tagged , , | Comments Off on Chasing non-existent “microRNAs” in cancer