January 2023
Authors:
M. Mahmoud, Y. Huang, K. Garimella, P. A. Audano, W. Wan, N. Prasad, R. E. Handsaker, S. Hall, A. Pionzio, M. C. Schatz, M. E. Talkowski, E. E. Eichler, S. E. Levy, F. J. Sedlazeck
Abstract:
“The All of Us (AoU) initiative aims to sequence the genomes of over one million Americans from diverse ethnic backgrounds to improve personalized medical care. In a recent technical pilot, we compared the performance of traditional short-read sequencing with long-read sequencing in a small cohort of samples from the HapMap project and two AoU control samples representing eight datasets. Our analysis revealed substantial differences in the ability of these technologies to accurately sequence complex medically relevant genes, particularly in terms of gene coverage and pathogenic variant identification. We also considered the advantages and challenges of using low coverage sequencing to increase sample numbers in large cohort analysis. Our results show that HiFi reads produced the most accurate results for both small and large variants. Further, we present a cloud-based pipeline to optimize SNV, indel and SV calling at scale for long-reads analysis. These results will lead to widespread improvements across AoU.”
Sage Science Products:
The PippinHT was used to size select PacBio HiFi libraries with a target range between 15-22kb.
Author Affiliations:
Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX,
Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA
The Jackson Laboratory for Genomic Medicine, Farmington, CT
Discovery Life Sciences, Huntsville, AL
Department of Genetics, Harvard Medical School, Boston, MA
Department of Computer Science, Johns Hopkins University, Baltimore, MD
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
Genome Sci, University of Washington, Seattle, WA
Howard Hughes Medical Institute, University of Washington, Seattle, WA
Hudson Alpha Institute for Biotechnology, Huntsville, AL
Department of Computer Science, Rice University, Houston, TX
bioRxiv preprint
DOI: 10.1101/2023.01.23.525236