A properly sized library improves the performance short-read sequencers such as Illumina and Ion Torrent. Optimal results (more useful reads per flow cell, higher quality assemblies) are achieved when uniformly sized fragments are used for cluster generation or amplification templates. For long-read approaches such as PacBio, automated size selection can be used to increase read-length. These NGS libraries contribute to structural bioinformatics in key ways:
Indels – In paired-end sequencing, a narrow and uniform library size allows more accurate alignment and facilitates identification of structural variants. Automated size selection enables reproducible production of high-quality paired-end libraries.
Structural and chromosomal rearrangement – Complex rearrangements are best studied with long-read techniques to cover large regions of repeating sequence. For mate-pair and synthetic long-read sequencing, accurate sizing and narrow/uniform size distributions also improve bioinformatic analysis. For long-read platforms such as PacBio, automated size selection can be used to filter the smaller fragments from a sheared sample, and ensure that only the larger fragments are sequenced. This can significantly increase the average and maximum read lengths that can be achieved by these systems.
miRNA – Automated size selection is routinely used to isolate miRNA libraries away from unligated adapters and larger artifacts, providing a savings in time and effort and eliminating unwanted reads.
Isoform detection– The Iso-Seq method from PacBio benefits from collecting multiple size fractions from total RNA libraries to reduce sample complexity for rare isoform/variant detection.
ChIP studies benefit from the flexibility provided by automated size selection. Users can accurately select narrow or broad ranges of fragments, depending on the factor that is analyzed.