Assembly Arena: Benchmarking RNA isoform reconstruction algorithms for nanopore sequencing

Mélanie Sagniez; Anshul Budhraja; Bastien Paré; Shawn M. Simpson; Clément Vinet-Ouellette; Marieke Rozendaal; Martin A. Smith

doi:10.1101/2024.03.21.586080

Abstract

Resolving the transcriptomes of higher eukaryotes is more tangible with the advent of long read sequencing, which greatly facilitates the identification of new transcripts and their splicing isoforms. However, the computational analysis of long read RNA sequencing data remains challenging as it is difficult to disentangle technical artifacts from bona fide biological information. To address this, we evaluated the performance of multiple leading transcriptome assembly algorithms on their ability to accurately reconstruct RNA transcript isoforms. We specifically focused on deep nanopore sequencing of synthetic RNA spike-in controls (Sequins™ and SIRVs) across different chemistries, including cDNA and direct RNA protocols. Our systematic comparative benchmarking exposes the strengths and limitations of the different surveyed strategies. We also highlight conceptual and technical challenges with the annotation of transcriptomes and the formalization of assembly quality metrics. Our results complement similar recent endeavors, helping forge a path towards a gold standard analytical pipeline for long read transcriptome assembly.

Competing Interest Statement

Melanie Sagniez, Anshul Budhraja & Martin A Smith have received financial support for travel to conferences from Oxford Nanopore Technologies. Martin A Smith has received free research consumables from Oxford Nanopore Technologies, who were not involved in the study design or the interpretation of results.

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.