Abstract
Resolving the transcriptomes of higher eukaryotes is more tangible with the advent of long read sequencing, which greatly facilitates the identification of new transcripts and their splicing isoforms. However, the computational analysis of long read RNA sequencing data remains challenging as it is difficult to disentangle technical artifacts from bona fide biological information. To address this, we evaluated the performance of multiple leading transcriptome assembly algorithms on their ability to accurately reconstruct RNA transcript isoforms. We specifically focused on deep nanopore sequencing of synthetic RNA spike-in controls (Sequins™ and SIRVs) across different chemistries, including cDNA and direct RNA protocols. Our systematic comparative benchmarking exposes the strengths and limitations of the different surveyed strategies. We also highlight conceptual and technical challenges with the annotation of transcriptomes and the formalization of assembly quality metrics. Our results complement similar recent endeavors, helping forge a path towards a gold standard analytical pipeline for long read transcriptome assembly.
Competing Interest Statement
Melanie Sagniez, Anshul Budhraja & Martin A Smith have received financial support for travel to conferences from Oxford Nanopore Technologies. Martin A Smith has received free research consumables from Oxford Nanopore Technologies, who were not involved in the study design or the interpretation of results.