ARCHE: AN ADVANCED FLEXIBLE TOOL FOR HIGH-THROUPUT ANNOTATION OF FUNCTIONS ON MICROBIAL CONTIGS

Daniel Gonzalo Alonso-Reyes; Virginia Helena Albarracín

doi:10.1101/2022.11.28.518280

SUMMARY

The growing amount of genomic data has prompted a need for less demanding and user friendly functional annotators. At the present, it’s hard to find a pipeline for the annotation of multiple functional data, such as both enzyme commission numbers (E.C.) and orthologous identifiers (KO and eggNOG), protein names, gene names, alternative names, and descriptions. In this work, we provide a new solution which combines different algorithms (BLAST, DIAMOND, HMMER3) and databases (UniprotKB, KOfam, NCBIFAMs, TIGRFAMs, and PFAM), and also overcome data download challenges. The software framework, Arche, herein demonstrated competitive results over Escherichia coli K12 genome when compared to other annotators. Finally, Arche provides an analysis pipeline that can accommodate advanced tools in a unique order, creating several advantages regarding to other commonly used annotators.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

Updated software
https://github.com/gundizalv/Arche

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.