Abstract
In this work, we extend vcfdist to be the first variant call benchmarking tool to jointly evaluate phased single-nucleotide polymorphisms (SNPs), small insertions/deletions (INDELs), and structural variants (SVs) for the whole genome. First, we find that a joint evaluation of small and structural variants uniformly reduces measured errors for SNPs (−28.9%), INDELs (−19.3%), and SVs (−52.4%) across three datasets. Next, we correct a common flaw in phasing evaluations, reducing measured flip errors by over 50%. Lastly, we show that vcfdist is more accurate than previously published works and on par with the newest approaches, but with improved result interpretability.
Competing Interest Statement
J. M. H. is employed by and holds stock in PacBio.
Footnotes
1) Updated comparisons: now with the most recent versions of vcfdist and Truvari, and taking Adam English's suggestions into account 2) Streamlined Results section: simplified some Figures, and moved a few less important ones to "Supplementary Information" 3) Added HLA-DQB1 gene example: added a complex variant example which highlights key differences between variant comparison tools 4) Shortened abstract: condensed the original abstract to meet a 100 word limit