In _sv_taxonomy.csv_, what do the underscores in _want_rank_ mean? For example lines 6-8 are ``` combinedsv___25372,family,1300,Streptococcaceae,family,1.0 combinedsv___25372,family_,1300,Streptococcaceae,family,1.0 combinedsv___25372,family__,1300,Streptococcaceae,family,1.0 ``` . Is there a difference between family, family_, and family__, or are these simply to make each row unique?

Created by ThomasJi
Hi Thomas, These are related to a quirk of the NCBI taxonomy, which has since been simplified and regularized more than it was prior to 2021. Prior to 2021, the NCBI taxonomy had multiple intermediate ranks between the traditional taxonomic ranks (e.g. "Family Group" between Order and Family) that further used irregular terminology. MaLiAmPi partially normalizes these using the underscores (e.g. 'family_' is between family and genus). Largely these can be ignored, but are provided for completeness (e.g. one can just select on a 'rank' or 'want_rank' of a more standard taxonomic rank like 'family' or 'genus'). Kind Regards, Jim on Behalf of the Challenge organizers.

