This is our official 3.0 release, that coincides with a complete re-curation of the database to update all entries to the latest NCBI taxonomy as of May 2021, along with a complete new format for sequence IDs.
The re-curation led to the removal of 202 sequences. Many of these sequences were duplicate strain-level entries from single-cell sequencing projects in the Prochlorococcus, Synechococcus, and Microcystis genera. Sericytochormatia in the sibling clade group also had a large reduction, due to a change in the taxonomic definition of the group.
There was an addition of 47 new sequences.
From Cydrasil v3 onward, sequence IDs are now easily parsable using automated methods. Each sequence header is now defined by the database and corresponding database ID, followed by the current NCBI taxonomy.
Format as follows:
CY-sourceName-sourceDatabaseID#g__generaName.s__speciesName.str__strainName
Example
-----------------------------------------------------------------------
CY-IMG-641250527#g__Acaryochloris.s__marina.str__MBIC11017
CY-NCBI-AB112435.1#g__Acaryochloris.s__sp.str__Awaji-1
-----------------------------------------------------------------------
Finally, phylogenetic reconstruction has moved to RAxML-ng in order to both support the new sequence header format and to move to the current standard.