ncbiTAXONparser is a command-line tool written in python3 to translate names, ids and hierachical ranks for the ncbi taxonomy. The package provides tools to wrangle with the taxonomy which is a common tasks in comparative genomics and phylogenetics. For example, it can retrieve the most recent common ancestor according to ncbi for a list of identifiers or get all taxonomic identifiers that are part of a particular tribe.
It provides the same operations as most tools available currently but connects them in a different way to improve usability. New functionality includes: get all lower taxon ID's for a provided taxon ID, get the most recent common ancestor of a set of taxon ID's, and check if a given taxon ID is part of a provided mrca. There are nine main functions provided to get taxonomic information, some internal functions are altered from ncbitax2lin to provide new functionality.
taxid_is_valid
: Checks if input taxon ID is known by ncbi.get_name_from_id
: Find the scientific name for a given ID.get_id_from_name
: Get the taxon ID for a given name.get_id_from_synonym
: Checks if provided name is a synonym and returns the valid taxon ID. Also used internally by \texttt{get_id_from_name}.get_rank
: Provides the rank of the given taxon ID.get_mrca
: Finds the most recent common ancestor of a set of taxon IDs.get_downtorank_id
: Provides the taxon ID of a higher rank, based on the input taxon ID and rank name.get_lower_from_id
: Finds all lower taxon IDs for given taxon ID. Takes long as it needs to go through a large part of the files.match_id_to_mrca
: Checks if a given taxon ID belongs to a given mrca
run from within the main folder:
python setup.py install
pip install -r requirements.txt
There is an example.py
files in the main folder, which can easily be adapted for other purposes.
See the PhylUp package for an example on how to use it.
To check if the installation was successful and to initialize the download of the taxonomy files from ncbi, a US government website, please run:
python3 ./tests/tests_setup.py
# this will download the taxonomy files from ncbi, a US government website.
pytest tests/test_*
Thanks to Emily Jane McTavish who hired me as a postdoc during some time of development and the funding from NSF ABI grant #1759846.