Skip to content

mkandziora/ncbiTAXONparser

Repository files navigation

Build Status

ncbiTAXONparser

ncbiTAXONparser is a command-line tool written in python3 to translate names, ids and hierachical ranks for the ncbi taxonomy. The package provides tools to wrangle with the taxonomy which is a common tasks in comparative genomics and phylogenetics. For example, it can retrieve the most recent common ancestor according to ncbi for a list of identifiers or get all taxonomic identifiers that are part of a particular tribe.

It provides the same operations as most tools available currently but connects them in a different way to improve usability. New functionality includes: get all lower taxon ID's for a provided taxon ID, get the most recent common ancestor of a set of taxon ID's, and check if a given taxon ID is part of a provided mrca. There are nine main functions provided to get taxonomic information, some internal functions are altered from ncbitax2lin to provide new functionality.

  • taxid_is_valid: Checks if input taxon ID is known by ncbi.
  • get_name_from_id: Find the scientific name for a given ID.
  • get_id_from_name: Get the taxon ID for a given name.
  • get_id_from_synonym: Checks if provided name is a synonym and returns the valid taxon ID. Also used internally by \texttt{get_id_from_name}.
  • get_rank: Provides the rank of the given taxon ID.
  • get_mrca: Finds the most recent common ancestor of a set of taxon IDs.
  • get_downtorank_id: Provides the taxon ID of a higher rank, based on the input taxon ID and rank name.
  • get_lower_from_id: Finds all lower taxon IDs for given taxon ID. Takes long as it needs to go through a large part of the files.
  • match_id_to_mrca: Checks if a given taxon ID belongs to a given mrca

Set up a run

Get started:

run from within the main folder:

  • python setup.py install
  • pip install -r requirements.txt

Translate ncbi names and IDs:

There is an example.py files in the main folder, which can easily be adapted for other purposes. See the PhylUp package for an example on how to use it.

Make sure it runs

To check if the installation was successful and to initialize the download of the taxonomy files from ncbi, a US government website, please run:

python3 ./tests/tests_setup.py # this will download the taxonomy files from ncbi, a US government website.

pytest tests/test_*

Acknowledgement

Thanks to Emily Jane McTavish who hired me as a postdoc during some time of development and the funding from NSF ABI grant #1759846.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages