Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add fallback when sequences cannot be downloaded #24

Merged
merged 7 commits into from
Mar 15, 2023
Merged

add fallback when sequences cannot be downloaded #24

merged 7 commits into from
Mar 15, 2023

Conversation

rvhonorato
Copy link
Owner

Sometimes it is not possible to access the efetch endpoint for unknown reasons, this would raise an error and not output would be produced. In this PR I added a fallback that if the efetch endpoint raises an exception, the sequence identifiers are written to a file and a warning message is displayed with further instructions on how to proceed:

 [2023-03-15 12:03:06,867 117 WARNING] Could not fetch the fasta sequences, dumping the sequence IDs instead.
 [2023-03-15 12:03:06,867 120 WARNING] This is probably due to the NCBI server being inaccessible. Please try again later or manually download the sequences from NCBI
 [2023-03-15 12:03:06,867 123 WARNING] Please upload GH43_1_15032023.txt to `https://www.ncbi.nlm.nih.gov/sites/batchentrez` to download the sequences

@rvhonorato rvhonorato self-assigned this Mar 15, 2023
@rvhonorato rvhonorato added the bug label Mar 15, 2023
@rvhonorato rvhonorato linked an issue Mar 15, 2023 that may be closed by this pull request
@rvhonorato rvhonorato merged commit e0bc86d into main Mar 15, 2023
@rvhonorato rvhonorato deleted the 404 branch March 15, 2023 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Skip fasta fetching if entrez endpoint is not accessible
1 participant