-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doco: GTDB R207 database inconsistency #3118
Comments
There are some instructions above that need to be run - see "Let’s index the taxonomy database using SQLite, for faster access later on:".
That having been said, you can use the taxonomy CSV too! It'll just take longer to load each time.
Right! It needs to match the content of the database you're searching, which (in this case) is all of the GTDB genomes, not just the species-level representatives. We'll fix the tutorial to make this clear! The download link is in the tutorial, under "We also want to download the accompanying taxonomy spreadsheet:"
Well, and our error message certainly need some help... we'll fix, thanks!
Oh dear, that does look incorrect to me - I wonder why we did that... I'll see if I can fix. Thank you very much for reporting all of this! |
Fixing link to species database here: #3119 |
Thanks for the quick response @ctb - makes sense - fine by me to close this issue. |
Per #3118, we linked the wrong taxonomy spreadsheet! The one in there is an experimental pangenome one. This PR fixes the links and adds better language.
Hi there,
I've been having some trouble getting R207 databases to work with
soumash tax metagenome
. I'm using 4.8.8 from conda.After running sketch, the instructions at https://sourmash.readthedocs.io/en/latest/tutorial-lemonade.html#id7 say
There doesn't appear to be any *.sqldb available, now we should just use the taxonomy CSV?
OK, so
"I only need to species reps" I think, so I'll just download the first one. But that fails:
The genome one worked, so I got there in the end.
I'm a bit confused why the species one has ident entries along the lines of
s__Escherichia_coli
whensketch
doesn't generate IDs of this type. Maybe I'm missing something.Anyway, HTH,
ben
The text was updated successfully, but these errors were encountered: