-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consider adding save_pathlist_of_signatures
method?
#1365
Comments
right, I think this is another situation where manifests or collections of signatures would be great! unfortunately, right now the challenge is that we have no way to refer to signatures in indices / collections. so we could easily do this for signatures sitting in individual files, but it would be hard to do signatures in .sbt.zip files or LCA.json files. Probably need something like |
save_file_list_of_signatures
method?save_pathlist_of_signatures
method?
With picklists and now manifests on the horizon, we should be able to update collections: read the manifest, and only sketch signatures if they do not already exist. I think this would involve building the manifest row for a signature that would be generated, checking the existing manifest, and then sketching + adding to the collection if needed. I know this doesn't save a lot of time for a couple signatures, but could save a lot for large collections. Would love to hear thoughts about the downsides of enabling this, though! My thinking = now that we can extract specific signatures using picklists, folks can keep all their query signatures together in a single (or multiple) zipfile collections, if they want! If this is the case, you can imagine that sometimes folks (me) want to add an additional sample or ksize without needing to regenerate the entire collection. |
this is an interesting idea, and actually a pretty good use case for IPFS and Redis storage backends, too - people wouldn't need to track filenames if their personal collection of signatures was just floating around in a database namespace somewhere. However, I'm curious about the connection to this issue in particular - I don't immediately see a strong connection, and the functionality seems mostly distinct. Do we want to make a new issue on this so the idea doesn't get buried here? (Or is there a strong connection that I'm missing?) Minor note - only a subset of the signature metadata would apply - identifier, moltype, ksize, and filename, I think. |
Hah, it's the logical 'next step' of what I want for saving signatures, but you're right - not so connected here. Will make new one! |
hey, look, the core functionality for the original request is provided by #1891 in combination with sourmash sig manifest! Between picklists and manifests we have so much nice functionality here that I think mostly what we need is a somewhat better way of outputting summary or merged manifests that include references to the input I wrote that up over in #1902. |
I'm constantly using
--from-file
file lists these days, so a python api function for saving a file list of sigs (save_list_of_signature_files
orsave_file_list_of_signatures
) might be handy.At the CLI level, this would enable optionally emitting a list of signature files from
sourmash sketch
, for use in downstream applications. I could also see it being useful as output ofsearch
orprefetch
, etcrelated to #1350, #1352
The text was updated successfully, but these errors were encountered: