Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: clarify incremental submissions procedure for neuroimaging data #16

Open
yarikoptic opened this issue Jun 26, 2019 · 2 comments

Comments

@yarikoptic
Copy link
Contributor

AFAIK (myself and @amanelis17) neuroimaging data requires incremental uploads. Looking forward to the next submission we wondered how we will need to proceed in the future when more subjects/sessions get collected.
It would be great if README.md made a clear example on how to proceed in those cases. Ideally, if provided image03.csv file was analyzed (based on NDA server side data) for older (already uploaded) entries and only new entries were uploaded, it would be great.

Thank you in advance!

@obenshaindw
Copy link
Contributor

@yarikoptic and @amanelis17 we can look to update the README.md on this particular issue.

Normally data is submitted twice a year (Dec-Jan and Jun-Jul). During these data submission cycles, a new meta-data csv file is created with the new subjects/sessions. These data are submitted to the same NDA collection. When users find data, they generally will find all data matching a particular query, or from a particular collection/study and package the data that way. So they will get multiple submissions in their package, which encompass all the currently available data.

@yarikoptic
Copy link
Contributor Author

Thank you @obenshaindw for the explanation!

... a new meta-data csv file is created with the new subjects/

In the light of BIDS2NDA and possibly other workflows, creating of such a new csv file would entail creating one with all the entries for the dataset, running some diff to select only new entries (while retaining the header) and then providing it to nda-tools. I would say it is a common usecase for other work flows as well since I hope people do not organize their data they analyze locally catering to NDA "incremental" submission requirement. As any "automation by human actions" I would say it is boring and bug prone (forgotten or duplicate entries), and it would be great to automate it...
Suggestion: Not sure yet either it should be a part of the vtcmd or some independent helper (e.g. nda-diff, ref #7). E.g. in case of nda-diff it could get two csv files (e.g. submission-201901.csv and submission-201907.csv) and produce a new submission-201907-incremental.csv. If coded in a modular fashion, the same actual internal nda_diff function could be triggered by adding --since FILE option to vtcmd to avoid actual creation of the -incremental.csv file and just doing diff before generating the package. That would simplify incremental submission, and eliminate possible human caused bugs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants