-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX] Move rawdata/
into sourcedata/raw
in alternative structure example, clarify on naming of datasets themselves
#1741
Conversation
be9bdbf
to
191c32c
Compare
…dataset to be BIDS dataset This is my take on an extended discussion about ambiguity of `rawdata/` example: https://github.com/bids-standard/bids-specification/pull/1734/files#r1534475631
Prior one bundled naming aspect under the same MUST. I separated into separate sentences, added explicit statement that BIDS does not prescribe a particular naming scheme for source data. And added explicit RECOMMENDED on the example how to organize/name files there.
…mpliant' === Do not change lines below === { "chain": [], "cmd": "sed -i -e s,rawdata,noncompliant,g tools/schemacode/bidsschematools/validator.py tools/schemacode/bidsschematools/tests/test_validator.py tools/schemacode/bidsschematools/tests/data/expected_bids_validator_xs_write.log", "exit": 0, "extra_inputs": [], "inputs": [], "outputs": [], "pwd": "." } ^^^ Do not change lines above ^^^
191c32c
to
7cae13d
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1741 +/- ##
==========================================
- Coverage 87.93% 87.79% -0.15%
==========================================
Files 16 16
Lines 1351 1360 +9
==========================================
+ Hits 1188 1194 +6
- Misses 163 166 +3 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am good with this approach. Hopefully this should prevent confusion.
anal me notices now that apparently we do not have full consistency but mostly list it AFTER the folders... but then logically it might need ... before if there are sub- folders, and after to signal that more files are possible...src/appendices/qmri.md- {
src/appendices/qmri.md- "ds-example": {
src/appendices/qmri.md- "derivatives": {
src/appendices/qmri.md- "qMRLab": {
src/appendices/qmri.md: "dataset_description.json": "",
src/appendices/qmri.md- "sub-01": {
src/appendices/qmri.md- "anat": {
src/appendices/qmri.md- "sub-01_T1map.nii.gz": "",
src/appendices/qmri.md- "sub-01_T1map.json": "",
--
src/common-principles.md- "my_dataset-1": {
src/common-principles.md- "sourcedata": {
src/common-principles.md- "dicoms": {},
src/common-principles.md- "raw": {
src/common-principles.md: "dataset_description.json": "",
src/common-principles.md- "sub-01": {},
src/common-principles.md- "sub-02": {},
src/common-principles.md- "...": "",
src/common-principles.md- },
--
src/common-principles.md- "pipeline_1": {},
src/common-principles.md- "pipeline_2": {},
src/common-principles.md- "...": "",
src/common-principles.md- },
src/common-principles.md: "dataset_description.json": "",
src/common-principles.md- "...": "",
src/common-principles.md- }
src/common-principles.md- }
src/common-principles.md-) }}
--
src/common-principles.md- },
src/common-principles.md- "sub-01": {},
src/common-principles.md- "sub-02": {},
src/common-principles.md- "...": "",
src/common-principles.md: "dataset_description.json": "",
src/common-principles.md- }
src/common-principles.md- }
src/common-principles.md- ) }}
src/common-principles.md-
--
src/common-principles.md- },
src/common-principles.md- "derivatives": {},
src/common-principles.md- "README": "",
src/common-principles.md- "participants.tsv": "",
src/common-principles.md: "dataset_description.json": "",
src/common-principles.md- "CHANGES": "",
src/common-principles.md- }
src/common-principles.md-) }}
src/common-principles.md-
--
src/derivatives/common-data-types.md- "raw/": {
src/derivatives/common-data-types.md- "CHANGES": "",
src/derivatives/common-data-types.md- "README": "",
src/derivatives/common-data-types.md- "channels.tsv": "",
src/derivatives/common-data-types.md: "dataset_description.json": "",
src/derivatives/common-data-types.md- "participants.tsv": "",
src/derivatives/common-data-types.md- "sub-001": {
src/derivatives/common-data-types.md- "eeg": {
src/derivatives/common-data-types.md- "sub-001_task-listening_events.tsv": "",
--
src/longitudinal-and-multi-site-studies.md- },
src/longitudinal-and-multi-site-studies.md- "sub-control01_sessions.tsv": "",
src/longitudinal-and-multi-site-studies.md- },
src/longitudinal-and-multi-site-studies.md- "participants.tsv": "",
src/longitudinal-and-multi-site-studies.md: "dataset_description.json": "",
src/longitudinal-and-multi-site-studies.md- "README": "",
src/longitudinal-and-multi-site-studies.md- "CHANGES": "",
src/longitudinal-and-multi-site-studies.md- }
src/longitudinal-and-multi-site-studies.md-) }}
I will at least unify for that now in a separate commit (easy to drop if we decide not worth it) |
note: failed to upload coverage, I restarted that pipeline to get it hopefully green. |
This feels more confusing to me.
Why is there now a I still don't understand why what is present is confusing, but maybe it would be better just to remove it. I would like to move to a Diataxis approach, where we clearly separate out the helpful hints and how-tos from the reference material. Maybe it could be revived in a how-to guide in a way that causes fewer problems, if there's any value to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nit-picks, if the consensus is that this is an improvement and gets cleared to merge. I don't think we want to imply validation of these new bits.
IMHO in BIDS specification we SHOULD only talk about BIDS datasets. We SHOULD NOT talk about arbitrary conventions which could exist outside of BIDS specification. Hence IMHO this example SHOULD be a BIDS dataset as well -- AFAIK nothing in BIDS specification would state that it is an illegitimate BIDS dataset, or am I wrong?
IMHO it is useful to give users guidance. My problem with current formulation is that it is misleading. This PR IMHO clarifies it, unless it is indeed "not BIDS compliant". |
I believe the validator would currently complain that there are no validatable files. You can argue that's wrong, but from an OpenNeuro perspective, I would want to reject such a dataset as it's effectively a blanket
I find it unhelpful at best, as it drops everything but a |
wrong. There is
as we discussed -- this sounds like an archive specific desire/behavior.
ok, let's meet half-way: what if I remove Then we stay with "convention" |
Sure, if we drop |
To expedite and ease re-reviewing -- I committed that final form of suggestions . |
Co-authored-by: Chris Markiewicz <effigies@gmail.com>
This is what the example now looks like: I personally find it confusing that the BIDS raw data is now listed under sourcedata, because the way "people" usually think about the directories is:
nesting "2" under "1" is technically possible, but I find it unhelpful/weird. Having that said, @Remi-Gau and @effigies seem to have discussed this at length / thought about it, so I will defer to them. |
there is no 'ideal' setup per se, but if datalad used, then 2. ( |
@effigies @sappelhoff so are we go on this one or more tune ups? |
rawdata/
into sourcedata/raw, ensure example is a BIDS-compliant dataset, clarify on naming of datasets themselvesrawdata/
into sourcedata/raw
in alternative structure example, clarify on naming of datasets themselves
This reverts commit a3c12f8 where I have tried to introduce it in bids-standard#1741 but it required a little more of further detailing.
Individual commits have more rationale.
I can split into multiple PRs but let's see - may be this could provide us a closure.
Reflecting my thoughts on
rawdata/
is described in text but not part of the schema #1736rawdata
directory is not a prescription #1519