Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] BEP 003: Common Derivatives #265

Merged
merged 236 commits into from
Jun 10, 2020
Merged

[ENH] BEP 003: Common Derivatives #265

merged 236 commits into from
Jun 10, 2020

Conversation

effigies
Copy link
Collaborator

@effigies effigies commented Jul 9, 2019

Edit: Following the merge, the common-derivatives branch is no longer being rendered. Derivatives can be found at https://bids-specification.readthedocs.io/en/stable/05-derivatives/01-introduction.html


This PR is the first to replace #207, as discussed in #254.

In this PR, we will focus on the (mostly) modality-agnostic components of the derivatives specification, and the necessary changes to the broader spec. This should hopefully keep the scope to a level that several people can read it all the way through, with a particular lookout for:

  1. Backwards-incompatible changes
  2. Excessive restriction that will make it difficult for apps to produce valid datasets with as-yet-unspecified data types (related to discussion in [FIX] Clarify paragraph about custom data types #264)
  3. Inconsistencies with how the modality-specific sub-BEPs have been treating an issue

There are a lot of conversations on #207. If any of them are relevant to the common derivatives, we should link to the old conversation, ideally with a summary of the state of discussion.

The branch is being rendered on https://bids-specification.readthedocs.io/en/common-derivatives/.


9/18 State of the BEP (by: @effigies)

I consider the following PRs essential to the completion of this BEP:

The following PR into master is a pre-condition for merging this BEP:

The following issues are worth considering in this context:

8/22 State of the BEP (by: @franklin-feingold)

Remaining PRs to meet at a consensus. This will enable the finalization of merging the common derivatives extensions.
#300
#301 (offshoot discussion in #309)
#306
#307
#310

chrisgorgo and others added 30 commits December 16, 2018 11:41
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-Authored-By: chrisfilo <krzysztof.gorgolewski@gmail.com>
Co-authored-by: Chris Markiewicz <markiewicz@stanford.edu>
@effigies
Copy link
Collaborator Author

effigies commented Jun 4, 2020

While I've got some eyes here, I just want to re-raise #265 (comment):

One question that does come up is the possibility of #369, with GeneratedBy becoming a valid entry in raw datasets. If so, how does a validator or tool automatically detect BIDS Derivatives?

Should we add a RECOMMENDED Class keyword to datasets_description.json with values "raw" or "derived" (or "derivative" or "derivatives"), and its absence to be interpreted as "raw"

Is there something else that we can use?

I think #369 is probably going to happen, and we should in any case not depend on anything implicit ro determine whether a dataset is raw or derivative, so I will go ahead and make this change later today. If someone has an alternative suggestion, I would love to hear it. If someone has the energy to pose it as a PR or suggested code, I would be even more pleased.

@tashrifbillah
Copy link

@effigies , do we still have time to comment?

@effigies
Copy link
Collaborator Author

effigies commented Jun 4, 2020

@tashrifbillah Yes, I believe the window closes at 11:59pm UTC tonight.

@effigies
Copy link
Collaborator Author

effigies commented Jun 4, 2020

Note to any reviewers: I have proposed #494 and intend to merge this evening if there are no comments. If you have any objections or suggestions, please make them known ASAP.

Edit: #494 has been included in 7bfc0e4. Please make any comments on this PR.

@franklin-feingold franklin-feingold mentioned this pull request Jun 5, 2020
@effigies
Copy link
Collaborator Author

effigies commented Jun 5, 2020

@franklin-feingold @sappelhoff This is now frozen. Should we merge? Or is there a last round of reviews to go?

@sappelhoff
Copy link
Member

Given that our definition of this freeze is loosely "public comments are concluded and there is one week where only critical issues will be considered" I think it's better to keep this a PR open until maybe one or two days prior the next release to make spotting potentially critical issues easier.

(the diff here is easier to read than digging into commits)

Unless you have a reason to merge it already now?

@effigies
Copy link
Collaborator Author

effigies commented Jun 5, 2020

Nope.

@franklin-feingold
Copy link
Collaborator

I agree with waiting to merge until a day or two before our release

@sappelhoff
Copy link
Member

as discussed in yesterday's maintainers meeting I am merging this now. Huge thanks to all contributors, reviewers and @effigies for steering this huge PR!

There are still a couple of To Dos, most notably:

  1. BIDS-validator implementation
  2. Examples

We see neither of these points as blockers for this PR or for the 1.4 release, yet we should keep busy to solve the two points soon.

re: BIDS-validator, we have created a dedicated project on the validator repository: https://github.com/bids-standard/bids-validator/projects/1 ... the idea is to add discrete and small issues to that project so that the whole effort can be overcome in a step wise and iterative manner (not a single big PR)

re: Examples, we are still looking for somebody to prepare one or several for https://github.com/bids-standard/bids-examples. In my opinion it'd be nice to not add a completely new example, but instead simply add derivatives to one of the existing examples in the repo.

@sappelhoff sappelhoff merged commit 3a11391 into master Jun 10, 2020
@poldrack
Copy link

Congrats to all of you on making this happen!

@tiborauer
Copy link

I have noticed that the rendering is not as mentioned in the description.
The link https://bids-specification.readthedocs.io/en/common-derivatives points to an empty page. Shall is be replaced with this one, perhaps: https://bids-specification.readthedocs.io/en/stable/05-derivatives/01-introduction.html?

@effigies
Copy link
Collaborator Author

Thanks @tiborauer. I updated the description.

@tiborauer
Copy link

tiborauer commented Jun 16, 2020

I have a few comments. Sorry for the late feedback:

1. Metadata conventions

- When chaining derivative pipelines, any JSON fields that were specified as
mandatory in the input files SHOULD be propagated forward in the output
file’s JSON provided they remain valid. Non-required JSON fields MAY be
propagated, and are highly useful, but it is the pipeline’s responsibility
to ensure that the values are still relevant and appropriate to the type of
output data.

I can see "valid" and "relevant" used interchangeably here and elsewhere in the BEP; however, I think they are not the same. The TR of a 3D (i.e. static) image can be irrelevant but still valid. A sampling rate after resampling can be invalid but still relevant.

2. Sources and RawSources

| Sources | OPTIONAL. A list of files with the paths specified relative to dataset root; these files were directly used in the creation of this derivative data file. For example, if a derivative A is used in the creation of another derivative B, which is in turn used to generate C in a chain of A->B->C, C should only list B in `Sources`, and B should only list A in `Sources`. However, in case both X and Y are directly used in the creation of Z, then Z should list X and Y in `Sources`, regardless of whether X was used to generate Y. |
| RawSources | OPTIONAL. A list of paths relative to dataset root pointing to the BIDS-Raw file(s) that were used in the creation of this derivative. |

Does the "paths specified relative to dataset root" means relative to the source or to the derived (i.e. this) dataset?

3. Examples

Preprocessed `bold` NIfTI file in the original coordinate space of the original run.
The location of the file in the original datasets is encoded in the `RawSources`
metadata, and `desc-<label>` is used to prevent clashing with the original file name.

Does "the location of the file in the original datasets" mean that the derivative is located in the original (i.e. (raw)source) dataset?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.