Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] BEP 003: Common Derivatives #265
[ENH] BEP 003: Common Derivatives #265
Changes from all commits
58dcce7
ed96fda
9a15097
2e6d5e7
9557d9c
32d0b33
99a1ccc
567f329
5ef3b65
548b435
a2fd24b
5cbf9e3
0d1bf41
5ce9347
a5917b1
0f29a1f
cdb835f
b515b04
447c63b
6122053
43017d1
df3b737
af0ff9a
140868b
df9fee6
8dc0d1b
8182a8b
a519499
a97a5ad
2c41d50
cf3e990
7351e2a
b941c3c
06a2560
eec7a3e
efd1cf0
e0f82da
be7f547
c6fc93f
0c52a0d
a75283d
672585c
f549b42
e2c3d39
0693c4b
6864ba9
374a85d
0c1d4a8
a6f55ee
101bc95
12ff0a6
1e2cc33
7be89b7
06ee174
bec8c23
a30ed5a
ea2bc96
ecbbe14
797c9b6
e777e1d
a5ef0c1
7e4fa68
a3f70f3
394efeb
7540011
da97988
8f9c92e
a9fd0fe
2d386ff
c3c9377
d33fbe9
2ecdbde
79e3f8d
27bb9a3
78a9350
132cb3e
100432a
4aab67e
dcc5642
b2c7eeb
1a5f03c
bd00db2
bb67fd7
4e08507
e7aae15
32183c2
18dc84f
fe07487
bc773d6
c42c4bb
abf6e8f
617e00a
c7c135e
1c7d9df
aa16797
84ee5d4
3bd3543
dc4b48a
900a2b0
b0d15dd
08f6e8a
497adf4
4b18cb4
999723a
efd026d
a78065a
b97c7f0
7f8fb99
26653c9
3878971
fd1fb89
93b2e17
bbb27a4
ccc33be
319b580
fcb7967
6fac7ec
e43e566
dd7f997
4ebc41a
74ff6d0
1d0613e
41ecca4
fd9a572
6f788a5
f28ba08
b48d068
d44d712
cf47f33
4a0506a
75f4367
5069065
9d281fe
78f8143
35899c8
0821f71
6ee22e2
bcf6406
e4a4d65
f880fff
ac1c4fe
e1de6de
064cad4
442b98d
96294ae
cad13dd
8584fc7
7bde6ce
85c54ea
64d306a
9f95aa6
5b037cd
6921e62
2bc990d
c384f97
10a9a71
49314d0
5e90628
9268670
dbdcfbc
6ca0598
70dc262
2c5db7c
54f6281
dbc38b0
4fa236f
e793715
fe17c4c
7b418ea
28b2b59
33de3f1
45d93f9
c200cff
dd626c1
a4c190c
8cd172a
2753da1
4550458
facda86
002f763
97cd4d0
f18b77f
cda4fb2
1fe71b8
a7beb3d
e5b41c0
ead9a84
9aaf323
83b2fae
89566f1
bb90961
94f1814
2841b34
2d9923c
5f620fd
0d4c1a0
debe853
d95e799
e0102fc
56c1d98
b21a9b8
82be287
0e2fcd2
9efc022
fa5808e
8a6821c
6a91d34
80cc32f
ddcec7a
1c0ed0a
7ef9a9d
55d54ea
bbd0df1
6817af9
1f1b46d
fa78c94
ded6875
5af5c1e
9f1d3a0
4225839
82fab3d
18d462d
e0508a7
b1d64ff
20b2269
5dd98d2
f43c0d4
37c2bf2
02408bf
f9734a5
95c4af7
131160c
0aaa881
3096fbe
7bfc0e4
affa960
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copying @Lestropie's comment from #488 (review):
I would be okay with an alternate name, but the
sourcedata/
keyword is already there, means the data that were transformed to produce the current dataset, and constitutes a boundary between datasets.Yes. I expect you'll end up with chains of derivatives or chains of sources, but not generally both within the same dataset. Extrapolating purely from my own preferences, I would suggest that the derivatives-as-subdirectory is a pre-derivatives hack that will largely go away in the future. Keeping a versioned record of the precise inputs of a derivative is a much more useful way to link datasets. (See: https://doi.org/10.7490/f1000research.1116363.1)
I had been thinking that we might need something in the
dataset_description.json
to unambiguously declare a dataset either derivative or raw. At the moment,GeneratedBy
is a distinguishing feature, but if #369 is accepted, that won't work. I will make a comment to the main thread.I think in practice, these are unenforceable, but if they were, the subdirectory can be
chmod
ded-w
.derivatives/
andsourcedata/
have to constitute boundaries between datasets, or you really can't coherently parse a dataset. What happens on the other side of those boundaries is out of our control. I have seen in practice people nesting input datasets inside output datasets, so this is just trying to give guidance on a consistent, BIDS-y way to do that.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No particular contentions with any of the above from me.
Only other possibility to float in this context would be whether we expect a priori / have learned from experience that a particular inter-dataset filesystem arrangement is likely to lead to fewer problems / less ambiguity, and would consequently warrant being the recommended approach, but without restriction on the alternatives. If anything this might actually make things easier for beginners, as they would no longer be obliged to naively make such a decision up-front.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure there currently exists a consensus on the best way to do it. I obviously have my opinion, but I'm not sure that we should really recommend one, as I think it's going to depend on the use case. For example, some approaches may be more conducive to doing processing while others are better for packaging up and distributing.
Hopefully this makes clear that datasets should be independently distributable. If that's the case, then their relative locations should be irrelevant, and moving to an alternative (or no) nesting should be more-or-less trivial.
Happy to consider any wording you might suggest that will get these ideas across. I think I'm running out of generative capacity at the moment.