Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Update docstring for fixed_seed to ref None option #104

Merged
merged 3 commits into from
Aug 4, 2018

Conversation

emdupre
Copy link
Member

@emdupre emdupre commented Aug 1, 2018

Closes #87 .

Changes proposed in this pull request:

  • Updates the docstring for the fixed_seed argument in eigendecomp.py and tedana.py

@codecov
Copy link

codecov bot commented Aug 1, 2018

Codecov Report

Merging #104 into master will decrease coverage by 0.05%.
The diff coverage is 25%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #104      +/-   ##
==========================================
- Coverage   45.42%   45.36%   -0.06%     
==========================================
  Files          28       28              
  Lines        1585     1587       +2     
==========================================
  Hits          720      720              
- Misses        865      867       +2
Impacted Files Coverage Δ
tedana/workflows/tedana.py 11.86% <0%> (ø) ⬆️
tedana/decomposition/eigendecomp.py 10.92% <0%> (-0.19%) ⬇️
tedana/utils/io.py 13.76% <66.66%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0c5f93d...aec1915. Read the comment docs.

Copy link
Member

@tsalo tsalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are just a couple of typos, but mostly looks good.

@@ -149,7 +149,9 @@ def _get_parser():
parser.add_argument('--seed',
dest='fixed_seed',
type=int,
help='Seeded value for ICA, for reproducibility.',
help=('Value passed to repr(mdp.numx_rand.seed()) '
'Set to an integer value for reproducibile ICA results; '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reproducibile --> reproducible

@@ -149,7 +149,9 @@ def _get_parser():
parser.add_argument('--seed',
dest='fixed_seed',
type=int,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break with None. Also (and this might not matter to anyone), but typing --fixed_seed None is a little awkward. I think I've seen something like --fixed_seed -1 which would then be coerced to None later.

@@ -228,7 +230,9 @@ def tedana_workflow(data, tes, mixm=None, ctab=None, manacc=None, strict=False,
Other Parameters
----------------
fixed_seed : :obj:`int`, optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:obj:int or :obj:None, I think.

@@ -228,7 +230,9 @@ def tedana_workflow(data, tes, mixm=None, ctab=None, manacc=None, strict=False,
Other Parameters
----------------
fixed_seed : :obj:`int`, optional
Seeded value for ICA, for reproducibility.
Value passed to ``mdp.numx_rand.seed()``.
Set to an integer value for reproducibile ICA results;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same typo as above.

@@ -260,7 +260,9 @@ def tedica(n_components, dd, conv, fixed_seed, cost, final_cost,
conv : :obj:`float`
Convergence limit for ICA
fixed_seed : :obj:`int`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:obj:int or :obj:None

@@ -260,7 +260,9 @@ def tedica(n_components, dd, conv, fixed_seed, cost, final_cost,
conv : :obj:`float`
Convergence limit for ICA
fixed_seed : :obj:`int`
Seed for ensuring reproducibility of ICA results
Value passed to ``mdp.numx_rand.seed()``.
Set to an integer value for reproducibile ICA results;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reproducibile --> reproducible

@emdupre
Copy link
Member Author

emdupre commented Aug 1, 2018

I cheated and fixed the typos with an git commit --amend 😆

The awkwardness of fixed_seed=None is a good point, though. Do you think it's too much ? We could do a couple of things, in my mind:

  • Have None be the default and specify that users should set a seed
  • Go ahead with the dual :obj:int or :obj:None identity, as you suggested
  • Try for a special value like -1, but I am worried that could actually be a valid seed

WDYT ? Do you have a preference here ?

@tsalo
Copy link
Member

tsalo commented Aug 1, 2018

Smart!

Regarding the fixed_seed values- I know that -1 breaks numpy.random.seed, so I don't expect it to be a valid value. I do think that None should be the default, although that will involve changing the order of the arguments in tedica, but that will still involve changing the docstrings. I don't know if setting the default value in the argument parser to None will conflict with the type being int, but my intuition is that it won't be a problem (unlike feeding in --seed None directly).

@emdupre
Copy link
Member Author

emdupre commented Aug 1, 2018

I just checked with scipy (what this is really a wrapper for), and -1 also breaks, so we could coerce the value. I think the choice then comes down to whether or not None should be the default.

I'm not enthusiastic about it, since I'd like to have results be reproducible out of the box unless a user knows and wants to specifically request non-replicable results. Happy to hear other thoughts, though ! Could you explain why you think None should be the default ?

@emdupre
Copy link
Member Author

emdupre commented Aug 1, 2018

@KirstieJane might have an opinion here, too, since her group also implemented a fixed_seed argument !

@KirstieJane
Copy link
Member

To be clear, the question is which should be the default, right? None or an integer value.

I can see both sides. There's a danger that groups won't mix up the random seed and therefore every run will be weirdly similar.....but I think the bigger danger is that people don't fix the random seed and every time they re-run the code they get (slightly) different answers.

So, my not-very-strong vote is for the default to be an integer rather than None....but with clearly documentation to say - go ahead and change this value as you choose!

@handwerkerd
Copy link
Member

The changes look fine to me. I'm fine merging, as is, but I'll make some comments in response to other's concerns regarding the default option and the awkwardness of --fixed_seed=None

Earlier, I agreed that the default should be a fixed seed so that the default is replicable results, but understand why default randomness could be useful. If that's the general consensus, for replicability, we'd just need to save the random seed somewhere in the output. The easiest way to do this would be to use a random number generator to assign a value to fixed_seed and then make sure to save that number in the output.
For example, replace

mdp.numx_rand.seed(fixed_seed)

with

if fixed_seed == None:
    fixed_seed=np.random_integers(low=1)
mdp.numx_rand.seed(fixed_seed)

and then make sure that fixed_seed is actually part of the output of the code. I'd recommend this in general because, even if people use None, they might want to replicate a specific iteration at some point in the future.

@KirstieJane
Copy link
Member

Endorse @handwerkerd's suggestion, even if we do keep a fixed seed as the default! 💯

@emdupre
Copy link
Member Author

emdupre commented Aug 2, 2018

Ok, I've edited this to take -1 as the value for non-reproducible calls, but following @handwerkerd's suggestion I'm also writing out the seed that was actually used !

Happy to hear any other suggestions, here. And I'd always appreciate another review, @tsalo :)

@emdupre
Copy link
Member Author

emdupre commented Aug 4, 2018

Merging !

@emdupre emdupre merged commit e0ee1cc into ME-ICA:master Aug 4, 2018
@emdupre emdupre deleted the ica-seed branch August 20, 2018 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

seed for ICA is never random
4 participants