-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extended multiple correction for group sequential, added doc for multiple correction. #179
Conversation
daryadedik
commented
Jan 9, 2018
•
edited
Loading
edited
- completed section for multiple correction
- multiple correction for group sequential
… it to false by default, added docs for multiple correction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks a lot!
I just have a few small suggestions for wordings.
docs/tutorial.rst
Outdated
@@ -98,26 +98,28 @@ You may also find the description in our :ref:`API <modindex>` page. | |||
* ``min_observations=20``: Minimum number of observations needed. | |||
* ``nruns=10000``: Only used if assume normal is false. | |||
* ``relative=False``: If relative==True, then the values will be returned as distances below and above the mean, respectively, rather than the absolute values. | |||
* ``multi_test_correction=True``: Initiate multiple correction (Bonferroni correction is supported). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multi_test_correction=False
by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, sure! Will correct that!
docs/tutorial.rst
Outdated
|
||
*group_sequential* is a frequentist approach for early stopping: | ||
|
||
* ``spending_function='obrien_fleming'``: Currently we support only Obrient-Fleming alpha spending function for the frequentist early stopping decision. | ||
* ``estimated_sample_size=None``: Sample size to be achieved towards the end of experiment. In other words, the actual size of data should be always smaller than estimated_sample_size. | ||
* ``alpha=0.05``: Type-I error rate. | ||
* ``cap=8``: Upper bound of the adapted z-score. | ||
* ``multi_test_correction=True``: Initiate multiple correction (Bonferroni correction is supported). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. multi_test_correction=False
by default?
expan/core/statistics.py
Outdated
@@ -57,7 +58,8 @@ def delta(x, y, assume_normal=True, percentiles=[2.5, 97.5], | |||
the weighted mean and confidence intervals, which is equivalent | |||
to the overall metric. This weighted approach is only relevant | |||
for ratios. | |||
num_tests: number of tests or reported kpis | |||
multi_test_correction (boolean): correct the confidence intervals (multiple correction problem) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about "flag of whether the correction for multiple testing is needed" to be consistent with the docstring below?
expan/core/statistics.py
Outdated
@@ -429,6 +430,8 @@ def normal_sample_difference(x, y, percentiles=[2.5, 97.5], relative=False): | |||
absolute values. In this case, the interval is mean-ret_val[0] to | |||
mean+ret_val[1]. This is more useful in many situations because it | |||
corresponds with the sem() and std() functions. | |||
multi_test_correction (boolean): True or False whether the correction for multiple testing is needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about "flag of whether the correction for multiple testing is needed"?
expan/core/statistics.py
Outdated
@@ -473,6 +477,8 @@ def normal_difference(mean1, std1, n1, mean2, std2, n2, percentiles=[2.5, 97.5], | |||
absolute values. In this case, the interval is mean-ret_val[0] to | |||
mean+ret_val[1]. This is more useful in many situations because it | |||
corresponds with the sem() and std() functions. | |||
multi_test_correction (boolean): True or False whether the correction for multiple testing is needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about "flag of whether the correction for multiple testing is needed"?
expan/core/early_stopping.py
Outdated
@@ -56,7 +59,9 @@ def group_sequential(x, | |||
the end of experiment | |||
alpha: type-I error rate | |||
cap: upper bound of the adapted z-score | |||
|
|||
multi_test_correction: multiple correction flag | |||
num_tests: number of tests or reported kpis used for multiple correction (default: 1, no correction is done) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about "num_tests (integer): number of tests or reported kpis used for multiple correction. This value is only used if multi_test_correction=True"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, if multi_test_correction is True I add worker_args['num_tests'] = len(self.report_kpi_names) in _delta. I made that because I don't want to make num_tests as a parameter of the method, because it's basically the number of reported kpis, which also added in parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. 👍
expan/core/statistics.py
Outdated
@@ -57,7 +58,8 @@ def delta(x, y, assume_normal=True, percentiles=[2.5, 97.5], | |||
the weighted mean and confidence intervals, which is equivalent | |||
to the overall metric. This weighted approach is only relevant | |||
for ratios. | |||
num_tests: number of tests or reported kpis | |||
multi_test_correction (boolean): correct the confidence intervals (multiple correction problem) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about "flag of whether the correction for multiple testing is needed" to be consistent?
expan/core/early_stopping.py
Outdated
@@ -56,7 +59,9 @@ def group_sequential(x, | |||
the end of experiment | |||
alpha: type-I error rate | |||
cap: upper bound of the adapted z-score | |||
|
|||
multi_test_correction: multiple correction flag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about "flag of whether the correction for multiple testing is needed" to be consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could more clear, thanks!
expan/core/experiment.py
Outdated
""" | ||
Perform subgroup analysis on date partitioning each day from start day till end date. Produces non-cumulative | ||
delta and CIs for each subgroup. | ||
Args: | ||
multi_test_correction (boolean): True or False whether the correction for multiple testing is needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about "flag of whether the correction for multiple testing is needed"?
expan/core/experiment.py
Outdated
""" | ||
Perform subgroup analysis. | ||
Args: | ||
feature_name_to_bins (dict): a dict of feature name (key) to list of Bin objects (value). | ||
This dict defines how and on which column to perform the subgroup split. | ||
multi_test_correction (boolean): True or False whether the correction for multiple testing is needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about "flag of whether the correction for multiple testing is needed"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, will make everywhere the same.
docs/glossary.rst
Outdated
@@ -83,11 +83,26 @@ You can find links to our detailed documentations for | |||
|
|||
Subgroup analysis | |||
------------------------------------ | |||
Subgroup analysis in ExaAn will select subgroup (which is a segment of data) based on the input argument, and then perform a regular delta analysis per subgroup as described before. | |||
Subgroup analysis in ExaAn will select subgroup (which is a segment of data) based on the input argument, and then perform a regular delta analysis per subgroup as described before. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry there's just a typo here (ExpAn)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, ok, will fix =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great
added multiple correction for bootstrap (also False by default) since it was there before and it also make sense to have it for bootstrap. |