Statistical tests on a test set #23

dinga92 · 2018-07-04T11:15:29Z

I would like to add a functionality to easily run statistical tests (against null, against other classifiers) on an independent test set. Since the test set is independent, this should be easy to do (no need to deal with dependencies between folds).

IMHO the main task will be to make some usable API

raamana · 2018-07-19T15:09:06Z

Certainly.

Make something assuming the ideal version you need, and we will work backwards to have it in neuropredict.

raamana · 2018-08-13T15:04:10Z

Hi Dinga, did you get a chance to work on this yet?

raamana · 2018-09-19T15:55:31Z

Hi Richard, did you get a chance to think about this? Take a look at related discussion here: maximtrp/scikit-posthocs#8

dinga92 · 2018-09-19T21:56:09Z

sorry for the delay, i was still in a vacation mode, and before I had to finish other papers.

I am working on this now, i was looking at theory for tests and also how is sklearn doing things, so we could be consistent, also many usefull things are already implemented there and in statsmodels

What kind of tests are you looking for in scikit-posthoc?

raamana · 2018-09-19T22:30:12Z

I don't think sklearn has anything in this regard - let me know if you see something.

I am particularly interested Friedman test and Nemenyi posthoc, but am open to learning, trying and testing all others too.

dinga92 · 2018-09-19T22:52:34Z

They have permutation test. This might be of interest to you https://arxiv.org/abs/1606.04316 together with code https://github.com/BayesianTestsML/tutorial/

Comparing multiple models on multiple datasets is not as important to me at the moment, also, i think it is quite a niche feature in general.

I will focus now on geting valid external validation and some reporting for one model, and add something more complex later. Probably for comparing competing models on the same test set. What do you say?

dinga92 · 2018-09-19T22:58:24Z

I am doing lots of power comparisons and model comparisons now, so what i do, i will try to make it usable and put it here.

raamana · 2018-09-19T22:58:36Z

Sure, we can start with something small.

Yeah, do it only if it helps your research, and something you will use in short to medium term.

dinga92 · 2018-09-20T14:13:07Z

any hints on how to write tests?

raamana · 2018-09-20T15:57:10Z

Funny you ask, I was just informing folks about this: https://twitter.com/raamana_/status/1039150311842164737

dinga92 · 2018-09-21T07:10:58Z

Sounds good, but which one are you using here? (sorry for a noob question)

raamana · 2018-09-21T10:55:16Z

NP, I use pytest. Its easy to learn.

dinga92 · 2018-09-21T12:37:18Z

So this is a little demo of what I have now:

dataset_size = 50
X, y = datasets.make_classification(n_samples=dataset_size, 
                                    n_features=5,
                                    n_informative=2,
                                    flip_y=0,
                                    class_sep=0.5)
X_train, X_test, y_train, y_test = train_test_split(X, 
                                                    y, 
                                                    test_size=0.5,
                                                    stratify=y)
fit = LogisticRegression(C=1, penalty='l1').fit(X_train, y_train)
predicted_probabilities = fit.predict_proba(X_test)

results = validate_out_of_sample_predictions(y_test, predicted_probabilities)
print(np.array(results))

Out:

Accuracy:  [[0.76  0.007 0.593 0.927]
AUC:        [0.821 0.004 0.649 0.992]
Logscore:   [0.532 0.01  0.    0.   ]
Brierscore: [0.559 0.005 0.    0.   ]]

validate_out_of_sample_predictions takes (probabilistic) predictions as sckit learn outputs them and computes accuracy, AUC, logscore and brier score with it's p-values and CI. At the moment I am using permutation test to get p-values for logscore and brierscore and I don't have a way to compute CI for those, but I think I will do it with bootstrap. I have these measures there because that's what I am using in my paper at the moment, but I would like to add different ones that are interpretable and according to best pracitices.

Is this functionality something you would like here?

raamana · 2018-09-21T13:28:01Z

Can you push the code for validate_out_of_sample_predictions to your repo and point that to me?

Also, please do take a look at at the scikit-posthocs repo, and play with some examples..

I think you and I are on slightly different pages.

dinga92 · 2018-09-21T13:43:38Z

This is what I have now dinga92@8e7a445 it's more in a script stage to run my own stuff and not really in a merging stage.

Now I need to compare models against null, later I will also compare 2 models against each other. As far as I understand, the post-hoc tests you are referring to are to compare multiple models against each other, am I right?

raamana · 2019-05-21T12:07:16Z

Yes.

Also, Will you be at OHBM next month?

dinga92 · 2019-05-23T09:52:18Z

Most probably I will

…

On Tue, May 21, 2019 at 2:07 PM Pradeep Reddy Raamana < ***@***.***> wrote: Yes. Also, Will you be at OHBM next month? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#23?email_source=notifications&email_token=ACMVL432B3DQJPDVKNI6MA3PWPQXJA5CNFSM4FIJUDA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODV3WFGA#issuecomment-494363288>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACMVL44QSNCQQOL3E5SCC2LPWPQXJANCNFSM4FIJUDAQ> .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Statistical tests on a test set #23

Statistical tests on a test set #23

dinga92 commented Jul 4, 2018

raamana commented Jul 19, 2018

raamana commented Aug 13, 2018

raamana commented Sep 19, 2018

dinga92 commented Sep 19, 2018

raamana commented Sep 19, 2018

dinga92 commented Sep 19, 2018 •

edited

Loading

dinga92 commented Sep 19, 2018

raamana commented Sep 19, 2018

dinga92 commented Sep 20, 2018

raamana commented Sep 20, 2018

dinga92 commented Sep 21, 2018

raamana commented Sep 21, 2018

dinga92 commented Sep 21, 2018 •

edited

Loading

raamana commented Sep 21, 2018

dinga92 commented Sep 21, 2018

raamana commented May 21, 2019

dinga92 commented May 23, 2019 via email

Statistical tests on a test set #23

Statistical tests on a test set #23

Comments

dinga92 commented Jul 4, 2018

raamana commented Jul 19, 2018

raamana commented Aug 13, 2018

raamana commented Sep 19, 2018

dinga92 commented Sep 19, 2018

raamana commented Sep 19, 2018

dinga92 commented Sep 19, 2018 • edited Loading

dinga92 commented Sep 19, 2018

raamana commented Sep 19, 2018

dinga92 commented Sep 20, 2018

raamana commented Sep 20, 2018

dinga92 commented Sep 21, 2018

raamana commented Sep 21, 2018

dinga92 commented Sep 21, 2018 • edited Loading

raamana commented Sep 21, 2018

dinga92 commented Sep 21, 2018

raamana commented May 21, 2019

dinga92 commented May 23, 2019 via email

dinga92 commented Sep 19, 2018 •

edited

Loading

dinga92 commented Sep 21, 2018 •

edited

Loading