Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyze ml prediction results and possibly lower confidence threshold for bugbug prediction #3702

Closed
ksy36 opened this issue Jun 6, 2022 · 1 comment

Comments

@ksy36
Copy link
Contributor

ksy36 commented Jun 6, 2022

Current confidence threshold is 97%:

Confidence threshold > 0.97 - 5270 classified

                          pre       rec       spe        f1       geo       iba       sup

                 1       0.99      0.53      0.92      0.69      0.70      0.47      9659
                 0       1.00      0.07      1.00      0.12      0.26      0.06       798
__NOT_CLASSIFIED__       0.00      0.00      0.50      0.00      0.00      0.00         0

with this confidence the model is able to find 53% of the issues and wrong 1% of the time (precision 99).

Let's analyze stats that were added in #3701 and make a decision whether we would benefit from lowering the confidence threshold to 95%.

With 95% confidence the model would find higher percent of issues (69%) but will have a bit lower precision (2% error instead of 1%).

Confidence threshold > 0.95 - 6863 classified

                          pre       rec       spe        f1       geo       iba       sup

                 1       0.98      0.69      0.84      0.81      0.76      0.57      9659
                 0       0.97      0.09      1.00      0.16      0.30      0.08       798
__NOT_CLASSIFIED__       0.00      0.00      0.66      0.00      0.00      0.00         0
@ksy36 ksy36 changed the title Gather stats and experiment with lowering confidence threshold for bugbug prediction Analyze ml prediction result and possibly lower confidence threshold for bugbug prediction Jun 6, 2022
@ksy36 ksy36 changed the title Analyze ml prediction result and possibly lower confidence threshold for bugbug prediction Analyze ml prediction results and possibly lower confidence threshold for bugbug prediction Jun 6, 2022
@ksy36
Copy link
Contributor Author

ksy36 commented Jun 22, 2022

To analyze whether it makes sense to lower the confidence from 97% to 95%, I've searched for issues that would be considered false positives with a confidence of 95%, but lower than 97%. That would be issues that predicted as invalid with confidence from 0.95-0.97, but ended up being valid (had a needsdiagnosis or moved milestone).

There are 13 issues that fall into this criteria since we've started recording stats 3 weeks ago:

webcompat/web-bugs#106167, confidence:0.95981133,
webcompat/web-bugs#106162, confidence:0.962785,
webcompat/web-bugs#106156, confidence:0.96612686,
webcompat/web-bugs#106148, confidence:0.96816695,
webcompat/web-bugs#106064, confidence:0.96615344,
webcompat/web-bugs#106038, confidence:0.9681239,
webcompat/web-bugs#105932, confidence:0.96473366,
webcompat/web-bugs#105905, confidence:0.9613183,
webcompat/web-bugs#105776, confidence:0.96952885,
webcompat/web-bugs#105571, confidence:0.95864516,
webcompat/web-bugs#105512, confidence:0.96135634,
webcompat/web-bugs#105390, confidence:0.9563874,
webcompat/web-bugs#105307, confidence:0.9617645

Total number of anonymous issues that was filed within this time frame is 906. Also around 450 of those were classified as invalid and closed (with the confidence threshold is > 97%).

It likely doesn't make sense to lower the confidence threshold at the moment, despite the fact that the number of false positives is low, relative to the total number of anonymous reports. Every such report is valuable and 13 missed reports within 3 weeks would be undesirable for us (in addition to possible false positives at a current confidence threshold, see mozilla/webcompat-team-okrs#256 (comment)).

@ksy36 ksy36 closed this as completed Jun 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant