add normalisation to confidence scores #4902

akelad · 2019-12-04T14:45:36Z

Proposed changes:

fixes https://github.com/RasaHQ/research/issues/57
normalize softmax confidence scores

Status (please check what you already did):

added some tests for the functionality
updated the documentation
updated the changelog (please check changelog for instructions)
reformat files using black (please check Readme for instructions)

akelad · 2019-12-04T14:48:43Z

@Ghostvv started a PR on this, few questions before I polish it up and request a review:

I would lean towards making the default be True, what do you think? though then i don't think we can put it in a minor :D
should we use a lower number than 10 for the scores? i know we discussed 5 as well

Ghostvv · 2019-12-04T15:26:35Z

I would use LABEL_RANKING_LENGTH as normalization length, otherwise it is not clear how to report other scores in there.
True by default - yes. Not sure about the version, it is not even model breaking. But it has a danger to break whole bot functionality.
Did you test, that the scores become more intuitive after normalization?

akelad · 2019-12-04T15:30:05Z

I would use LABEL_RANKING_LENGTH as normalization length, otherwise it is not clear how to report other scores in there.

as zero 😂

No i haven't tested yet, i still need to do that. just wanted to get what i had done on a PR first

tmbo · 2019-12-10T10:48:11Z

I don't think this shouldn't go into a patch release, please change the base to master. Next minor is next week, so that would be last chance this year

Ghostvv · 2019-12-11T12:16:17Z

@akelad could you please add normalization to EmbeddingPolicy as well?

"

erohmensing · 2019-12-13T03:33:38Z

Histogram for sara on a held out test set with no normalization, normalizaton of the top 10 intents, and normalization of the top 5 intents, in that order

erohmensing · 2019-12-13T03:36:36Z

@Ghostvv I think it is just as easy to make the number configurable as to make it a boolean -- i've implemented it as such here (same for embedding policy), where 0 is no normalization. with respect to how to report the other intents, do you think it makes sense just to always report the number of intents that we have normalized, if we normalized them (essentially doing I would use LABEL_RANKING_LENGTH as normalization length,, but in reverse )? that's also how i went about it here.

Missed that you wanted to have it default:true, but figured i can wait to change that once we decide what the default number should be. Included histograms for sara, scores definitely become more intuitive with some level or normalization, just don't know how far to go with it

rasa/core/policies/embedding_policy.py

wochinge · 2019-12-16T08:43:25Z

@Ghostvv @erohmensing What's the state with this? Are you still planning to get this into Rasa 1.6?

Ghostvv

we need to introduce it to response selector as well. Please add docs

rasa/nlu/classifiers/__init__.py

rasa/core/policies/embedding_policy.py

rasa/nlu/classifiers/embedding_intent_classifier.py

erohmensing · 2019-12-17T03:57:55Z

Everything should be implemented -- could use a few tests, let me know if you have any idea how to go about those (could count the number of non-zero actions and the length of the intent output)? Also feel free to change how I've explained it because 🤷‍♂

Would really like to get this on the train 🚋

Ghostvv

Also please add tests, we could create a couple of classifiers/policies, where we set different ranking_length (including 0) and check that the output is how we like it

rasa/nlu/classifiers/embedding_intent_classifier.py

rasa/core/policies/embedding_policy.py

rasa/nlu/classifiers/embedding_intent_classifier.py

Ghostvv · 2019-12-17T09:41:36Z

sorry, accidentally reviewed old commit, but comments are still valid

wochinge · 2019-12-18T12:51:31Z

@erohmensing @Ghostvv I'm sorry, but this missed the release train.

rasa/core/policies/embedding_policy.py

rasa/nlu/classifiers/embedding_intent_classifier.py

rasa/utils/train_utils.py

tests/core/test_policies.py

outdated

tmbo added this to the Rasa 1.6 milestone Dec 10, 2019

akelad and others added 5 commits December 12, 2019 22:28

add normalisation to confidence scores"

9f05641

"

add configuration parameter for normalization

d0a50b7

fix label confidence not being updated

5d7df4d

make number to normalize configurable, report that num

4466052

add normalization for embedding policy

606e578

erohmensing force-pushed the add_softmax_normalisation branch from a171132 to 606e578 Compare December 13, 2019 03:28

erohmensing changed the base branch from 1.5.x to master December 13, 2019 03:28

erohmensing requested a review from Ghostvv December 13, 2019 03:36

remove debug logging

792eeb1

Ghostvv reviewed Dec 13, 2019

View reviewed changes

rasa/core/policies/embedding_policy.py Outdated Show resolved Hide resolved

Ghostvv suggested changes Dec 16, 2019

View reviewed changes

erohmensing added 6 commits December 16, 2019 21:13

Merge branch 'master' into add_softmax_normalisation

13cb22a

persist policy loss type

4996ca2

add response selector, update default, rename parameter

efa399f

clean up code

93ac058

changelog and docs

0d13ae4

fix policy tests

0965009

erohmensing requested a review from Ghostvv December 17, 2019 03:58

Ghostvv suggested changes Dec 17, 2019

View reviewed changes

erohmensing added 2 commits December 18, 2019 07:44

truncate labels

2087fc8

Merge branch 'master' into add_softmax_normalisation

3528a1c

tmbo removed this from the Rasa 1.6 milestone Dec 19, 2019

erohmensing added 11 commits January 15, 2020 03:41

Merge branch 'master' into add_softmax_normalisation

36e6857

move migration content

1c35df9

move normalization to calculate_message_sim

5da58cb

make normalization method more general

7d1911a

add test for margin loss type not undergoing normalization

d6395cd

update changelog

1ac4bea

use hardcoded default ranking length in test

90d6192

use old method for truncating

4271286

handle some edge cases

a0fc186

explicitly do not call normalization if ranking length <1

fecb858

use correct attribute

bb9cd0c

erohmensing requested a review from Ghostvv January 15, 2020 12:07

Ghostvv previously requested changes Jan 15, 2020

View reviewed changes

rasa/core/policies/embedding_policy.py Outdated Show resolved Hide resolved

rasa/nlu/classifiers/embedding_intent_classifier.py Outdated Show resolved Hide resolved

rasa/utils/train_utils.py Outdated Show resolved Hide resolved

tests/core/test_policies.py Show resolved Hide resolved

akelad added this to the Rasa 1.7 milestone Jan 23, 2020

Ghostvv and others added 6 commits January 23, 2020 10:46

Merge branch 'master' into add_softmax_normalisation

1103fb0

Update rasa/nlu/classifiers/embedding_intent_classifier.py

7d933ee

Update rasa/core/policies/embedding_policy.py

ad2beab

Update rasa/utils/train_utils.py

d76438f

black

1741307

don't mutate an argument

f57624a

Ghostvv approved these changes Jan 23, 2020

View reviewed changes

add norm test

97662f1

Ghostvv merged commit 13d98a9 into master Jan 23, 2020

Ghostvv deleted the add_softmax_normalisation branch January 23, 2020 14:47

Ghostvv mentioned this pull request Feb 3, 2020

Confidence level down after upgrading the Rasa version from 1.4.3 to 1.6.2 #5163

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add normalisation to confidence scores #4902

add normalisation to confidence scores #4902

akelad commented Dec 4, 2019 •

edited by Ghostvv

Loading

akelad commented Dec 4, 2019 •

edited

Loading

Ghostvv commented Dec 4, 2019

akelad commented Dec 4, 2019

tmbo commented Dec 10, 2019

Ghostvv commented Dec 11, 2019

erohmensing commented Dec 13, 2019

erohmensing commented Dec 13, 2019 •

edited

Loading

wochinge commented Dec 16, 2019

Ghostvv left a comment

erohmensing commented Dec 17, 2019

Ghostvv left a comment •

edited

Loading

Ghostvv commented Dec 17, 2019

wochinge commented Dec 18, 2019

add normalisation to confidence scores #4902

add normalisation to confidence scores #4902

Conversation

akelad commented Dec 4, 2019 • edited by Ghostvv Loading

akelad commented Dec 4, 2019 • edited Loading

Ghostvv commented Dec 4, 2019

akelad commented Dec 4, 2019

tmbo commented Dec 10, 2019

Ghostvv commented Dec 11, 2019

erohmensing commented Dec 13, 2019

erohmensing commented Dec 13, 2019 • edited Loading

wochinge commented Dec 16, 2019

Ghostvv left a comment

Choose a reason for hiding this comment

erohmensing commented Dec 17, 2019

Ghostvv left a comment • edited Loading

Choose a reason for hiding this comment

Ghostvv commented Dec 17, 2019

wochinge commented Dec 18, 2019

akelad commented Dec 4, 2019 •

edited by Ghostvv

Loading

akelad commented Dec 4, 2019 •

edited

Loading

erohmensing commented Dec 13, 2019 •

edited

Loading

Ghostvv left a comment •

edited

Loading