Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dense dimension of a sparse input shouldn't be set to a dimension of a dense input #6555

Closed
Ghostvv opened this issue Sep 3, 2020 · 8 comments · Fixed by #6563
Closed

Dense dimension of a sparse input shouldn't be set to a dimension of a dense input #6555

Ghostvv opened this issue Sep 3, 2020 · 8 comments · Fixed by #6563
Assignees
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR type:experiments 🔬 To verify if the feature is working as expected we need to run some experiments.

Comments

@Ghostvv
Copy link
Contributor

Ghostvv commented Sep 3, 2020

for historical reasons, we set dense dimension of a sparse input to a dimension of a dense input.

since we concatenate instead of sum, there is no reason to do it, I think it should be removed and dense dim should be reduced to smth like a 100 if doesn't hinder performance, but it should reduce train time

Some experiments are required to test performance

@Ghostvv Ghostvv added type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR area:rasa-oss 🎡 Anything related to the open source Rasa framework type:experiments 🔬 To verify if the feature is working as expected we need to run some experiments. labels Sep 3, 2020
@Ghostvv Ghostvv added this to the 2.0rc1 Rasa Open Source milestone Sep 3, 2020
@tabergma tabergma self-assigned this Sep 3, 2020
@tabergma
Copy link
Contributor

tabergma commented Sep 3, 2020

@Ghostvv Just to be clear, we should remove this line https://github.com/RasaHQ/rasa/blob/master/rasa/nlu/classifiers/diet_classifier.py#L1349 and the output dimension of the sparse_to_dense should be set to 100? Do we really want to fix this dimension? If we have a sparse feature dimension < 100, it might not be the best choice. Can't we set it to half of the sparse feature dimension size (or something similar)? Will just try it and see what the experiments say.

@Ghostvv
Copy link
Contributor Author

Ghostvv commented Sep 3, 2020

yes, I think we should simply remove this line. I think we should set 128 or smth here:

DENSE_DIMENSION: {TEXT: 512, LABEL: 20},

or 256, but 512 is definitely too much.
Same with concat

@Ghostvv
Copy link
Contributor Author

Ghostvv commented Sep 3, 2020

half of sparse is huge. With all our sparse features, like char, it can be 10000

@b-quachtran
Copy link
Contributor

@Ghostvv Just a note, the change to the default dense & concat dimensions from 512 to 128 ended up causing issues with a customer's NLU model when they upgraded to Rasa 2.0.

Specifically, they were seeing problems where messages that should've triggered nlu_fallback were classified as an existing unrelated intent with relatively high confidence.

For example, the message "feed the cat" was classified as "affirm" with confidence 0.8:

{
  "text": "feed the cat",
  "intent": {
    "id": 3201025162187566927,
    "name": "affirm",
    "confidence": 0.8004463315010071
  },

@Ghostvv
Copy link
Contributor Author

Ghostvv commented Nov 18, 2020

@b-quachtran what was the intent of it in previous model? Are you sure it doesn't fluctuate from training to training with 512?

@b-quachtran
Copy link
Contributor

b-quachtran commented Nov 18, 2020

It gets classified as a "talk_to_human" intent with low confidence with their 1.10.14 model:

feed the cat
{
  "intent": {
    "name": "talk_to_human",
    "confidence": 0.45141956210136414
  },

They're setting a random seed for the classifier

@Ghostvv
Copy link
Contributor Author

Ghostvv commented Nov 18, 2020

but since we changed the default architecture, the effect of a random seed also changed. How's overall performance of the model, did it drop?

@b-quachtran
Copy link
Contributor

They're running evals on the models at the moment. I'll get back to you once they have real performance metrics to share

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR type:experiments 🔬 To verify if the feature is working as expected we need to run some experiments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants