Dense dimension of a sparse input shouldn't be set to a dimension of a dense input #6555

Ghostvv · 2020-09-03T09:09:43Z

for historical reasons, we set dense dimension of a sparse input to a dimension of a dense input.

since we concatenate instead of sum, there is no reason to do it, I think it should be removed and dense dim should be reduced to smth like a 100 if doesn't hinder performance, but it should reduce train time

Some experiments are required to test performance

The text was updated successfully, but these errors were encountered:

tabergma · 2020-09-03T11:50:25Z

@Ghostvv Just to be clear, we should remove this line https://github.com/RasaHQ/rasa/blob/master/rasa/nlu/classifiers/diet_classifier.py#L1349 and the output dimension of the sparse_to_dense should be set to 100? Do we really want to fix this dimension? If we have a sparse feature dimension < 100, it might not be the best choice. Can't we set it to half of the sparse feature dimension size (or something similar)? Will just try it and see what the experiments say.

Ghostvv · 2020-09-03T18:50:35Z

yes, I think we should simply remove this line. I think we should set 128 or smth here:

rasa/rasa/nlu/classifiers/diet_classifier.py

Line 176 in 0cbd883

DENSE_DIMENSION: {TEXT: 512, LABEL: 20},

or 256, but 512 is definitely too much.
Same with concat

Ghostvv · 2020-09-03T18:51:38Z

half of sparse is huge. With all our sparse features, like char, it can be 10000

b-quachtran · 2020-11-18T03:01:53Z

@Ghostvv Just a note, the change to the default dense & concat dimensions from 512 to 128 ended up causing issues with a customer's NLU model when they upgraded to Rasa 2.0.

Specifically, they were seeing problems where messages that should've triggered nlu_fallback were classified as an existing unrelated intent with relatively high confidence.

For example, the message "feed the cat" was classified as "affirm" with confidence 0.8:

{
  "text": "feed the cat",
  "intent": {
    "id": 3201025162187566927,
    "name": "affirm",
    "confidence": 0.8004463315010071
  },

Ghostvv · 2020-11-18T10:20:10Z

@b-quachtran what was the intent of it in previous model? Are you sure it doesn't fluctuate from training to training with 512?

b-quachtran · 2020-11-18T18:34:34Z

It gets classified as a "talk_to_human" intent with low confidence with their 1.10.14 model:

feed the cat
{
  "intent": {
    "name": "talk_to_human",
    "confidence": 0.45141956210136414
  },

They're setting a random seed for the classifier

Ghostvv · 2020-11-18T19:00:36Z

but since we changed the default architecture, the effect of a random seed also changed. How's overall performance of the model, did it drop?

b-quachtran · 2020-11-18T19:03:20Z

They're running evals on the models at the moment. I'll get back to you once they have real performance metrics to share

Ghostvv added this to the 2.0rc1 Rasa Open Source milestone Sep 3, 2020

tabergma self-assigned this Sep 3, 2020

tabergma mentioned this issue Sep 4, 2020

Dense dimension sparse input #6563

Merged

4 tasks

rasabot closed this as completed in #6563 Sep 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dense dimension of a sparse input shouldn't be set to a dimension of a dense input #6555

Dense dimension of a sparse input shouldn't be set to a dimension of a dense input #6555

Ghostvv commented Sep 3, 2020 •

edited

Loading

tabergma commented Sep 3, 2020 •

edited

Loading

Ghostvv commented Sep 3, 2020 •

edited

Loading

Ghostvv commented Sep 3, 2020

b-quachtran commented Nov 18, 2020

Ghostvv commented Nov 18, 2020

b-quachtran commented Nov 18, 2020 •

edited

Loading

Ghostvv commented Nov 18, 2020

b-quachtran commented Nov 18, 2020

Dense dimension of a sparse input shouldn't be set to a dimension of a dense input #6555

Dense dimension of a sparse input shouldn't be set to a dimension of a dense input #6555

Comments

Ghostvv commented Sep 3, 2020 • edited Loading

tabergma commented Sep 3, 2020 • edited Loading

Ghostvv commented Sep 3, 2020 • edited Loading

Ghostvv commented Sep 3, 2020

b-quachtran commented Nov 18, 2020

Ghostvv commented Nov 18, 2020

b-quachtran commented Nov 18, 2020 • edited Loading

Ghostvv commented Nov 18, 2020

b-quachtran commented Nov 18, 2020

Ghostvv commented Sep 3, 2020 •

edited

Loading

tabergma commented Sep 3, 2020 •

edited

Loading

Ghostvv commented Sep 3, 2020 •

edited

Loading

b-quachtran commented Nov 18, 2020 •

edited

Loading