softmax activation in GRU #2

fgvbrt · 2017-03-08T22:40:22Z

Hi, I noticed that you put softmax activation inside GRU cell, as I understand in this case you wont get sum of activations for each timestep equals to 1. Here is link for GRU cell and the same situation for terminal GRU https://github.com/HIPS/molecule-autoencoder/blob/master/autoencoder/train_autoencoder.py#L225

I also checked with you version of keras that it does not sum to 1, here is link to ghist https://gist.github.com/fgvbrt/1f2e1828c6d8c0eb88614f14c60874ad

Was it done on purpose or was it mistake?
Thanks in advance.

fgvbrt · 2017-03-16T17:46:45Z

I also want to add, that this softmax in GRU would be valid if initialization of initial state would represent probability distribution (i.e. initial states sums to one), but in code there is initialization with zeros.

duvenaud · 2017-03-16T17:54:12Z

Thanks for catching this! I'll bring it up with Rafa G-B next time I talk to him.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

softmax activation in GRU #2

softmax activation in GRU #2

fgvbrt commented Mar 8, 2017 •

edited

Loading

fgvbrt commented Mar 16, 2017

duvenaud commented Mar 16, 2017

softmax activation in GRU #2

softmax activation in GRU #2

Comments

fgvbrt commented Mar 8, 2017 • edited Loading

fgvbrt commented Mar 16, 2017

duvenaud commented Mar 16, 2017

fgvbrt commented Mar 8, 2017 •

edited

Loading