Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convolution Sequence to Sequence Learning #1

Open
flrngel opened this issue Jan 29, 2018 · 0 comments
Open

Convolution Sequence to Sequence Learning #1

flrngel opened this issue Jan 29, 2018 · 0 comments

Comments

@flrngel
Copy link
Owner

flrngel commented Jan 29, 2018

Convolution Sequence to Sequence Learning

aka Fairseq

https://arxiv.org/pdf/1705.03122.pdf

3. A Convolutional Architecture

3.1. Position Embeddings

image
P for position vector

image
e for embedding

See also

"Positional encoding" from Attention is all you need

3.2. Convolutional Block Structure

image
(image from https://norman3.github.io/papers/docs/fairseq.html)

For image above, kernel width is 3, and convolution block stack size is 1

3.3. Multi-step Attention

using residual connection from g_i

attention for dot product z and d_i

3.4. Normalization Strategy

This is good for stabilize learning

@flrngel flrngel added this to the Read one more time milestone Jan 29, 2018
@flrngel flrngel removed this from the Read one more time milestone Jan 29, 2018
@flrngel flrngel added the NMT label Feb 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant