Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Representation Learning with Contrastive Predictive Coding #24

Open
flrngel opened this issue Sep 19, 2018 · 2 comments
Open

Representation Learning with Contrastive Predictive Coding #24

flrngel opened this issue Sep 19, 2018 · 2 comments

Comments

@flrngel
Copy link
Owner

flrngel commented Sep 19, 2018

https://arxiv.org/abs/1807.03748
big fan of Aaron van den Oord

Abstract

  • propose universal unsupervised learning approach to extract useful representations from high-dimensional data, CPC(Contrastive Predictive Coding)
  • use probabilistic contrastive loss which induce latent space to capture information that is useful to predict future samples and tractable by using negative sampling

1. Introduction

  • unsupervised learning is an important stepping stone towards robust and generic representation learning
  • idea of predictive coding theories comes from neuroscience, Word2Vec, colorization, etc.
  • paper proposes
    • compress high-dimensional data into compact latent embedding space which conditional predictions are easier to model
    1. predict further steps with autoregressive model
    2. use NCE loss
    3. train CPC as end-to-end resulting model
  • CPC model outperforms other models in various of domains

2. Contrastive Predicting Coding

image

2.1. Motivation and Intuitions

  • in time series and high dimension modeling, they use next step prediction exploit the local smoothness of signal
    • further in future, shared information becomes much lower and model needs to infer more global structure and these are called 'slow features'
  • 'class size' < 'latent variable' in information
    image

2.2. Contrastive Predictive Coding

  • g_enc maps the input sequence of observations x_t to latent representations z_t = g_enc(x_t)
  • autoregressive model g_ar summarizes all z_<=t in lthe latent space and produces a context latent representation c_t = g_ar(z_<=t)

it means
image

  • proposed model, either z_t and c_t could be used as representation for downstream tasks
  • c_t can be used if extra content from the past is useful
  • z_t might not contain enough information to capture phonetic content

2.3. Noise Contrastive Estimation Loss

image
image

2.4. Related work

  • triplet losses using max-margin to separate positive from negative examples
  • time contrastive learning which minimize between embeddings from multiple viewpoints of the same scene and maximize different embeddings extracted from different timesteps
  • in word2vec neighbouring words are predicted using a contrastive loss

3. Experiments

3.1. Audio

image

3.2. Vision

model
image

experiment result
image

3.3. Reinforcement Learning

image

4. Conclusions

  • CPC combines autoregressive modeling and noise-contrastive estimation with intuitions from predictive coding to learn abstract representations in an unsupervised fashion

Notes

  • best paper of this week
  • I will implement this paper to code as soon as possible
@paganpasta
Copy link

Hey @flrngel ! Thanks for the notes and the corresponding repository. I was trying to understand the statement made by the authors which you cite in Section 2.1(last paragraph). The statement reads " ... by maximising the MI between the encoded signals( which is bounded by the MI between the input signals)...
Could you please elaborate on it? Does it mean I(g(x);g(x+t)) < I(x; x+t), if so why ?
Sorry for a naive question.

@flrngel
Copy link
Owner Author

flrngel commented Jan 29, 2020

@MacroMayhem Equation (2) and Appendix A will help you to undertand I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants