Skip to content

Latest commit

 

History

History
60 lines (48 loc) · 2.12 KB

README.md

File metadata and controls

60 lines (48 loc) · 2.12 KB

Pytorch Recurrent Conditional Adversarial Autoencoder (GAN): Generate Eminem lyrics from continuous space

Based on ideas from Samuel Bowman's Generating Sentences from a Continuous Space with additional changes:

  • Discriminator instead KL-divergence
  • Decoder conditioned on text style: sample from continuous space can be decoded with style like Eminem lyrics or Plain text
  • use simple BiLSTM for Encoder without Highways/Attention

To train models was used a special dataset with text samples of two different styles: small couplets from Eminem lyrics and several small sentences from Multi30K dataset.

Sampling examples


Decoded w. style `Eminem lyrics`:
	the morning rain clouds up my window
	and i ca n't see at all
	and even if i could it 'd all be gray
	but your picture on my wall <eos>

Decoded w. style `Plain text`:
	two men are playing professional hockey .
	a man in a blue shirt is fixing a yellow and white speed train .
	a man is standing on a ladder painting bricks . <eos>

Decoded w. style `Eminem lyrics`:
	and i do n't even know you slim ,
	i 'm not a little skeptical who i hang up this
	when i 'm gone , i 'm going back on the mall
	i wanna leave the show to

Decoded w. style `Plain text`:
	two men are playing hockey , one is singing karaoke .
	a man in a blue shirt is playing a keyboard and singing into a microphone .
	a man in a black shirt is playing a trumpet . <eos>

Usage

To train model use Jupyter Notebook RCAAE.ipynb or just run

python main.py

Parameters

  • --num-epochs default: 100
  • --batch-size default: 64
  • --learning-rate default: 0.0001
  • --dropout default: 0.3
  • --hidden-size LSTM hidden size, default: 500
  • --seed default: 42
  • --embeddings-size default: 300
  • --vectors pretrained word vectors, default='fasttext.en.300d' (vectors loaded automatically by torchtext library)
  • --cuda CUDA device numer, default: 0