Rappers

Automatic rap lyric generation tool

Requirements

invoke
tensorflow
juman
kytea
chainer

Usage

Obtain lyrics

# pip install beautifulsoup
python getlyrics.py -v > output.tsv

Construct corpus

Extract lyrics archive, then run the following command to obtain a file data/juman_input.txt:

python preprocess.py -crawl data/lyrics_shonan_s27_raw.tsv

Feed the cleaned crawled corpus to juman:

juman < data/juman_input.txt > data/juman_out.txt

Process the juman output file:

python preprocess.py -juman data/juman_out.txt

The preprocessing step is finished. You will have three files in the /data folder:

string_corpus.txt as a string corpus file for LSTM training (one sentence per line), each song is separated from the previous one by one line
hiragana_corpus.txt as a hiragana corpus file for FFNN training (one sentence per line), each song is separated from the previous one by one line
daihyou_vocab.p file as a vocabulary file (keys correspond to surface forms, values to 代表表記) - this is used to lookup the embeddings during the LSTM training

Neural Network Language Model

Training

inv train model

Testing

inv test model

Chainer LSTM LM

Training

run the command below at the directory chainer_model

python train_lstm_lm.py (--gpu 0)

You should use gpu to train (this code is very slow on cpu)

Generating lines

python generate_seq.py --model trained_model -O output_file N 10000

Rhyme

Make term-rhyme table using data/string_corpus.txt and data/hiragana_corpus.txt

python features/make_term_vowel_table.py -v --unknown-terms <path-to-unknown-terms:optional> > <path-to-output-table>

data/term_vowel_table.csv: term to vowel table (each row has term,vowels)
data/unknown_terms.txt: terms that did not have hiragana form in data/hiragana_corpus.txt. Currently they are filtered out from the table above

Next line prediction

python NextLine.py -f data/sample_nextline_prediction_candidates.txt

After the processing, you will have the result test_lyrics.txt.

Note: You may need to comment out the lines below in NextLine.py

if __name__ == "__main__":
    ...
            temp.pop(0)
            temp.pop(-1)

Name		Name	Last commit message	Last commit date
Latest commit History 154 Commits
NNLM		NNLM
api		api
chainer_model		chainer_model
data		data
preprocess		preprocess
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE.md		LICENSE.md
NextLine.py		NextLine.py
README.md		README.md
feature_extract.py		feature_extract.py
feature_extract.sh		feature_extract.sh
getlyrics.py		getlyrics.py
make_features.py		make_features.py
preprocess.py		preprocess.py
rhyme.py		rhyme.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rappers

Requirements

Usage

Obtain lyrics

Construct corpus

Neural Network Language Model

Chainer LSTM LM

Rhyme

Next line prediction

About

Releases

Packages

Contributors 6

Languages

License

jntkym/rappers

Folders and files

Latest commit

History

Repository files navigation

Rappers

Requirements

Usage

Obtain lyrics

Construct corpus

Neural Network Language Model

Chainer LSTM LM

Rhyme

Next line prediction

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages