Welcome to the Radio Murakami repository!
Here, we develop a bot that spouts out quotes as good as the writings of Haruki Murakami himself (okay, to a reasonable approximation). This is based off his writing style as well as quotes and interviews he has done in the past. We write about our work right at Daily Inspirational Quotes from Your Favorite Author using Deep Learning.
We source our data from all of Murakami's novels, and various sources below.
- Jazz Messenger
- Haruki Murakami Says He Doesn’t Dream. He Writes.
- What’s Needed is Magic: Writing Advice from Haruki Murakami
- Haruki Murakami, The Art of Fiction No. 182
- An Unrealistic Dreamer
- Haruki Murakami: The Moment I Became a Novelist
- Always on the Side of the Egg
- The Best of Haruki Murakami’s Advice Column
- A Conversation with Murakami about Sputnik Sweetheart
- Questions for Murakami about Kafka on the Shore
- The novelist in wartime
- Surreal often more real for author Haruki Murakami
- The Salon: Haruki Murakami
- An Interview with Haruki Murakami
- When I Run I Am in a Peaceful Place
- Haruki Murakami on Parallel Realities
- Free Haruki Murakami Short Stories, Essays, Interviews, Speeches
- All works by Haruki Murakami on The New Yorker
- The Underground Worlds of Haruki Murakami
We have a simple utility here that allows you to scrape tweets of a Twitter account.
We fine tuned a GPT-2 model using the datasets above. You can fine-tune the model simply by using a pre-trained GPT-2 model on the data from here.
In this repository, we use pre-commit
to ensure consistency of formatting. To install for Mac, run
brew install pre-commit
Once installed, in the command line of the repository, run
pre-commit install
This will install pre-commit
to the Git hook, so that pre-commit
will run and fix files covered in its config before committing.
Run the Docker image by running the shell script run.sh
. You can also run the trained model with
python docker/src/main.py <seed phrase> --num-samples <number of samples> --max-length <maximum token length> --model-dir <model weights path>
By default, <number of samples>
is 50, <maximum token length>
is 100, and <model weights path>
is ./murakami_bot/
. You can download our model here.
[1] Radford A., Wu J., Child R., Luan D., Amodei D., and Sutskever I., "Language Models are Unsupervised Multitask Learners", 2019. (link)
[2] Devlin J., Chang M-W., Lee K., and Toutanova K., "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding", Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume I (Long and Short Papers), pp. 4171-4186, June 2019. (link)
[3] Vaswani A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I, "Attention is all you need", Advances in Neural Information Processing Systems, pp. 5998–6008, 2017. (link)