This repository has JavaScript and Rust implementations of creating a markov chain from raw text and using a markov chain to generate qoutes. It also contains a use of these implementations to procedurally generate fake Instagram post captions.
The fake Instagram posts are in the style of John Danaher and are generated from a Markov chain that was generated from his real Instagram posts. Danaher is a (relatively) famous brazilian jiu jitsu coach with a posting habit, and the huge body of long, elaborate, distinctively written work he's posted over the years is good raw material for procedural quote generation.
You can see the danaher quote generator in action on this repos' github pages site here. Github is serving the static assets, but the posts are generated on demand by an AWS lambda written in Rust. I ain't necessarily proud of the html and css for the site, they're from a version of this project I did quite a while ago now.
The javascript implementation can run on any relatively recentish version of node.
The rust implementation is a little tricker to build/run, because quote-lambda
depends a file which is created by generate-markov
. A pre-generated chain is included statically in quote-lambda
to speed it up at runtime. You can run cargo run -r --bin generate-markov 3 ../input.txt
to build and run just generate-markov
and create the file. After that, any compilation/running should work normally.
To get quoute-lambda
running in an AWS lambda, see the information here (for source code reference) and here (for build/deploy reference). I compiled it with Cargo Lambda cargo lambda build --bin quote-lambda --release --output-format zip --arm64
and uploaded the zip through the aws console. It is currently running in a lambda with runtime Amazon Linux 2023
and architecture arm64
.
Every few months, I'd like to scrape all the new posts off social media and regenerate the markov chain with more data. The chain will get better at mimicking him the more he writes and the more data I can feed it, so maybe in ten years or so the chain will be able to write his posts for him and no one'll know the difference.
It would also be nice to fix up the sample site, especially so it looks less awful on mobile and relies less on literal screenshots of assets from Instagram.
I've done the basic idea of a quote generator twice now (js and rs implementations), but it would be cool to do one with a radically different technology, like gen ai.