- N-gram https://en.wikipedia.org/wiki/N-gram
- Markov Chain
- Created for NaNoGenMo 2016 (National November Generated fiction Month)
- See NaNoGenMo2017 for this year's version
- Install Java 8+ JDK
- Install maven
- Install your favorite version of LaTeX (Either MikTeX or LiveTeX should work on Windows)
- mvn test
- steps...
- mvn latex:latex
- Generate LaTeX output with
mvn latex:latex
- requires "story.txt" in top directory (change path in
\\input{}
to change location) - pdf file is in target/latex/book/book.pdf
(find mvn cmdline that runs all at once)
- Word count
- Source filename
- Destination file (and/or directory?)
- IMarkov implementation (default to MarkovBigram)
- Formatted
- LaTeX does its own line formatting, so maybe just always format?
- Reader reads from Internet
- implement 3-gram to get better sentences
- More unit tests
- Create
main()
that implements calls intestGenerate()
- half-done, methods moved out of
MarkovBigramTest
- implement parsing of cmdline args
- sometimes period appears after other punctuation
- Create unit test demonstrating it
- html: hover display words that could have been chosen and probabilities for each
- variable-length sentences (would require bayesian-type methods?)
- should
freqs
be an instance field or input/returned making the class "pure functional" with no stored state?
- better determination of proper nouns
- move docs to separate directory
- rename "First" to MarkovBigram
- have
store()
returnfreqs
object instead of requiring it to be passed in - Made
freqs
an instance field so it's not passed around - find lib to do word-wrap
- Good old Apache Commons
- LaTeX output
- generates PDF file programmatically easier than using some pdf library, but the downside is it requires installation of a LaTeX distribution