Epsilon-Greedy reinforcement learning code - alternative to standard OOP version
Combines exploitation with exploration to give dynamic feedback on how your campaign is performing, e.g. to adjust fees, prices and content in real time, in accordance with user interactions.
Also, how many trials you need to converge upon the final statsitic, e.g. drug testing, where adequate testing is required but not so much that people who need the medicine are waiting uncecessarily for trials to conclude.
Protects against p-hacking, where one can determine if number of trials is sufficient.
Code generally avialable online, e.g. https://www.geeksforgeeks.org/epsilon-greedy-algorithm-in-reinforcement-learning/ However, all instances appear to be the same Object Orientated Programming code, whereas I've written this in a way that's easier for me (and maybe you) to follow. Note that unlike the online examples, this will work with both Normal and Bernoulli distributions (plus any others you care to add).