monte-carlo
q-learning
dqn
epsilon-greedy
policy-gradient
dynamic-programming
transfer-learning
policy-iteration
value-iteration
model-based-rl
behavioral-economics
sarsa-learning
n-armed-bandit-problem
double-q-learning
model-learning
n-step-expected-sarsa
n-step-tree-backup
ucb-algorithm
cognitive-fallacies
-
Updated
Sep 27, 2021 - Jupyter Notebook