Comparison of RL algorithms (Bandit, Q-Learning etc.) to similar algorithms that use inference. Psi-Auto as an algorithm that automatically tunes the inverse temperature.
-
Updated
May 13, 2023 - Python
Comparison of RL algorithms (Bandit, Q-Learning etc.) to similar algorithms that use inference. Psi-Auto as an algorithm that automatically tunes the inverse temperature.
Add a description, image, and links to the mirl topic page so that developers can more easily learn about it.
To associate your repository with the mirl topic, visit your repo's landing page and select "manage topics."