Mirror Descent Policy Optimization
reinforcement-learning deep-learning deep-reinforcement-learning deep-learning-algorithms sac trpo deep-rl ppo deep-learning-ai policy-optimization stable-baselines model-free-rl mirror-descent mdpo
-
Updated
Oct 31, 2020 - Python