11/9 |
Trust Region Policy Optimization, J. Schulman et al, 2015. |
Chris Ohk |
[paper] [review] |
11/16 |
Model based Reinforcement Learning for Atari, L. Kaiser et al, 2019. |
Sungdong Yoo |
[paper] [review] |
11/16 |
High-dimensional Continuous Control using Generalized Advantage Estimation, J. Schulman et al, 2015. |
Junyeob Baek |
[paper] [review] |
11/23 |
The Option-critic Architecture, PL. Bacon et al, 2017. |
Seonghyeon Moon |
[paper] [review] |
11/23 |
Rainbow: Combining Improvements in Deep Reinforcement Learning, M. Hessel et al, 2017. |
Sungkwon On |
[paper] [review] |
11/30 |
Prioritized Experience Replay, T. Schaul et al, 2015. |
Donggu Kang |
[paper] [review] |
12/7 |
Distributed Prioritized Experience Replay, D. Horgan et al, 2018. |
Jungyeon Lee |
[paper] [review] |
12/7 |
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-learner Architectures, L. Espeholt et al, 2018. |
Junhyung Kang |
[paper] [review] |
12/14 |
A Distributional Perspective on Reinforcement Learning, MG. Bellemare et al, 2017. |
Wonwoo Choi |
[paper] [review] |
12/14 |
Addressing Function Approximation Error in Actor-critic Methods, S. Fujimoto et al, 2018. |
Sooyoung Lee |
[paper] [review] |
12/21 |
Action-gap Phenomenon in Reinforcement Learning, A. Farahmand et al, 2011. |
Handong Im |
[paper] [review] |
12/21 |
Soft Actor-critic: Off-policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor, T. Haarnoja et al, 2018. |
Hyo Jeon |
[paper] [review] |
12/28 |
Agent57: Outperforming the Atari Human Benchmark, Badia, A. P. et al, 2020. |
Chris Ohk |
[paper] [review] |