Weekly Reading Group on Offline RL at GRAIL
Organizers: Mathieu Godbout and Ulysse Côté-Allard
Everyone reads the weekly paper and a discussion guide is (pseudo-)randomly drawn at beginning of each of our meeting. The discussion guide should only drive the conversation and make sure we are able to cover the entirety of the paper within the scheduled hour. Starting from the first of November 2022, meetings are every Tuesday from 9AM to 10AM (EST) and occur on Google Meet.
This reading group is open to anyone interested. If you wish to join, simply send us an email so we can add you to our discussion channel.
For this semester, we will look a various state of the art reinforcement learning approaches without constraint on their particular domain. The reading group will take the form of a discussion and so no one will be considered the main presentator.
Date | Paper |
---|---|
1st November, 2022 | Discovering faster matrix multiplication algorithms with reinforcement learning |
8th November, 2022 | The Primacy Bias in Deep Reinforcement Learning |
15th November, 2022 | Deep Reinforcement Learning at the Edge of the Statistical Precipice |
22th November, 2022 | Explainable Reinforcement Learning: A Survey |
Date | Paper |
---|---|
8th April, 2022 | Why is Posterior Sampling Better than Optimism for Reinforcement Learning |
15th April, 2022 | Collaborating with Humans without Human Data |
22nd April, 2022 | Mastering the game of Go without human knowledge |
29th April, 2022 | AlphaStar: Grandmaster level in StarCraft II using multi-agent reinforcement learning |
6th May, 2022 | GFlowNet |
13th May, 2022 | Asymmetric self-play for automatic goal discovery in robotic manipulation |
20th May, 2022 | Continuous multi-task bayesian optimisation with correlation |
27th May, 2022 | Outracing champion Gran Turismo drivers with deep reinforcement learning |
3rd June, 2022 | Planning with Diffusion for Flexible Behavior Synthesis |
Summer Break | The reading group will start again at the end of July. More details regarding the next paper will be communicated closer to the start date. |
For this semester, we will follow the reinforcement learning class by Emma Brunskill (https://www.youtube.com/playlist?list=PLoROMvodv4rOSOPzutgyCTapiGlY2Nd8u) at a rythm of one class per week.
Date | Paper | Presenter |
---|---|---|
3rd of December, 2021 | Class 1-2 | Introduction & Given a Model of the World |
10th December, 2021 | Class 3 | Model-Free Policy Evaluation |
17th December, 2021 | Class 4 | Model Free Control |
7th January, 2022 | Class 5 | Value Function Approximation |
14th January, 2022 | Class 6 | CNNs and Deep Q Learning |
7th January, 2022 | Class 5 | Value Function Approximation |
21st January, 2022 | Class 6 | CNNs and Deep Q Learning |
28th January, 2022 | Class 7 | Value Function Approximation |
4th February, 2022 | Class 8 | Imitation Learning |
11th February, 2022 | Class 9 | Policy Gradient I |
18th February, 2022 | Class 10 | Policy Gradient II |
25th February, 2022 | Class 11 | Policy Gradient III & Review |
4th March, 2022 | Class 12 | Fast Reinforcement Learning |
11th March, 2022 | Class 13 | Fast Reinforcement Learning II |
18th March, 2022 | Class 14 | Fast Reinforcement Learning III |
25th March, 2022 | Class 15 | Batch Reinforcement Learning |
1st April, 2022 | Class 16 | Monte Carlo Tree Search |
For this portion of the fall session, we will allow for papers outside of the usual offline RL scope. Presenters for submitted papers that aren't offline RL related will no longer be randomly sampled, rather being automatically assigned to the person who submitted said paper.
Date | Paper | Presenter |
---|---|---|
29th October, 2021 | Pareto Front Identification from Stochastic Bandit Feedback | Alexandre Larouche |
5th November, 2021 | Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning | Peng Cheng (Frank) |
12th November, 2021 | COMBO: Conservative Offline Model-Based Policy Optimization | Random |
19th November, 2021 | Logistic Q-Learning. A 15-minute author presentation is also available | Mathieu Godbout |
26th November, 2021 | Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning. Code for the work's implementation available here | Ulysse Côté-Allard |
Date | Paper |
---|---|
2nd September, 2021 | Skipped to attend the DEEL workshop on adversarial attack (free registration) |
10th September, 2021 | Causality and Batch Reinforcement Learning: Complementary Approaches To Planning In Unknown Domains |
17th September, 2021 | Causal Reinforcement Learning (ICML Tutorial) Part 1&2 |
1st October, 2021 | Efficient Counterfactual Learning from Bandit Feedback |
8th October, 2021 | Improving Long-Term Metrics in Recommendation Systems using Short-Horizon Offline RL |
15th October, 2021 | A Workflow for Offline Model-Free Robotic Reinforcement Learning |
22th October, 2021 | Offline Reinforcement Learning with Implicit Q-Learning |
Date | Paper |
---|---|
11th March, 2021 | Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems (1/4) |
18th March, 2021 | Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems (2/4) |
24th March, 2021 | Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems (3/4) |
31st March, 2021 | Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems (4/4) |
7th April, 2021 | MOPO: Model-based Offline Policy Optimization |