This repository contains an implementation of the Deep Q-Network (DQN) algorithm for playing Atari games. The DQN algorithm, introduced by Mnih et al. in the paper Playing Atari with Deep Reinforcement Learning, combines Q-learning with deep neural networks to achieve impressive results in a variety of Atari 2600 games.
The Deep Q-Network is a deep reinforcement learning algorithm that extends Q-learning to handle high-dimensional state spaces. It employs a neural network to approximate the Q-function, which represents the expected cumulative future rewards for taking a specific action in a given state. This allows DQN to learn directly from raw sensory inputs, making it applicable to a wide range of tasks.
The Atari 2600, a popular home video game console in the late 1970s and early 1980s, featured a diverse collection of games. These games serve as a benchmark for testing the capabilities of reinforcement learning algorithms. Each game in the Atari 2600 suite provides a unique environment with different challenges, making them an ideal testbed for training agents to generalize across a variety of tasks.
To run this project, you will need the following:
- Python 3.x
- PyTorch
- Gym (OpenAI)
- Clone the repository:
git clone https://github.com/adhiiisetiawan/atari-dqn.git
- Install the required dependencies:
pip install -r requirements.txt
To train and evaluate the DQN agent, follow the steps outlined below:
-
Set up the required dependencies as described in the Installation section.
-
Train the DQN agent:
sh train.sh
If you want to change the game that you train, please edit the game environment name in train.sh
file.
- Evaluate the trained agent:
Evaluation process has been done during end of training, but if you want to run evaluation separately, just run dqn_eval.py
and change the game environment.
The training process involves the following steps:
- Preprocess raw game frames to reduce dimensionality.
- Initialize a deep neural network to approximate the Q-function.
- Initialize a replay buffer to store experiences.
- For each episode, perform the following steps:
- Select an action using an epsilon-greedy policy.
- Execute the action in the environment and observe the next state, reward, and terminal flag.
- Store the experience in the replay buffer.
- Sample a batch of experiences from the replay buffer and perform a Q-learning update step.
- Update the target Q-network periodically.
The evaluation process involves testing the trained DQN agent on a specific game. The agent's performance is measured in terms of the average score achieved over a specified number of episodes.
Here's a GIF of the agent playing Q-Bert
:
Here's a GIF of the agent playing MS PacMan
:
This project is licensed under the MIT License - see the LICENSE file for details.
This repository inspired by CleanRL
@article{huang2022cleanrl,
author = {Shengyi Huang and Rousslan Fernand Julien Dossa and Chang Ye and Jeff Braga and Dipam Chakraborty and Kinal Mehta and João G.M. Araújo},
title = {CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms},
journal = {Journal of Machine Learning Research},
year = {2022},
volume = {23},
number = {274},
pages = {1--18},
url = {http://jmlr.org/papers/v23/21-1342.html}
}