Skip to content

Official implementation of the Informed Dreamer algorithm, based on DreamerV3

License

Notifications You must be signed in to change notification settings

glambrechts/informed-dreamer

 
 

Repository files navigation

Informed POMDP: Leveraging Additional Information in Model-Based RL

Official implementation of the Informed Dreamer, an adaptation of Dreamer to Informed POMDPs.

Informed POMDP

If you find this code useful, please reference in your paper:

@article{lambrechts2024informed,
    title={Informed {POMDP}: {L}everaging Additional Information in Model-Based {RL}},
    author={Lambrechts, Gaspard and Bolland, Adrien and Ernst, Damien},
    journal={Reinforcement Learning Journal},
    volume={1},
    issue={1},
    year={2024}
}

@article{hafner2023dreamerv3,
  title={Mastering Diverse Domains through World Models},
  author={Hafner, Danijar and Pasukonis, Jurgis and Ba, Jimmy and Lillicrap, Timothy},
  journal={arXiv preprint arXiv:2301.04104},
  year={2023}
}

To learn more:

Instructions

For installation, examples and tips, see the original Dreamer repository.

This repository implements the following Informed POMDPs:

  • Varying Mountain Hike (state informed)
  • Flickering Atari (annotated-RAM informed)
  • Velocity DeepMind Control (state informed)
  • Flickering DeepMind Control (state informed)

By convention, the observation keys starting with info_ are considered as part of the information $i$, while the other observation keys are considered as part of the observation $o$.

The Informed Dreamer and the Uninformed Dreamer agents can be trained as follows:

  • For the informed POMDP training, use --decoder.outputs 'info_.*' that only uses the information.
  • For the classical POMDP training, use --decoder.outputs '^(?!info_).*' that ony uses the observation.

For both the information and the observation, use --decoder.outputs '.*' (untested).

Experiments

Varying Mountain Hike

python dreamerv3/train.py --logdir logs/$(date '+%Y-%m-%d_%H.%M.%S') \
    --configs hike --task 'hike_foo' --env.hike.discrete True \
    --configs hike --env.hike.altitude False --env.hike.rotations True \
    --decoder.outputs 'info_.*'

Flickering Atari

python dreamerv3/train.py --logdir logs/$(date '+%Y-%m-%d_%H.%M.%S') \
    --configs atari100k --task 'atari_pong' --env.atari.flickering 0.5 \
    --decoder.outputs 'info_.*'

Velocity Control

python dreamerv3/train.py --logdir logs/$(date '+%Y-%m-%d_%H.%M.%S') \
    --configs dmc_velocity --task 'dmc_hopper_stand' \
    --decoder.outputs 'info_.*'

Flickering Control

python dreamerv3/train.py --logdir logs/$(date '+%Y-%m-%d_%H.%M.%S') \
    --configs dmc_vision --task 'dmc_hopper_stand' --env.dmc.flickering 0.5 \
    --decoder.outputs 'info_.*'

About

Official implementation of the Informed Dreamer algorithm, based on DreamerV3

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.7%
  • Other 1.3%