Skip to content

thiagopbueno/rddlgym

Repository files navigation

rddlgym Py Versions PyPI version Build Status Documentation Status License: GPL v3

A toolkit for working with RDDL domains in Python3. Its main purpose is to wrap a RDDL domain/instance planning problem as an OpenAI Gym environment.

Quickstart

$ pip3 install -U rddlgym

Features

rddlgym implements the OpenAI Gym API for RDDL problems. It uses pyrddl to parse RDDL files. Additionally, in order to simulate RDDL domains, it uses rddl2tf to compile RDDL operations and expressions to TensorFlow 1.X ops.

For further details, please refer to the documentation of the following packages:

  • pyrddl: RDDL lexer/parser in Python3.
  • rddl2tf: RDDL2TensorFlow compiler.

NOTE

Please note that rddl2tf (and consequently rddlgym) has been mainly developed to support continuous state-action domains. It may not currently work for discrete MDPs.

If you tried to use rddlgym with your own RDDL files and encounter errors due (probably) to the RDDL-to-TensorFlow compilation, please do not hesitate to open an issue or contact me.


Usage

rddlgym can either be used as a standalone CLI app or it can be integrated with your code in order to implement customized agent-environment interaction loops.

CLI

$ rddlgym --help
Usage: rddlgym [OPTIONS] COMMAND [ARGS]...

  rddlgym: A toolkit for working with RDDL domains in Python3.

Options:
  --help  Show this message and exit.

Commands:
  info   Print metadata for a `rddl` domain/instance.
  ls     List all RDDL domains and instances available.
  parse  Check RDDL file parsing.
  run    Run random policy in `rddl` domain/instance.
  show   Print `rddl` file.

API

import rddlgym

# create RDDLGYM environment
rddl_id = "Navigation-v3" # see available RDDL domains/instances with `rddlgym ls` command
env = rddlgym.make(rddl_id, mode=rddlgym.GYM)

# you can also wrap your own RDDL files (domain + instance)
# env = rddlgym.make("/path/to/your/domain_instance.rddl", mode=rddlgym.GYM)

# define random policy
policy = lambda state, t: env.action_space.sample()

# initialize environament
state, t = env.reset()
done = False

# create a trajectory container
trajectory = rddlgym.Trajectory(env)

# sample an episode and store trajectory
while not done:

    action = policy(state, t)
    next_state, reward, done, info = env.step(action)

    trajectory.add_transition(t, state, action, reward, next_state, info, done)

    state = next_state
    t = env.timestep

print(f"Total Reward = {trajectory.total_reward}")
print(f"Episode length = {len(trajectory)}")

filepath = f"/tmp/rddlgym/{rddl}/data.csv"
df = trajectory.save(filepath) # dump episode data as csv file
print(df) # display dataframe

License

Copyright (c) 2018-2020 Thiago Pereira Bueno All Rights Reserved.

rddlgym is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

rddlgym is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with rddlgym. If not, see http://www.gnu.org/licenses/.

Packages

No packages published

Languages