Skip to content

Implementation of domain contraction for robust policy learning

Notifications You must be signed in to change notification settings

idiap/robust_pl

Repository files navigation

Robust Manipulation Primitive Learning via Domain Contraction (CoRL 2024)

Paper    Project

We propose a bi-level approach for robust primitive learning:

Level-1: Parameter-augmented policy learning using multiple models. We augment the state space with physical parameters of multiple models and use Tensor Train (TT) to approximate the state-value function and advantage function.

Level-2: Parameter-conditioned policy retrieval through domain contraction. At the stage of execution, we can obtain a rough estimate of the physical parameters in the manipulation domain. This instance-specific information can be utilized to retrieve a parameter-conditioned policy, expected to be much more instance-optimal.

Our algorithm is based on Generalized Policy Iteration using Tensor Train (TTPI) for policy learning. In this work, we further enable robust policy learning through domain contraction, by leveraging produce of tensor cores between a rough parameter distribution and parameter-agumented advantage function.

Dependencies

  • Python version: 3.7 (Tested)

  • Install necessary packages:

    pip install -r requirements.txt

Parmeter-augmented Policy Learning:

We utilize TTPI for parameter-augmented policy learning. To learn more about TTPI, please run the following tutorial as an example:

       examples/TTPI_example_pushing.ipynb

Based on TTPI, we augment the state space with parameters to learn parameter-augmented policies for Push and Reorientation.

The codes for training such policies are listed below in examples. You can skip them by directly loading the pretrained models stored in the tt_models folder.

        PushingTask_policy_training.ipynb

        ReOrientation_policy_training.ipynb

Note: The parameter-augmented policy of Hit can be obtained analytically, as shown in Hit_policy_training_retrieval.ipynb.

Paramter-conditioned Policy Retrieval through Domain Contraction

In examples folder:

Hit:

        Hit_policy_training_retrieval.ipynb

Push:

        PushingTask_policy_retrieval.ipynb

Reorientation:

        ReOrientation_policy_retrieval.ipynb

This repository is maintained by Teng Xue.

Contact: teng.xue@idiap.ch

About

Implementation of domain contraction for robust policy learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages