Skip to content

State Representations as Incentives for Reinforcement Learning Agents: A Sim2Real Analysis on Robotic Grasping

License

Notifications You must be signed in to change notification settings

PetropoulakisPanagiotis/igae

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

State Representations as Incentives for Reinforcement Learning Agents: A Sim2Real Analysis on Robotic Grasping

Choosing an appropriate representation of the environment for the underlying decision-making process of the reinforcement learning agent is not always straightforward. The state representation should be inclusive enough to allow the agent to informatively decide on its actions and disentangled enough to simplify policy training and the corresponding sim2real transfer.

Given this outlook, this work examines the effect of various representations in incentivizing the agent to solve a specific robotic task: antipodal and planar object grasping. A continuum of state representations is defined, starting from hand-crafted numerical states to encoded image-based representations, with decreasing levels of induced task-specific knowledge. The effects of each representation on the ability of the agent to solve the task in simulation and the transferability of the learned policy to the real robot are examined and compared against a model-based approach with complete system knowledge.

The results show that reinforcement learning agents using numerical states can perform on par with non-learning baselines. Furthermore, we find that agents using image-based representations from pre-trained environment embedding vectors perform better than end-to-end trained agents, and hypothesize that separation of representation learning from reinforcement learning can benefit sim2real transfer. Finally, we conclude that incentivizing the state representation with task-specific knowledge facilitates faster convergence for agent training and increases success rates in sim2real robot control.


Results

  • The Unity simulator and RL agents implementations can be found at the VTPRL reposiroty.

Table I: Mean success rate across the different state representation strategies

Strategy Average + Std. (Idl. Sim.) Best Model (Idl. Sim.) Best Model (Rnd. Sim.) Best Model (Real)
Ruckig 100% 100% N/A 100%
St. 100% 100% N/A 100%
St. (rnd.) 100% 100% N/A 100%
VtS 91.6% ± 2.2 94% 70% 52%
IGAE 96.0% ± 2.8 100% 78% 84%
AE 82.4% ± 7.1 92% 70% 60%
EtE 54.8% ± 11.0 78% 44% 24%

Table II: Evaluation of autoencoder-based vision models over KNN-MSE criterion

Strategy Mean Std. Max Min
IGAE 0.0393 0.1679 1.8883 1.4122 x 10^-6
AE 0.0459 0.1839 1.6960 1.4122 x 10^-6
VtS 0.0488 0.1946 1.6857 1.8881 x 10^-6

Citing the Project

To cite this repository in publications:

@article{petropoulakis2024staterepresentationsincentives,
      title={State Representations as Incentives for Reinforcement Learning Agents: A Sim2Real Analysis on Robotic Grasping}, 
      author={Panagiotis Petropoulakis and Ludwig Gräf and Mohammadhossein Malmir and Josip Josifovski and Alois Knoll},
      year={2024},
      eprint={2309.11984},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2309.11984}, 
}

Acknowledgments

This work has been financially supported by A-IQ READY project, which has received funding within the Chips Joint Undertaking (Chips JU) - the Public-Private Partnership for research, development and innovation under Horizon Europe – and National Authorities under grant agreement No. 101096658.