Chainer implementation of Adversarial Inverse Reinforcement Learning (AIRL) and Generative Adversarial Imitation Learning (GAIL). The code heavily depend on the reinforcement learning package Chainerrl.
Train and sample expert trajectory
python train_gym.py ppo --gpu $gpu_id --env CartPole-v0 --arch FFSoftmax --steps 50000
Run GAIL
python train_gym.py gail --gpu $gpu_id --env CartPole-v0 --arch FFSoftmax --steps 100000 \
--load_demo ${PathOfDemonstrationNpzFile} --update-interval 128 --entropy-coef 0.01
Run AIRL
python train_gym.py airl --gpu $gpu_id --env CartPole-v0 --arch FFSoftmax --steps 100000 \
--load_demo ${PathOfDemonstrationNpzFile} --update-interval 128 --entropy-coef 0.01
MIT