Pytorch implementation of LOLA (https://arxiv.org/abs/1709.04326) using DiCE (https://arxiv.org/abs/1802.05098)
[lr_in=0.3, lr_out=0.2, lr_v=0.1, batch_size=128, len_rollout=150, use_baseline=True]
[lr_in=0.3, lr_out=0.2, lr_v=0.1, batch_size=128, len_rollout=150, use_baseline=True] (It seems that 2 lookaheads is the most stable model with this set of hyperparameters)
[lr_in=0.3, lr_out=0.2, batch_size=128, len_rollout=150]
[lr_in=0.3, lr_out=0.2, batch_size=128, len_rollout=150]
The authors of the paper have their own version (Tensorflow) available here: https://github.com/alshedivat/lola