Skip to content

Releases: google-research/batch-ppo

TensorFlow Agents 1.4.0

16 Apr 20:14
2b6f509
Compare
Choose a tag to compare

Features:

  • Split episodes into chunks for training. This reduces memory requirements when training from pixels and in some cases increases data efficiency.
  • Use lambda variable initializers everywhere to support embedding the simulation into a larger graph.
  • Upgrade to newest Gym version, including new environment names and dtypes for spaces.
  • Support regularization losses returned by the network.

Improvements:

  • Remove MuJoCo dependency from tests.
  • Speed up smoke tests for faster iteration times.
  • Enable continuous integration.

Bugs:

  • Fix off-by-one bug in FrameHistory environment wrapper.

TensorFlow Agents 1.3.0

18 Jan 13:33
Compare
Choose a tag to compare

Features:

  • Represent policies as tf.distribution objects, so that the algorithms are independent of the action distribution.

Improvements:

  • Move reusable components into agents.parts package.
  • Add nesting tools to handle nested tuples, lists, and dicts.

Bugs:

  • Fix PPO not learning on GPU by placing the optimizer on the GPU.

TensorFlow Agents 1.2.0

13 Nov 20:51
Compare
Choose a tag to compare

Features:

  • Use single optimizer for PPO to train shared feature layers better.
  • Allow calling methods of the process environment.

Improvements:

  • Improve default and MuJoCo configs.
  • Report both training and evaluation scores.

Bugs:

  • Likelihood calculation halved gradients for the action standard deviation.

TensorFlow Agents 1.1.0

04 Oct 15:49
Compare
Choose a tag to compare

Features:

  • Policy networks are now defined as functions mapping sequences of observations to sequences of actions. As a result, feed forward policies are faster now, and memory based agents are easier to implement. Previously, networks were restricted to be defined as RNNCells.
  • All functions of the agent interface receive a tensor of agent indices now. This adds the flexibility to process observations in smaller batches. Previously, perform() and experience() was defined on data from all the environments.

TensorFlow Agents 1.0.0

08 Sep 23:25
Compare
Choose a tag to compare

Initial release.