Features:
- Split episodes into chunks for training. This reduces memory requirements when training from pixels and in some cases increases data efficiency.
- Use lambda variable initializers everywhere to support embedding the simulation into a larger graph.
- Upgrade to newest Gym version, including new environment names and dtypes for spaces.
- Support regularization losses returned by the network.
Improvements:
- Remove MuJoCo dependency from tests.
- Speed up smoke tests for faster iteration times.
- Enable continuous integration.
Bugs:
- Fix off-by-one bug in
FrameHistory
environment wrapper.