Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TRPO agent #204

Merged
merged 32 commits into from
Mar 15, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
c9fbaa4
Add chainerrl.misc.conjugate_gradient
muupan Dec 14, 2017
c99d554
Add tests for conjugate_gradient
muupan Dec 14, 2017
fea10ec
Add TRPO agent
muupan Dec 16, 2017
89ccaac
Check return type of conjugate_gradient
muupan Dec 16, 2017
b215013
Improve docstring of envs.ABC
muupan Dec 16, 2017
630ff06
Add a TRPO example for gym
muupan Dec 16, 2017
1b4a68c
Use policies.FCGaussianPolicyWithStateIndependentCovariance for tests
muupan Dec 17, 2017
3bbb159
Simplify code
muupan Dec 17, 2017
f8d4f74
Check if the comuptation graph contains old-style functions
muupan Dec 17, 2017
3b1513a
Set entropy_coef=0
muupan Dec 17, 2017
114fefc
It doesn't work with 3.0.0 because of insufficient support of
muupan Dec 17, 2017
fd4ce71
Parameterize variance as log std
muupan Dec 17, 2017
96647cd
Allow saved attributes to be None
muupan Dec 18, 2017
6bfe868
Add obs_normalizer and conjugate_gradient_max_iter
muupan Dec 18, 2017
2795155
Use settings of http://arxiv.org/abs/1709.06560
muupan Dec 18, 2017
77a2871
Update on stop_episode_and_train as well as act_and_train
muupan Dec 18, 2017
7e8f3ba
Add --trpo-update-interval
muupan Dec 18, 2017
1fa5a5e
Add train_trpo_gym.py to test_examples.sh
muupan Dec 18, 2017
90f39d2
Merge branch 'master' into trpo
muupan Dec 28, 2017
fde05d8
Use different seeds for train and test envs
muupan Dec 28, 2017
396b947
Merge branch 'master' into trpo
muupan Feb 13, 2018
65f3524
Use exp(2*x) instead of exp(x)**2
muupan Feb 13, 2018
514c0f3
Test with differnet dtypes
muupan Feb 13, 2018
4cdc3a7
Remove unnecessary transpose
muupan Feb 13, 2018
0534d73
Remove unnecessary dataset_iter.reset()
muupan Feb 13, 2018
77b3198
Compute CG answer without inv_mat
muupan Feb 13, 2018
41b6f64
Use chainer.grad and raise an error for None grads
muupan Mar 14, 2018
c564ca3
Fix style of long string literals
muupan Mar 14, 2018
c65278a
Check xp consistency
muupan Mar 14, 2018
06ad59e
Use pkg_resources.parse_version to handle rc and b
muupan Mar 14, 2018
ed600de
Fix a flake8 error
muupan Mar 15, 2018
4d4c1cc
Merge branch 'master' into trpo
muupan Mar 15, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions chainerrl/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,8 @@ def __save(self, dirname, ancestors):
for attr in self.saved_attributes:
assert hasattr(self, attr)
attr_value = getattr(self, attr)
if attr_value is None:
continue
if isinstance(attr_value, AttributeSavingMixin):
assert not any(
attr_value is ancestor
Expand All @@ -139,6 +141,8 @@ def __load(self, dirname, ancestors):
for attr in self.saved_attributes:
assert hasattr(self, attr)
attr_value = getattr(self, attr)
if attr_value is None:
continue
if isinstance(attr_value, AttributeSavingMixin):
assert not any(
attr_value is ancestor
Expand Down
1 change: 1 addition & 0 deletions chainerrl/agents/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,4 @@
from chainerrl.agents.reinforce import REINFORCE # NOQA
from chainerrl.agents.residual_dqn import ResidualDQN # NOQA
from chainerrl.agents.sarsa import SARSA # NOQA
from chainerrl.agents.trpo import TRPO # NOQA
Loading