Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code to collect demonstrations from an agent. #468

Merged
merged 22 commits into from
Jul 15, 2019

Conversation

prabhatnagarajan
Copy link
Contributor

No description provided.

@ummavi ummavi mentioned this pull request Jun 13, 2019
help='Random seed [0, 2 ** 31)')
parser.add_argument('--gpu', type=int, default=0,
help='GPU to use, set to -1 if no GPU.')
parser.add_argument('--demo', action='store_true', default=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

demo is not used. (always true in this script)


opt.setup(q_func)

rbuf = replay_buffer.ReplayBuffer(10 ** 6)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on a situation, but I think we should clarify that opt/rbuf are dummies. (but agent requires them)

@keisuke-nakata
Copy link
Member

CI fails 😢

@prabhatnagarajan prabhatnagarajan added this to the v0.8 milestone Jul 9, 2019
clip_rewards=False)
env.seed(int(args.seed))
# Randomize actions like epsilon-greedy
env = chainerrl.wrappers.RandomizeAction(env, 0.01)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this randomization intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We wouldn't want to collect demonstrations that are identical in a deterministic env.

gpu="$1"

# Chainer 4 does not support open_pickle_dataset_writer, which is used by the demonstration collection example.
pickle_writer_support=$(python -c "import chainer; from distutils.version import StrictVersion; print(1 if StrictVersion(chainer.__version__) >= StrictVersion('5.0.0') else 0)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On some environment this line may not work because users might use python3 instead of python.
I think now this line can be removed because we no longer support Chainer V4.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the Chainer v4 support has been dropped yet. That will be done separately in a different PR.

Regarding your other comment, I'm following this: https://github.com/chainer/chainerrl/pull/204/files#diff-875175ed9e296e37b03562cf27435d32R11.

And wouldn't your python2 vs. python3 comment apply to all example tests? E.g. https://github.com/chainer/chainerrl/blob/master/examples_tests/atari/test_a2c.sh

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you're completely right.

Copy link
Member

@keisuke-nakata keisuke-nakata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@prabhatnagarajan prabhatnagarajan merged commit b73156d into chainer:master Jul 15, 2019
@prabhatnagarajan prabhatnagarajan deleted the demo_collection branch July 15, 2019 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants