-
-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Policy Gradient Tutorial #82
base: tutorials
Are you sure you want to change the base?
Conversation
tutorials/Policy_Gradient_Vanilla.jl
Outdated
return mean(-logpi .* A_t) | ||
end | ||
|
||
opt = ADAM(params(policy),η) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opt = ADAM(params(policy),η)
could be
opt = ADAM(η)
in accordance with the new optimizer API
tutorials/Policy_Gradient_Vanilla.jl
Outdated
|
||
G_t = γ*G_t + r | ||
|
||
l = l .+ loss(state,act,G_t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadcasting is not required for a variable element.
tutorials/Policy_Gradient_Vanilla.jl
Outdated
|
||
l = l .+ loss(state,act,G_t) | ||
Flux.back!(loss(state,act,G_t)) | ||
opt() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WRT the new Optimizer API, this will become update!(opt, params(model))
does not find a matching candidate.
but even this throws up errors. |
Please add a Project.toml and Manifest.toml as well so it easier to standardize the environment. |
@dhairyagandhi96 Added the files |
vision/mnist/DCGAN/dcgan.md
Outdated
@@ -0,0 +1,195 @@ | |||
# ***Generative Adversarial Network Tutorial*** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just use normal headings for these rather than the extra formatting / html tags?
@tejank10 are you happy with the changes made here, or is there more to do? |
@MikeInnes Thanks for the reply. I apologize for making a few errors before. The DCGAN code should not have been included in this PR. There is a separate PR for that. I have corrected it by removing the GAN code. The changes you mentioned for the GAN part will be updated in the respective PR. |
@tejank10 I have made the changes requested. Sorry for having delayed this for so long. I got into other work and did not fix the errors that were coming up. The changes have been completed now. I have also added functions to normalize the discounted rewards which would aid in training the network. |
It will also need to be in its own folder, and have a simple README. Otherwise this is looking good I think, but it'd be good to hear from @tejank10. |
@MikeInnes Sorry for the delayed response. I have made the changes. Is the README sufficient for now or is there something more to be added? |
*Description
Implementation of Vanilla Monte Carlo Policy Gradients on the CartPole-v0 environment added as a tutorial.
*Tests
Run the script