Learn & Apply reinforcement learning techniques on complex continuous control domain to achieve maximum rewards. In the continuous control domain, where actions are continuous and often high-dimensional such as OpenAI-Gym environment Humanoid-V2. The Humanoid environment has 377 Observation dimensions and 17 action dimensions. This problem requires temporal difference learning compared to supervised learning since it has so many moving parts that are hard to debug, and they require substantial efforts in tuning in order to get good results. Also, in supervised learning problems, progress has been driven by large labeled datasets like ImageNet. In Reinforcement Learning, the closest equivalent would be a large and diverse collection of environments.
-
Notifications
You must be signed in to change notification settings - Fork 1
rajkumargithub/MSDS696
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Humanoid V2
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published