in

Continuous Proximal Policy Optimization Tutorial with OpenAI gym environment



In this tutorial, we’ll learn more about continuous Reinforcement Learning agents and how to teach BipedalWalker-v3 to walk!

Reinforcement Learning in the real world is still an ill-defined problem. The agent has to be greedy, but not too greedy… One might conjecture that an optimal agent should have bayesian behavior, which again is not always what we want, nor the design goal of our brain. We want the agent to be curious so they could exploit the environment whenever possible, but not too curious so that they will continue to work for us.

If you were the head of a company, it could all be compared to training your employee. You want your employee to be exceptionally efficient at his job, while at the same time you want them to stay working for you. Which is hard, if not impossible. (unless you’re Google… of course). For more information watch my tutorial.

Text version tutorial: https://pylessons.com/BipedalWalker-v3-PPO/
Full video playlist: https://www.youtube.com/watch?v=D795oNqa-Vk&list=PLbMO9c_jUD47r9QZKpLn5CY_Mt-NFY8cC
GitHub code: https://github.com/pythonlessons/Reinforcement_Learning

Support My Channel Through Patreon:
https://www.patreon.com/PyLessons

One-Time Contribution Through PayPal:
https://www.paypal.com/paypalme/PyLessons

Things Keep Getting Worse for the Humane Ai Pin

Things Keep Getting Worse for the Humane Ai Pin

Google Has Drastically Slashed Its AI Results After Disastrously Embarrassing Launch

Google Has Drastically Slashed Its AI Results After Disastrously Embarrassing Launch