The third part of this nanodegree program covers policy-based methods in deep reinforcement learning. You can find all of the coding exercises from the lessons in this GitHub repository.
In this lesson, you will learn about methods such as hill climbing, simulated annealing, and adaptive noise scaling. You'll also learn about cross-entropy methods and evolution strategies.
In this lesson, you'll study REINFORCE, along with improvements we can make to lower the variance of policy gradient algorithms.
In this lesson, you'll learn about Proximal Policy Optimization (PPO), a cutting-edge policy gradient method.
In this lesson, you'll learn how to combine value-based and policy-based methods, bringing together the best of both worlds, to solve challenging reinforcement learning problems.
In this optional lesson, you'll learn how to apply deep reinforcement learning techniques for optimal execution of portfolio transactions.