Deep Reinforcement Learning in Python
Offered By: DataCamp
Course Description
Overview
Learn and use powerful Deep Reinforcement Learning algorithms, including refinement and optimization techniques.
Embark on a journey to empower machines through Deep Reinforcement Learning (DRL). This course offers hands-on experience with powerful algorithms using PyTorch and Gymnasium.
Start with DRL foundations and traditional Reinforcement Learning, then implement Deep Q-Networks (DQN) with advanced refinements like Prioritized Experience Replay.
Advance your skills with policy-based methods and explore industry-standard algorithms like Proximal Policy Optimization (PPO) before optimizing your models using Optuna.
Embark on a journey to empower machines through Deep Reinforcement Learning (DRL). This course offers hands-on experience with powerful algorithms using PyTorch and Gymnasium.
Start with DRL foundations and traditional Reinforcement Learning, then implement Deep Q-Networks (DQN) with advanced refinements like Prioritized Experience Replay.
Advance your skills with policy-based methods and explore industry-standard algorithms like Proximal Policy Optimization (PPO) before optimizing your models using Optuna.
Syllabus
- Introduction to Deep Reinforcement Learning
- Discover how deep reinforcement learning improves upon traditional Reinforcement Learning while studying and implementing your first Deep Q Learning algorithm.
- Deep Q-learning
- Dive into Deep Q-learning by implementing the original DQN algorithm, featuring Experience Replay, epsilon-greediness and fixed Q-targets. Beyond DQN, you will then explore two fascinating extensions that improve the performance and stability of Deep Q-learning: Double DQN and Prioritized Experience Replay.
- Introduction to Policy Gradient Methods
- Learn about the foundational concepts of policy gradient methods found in DRL. You will begin with the policy gradient theorem, which forms the basis for these methods. Then, you will implement the REINFORCE algorithm, a powerful approach to learning policies. The chapter will then guide you through Actor-Critic methods, focusing on the Advantage Actor-Critic (A2C) algorithm, which combines the strengths of both policy gradient and value-based methods to enhance learning efficiency and stability.
- Proximal Policy Optimization and DRL Tips
- Explore Proximal Policy Optimization (PPO) for robust DRL performance. Next, you will examine using an entropy bonus in PPO, which encourages exploration by preventing premature convergence to deterministic policies. You'll also learn about batch updates in policy gradient methods. Finally, you will learn about hyperparameter optimization with Optuna, a powerful tool for optimizing performance in your DRL models.
Taught by
Timothée Carayol
Related Courses
Build your first Self Driving Car using AWS DeepRacerCoursera Project Network via Coursera Fundamentals of Deep Reinforcement Learning
Learn Ventures via edX Natural Language Processing (NLP)
Microsoft via edX Reinforcement Learning Course: Intro to Advanced Actor Critic Methods
freeCodeCamp 6.S094: Deep Learning for Self-Driving Cars
Massachusetts Institute of Technology via Independent