Deep Reinforcement Learning

DRL university course lecture notes & exercises

Chapters recap:

Chapter	Sections recap
Hello world	Basic terminology and definitions (based on spinning up RL, by openAI)
RL Basics	MDPs, Polciy/Value-Iteration, MC, SARSA & Q-Learning
DQN & it's derivatives	Deep Q-Network (DQN), Double DQN, Dueling-DQN
Policy Gradients	REINFORCE, REINFORCE with Baseline, Actor-Critic methods
Imitation Learning	Apprenticeship, Supervised and forward learning. Dagger, Dagger with coaching
Multi-Armed Bandit	Bandit algorithm, Gradient based algorithm, contextual bandits, Thompson sampling
RL use-case: AlphaGo	Monte Carlo Tree Search, AlphaGo, AlphaZero
Meta and Transfer Learning	Concepts in Meta learning and Transfer learning in the context of RL
Large action spaces	Examining some papers discussing handling with large action spaces
Advanced model learning & exploration	Learning in latent space, next states predictions, exploration schemes

Exercise	Description
ex1	Q-Learning and Deep-Q-Learning (DQN) implementations from scratch
ex2	REINFORCE (with and without baseline) and Monte Carlo Actor-Critic implementations from scratch

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
.idea		.idea
ex1		ex1
ex2/my_implementation		ex2/my_implementation
.gitattributes		.gitattributes
Deep_Reinforcement_Learning.pdf		Deep_Reinforcement_Learning.pdf
README.md		README.md