Course on Reinforcement Learning


Lecture 6: Sample Complexity of ADP Algorithms

Lecture 5: Approximate Dynamic Programming

Lecture 3: Reinforcement Learning Algorithms

Lecture 2: Markov Decision Processes and Dynamic Programming

Lecture 0: Introduction to the Course

Lecture 1: Introduction to Reinforcement Learning


Introduction to the models and mathematical tools used in formalizing the problem of learning and decision-making under uncertainty. In particular, we will focus on the frameworks of reinforcement learning and multi-arm bandit. The main topics studied during the course are:

-Historical multi-disciplinary basis of reinforcement learning

-Markov decision processes and dynamic programming

-Stochastic approximation and Monte-Carlo methods

-Function approximation and statistical learning theory

-Approximate dynamic programming

-Introduction to stochastic and adversarial multi-arm bandit

-Learning rates and finite-sample analysis

Where and When

The course on “Reinforcement Learning” will be held at the Department of Mathematics at ENS Cachan. The course will be held every Tuesday from September 30th to December 16th in C103 (C109 for practical sessions) from 11:00 to 13:00.


  1. 30/09 -- Markov Decision Processes

  2. 07/10 -- Dynamic Programming

  3. 14/10 -- Reinforcement Learning

  4. 21/10 -- Practical session on Dynamic Programming and Reinforcement Learning

  5. 28/10 -- Multi-armed Bandit (1)

  6. 04/11 -- Practical session on Multi-armed Bandit

  7. 18/11 -- Multi-armed Bandit (2) [announcement and assignment of projects]

  8. 25/11 -- Practical session on Multi-armed Bandit

  9. 02/12 -- Approximate Dynamic Programming

  10. 09/12 -- Practical session on ADP

  11. 16/12 -- Sample Complexity of Approximate Dynamic Programming


The course will be evaluated according to the points collected in the practical sessions and with a final project. Project proposals, internships, and PhD positions will be announced towards mid-November.


** Lecture notes and slides will be updated at each class (these are from last year)


  1. Schedule of the presentation day: schedule.pdf

  2. List of the assignments of projects: assignments.pdf.

  3. Two new internships proposals on recommendation systems available.

  4. The slides on mutl-armed bandit have been updated with the new version including the discussion about linear bandit.

  5. The (preliminary) list of projects is available here: mini-projects.pdf (the list will be updated until the end of the week!)

  6. Practical sessions and homework assignments:

  7. Link to the simulator of the grid world domain:

  8. Link to a website explaining some of the concepts covered during the course and applied to Tetris (in French):

Lecture 4: The Multi-arm Bandit Problem