Remi Munos

 Currently at Google DeepMind

Senior Researcher (DR1)
INRIA Lille - Nord Europe, SequeL team (Sequential Learning)

In 2013-2014 I was at Microsoft Research New-England


 Teaching (Master Maths Vision Apprentissage ENS Cachan)

Research interests:

Bandit theory

Optimistic algorithms (KL-UCB, UCB-V), Thompson sampling, many-armed bandits

Foundations of Monte-Carlo Tree Search

Optimistic optimization (HOO, SOO, StoSOO), optimistic planning (OP-MDP, OLOP)

Bandits in graphs and other structured spaces

Reinforcement Learning (RL)

Analysis of Reinforcement Learning and Dynamic Programming (DP) with function approximation

Finite-sample analysis of RL and DP (Lasso-TD, LSTD, AVI, API, BRM, compressed-LSTD)

Policy gradient and sensitivity analysis

Sampling methods for MDPs, Bayesian RL, POMDPs

Optimal control in continuous time

Numerical solutions to HJB equations

Stability analysis via viscosity solutions

Variable resolution discretizations

Statistical learning and randomization

Random projections for least squares regression

Adaptive sampling for Monte-Carlo integration

Active learning and sparse bandits


From bandits to Monte-Carlo Tree Search: The optimistic principle applied to optimization and planning, 2014. See tech report.

Projects and activities:

Scientific events

PhD Students:


Rémi Munos, SEQUEL project, INRIA Lille - Nord Europe,
40 avenue Halley, 59650 Villeneuve d'Ascq, FRANCE

Email: remi (dot) munos (at) inria (dot) fr
Tel: (0 or 33)3 59 57 79 06
Fax: (0 or 33)3 59 57 78 50