
Senior
Researcher (DR2)
INRIA Lille - Nord Europe, SequeL team
(Sequential Learning)
Bandit theory
KL-UCB, UCB-V, Thompson sampling, many-armed bandits
Foundations of Monte-Carlo Tree Search and hierarchical bandits
Optimistic optimization, optimistic planning
Random projections
For Least Squares regression and Reinforcement Learning
Reinforcement Learning (RL) and approximate dynamic programming (DP)
Finite-time analysis of RL and DP (Lasso-TD, LSTD, AVI, API, BRM, compressed-LSTD)
RL and DP with function approximation (Lp analysis)
Reinforcement Learning and optimal control in continuous time
Numerical solutions to HJB equations
Stability analysis via viscosity solutions
Variable resolution discretizations
Policy gradient in RL and control
Sensitivity analysis in continuous time
Sensitivity analysis in POMDPs via particle filters
PASCAL2 site INRIA Lille, since October 2009.
European project COMPLACS (Composing Learning for Artificial Cognitive Systems) 2011-2015
ANR EXPLO-RA (EXPLOration - EXPLOitation for efficient Resource Allocation. Applications to optimization, control, learning, and games) 2009-2012
ANR CO-ADAPT (Brain computer co-adaptation for better interfaces), 2010 - 2013.
PASCAL 2 Pump Priming Programme Sparse Reinforcement Learning in High Dimensions, 2010 - 2011
Associated Team with RLAI University of Alberta, 2009 - 2010, 2011, 2012
ARC CODA: Contrôle Optimal d'un Digesteur Anaérobie, 2007 - 2008
Associated researcher with CREA (Centre de Recherche en Epistémologie Appliquée), Ecole Polytechnique, from 2007.
Co-chair of ALT 2013 (with Sanjay Jain) in Singapore, October 6-9, 2013.
President comité de programme JFPDA 2013 in Lille. Historique des JFPDA
ICML 2012 Workshop new Challenges for Exploration & Exploitation 3
INRIA Workshop on Statistical Learning. December 5, 6, 2011
Machine Learning Summer School 2011 in Bordeaux. Slides of my Introduction to Reinforcement Learning: Part1, Part2, Part3
ICML 2011 Tutorial on bandits: Introduction to Bandits: Algorithms and Theory (with Jean-Yves Audibert). Slides: Part1, Part2
ICML 2009 workshop On-line Learning with Limited Feedback (Sponsored by PASCAL 2). See Videolectures
European Workshop on Reinforcement Learning, 2008. A post selection of 21 papers have been published by Springer in this LNCS Volume.
Co-chair of ADPRL 2007 (IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning), celebrating the 50th anniversary of Richard Bellman's pioneering work on Dynamic Programming in 1957. April 1-5, 2007, Hawaii, USA.
ICML/COLT 2006 Workshop Kernel Machines and Reinforcement Learning, June 29, 2006, Pittsburgh, USA.
Sébastien Bubeck, now assistant professor at Princeton. Prix Gilles Kahn 2010, Prix Jacques Neveu 2010, Prix AFIA 2011
Odalric-Ambrym Maillard (co-supervized with Philippe Berthet), now postdoc with Peter Auer, Prix de thèse AFIA 2012
Jean-François Hren, now postdoc in 2XS team of LIFL
Alexandra Carpentier, now postdoc at University of Cambridge with Richard Nickl
Mohammad Gheshlaghi Azar (co-supervized with Bert Kappen), now postdoc at CMU with Emma Brunskill
Pierre-Arnaud Coquelin, now CEO of Vekia
Emilie Kauffmann (co-supervized with Aurélien Garivier and Olivier Cappé)
Amir Sani (co-supervized with Alessandro Lazaric)
Adrien Hoarau
Marta Soare (co-supervized with Alessandro Lazaric)
Address:
Rémi Munos, SEQUEL project, INRIA Lille - Nord Europe,
40
avenue Halley, 59650 Villeneuve d'Ascq, FRANCE
Email:
remi
(dot) munos (at) inria (dot) fr
Tel:
(0 or 33)3 59 57 79 06
Fax:
(0 or 33)3 59 57 78 50