Senior Researcher (DR1)
INRIA Lille - Nord Europe, SequeL team (Sequential Learning)
In 2013-2014 I was at Microsoft Research New-England
Bandit
theory
Optimistic algorithms (KL-UCB,
UCB-V), Thompson sampling, many-armed bandits
Foundations
of Monte-Carlo Tree Search
Optimistic
optimization (HOO, SOO, StoSOO), optimistic planning (OP-MDP, OLOP)
Bandits in graphs and other structured spaces
Reinforcement
Learning (RL)
Analysis of Reinforcement Learning and Dynamic Programming (DP) with function approximation
Finite-sample
analysis of RL and DP (Lasso-TD, LSTD, AVI, API, BRM,
compressed-LSTD) Policy gradient and sensitivity analysis
Publications
Teaching (Master Maths Vision Apprentissage ENS Cachan)
Research
interests:
Sampling methods for MDPs, Bayesian RL, POMDPs
Optimal control in continuous time
Numerical solutions to HJB equations
Stability analysis via viscosity solutions
Variable resolution discretizations
Statistical learning and randomization
Random projections for least squares regression
Adaptive sampling for Monte-Carlo integration
Active learning and sparse bandits
From bandits to Monte-Carlo Tree Search: The optimistic principle applied to optimization and planning, 2014. See tech report.
PASCAL2 site INRIA Lille, 2009-2013.
European project COMPLACS (Composing Learning for Artificial Cognitive Systems) 2011-2015
Associated Team with Mc Gill University, 2013
ANR EXPLO-RA (EXPLOration - EXPLOitation for efficient Resource Allocation. Applications to optimization, control, learning, and games) 2009-2012
ANR CO-ADAPT (Brain computer co-adaptation for better interfaces), 2010 - 2013.
PASCAL 2 Pump Priming Programme Sparse Reinforcement Learning in High Dimensions, 2010 - 2011
Associated Team with RLAI University of Alberta, 2009 - 2010, 2011, 2012
ARC CODA: Contrôle Optimal d'un Digesteur Anaérobie, 2007 - 2008
Associated researcher with CREA (Centre de Recherche en Epistémologie Appliquée), Ecole Polytechnique, from 2007.
Tutorial AAAI 2013 From Bandits to Monte Carlo Tree Search: The optimistic principle applied to Optimization and Planning.
Summer School Netadis in Hillerod, Denmark, September 8-22, 2013. Slides: Part1, Part2, Part3
Co-chair of ALT 2013 (with Sanjay Jain) in Singapore, October 6-9, 2013.
President comité de programme JFPDA 2013 in Lille
ICML 2012 Workshop new Challenges for Exploration & Exploitation 3
INRIA Workshop on Statistical Learning. December 5, 6, 2011
Machine Learning Summer School 2011 in Bordeaux. Slides of my Introduction to Reinforcement Learning: Part1, Part2, Part3
ICML 2011 Tutorial on bandits: Introduction to Bandits: Algorithms and Theory (with Jean-Yves Audibert). Slides: Part1, Part2
ICML 2009 workshop On-line Learning with Limited Feedback (Sponsored by PASCAL 2). See Videolectures
European Workshop on Reinforcement Learning, 2008. A post selection of 21 papers have been published by Springer in this LNCS Volume.
Co-chair of ADPRL 2007 (IEEE Symposium on Approximate Dynamic Programming and Reinforcement Learning), celebrating the 50th anniversary of Richard Bellman's pioneering work on Dynamic Programming in 1957. April 1-5, 2007, Hawaii, USA.
ICML/COLT 2006 Workshop Kernel Machines and Reinforcement Learning, June 29, 2006, Pittsburgh, USA.
Sébastien Bubeck, 2008-2010, now assistant professor at Princeton. Prix Gilles Kahn 2010, Prix Jacques Neveu 2010, Prix AFIA 2011
Odalric-Ambrym Maillard (co-supervized with Philippe Berthet), 2009-2011, postdoc with Peter Auer and Shie Mannor, Prix de thèse AFIA 2012
Jean-François Hren, 2008-2012, now postdoc in 2XS team of LIFL
Alexandra Carpentier, 2010-2012, now postdoc at University of Cambridge with Richard Nickl. Prix de thèse AFIA 2013
Mohammad Gheshlaghi Azar (co-supervized with Bert Kappen), 2008-2012, now postdoc at CMU with Emma Brunskill
Pierre-Arnaud Coquelin, 2005-, now CEO of Vekia
Emilie Kauffmann (co-supervized with Aurélien Garivier and Olivier Cappé), 2011-
Amir Sani (co-supervized with Alessandro Lazaric), 2011-
Adrien Hoarau, 2012-
Marta Soare (co-supervized with Alessandro Lazaric), 2012-
Tomáš Kocák (co-supervized with Michal Valko), 2013-
Address:
Rémi Munos, SEQUEL project, INRIA Lille - Nord Europe,
40
avenue Halley, 59650 Villeneuve d'Ascq, FRANCE
Email:
remi
(dot) munos (at) inria (dot) fr
Tel:
(0 or 33)3 59 57 79 06
Fax:
(0 or 33)3 59 57 78 50