Alessandro Lazaric

2012

Next April I will teach a course on “Advanced Topics of Machine Learning Theory and Online Learning”.

Where and When

I’m organizing a course on “Advanced Topics of Machine Learning Theory and Online Learning” at the Department of Electronics and Information (DEI), Politecnico di Milano. The course will be held on April 2, 3, 4, 11, 12, 13 at Sala Seminari from 9:00 to 13:00.

Abstract

Machine learning is a scientific field at the intersection of computer science, statistics, and optimization, and it focuses on the design and development of algorithms able to improve their performance based on empirical data drawn from an unknown environment. An artificial learner observes examples from the environment (e.g., a text labeled with its topic provided by an expert) and automatically learns a prediction rule which can be used to predict the outcome of the environment in the future (e.g., previously unseen texts). Machine learning is applied in a wide variety of domains such as natural language processing, computer vision, medical diagnosis, bioinformatics, game playing, computational finance, brain-machine interface, online advertising, and many more.

In this course we will cover a number of advanced topics in machine learning: (i) theoretical foundations of machine learning (statistical learning theory), (ii) online learning.

From a theoretical point of view, machine learning algorithms are complex statistical processes whose behavior is intrinsically related to the way data are generated and the way prediction rules are built. In the last few decades, statistical learning theory (STL) provided a very effective theoretical framework for the analysis and prediction of the performance of machine learning algorithms. In this course we will review the basic statistical tools used in STL and we will give an overview of the main theoretical results in supervised learning (mostly binary classification).

In classical machine learning data are often assumed to be available in batches and the learning phase is sharply separated from the use of the learned prediction rule. Nonetheless, in many practical applications, data arrive in a stream (e.g., market prediction) and the learner has to return a prediction at each time instant. In this scenario, usually referred to as online learning, it is crucial that the learner continuously adapts its predictions and this opens a number of interesting issues. In fact, in this case the stochastic process generating the data might change over time (non-stationary process) and only a partial observation of the environment might be available. In this case we identify two main problems: sequential prediction and multi-arm bandit.

In sequential prediction, we completely drop the standard assumption that data are generated by an underlying stochastic process and we consider it as the product of some unknown and unspecified mechanism (which could be deterministic, stochastic, or even adversarially adaptive to the learner). This new point of view on the sequential prediction problem led to the development of a large number of new algorithms which observe data in a stream and predict the future behavior of the environment at each time step. These algorithms proved to be effective in a number of applications (e.g., market prediction, weather forecast) and come with very strong performance guarantees (in a worst-case analysis).

Beside the problem on the way data are generated, online learning also introduces the possibility that the learner might only receive a partial feedback on the quality of its prediction. For instance, in clinical trials, where different treatments are tested, doctors only observe the response to the specific treatment they have chosen for a patient but they do not know what would have been the response to an alternative treatment. The effects of this partial feedback and the development of algorithms able to effectively learn in this setting is the main focus of the multi-arm bandit theory. Although first defined in the 50s, recent major theoretical and algorithmic developments makes the multi-arm bandit problem a very active field of machine learning with applications in a number of domains such as online advertising, wireless network management, adaptive routing, and so on.

Objective and structure of the course

The objective of this course is to provide the students with a brief introduction to the theoretical foundations of classical machine learning and an in-depth overview of the main algorithms and results available in online learning and multi-arm bandit. The course is designed so as to keep a suitable balance between theoretical framework and tools, algorithms, and applications. From a theoretical point of view, the course will introduce the basic statistical tools used in statistical learning theory (i.e., Chernoff-Hoeffding inequality, cover numbers, VC-theory) and the main notions used to analyze online learning algorithms (i.e., definition of regret, worst-case analysis). The course will cover the main algorithms for sequential prediction with expert advice (e.g., hedge, exponentially weighted forecaster, follow-the-leader) and for multi-arm bandit (e.g., UCB, Exp3, UCB-E, HOO). While toy examples will be used to illustrate the settings and the behavior of some algorithms along the course, one or two selected applications will be presented at the end of the course. The course will also draw connections to a number of other disciplines such as statistics, operation research, and game theory.

The students will be asked to write a short report about one or more papers related to the topics of the course.

Course on "Advanced topics of machine learning”

January 31, 2012

Course on “Advanced Topics of Machine Learning Theory and Online Learning”

DEI, Politecnico di Milano

Milan, Italy

2-4, 11-13, April 2012.