Graphs in Machine Learning  Fall 2019  MVA  ENS ParisSaclay
News
 07.10.2019: This year we move to 100% Python for the labs (and discountinue Matlab and VM).
 01.10.2019: Piazza fixed  you can use any school email address together with the class code.
 30.09.2019: Check piazza to get a bonus grade!
 30.09.2019: Signup and communication managed via piazza
 27.09.2019: The course will be presented on Friday, 27. 9. 2019.
 27.09.2019: First class will start on 01. 10. 2019 at 13h30.
Administrivia
 Time: Wednesday afternoon
 Place: ENS ParisSaclay (different lecture halls)
 7 or 8 lectures and 3 recitations (TD)
 Validation: grades from TD (40%) + class project (60%)
 Research: projects, internships (stages) and PhD. thesis at SequeL and elsewhere possible
 Piazza: Registration (with your school email) and online class discussion on piazza.
 TA: Omar Darwiche Dominques
 course description at MVA at ENS ParisSaclay
 MVA tags: content: #apprentissage, type: #méthodologique #théorique, validation: #projet #td
Main topics
 spectral graph theory, graph Laplacians
 semisupervised graphbased learning
 manifold learning
 graphs from flat data  graph as a nonparametric basis
 online learning with graphs
 real world graphs scalability and approximations
 graph neural networks (new in 2019)
 social networks and recommender systems applications
 large graph analysis, learning, and mining
 vision applications (e.g., face recognition)
Important: Don't take this class ...
 ... if you don't have time to do reports. While 510% students finish their 3 assignments during the 2hour long recitations, about 20% students find that they spend 3 times longer time doing homework reports than for other classes in the master program.
 ... if you expect your instructor to reply to your emails or not willing to read this webpage and other instructions. Piazza is the place for all communication.
 ... if you believe that extra extensions beyond the rules below would be granted or if you cannot deliver the project report on time.
Intro
The graphs come handy whenever we deal with relations between the objects. This course, focused on learning, will present methods involving two main sources of graphs in ML: 1) graphs coming from networks, e.g., social, biological, technology, etc. and 2) graphs coming from flat (often vision) data, where a graph serves as a useful nonparametric basis and is an effective data representation for such tasks as spectral clustering, manifold or semisupervised learning. We will also discuss online decisionmaking on graphs, suitable for recommender systems or online advertising. Finally, we will always discuss the scalability of all approaches and learn how to address huge graphs in practice. The lectures will show not only how but mostly why things work. The students will learn relevant topics from spectral graph theory, learning theory, bandit theory, graph neural networks, necessary mathematical concepts and the concrete graphbased approaches for typical machine learning problems. The practical sessions will provide handson experience on interesting applications (e.g., online face recognizer) and stateoftheart graphs processing tools.
Organization
The course will feature 11 sessions, 8 lectures and 3 recitations (TD), each of them 2 hours long. There may be a special session with guest lectures. There may be also an extra homework with extra credit. The evaluation is be based on reports from TD and from the projects. Several project topics will be proposed but the students will be able to come up with their own and they will be able to work in groups of 23 people. The best reference for this course are the slides from the lecture which are made to be comprehensive and there is no recommended textbook. The material we cover is mostly based on research papers, some of which very recent. The course will be in English.Recitations and homeworks (TDs)
Bring your own laptop to the practical sessions. Each of the 3 practical sessions are followed by a graded report. The assignments are posted on piazza. You are welcome to discuss with your peers (in which case indicate the people you have discussed with in your report), but the reports should be written individually to avoid a penalty.Class projects
The main part the of the grade comes from the projects. The students are encouraged to come up with the topic of their interest related to the course and start working on it as soon as possible. In this case, please email the lecturer with a short description of the project for the approval. Some project proposals will be given. Additional project proposals will be presented on 5.11.2019. Deadline for deciding on the project is 26.11.2019, but the recommended date for picking up the projects is on 19.11.2019. The deadline for submitting 510 page reports in NeurIPS format is 07.01.2020. The planned time for 15+5 minutes will be from 11. 1. 2020 over Skype/Hangout. Students can work in pairs of 2 and exceptionally 3. Very detailed instructions are given on the dedicated page for the class projects.Registration, Communication, and Questions
We will be using piazza for the enrollment and online class discussion. Use your full name and your school email when registering. The access code will be given out during the class. Piazza is the place of questions regarding lectures, homeworks, and logistics. Posting questions to piazza makes the whole class benefit from the answers and enables students to answer questions too. However, refrain from posting the solutions to the homeworks. Please use piazza also for public or private communication with the instructors of any kind. Emails that should be posted to Pizza to the instructors will not be answered or will be answered late by a canned response "please post this question to piazza".Late policy
You will have 4 late days without penalty to be used across the entire course. You can use them for any deadline (homework, project assignment, project report delivery). After those late days are used, you will be penalized according to the following policy: (1) full credit at the midnight on the due date, Paris time (2) half credit for the next 48 hours; (3) zero credit after that. We encourage the students not to use these late dates except in the exceptional circumstances. All the deadlines are strict and we ask students to avoid demanding the extensions. If you have serious reasons that prevent you meeting the deadlines, please use the formal procedures of your school.Prerequisites
linear algebra, basic statistics, others tools needed will be covered in the lecturesSyllabus
Session 0  Overview 27.09.2019  Amphi Curie
 Short presentation of the course
Session 1  Lecture 1 01.10.2019, Salle Condorcet 13h3015h30
 Introduction to graphs in Machine Learning
 Motivation and overview of the successful approaches
 Applications (Recommender Systems, SemiSupervised Learning, …)
 Submodularity for influence maximization
 Google Pagerank
 Graphs as datadependent approach
 Online SSL for Face Recognition: from raw pixels to analysis
 Erdős number project
 Random graphs models – realworld graph models
 Social network modeling, Small world phenomena, Advertising
 Graph theory refresher
 Datagraph constructions
Session 2 – Lecture 2 08.10.2019, Salle Condorcet 13h3015h30
 Available public graphs datasets
 Spectral algebra
 Graph Laplacian and its properties
 Random walks and the Laplacians
 Geometry of the data and connectivity
 Spectral clustering
Session 3 – TD 1 15.10.2019, Salle Condorcet 13h3015h30
 Graph construction and spectral clustering
 kNN and epsilon graphs, comparison
 Understand connectivity vs. compactness paradigm in clustering
 Spectral clustering on simple image segmentation problem
Session 4 – Lecture 3 22.10.2019, Salle Condorcet 13h3015h30
 Introduction to manifold learning
 Manifold learning with Laplacian Eigenmaps
 Recommendations with graph distances
 Resistance networks
 SemiSupervised Learning
 SSL why and when it helps
 SSL with selftraining
 SemiSupervised SVMs
 Graphbased semisupervised learning
 Gaussian random fields and harmonic solution
Session 5  Lecture 4 29.10.2019, Salle Condorcet 13h3015h30, Deadline: TD1 report
 Regularization of harmonic solution
 Softharmonic solution
 Inductive and transductive semisupervised learning
 Manifold regularization
 Maxmargin graph cuts
 Theory of Laplacianbased manifold methods
 Transductive learning stability based bounds
Session 6  Lecture 5 and class project proposals 05.11.2019, Salle Condorcet 13h3015h30,
 Online learning with graphs
 Online clustering
 Online semisupervised learning
 What to do when graphs grow?
 Online incremental kcenters
 Examples of applications of online SSL
 Analysis of online SSL
 Analysis of quantization error
 When does graphbased SSL provably help?
 Scalability
 Scaling harmonic functions to millions of samples
 Numerical eigenfunctions
Session 7 – TD 2 12.11.2019, Salle Condorcet 13h3015h30
 Semi Supervised Learning and Harmonic Function Solution
 Face Recognizer with HFS
 Report experiments of one's own face
 Suggest improvements
Session 8  Lecture 6 19.11.2019, Salle Condorcet 13h3015h30, Soft deadline: project assignment
 Largescale sparsification
 Combinatorial sparsification
 Spectral sparsification
 GraphLab abstraction
 Decisionmaking with graphs
 Online SemiSupervised Learning
 Big graph data learning and analysis
 Largescale data processing tools: GraphLab
 Largescale label propagation
 Largescale graph construction
 Graph bandits
 Online decisionmaking on graphs
 Smoothness of rewards (preferences) on a given graphs
 Online movie recommendation on graphs
 Observability graphs
 Improving learning rates using side information
 Online influence maximization
Session 9  Lecture 7 26.11.2019, 14h00  16h00, Salle Condorcet 13h3015h30, Deadlines: TD2 report, project assignment
 Invited lecture by Marc Lelarge
 Graph Nets
Session 10 – TD 3 06.12.2019, Salle Condorcet 12h0014h00
 Graph Nets
No class 10.12.2019, Michal and Omar will be at NeurIPS 2019.
You can check out the papers that your instructor(s) are presenting at NeurIPS 2019 during that week:

Planning in entropyregularized Markov decision processes and games

Multiagent evaluation under incomplete information

Exact sampling of determinantal point processes with sublinear time preprocessing

On two ways to use determinantal point processes for Monte Carlo integration
10.12.2019: Deadline: TD3 report
07.01.2020: Deadline: project report
from 13.01.2020 project presentations over Skype/Hangouts
Course page from previous years: