Game Theory and ML course -- IFT 6756 -- Théorie des Jeux pour l'Apprentissage Machine

Mandatory registration to Piazza. (if you registered through studium you should have been automatically added to Piazza, if you want to audit the course and you are not a UdeM register via )
The class will be taught in English, all slides and class notes are in English, you will have the possibility to do any assignement either in French or English. Question can be asked in French or English.
A background in Deep Learning is recommended.
A background in Machine Learning, Optimization, Reinforcement Learning and Algorithmic Game Theory may be a plus.

French Version

Inscription obligatoire au TEAMS du cours. (si vous vous êtes inscrit via Studium vous trouverez le lien pour accéder a TEAMS du cours, si vous souhaitez auditer le cours et que vous n'êtes pas étudiant à l'UdeM, envoyez-moi un email)
Le cours sera enseigné en français ou en anglais, selon la fréquentation (toutes les diapositives et notes de cours sont en anglais).
Des connaissances en apprentissage profond sont recommandées.
Des connaissances en apprentissage automatique, optimisation, apprentissage par renforcement et théorie des jeux algorithmiques peuvent être un plus.

Description

The number of Machine Learning applications related to game theory has been growing in the last couple of years. For example, two-player zero-sum games are important for generative modeling (GANs) and mastering games like Go or Poker via self-play. This course is at the interface between game theory, optimization, and machine learning. It tries to understand how to learn models to play games. It will start with some quick notions of game theory to eventually delve into machine learning problems with game formulations such as GANs or Multi-agent RL. This course will also cover the optimization (a.k.a, training) of such machine learning games.

Un nombre grandissant d'applications d'apprentissage automatique liées à la théorie des jeux à vu le jour ces dernières années. Par exemple, les jeux à deux joueurs et à somme nulle sont importants pour la modélisation générative (GAN) et la maîtrise de jeux comme Go ou Poker via l'appentissage autonome. Ce cours est à l'interface entre la théorie des jeux, l'optimisation et l'apprentissage automatique. Il essaie de comprendre comment apprendre des modèles pour jouer à des jeux. Il commencera par quelques notions rapides de théorie des jeux pour finalement se plonger dans les problèmes d'apprentissage automatique avec des formulations de jeux telles que les GAN ou l'apprentissage par renforcement avec plusieurs agents. Ce cours couvrira également l'optimisation (a.k.a, entrainement) de tels jeux d'apprentissage automatique.

Schedule - Plan de cours

Due to the sanitary situation the course was given online in 2021. The Lectures given in 2021 were the following:

Lecture 1: Motivations [slides]
Lecture 2: Game Theory Background [slides]
Lecture 3: ML Background [slides]
Lecture 4: Optimization Background [slides] [notes]
Lecture 5: Adversarial Examples [slides] [notes]
Lecture 6: Paper Presentation: Towards DL Models Resistant to Adversaries and Practical Black-Box Attacks against ML
Lecture 7: Generative Adversarial Networks (Part 1) [slides] [notes]
Lecture 8: Paper Presentation: f-GANs and Cycle-GANs.
Lecture 9: Generative Adversarial Networks (Part 2) [slides] [notes]
Lecture 10: Paper presentation: BigGAN, Unrolled GANs.
Lecture 11: Wasserstein Generative Adversarial Networks [slides] [notes]
Lecture 12: Paper Presentations: Climate Change project and Adversarial Example Games
Lecture 13: Differentiable Games [slides] [notes]
Lecture replaced by individual meeting to talk about the project.
Lecture 15: Extragradient [slides] [notes]
Lecture 16: Paper Presentation The Mecanics of n-player Games and Implicit competitive regularization in GANs
Lecture 17: Sprectal Analysis [slides] [notes]
Lecture 18: Paper Presentations The Hanabi Challenge and Evaluating generating models: A p(reality) check
Lecture 19: Stability and Equilibrium [slides] [notes]
Lecture 20: Paper presentations: Multi-Agent Actor Critic and Counterfactual Multi-Agent Policy Gradients
Lecture 21: Evaluation and Learning in MA systems [slides] [notes]
Guest Presentation: On Grandmaster level in StarCraft II using multi-agent reinforcement learning
Lecture 23: Evaluation of MA systems [slides] [notes]

Special thanks to the students who scribed the lectures this year: David Dobre, Arnaud L'Heureux, Ivan Puhachov, François Mercier, Sharath Chandra, Mathieu Godbout, William Neveu, Olivier Tessier-Larivière, Tianyu Zhang, Francisco Gallegos, Albert Orozco Camacho, Martin Dallaire, Bharath Govindaraju, Justine Pepin, François David, François Milot, Jonathan Tremblay, Carl Perreault-Lafleur, Mojtaba Faramarzi, Pascal Jutras-Dubé, Arnold (Zicong) Mo, Uros Petricevic, Olivier Ethier, Nehal Pandey, Harsh Kumar Roshan, Sree Rama Sanjeev, Rupali Bhati, Sharath Chandra and Kavin Patel.

Evaluation

Project 100% - Attending at least 80% of the classes is also mandatory. The project evaluation will be based on a mid-session extended abstract (30%), project report (40%), and a project presentation (30%).

Projet 100% - Assister à au moins 80% des cours est également obligatoire. L'évaluation du projet sera basée sur un rapport, une présentation et un pré-rapport à mi-session.

Timeline -- Dates Importantes

March 12th (11:59PM AoE): Extended Abstract submission -- Soumission du pré-rapport.
April 29th (11:59PM AoE): Project Report submission -- Soumission du rapport de projet.
End of April: Project Presentations (Optional if a paper presentation has already been done) -- Présentation des projets (facultative si une présentation de papier à déjà été faite)

Relevant references

GANs: https://arxiv.org/abs/1406.2661,
Big GAN: https://arxiv.org/abs/1809.11096
WGAN: https://arxiv.org/pdf/1701.07875.pdf
WGAN-GP: https://arxiv.org/pdf/1704.00028.pdf
Poker: https://www.cs.cmu.edu/~noamb/papers/19-Science-Superhuman.pdf, https://www.cs.cmu.edu/~noamb/papers/17-Science-Superhuman.pdf
Diplomacy: https://arxiv.org/pdf/2010.02923.pdf, https://arxiv.org/abs/2006.04635, https://arxiv.org/abs/1909.02128
Hanabi: https://arxiv.org/abs/1902.00506, https://arxiv.org/abs/1811.01458
StarCraft II: https://www.nature.com/articles/d41586-019-03343-4
AlphaGo: https://www.nature.com/articles/nature16961
AlphaGo zero: https://www.nature.com/articles/nature24270
Alpha zero: https://science.sciencemag.org/content/362/6419/1140
Open-ended learning: https://arxiv.org/abs/1901.08106
Spinning-top: https://arxiv.org/abs/2004.09468
Unified framework: https://arxiv.org/pdf/1711.00832.pdf
Re-evaluating Evaluation: https://papers.nips.cc/paper/2018/file/cdf1035c34ec380218a8cc9a43d438f9-Paper.pdf
Lola https://arxiv.org/abs/1709.04326
Numerics of GANs: https://arxiv.org/abs/1705.10461
Optimistic methods: https://arxiv.org/abs/1711.00141
Extragradient methods: https://arxiv.org/abs/1802.10551, https://arxiv.org/abs/1906.05945
Negative momentum: https://arxiv.org/abs/1807.04740
Optimal convergence rates in games: https://arxiv.org/abs/2001.00602
Noise in Games: https://arxiv.org/abs/1904.08598
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning: https://arxiv.org/pdf/1711.00832.pdf
Mean Field Multi-Agent Reinforcement Learning https://arxiv.org/pdf/1802.05438.pdf
Adversarial Training: https://arxiv.org/abs/1706.06083

Potential Projects

Critic of a paper --- Critique d'un papier (Experimental)

This kind of project, is about reproducing a research paper, criticizing their experimental method/results, and proposing an ablation study or new experiments that has not been done in the paper. One of the paper in the list above may for instance good candiate for such a project.

Reproduction d'un papier, puis critique de la méthode experimentale proposée. Enfin le projet doit se conclure par la proposition d'une nouvelle experience pour completer le papier.

Projects List --- Liste de projets

Read a paper from the list and send me an email for more details on the project:
Theoretical

Improving the lower bounds from: https://arxiv.org/abs/2001.0060
Last iterate convergence for Stochastic extragradient method: https://arxiv.org/abs/1905.11373
Re-evaluating evaluation of Multi-player games: https://papers.nips.cc/paper/2018/file/cdf1035c34ec380218a8cc9a43d438f9-Paper.pdf
Adversarial example Games for the theory of adversarial examples: https://arxiv.org/abs/2007.00720
Law of robustness for Neural Networks: https://arxiv.org/abs/2009.14444
New convergence rates for Alternated Gradient Method (extending the ones in https://arxiv.org/abs/1807.04740 )
Primal-Dual optimization in RL: New alorithm to solve the minimax formulation proposed in http://proceedings.mlr.press/v120/serrano20a.html

Experimental

Adversarial Example Games for Adversarial training: https://arxiv.org/abs/2007.00720
Try some optimizers (ExtraAdam, Negative Momentum) on text data

Propose your Own Project (higher risk but it will be taken into account) --- Proposition d'un projet personnel (plus risqué. Cela sera pris en compte dans l'évaluation)

The stage of the extended abstract is very important because the proposal might be refused at mid-term if it is judged that it does not fit the standard of the course.