Adversarial learning course -- IFT 6164 -- (previously Game Theory and ML course -- IFT 6756) -- Cours d'apprentissage automatique antagoniste

Mandatory registration to Piazza and Gradescope (More details to come and during the first lecture). registration link for Piazza the password is `ift6164`
The class will be taught in English, (all slides and class notes are in English), you will have the possibility to have the homeworks and exams in French. Of course, you can always ask your questions in french.

IMPORTANT: A background in Deep Learning and Machine Learning is necessary. It is important to be confortable with Optimization. In order to evaluate if you have the sufficient backgroup you can try the Homework 0 before the beginning of the course. A background in, Reinforcement Learning and Algorithmic Game Theory may be a plus.

French Version

Inscription obligatoire à Piazza et Gradescope.
Le cours sera enseigné en anglais,(toutes les diapositives et notes de cours sont en anglais).

IMPORTANT: Des connaissances en apprentissage profond apprentiss`age automatique, optimisation sont nécessaires. Pour vérifier que vous avez les compétences requises vous pouvez essayer le devoir à la maison 0 avant le debut de la session. Des connaissances en apprentissage par renforcement et théorie des jeux algorithmiques peuvent être un plus.

Description

The number of Machine Learning applications related to game theory has been growing in the last couple of years. For example, two-player zero-sum games are important for generative modeling (GANs) and mastering games like Go or Poker via self-play. This course is at the interface between game theory, optimization, and machine learning. It tries to understand how to learn models to play games. It will start with some quick notions of game theory to eventually delve into machine learning problems with game formulations such as GANs or Multi-agent RL. This course will also cover the optimization (a.k.a, training) of such machine learning games.

Un nombre grandissant d'applications d'apprentissage automatique liées à la théorie des jeux à vu le jour ces dernières années. Par exemple, les jeux à deux joueurs et à somme nulle sont importants pour la modélisation générative (GAN) et la maîtrise de jeux comme Go ou Poker via l'appentissage autonome. Ce cours est à l'interface entre la théorie des jeux, l'optimisation et l'apprentissage automatique. Il essaie de comprendre comment apprendre des modèles pour jouer à des jeux. Il commencera par quelques notions rapides de théorie des jeux pour finalement se plonger dans les problèmes d'apprentissage automatique avec des formulations de jeux telles que les GAN ou l'apprentissage par renforcement avec plusieurs agents. Ce cours couvrira également l'optimisation (a.k.a, entrainement) de tels jeux d'apprentissage automatique.

Schedule - Plan de cours

The first week (11th and 12th of January), the classes will be remote. The plan for the course will roughtly be the following (the slides and lecture notes are from last year, they are just indicative of the topic covered but do not correspond to the slides that will be used):

Lecture 1: (Jan 11th) Motivations [slides]
Q and A for the course (Jan 12th)
Lecture 2: (Jan 18th) Adversarial Examples [slides] [notes]
Lecture 3: Adversarial Learning [slides] [notes]
Lecture 4: Paper Presentation (exact presentations TBD): Towards DL Models Resistant to Adversaries and Practical Black-Box Attacks against ML
Lecture 5: Generative Adversarial Networks (Part 1) [slides] [notes]
Lecture 6: Paper Presentation: Adversarial trainining for free https://arxiv.org/abs/1904.12843 .
Lecture 7: Generative Adversarial Networks (Part 2) [slides] [notes]
Lecture 8: Paper presentation (exact presentations TBD): f-GANs and Cycle-GANs.
Lecture 9: Wasserstein Generative Adversarial Networks [slides] [notes]
Lecture 10: Paper Presentations: BigGAN, Unrolled GANs
Lecture 11: Optimization + Differentiable Games [slides] [notes]
Lecture 12: Paper Presentation on WGANs.
Feb 21st: Midterm exam week
Break
Lecture 13: Extragradient [slides] [notes]
Lecture 14: Paper Presentation The Mecanics of n-player Games and Implicit competitive regularization in GANs
Lecture 15: Sprectal Analysis [slides] [notes]
Lecture 16: Paper Presentations on Extragradient.
Lecture 17: Stability and Equilibrium [slides] [notes]
Lecture 18: Paper presentations on Spectral analysis.
Lecture 19: Multi agent RL.
Lecture 20: Paper presentations on Stability and Equilibrium
Lecture 21: Evaluation and Learning in MA systems [slides] [notes]
Lecture 22: Paper presentations on Multi Agent RL
Lecture 23: Evaluation of MA systems [slides] [notes]
Lecture 24: Paper presentation on Evaluation on MA-RL agents
April 18th: Final exam week.

Special thanks to the students who scribed the lectures in 2022: William St-Arnaud, Sobhan Mohammadpour, Motahareh Sohrabi, Tianwei Ni, Mansi Rankawat, Orson Mengara, Marco Jiralerspong, Rozhin Nobahari, and Peng Lu.
And in 2021: David Dobre, Arnaud L'Heureux, Ivan Puhachov, François Mercier, Sharath Chandra, Mathieu Godbout, William Neveu, Olivier Tessier-Larivière, Tianyu Zhang, Francisco Gallegos, Albert Orozco Camacho, Martin Dallaire, Bharath Govindaraju, Justine Pepin, François David, François Milot, Jonathan Tremblay, Carl Perreault-Lafleur, Mojtaba Faramarzi, Pascal Jutras-Dubé, Arnold (Zicong) Mo, Uros Petricevic, Olivier Ethier, Nehal Pandey, Harsh Kumar Roshan, Sree Rama Sanjeev, Rupali Bhati, Sharath Chandra and Kavin Patel.

Evaluation

Midterm HW (15%)+ Final HW (15%) + Paper presentation (30%) + Project (40%).

Devoir de mi-session (15%) + devoir final (15%) + Presentation de papiers (30%) + Projet (40%).

Relevant references

GANs: https://arxiv.org/abs/1406.2661,
Big GAN: https://arxiv.org/abs/1809.11096
WGAN: https://arxiv.org/pdf/1701.07875.pdf
WGAN-GP: https://arxiv.org/pdf/1704.00028.pdf
Poker: https://www.cs.cmu.edu/~noamb/papers/19-Science-Superhuman.pdf, https://www.cs.cmu.edu/~noamb/papers/17-Science-Superhuman.pdf
Diplomacy: https://arxiv.org/pdf/2010.02923.pdf, https://arxiv.org/abs/2006.04635, https://arxiv.org/abs/1909.02128
Hanabi: https://arxiv.org/abs/1902.00506, https://arxiv.org/abs/1811.01458
StarCraft II: https://www.nature.com/articles/d41586-019-03343-4
AlphaGo: https://www.nature.com/articles/nature16961
AlphaGo zero: https://www.nature.com/articles/nature24270
Alpha zero: https://science.sciencemag.org/content/362/6419/1140
Open-ended learning: https://arxiv.org/abs/1901.08106
Spinning-top: https://arxiv.org/abs/2004.09468
Unified framework: https://arxiv.org/pdf/1711.00832.pdf
Re-evaluating Evaluation: https://papers.nips.cc/paper/2018/file/cdf1035c34ec380218a8cc9a43d438f9-Paper.pdf
Lola https://arxiv.org/abs/1709.04326
Numerics of GANs: https://arxiv.org/abs/1705.10461
Optimistic methods: https://arxiv.org/abs/1711.00141
Extragradient methods: https://arxiv.org/abs/1802.10551, https://arxiv.org/abs/1906.05945
Negative momentum: https://arxiv.org/abs/1807.04740
Optimal convergence rates in games: https://arxiv.org/abs/2001.00602
Noise in Games: https://arxiv.org/abs/1904.08598
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning: https://arxiv.org/pdf/1711.00832.pdf
Mean Field Multi-Agent Reinforcement Learning https://arxiv.org/pdf/1802.05438.pdf
Adversarial Training: https://arxiv.org/abs/1706.06083

Potential Projects (from 2021)

Critic of a paper --- Critique d'un papier (Experimental)

This kind of project, is about reproducing a research paper, criticizing their experimental method/results, and proposing an ablation study or new experiments that has not been done in the paper. One of the paper in the list above may for instance good candiate for such a project.

Reproduction d'un papier, puis critique de la méthode experimentale proposée. Enfin le projet doit se conclure par la proposition d'une nouvelle experience pour completer le papier.

Projects List --- Liste de projets

Read a paper from the list and send me an email for more details on the project:
Theoretical

Improving the lower bounds from: https://arxiv.org/abs/2001.0060
Last iterate convergence for Stochastic extragradient method: https://arxiv.org/abs/1905.11373
Re-evaluating evaluation of Multi-player games: https://papers.nips.cc/paper/2018/file/cdf1035c34ec380218a8cc9a43d438f9-Paper.pdf
Adversarial example Games for the theory of adversarial examples: https://arxiv.org/abs/2007.00720
Law of robustness for Neural Networks: https://arxiv.org/abs/2009.14444
New convergence rates for Alternated Gradient Method (extending the ones in https://arxiv.org/abs/1807.04740 )
Primal-Dual optimization in RL: New alorithm to solve the minimax formulation proposed in http://proceedings.mlr.press/v120/serrano20a.html

Experimental

Adversarial Example Games for Adversarial training: https://arxiv.org/abs/2007.00720
Try some optimizers (ExtraAdam, Negative Momentum) on text data

Propose your Own Project (higher risk but it will be taken into account) --- Proposition d'un projet personnel (plus risqué. Cela sera pris en compte dans l'évaluation)

The stage of the extended abstract is very important because the proposal might be refused at mid-term if it is judged that it does not fit the standard of the course.