Adversarial learning course -- IFT 6164 -- (previously Game Theory and ML course -- IFT 6756) -- Cours d'apprentissage automatique antagoniste
Mandatory registration to Piazza and Gradescope (More details to come and during the first lecture). registration link for Piazza the password is `ift6164`
The class will be taught in English, (all slides and class notes are in English), you will have the possibility to have the homeworks and exams in French. Of course, you can always ask your questions in french.
IMPORTANT:
A background in Deep Learning and Machine Learning is necessary. It is important to be confortable with Optimization. In order to evaluate if you have the sufficient backgroup you can try the Homework 0 before the beginning of the course.
A background in, Reinforcement Learning and Algorithmic Game Theory may be a plus.
French Version
Inscription obligatoire à Piazza et Gradescope.
Le cours sera enseigné en anglais,(toutes les diapositives et notes de cours sont en anglais).
IMPORTANT:
Des connaissances en apprentissage profond apprentiss`age automatique, optimisation sont nécessaires. Pour vérifier que vous avez les compétences requises vous pouvez essayer le devoir à la maison 0 avant le debut de la session.
Des connaissances en apprentissage par renforcement et théorie des jeux algorithmiques peuvent être un plus.
Description
The number of Machine Learning applications related to game theory has been growing in the last couple of years.
For example, two-player zero-sum games are important for generative modeling (GANs) and mastering games like Go or Poker via self-play.
This course is at the interface between game theory, optimization, and machine learning. It tries to understand how to learn models to play games.
It will start with some quick notions of game theory to eventually delve into machine learning problems with game formulations such as GANs or Multi-agent RL.
This course will also cover the optimization (a.k.a, training) of such machine learning games.
Un nombre grandissant d'applications d'apprentissage automatique liées à la théorie des jeux à vu le jour ces dernières années.
Par exemple, les jeux à deux joueurs et à somme nulle sont importants pour la modélisation générative (GAN) et la maîtrise de jeux comme Go ou Poker via l'appentissage autonome.
Ce cours est à l'interface entre la théorie des jeux, l'optimisation et l'apprentissage automatique. Il essaie de comprendre comment apprendre des modèles pour jouer à des jeux.
Il commencera par quelques notions rapides de théorie des jeux pour finalement se plonger dans les problèmes d'apprentissage automatique avec des formulations de jeux telles que les GAN ou l'apprentissage par renforcement avec plusieurs agents. Ce cours couvrira également l'optimisation (a.k.a, entrainement) de tels jeux d'apprentissage automatique.
Schedule - Plan de cours
The first week (11th and 12th of January), the classes will be remote.
The plan for the course will roughtly be the following (the slides and lecture notes are from last year, they are just indicative of the topic covered but do not correspond to the slides that will be used):
Special thanks to the students who scribed the lectures in 2022: William St-Arnaud, Sobhan Mohammadpour, Motahareh Sohrabi, Tianwei Ni, Mansi Rankawat, Orson Mengara, Marco Jiralerspong, Rozhin Nobahari, and Peng Lu.
And in 2021: David Dobre, Arnaud L'Heureux, Ivan Puhachov, François Mercier, Sharath Chandra, Mathieu Godbout, William Neveu, Olivier Tessier-Larivière, Tianyu Zhang, Francisco Gallegos, Albert Orozco Camacho, Martin Dallaire, Bharath Govindaraju, Justine Pepin, François David, François Milot, Jonathan Tremblay, Carl Perreault-Lafleur, Mojtaba Faramarzi, Pascal Jutras-Dubé, Arnold (Zicong) Mo, Uros Petricevic, Olivier Ethier, Nehal Pandey, Harsh Kumar Roshan, Sree Rama Sanjeev, Rupali Bhati, Sharath Chandra and Kavin Patel.
Evaluation
Midterm HW (15%)+ Final HW (15%) + Paper presentation (30%) + Project (40%).
Devoir de mi-session (15%) + devoir final (15%) + Presentation de papiers (30%) + Projet (40%).
Relevant references
- GANs: https://arxiv.org/abs/1406.2661,
- Big GAN: https://arxiv.org/abs/1809.11096
- WGAN: https://arxiv.org/pdf/1701.07875.pdf
- WGAN-GP: https://arxiv.org/pdf/1704.00028.pdf
- Poker: https://www.cs.cmu.edu/~noamb/papers/19-Science-Superhuman.pdf, https://www.cs.cmu.edu/~noamb/papers/17-Science-Superhuman.pdf
- Diplomacy: https://arxiv.org/pdf/2010.02923.pdf, https://arxiv.org/abs/2006.04635, https://arxiv.org/abs/1909.02128
- Hanabi: https://arxiv.org/abs/1902.00506, https://arxiv.org/abs/1811.01458
- StarCraft II: https://www.nature.com/articles/d41586-019-03343-4
- AlphaGo: https://www.nature.com/articles/nature16961
- AlphaGo zero: https://www.nature.com/articles/nature24270
- Alpha zero: https://science.sciencemag.org/content/362/6419/1140
- Open-ended learning: https://arxiv.org/abs/1901.08106
- Spinning-top: https://arxiv.org/abs/2004.09468
- Unified framework: https://arxiv.org/pdf/1711.00832.pdf
- Re-evaluating Evaluation: https://papers.nips.cc/paper/2018/file/cdf1035c34ec380218a8cc9a43d438f9-Paper.pdf
- Lola https://arxiv.org/abs/1709.04326
- Numerics of GANs: https://arxiv.org/abs/1705.10461
- Optimistic methods: https://arxiv.org/abs/1711.00141
- Extragradient methods: https://arxiv.org/abs/1802.10551, https://arxiv.org/abs/1906.05945
- Negative momentum: https://arxiv.org/abs/1807.04740
- Optimal convergence rates in games: https://arxiv.org/abs/2001.00602
- Noise in Games: https://arxiv.org/abs/1904.08598
- A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning: https://arxiv.org/pdf/1711.00832.pdf
- Mean Field Multi-Agent Reinforcement Learning https://arxiv.org/pdf/1802.05438.pdf
- Adversarial Training: https://arxiv.org/abs/1706.06083
Potential Projects (from 2021)
Critic of a paper --- Critique d'un papier (Experimental)
This kind of project, is about reproducing a research paper, criticizing their experimental method/results, and proposing an ablation study or new experiments that has not been done in the paper. One of the paper in the list above may for instance good candiate for such a project.
Reproduction d'un papier, puis critique de la méthode experimentale proposée. Enfin le projet doit se conclure par la proposition d'une nouvelle experience pour completer le papier.
Projects List --- Liste de projets
Read a paper from the list and send me an email for more details on the project:
Theoretical
- Improving the lower bounds from: https://arxiv.org/abs/2001.0060
- Last iterate convergence for Stochastic extragradient method: https://arxiv.org/abs/1905.11373
- Re-evaluating evaluation of Multi-player games: https://papers.nips.cc/paper/2018/file/cdf1035c34ec380218a8cc9a43d438f9-Paper.pdf
- Adversarial example Games for the theory of adversarial examples: https://arxiv.org/abs/2007.00720
- Law of robustness for Neural Networks: https://arxiv.org/abs/2009.14444
- New convergence rates for Alternated Gradient Method (extending the ones in https://arxiv.org/abs/1807.04740 )
- Primal-Dual optimization in RL: New alorithm to solve the minimax formulation proposed in http://proceedings.mlr.press/v120/serrano20a.html
Experimental
- Adversarial Example Games for Adversarial training: https://arxiv.org/abs/2007.00720
- Try some optimizers (ExtraAdam, Negative Momentum) on text data
Propose your Own Project (higher risk but it will be taken into account) --- Proposition d'un projet personnel (plus risqué. Cela sera pris en compte dans l'évaluation)
The stage of the extended abstract is very important because the proposal might be refused at mid-term if it is judged that it does not fit the standard of the course.