A day on Random Matrix Theory and Deep Learning
ATTENTION! Registration is free but mandatory for organizational purposes: register here.
The workshop is organized by the Rome Center on Mathematics for Modeling and Data ScienceS (RoMaDS), University of Rome Tor Vergata, and funded by the excellence program MatMod@TOV. The aim of the workshop is to bring together experts with interests at the intersection of Random Matrix Theory and Deep Learning.
Speakers
Nadav Cohen (Tel Aviv University)Do Neural Nets Need Gradient Descent to Generalize? Matrix Factorization as a Theoretical Testbed
Abstract
Conventional wisdom attributes the mysterious generalization abilities of overparameterized neural networks to gradient descent (and its variants). The recent volume hypothesis challenges this view: it posits that these generalization abilities persist even when gradient descent is replaced by Guess & Check (G&C), i.e., by drawing weight settings until one that fits the training data is found. The validity of the volume hypothesis for wide and deep neural networks remains an open question. In this talk, I will theoretically investigate this question for matrix factorization---a common testbed in neural network theory. I will first show that generalization under G&C deteriorates with increasing width, establishing what is, to my knowledge, the first case where G&C is provably inferior to gradient descent. Conversely, I will show that generalization under G&C improves with increasing depth, revealing a stark contrast between wide and deep networks, which is validated empirically. These findings suggest that even in simple settings, there may not be a simple answer to the question of whether neural networks need gradient descent to generalize well.
Jon Keating (University of Oxford)
Some connections between random matrix theory and machine learning
Abstract
I will discuss some connections between random matrix theory and machine learning, focusing on the spectrum of the hessian of the loss surface and touching on issues relating to neural collapse.
Florent Krzakala (École polytechnique fédérale de Lausanne)
TBA
Abstract
TBA
Zhenyu Liao (Huazhong University of Science & Technology)
A Random Matrix Approach to Neural Networks: From Linear to Nonlinear, and from Shallow to Deep
Abstract
Deep neural networks have become the cornerstone of modern machine learning, yet their multi-layer structure, nonlinearities, and intricate optimization processes pose considerable theoretical challenges. In this talk, I will review recent advances in random matrix analysis that shed new light on these complex ML models. Starting with the foundational case of linear regression, I will demonstrate how the proposed analysis extends naturally to shallow nonlinear and ultimately deep nonlinear network models. I will also discuss practical implications (e.g., compressing and/or designing "equivalent" NN models) that arise from these theoretical insights. The talk is based on a recent review paper https://arxiv.org/abs/2506.13139 joint with Michael W. Mahoney.
Roland Speicher (Saarland University)
TBA
Abstract
TBA
Schedule
Time | Speaker |
---|---|
09h30 - 10h30 | Jon Keating |
10h30 - 11h00 | Coffee break |
11h00 - 12h00 | Roland Speicher |
12h00 - 13h00 | Zhenyu Liao |
13h00 - 14h30 | Lunch |
14h30 - 15h30 | Florent Krzakala |
15h30 - 16h00 | Coffee break |
16h00 - 17h00 | Nadav Cohen |
Venue
The conference will take place in
Aula Gismondi (also known as Aula Magna) at the University of Rome Tor Vergata,
Via della Ricerca Scientifica 1, 00133, Roma.
When you arrive and are facing the main building, head to your left. At the far left end of the building, just before entering, take the stairs on your left. Walk a few meters under the overhead 'bridges' — and you'll find the entrance.
Organizers
Domenico Marinucci
Stefano Vigogna
Michele Salvi
Contact
Should you have further questions, please write to [last name of second organizer] [at] mat [dot] uniroma2 [dot] it .