Mathematics Department UTV

A day on Random Matrix Theory and Deep Learning

ATTENTION! Registration is free but mandatory for organizational purposes: register here.

The workshop is organized by the Rome Center on Mathematics for Modeling and Data ScienceS (RoMaDS), University of Rome Tor Vergata, and funded by the excellence program MatMod@TOV. The aim of the workshop is to bring together experts with interests at the intersection of Random Matrix Theory and Deep Learning.

Speakers

Nadav Cohen (Tel Aviv University)
Do Neural Nets Need Gradient Descent to Generalize? Matrix Factorization as a Theoretical Testbed

Abstract

Conventional wisdom attributes the mysterious generalization abilities of overparameterized neural networks to gradient descent (and its variants). The recent volume hypothesis challenges this view: it posits that these generalization abilities persist even when gradient descent is replaced by Guess & Check (G&C), i.e., by drawing weight settings until one that fits the training data is found. The validity of the volume hypothesis for wide and deep neural networks remains an open question. In this talk, I will theoretically investigate this question for matrix factorization---a common testbed in neural network theory. I will first show that generalization under G&C deteriorates with increasing width, establishing what is, to my knowledge, the first case where G&C is provably inferior to gradient descent. Conversely, I will show that generalization under G&C improves with increasing depth, revealing a stark contrast between wide and deep networks, which is validated empirically. These findings suggest that even in simple settings, there may not be a simple answer to the question of whether neural networks need gradient descent to generalize well.

Jon Keating (University of Oxford)
Some connections between random matrix theory and machine learning

Abstract

I will discuss some connections between random matrix theory and machine learning, focusing on the spectrum of the hessian of the loss surface and touching on issues relating to neural collapse.

Florent Krzakala (École polytechnique fédérale de Lausanne)
Asymptotics of Neural Networks in High Dimensions: From Sparse Representations to Deep Hierarchies

Abstract

I will present recent advances in the asymptotic analysis of empirical risk minimization in overparameterized neural networks trained on synthetic data, focusing on two key extensions of classical models: depth and nonlinearity. The aim is to understand what different architectures can learn and to characterize their inductive biases. In the first part, I derive sharp asymptotic formulas for two-layer networks with quadratic activations and show that they exhibit a bias toward sparse representations, such as multi-index models. By mapping the learning dynamics to a convex matrix sensing problem with nuclear norm regularization, we obtain precise generalization thresholds and identify the fundamental limits of recovery. In the second part, I analyze deep architectures trained on hierarchical Gaussian targets and show that depth enables effective dimensionality reduction. This leads to significantly improved sample complexity compared to kernel or shallow methods. A key requirement is that the target function be compositional and robust to noise at each level. These results draw on tools from random matrix theory, convex optimization, and statistical physics, and help delineate the regimes in which deep learning offers provable advantages in high-dimensional settings.

Zhenyu Liao (Huazhong University of Science & Technology)
A Random Matrix Approach to Neural Networks: From Linear to Nonlinear, and from Shallow to Deep

Abstract

Deep neural networks have become the cornerstone of modern machine learning, yet their multi-layer structure, nonlinearities, and intricate optimization processes pose considerable theoretical challenges. In this talk, I will review recent advances in random matrix analysis that shed new light on these complex ML models. Starting with the foundational case of linear regression, I will demonstrate how the proposed analysis extends naturally to shallow nonlinear and ultimately deep nonlinear network models. I will also discuss practical implications (e.g., compressing and/or designing "equivalent" NN models) that arise from these theoretical insights. The talk is based on a recent review paper https://arxiv.org/abs/2506.13139 joint with Michael W. Mahoney.

Roland Speicher (Saarland University)
Entry-wise application of non-linear functions on orthogonally invariant random matrices

Abstract

We investigate how entrywise application of a non-linear function to symmetric orthogonally invariant random matrix ensembles alters the spectral distribution. We treat also the multivariate case where we apply multivariate functions to entries of several orthogonally invariant matrices; where even correlations between matrices are allowed. We find that in all those cases a Gaussian equivalence principle holds, that is, the asymptotic effect of the non-linear function is the same as taking a linear combination of the involved matrices and an additional independent GOE. The ReLU-function in the case of one matrix and the max-function in the case of two matrices provide illustrative examples. This is joint work with Alexander Wendel.

Schedule

Time	Speaker
09h30 - 10h30	Jon Keating
10h30 - 11h00	Coffee break
11h00 - 12h00	Florent Krzakala
12h00 - 13h00	Nadav Cohen
13h00 - 14h30	Lunch
14h30 - 15h30	Roland Speicher
15h30 - 16h30	Zhenyu Liao
16h30 - 17h00	Goodbye tea

Venue

The conference will take place in Aula Gismondi (also known as Aula Magna) at the University of Rome Tor Vergata, Via della Ricerca Scientifica 1, 00133, Roma.
When you arrive and are facing the main building, head to your left. At the far left end of the building, just before entering, take the stairs on your left. Walk a few meters under the overhead 'bridges' — and you'll find the entrance.

Organizers

Domenico Marinucci
Stefano Vigogna
Michele Salvi

Contact

Should you have further questions, please write to [last name of second organizer] [at] mat [dot] uniroma2 [dot] it .

Click here to download the poster.