# Plenary talks

Monday, June 12, 9:30 ~ 10:30

## Geometric Knot Theory

### John Sullivan

#### TU Berlin, Germany - This email address is being protected from spambots. You need JavaScript enabled to view it.

Geometric knot theory studies how geometric properties of space curves relate to their topological knot type. We will give a survey of the field, starting with one of the earliest results, the Fáry/Milnor theorem relating total curvature to bridge number. Work in recent decades has been partly motivated by applications to the shapes of knotted polymers like DNA molecules. One interesting problem with some surprising answers asks for the shapes of knots and links tied tight in rope of fixed thickness. We will consider this ropelength problem, describing for instance an explicit shape for the tight Borromean rings, as well as analogous problems for periodic links in space and for knot diagrams in the plane. Finally, we will also mention elastic knots, Gromov's notion of distortion, and the second hull of a knot or link.

Monday, June 12, 11:00 ~ 12:00

## Computational Group Theory and the Classification of Finite Groups

### Bettina Eick

#### Technische Universität Braunschweig, Germany - This email address is being protected from spambots. You need JavaScript enabled to view it.

The talk gives a broad survey on the history of computational group theory and then focus on its application in the classification of finite groups. In particular, the talk shows how the theory of groups is used in this part of computational group theory to design effective algorithms and it discusses its results and open problems.

Tuesday, June 13, 9:30 ~ 10:30

## On the computational complexity of high-dimensional MCMC in non-linear inverse problems

### Richard Nickl

#### University of Cambridge, UK - This email address is being protected from spambots. You need JavaScript enabled to view it.

We discuss recent results, both positive and negative, about the run-time of Markov chain Monte Carlo (MCMC) algorithms that target posterior distributions arising from high-dimensional non-linear statistical regression models with Gaussian process priors. Prototypical applications include inverse problems with partial differential equations (PDEs). We show that cold-start local MCMC may not work (have `exponential in dimension’ runtime) even for target measures that are radially strictly decreasing away from their unique mode, but that warm-start Langevin type MCMC can achieve polynomial runtime for posterior computation under certain `gradient stability’ conditions that can be verified in a large class of relevant PDE models.

Joint work with Sven Wang (MIT), and for the lower bounds further with Afonso Bandeira (ETH Zürich), Antoine Maillard (ETH Zürich)..

Tuesday, June 13, 11:10 ~ 12:10

### Shayan Oveis Gharan - TBA

#### University of Washington, This email address is being protected from spambots. You need JavaScript enabled to view it.

Wednesday, June 14, 9:30 ~ 10:30

## Least-squares solvers for PDEs

### Rob Stevenson

#### University of Amsterdam, Netherlands - This email address is being protected from spambots. You need JavaScript enabled to view it.

Least squares solvers regain popularity because of their suitability for approximation with neural networks. After recalling the least-squares approach to numerically approximate the solution of a PDE, we will present techniques to obtain quasi-optimal approximations also in the case that the residual is measured in a norm that cannot be evaluated, as a dual norm or fractional Sobolev norm. The latter norms typically arise when the PDE is appended with inhomogeneous boundary conditions. Dual norms are unavoidable with so-called ultra-weak formulations of PDEs. By employing with such formulations the so-called optimal test norm, one is able to approximate the best approximation from the trial space within any tolerance.

As an application of the latter approach, we present details about an ultra-weak first order system system discretization of the Helmholtz equation that is pollution-free.

Wednesday, June 14, 11:00 ~ 12:00

## Rigorous Approximations for Interacting Particle Systems on Large Sparse Graphs

### Kavita Ramanan

#### Brown University, USA - This email address is being protected from spambots. You need JavaScript enabled to view it.

Many physical phenomena are modelled by large collections of randomly evolving particles in which the instantaneous evolution of the state of each particle depends only on the states of particles in its neighborhood with respect to an underlying interaction graph. While classical work, falling under the rubric of mean-field approximations, has focused on the case when this interaction graph is dense, most real-world networks are sparse and often random. We describe recent developments that provide asymptotically exact approximations for such interacting particle systems in the complementary case when the graph is sparse.

Thursday, June 15, 9:30 ~ 10:30

## The multi armed bandit problem

### Angelika Steger

#### ETH Zurich, Switzerland - This email address is being protected from spambots. You need JavaScript enabled to view it.

In this talk we cover two well studied problems in machine learning resp cognitive neuroscience: the multi armed bandit problem resp. reversal learning setups. The problem is the following: consider a machine with K arms, where each arm provides a random reward from an unknown probability distribution that potentially can change over time. The objective of the player is to maximize the sum of expected rewards by using an appropriate strategy. The crucial tradeoff that the player faces at each step is the tradeoff between „exploitation" of the arm that she believes has the highest expected payoff and "exploration" to get more information on the underlying probability distributions of all arms. In this talk we provide new optimal algorithms for several versions of the problem.

Joint work with Maxime Larcher (ETH Zurich) and Robert Meier (ETH Zurich).

Thursday, June 15, 11:00 ~ 12:00

## Resonances as a computational tool

### Katharina Schratz

#### Sorbonne University, France - This email address is being protected from spambots. You need JavaScript enabled to view it.

A large toolbox of numerical schemes for dispersive equations has been established, based on different discretization techniques such as discretizing the variation-of-constants formula (e.g., exponential integrators) or splitting the full equation into a series of simpler subproblems (e.g., splitting methods). In many situations these classical schemes allow a precise and efficient approximation. This, however, drastically changes whenever non-smooth phenomena enter the scene such as for problems at low regularity and high oscillations. Classical schemes fail to capture the oscillatory nature of the solution, and this may lead to severe instabilities and loss of convergence. In this talk I present a new class of resonance based schemes. The key idea in the construction of the new schemes is to tackle and deeply embed the underlying nonlinear structure of resonances into the numerical discretization. As in the continuous case, these terms are central to structure preservation and offer the new schemes strong geometric properties at low regularity.

I will present the key idea behind resonances as a computational tool, their high order counterpart (via tree series inspired by singular SPDEs), their error estimates in low regularity spaces (via discrete Bourgain spaces) and numerical experiments on nonlinear dispersive quantization effects. I also want to address the fundamental question of resonance based schemes and structure preservation, i.e., central symmetries and even more so symplecticity.

Friday, June 16, 9:30 ~ 10:30

## Threshold-linear networks, attractors, and oriented matroids

### Carina Curto

#### Pennsylvania State University, USA - This email address is being protected from spambots. You need JavaScript enabled to view it.

Threshold-linear networks (TLNs) are common firing rate models in theoretical neuroscience that are useful for modeling neural activity and computation in the brain. They are simple, recurrently-connected networks with a rich repertoire of nonlinear dynamics including multistability, limit cycles, quasiperiodic attractors, and chaos. Over the past few years, we have developed a mathematical theory relating stable and unstable fixed points of TLNs to combinatorial properties of an underlying graph. The resulting "graph rules" and "gluing rules" provide a direct link between network architecture and key features of the dynamics. Many open questions remain, however, such as: How do changes in network parameters, such as connectivity, transition a TLN from one dynamic regime to another? In this talk, I present some new results in the bifurcation theory of TLNs, with an eye towards understanding how these networks may evolve via training (or learning). It turns out that a certain family of determinants related to the underlying hyperplane arrangement of a TLN is key. In particuar, the theory of oriented matroids and Grassmann-Plucker relations provide valuable insights into the allowed fixed point configurations of TLNs, as well as the allowed bifurcations.

Friday, June 16, 11:10 ~ 12:10

## Universality, the new trend in development of Optimization Schemes

### Yurii Nesterov

#### CORE/INMA, UCLouvain, Belgium - This email address is being protected from spambots. You need JavaScript enabled to view it.

In the early years of Optimization, the first classical schemes were derived from an abstract concept of approximation (e.g. Gradient method, Newton’s methods, etc.). However, since the development of Complexity Theory for Convex Optimization (Nemirovsky, Yudin 1970’s), the most powerful approaches for constructing efficient (optimal) methods are based on the model of the objective function. This model incorporates the characteristic properties of the corresponding problem class and provides us with a comprehensive information on the behavior of the objective. At the same time, it helps in deriving theoretically unimprovable complexity bounds for the target class.

However, this framework completely neglects the fact that every objective function belongs, at the same time, to many different problem classes. Hence, it should be treated by a method developed for the most appropriate class of problems. However, for the real-life problems, such a choice is hardly feasible, at least in advance.

In this talk, we discuss several ideas for constructing universal methods, which automatically ensure the best possible convergence rate among appropriate problem classes. The simplest methods of this type adjust to the best power in Holder condition for the target derivative. Our most promising super-universal Regularized Newton’s Method works properly for a wide range of problems, starting from the functions with bounded variation of Hessian up to the functions with Lipschitz continuous third derivative. Thus, being a second-order scheme, it covers all diversity of problems, from the problems traditionally treated by the first-order methods, up to the problems, which are usually attributed to the third-order schemes. For its proper work, no preliminary information on the objective function is needed.

Joint work with Nikita Doikov (EPFL, Switzerland), Geovani Grapiglia (UCLouvain, Belgium) and Konstantin Mishchenko (Samsung Research, Cambridge, UK).

Saturday, June 17, 9:30 ~ 10:30

## Stochastic gradient descent with adaptive step-sizes: from practice to theory

### Rachel Ward

#### University of Texas at Austin, USA - This email address is being protected from spambots. You need JavaScript enabled to view it.

Stochastic gradient descent (SGD) is the foundational optimization algorithm used to train neural networks, and variations of SGD where the step-sizes are updated adaptively based on past stochastic gradient information are a crucial part of this success. Recently, a theoretical understanding of the behavior of stochastic gradient descent with adaptive step-sizes has emerged, shedding light on why adaptivity is so powerful in practice.

We focus on the simplest adaptive gradient algorithm called Adagrad-norm, and recall how Adagrad-norm can match the optimal convergence rates of carefully-tuned SGD. We then discuss recent results which show that even when the optimization landscape is not globally smooth — as arises even in simple neural networks — Adagrad-norm can adapt and converge at an optimal rate without additional hyperparameter tuning.

Saturday, June 17, 11:00 ~ 12:00

## Optimization, Complexity and Math (or, can we prove P!=NP by gradient descent?).

### Avi Wigderson

#### Institute for Advanced Study, Princeton, United States - This email address is being protected from spambots. You need JavaScript enabled to view it.

This talk aims to summarize a project I was involved in during the past decade, with the hope of explaining our most complete understanding so far, as well as challenges and open problems. The main messages of this project are summarized below; I plan to describe, through examples, many of the concepts they refer to, and the evolution of ideas leading to them. No special background is assumed.

(1) The most basic tools of convex optimization in Euclidean space extend to a far more general setting of Riemannian manifolds that arise from the symmetries of noncommutative groups. We develop first-order and second-order algorithms, and analyze their performance in general. While proving convergence bounds requires heavy algebraic and analytic tools, convergence itself depends in an elegant way on natural ``smoothness’’ parameters, in analogy with the Euclidean (commutative) case.

(2) These algorithms give exponential (or better) improvements in run-time for solving algorithmic many problems across CS, Math and Physics. In particular, these include problems in algebra (e.g. testing rational identities in non-commutative variables), in analysis (testing the feasibility and tightness of Brascamp-Lieb inequalities), in quantum information theory (to the quantum marginals problem), in algebraic geometry (to computing Kronecker coefficients), in computational complexity (to derandomizing new special cases of the PIT problem) and in optimization (to testing membership in large, implicitly described polytopes).

(3) The focus on symmetries exposes old and reveals new relations between the problems above. Essentially, they are all membership problems in null cones and moment polytopes of natural group actions on natural spaces. Invariant theory, which studies such group actions, plays an essential role in this development. In particular, a beautiful non-commutative duality theory (expending linear programming duality in the commutative case), and notions of geodesic convexity (extending the Euclidean one) and moment maps (extending the Euclidean gradient) are central to the algorithms and their analysis. Interestingly, most algorithms in invariant theory are symbolic/algebraic, and these new numeric/analytic algorithms proposed here often significantly improve on them.

Based on joint works with Zeyuan Allen-Zhu, Peter Burgisser, Cole Franks, Ankit Garg, Leonid Gurvits, Pavel Hrubes, Yuanzhi Li, Visu Makam, Rafael Oliveira and Michael Walter.

Monday, June 19, 9:30 ~ 10:30

## Randomization for solving difficult linear algebra problems

### Daniel Kressner

#### EPFL, Switzerland - This email address is being protected from spambots. You need JavaScript enabled to view it.

Randomized algorithms are becoming increasingly popular in matrix computations. Recent software efforts, such as RandLAPACK, testify that randomization is on the brink of replacing existing deterministic techniques for several large-scale linear algebra tasks. The poster child of these developments, the randomized singular value decomposition is nowadays one of the state-the-of-art approaches for performing low-rank approximation. In this talk, we will discuss numerous further examples for the potential of randomization to facilitate the solution of notoriously difficult linear algebra tasks. This includes a simple numerical algorithm for jointly diagonalizing a family of nearly commuting matrices, the solution of challenging singular and nonlinear eigenvalue problems, as well as the low-rank approximation of matrix functions and matrix-valued functions. A common theme of all these developments is that randomization helps turn linear algebra results that only hold generically into robust and reliable numerical algorithms.

Monday, June 19, 11:10 ~ 12:10

## High-dimensional Stochastic Gradient Descent: effective dynamics and critical scaling

### Gérard Ben Arous

#### Courant Institute of Mathematical Sciences, New York University, U.S.A - This email address is being protected from spambots. You need JavaScript enabled to view it.

I will survey recent joint work with Reza Gheissari (Northwestern) and Aukosh Jagannath (Waterloo) and upcoming work with the same authors and with Jiaoyang Huang (Wharton, University of Pennsylvania).

The trajectories of SGD (properly rescaled) typically converge to the gradient flow of the population loss, in finite dimensions. How can one efficiently state such a theorem when the dimension is diverging?

We prove limit theorems for the trajectories of summary statistics (i.e., finite-dimensional functions) of SGD as the dimension goes to infinity. Our approach allows one to choose the summary statistics that are tracked, the initialization, and the step-size. It yields both ballistic (ODE) and diffusive (SDE) limits, with the limit depending dramatically on the former choices.

Interestingly, we find a critical scaling regime for the step-size below which the effective ballistic dynamics matches gradient flow for the population loss, but at which, a new correction term appears which changes the phase diagram.

About the fixed points of this effective dynamics, the corresponding diffusive limits can be quite complex and even degenerate. We demonstrate our approach on popular examples including estimation for spiked matrix and tensor models and classification via two-layer networks for binary and XOR-type Gaussian mixture models.

These examples exhibit surprising phenomena for these limiting dynamical systems, including multimodal timescales to convergence as well as convergence to sub-optimal solutions with probability bounded away from zero from random (e.g.,Gaussian) initializations.

The next open question are then: how does one find relevant summary statistics for less explicit models than these examples? And how are we sure that the induced effective dynamics perform well?

Joint work with Aukosh Jagannath (University of Waterloo, Canada), Reza Gheissari (Northwestern University, U.S.A) and Jiaoyang Huang (Wharton School, University of Pennsylvania, U.S.A).

Tuesday, June 20, 9:30 ~ 10:30

## Multiple orthogonal polynomials and special functions

### Walter Van Assche

#### KU Leuven, Belgium - This email address is being protected from spambots. You need JavaScript enabled to view it.

Multiple orthogonal polynomials are polynomials in one variable that satisfy orthogonality conditions with respect to $r$ measures. They appear naturally in Hermite-Pad\'e approximation to $r$ functions. The case $r=1$ corresponds to the usual orthogonal polynomials. Several systems of multiple orthogonal polynomials have been constructed using classical weight functions (multiple Hermite, multiple Laguerre, multiple Jacobi polynomials). In this talk I will use weight functions given by special functions satisfying a differential equation. The $r$ weights then appear by writing the differential equation as a system of first order equations, which then generalizes the Pearson equation for classical orthogonal polynomials. The weights are in terms of modified Bessel functions $K_\nu(2\sqrt{x})$, modified Bessel functions $I_\nu(2\sqrt{x})$, hypergeometric functions, confluent hypergeometric functions, and the exponential integral. We give some applications where these multiple orthogonal polynomials appear, such as eigenvalues and singular values of products of random matrices, non-intersecting Brownian motions, and rational approximations for real numbers.

Tuesday, June 20, 11:00 ~ 12:00

## Towards a Fluid computer

### Eva Miranda

#### Universitat Politècnica de Catalunya and Centre de Recerca Matemàtica, Spain - This email address is being protected from spambots. You need JavaScript enabled to view it.

In 1991, Moore [4] raised a question about whether hydrodynamics is capable of performing computations. Similarly, in 2016, Tao [6] asked whether a mechanical system, including a fluid flow, can simulate a universal Turing machine. Etnyre and Ghrist showed in [3] that contact geometry and fluid dynamics are related through a mirror that reflects Reeb vector fields as Beltrami vector fields, allowing us to answer these questions.

In this talk, we will present the construction in [1] of a “Fluid computer” in dimension 3 that uses this "mirror" combining techniques developed by Alan Turing with symbolic dynamics and modern Geometry (contact geometry). A completely different construction for the Euclidean metric is given in [2]. These constructions reveal the existence of undecidable fluid paths. Tao's question was motivated by a research program to address the Navier–Stokes existence and smoothness problem ([5] and [6]). Could such a Fluid computer be used to address this Millennium prize problem?

Time permitting, we will end up the talk with some speculative ideas of a Fluid computer construction à la Feynman.

[1] R. Cardona, E. Miranda, D. Peralta-Salas and F. Presas, Constructing Turing complete Euler flows in dimension 3. Proc. Natl. Acad. Sci. USA 118 (2021), no. 19, Paper No. e2026818118, 9 pp.

[2] R. Cardona, E. Miranda, and D. Peralta-Salas, Computability and Beltrami fields in Euclidean space. J. Math. Pures Appl. (9) 169 (2023), 50–81.

[3] J. Etnyre, R. Ghrist, Contact topology and hydrodynamics: I. Beltrami fields and the Seifert conjecture. Nonlinearity 13, 441 (2000).

[4] C. Moore, Generalized shifts: Unpredictability and undecidability in dynamical systems. Nonlinearity 4, 199 (1991).

[5] T. Tao, Searching for singularities in the Navier–Stokes equations. Nat. Rev. Phys. 1, 418–419 (2019).

[6] T. Tao, Finite time blowup for an averaged three-dimensional Navier–Stokes equation. J. Am. Math. Soc. 29, 601–674 (2016).

Wednesday, June 21, 9:30 ~ 10:30

## Fourier interpolation pairs and modular forms

### Maryna Viazovska

#### EPFL, Switzerland - This email address is being protected from spambots. You need JavaScript enabled to view it.

This lecture is about Fourier uniqueness and Fourier interpolation pairs, some explicit constructions of such pairs and their applications. Suppose that we have two subsets X and Y of the Euclidean space. Can we reconstruct a function f from its restriction to the set X and the restriction of its Fourier transform to the set Y? We are interested in the pairs (X,Y) such that the answer to the question above is affirmative. In this talk I will give an overview of recent progress on explicit constructions and existence results for Fourier interpolation pairs and corresponding interpolation formulas. Also I will try to convince you that explicit Fourier interpolation is a useful gadget in solving optimization problem and analyzing differential equations.

Wednesday, June 21, 11:00 ~ 12:00

## Optimal sampling

### Ben Adcock

#### Simon Fraser University, Canada - This email address is being protected from spambots. You need JavaScript enabled to view it.

Approximating an unknown function from a limited number of pointwise samples is a fundamental problem in applied mathematics, computer science and machine learning. This talk concerns the following question: given a fixed (linear or nonlinear) approximation space, how should one choose the sample points to obtain as good an approximation from as few samples as possible? In this talk, I will first describe the case of linear approximation spaces. Here a near-complete answer to this question has emerged in recent years. I will survey this recent progress, aiming to highlight connections to approximation theory, information-based complexity and randomized numerical linear algebra. Next, I will explore the significantly more challenging setting of nonlinear approximation spaces. Here I will present several new theoretical results and discuss recent progress towards optimal sampling. I will also discuss an important generalization of the problem to non-pointwise, and potentially non-scalar valued, samples. Finally, and time permitting, I will present a series of novel applications, including learning with sparse models, compressive imaging using generative models, and the numerical solution of PDEs using Physics-Informed Neural Networks (PINNs).

Joint work with Juan M. Cardenas (Simon Fraser University, Canada) and Nick Dexter (Florida State University, USA).