From Notices of the AMS
On the Geometry of Deep Learning

by Randall Balestriero
Ahmed Imtiaz Humayun
Richard G. Baraniuk
Communicated by Reza Malek-Madani
Introduction
Machine learning has significantly advanced our ability to address a wide range of difficult computational problems and is the engine driving progress in modern artificial intelligence (AI). Today's machine learning landscape is dominated by deep (neural) networks, which are compositions of a large number of simple parametrized linear and nonlinear operators. An all-too-familiar story of the past decade is that of plugging a deep network into an engineering or scientific application as a black box, learning its parameter values using copious training data, and then significantly improving performance over classical task-specific approaches based on erudite practitioner expertise or mathematical elegance.
Despite this exciting empirical progress, however, the precise mechanisms by which deep learning works so well remain relatively poorly understood, adding an air of mystery to the entire field. Ongoing attempts to build a rigorous mathematical framework have been stymied by the fact that, while deep networks are locally simple, they are globally complicated. Hence, they have been studied primarily as "black boxes" and mainly empirically. This approach greatly complicates analysis to understand both the success and failure modes of deep networks.
- Also in Notices
- Selected Results from the Mathematical Conventions Survey
- Short Stories: Triangulations of the Sphere