Mathematics of Computation

Published by the American Mathematical Society since 1960 (published as Mathematical Tables and other Aids to Computation 1943-1959), Mathematics of Computation is devoted to research articles of the highest quality in computational mathematics.

ISSN 1088-6842 (online) ISSN 0025-5718 (print)

The 2024 MCQ for Mathematics of Computation is 1.78.

What is MCQ? The Mathematical Citation Quotient (MCQ) measures journal impact by looking at citations over a five-year period. Subscribers to MathSciNet may click through for more detailed information.

 

Contractivity of neural ODEs: An eigenvalue optimization problem
HTML articles powered by AMS MathViewer

by Nicola Guglielmi, Arturo De Marinis, Anton Savostianov and Francesco Tudisco;
Math. Comp.
DOI: https://doi.org/10.1090/mcom/4059
Published electronically: February 10, 2025

Abstract:

We propose a novel methodology to solve a key eigenvalue optimization problem which arises in the contractivity analysis of neural ordinary differential equations (ODEs). When looking at contractivity properties of a one-layer weight-tied neural ODE $\dot {u}(t)=\sigma (Au(t)+b)$ (with $u,b \in \mathbb {R}^n$, $A$ is a given $n \times n$ matrix, $\sigma : \mathbb {R}\to \mathbb {R}$ denotes an activation function and for a vector $z \in \mathbb {R}^n$, $\sigma (z) \in \mathbb {R}^n$ has to be interpreted entry-wise), we are led to study the logarithmic norm of a set of products of type $D A$, where $D$ is a diagonal matrix such that $diag(D) \in \sigma ’(\mathbb {R}^n)$. Specifically, given a real number $c$ (usually $c=0$), the problem consists in finding the largest positive interval $\mathrm {I}\subseteq \mathbb [0,\infty )$ such that the logarithmic norm $\mu (DA) \le c$ for all diagonal matrices $D$ with $D_{ii}\in \mathrm {I}$. We propose a two-level nested methodology: an inner level where, for a given $\mathrm {I}$, we compute an optimizer $D^\star (\mathrm {I})$ by a gradient system approach, and an outer level where we tune $\mathrm {I}$ so that the value $c$ is reached by $\mu (D^\star (\mathrm {I})A)$. We extend the proposed two-level approach to the general multilayer, and possibly time-dependent, case $\dot {u}(t) = \sigma ( A_k(t) \ldots \sigma ( A_{1}(t) u(t) + b_{1}(t) ) \ldots + b_{k}(t) )$ and we propose several numerical examples to illustrate its behaviour, including its stabilizing performance on a one-layer neural ODE applied to the classification of the MNIST handwritten digits dataset.
References
  • K. He, X. Zhang, S. Ren, and J. Sun, Deep Residual Learning for Image Recognition, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
  • R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, Neural Ordinary Differential Equations, Advances in Neural Information Processing Systems, 2018.
  • Eldad Haber and Lars Ruthotto, Stable architectures for deep neural networks, Inverse Problems 34 (2018), no. 1, 014004, 22. MR 3742361, DOI 10.1088/1361-6420/aa9a90
  • E. Celledoni, M. J. Ehrhardt, C. Etmann, R. I. Mclachlan, B. Owren, C.-B. Schonlieb, and F. Sherry, Structure-preserving deep learning, European J. Appl. Math. 32 (2021), no. 5, 888–936. MR 4308177, DOI 10.1017/S0956792521000139
  • B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Šrndić, P. Laskov, G. Giacinto, and F. Roli, Evasion Attacks against Machine Learning at Test Time, Machine Learning and Knowledge Discovery in Databases, 2013.
  • C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, Intriguing Properties of Neural Networks, International Conference on Learning Representations, 2014.
  • I. J. Goodfellow, J. Shlens, and C. Szegedy, Explaining and Harnessing Adversarial Examples, International Conference on Learning Representations, 2015.
  • F. Carrara, R. Caldelli, F. Falchi, and G. Amato, On the Robustness to Adversarial Examples of Neural ODE Image Classifiers, IEEE International Workshop on Information Forensics and Security, 2019.
  • H. Yan, J. Du, V.Y.F. Tan, and J. Feng, On Robustness of Neural Ordinary Differential Equations, International Conference on Learning Representations, 2020.
  • M. Li, L. He, and Z. Lin, Implicit Euler Skip Connections: Enhancing Adversarial Robustness via Numerical Stability, International Conference on Machine Learning, 2020.
  • F. Carrara, R. Caldelli, F. Falchi, and G. Amato, Defending Neural ODE Image Classifiers from Adversarial Attacks with Tolerance Randomization, International Conference on Pattern Recognition, 2021.
  • F. Carrara, R. Caldelli, F. Falchi, and G. Amato, Improving the Adversarial Robustness of Neural ODE Image Classifiers by Tuning the Tolerance Parameter, Information, 2022.
  • R. Caldelli, F. Carrara, and F. Falchi, Tuning Neural ODE Networks to Increase Adversarial Robustness in Image Forensics, IEEE International Conference on Image Processing, 2022.
  • B. Chang, L. Meng, E. Haber, L. Ruthotto, D. Begert, and E. Holtham, Reversible Architectures for Arbitrarily Deep Residual Neural Networks, Conference on Artificial Intelligence, 2018.
  • Elena Celledoni, Davide Murari, Brynjulf Owren, Carola-Bibiane Schönlieb, and Ferdia Sherry, Dynamical systems-based neural networks, SIAM J. Sci. Comput. 45 (2023), no. 6, A3071–A3094. MR 4679974, DOI 10.1137/22M1527337
  • Ferdia Sherry, Elena Celledoni, Matthias J. Ehrhardt, Davide Murari, Brynjulf Owren, and Carola-Bibiane Schönlieb, Designing stable neural networks using convex analysis and ODEs, Phys. D 463 (2024), Paper No. 134159, 13. MR 4733277, DOI 10.1016/j.physd.2024.134159
  • Q. Kang, Y. Song, Q. Ding, and W. P. Tay, Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending against Adversarial Attacks, Advances in Neural Information Processing Systems, 2021.
  • X. Li, Z. Xin, and W. Liu, Defending Against Adversarial Attacks via Neural Dynamic System, Advances in Neural Information Processing Systems, 2022.
  • Y. Huang, Y. Yu, H. Zhang, Y. Ma, and Y. Yao, Adversarial Robustness of Stabilized Neural ODE Might Be From Obfuscated Gradients, Mathematical and Scientific Machine Learning, 2022.
  • Gustaf Söderlind, The logarithmic norm. History and modern theory, BIT 46 (2006), no. 3, 631–652. MR 2265579, DOI 10.1007/s10543-006-0069-9
  • D. Rim, S. Suri, S. Hong, K. Lee, and R. J. LeVeque, A Stability Analysis of Neural Networks and Its Application to Tsunami Early Warning, JGR: Machine Learning and Computation, 1 (2024), no. 4, Wiley Online Library.
  • E. Hairer, S. P. Nørsett, and G. Wanner, Solving ordinary differential equations. I, 2nd ed., Springer Series in Computational Mathematics, vol. 8, Springer-Verlag, Berlin, 1993. Nonstiff problems. MR 1227985
  • Tosio Kato, Perturbation theory for linear operators, Die Grundlehren der mathematischen Wissenschaften, Band 132, Springer-Verlag New York, Inc., New York, 1966. MR 203473
  • L. Deng, The MNIST Database of Handwritten Digit Images for Machine Learning Research, IEEE Signal Processing Magazine, 2012.
Similar Articles
  • Retrieve articles in Mathematics of Computation with MSC (2020): 15A18, 65F15, 93D40
  • Retrieve articles in all journals with MSC (2020): 15A18, 65F15, 93D40
Bibliographic Information
  • Nicola Guglielmi
  • Affiliation: Division of Mathematics, Gran Sasso Science Institute, viale Rendina 26-28, 67100, L’Aquila, Italy
  • MR Author ID: 603494
  • Email: nicola.guglielmi@gssi.it
  • Arturo De Marinis
  • Affiliation: Division of Mathematics, Gran Sasso Science Institute, viale Rendina 26-28, 67100, L’Aquila, Italy
  • MR Author ID: 1496141
  • ORCID: 0009-0004-4250-1054
  • Email: arturo.demarinis@gssi.it
  • Anton Savostianov
  • Affiliation: Division of Mathematics, Gran Sasso Science Institute, viale Rendina 26-28, 67100, L’Aquila, Italy; Computational Network Science, RWTH Aachen University, Aachen, Germany
  • MR Author ID: 1350668
  • ORCID: 0000-0003-0126-3059
  • Email: anton.savostianov@gssi.it, savostianov@cs.rwth-aachen.de
  • Francesco Tudisco
  • Affiliation: School of Mathematics, University of Edinburgh, James Clerk Maxwell Building, Peter Guthrie Tait Road, Edinburgh, EH9 3FD, UK; and Division of Mathematics, Gran Sasso Science Institute, viale Rendina 26-28, 67100, L’Aquila, Italy
  • MR Author ID: 941446
  • ORCID: 0000-0002-8150-4475
  • Email: f.tudisco@ed.ac.uk, francesco.tudisco@gssi.it
  • Received by editor(s): January 15, 2024
  • Received by editor(s) in revised form: September 8, 2024, and November 25, 2024
  • Published electronically: February 10, 2025
  • Additional Notes: The research of the first author was supported by funds from the Italian MUR (Ministero dell’Università e della Ricerca) within the PRIN 2022 Project “Advanced numerical methods for time dependent parametric partial differential equations with applications” and the PRO3 joint project entitled “Calcolo scientifico per le scienze naturali, sociali e applicazioni: sviluppo metodologico e tecnologico”. The first and fourth authors were supported from MUR-PRO3 grant STANDS and PRIN-PNRR grant FIN4GEO
  • © Copyright 2025 American Mathematical Society
  • Journal: Math. Comp.
  • MSC (2020): Primary 15A18, 65F15, 93D40
  • DOI: https://doi.org/10.1090/mcom/4059