Mathematics of Computation

Published by the American Mathematical Society since 1960 (published as Mathematical Tables and other Aids to Computation 1943-1959), Mathematics of Computation is devoted to research articles of the highest quality in computational mathematics.

ISSN 1088-6842 (online) ISSN 0025-5718 (print)

The 2024 MCQ for Mathematics of Computation is 1.78.

What is MCQ? The Mathematical Citation Quotient (MCQ) measures journal impact by looking at citations over a five-year period. Subscribers to MathSciNet may click through for more detailed information.

 

Convergence and stability results for the particle system in the Stein gradient descent method
HTML articles powered by AMS MathViewer

by José A. Carrillo and Jakub Skrzeczkowski;
Math. Comp. 94 (2025), 1793-1814
DOI: https://doi.org/10.1090/mcom/4000
Published electronically: February 26, 2025

Abstract:

There has been recently a lot of interest in the analysis of the Stein gradient descent method, a deterministic sampling algorithm. It is based on a particle system moving along the gradient flow of the Kullback-Leibler divergence towards the asymptotic state corresponding to the desired distribution. Mathematically, the method can be formulated as a joint limit of time $t$ and number of particles $N$ going to infinity. We first observe that the recent work of Lu, Lu and Nolen [SIAM J. Math Anal. 51 (2019), pp. 648–671] implies that if $t=O(\log (\log N))$, then the joint limit can be rigorously justified in the Wasserstein distance. Not satisfied with this time scale, we explore what happens for larger times by investigating the stability of the method: if the particles are initially close to the asymptotic state, with distance $O(1/N)$, how long will they remain close? We prove that this happens in algebraic time scales $t=O(\sqrt {N})$ which is significantly better. The exploited method, developed by Caglioti and Rousset [J. Stat. Phys. 129 (2007), pp. 241–263; Arch. Ration. Mech. Anal. 190 (2008), pp. 517–547] for the Vlasov equation, is based on finding a functional invariant for the linearized equation. This allows to eliminate linear terms and arrive at an improved Grönwall-type estimate.
References
  • N. Bou-Rabee and M. Hairer, Nonasymptotic mixing of the MALA algorithm, IMA J. Numer. Anal. 33 (2013), no. 1, 80–110. MR 3020951, DOI 10.1093/imanum/drs003
  • W. Braun and K. Hepp, The Vlasov dynamics and its fluctuations in the $1/N$ limit of interacting classical particles, Comm. Math. Phys. 56 (1977), no. 2, 101–113. MR 475547, DOI 10.1007/BF01611497
  • E. Caglioti and F. Rousset, Quasi-stationary states for particle systems in the mean-field limit, J. Stat. Phys. 129 (2007), no. 2, 241–263. MR 2358804, DOI 10.1007/s10955-007-9390-1
  • E. Caglioti and F. Rousset, Long time estimates in the mean field limit, Arch. Ration. Mech. Anal. 190 (2008), no. 3, 517–547. MR 2448326, DOI 10.1007/s00205-008-0157-x
  • J. A. Carrillo, S. Fagioli, F. Santambrogio, and M. Schmidtchen, Splitting schemes and segregation in reaction cross-diffusion systems, SIAM J. Math. Anal. 50 (2018), no. 5, 5695–5718. MR 3870087, DOI 10.1137/17M1158379
  • P. Chen and O. Ghattas, Projected Stein variational gradient descent, Adv. Neural Inform. Process. Syst. 33 (2020), 1947–1958.
  • Y. Chen, D. Zhengyu Huang, J. Huang, S. Reich, and A. M. Stuart, Gradient flows for sampling: mean-field models, gaussian approximations and affine invariance, Preprint, arXiv:2302.11024, 2023.
  • A. Das and D. Nagaraj, Provably fast finite particle variants of svgd via virtual particle stochastic approximation, Adv. Neural Inform. Process. Syst. 36 (2024).
  • R. L. Dobrušin, Vlasov equations, Funktsional. Anal. i Prilozhen. 13 (1979), no. 2, 48–58, 96 (Russian). MR 541637
  • Christian Düll, Piotr Gwiazda, Anna Marciniak-Czochra, and Jakub Skrzeczkowski, Spaces of measures and their applications to structured population models, Cambridge Monographs on Applied and Computational Mathematics, vol. 36, Cambridge University Press, Cambridge, 2022. MR 4309603
  • A. Duncan, N. Nüsken, and L. Szpruch, On the geometry of Stein variational gradient descent, J. Mach. Learn. Res. 24 (2023), Paper No. [56], 39. MR 4582478
  • Nicolas Fournier and Benoît Perthame, A nonexpanding transport distance for some structured equations, SIAM J. Math. Anal. 53 (2021), no. 6, 6847–6872. MR 4347325, DOI 10.1137/21M1397313
  • Piotr Gwiazda, Błażej Miasojedow, Jakub Skrzeczkowski, and Zuzanna Szymańska, Convergence of the EBT method for a non-local model of cell proliferation with discontinuous interaction kernel, IMA J. Numer. Anal. 43 (2023), no. 1, 590–626. MR 4565590, DOI 10.1093/imanum/drab102
  • Seung-Yeal Ha and Jian-Guo Liu, A simple proof of the Cucker-Smale flocking dynamics and mean-field limit, Commun. Math. Sci. 7 (2009), no. 2, 297–325. MR 2536440, DOI 10.4310/cms.2009.v7.n2.a2
  • J. Han and Q. Liu, Stein variational gradient descent without gradient, International Conference on Machine Learning, PMLR, 2018, pp. 1900–1908.
  • Daniel Han-Kwan and Toan T. Nguyen, Instabilities in the mean field limit, J. Stat. Phys. 162 (2016), no. 6, 1639–1653. MR 3463791, DOI 10.1007/s10955-016-1455-6
  • W. K. Hastings, Monte Carlo sampling methods using Markov chains and their applications, Biometrika 57 (1970), no. 1, 97–109. MR 3363437, DOI 10.1093/biomet/57.1.97
  • Maxime Hauray and Pierre-Emmanuel Jabin, $N$-particles approximation of the Vlasov equations with singular potential, Arch. Ration. Mech. Anal. 183 (2007), no. 3, 489–524. MR 2278413, DOI 10.1007/s00205-006-0021-9
  • Maxime Hauray and Pierre-Emmanuel Jabin, Particle approximation of Vlasov equations with singular forces: propagation of chaos, Ann. Sci. Éc. Norm. Supér. (4) 48 (2015), no. 4, 891–940 (English, with English and French summaries). MR 3377068, DOI 10.24033/asens.2261
  • Y. He, K. Balasubramanian, B. K. Sriperumbudur, and J. Lu, Regularized Stein variational gradient flow, Preprint, arXiv:2211.07861, 2022.
  • Pierre-Emmanuel Jabin and Zhenfu Wang, Mean field limit and propagation of chaos for Vlasov systems with bounded forces, J. Funct. Anal. 271 (2016), no. 12, 3588–3627. MR 3558251, DOI 10.1016/j.jfa.2016.09.014
  • A. Korba, A. Salim, M. Arbel, G. Luise, and A. Gretton, A non-asymptotic analysis for Stein variational gradient descent, Adv. Neural Inform. Process. Syst. 33 (2020), 4672–4682.
  • Giovanni Leoni and Massimiliano Morini, Necessary and sufficient conditions for the chain rule in $W^{1,1}_\textrm {loc}(\Bbb R^N;\Bbb R^d)$ and $\textrm {BV}_\textrm {loc}(\Bbb R^N;\Bbb R^d)$, J. Eur. Math. Soc. (JEMS) 9 (2007), no. 2, 219–252. MR 2293955, DOI 10.4171/JEMS/78
  • Q. Liu, Stein variational gradient descent as gradient flow, Adv. Neural Inform. Process. Syst. 30 (2017).
  • Q. Liu and D. Wang, Stein variational gradient descent: a general purpose bayesian inference algorithm, Adv. Neural Inform. Process. Syst. 29 (2016).
  • Q. Liu and D. Wang, Stein variational gradient descent as moment matching, Adv. Neural Inform. Process. Syst. 31 (2018).
  • T. Liu, P. Ghosal, K. Balasubramanian, and N. Pillai, Towards understanding the dynamics of gaussian-stein variational gradient descent, Adv. Neural Inform. Process. Syst. 36 (2024).
  • Jianfeng Lu, Yulong Lu, and James Nolen, Scaling limit of the Stein variational gradient descent: the mean field regime, SIAM J. Math. Anal. 51 (2019), no. 2, 648–671. MR 3919409, DOI 10.1137/18M1187611
  • N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, Equation of state calculations by fast computing machines, J. Chem. Phys. 21 (1953), no. 6, 1087–1092.
  • Nikolas Nüsken and D. R. Michiel Renger, Stein variational gradient descent: many-particle and long-time asymptotics, Found. Data Sci. 5 (2023), no. 3, 286–320. MR 4622922, DOI 10.3934/fods.2022023
  • G. O. Roberts and R. L. Tweedie, Geometric convergence and central limit theorems for multidimensional Hastings and Metropolis algorithms, Biometrika 83 (1996), no. 1, 95–110. MR 1399158, DOI 10.1093/biomet/83.1.95
  • Gareth O. Roberts and Richard L. Tweedie, Exponential convergence of Langevin distributions and their discrete approximations, Bernoulli 2 (1996), no. 4, 341–363. MR 1440273, DOI 10.2307/3318418
  • A. Salim, L. Sun, and P. Richtarik, A convergence theory for svgd in the population limit under talagrand’s inequality t1, International Conference on Machine Learning, PMLR, 2022, pp. 19139–19152.
  • Filippo Santambrogio, Optimal transport for applied mathematicians, Progress in Nonlinear Differential Equations and their Applications, vol. 87, Birkhäuser/Springer, Cham, 2015. Calculus of variations, PDEs, and modeling. MR 3409718, DOI 10.1007/978-3-319-20828-2
  • Sylvia Serfaty, Mean field limit for Coulomb-type flows, Duke Math. J. 169 (2020), no. 15, 2887–2935. With an appendix by Mitia Duerinckx and Serfaty. MR 4158670, DOI 10.1215/00127094-2020-0019
  • J. Shi and L. Mackey, A finite-particle convergence rate for Stein variational gradient descent, Adv. Neural Inform. Process. Syst. 36 (2024).
  • A. M. Stuart, Inverse problems: a Bayesian perspective, Acta Numer. 19 (2010), 451–559. MR 2652785, DOI 10.1017/S0962492910000061
  • L. Sun, A. Karagulyan, and P. Richtarik, Convergence of stein variational gradient descent under a weaker smoothness condition, International Conference on Artificial Intelligence and Statistics, PMLR, 2023, pp. 3693–3717.
  • Z. Szymańska, J. Skrzeczkowski, B. Miasojedow, and P. Gwiazda, Bayesian inference of a non-local proliferation model, Roy. Soc. Open Sci. 8 (2021), no. 11, 211279.
  • D. Wang, Z. Tang, C. Bajaj, and Q. Liu, Stein variational gradient descent with matrix-valued kernels, Adv. Neural Inform. ProcesS. Syst. 32 (2019).
  • L. Xu, A. Korba, and D. Slepcev, Accurate quantization of measures via interacting particle-based optimization, International Conference on Machine Learning, PMLR, 2022, pp. 24576–24595.
  • J. Zhuo, C. Liu, J. Shi, J. Zhu, N. Chen, and B. Zhang, Message passing Stein variational gradient descent, International Conference on Machine Learning, PMLR, 2018, pp. 6018–6027.
Similar Articles
Bibliographic Information
  • José A. Carrillo
  • Affiliation: Mathematical Institute, University of Oxford, Woodstock Road, Oxford OX2 6GG, United Kingdom
  • ORCID: 0000-0001-8819-4660
  • Email: carrillo@maths.ox.ac.uk
  • Jakub Skrzeczkowski
  • Affiliation: Mathematical Institute, University of Oxford, Woodstock Road, Oxford OX2 6GG, United Kingdom
  • MR Author ID: 1356459
  • ORCID: 0000-0003-3328-4428
  • Email: jakub.skrzeczkowski@maths.ox.ac.uk
  • Received by editor(s): December 26, 2023
  • Received by editor(s) in revised form: May 8, 2024, June 16, 2024, and June 17, 2024
  • Published electronically: February 26, 2025
  • Additional Notes: The authors were supported by the Advanced Grant Nonlocal-CPD (Nonlocal PDEs for Complex Particle Dynamics: Phase Transitions, Patterns and Synchronization) of the European Research Council Executive Agency (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No. 883363). The first author was also partially supported by the EPSRC grant numbers EP/T022132/1 and EP/V051121/1. The first author was also partially supported by the “Maria de Maeztu” Excellence Unit IMAG, reference CEX2020-001105-M, funded by MCIN/AEI/10.13039/501100011033/.
  • © Copyright 2025 American Mathematical Society
  • Journal: Math. Comp. 94 (2025), 1793-1814
  • MSC (2020): Primary 35Q62, 35B35, 35Q68, 62-08, 65K10
  • DOI: https://doi.org/10.1090/mcom/4000