Convergence acceleration of ensemble Kalman inversion in nonlinear settings

Chada, Neil; Tong, Xin

doi:10.1090/mcom/3709

Convergence acceleration of ensemble Kalman inversion in nonlinear settings
HTML articles powered by AMS MathViewer

by Neil K. Chada and Xin T. Tong HTML | PDF

Math. Comp. 91 (2022), 1247-1280 Request permission

Abstract:

Many data-science problems can be formulated as an inverse problem, where the parameters are estimated by minimizing a proper loss function. When complicated black-box models are involved, derivative-free optimization tools are often needed. The ensemble Kalman filter (EnKF) is a particle-based derivative-free Bayesian algorithm originally designed for data assimilation. Recently, it has been applied to inverse problems for computational efficiency. The resulting algorithm, known as ensemble Kalman inversion (EKI), involves running an ensemble of particles with EnKF update rules so they can converge to a minimizer. In this article, we investigate EKI convergence in general nonlinear settings. To improve convergence speed and stability, we consider applying EKI with non-constant step-sizes and covariance inflation. We prove that EKI can hit critical points with finite steps in non-convex settings. We further prove that EKI converges to the global minimizer polynomially fast if the loss function is strongly convex. We verify the analysis presented with numerical experiments on two inverse problems.

References

J. L. Anderson, An adaptive covariance inflation error correction algorithms for ensemble filters, Tellus A 59 (2007), 210–224.
J. L. Anderson, Spatially and temporally varying adaptive covariance inflation for ensemble filters, Tellus A 61 (2009), no. 1, 72–83.
Frank Bauer, Thorsten Hohage, and Axel Munk, Iteratively regularized Gauss-Newton method for nonlinear inverse problems with random noise, SIAM J. Numer. Anal. 47 (2009), no. 3, 1827–1846. MR 2505875, DOI 10.1137/080721789
Bradley M. Bell and Frederick W. Cathey, The iterated Kalman filter update as a Gauss-Newton method, IEEE Trans. Automat. Control 38 (1993), no. 2, 294–297. MR 1206815, DOI 10.1109/9.250476
Dennis S. Bernstein, Matrix mathematics, Princeton University Press, Princeton, NJ, 2005. Theory, facts, and formulas with application to linear systems theory. MR 2123424
Dimitri P. Bertsekas, Incremental least squares methods and the extended Kalman filter, SIAM J. Optim. 6 (1996), no. 3, 807–822. MR 1402206, DOI 10.1137/S1052623494268522
Martin Benning and Martin Burger, Modern regularization methods for inverse problems, Acta Numer. 27 (2018), 1–111. MR 3826506, DOI 10.1017/s0962492918000016
Dimitris Bertsimas and John Tsitsiklis, Simulated annealing, Probability and algorithms, Nat. Acad. Press, Washington, DC, 1992, pp. 17–29. MR 1194437
Alexandros Beskos, Dan Crisan, and Ajay Jasra, On the stability of sequential Monte Carlo methods in high dimensions, Ann. Appl. Probab. 24 (2014), no. 4, 1396–1445. MR 3211000, DOI 10.1214/13-AAP951
C. Bishop, B. Etherton, and S. Majumdar, Adaptive sampling with the ensemble transform kalman filter part i: the theoretical aspects, Monthly Weather Rev. 129 (2001), 420–436.
Dirk Blömker, Claudia Schillings, Philipp Wacker, and Simon Weissmann, Well posedness and convergence analysis of the ensemble Kalman inversion, Inverse Problems 35 (2019), no. 8, 085007, 32. MR 3988266, DOI 10.1088/1361-6420/ab149c
T. Bui-Thanh and M. Girolami, Solving large-scale PDE-constrained Bayesian inverse problems with Riemann manifold Hamiltonian Monte Carlo, Inverse Problems 30 (2014), no. 11, 114014, 23. MR 3274598, DOI 10.1088/0266-5611/30/11/114014
Martin Burger and Stanley Osher, Convergence rates of convex variational regularization, Inverse Problems 20 (2004), no. 5, 1411–1421. MR 2109126, DOI 10.1088/0266-5611/20/5/005
N. K. Chada, Analysis of hierarchical ensemble Kalman inversion, preprint arXiv:1801.00847, 2018.
Neil K. Chada, Marco A. Iglesias, Lassi Roininen, and Andrew M. Stuart, Parameterizations for ensemble Kalman inversion, Inverse Problems 34 (2018), no. 5, 055009, 31. MR 3788159, DOI 10.1088/1361-6420/aab6d9
Neil K. Chada, Andrew M. Stuart, and Xin T. Tong, Tikhonov regularization within ensemble Kalman inversion, SIAM J. Numer. Anal. 58 (2020), no. 2, 1263–1294. MR 4089506, DOI 10.1137/19M1242331
Xi Chen, Jason D. Lee, Xin T. Tong, and Yichen Zhang, Statistical inference for model parameters in stochastic gradient descent, Ann. Statist. 48 (2020), no. 1, 251–273. MR 4065161, DOI 10.1214/18-AOS1801
Sebastian Reich and Colin J. Cotter, Ensemble filter techniques for intermittent data assimilation, Large scale inverse problems, Radon Ser. Comput. Appl. Math., vol. 13, De Gruyter, Berlin, 2013, pp. 91–134. MR 3185328
M. Dashti, K. J. H. Law, A. M. Stuart, and J. Voss, MAP estimators and their consistency in Bayesian nonparametric inverse problems, Inverse Problems 29 (2013), no. 9, 095017, 27. MR 3104933, DOI 10.1088/0266-5611/29/9/095017
Pierre Del Moral, Arnaud Doucet, and Ajay Jasra, Sequential Monte Carlo samplers, J. R. Stat. Soc. Ser. B Stat. Methodol. 68 (2006), no. 3, 411–436. MR 2278333, DOI 10.1111/j.1467-9868.2006.00553.x
M. M. Dunlop, Multiplicative noise in Bayesian inverse problems: Well-posedness and consistency of MAP estimators, preprint arXiv:1910.14632, 2019.
John Duchi, Elad Hazan, and Yoram Singer, Adaptive subgradient methods for online learning and stochastic optimization, J. Mach. Learn. Res. 12 (2011), 2121–2159. MR 2825422
Heinz W. Engl, Martin Hanke, and Andreas Neubauer, Regularization of inverse problems, Mathematics and its Applications, vol. 375, Kluwer Academic Publishers Group, Dordrecht, 1996. MR 1408680, DOI 10.1007/978-94-009-1740-8
Geir Evensen, Data assimilation, 2nd ed., Springer-Verlag, Berlin, 2009. The ensemble Kalman filter. MR 2555209, DOI 10.1007/978-3-642-03711-5
G. Evensen, The ensemble Kalman filter: Theoretical formulation and practical implementation, Ocean Dyn. 53 (2003), no. 4, 343–367.
Alfredo Garbuno-Inigo, Franca Hoffmann, Wuchen Li, and Andrew M. Stuart, Interacting Langevin diffusions: gradient structure and ensemble Kalman sampler, SIAM J. Appl. Dyn. Syst. 19 (2020), no. 1, 412–441. MR 4059375, DOI 10.1137/19M1251655
N. J. Gordon, D.J. Salmond, and A. F. M. Smith, Novel approach to nonlinear/non-Gaussian Bayesian state estimation, IEE Proc. F - Radar Signal Process. 140 (1993), no. 2.
S. Gratton, A. S. Lawless, and N. K. Nichols, Approximate Gauss-Newton methods for nonlinear least squares problems, SIAM J. Optim. 18 (2007), no. 1, 106–132. MR 2299676, DOI 10.1137/050624935
E. Haber, F. Lucka, and L. Ruthotto, Never look back - A modified EnKF method and its application to the training of neural networks without back propagation, preprint arXiv:1805.08034, 2018.
Martin Hanke, A regularizing Levenberg-Marquardt scheme, with applications to inverse groundwater filtration problems, Inverse Problems 13 (1997), no. 1, 79–95. MR 1435869, DOI 10.1088/0266-5611/13/1/007
T. M. Hamill and J. S. Whitaker, Accounting for the error due to unresolved scales in ensemble data assimilation: a comparison of different approaches, Monthly Weather Rev. 133 (2005), no. 11, 3132–3147.
P. L. Houtekamer and H. L. Mitchell, A sequential ensemble kalman filter for atmospheric data assimilation, Monthly Weather Rev. 129 (2001), no. 1, 123–137.
Marco A. Iglesias, A regularizing iterative ensemble Kalman method for PDE-constrained inverse problems, Inverse Problems 32 (2016), no. 2, 025002, 45. MR 3476487, DOI 10.1088/0266-5611/32/2/025002
Marco A. Iglesias, Kody J. H. Law, and Andrew M. Stuart, Ensemble Kalman methods for inverse problems, Inverse Problems 29 (2013), no. 4, 045001, 20. MR 3041539, DOI 10.1088/0266-5611/29/4/045001
Marco Iglesias, Minho Park, and M. V. Tretyakov, Bayesian inversion in resin transfer molding, Inverse Problems 34 (2018), no. 10, 105002, 46. MR 3841732, DOI 10.1088/1361-6420/aad1cc
S. J. Julier and J. K. Uhlmann, New extension of the Kalman filter to nonlinear systems, Signal processing, sensor fusion, and target recognition VI, International Society for Optics and Photonics vol. 3068, 1997.
R. E. Kalman, A new approach to linear filtering and prediction problems, Trans. ASME Ser. D. J. Basic Engrg. 82 (1960), no. 1, 35–45. MR 3931993, DOI 10.1115/1.3662552
Nikolas Kantas, Alexandros Beskos, and Ajay Jasra, Sequential Monte Carlo methods for high-dimensional inverse problems: a case study for the Navier-Stokes equations, SIAM/ASA J. Uncertain. Quantif. 2 (2014), no. 1, 464–489. MR 3283917, DOI 10.1137/130930364
C. T. Kelley, Iterative methods for optimization, Frontiers in Applied Mathematics, vol. 18, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1999. MR 1678201, DOI 10.1137/1.9781611970920
D. T. B. Kelly, K. J. H. Law, and A. M. Stuart, Well-posedness and accuracy of the ensemble Kalman filter in discrete and continuous time, Nonlinearity 27 (2014), no. 10, 2579–2604. MR 3265724, DOI 10.1088/0951-7715/27/10/2579
D. T. Kelly, A. J. Majda, and X. T. Tong, Concrete ensemble Kalman filters with rigorous catastrophic filter divergence, Proc. Natl. Acad. Sci., 112 (2016), no. 34, 10589–10594, 2016.
S. Kirkpatrick, C. D. Gelatt Jr., and M. P. Vecchi, Optimization by simulated annealing, Science 220 (1983), no. 4598, 671–680. MR 702485, DOI 10.1126/science.220.4598.671
Nikola B. Kovachki and Andrew M. Stuart, Ensemble Kalman inversion: a derivative-free technique for machine learning tasks, Inverse Problems 35 (2019), no. 9, 095005, 35. MR 3998631, DOI 10.1088/1361-6420/ab1c3a
Evan Kwiatkowski and Jan Mandel, Convergence of the square root ensemble Kalman filter in the large ensemble limit, SIAM/ASA J. Uncertain. Quantif. 3 (2015), no. 1, 1–17. MR 3295689, DOI 10.1137/140965363
Theresa Lange and Wilhelm Stannat, On the continuous time limit of the ensemble Kalman filter, Math. Comp. 90 (2021), no. 327, 233–265. MR 4166460, DOI 10.1090/mcom/3588
Theresa Lange and Wilhelm Stannat, On the continuous time limit of ensemble square root filters, Commun. Math. Sci. 19 (2021), no. 7, 1855–1880. MR 4312747, DOI 10.4310/CMS.2021.v19.n7.a5
Kody Law, Andrew Stuart, and Konstantinos Zygalakis, Data assimilation, Texts in Applied Mathematics, vol. 62, Springer, Cham, 2015. A mathematical introduction. MR 3363508, DOI 10.1007/978-3-319-20325-6
G. Li and A. C. Reynolds, Iterative ensemble Kalman filters for data assimilation, SPE J. 14 (2009), 496–505.
F. Le Gland, V. Monbet, and V.-D. Tran, Large sample asymptotics for the ensemble Kalman filter, The Oxford handbook of nonlinear filtering, Oxford Univ. Press, Oxford, 2011, pp. 598–631. MR 2884610
Qiang Liu and Xin T. Tong, Accelerating Metropolis-within-Gibbs sampler with localized computations of differential equations, Stat. Comput. 30 (2020), no. 4, 1037–1056. MR 4108690, DOI 10.1007/s11222-020-09934-w
David M. Livings, Sarah L. Dance, and Nancy K. Nichols, Unbiased ensemble square root filters, Phys. D 237 (2008), no. 8, 1021–1028. MR 2450491, DOI 10.1016/j.physd.2008.01.005
Gabriel J. Lord, Catherine E. Powell, and Tony Shardlow, An introduction to computational stochastic PDEs, Cambridge Texts in Applied Mathematics, Cambridge University Press, New York, 2014. MR 3308418, DOI 10.1017/CBO9781139017329
E. N. Lorenz, Predictability: A problem partly solved, Proc. ECMWF Semin. predictability, 1 (1986), 1–18.
Andrew J. Majda and John Harlim, Filtering complex turbulent systems, Cambridge University Press, Cambridge, 2012. MR 2934167, DOI 10.1017/CBO9781139061308
Andrew J. Majda and Xin T. Tong, Performance of ensemble Kalman filters in large dimensions, Comm. Pure Appl. Math. 71 (2018), no. 5, 892–937. MR 3794517, DOI 10.1002/cpa.21722
Jan Mandel, Loren Cobb, and Jonathan D. Beezley, On the convergence of the ensemble Kalman filter, Appl. Math. 56 (2011), no. 6, 533–541. MR 2886236, DOI 10.1007/s10492-011-0031-2
Hiroyuki Moriyama, Nobuo Yamashita, and Masao Fukushima, The incremental Gauss-Newton algorithm with adaptive stepsize rule, Comput. Optim. Appl. 26 (2003), no. 2, 107–141. MR 2013510, DOI 10.1023/A:1025703629626
M. Morzfeld, X. T. Tong, and Y. M. Marzouk, Localization for MCMC: sampling high-dimensional posterior distributions with local structure, J. Comput. Phys. 380 (2019), 1–28. MR 3901708, DOI 10.1016/j.jcp.2018.12.008
Yurii Nesterov, Introductory lectures on convex optimization, Applied Optimization, vol. 87, Kluwer Academic Publishers, Boston, MA, 2004. A basic course. MR 2142598, DOI 10.1007/978-1-4419-8853-9
Jorge Nocedal and Stephen J. Wright, Numerical optimization, 2nd ed., Springer Series in Operations Research and Financial Engineering, Springer, New York, 2006. MR 2244940
D. Oliver, A. C. Reynolds, and N. Liu, Inverse theory for petroleum reservoir characterization and history matching, Cambridge University Press, 2008.
Vivak Patel, Kalman-based stochastic gradient method with stop condition and insensitivity to conditioning, SIAM J. Optim. 26 (2016), no. 4, 2620–2648. MR 3576578, DOI 10.1137/15M1048239
V. P. Plagianakos, G. D. Magoulas, and M. N. Vrahatis, Learning rate adaptation in stochastic gradient descent, Advances in convex analysis and global optimization (Pythagorion, 2000) Nonconvex Optim. Appl., vol. 54, Kluwer Acad. Publ., Dordrecht, 2001, pp. 433–444. MR 1846169, DOI 10.1007/978-1-4613-0279-7_{2}7
B. T. Polyak and A. B. Juditsky, Acceleration of stochastic approximation by averaging, SIAM J. Control Optim. 30 (1992), no. 4, 838–855. MR 1167814, DOI 10.1137/0330046
P. N. Raanes, A. Carrassi, and L. Bertino, Extending the square root method to account for additive forecast noise in ensemble methods, Monthly Weather Rev. 143 (2015), no. 10, 3857–3873.
Sebastian Reich, Data assimilation: the Schrödinger perspective, Acta Numer. 28 (2019), 635–711. MR 3963510, DOI 10.1017/s0962492919000011
Sebastian Reich and Colin Cotter, Probabilistic forecasting and Bayesian data assimilation, Cambridge University Press, New York, 2015. MR 3242790, DOI 10.1017/CBO9781107706804
Konrad Reif, Stefan Günther, Engin Yaz, and Rolf Unbehauen, Stochastic stability of the discrete-time extended Kalman filter, IEEE Trans. Automat. Control 44 (1999), no. 4, 714–728. MR 1684426, DOI 10.1109/9.754809
Luis Miguel Rios and Nikolaos V. Sahinidis, Derivative-free optimization: a review of algorithms and comparison of software implementations, J. Global Optim. 56 (2013), no. 3, 1247–1293. MR 3070154, DOI 10.1007/s10898-012-9951-y
Herbert Robbins and Sutton Monro, A stochastic approximation method, Ann. Math. Statistics 22 (1951), 400–407. MR 42668, DOI 10.1214/aoms/1177729586
Claudia Schillings and Andrew M. Stuart, Analysis of the ensemble Kalman filter for inverse problems, SIAM J. Numer. Anal. 55 (2017), no. 3, 1264–1290. MR 3654885, DOI 10.1137/16M105959X
M. Schweiger, S. R. Arridge, and I. Nissila, Gauss-Newton method for image reconstruction in diffuse optical tomography, Phys. Med. Biol. 50 (2005), 2365–2386.
A. M. Stuart, Inverse problems: a Bayesian perspective, Acta Numer. 19 (2010), 451–559. MR 2652785, DOI 10.1017/S0962492910000061
T. J. Sullivan, Introduction to uncertainty quantification, Texts in Applied Mathematics, vol. 63, Springer, Cham, 2015. MR 3364576, DOI 10.1007/978-3-319-23395-6
Albert Tarantola, Inverse problem theory, Elsevier Science Publishers, B.V., Amsterdam, 1987. Methods for data fitting and model parameter estimation. MR 930881
M. K. Tippett, J. L. Anderson, C. H. Bishop, T. M. Hamill, and C. Whitaker, Ensemble square root filters, Mon. Wea. Rev. 131 (2003), 1485–1490.
Xin T. Tong, Performance analysis of local ensemble Kalman filter, J. Nonlinear Sci. 28 (2018), no. 4, 1397–1442. MR 3817786, DOI 10.1007/s00332-018-9453-2
Xin T. Tong, Andrew J. Majda, and David Kelly, Nonlinear stability of the ensemble Kalman filter with adaptive covariance inflation, Commun. Math. Sci. 14 (2016), no. 5, 1283–1313. MR 3506803, DOI 10.4310/CMS.2016.v14.n5.a5
E. A. Wan and R. Van Der Merwe, The unscented Kalman filter for nonlinear estimation, Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium, IEEE, 2000.