Randomized dynamic programming principle and Feynman-Kac representation for optimal control of McKean-Vlasov dynamics
HTML articles powered by AMS MathViewer
- by Erhan Bayraktar, Andrea Cosso and Huyên Pham PDF
- Trans. Amer. Math. Soc. 370 (2018), 2115-2160 Request permission
Abstract:
We analyze a stochastic optimal control problem, where the state process follows a McKean-Vlasov dynamics and the diffusion coefficient can be degenerate. We prove that its value function $V$ admits a nonlinear Feynman-Kac representation in terms of a class of forward-backward stochastic differential equations, with an autonomous forward process. We exploit this probabilistic representation to rigorously prove the dynamic programming principle (DPP) for $V$. The Feynman-Kac representation we obtain has an important role beyond its intermediary role in obtaining our main result: in fact it would be useful in developing probabilistic numerical schemes for $V$. The DPP is important in obtaining a characterization of the value function as a solution of a nonlinear partial differential equation (the so-called Hamilton-Jacobi-Belman equation), in this case on the Wasserstein space of measures. We should note that the usual way of solving these equations is through the Pontryagin maximum principle, which requires some convexity assumptions. There were attempts in using the dynamic programming approach before, but these works assumed a priori that the controls were of Markovian feedback type, which helps write the problem only in terms of the distribution of the state process (and the control problem becomes a deterministic problem). In this paper, we will consider open-loop controls and derive the dynamic programming principle in this most general case. In order to obtain the Feynman-Kac representation and the randomized dynamic programming principle, we implement the so-called randomization method, which consists of formulating a new McKean-Vlasov control problem, expressed in weak form taking the supremum over a family of equivalent probability measures. One of the main results of the paper is the proof that this latter control problem has the same value function $V$ of the original control problem.References
- Daniel Andersson and Boualem Djehiche, A maximum principle for SDEs of mean-field type, Appl. Math. Optim. 63 (2011), no. 3, 341–356. MR 2784835, DOI 10.1007/s00245-010-9123-8
- N. Aronszajn and P. Panitchpakdi, Extension of uniformly continuous transformations and hyperconvex metric spaces, Pacific J. Math. 6 (1956), 405–439. MR 84762
- Alan Bain and Dan Crisan, Fundamentals of stochastic filtering, Stochastic Modelling and Applied Probability, vol. 60, Springer, New York, 2009. MR 2454694, DOI 10.1007/978-0-387-76896-0
- E. Bandini, A. Cosso, M. Fuhrman, and H. Pham, Randomization method and backward SDEs for optimal control of partially observed path-dependent stochastic systems, preprint, arXiv:1511.09274v1 (2015).
- A. Bensoussan, J. Frehse, and S. C. P. Yam, On the interpretation of the Master Equation, Stochastic Process. Appl. 127 (2017), no. 7, 2093–2137. MR 3652408, DOI 10.1016/j.spa.2016.10.004
- Alain Bensoussan, Jens Frehse, and Phillip Yam, Mean field games and mean field type control theory, SpringerBriefs in Mathematics, Springer, New York, 2013. MR 3134900, DOI 10.1007/978-1-4614-8508-7
- Dimitri P. Bertsekas and Steven E. Shreve, Stochastic optimal control, Mathematics in Science and Engineering, vol. 139, Academic Press, Inc. [Harcourt Brace Jovanovich, Publishers], New York-London, 1978. The discrete time case. MR 511544
- Rainer Buckdahn, Boualem Djehiche, and Juan Li, A general stochastic maximum principle for SDEs of mean-field type, Appl. Math. Optim. 64 (2011), no. 2, 197–216. MR 2822408, DOI 10.1007/s00245-011-9136-y
- Rainer Buckdahn, Juan Li, Shige Peng, and Catherine Rainer, Mean-field stochastic differential equations and associated PDEs, Ann. Probab. 45 (2017), no. 2, 824–878. MR 3630288, DOI 10.1214/15-AOP1076
- P. Cardaliaguet, Notes on mean field games, https://www.ceremade.dauphine.fr/cardalia/MFG100629.pdf (2012).
- René Carmona, Lectures on BSDEs, stochastic control, and stochastic differential games with financial applications, Financial Mathematics, vol. 1, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2016. MR 3629171, DOI 10.1137/1.9781611974249
- René Carmona and François Delarue, The master equation for large population equilibriums, Stochastic analysis and applications 2014, Springer Proc. Math. Stat., vol. 100, Springer, Cham, 2014, pp. 77–128. MR 3332710, DOI 10.1007/978-3-319-11292-3_{4}
- René Carmona and François Delarue, Forward-backward stochastic differential equations and controlled McKean-Vlasov dynamics, Ann. Probab. 43 (2015), no. 5, 2647–2700. MR 3395471, DOI 10.1214/14-AOP946
- René Carmona, François Delarue, and Aimé Lachapelle, Control of McKean-Vlasov dynamics versus mean field games, Math. Financ. Econ. 7 (2013), no. 2, 131–166. MR 3045029, DOI 10.1007/s11579-012-0089-y
- René Carmona, Jean-Pierre Fouque, and Li-Hsien Sun, Mean field games and systemic risk, Commun. Math. Sci. 13 (2015), no. 4, 911–933. MR 3325083, DOI 10.4310/CMS.2015.v13.n4.a4
- Sébastien Choukroun and Andrea Cosso, Backward SDE representation for stochastic control problems with nondominated controlled intensity, Ann. Appl. Probab. 26 (2016), no. 2, 1208–1259. MR 3476636, DOI 10.1214/15-AAP1115
- Marco Fuhrman and Huyên Pham, Randomized and backward SDE representation for optimal control of non-Markovian SDEs, Ann. Appl. Probab. 25 (2015), no. 4, 2134–2167. MR 3349004, DOI 10.1214/14-AAP1045
- Wilfrid Gangbo, Hwa Kil Kim, and Tommaso Pacini, Differential forms on Wasserstein space and infinite-dimensional Hamiltonian systems, Mem. Amer. Math. Soc. 211 (2011), no. 993, vi+77. MR 2808856, DOI 10.1090/S0065-9266-2010-00610-0
- Olav Kallenberg, Foundations of modern probability, 2nd ed., Probability and its Applications (New York), Springer-Verlag, New York, 2002. MR 1876169, DOI 10.1007/978-1-4757-4015-8
- Idris Kharroubi, Nicolas Langrené, and Huyên Pham, Discrete time approximation of fully nonlinear HJB equations via BSDEs with nonpositive jumps, Ann. Appl. Probab. 25 (2015), no. 4, 2301–2338. MR 3349008, DOI 10.1214/14-AAP1049
- Idris Kharroubi and Huyên Pham, Feynman-Kac representation for Hamilton-Jacobi-Bellman IPDE, Ann. Probab. 43 (2015), no. 4, 1823–1865. MR 3353816, DOI 10.1214/14-AOP920
- N. V. Krylov, Controlled diffusion processes, Stochastic Modelling and Applied Probability, vol. 14, Springer-Verlag, Berlin, 2009. Translated from the 1977 Russian original by A. B. Aries; Reprint of the 1980 edition. MR 2723141
- Daniel Lacker, Limit theory for controlled McKean-Vlasov dynamics, SIAM J. Control Optim. 55 (2017), no. 3, 1641–1672. MR 3654119, DOI 10.1137/16M1095895
- Mathieu Laurière and Olivier Pironneau, Dynamic programming for mean-field type control, C. R. Math. Acad. Sci. Paris 352 (2014), no. 9, 707–713 (English, with English and French summaries). MR 3258261, DOI 10.1016/j.crma.2014.07.008
- P.L. Lions, Cours au collège de france: Théorie des jeux à champ moyens (audio conference, 2006–2012).
- Huyên Pham and Xiaoli Wei, Dynamic programming for optimal control of stochastic McKean-Vlasov dynamics, SIAM J. Control Optim. 55 (2017), no. 2, 1069–1101. MR 3631380, DOI 10.1137/16M1071390
- Daniel Revuz and Marc Yor, Continuous martingales and Brownian motion, 3rd ed., Grundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 293, Springer-Verlag, Berlin, 1999. MR 1725357, DOI 10.1007/978-3-662-06400-9
- A. N. Shiryaev, Probability, 2nd ed., Graduate Texts in Mathematics, vol. 95, Springer-Verlag, New York, 1996. Translated from the first (1980) Russian edition by R. P. Boas. MR 1368405, DOI 10.1007/978-1-4757-2539-1
- C. Stricker and M. Yor, Calcul stochastique dépendant d’un paramètre, Z. Wahrsch. Verw. Gebiete 45 (1978), no. 2, 109–133 (French). MR 510530, DOI 10.1007/BF00715187
- Shan Jian Tang and Xun Jing Li, Necessary conditions for optimal control of stochastic systems with random jumps, SIAM J. Control Optim. 32 (1994), no. 5, 1447–1475. MR 1288257, DOI 10.1137/S0363012992233858
- Cédric Villani, Optimal transport, Grundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 338, Springer-Verlag, Berlin, 2009. Old and new. MR 2459454, DOI 10.1007/978-3-540-71050-9
- J. Zabczyk, Chance and decision, Scuola Normale Superiore di Pisa. Quaderni. [Publications of the Scuola Normale Superiore of Pisa], Scuola Normale Superiore, Pisa, 1996. Stochastic control in discrete time. MR 1678432
Additional Information
- Erhan Bayraktar
- Affiliation: Department of Mathematics, University of Michigan, 530 Church Street, Ann Arbor, Michigan 48109
- MR Author ID: 743030
- ORCID: 0000-0002-1926-4570
- Email: erhan@umich.edu
- Andrea Cosso
- Affiliation: Politecnico di Milano, Dipartimento di Matematica, via Bonardi 9, 20133 Milano, Italy
- Address at time of publication: Dipartimento di Matematica, Universitá di Bologna, Piazza di Porta S. Donato, 5, 40126 Bologna, Italy
- MR Author ID: 1024819
- Email: andrea.cosso@unibo.it
- Huyên Pham
- Affiliation: Laboratoire de Probabilités et Modèles Aléatoires, CNRS, UMR 7599, Université Paris Diderot, 75205 Paris Cedex 13, France–and-CREST-ENSAE
- MR Author ID: 363068
- Email: pham@math.univ-paris-diderot.fr
- Received by editor(s): June 26, 2016
- Received by editor(s) in revised form: October 25, 2016
- Published electronically: November 15, 2017
- Additional Notes: The first author was supported in part by the National Science Foundation under grant DMS-1613170 and the Susan M. Smith Professorship.
The third author was supported in part by the ANR project CAESARS (ANR-15-CE05-0024) - © Copyright 2017 American Mathematical Society
- Journal: Trans. Amer. Math. Soc. 370 (2018), 2115-2160
- MSC (2010): Primary 49L20, 93E20, 60K35, 60H10, 60H30
- DOI: https://doi.org/10.1090/tran/7118
- MathSciNet review: 3739204