Penalty methods with stochastic approximation for stochastic nonlinear programming
HTML articles powered by AMS MathViewer
- by Xiao Wang, Shiqian Ma and Ya-xiang Yuan;
- Math. Comp. 86 (2017), 1793-1820
- DOI: https://doi.org/10.1090/mcom/3178
- Published electronically: October 12, 2016
- PDF | Request permission
Abstract:
In this paper, we propose a class of penalty methods with stochastic approximation for solving stochastic nonlinear programming problems. We assume that only noisy gradients or function values of the objective function are available via calls to a stochastic first-order or zeroth-order oracle. In each iteration of the proposed methods, we minimize an exact penalty function which is nonsmooth and nonconvex with only stochastic first-order or zeroth-order information available. Stochastic approximation algorithms are presented for solving this particular subproblem. The worst-case complexity of calls to the stochastic first-order (or zeroth-order) oracle for the proposed penalty methods for obtaining an $\epsilon$-stochastic critical point is analyzed.References
- Fabian Bastin, Cinzia Cirillo, and Philippe L. Toint, Convergence theory for nonconvex stochastic programming with an application to mixed logit, Math. Program. 108 (2006), no. 2-3, Ser. B, 207–234. MR 2238700, DOI 10.1007/s10107-006-0708-6
- Wei Bian and Xiaojun Chen, Worst-case complexity of smoothing quadratic regularization methods for non-Lipschitzian optimization, SIAM J. Optim. 23 (2013), no. 3, 1718–1741. MR 3093871, DOI 10.1137/120864908
- John R. Birge and François Louveaux, Introduction to stochastic programming, 2nd ed., Springer Series in Operations Research and Financial Engineering, Springer, New York, 2011. MR 2807730, DOI 10.1007/978-1-4614-0237-4
- D. Brownstone, D. S. Bunch, and K. Train, Joint mixed logit models of stated and revealed preferences for alternative-fuel vehicles, Transportation Research B 34 (2000), no. 5, 315–338.
- Coralia Cartis, Nicholas I. M. Gould, and Philippe L. Toint, On the evaluation complexity of composite function minimization with applications to nonconvex nonlinear programming, SIAM J. Optim. 21 (2011), no. 4, 1721–1739. MR 2869514, DOI 10.1137/11082381X
- K. L. Chung, On a stochastic approximation method, Ann. Math. Statistics 25 (1954), 463–483. MR 64365, DOI 10.1214/aoms/1177728716
- Cong D. Dang and Guanghui Lan, Stochastic block mirror descent methods for nonsmooth and stochastic optimization, SIAM J. Optim. 25 (2015), no. 2, 856–881. MR 3341135, DOI 10.1137/130936361
- John C. Duchi, Peter L. Bartlett, and Martin J. Wainwright, Randomized smoothing for stochastic optimization, SIAM J. Optim. 22 (2012), no. 2, 674–701. MR 2968871, DOI 10.1137/110831659
- Yuri Ermoliev, Stochastic quasigradient methods and their application to system optimization, Stochastics 9 (1983), no. 1-2, 1–36. MR 703846, DOI 10.1080/17442508308833246
- Michael C. Fu, Optimization for simulation: theory vs. practice, INFORMS J. Comput. 14 (2002), no. 3, 192–215. MR 1918923, DOI 10.1287/ijoc.14.3.192.113
- A. A. Gaĭvoronskiĭ, Nonstationary stochastic programming problems, Kibernetika (Kiev) 4 (1978), 89–92 (Russian, with English summary). MR 509843
- R. Garmanjani and L. N. Vicente, Smoothing and worst-case complexity for direct-search methods in nonsmooth optimization, IMA J. Numer. Anal. 33 (2013), no. 3, 1008–1028. MR 3081492, DOI 10.1093/imanum/drs027
- Saeed Ghadimi and Guanghui Lan, Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization I: A generic algorithmic framework, SIAM J. Optim. 22 (2012), no. 4, 1469–1492. MR 3023780, DOI 10.1137/110848864
- Saeed Ghadimi and Guanghui Lan, Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, II: Shrinking procedures and optimal algorithms, SIAM J. Optim. 23 (2013), no. 4, 2061–2089. MR 3118261, DOI 10.1137/110848876
- Saeed Ghadimi and Guanghui Lan, Stochastic first- and zeroth-order methods for nonconvex stochastic programming, SIAM J. Optim. 23 (2013), no. 4, 2341–2368. MR 3134439, DOI 10.1137/120880811
- Saeed Ghadimi and Guanghui Lan, Accelerated gradient methods for nonconvex nonlinear and stochastic programming, Math. Program. 156 (2016), no. 1-2, Ser. A, 59–99. MR 3459195, DOI 10.1007/s10107-015-0871-8
- Saeed Ghadimi, Guanghui Lan, and Hongchao Zhang, Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization, Math. Program. 155 (2016), no. 1-2, Ser. A, 267–305. MR 3439803, DOI 10.1007/s10107-014-0846-1
- D. A. Hensher and W. H. Greene, The mixed logit model: The state of practice, Transportation 30 (2003), 133–176.
- A. Juditsky, P. Rigollet, and A. B. Tsybakov, Learning by mirror averaging, Ann. Statist. 36 (2008), no. 5, 2183–2206. MR 2458184, DOI 10.1214/07-AOS546
- Anton J. Kleywegt, Alexander Shapiro, and Tito Homem-de-Mello, The sample average approximation method for stochastic discrete optimization, SIAM J. Optim. 12 (2001/02), no. 2, 479–502. MR 1885572, DOI 10.1137/S1052623499363220
- Guanghui Lan, An optimal method for stochastic composite optimization, Math. Program. 133 (2012), no. 1-2, Ser. A, 365–397. MR 2921104, DOI 10.1007/s10107-010-0434-y
- Guanghui Lan, Arkadi Nemirovski, and Alexander Shapiro, Validation analysis of mirror descent stochastic approximation method, Math. Program. 134 (2012), no. 2, Ser. A, 425–458. MR 2961314, DOI 10.1007/s10107-011-0442-6
- J. Mairal, F. Bach, J. Ponce, and G. Sapiro, Online dictionary learning for sparse coding, In ICML, 2009.
- A. Nemirovski and R.Y. Rubinstein, An efficient stochastic approximation algorithm for stochastic saddle point problems, in Modeling Uncertainty, Springer, 2005, pp. 156–184.
- A. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro, Robust stochastic approximation approach to stochastic programming, SIAM J. Optim. 19 (2008), no. 4, 1574–1609. MR 2486041, DOI 10.1137/070704277
- A. S. Nemirovsky and D. B. Yudin, Problem complexity and method efficiency in optimization, Wiley-Interscience Series in Discrete Mathematics, John Wiley & Sons, Inc., New York, 1983. Translated from the Russian and with a preface by E. R. Dawson; A Wiley-Interscience Publication. MR 702836
- Yu. E. Nesterov, A method for solving the convex programming problem with convergence rate $O(1/k^{2})$, Dokl. Akad. Nauk SSSR 269 (1983), no. 3, 543–547 (Russian). MR 701288
- Y. E. Nesterov, Random gradient-free minimization of convex functions, Technical report, Center for Operation Research and Econometrics (CORE), Catholic University of Louvain, 2010.
- Jorge Nocedal and Stephen J. Wright, Numerical optimization, 2nd ed., Springer Series in Operations Research and Financial Engineering, Springer, New York, 2006. MR 2244940
- B. T. Polyak, A new method of stochastic approximation type, Avtomat. i Telemekh. 7 (1990), 98–107 (Russian); English transl., Automat. Remote Control 51 (1990), no. 7, 937–946 (1991). MR 1071220
- B. T. Polyak and A. B. Juditsky, Acceleration of stochastic approximation by averaging, SIAM J. Control Optim. 30 (1992), no. 4, 838–855. MR 1167814, DOI 10.1137/0330046
- Herbert Robbins and Sutton Monro, A stochastic approximation method, Ann. Math. Statistics 22 (1951), 400–407. MR 42668, DOI 10.1214/aoms/1177729586
- Andrzej Ruszczyński and Wojciech Syski, A method of aggregate stochastic subgradients with on-line stepsize rules for convex stochastic programming problems, Math. Programming Stud. 28 (1986), 113–131. Stochastic programming 84. II. MR 836764, DOI 10.1007/bfb0121128
- Jerome Sacks, Asymptotic distribution of stochastic approximation procedures, Ann. Math. Statist. 29 (1958), 373–405. MR 98427, DOI 10.1214/aoms/1177706619
- Alexander Shapiro, Darinka Dentcheva, and Andrzej Ruszczyński, Lectures on stochastic programming, MPS/SIAM Series on Optimization, vol. 9, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA; Mathematical Programming Society (MPS), Philadelphia, PA, 2009. Modeling and theory. MR 2562798, DOI 10.1137/1.9780898718751
- Mengdi Wang and Dimitri P. Bertsekas, Stochastic first-order methods with random constraint projection, SIAM J. Optim. 26 (2016), no. 1, 681–717. MR 3472017, DOI 10.1137/130931278
- Y. Yuan, Conditions for convergence of trust region algorithms for nonsmooth optimization, Math. Programming 31 (1985), no. 2, 220–228. MR 777292, DOI 10.1007/BF02591750
Bibliographic Information
- Xiao Wang
- Affiliation: School of Mathematical Sciences, University of Chinese Academy of Sciences; Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, People’s Republic of China
- Email: wangxiao@ucas.ac.cn
- Shiqian Ma
- Affiliation: Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N. T., Hong Kong
- MR Author ID: 826033
- Email: sqma@se.cuhk.edu.hk
- Ya-xiang Yuan
- Affiliation: State Key Laboratory of Scientific and Engineering Computing, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, People’s Republic of China
- Email: yyx@lsec.cc.ac.cn
- Received by editor(s): April 8, 2015
- Received by editor(s) in revised form: December 1, 2015
- Published electronically: October 12, 2016
- Additional Notes: The research of the first author was supported in part by Postdoc Grant 119103S175, UCAS President Grant Y35101AY00 and NSFC Grant 11301505.
The research of the second author was supported in part by a Direct Grant of the Chinese University of Hong Kong (Project ID: 4055016) and the Hong Kong Research Grants Council General Research Fund Early Career Scheme (Project ID: CUHK 439513)
The research of the third author was supported in part by NSFC Grants 11331012, 11321061 and 11461161005 - © Copyright 2016 American Mathematical Society
- Journal: Math. Comp. 86 (2017), 1793-1820
- MSC (2010): Primary 90C15, 90C30, 62L20, 90C60
- DOI: https://doi.org/10.1090/mcom/3178
- MathSciNet review: 3626537