Remote Access Mathematics of Computation
Green Open Access

Mathematics of Computation

ISSN 1088-6842(online) ISSN 0025-5718(print)



Analyzing random permutations for cyclic coordinate descent

Authors: Stephen J. Wright and Ching-pei Lee
Journal: Math. Comp. 89 (2020), 2217-2248
MSC (2010): Primary 65F10; Secondary 90C25, 68W20
Published electronically: March 27, 2020
MathSciNet review: 4109565
Full-text PDF
View in AMS MathViewer New

Abstract | References | Similar Articles | Additional Information

Abstract: We consider coordinate descent methods for minimization of convex quadratic functions, in which exact line searches are performed at each iteration. (This algorithm is identical to Gauss-Seidel on the equivalent symmetric positive definite linear system.) We describe a class of convex quadratic functions for which the random permutations version of cyclic coordinate descent (RPCD) is observed to outperform the standard cyclic coordinate descent (CCD) approach on computational tests, yielding convergence behavior similar to the fully random variant (RCD). A convergence analysis is developed to explain the empirical observations.

References [Enhancements On Off] (What's this?)

  • [1] Amir Beck and Luba Tetruashvili, On the convergence of block coordinate descent type methods, SIAM J. Optim. 23 (2013), no. 4, 2037–2060. MR 3116649,
  • [2] I. Gelfand, Normierte Ringe, Rec. Math. [Mat. Sbornik] N. S. 9 (51) (1941), 3–24 (German, with Russian summary). MR 0004726
  • [3] Ching-pei Lee and Stephen J. Wright, Random permutations fix a worst case for cyclic coordinate descent, IMA J. Numer. Anal. 39 (2019), no. 3, 1246–1275. MR 4023751,
  • [4] Xingguo Li, Tuo Zhao, Raman Arora, Han Liu, and Mingyi Hong, On faster convergence of cyclic block coordinate descent-type methods for strongly convex minimization, J. Mach. Learn. Res. 18 (2017), Paper No. 184, 24. MR 3827072
  • [5] Yu. Nesterov, Efficiency of coordinate descent methods on huge-scale optimization problems, SIAM J. Optim. 22 (2012), no. 2, 341–362. MR 2968857,
  • [6] B. Recht and C. Ré, Toward a noncommutative arithmetic-geometric mean inequality: Conjectures, case-studies, and consequences, Proceedings of the 25th Annual Conference on Learning Theory, vol. 23, 2012, pp. 11.1-11.24.
  • [7] Shai Shalev-Shwartz and Tong Zhang, Stochastic dual coordinate ascent methods for regularized loss minimization, J. Mach. Learn. Res. 14 (2013), 567–599. MR 3033340
  • [8] R. Sun and M. Hong, Improved iteration complexity bounds of cyclic block coordinate descent for convex problems, Advances in Neural Information Processing Systems, 2015, pp. 1306-1314.
  • [9] R. Sun, Z.-Q. Luo, and Y. Ye, On the efficiency of random permutation for admm and coordinate descent, Tech. Report arXiv:1503.06387 (2019). To appear in Math. Oper. Res.
  • [10] R. Sun and Y. Ye, Worst-case complexity of cyclic coordinate descent: $ {O}(n^2)$ gap with randomized version, Mathematical Programming (2019), 1-34, Online first.
  • [11] S. J. Wright, Computations with coordinate descent methods, Presentation at Workshop on Challenges in Optimization for Data Science,, July 2015.
  • [12] S. J. Wright, Coordinate descent methods, Colloquium, Courant Institute of Mathematical Sciences, December 2015.

Similar Articles

Retrieve articles in Mathematics of Computation with MSC (2010): 65F10, 90C25, 68W20

Retrieve articles in all journals with MSC (2010): 65F10, 90C25, 68W20

Additional Information

Stephen J. Wright
Affiliation: Computer Sciences Department, University of Wisconsin-Madison, Madison, Wisconsin

Ching-pei Lee
Affiliation: Computer Sciences Department, University of Wisconsin-Madison, Madison, Wisconsin
Address at time of publication: Department of Mathematics, National University of Singapore

Keywords: Coordinate descent, Gauss-Seidel, randomization, permutations
Received by editor(s): May 27, 2017
Received by editor(s) in revised form: February 12, 2019, November 25, 2019, and January 5, 2020
Published electronically: March 27, 2020
Additional Notes: This work was supported by NSF Awards IIS-1447449, 1628384, 1634597, and 1740707; ONR Award N00014-13-1-0129; AFOSR Award FA9550-13-1-0138, and Subcontracts 3F-30222 and 8F-30039 from Argonne National Laboratory; and Award N660011824020 from the DARPA Lagrange Program.
Article copyright: © Copyright 2020 American Mathematical Society