Equivalence of approximation by convolutional neural networks and fully-connected networks
HTML articles powered by AMS MathViewer
- by Philipp Petersen and Felix Voigtlaender
- Proc. Amer. Math. Soc. 148 (2020), 1567-1581
- DOI: https://doi.org/10.1090/proc/14789
- Published electronically: December 6, 2019
- PDF | Request permission
Abstract:
Convolutional neural networks are the most widely used type of neural networks in applications. In mathematical analysis, however, mostly fully-connected networks are studied. In this paper, we establish a connection between both network architectures. Using this connection, we show that all upper and lower bounds concerning approximation rates of fully-connected neural networks for functions $f \in \mathcal {C}$—for an arbitrary function class $\mathcal {C}$—translate to essentially the same bounds concerning approximation rates of convolutional neural networks for functions $f \in \mathcal {C}^{\mathrm {equi}}$, with the class $\mathcal {C}^{\mathrm {equi}}$ consisting of all translation equivariant functions whose first coordinate belongs to $\mathcal {C}$. All presented results consider exclusively the case of convolutional neural networks without any pooling operation and with circular convolutions, i.e., not based on zero-padding.References
- Andrew R. Barron, Universal approximation bounds for superpositions of a sigmoidal function, IEEE Trans. Inform. Theory 39 (1993), no. 3, 930–945. MR 1237720, DOI 10.1109/18.256500
- Helmut Bölcskei, Philipp Grohs, Gitta Kutyniok, and Philipp Petersen, Optimal approximation with sparsely connected deep neural networks, SIAM J. Math. Data Sci. 1 (2019), no. 1, 8–45. MR 3949699, DOI 10.1137/18M118709X
- N. Cohen, O. Sharir, and A. Shashua, On the expressive power of deep learning: A tensor analysis, COLT, pages 698–728, 2016.
- G. Cybenko, Approximation by superpositions of a sigmoidal function, Math. Control Signals Systems 2 (1989), no. 4, 303–314. MR 1015670, DOI 10.1007/BF02551274
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep learning, Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, 2016. MR 3617773
- K. Hornik, M. Stinchcombe, and H. White. Multilayer feedforward networks are universal approximators. Neural Netw., 2(5):359–366, 1989.
- Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature, 521(7553):436–444, 2015.
- Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proc. IEEE, 86(11):2278–2324, Nov. 1998.
- M. Leshno, V. Y. Lin, A. Pinkus, and S. Schocken, Multilayer feedforward networks with a nonpolynomial activation function can approximate any function, Neural Netw., 6(6):861 – 867, 1993.
- V. Maiorov and A. Pinkus, Lower bounds for approximation by MLP neural networks, Neurocomputing, 25(1-3):81–91, 1999.
- S. Mallat, Understanding deep convolutional networks, Philos. Trans. Royal Soc., 374(2065):20150203, 2016.
- H. Mhaskar, Neural networks for optimal approximation of smooth and analytic functions, Neural Comput., 8(1):164–177, 1996.
- H. N. Mhaskar, Approximation properties of a multilayered feedforward artificial neural network, Adv. Comput. Math. 1 (1993), no. 1, 61–80. MR 1230251, DOI 10.1007/BF02070821
- H. N. Mhaskar and Charles A. Micchelli, Approximation by superposition of sigmoidal and radial basis functions, Adv. in Appl. Math. 13 (1992), no. 3, 350–373. MR 1176581, DOI 10.1016/0196-8858(92)90016-P
- P. Petersen and F. Voigtlaender, Optimal approximation of piecewise smooth functions using deep ReLU neural networks, Neural Netw. 108 (2018), 296–330., DOI 10.1016/j.neunet.2018.08.019
- Allan Pinkus, Approximation theory of the MLP model in neural networks, Acta numerica, 1999, Acta Numer., vol. 8, Cambridge Univ. Press, Cambridge, 1999, pp. 143–195. MR 1819645, DOI 10.1017/S0962492900002919
- J. Schmidhuber, Deep learning in neural networks: An overview, Neural Netw., 61:85–117, 2015.
- Uri Shaham, Alexander Cloninger, and Ronald R. Coifman, Provable approximation properties for deep neural networks, Appl. Comput. Harmon. Anal. 44 (2018), no. 3, 537–557. MR 3768851, DOI 10.1016/j.acha.2016.04.003
- D. Yarotsky, Error bounds for approximations with deep ReLU networks, Neural Netw., 94:103–114, 2017.
- D. Yarotsky, Universal approximations of invariant maps by neural networks, arXiv:1804.10306v1, (2018).
- Ding-Xuan Zhou, Deep distributed convolutional neural networks: universality, Anal. Appl. (Singap.) 16 (2018), no. 6, 895–919. MR 3871718, DOI 10.1142/S0219530518500124
- D.-X. Zhou, Universality of deep convolutional neural networks, arXiv:1805.10769, 2018.
Bibliographic Information
- Philipp Petersen
- Affiliation: Mathematical Institute, University of Oxford, Andrew Wiles Building, Oxford, United Kingdom
- MR Author ID: 1101536
- Email: Philipp.Petersen@maths.ox.ac.uk
- Felix Voigtlaender
- Affiliation: Lehrstuhl Wissenschaftliches Rechnen, Katholische Universität Eichstätt–Ingolstadt, Ostenstraße 26, 85072 Eichstätt, Germany
- MR Author ID: 1107453
- Email: felix@voigtlaender.xyz
- Received by editor(s): September 4, 2018
- Received by editor(s) in revised form: March 5, 2019, and August 5, 2019
- Published electronically: December 6, 2019
- Additional Notes: The first author was supported by a DFG research fellowship.
Both authors contributed equally to this work - Communicated by: Yuan Xu
- © Copyright 2019 American Mathematical Society
- Journal: Proc. Amer. Math. Soc. 148 (2020), 1567-1581
- MSC (2010): Primary 41A25; Secondary 44A35, 41A46
- DOI: https://doi.org/10.1090/proc/14789
- MathSciNet review: 4069195