Equivalence of approximation by convolutional neural networks and fully-connected networks

Petersen, Philipp; Voigtlaender, Felix

doi:10.1090/proc/14789

Equivalence of approximation by convolutional neural networks and fully-connected networks
HTML articles powered by AMS MathViewer

by Philipp Petersen and Felix Voigtlaender PDF

Proc. Amer. Math. Soc. 148 (2020), 1567-1581 Request permission

Abstract:

Convolutional neural networks are the most widely used type of neural networks in applications. In mathematical analysis, however, mostly fully-connected networks are studied. In this paper, we establish a connection between both network architectures. Using this connection, we show that all upper and lower bounds concerning approximation rates of fully-connected neural networks for functions $f \in \mathcal {C}$—for an arbitrary function class $\mathcal {C}$—translate to essentially the same bounds concerning approximation rates of convolutional neural networks for functions $f \in \mathcal {C}^{\mathrm {equi}}$, with the class $\mathcal {C}^{\mathrm {equi}}$ consisting of all translation equivariant functions whose first coordinate belongs to $\mathcal {C}$. All presented results consider exclusively the case of convolutional neural networks without any pooling operation and with circular convolutions, i.e., not based on zero-padding.

References

Andrew R. Barron, Universal approximation bounds for superpositions of a sigmoidal function, IEEE Trans. Inform. Theory 39 (1993), no. 3, 930–945. MR 1237720, DOI 10.1109/18.256500
Helmut Bölcskei, Philipp Grohs, Gitta Kutyniok, and Philipp Petersen, Optimal approximation with sparsely connected deep neural networks, SIAM J. Math. Data Sci. 1 (2019), no. 1, 8–45. MR 3949699, DOI 10.1137/18M118709X
N. Cohen, O. Sharir, and A. Shashua, On the expressive power of deep learning: A tensor analysis, COLT, pages 698–728, 2016.
G. Cybenko, Approximation by superpositions of a sigmoidal function, Math. Control Signals Systems 2 (1989), no. 4, 303–314. MR 1015670, DOI 10.1007/BF02551274
Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep learning, Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, 2016. MR 3617773
K. Hornik, M. Stinchcombe, and H. White. Multilayer feedforward networks are universal approximators. Neural Netw., 2(5):359–366, 1989.
Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature, 521(7553):436–444, 2015.
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proc. IEEE, 86(11):2278–2324, Nov. 1998.
M. Leshno, V. Y. Lin, A. Pinkus, and S. Schocken, Multilayer feedforward networks with a nonpolynomial activation function can approximate any function, Neural Netw., 6(6):861 – 867, 1993.
V. Maiorov and A. Pinkus, Lower bounds for approximation by MLP neural networks, Neurocomputing, 25(1-3):81–91, 1999.
S. Mallat, Understanding deep convolutional networks, Philos. Trans. Royal Soc., 374(2065):20150203, 2016.
H. Mhaskar, Neural networks for optimal approximation of smooth and analytic functions, Neural Comput., 8(1):164–177, 1996.
H. N. Mhaskar, Approximation properties of a multilayered feedforward artificial neural network, Adv. Comput. Math. 1 (1993), no. 1, 61–80. MR 1230251, DOI 10.1007/BF02070821
H. N. Mhaskar and Charles A. Micchelli, Approximation by superposition of sigmoidal and radial basis functions, Adv. in Appl. Math. 13 (1992), no. 3, 350–373. MR 1176581, DOI 10.1016/0196-8858(92)90016-P
P. Petersen and F. Voigtlaender, Optimal approximation of piecewise smooth functions using deep ReLU neural networks, Neural Netw. 108 (2018), 296–330., DOI 10.1016/j.neunet.2018.08.019
Allan Pinkus, Approximation theory of the MLP model in neural networks, Acta numerica, 1999, Acta Numer., vol. 8, Cambridge Univ. Press, Cambridge, 1999, pp. 143–195. MR 1819645, DOI 10.1017/S0962492900002919
J. Schmidhuber, Deep learning in neural networks: An overview, Neural Netw., 61:85–117, 2015.
Uri Shaham, Alexander Cloninger, and Ronald R. Coifman, Provable approximation properties for deep neural networks, Appl. Comput. Harmon. Anal. 44 (2018), no. 3, 537–557. MR 3768851, DOI 10.1016/j.acha.2016.04.003
D. Yarotsky, Error bounds for approximations with deep ReLU networks, Neural Netw., 94:103–114, 2017.
D. Yarotsky, Universal approximations of invariant maps by neural networks, arXiv:1804.10306v1, (2018).
Ding-Xuan Zhou, Deep distributed convolutional neural networks: universality, Anal. Appl. (Singap.) 16 (2018), no. 6, 895–919. MR 3871718, DOI 10.1142/S0219530518500124
D.-X. Zhou, Universality of deep convolutional neural networks, arXiv:1805.10769, 2018.