AMS eBook CollectionsOne of the world's most respected mathematical collections, available in digital format for your library or institution
Ridge Functions and Applications in Neural Networks
About this Title
Vugar E. Ismailov, Azerbaijan National Academy of Sciences, Baku, Azerbaijan
Publication: Mathematical Surveys and Monographs
Publication Year:
2021; Volume 263
ISBNs: 978-1-4704-6765-4 (print); 978-1-4704-6800-2 (online)
DOI: https://doi.org/10.1090/surv/263
Table of Contents
Download chapters as PDF
Front/Back Matter
Chapters
- Introduction
- Properties of linear combinations of ridge functions
- The smoothness problem in ridge function representation
- Approximation of multivariate functions by sums of univariate functions
- Generalized ridge functions and linear superpositions
- Applications to neural networks
- J. Aczél, Lectures on functional equations and their applications, Mathematics in Science and Engineering, Vol. 19, Academic Press, New York-London, 1966. Translated by Scripta Technica, Inc. Supplemented by the author. Edited by Hansjorg Oser. MR 0208210
- Rashid A. Aliev, Aysel A. Asgarova, and Vugar E. Ismailov, A note on continuous sums of ridge functions, J. Approx. Theory 237 (2019), 210–221. MR 3868633, DOI 10.1016/j.jat.2018.09.006
- Rashid A. Aliev and Vugar E. Ismailov, On a smoothness problem in ridge function representation, Adv. in Appl. Math. 73 (2016), 154–169. MR 3433504, DOI 10.1016/j.aam.2015.11.002
- Aliev R.A., Ismailov V.E., On the representation by bivariate ridge functions, arXiv:1606.07940.
- Rashid A. Aliev and Vugar E. Ismailov, A representation problem for smooth sums of ridge functions, J. Approx. Theory 257 (2020), 105448, 13. MR 4109084, DOI 10.1016/j.jat.2020.105448
- J. M. Almira, P. E. Lopez-de-Teruel, D. J. Romero-López, and F. Voigtlaender, Negative results for approximation using single layer and multilayer feedforward neural networks, J. Math. Anal. Appl. 494 (2021), no. 1, Paper No. 124584, 11. MR 4151572, DOI 10.1016/j.jmaa.2020.124584
- George A. Anastassiou, Intelligent systems: approximation by artificial neural networks, Intelligent Systems Reference Library, vol. 19, Springer-Verlag, Berlin, 2011. MR 2895407, DOI 10.1007/978-3-642-21431-8
- V. I. Arnol′d, On functions of three variables, Dokl. Akad. Nauk SSSR 114 (1957), 679–681 (Russian). MR 0111808
- Aida Kh. Asgarova and Vugar E. Ismailov, Diliberto-Straus algorithm for the uniform approximation by a sum of two algebras, Proc. Indian Acad. Sci. Math. Sci. 127 (2017), no. 2, 361–374. MR 3647158, DOI 10.1007/s12044-017-0337-4
- Georg Aumann, Über approximative Nomographie. II, Bayer. Akad. Wiss. Math.-Nat. Kl. S.-B. 1959 (1959), 103–109 (1960) (German). MR 0116173
- M.-B. A. Babaev, The approximation of polynomials of two variables by functions of the form $\varphi (x)+\psi (y)$, Dokl. Akad. Nauk SSSR 193 (1970), 967–969 (Russian). MR 0280915
- M.-B. A. Babaev, Sharp estimates for the approximation of functions of several variables by sums of functions of a lesser number of variables, Mat. Zametki 12 (1972), 105–114 (Russian). MR 326243
- M.-B. A. Babaev, Extremal elements and the value of best approximation of a monotone function in $\textbf {R}^{n}$ by sums of functions of the least number of variables, Dokl. Akad. Nauk SSSR 265 (1982), no. 1, 11–13 (Russian). MR 671630
- M.-B. A. Babaev and V. E. Ismailov, Two-sided estimates for the best approximation in domains different from the parallelepiped, Funct. Approx. Comment. Math. 25 (1997), 121–128. Dedicated to Roman Taberski on the occasion of his 70th birthday. MR 1602374
- Randolph E. Bank, An automatic scaling procedure for a D′jakanov-Gunn iteration scheme, Linear Algebra Appl. 28 (1979), 17–33. MR 549415, DOI 10.1016/0024-3795(79)90114-9
- Andrew R. Barron, Universal approximation bounds for superpositions of a sigmoidal function, IEEE Trans. Inform. Theory 39 (1993), no. 3, 930–945. MR 1237720, DOI 10.1109/18.256500
- Helmut Bölcskei, Philipp Grohs, Gitta Kutyniok, and Philipp Petersen, Optimal approximation with sparsely connected deep neural networks, SIAM J. Math. Data Sci. 1 (2019), no. 1, 8–45. MR 3949699, DOI 10.1137/18M118709X
- Dietrich Braess and Allan Pinkus, Interpolation by ridge functions, J. Approx. Theory 73 (1993), no. 2, 218–236. MR 1216487, DOI 10.1006/jath.1993.1039
- R. C. Buck, On approximation theory and functional equations, J. Approximation Theory 5 (1972), 228–237. MR 377363, DOI 10.1016/0021-9045(72)90016-0
- Martin D. Buhmann and Allan Pinkus, Identifying linear combinations of ridge functions, Adv. in Appl. Math. 22 (1999), no. 1, 103–118. MR 1657745, DOI 10.1006/aama.1998.0623
- N. G. de Bruijn, Functions whose differences belong to a given class, Nieuw Arch. Wiskunde (2) 23 (1951), 194–218. MR 0043870
- N. G. de Bruijn, A difference property for Riemann integrable functions and for some similar classes of functions, Nederl. Akad. Wetensch. Proc. Ser. A. 55 = Indagationes Math. 14 (1952), 145–151. MR 0047120
- Neil Calkin and Herbert S. Wilf, Recounting the rationals, Amer. Math. Monthly 107 (2000), no. 4, 360–363. MR 1763062, DOI 10.2307/2589182
- Emmanuel J. Candès, Ridgelets: estimating with ridge functions, Ann. Statist. 31 (2003), no. 5, 1561–1599. MR 2012826, DOI 10.1214/aos/1065705119
- Candes E.J., Ridgelets: theory and applications. Ph.D. Thesis, Technical Report, Department of Statistics, Stanford University.
- Cao F., Lin S., Xu Z., Approximation capability of interpolation neural networks, Neurocomputing 74 (2010), 457-460.
- Cao F., Xie T., The construction and approximation for feedforword neural networks with fixed weights, Proceedings of the ninth international conference on machine learning and cybernetics, Qingdao, 2010, pp. 3164-3168.
- Carroll S.M., Dickinson B.W., Construction of neural nets using the Radon transform, in Proceedings of the IEEE 1989 International Joint Conference on Neural Networks, 1989, Vol. 1, IEEE, New York, 607-611.
- Chen T., Chen H., Approximation of continuous functionals by neural networks with application to dynamic systems, IEEE Trans. Neural Networks 4 (1993), 910-918.
- Chen T., Chen H., and Liu R., A constructive proof of Cybenko’s approximation theorem and its extensions, in Computing Science and Statistics, Springer-Verlag, 1992, 163-168.
- Cheridito P., Jentzen A., Rossmannek F., Efficient approximation of high-dimensional functions with deep neural networks, arXiv:1912.04310.
- Charles K. Chui and Xin Li, Approximation by ridge functions and neural networks with one hidden layer, J. Approx. Theory 70 (1992), no. 2, 131–141. MR 1172015, DOI 10.1016/0021-9045(92)90081-X
- C. K. Chui, Xin Li, and H. N. Mhaskar, Limitations of the approximation capabilities of neural networks with one hidden layer, Adv. Comput. Math. 5 (1996), no. 2-3, 233–243. MR 1399382, DOI 10.1007/BF02124745
- Z. Ciesielski, Some properties of convex functions of higher orders, Ann. Polon. Math. 7 (1959), 1–7. MR 109202, DOI 10.4064/ap-7-1-1-7
- Albert Cohen, Ingrid Daubechies, Ronald DeVore, Gerard Kerkyacharian, and Dominique Picard, Capturing ridge functions in high dimensions from point queries, Constr. Approx. 35 (2012), no. 2, 225–243. MR 2891227, DOI 10.1007/s00365-011-9147-6
- Constantine P.G., del Rosario Z., Iaccarino G., Many physical laws are ridge functions, arXiv:1605.07974.
- Constantine P.G., del Rosario Z., Iaccarino G., Data-driven dimensional analysis: algorithms for unique and relevant dimensionless groups, arXiv:1708.04303.
- Danilo Costarelli and Renato Spigler, Constructive approximation by superposition of sigmoidal functions, Anal. Theory Appl. 29 (2013), no. 2, 169–196. MR 3109891, DOI 10.4208/ata.2013.v29.n2.8
- Costarelli D., Spigler R, Approximation results for neural network operators activated by sigmoidal functions, Neural Networks 44 (2013), 101-106.
- Neil E. Cotter, The Stone-Weierstrass theorem and its application to neural networks, IEEE Trans. Neural Networks 1 (1990), no. 4, 290–295. MR 1083640, DOI 10.1109/72.80265
- G. Cybenko, Approximation by superpositions of a sigmoidal function, Math. Control Signals Systems 2 (1989), no. 4, 303–314. MR 1015670, DOI 10.1007/BF02551274
- Dahmen W., Micchelli C.A., Some remarks on ridge functions, Approx. Theory Appl. 3 (1987), 139-143.
- Stephen Demko, A superposition theorem for bounded continuous functions, Proc. Amer. Math. Soc. 66 (1977), no. 1, 75–78. MR 457651, DOI 10.1090/S0002-9939-1977-0457651-5
- Frank Deutsch, The alternating method of von Neumann, Multivariate approximation theory (Proc. Conf., Math. Res. Inst., Oberwolfach, 1979) Internat. Ser. Numer. Math., vol. 51, Birkhäuser, Basel-Boston, Mass., 1979, pp. 83–96. MR 560665
- Ronald A. DeVore, Konstantin I. Oskolkov, and Pencho P. Petrushev, Approximation by feed-forward neural networks, Ann. Numer. Math. 4 (1997), no. 1-4, 261–287. The heritage of P. L. Chebyshev: a Festschrift in honor of the 70th birthday of T. J. Rivlin. MR 1422683
- Persi Diaconis and Mehrdad Shahshahani, On nonlinear functions of linear combinations, SIAM J. Sci. Statist. Comput. 5 (1984), no. 1, 175–191. MR 731890, DOI 10.1137/0905013
- S. P. Diliberto and E. G. Straus, On the approximation of a function of several variables by the sum of functions of fewer variables, Pacific J. Math. 1 (1951), 195–210. MR 43882
- D. Ž. Djoković, A representation theorem for $(X_{1}-1)(X_{2} -1)\cdots (X_{n}-1)$ and its applications, Ann. Polon. Math. 22 (1969/70), 189–198. MR 265798, DOI 10.4064/ap-22-2-189-198
- Benjamin Doerr and Sebastian Mayer, The recovery of ridge functions on the hypercube suffers from the curse of dimensionality, J. Complexity 63 (2021), Paper No. 101521, 29. MR 4204470, DOI 10.1016/j.jco.2020.101521
- David L. Donoho and Iain M. Johnstone, Projection-based approximation and a duality with kernel methods, Ann. Statist. 17 (1989), no. 1, 58–106. MR 981438, DOI 10.1214/aos/1176347004
- Nira Dyn, W. A. Light, and E. W. Cheney, Interpolation by piecewise-linear radial basis functions. I, J. Approx. Theory 59 (1989), no. 2, 202–223. MR 1022117, DOI 10.1016/0021-9045(89)90152-4
- Leopold Flatto, The approximation of certain functions of several variables by sums of functions of fewer variables, Amer. Math. Monthly 73 (1966), no. 4, 131–132. MR 201888, DOI 10.2307/2313764
- Massimo Fornasier, Karin Schnass, and Jan Vybiral, Learning functions of few arbitrary linear parameters in high dimensions, Found. Comput. Math. 12 (2012), no. 2, 229–262. MR 2898783, DOI 10.1007/s10208-012-9115-y
- B. L. Fridman, An improvement in the smoothness of the functions in A. N. Kolmogorov’s theorem on superpositions, Dokl. Akad. Nauk SSSR 177 (1967), 1019–1022 (Russian). MR 0225066
- Friedman J.H., Tukey J.W., A Projection Pursuit Algorithm for Exploratory Data Analysis, IEEE Transactions on Computers C-23 (1974), 881-890.
- Jerome H. Friedman and Werner Stuetzle, Projection pursuit regression, J. Amer. Statist. Assoc. 76 (1981), no. 376, 817–823. MR 650892
- Funahashi K., On the approximate realization of continuous mapping by neural networks, Neural Networks 2 (1989), 183-192.
- Zbigniew Gajda, Difference properties of higher orders for continuity and Riemann integrability, Colloq. Math. 53 (1987), no. 2, 275–288. MR 924073, DOI 10.4064/cm-53-2-275-288
- Gallant A.R., White H., There exists a neural network that does not make avoidable mistakes, in Proceedings of the IEEE 1988 International Conference on Neural Networks, 1988, vol. 1, IEEE, New York, pp. 657-664.
- A. L. Garkavi, V. A. Medvedev, and S. Ya. Khavinson, On the existence of a best uniform approximation of a function in two variables by sums $\phi (x)+\psi (y)$, Sibirsk. Mat. Zh. 36 (1995), no. 4, 819–827, ii (Russian, with Russian summary); English transl., Siberian Math. J. 36 (1995), no. 4, 707–713. MR 1367249, DOI 10.1007/BF02107327
- A. L. Garkavi, V. A. Medvedev, and S. Ya. Khavinson, On the existence of a best uniform approximation of a function of several variables by the sum of functions of fewer variables, Mat. Sb. 187 (1996), no. 5, 3–14 (Russian, with Russian summary); English transl., Sb. Math. 187 (1996), no. 5, 623–634. MR 1400350, DOI 10.1070/SM1996v187n05ABEH000125
- Andrew Glaws, Paul G. Constantine, and R. Dennis Cook, Inverse regression for ridge recovery: a data-driven approach for parameter reduction in computer experiments, Stat. Comput. 30 (2020), no. 2, 237–253. MR 4064620, DOI 10.1007/s11222-019-09876-y
- M. von Golitschek and W. A. Light, Approximation by solutions of the planar wave equation, SIAM J. Numer. Anal. 29 (1992), no. 3, 816–830. MR 1163358, DOI 10.1137/0729050
- Michael Golomb, Approximation by functions of fewer variables, On numerical approximation. Proceedings of a Symposium, Madison, April 21-23, 1958, Publication of the Mathematics Research Center, U.S. Army, the University of Wisconsin, no. 1, University of Wisconsin Press, Madison, Wis., 1959, pp. 275–327. Edited by R. E. Langer. MR 0102168
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep learning, Adaptive Computation and Machine Learning, MIT Press, Cambridge, MA, 2016. MR 3617773
- Y. Gordon, V. Maiorov, M. Meyer, and S. Reisner, On the best approximation by ridge functions in the uniform norm, Constr. Approx. 18 (2002), no. 1, 61–85. MR 1866380, DOI 10.1007/s00365-001-0009-5
- Namig J. Guliyev and Vugar E. Ismailov, A single hidden layer feedforward network with only one neuron in the hidden layer can approximate any univariate function, Neural Comput. 28 (2016), no. 7, 1289–1304. MR 3867104, DOI 10.1162/neco_{a}_{0}0849
- Guliyev N.J., Ismailov V.E., On the approximation by single hidden layer feedforward neural networks with fixed weights, Neural Networks 98 (2018), 296-304.
- Guliyev N.J., Ismailov V.E., Approximation capability of two hidden layer feedforward neural networks with fixed weights, Neurocomputing 316 (2018), 262-269.
- Nahmwoo Hahm and Bum Il Hong, An approximation by neural networks with a fixed weight, Comput. Math. Appl. 47 (2004), no. 12, 1897–1903. MR 2086107, DOI 10.1016/j.camwa.2003.06.008
- Israel Halperin, The product of projection operators, Acta Sci. Math. (Szeged) 23 (1962), 96–99. MR 141978
- Hecht-Nielsen R., Kolmogorov’s mapping neural network existence theorem, in Proc. I987 IEEE Int. Conf. on Neural Networks, IEEE Press, New York, 1987, vol. 3, 11-14.
- Hornik K., Approximation capabilities of multilayer feedforward networks, Neural Networks 4 (1991), 251-257.
- Hornik K., Stinchcombe M., White H., Multilayer feedforward networks are universal approximators, Neural Networks 2 (1989), 359-366.
- Peter J. Huber, Projection pursuit, Ann. Statist. 13 (1985), no. 2, 435–525. With discussion. MR 790553, DOI 10.1214/aos/1176349519
- V. E. Ismailov, Theorem on lightning bolts for elementary domains, Proc. Inst. Math. Mech. Natl. Acad. Sci. Azerb. 17 (2002), 78–85, 205 (English, with English and Azerbaijani summaries). Applied problems of mathematics and mechanics (Baku, 2002). MR 2027426
- Vugar E. Ismailov, On some classes of bivariate functions characterized by formulas for the best approximation, Rad. Mat. 13 (2004), no. 1, 53–62 (English, with English and Serbo-Croatian summaries). MR 2111677
- Vugar E. Ismailov, On error formulas for approximation by sums of univariate functions, Int. J. Math. Math. Sci. , posted on (2006), Art. ID 65620, 11. MR 2219166, DOI 10.1155/IJMMS/2006/65620
- V. È. Ismailov, On methods for computing the exact value of the best approximation by sums of functions of one variable, Sibirsk. Mat. Zh. 47 (2006), no. 5, 1076–1082 (Russian, with Russian summary); English transl., Siberian Math. J. 47 (2006), no. 5, 883–888. MR 2266517, DOI 10.1007/s11202-006-0097-3
- Vugar E. Ismailov, On the approximation by compositions of fixed multivariate functions with univariate functions, Studia Math. 183 (2007), no. 2, 117–126. MR 2353881, DOI 10.4064/sm183-2-2
- Vugar E. Ismailov, A note on the best $L_2$ approximation by ridge functions, Appl. Math. E-Notes 7 (2007), 71–76. MR 2295689
- Vugar E. Ismailov, Representation of multivariate functions by sums of ridge functions, J. Math. Anal. Appl. 331 (2007), no. 1, 184–190. MR 2305997, DOI 10.1016/j.jmaa.2006.08.076
- Vugar E. Ismailov, Characterization of an extremal sum of ridge functions, J. Comput. Appl. Math. 205 (2007), no. 1, 105–115. MR 2324828, DOI 10.1016/j.cam.2006.04.043
- Vugar E. Ismailov, On the representation by linear superpositions, J. Approx. Theory 151 (2008), no. 2, 113–125. MR 2407861, DOI 10.1016/j.jat.2007.09.003
- Vugar E. Ismailov, On the approximation by weighted ridge functions, An. Univ. Vest Timiş. Ser. Mat.-Inform. 46 (2008), no. 1, 75–83. MR 2791467
- Vugar E. Ismailov, On the proximinality of ridge functions, Sarajevo J. Math. 5(17) (2009), no. 1, 109–118. MR 2527153
- Vugar E. Ismailov, On the theorem of M Golomb, Proc. Indian Acad. Sci. Math. Sci. 119 (2009), no. 1, 45–52. MR 2508488, DOI 10.1007/s12044-009-0005-4
- V. E. Ismailov, Approximation capabilities of neural networks with weights from two directions, Azerb. J. Math. 1 (2011), no. 1, 122–128. MR 2776103
- Vugar E. Ismailov, Approximation by neural networks with weights varying on a finite set of directions, J. Math. Anal. Appl. 389 (2012), no. 1, 72–83. MR 2876482, DOI 10.1016/j.jmaa.2011.11.037
- Vugar E. Ismailov, A note on the representation of continuous functions by linear superpositions, Expo. Math. 30 (2012), no. 1, 96–101. MR 2899658, DOI 10.1016/j.exmath.2011.07.005
- Vugar E. Ismailov, A review of some results on ridge function approximation, Azerb. J. Math. 3 (2013), no. 1, 3–51. MR 3032796
- Vugar E. Ismailov, On the approximation by neural networks with bounded number of neurons in hidden layers, J. Math. Anal. Appl. 417 (2014), no. 2, 963–969. MR 3194523, DOI 10.1016/j.jmaa.2014.03.092
- Vugar E. Ismailov, Alternating algorithm for the approximation by sums of two compositions and ridge functions, Proc. Inst. Math. Mech. Natl. Acad. Sci. Azerb. 41 (2015), no. 1, 146–152. MR 3465724
- V. E. Ismailov, On the uniqueness of representation by linear superpositions, Ukraïn. Mat. Zh. 68 (2016), no. 12, 1620–1628 (English, with English and Ukrainian summaries); English transl., Ukrainian Math. J. 68 (2017), no. 12, 1874–1883. MR 3592387, DOI 10.1007/s11253-017-1335-5
- V. È. Ismailov, Approximation by sums of ridge functions with fixed directions, Algebra i Analiz 28 (2016), no. 6, 20–69 (Russian, with Russian summary); English transl., St. Petersburg Math. J. 28 (2017), no. 6, 741–772. MR 3637575, DOI 10.1090/spmj/1471
- Vugar E. Ismailov, A note on the equioscillation theorem for best ridge function approximation, Expo. Math. 35 (2017), no. 3, 343–349. MR 3689906, DOI 10.1016/j.exmath.2017.05.003
- Vugar E. Ismailov, Computing the approximation error for neural networks with weights varying on fixed directions, Numer. Funct. Anal. Optim. 40 (2019), no. 12, 1395–1409. MR 3953211, DOI 10.1080/01630563.2019.1605523
- V. E. Ismailov and A. Pinkus, Interpolation on lines by ridge functions, J. Approx. Theory 175 (2013), 91–113. MR 3101062, DOI 10.1016/j.jat.2013.07.010
- Vugar E. Ismailov and Ekrem Savas, Measure theoretic results for approximation by neural networks with limited weights, Numer. Funct. Anal. Optim. 38 (2017), no. 7, 819–830. MR 3654363, DOI 10.1080/01630563.2016.1254654
- Irie B., Miyake S., Capability of three-layered perceptrons, in Proceedings of the IEEE 1988 Int. Conf. on Neural Networks, 1988, Vol. 1, IEEE, New York, 641-648.
- Ito Y., Representation of functions by superpositions of a step or sigmoid function and their applications to neural network theory, Neural Networks 4 (1991), 385-394.
- Ito Y., Approximation of functions on a compact set by finite sums of a sigmoid function without scaling, Neural Networks 4 (1991), no. 6, 817-826.
- Ito Y., Approximation of continuous functions on $\mathbb {R}^{d}$ by linear combinations of shifted rotations of a sigmoid function with and without scaling, Neural Networks 5 (1992), 105-115.
- Jones L.K., Constructive approximations for neural networks by sigmoidal functions, Proc. IEEE 78 (1990), 1586-1589. Correction and addition, Proc. IEEE 79 (1991), 243.
- Fritz John, Plane waves and spherical means applied to partial differential equations, Interscience Publishers, New York-London, 1955. MR 0075429
- Palle E. T. Jorgensen and James F. Tian, Superposition, reduction of multivariable problems, and approximation, Anal. Appl. (Singap.) 18 (2020), no. 5, 771–801. MR 4131038, DOI 10.1142/S021953051941001X
- Paul C. Kainen and Věra Kůrková, An integral upper bound for neural network approximation, Neural Comput. 21 (2009), no. 10, 2970–2989. MR 2567918, DOI 10.1162/neco.2009.04-08-745
- Kainen P.C., Kůrkova V., Vogt A., Best approximation by Heaviside perceptron networks, Neural Networks 13 (2007), no. 7, 695-697.
- I. G. Kazantsev, Tomographic reconstruction from arbitrary directions using ridge functions, Inverse Problems 14 (1998), no. 3, 635–645. MR 1630007, DOI 10.1088/0266-5611/14/3/014
- Kazantsev I., Tomographic reconstruction using ridge functions, Proceedings of 1st World Congress on Industrial Process Tomography, Buxton, Derbishyre, UK, April 14-17, 1999, pp. 433-437.
- Kazantsev I., Lemahieu I., Reconstruction of elongated structures using ridge functions and natural pixels, Inverse Problems 16 (2000), 505-517.
- Sandra Keiper, Approximation of generalized ridge functions in high dimensions, J. Approx. Theory 245 (2019), 101–129. MR 3953358, DOI 10.1016/j.jat.2019.04.006
- C. T. Kelley, A note on the approximation of functions of several variables by sums of functions of one variable, J. Approx. Theory 33 (1981), no. 3, 179–189. MR 647845, DOI 10.1016/0021-9045(81)90068-X
- S. Ja. Havinson, A Čebyšev theorem for the approximation of a function of two variables by sums $\phi (x)+\psi (y)$, Izv. Akad. Nauk SSSR Ser. Mat. 33 (1969), 650–666 (Russian). MR 0262746
- S. Ya. Khavinson, Representation of functions of two variables by the sums $\phi (x)+\psi (y)$, Izv. Vyssh. Uchebn. Zaved. Mat. 2 (1985), 66–73, 87 (Russian). MR 788620
- S. Ya. Khavinson, Some approximation properties of linear superpositions, Izv. Vyssh. Uchebn. Zaved. Mat. 8 (1995), 63–73 (Russian); English transl., Russian Math. (Iz. VUZ) 39 (1995), no. 8, 60–70. MR 1391343
- S. Ya. Khavinson, Best approximation by linear superpositions (approximate nomography), Translations of Mathematical Monographs, vol. 159, American Mathematical Society, Providence, RI, 1997. Translated from the Russian manuscript by D. Khavinson. MR 1421322, DOI 10.1090/mmono/159
- A. Kłopotowski and M. G. Nadkarni, Shift invariant measures and simple spectrum. part 2, Colloq. Math. 84/85 (2000), no. part 2, 385–394. Dedicated to the memory of Anzelm Iwanik. MR 1784203, DOI 10.4064/cm-84/85-2-385-394
- A. Kłopotowski, M. G. Nadkarni, and K. P. S. Bhaskara Rao, When is $f(x_1,x_2,\dots ,x_n)=u_1(x_1)+u_2(x_2)+\dots +u_n(x_n)$?, Proc. Indian Acad. Sci. Math. Sci. 113 (2003), no. 1, 77–86. Functional analysis (Kolkata, 2001). MR 1971557, DOI 10.1007/BF02829681
- A. Kłopotowski, M. G. Nadkarni, and K. P. S. Bhaskara Rao, Geometry of good sets in $n$-fold Cartesian product, Proc. Indian Acad. Sci. Math. Sci. 114 (2004), no. 2, 181–197. MR 2062398, DOI 10.1007/BF02829852
- A. N. Kolmogorov, On certain asymptotic characteristics of completely bounded metric spaces, Dokl. Akad. Nauk SSSR (N.S.) 108 (1956), 385–388 (Russian). MR 0080904
- A. N. Kolmogorov, On the representation of continuous functions of many variables by superposition of continuous functions of one variable and addition, Amer. Math. Soc. Transl. (2) 28 (1963), 55–59. MR 0153799, DOI 10.1090/trans2/028/04
- A. N. Kolmogorov and V. M. Tihomirov, $\varepsilon$-entropy and $\varepsilon$-capacity of sets in functional space, Amer. Math. Soc. Transl. (2) 17 (1961), 277–364. MR 0124720
- I. M. Kolodiĭ and F. Hil′debrand, Certain properties of the modulus of continuity, Mat. Zametki 9 (1971), 495–500 (Russian). MR 281850
- S. V. Konyagin and A. A. Kuleshov, On the continuity of finite sums of ridge functions, Mat. Zametki 98 (2015), no. 2, 308–309 (Russian); English transl., Math. Notes 98 (2015), no. 1-2, 336–338. MR 3438485, DOI 10.4213/mzm10787
- S. V. Konyagin and A. A. Kuleshov, On some properties of finite sums of ridge functions defined on convex subsets of $\Bbb {R}^n$, Tr. Mat. Inst. Steklova 293 (2016), no. Funktsional. Prostranstva, Teor. Priblizh., Smezhnye Razdely Mat. Anal., 193–200 (Russian, with Russian summary); English transl., Proc. Steklov Inst. Math. 293 (2016), no. 1, 186–193. MR 3628479, DOI 10.1134/S0371968516020138
- S. V. Konyagin, A. A. Kuleshov, and V. E. Maĭorov, Some problems in the theory of ridge functions, Tr. Mat. Inst. Steklova 301 (2018), no. Kompleksnyĭ Analiz, Matematicheskaya Fizika i Prilozheniya, 155–181 (Russian, with Russian summary); English transl., Proc. Steklov Inst. Math. 301 (2018), no. 1, 144–169. MR 3841666, DOI 10.1134/S0371968518020127
- A. Kroó, On approximation by ridge functions, Constr. Approx. 13 (1997), no. 4, 447–460. MR 1466061, DOI 10.1007/s003659900053
- Marek Kuczma, An introduction to the theory of functional equations and inequalities, 2nd ed., Birkhäuser Verlag, Basel, 2009. Cauchy’s equation and Jensen’s inequality; Edited and with a preface by Attila Gilányi. MR 2467621, DOI 10.1007/978-3-7643-8749-5
- A. A. Kuleshov, On some properties of smooth sums of ridge functions, Tr. Mat. Inst. Steklova 294 (2016), no. Sovremennye Problemy Matematiki, Mekhaniki i Matematicheskoĭ Fiziki. II, 99–104 (Russian, with Russian summary); English transl., Proc. Steklov Inst. Math. 294 (2016), no. 1, 89–94. MR 3628495, DOI 10.1134/S0371968516030067
- A. A. Kuleshov, Continuous sums of ridge functions on a convex body and the class VMO, Mat. Zametki 102 (2017), no. 6, 866–873 (Russian, with Russian summary); English transl., Math. Notes 102 (2017), no. 5-6, 799–805. MR 3733330, DOI 10.4213/mzm11568
- Svetozar Kurepa, A property of a set of positive measure and its application, J. Math. Soc. Japan 13 (1961), 13–19. MR 132141, DOI 10.2969/jmsj/01310013
- Kůrková V., Kolmogorov’s theorem is relevant, Neural Comput. 3 (1991), 617-622.
- Kůrková V., Kolmogorov’s theorem and multilayer neural networks, Neural Networks 5 (1992), 501-506.
- Leshno M., Lin V.Ya., Pinkus A., Schocken S., Multilayer feedforward networks with a non-polynomial activation function can approximate any function, Neural Networks 6 (1993), 861-867.
- Will Light, Ridge functions, sigmoidal functions and neural networks, Approximation theory VII (Austin, TX, 1992) Academic Press, Boston, MA, 1993, pp. 163–206. MR 1212573, DOI 10.1144/GSL.SP.1993.071.01.08
- W. A. Light and E. W. Cheney, On the approximation of a bivariate function by the sum of univariate functions, J. Approx. Theory 29 (1980), no. 4, 305–322. MR 598725, DOI 10.1016/0021-9045(80)90119-7
- W. A. Light and E. W. Cheney, Approximation theory in tensor product spaces, Lecture Notes in Mathematics, vol. 1169, Springer-Verlag, Berlin, 1985. MR 817984, DOI 10.1007/BFb0075391
- Lin S., Limitations of shallow nets approximation, Neural Networks 94 (2017), 96-102.
- Vladimir Ya. Lin and Allan Pinkus, Fundamentality of ridge functions, J. Approx. Theory 75 (1993), no. 3, 295–311. MR 1250542, DOI 10.1006/jath.1993.1104
- B. F. Logan and L. A. Shepp, Optimal reconstruction of a function from its projections, Duke Math. J. 42 (1975), no. 4, 645–659. MR 397240
- G. G. Lorentz, Metric entropy, widths, and superpositions of functions, Amer. Math. Monthly 69 (1962), 469–485. MR 141926, DOI 10.2307/2311185
- V. E. Maiorov, On best approximation by ridge functions, J. Approx. Theory 99 (1999), no. 1, 68–94. MR 1696577, DOI 10.1006/jath.1998.3304
- Vitaly Maiorov, Geometric properties of the ridge function manifold, Adv. Comput. Math. 32 (2010), no. 2, 239–253. MR 2581237, DOI 10.1007/s10444-008-9106-3
- V. E. Maiorov and R. Meir, On the near optimality of the stochastic approximation of smooth functions by neural networks, Adv. Comput. Math. 13 (2000), no. 1, 79–103. MR 1759189, DOI 10.1023/A:1018993908478
- Vitaly Maiorov, Ron Meir, and Joel Ratsaby, On the approximation of functional classes equipped with a uniform measure using ridge functions, J. Approx. Theory 99 (1999), no. 1, 95–111. MR 1696573, DOI 10.1006/jath.1998.3305
- Maiorov V., Pinkus A., Lower bounds for approximation by MLP neural networks, Neurocomputing 25 (1999), 81-91.
- Y. Makovoz, Uniform approximation by neural networks, J. Approx. Theory 95 (1998), no. 2, 215–228. MR 1652888, DOI 10.1006/jath.1997.3217
- Robert B. Marr, On the reconstruction of a function on a circular domain from a sampling of its line integrals, J. Math. Anal. Appl. 45 (1974), 357–374. MR 336156, DOI 10.1016/0022-247X(74)90078-X
- Donald E. Marshall and Anthony G. O’Farrell, Uniform approximation by real functions, Fund. Math. 104 (1979), no. 3, 203–211. MR 559174, DOI 10.4064/fm-104-3-203-211
- Donald E. Marshall and Anthony G. O’Farrell, Approximation by a sum of two algebras. The lightning bolt principle, J. Funct. Anal. 52 (1983), no. 3, 353–368. MR 712586, DOI 10.1016/0022-1236(83)90074-5
- Sebastian Mayer, Tino Ullrich, and Jan Vybíral, Entropy and sampling numbers of classes of ridge functions, Constr. Approx. 42 (2015), no. 2, 231–264. MR 3392489, DOI 10.1007/s00365-014-9267-x
- Mazur S., Orlicz W., Grundlegende Eigenschaften der polynomischen Operationen I., II., Studia Math. 5 (1934), 50-68, 179-189.
- M. A. McKiernan, On vanishing $n\textrm {th}$ ordered differences and Hamel bases, Ann. Polon. Math. 19 (1967), 331–336. MR 221131, DOI 10.4064/ap-19-3-331-336
- V. A. Medvedev, Refutation of a theorem of Diliberto and Straus, Mat. Zametki 51 (1992), no. 4, 78–80, 142 (Russian); English transl., Math. Notes 51 (1992), no. 3-4, 380–381. MR 1172469, DOI 10.1007/BF01250549
- V. A. Medvedev, On the sum of two closed algebras of continuous functions on a compact space, Funktsional. Anal. i Prilozhen. 27 (1993), no. 1, 33–36 (Russian); English transl., Funct. Anal. Appl. 27 (1993), no. 1, 28–30. MR 1225908, DOI 10.1007/BF01768665
- H. N. Mhaskar, On the tractability of multivariate integration and approximation by neural networks, J. Complexity 20 (2004), no. 4, 561–590. MR 2068159, DOI 10.1016/j.jco.2003.11.004
- H. N. Mhaskar and Charles A. Micchelli, Approximation by superposition of sigmoidal and radial basis functions, Adv. in Appl. Math. 13 (1992), no. 3, 350–373. MR 1176581, DOI 10.1016/0196-8858(92)90016-P
- Montanelli H., Yang H., Error bounds for deep ReLU networks using the Kolmogorov–Arnold superposition theorem, Neural Networks 129 (2020), 1-6.
- F. Natterer, The mathematics of computerized tomography, B. G. Teubner, Stuttgart; John Wiley & Sons, Ltd., Chichester, 1986. MR 856916
- K. Gowri Navada, Some remarks on good sets, Proc. Indian Acad. Sci. Math. Sci. 114 (2004), no. 4, 389–397. MR 2067701, DOI 10.1007/BF02829443
- John von Neumann, Functional Operators. II. The Geometry of Orthogonal Spaces, Annals of Mathematics Studies, No. 22, Princeton University Press, Princeton, N. J., 1950. MR 0034514
- Erich Novak and Henryk Woźniakowski, Approximation of infinitely differentiable multivariate functions is intractable, J. Complexity 25 (2009), no. 4, 398–404. MR 2542039, DOI 10.1016/j.jco.2008.11.002
- Erich Novak and Henryk Woźniakowski, Tractability of multivariate problems for standard and linear information in the worst case setting: Part I, J. Approx. Theory 207 (2016), 177–192. MR 3494228, DOI 10.1016/j.jat.2016.02.017
- Erich Novak and Henryk Woźniakowski, Tractability of multivariate problems for standard and linear information in the worst case setting: Part II, Contemporary computational mathematics—a celebration of the 80th birthday of Ian Sloan. Vol. 1, 2, Springer, Cham, 2018, pp. 963–977. MR 3822267
- Ju. P. Ofman, On the best approximation of functions of two variables by functions of the form $\varphi (x)+\psi (y)$, Izv. Akad. Nauk SSSR Ser. Mat. 25 (1961), 239–252 (Russian). MR 0125381
- K. I. Oskolkov, Ridge approximation, Fourier-Chebyshev analysis, and optimal quadrature formulas, Tr. Mat. Inst. Steklova 219 (1997), no. Teor. Priblizh. Garmon. Anal., 269–285 (Russian); English transl., Proc. Steklov Inst. Math. 4(219) (1997), 265–280. MR 1642280
- Phillip A. Ostrand, Dimension of metric spaces and Hilbert’s problem $13$, Bull. Amer. Math. Soc. 71 (1965), 619–622. MR 177391, DOI 10.1090/S0002-9904-1965-11363-5
- Petersen P., Voigtlaender F., Optimal approximation of piecewise smooth functions using deep ReLU neural networks, Neural Networks, 108 (2018), 296-330.
- Pencho P. Petrushev, Approximation by ridge functions and neural networks, SIAM J. Math. Anal. 30 (1999), no. 1, 155–189. MR 1646689, DOI 10.1137/S0036141097322959
- Allan Pinkus, Approximating by ridge functions, Surface fitting and multiresolution methods (Chamonix–Mont-Blanc, 1996) Vanderbilt Univ. Press, Nashville, TN, 1997, pp. 279–292. MR 1660030
- Allan Pinkus, Approximation theory of the MLP model in neural networks, Acta numerica, 1999, Acta Numer., vol. 8, Cambridge Univ. Press, Cambridge, 1999, pp. 143–195. MR 1819645, DOI 10.1017/S0962492900002919
- A. Pinkus, Smoothness and uniqueness in ridge function representation, Indag. Math. (N.S.) 24 (2013), no. 4, 725–738. MR 3124803, DOI 10.1016/j.indag.2012.10.004
- Allan Pinkus, The alternating algorithm in a uniformly convex and uniformly smooth Banach space, J. Math. Anal. Appl. 421 (2015), no. 1, 747–753. MR 3250506, DOI 10.1016/j.jmaa.2014.06.076
- Allan Pinkus, Ridge functions, Cambridge Tracts in Mathematics, vol. 205, Cambridge University Press, Cambridge, 2015. MR 3559591, DOI 10.1017/CBO9781316408124
- T. J. Rivlin and R. J. Sibner, The degree of approximation of certain functions of two variables by a sum of functions of one variable, Amer. Math. Monthly 72 (1965), 1101–1103. MR 186995, DOI 10.2307/2315959
- Walter Rudin, Functional analysis, 2nd ed., International Series in Pure and Applied Mathematics, McGraw-Hill, Inc., New York, 1991. MR 1157815
- Marcello Sanguineti, Universal approximation by ridge computational models and neural networks: a survey, Open Appl. Math. J. 2 (2008), 31–58. MR 2399691, DOI 10.2174/1874114200802010031
- Schmidt-Hieber J., The Kolmogorov–Arnold representation theorem revisited, Neural Networks 137 (2021), 119–126.
- Laurent Schwartz, Théorie générale des fonctions moyenne-périodiques, Ann. of Math. (2) 48 (1947), 857–929 (French). MR 23948, DOI 10.2307/1969386
- Shen, Z., Yang, H., Zhang, S., Neural network approximation: Three hidden layers are enough, Neural Networks 141 (2021), 160-173.
- Ivan Singer, The theory of best approximation and functional analysis, Conference Board of the Mathematical Sciences Regional Conference Series in Applied Mathematics, No. 13, Society for Industrial and Applied Mathematics, Philadelphia, Pa., 1974. MR 0374771
- David Sprecher, A representation theorem for continuous functions of several variables, Proc. Amer. Math. Soc. 16 (1965), 200–203. MR 174666, DOI 10.1090/S0002-9939-1965-0174666-7
- David A. Sprecher, On the existence of best approximations and representations in several variables, J. Reine Angew. Math. 234 (1969), 152–162. MR 257622, DOI 10.1515/crll.1969.234.152
- David A. Sprecher, An improvement in the superposition theorem of Kolmogorov, J. Math. Anal. Appl. 38 (1972), 208–213. MR 302838, DOI 10.1016/0022-247X(72)90129-1
- Sprecher D.A., A universal mapping for Kolmogorov’s superposition theorem, Neural Networks 6 (1993), 1089-1094.
- J. P. Sproston and D. Strauss, Sums of subalgebras of $C(X)$, J. London Math. Soc. (2) 45 (1992), no. 2, 265–278. MR 1171554, DOI 10.1112/jlms/s2-45.2.265
- Stein W.A. et al., Sage Mathematics Software (Version 7.6), The Sage Developers, 2017, http://www.sagemath.org.
- J. Sternfeld, Dimension theory and superpositions of continuous functions, Israel J. Math. 20 (1975), no. 3-4, 300–320. MR 374351, DOI 10.1007/BF02760335
- Y. Sternfeld, Uniformly separating families of functions, Israel J. Math. 29 (1978), no. 1, 61–91. MR 487991, DOI 10.1007/BF02760402
- Y. Sternfeld, Superpositions of continuous functions, J. Approx. Theory 25 (1979), no. 4, 360–368. MR 535937, DOI 10.1016/0021-9045(79)90022-4
- Y. Sternfeld, Dimension, superposition of functions and separation of points, in compact metric spaces, Israel J. Math. 50 (1985), no. 1-2, 13–53. MR 788068, DOI 10.1007/BF02761117
- Yaki Sternfeld, Uniform separation of points and measures and representation by sums of algebras, Israel J. Math. 55 (1986), no. 3, 350–362. MR 876401, DOI 10.1007/BF02765032
- Stinchcombe M., White H., Approximating and learning unknown mappings using multilayer feedforward networks with bounded weights, in Proceedings of the IEEE 1990 International Joint Conference on Neural Networks, 1990, Vol. 3, IEEE, New York, 7-16.
- Bruno H. Strulovici and Thomas A. Weber, Additive envelopes of continuous functions, Oper. Res. Lett. 38 (2010), no. 3, 165–168. MR 2608851, DOI 10.1016/j.orl.2010.01.004
- Xingping Sun and E. W. Cheney, The fundamentality of sets of ridge functions, Aequationes Math. 44 (1992), no. 2-3, 226–235. MR 1181270, DOI 10.1007/BF01830981
- Temlyakov V.N., On approximation by ridge functions, Preprint. Department of Mathematics, University of South Carolina, 1996.
- V. M. Tihomirov, The works of A. N. Kolmogorov on $\varepsilon$-entropy of function classes and superpositions of functions, Uspehi Mat. Nauk 18 (1963), no. 5 (113), 55–92 (Russian). MR 0162910
- J. F. Traub, G. W. Wasilkowski, and H. Woźniakowski, Information-based complexity, Computer Science and Scientific Computing, Academic Press, Inc., Boston, MA, 1988. With contributions by A. G. Werschulz and T. Boult. MR 958691
- V. N. Trofimov and L. R. Hariton, On the error of uniform approximation of functions of two variables by a sum of functions of one variable, Izv. Vyssh. Uchebn. Zaved. Mat. 8 (1979), 70–73 (Russian). MR 554506
- Hemant Tyagi and Volkan Cevher, Learning non-parametric basis independent models from point queries via low-rank methods, Appl. Comput. Harmon. Anal. 37 (2014), no. 3, 389–412. MR 3256780, DOI 10.1016/j.acha.2014.01.002
- A. G. Vituškin and G. M. Henkin, Linear superpositions of functions, Uspehi Mat. Nauk 22 (1967), no. 1 (133), 77–124 (Russian). MR 0237729
- B. A. Vostrecov and M. A. Kreĭnes, Approximation of continuous functions by superpositions of plane waves. , Dokl. Akad. Nauk SSSR 140 (1961), 1237–1240 (Russian). MR 0131106
- Yarotsky D., Error bounds for approximations with deep ReLU networks, Neural Networks 94 (2017), 103-114.