Testing the manifold hypothesis

Fefferman, Charles; Mitter, Sanjoy; Narayanan, Hariharan

doi:10.1090/jams/852

Testing the manifold hypothesis
HTML articles powered by AMS MathViewer

by Charles Fefferman, Sanjoy Mitter and Hariharan Narayanan

J. Amer. Math. Soc. 29 (2016), 983-1049

DOI: https://doi.org/10.1090/jams/852

Published electronically: February 9, 2016

PDF

Abstract:

The hypothesis that high dimensional data tend to lie in the vicinity of a low dimensional manifold is the basis of manifold learning. The goal of this paper is to develop an algorithm (with accompanying complexity guarantees) for testing the existence of a manifold that fits a probability distribution supported in a separable Hilbert space, only using i.i.d. samples from that distribution. More precisely, our setting is the following. Suppose that data are drawn independently at random from a probability distribution $\mathcal {P}$ supported on the unit ball of a separable Hilbert space $\mathcal {H}$. Let $\mathcal {G}(d, V, \tau )$ be the set of submanifolds of the unit ball of $\mathcal {H}$ whose volume is at most $V$ and reach (which is the supremum of all $r$ such that any point at a distance less than $r$ has a unique nearest point on the manifold) is at least $\tau$. Let $\mathcal {L}(\mathcal {M}, \mathcal {P})$ denote the mean-squared distance of a random point from the probability distribution $\mathcal {P}$ to $\mathcal {M}$. We obtain an algorithm that tests the manifold hypothesis in the following sense.

The algorithm takes i.i.d. random samples from $\mathcal {P}$ as input and determines which of the following two is true (at least one must be):

There exists $\mathcal {M} \in \mathcal {G}(d, CV, \frac {\tau }{C})$ such that $\mathcal {L}(\mathcal {M}, \mathcal {P}) \leq C {\epsilon }.$
There exists no $\mathcal {M} \in \mathcal {G}(d, V/C, C\tau )$ such that $\mathcal {L}(\mathcal {M}, \mathcal {P}) \leq \frac {\epsilon }{C}.$

The answer is correct with probability at least $1-\delta$.

References

Shun-ichi Amari and Hiroshi Nagaoka, Methods of information geometry, Translations of Mathematical Monographs, vol. 191, American Mathematical Society, Providence, RI; Oxford University Press, Oxford, 2000. Translated from the 1993 Japanese original by Daishi Harada. MR 1800071, DOI 10.1090/mmono/191
Richard G. Baraniuk and Michael B. Wakin, Random projections of smooth manifolds, Found. Comput. Math. 9 (2009), no. 1, 51–77. MR 2472287, DOI 10.1007/s10208-007-9011-z
Mikhail Belkin and Partha Niyogi, Laplacian eigenmaps for dimensionality reduction and data representation, Neural Comput. 15 (2003), no. 6, 1373–1396.
Jean-Daniel Boissonnat, Leonidas J. Guibas, and Steve Y. Oudot, Manifold reconstruction in arbitrary dimensions using witness complexes, Discrete Comput. Geom. 42 (2009), no. 1, 37–70. MR 2506737, DOI 10.1007/s00454-009-9175-1
Olivier Bousquet, Stéphane Boucheron, and Gábor Lugosi, Introduction to statistical learning theory, Advanced Lectures on Machine Learning, 2004, pp. 169–207.
P. S. Bradley and O. L. Mangasarian, $k$-plane clustering, J. Global Optim. 16 (2000), no. 1, 23–32. MR 1770524, DOI 10.1023/A:1008324625522
M. Brand, Charting a manifold, NIPS 15 (2002), 985–992.
Guillermo D. Cañas, Tomaso Poggio, and Lorenzo Rosasco, Learning manifolds with k-means and k-flats, 26th Annual Conference on Neural Information Processing Systems, December 3–6, 2012, Lake Tahoe, Nevada, Proceedings of Advances in Neural Information Processing Systems 25, Curran Associates, Red Hook, NY, 2013, pp. 2474–2482.
Gunnar Carlsson, Topology and data, Bull. Amer. Math. Soc. (N.S.) 46 (2009), no. 2, 255–308. MR 2476414, DOI 10.1090/S0273-0979-09-01249-X
Siu-Wing Cheng, Tamal K. Dey, and Edgar A. Ramos, Manifold reconstruction from point samples, Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, ACM, New York, 2005, pp. 1018–1027. MR 2298361
Christopher R. Genovese, Marco Perone-Pacifico, Isabella Verdinelli, and Larry Wasserman, Manifold estimation and singular deconvolution under Hausdorff loss, Ann. Statist. 40 (2012), no. 2, 941–963. MR 2985939, DOI 10.1214/12-AOS994
Christopher R. Genovese, Marco Perone-Pacifico, Isabella Verdinelli, and Larry Wasserman, Minimax manifold estimation, J. Mach. Learn. Res. 13 (2012), no. 1, 1263–1291.
Kenneth L. Clarkson, Tighter bounds for random projections of manifolds, Computational geometry (SCG’08), ACM, New York, 2008, pp. 39–48. MR 2504269, DOI 10.1145/1377676.1377685
Ronald R. Coifman, Stephane Lafon, Ann Lee, Mauro Maggioni, Boaz Nadler, Frederick Warner, and Steven Zucker, Geometric diffusions as a tool for harmonic analysis and structure definition of data. Part II: Multiscale methods, Proc. Natl. Acad. Sci. USA 102 (2005), no. 21, 7432–7438.
Trevor F. Cox and Michael A. A. Cox, Multidimensional scaling, Monographs on Statistics and Applied Probability, vol. 59, Chapman & Hall, London, 1994. With 1 IBM-PC floppy disk (3.5 inch, HD). MR 1335449
Sanjoy Dasgupta and Yoav Freund, Random projection trees and low dimensional manifolds, STOC’08, ACM, New York, 2008, pp. 537–546. MR 2582911, DOI 10.1145/1374376.1374452
Guy David and Stephen Semmes, Analysis of and on uniformly rectifiable sets, Mathematical Surveys and Monographs, vol. 38, American Mathematical Society, Providence, RI, 1993. MR 1251061, DOI 10.1090/surv/038
Luc Devroye, Lászlo Györfi, and Gábor Lugosi, A probabilistic theory of pattern recognition, Springer, New York, 1996.
David L. Donoho, Aide-memoire. High-dimensional data analysis: The curses and blessings of dimensionality, http://statweb.stanford.edu/$\sim$donoho/Lectures/CBMS/Curses.pdf (2000).
David L. Donoho and Carrie Grimes, Hessian eigenmaps: locally linear embedding techniques for high-dimensional data, Proc. Natl. Acad. Sci. USA 100 (2003), no. 10, 5591–5596. MR 1981019, DOI 10.1073/pnas.1031596100
R. M. Dudley, Uniform central limit theorems, Cambridge Studies in Advanced Mathematics, vol. 63, Cambridge University Press, Cambridge, 1999. MR 1720712, DOI 10.1017/CBO9780511665622
Herbert Federer, Curvature measures, Trans. Amer. Math. Soc. 93 (1959), 418–491. MR 110078, DOI 10.1090/S0002-9947-1959-0110078-1
Charles Fefferman, Interpolation and extrapolation of smooth functions by linear operators, Rev. Mat. Iberoamericana 21 (2005), no. 1, 313–348. MR 2155023, DOI 10.4171/RMI/424
Charles Fefferman, The $C^m$ norm of a function with prescribed jets. II, Rev. Mat. Iberoam. 25 (2009), no. 1, 275–421. MR 2514339, DOI 10.4171/RMI/570
Christopher R. Genovese, Marco Perone-Pacifico, Isabella Verdinelli, and Larry Wasserman, Nonparametric ridge estimation, Ann. Statist. 42 (2014), no. 4, 1511–1545. MR 3262459, DOI 10.1214/14-AOS1218
Trevor Hastie and Werner Stuetzle, Principal curves, J. Amer. Statist. Assoc. 84 (1989), no. 406, 502–516. MR 1010339
H. Hotelling, Analysis of a complex of statistical variables into principal components, J. Educational Psychology 24 (1933), 417–441, 498–520.
Mark A. Iwen and Mauro Maggioni, Approximation of points on low-dimensional manifolds via random linear projections, Inf. Inference 2 (2013), no. 1, 1–31. MR 3311442, DOI 10.1093/imaiai/iat001
William B. Johnson and Joram Lindenstrauss, Extensions of Lipschitz mappings into a Hilbert space, Conference in modern analysis and probability (New Haven, Conn., 1982) Contemp. Math., vol. 26, Amer. Math. Soc., Providence, RI, 1984, pp. 189–206. MR 737400, DOI 10.1090/conm/026/737400
Peter W. Jones, Mauro Maggioni, and Raanan Schul, Universal local parametrizations via heat kernels and eigenfunctions of the Laplacian, Ann. Acad. Sci. Fenn. Math. 35 (2010), no. 1, 131–174. MR 2643401, DOI 10.5186/aasfm.2010.3508
A. Kambhatla and Todd K. Leen, Fast non-linear dimension reduction, IEEE International Conference on Neural Networks, IEEE, New York, 1993, pp. 1213–1218.
Balázs Kégl, Adam Krzyzak, Tamás Linder, and Kenneth Zeger, Learning and design of principal curves, IEEE Trans. Pattern Analysis and Machine Intelligence 22 (2000), 281–297.
Michel Ledoux, The concentration of measure phenomenon, Mathematical Surveys and Monographs, vol. 89, American Mathematical Society, Providence, RI, 2001. MR 1849347, DOI 10.1090/surv/089
Gilad Lerman and Teng Zhang, Robust recovery of multiple subspaces by geometric $l_p$ minimization, Ann. Statist. 39 (2011), no. 5, 2686–2715. MR 2906883, DOI 10.1214/11-AOS914
Yunqian Ma and Yun Fu (eds.), Manifold learning theory and applications, CRC Press, Boca Raton, FL, 2012. MR 2964193
Thomas Martinetz and Klaus Schulten, Topology representing networks, Neural Netw. 7 (1994), no. 3, 507–522.
Pascal Massart, Some applications of concentration inequalities to statistics, Ann. Fac. Sci. Toulouse Math. (6) 9 (2000), no. 2, 245–303 (English, with English and French summaries). Probability theory. MR 1813803
Andreas Maurer and Massimiliano Pontil, $K$-dimensional coding schemes in Hilbert spaces, IEEE Trans. Inform. Theory 56 (2010), no. 11, 5839–5846. MR 2808936, DOI 10.1109/TIT.2010.2069250
Raghavan Narasimhan, Lectures on topics in analysis, Tata Institute of Fundamental Research Lectures on Mathematics, No. 34, Tata Institute of Fundamental Research, Bombay, 1965. Notes by M. S. Rajwade. MR 0212837
Hariharan Narayanan and Sanjoy Mitter, On the sample complexity of testing the manifold hypothesis, NIPS (2010), 1786–1794.
Hariharan Narayanan and Partha Niyogi, On the sample complexity of learning smooth cuts on a manifol, Proc. of the 22nd Annual Conference on Learning Theory (COLT), June 2009, pp. 197–204.
Nigel J. Newton, Infinite-dimensional manifolds of finite-entropy probability measures, Geometric science of information, Lecture Notes in Comput. Sci., vol. 8085, Springer, Heidelberg, 2013, pp. 713–720. MR 3126105, DOI 10.1007/978-3-642-40020-9_{7}9
Partha Niyogi, Stephen Smale, and Shmuel Weinberger, Finding the homology of submanifolds with high confidence from random samples, Discrete Comput. Geom. 39 (2008), no. 1-3, 419–441. MR 2383768, DOI 10.1007/s00454-008-9053-2
Umut Ozertem and Deniz Erdogmus, Locally defined principal curves and surfaces, J. Mach. Learn. Res. 12 (2011), 1249–1286. MR 2804600
K. Pearson, On lines and planes of closest fit to systems of points in space, Philos. Mag. 2 (1901), 559–572.
S. T. Roweis, L. Saul, and G. Hinton, Global coordination of local linear models, Adv. Neural Inf. Process. Syst. 14 (2002), 889–896.
Sam T. Roweis and Lawrence K. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science 290 (2000), 2323–2326.
M. Rudelson and R. Vershynin, Combinatorics of random processes and sections of convex bodies, Ann. of Math. (2) 164 (2006), no. 2, 603–648. MR 2247969, DOI 10.4007/annals.2006.164.603
J. Shawe-Taylor and N. Christianini, Kernel methods for pattern analysis, Cambridge University Press, Cambridge, UK, 2004.
Alex J. Smola, Robert C. Williamson, Sebastian Mika, and Bernhard Schölkopf, Regularized principal manifolds, Computational learning theory (Nordkirchen, 1999) Lecture Notes in Comput. Sci., vol. 1572, Springer, Berlin, 1999, pp. 214–229. MR 1724991, DOI 10.1007/3-540-49097-3_{1}7
Karthik Sridharan and Nathan Srebro, Note on refined Dudley integral covering number bound. Unpublished.
J. B. Tenenbaum, V. Silva, and J. C. Langford, A Global Geometric Framework for Nonlinear Dimensionality Reduction, Science 290 (2000), no. 2000, 2319–2323.
Pravin M. Vaidya, A new algorithm for minimizing convex functions over convex sets, Math. Programming 73 (1996), no. 3, Ser. A, 291–341. MR 1393123, DOI 10.1016/0025-5610(92)00021-S
Vladimir Vapnik, Estimation of dependences based on empirical data, Springer Series in Statistics, Springer-Verlag, New York-Berlin, 1982. Translated from the Russian by Samuel Kotz. MR 672244
Kilian Q. Weinberger and Lawrence K. Saul, Unsupervised learning of image manifolds by semidefinite programming, Int. J. Comput. Vision 70 (2006), no. 1, 77–90.
Zhenyue Zhang and Hongyuan Zha, Principal manifolds and nonlinear dimensionality reduction via tangent space alignment, SIAM J. Sci. Comput. 26 (2004), no. 1, 313–338. MR 2114346, DOI 10.1137/S1064827502419154