Skip to Main Content
Quarterly of Applied Mathematics

Quarterly of Applied Mathematics

Online ISSN 1552-4485; Print ISSN 0033-569X

   
 
 

 

Data clustering based on Langevin annealing with a self-consistent potential


Authors: Kyle Lafata, Zhennan Zhou, Jian-Guo Liu and Fang-Fang Yin
Journal: Quart. Appl. Math. 77 (2019), 591-613
MSC (2010): Primary 62H30, 82C31, 60H10, 37M25
DOI: https://doi.org/10.1090/qam/1521
Published electronically: October 11, 2018
MathSciNet review: 3962584
Full-text PDF

Abstract | References | Similar Articles | Additional Information

Abstract: This paper introduces a novel data clustering algorithm based on Langevin dynamics, where the associated potential is constructed directly from the data. To introduce a self-consistent potential, we adopt the potential model from the established Quantum Clustering method. The first step is to use a radial basis function to construct a density distribution from the data. A potential function is then constructed such that this density distribution is the ground state solution to the time-independent Schrödinger equation. The second step is to use this potential function with the Langevin dynamics at subcritical temperature to avoid ergodicity. The Langevin equations take a classical Gibbs distribution as the invariant measure, where the peaks of the distribution coincide with minima of the potential surface. The time dynamics of individual data points lead to different metastable states, which are interpreted as cluster centers. Clustering is therefore achieved when subsets of the data aggregate—as a result of the Langevin dynamics for a moderate period of time—in the neighborhood of a particular potential minimum. While the data points are pushed towards potential minima by the potential gradient, Brownian motion allows them to effectively tunnel through local potential barriers and escape saddle points into locations of the potential surface otherwise forbidden. The algorithm’s feasibility is first established based on several illustrating examples and theoretical analyses, followed by a stricter evaluation using a standard benchmark dataset.


References [Enhancements On Off] (What's this?)

References
  • Dario Bambusi, Sandro Graffi, and Thierry Paul, Long time semiclassical approximation of quantum flows: a proof of the Ehrenfest time, Asymptot. Anal. 21 (1999), no. 2, 149–160. MR 1723551
  • Andrea L. Bertozzi and Arjuna Flenner, Diffuse interface models on graphs for classification of high dimensional data [reprint of MR3022033], SIAM Rev. 58 (2016), no. 2, 293–328. MR 3493948, DOI https://doi.org/10.1137/16M1070426
  • A. Bewley and B. Upcroft, Advantages of exploiting projection structure for segmenting dense 3d point clouds, Proceedings of Australasian Conference on Robotics and Automation (2013).
  • L. V. Bijuraj, Clustering and its applications, Proceedings of National Conference on New Horizona in IT (2013).
  • K. Blekas and I. E. Lagaris, Newtonian clustering: An approach based on molecular dynamics and global optimization, Pattern Recognition 40 (2007), 1734–1744.
  • T. Buhler and M. Hein, Spectral clustering based on the graph p-laplacian, Proceedings of the 26th International Conference on Machine Learning (2009), 81–88.
  • W. T. Coffey, Yu. P. Kalmykov, and J. T. Waldron, The Langevin equation, 2nd ed., World Scientific Series in Contemporary Chemical Physics, vol. 14, World Scientific Publishing Co., Inc., River Edge, NJ, 2004. With applications to stochastic problems in physics, chemistry and electrical engineering. MR 2053912
  • R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner, and S. W. Zucker, Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps, Proceedings of the National Academy of Sciences 102 (2005), no. 21, 1788–1794.
  • Y. Dauphin, R. Pascanu, C. Gulcehre, K. Cho, S. Ganguli, and Y. Bengio, Identifying and attacking the saddle point problem in high-dimensional non-convex optimization, Proceedings of the 27th International Conference on Neural Information Processing Systems (2014), 2933–2941.
  • V. Estivill-Castro, Why so many clustering algorithms — a position paper, ACM SIGKDD Explorations Newsletter 4 (2002), no. 1, 65–75.
  • Denis J. Evans, Debra J. Searles, and Stephen R. Williams, Fundamentals of classical statistical thermodynamics, WILEY-VCH Verlag Berlin GmbH, Weinheim, 2016. Dissipation, relaxation and fluctuation theorems. MR 3497107
  • R. Filipovych, S. M. Resnick, and C. Davatzikos, Semi-supervised cluster analysis of imaging data, Neuroimage 54 (2011), no. 3, 2185–2197.
  • C. Garcia-Cardona, A. Flenner, and A. G. Percus, Diffuse interface models on graphs for classification of high dimensional data, Proceedings of the 2nd International Conference on Pattern Recognition Applications and Methods (2013), 78–86.
  • R. Ge, F. Huang, C. Jin, and Y. Yuan, Escaping from saddle points – online stochastic gradient for tensor decomposition, JMLR: Workshop and Conference Proceedings 40 (2015), 1–46.
  • Patrick Gérard, Peter A. Markowich, Norbert J. Mauser, and Frédéric Poupaud, Homogenization limits and Wigner transforms, Comm. Pure Appl. Math. 50 (1997), no. 4, 323–379. MR 1438151, DOI https://doi.org/10.1002/%28SICI%291097-0312%28199704%2950%3A4%3C323%3A%3AAID-CPA4%3E3.3.CO%3B2-Q
  • George A. Hagedorn and Alain Joye, Exponentially accurate semiclassical dynamics: propagation, localization, Ehrenfest times, scattering, and more general states, Ann. Henri Poincaré 1 (2000), no. 5, 837–883. MR 1806980, DOI https://doi.org/10.1007/PL00001017
  • David Horn, Clustering via Hilbert space, Phys. A 302 (2001), no. 1-4, 70–79. MR 1876430, DOI https://doi.org/10.1016/S0378-4371%2801%2900442-3
  • D. Horn and I. Axel, Novel clustering algorithm for microarray expression data in a truncated svd space, Bioinformatics 19 (2003), no. 15, 1110.
  • D. Horn and A. Gottlieb, Algorithm for data clustering in pattern recognition problems based on quantum mechanics, Physical Review Letters 88 (2002), no. 1, 018702.
  • A. K. Jain, Data clustering: 50 years beyond k-means, Pattern Recognition Letters 31 (2010), no. 8, 651–666.
  • A. K. Jain, M. N. Murty, and P. J. Flynn, Data clustering: A review, ACM Computing Surveys 31 (1999), no. 3, 651–666.
  • I. T. Jolliffe, Principal component analysis, 2nd ed., Springer Series in Statistics, Springer-Verlag, New York, 2002. MR 2036084
  • Shoshichi Kobayashi and Katsumi Nomizu, Foundations of differential geometry. Vol. II, Interscience Tracts in Pure and Applied Mathematics, No. 15 Vol. II, Interscience Publishers John Wiley & Sons, Inc., New York-London-Sydney, 1969. MR 0238225
  • R. Kubo, The fluctuation-dissipation theorem, Reports on Progress in Physics 29 (1966), no. 1, 255–284.
  • S. Lafon and A. B. Lee, Diffusion maps and coarse-graining: A unified framework for dimensionality reduction, graph partitioning, and data set parameterization, IEEE Transactions on Pattern Analysis and Machine Intelligence 28 (2006), no. 9, 1393–1403.
  • B. Leimkuhler and C. Matthews, Robust and efficient configurational molecular sampling via Langevin dynamics, Journal of Chemical Physics 138 (2013), no. 17.
  • C. Li, C. Chen, D. Carlson, and L. Carin, Preconditioned stochastic gradient Langevin dynamics for deep neural networks, Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (2016), 1788–1794.
  • T. W. Liao, Clustering of time series data — a survey, Pattern Recognition 38 (2005), no. 11, 1857–1874.
  • Pierre-Louis Lions and Thierry Paul, Sur les mesures de Wigner, Rev. Mat. Iberoamericana 9 (1993), no. 3, 553–618 (French, with English and French summaries). MR 1251718, DOI https://doi.org/10.4171/RMI/143
  • J. Lu and Z. Zhou, Path integral molecular dynamics with surface hopping for thermal equilibrium sampling of nonadiabatic systems, Journal of Chemical Physics 146 (2017), no. 15.
  • Boaz Nadler, Stéphane Lafon, Ronald R. Coifman, and Ioannis G. Kevrekidis, Difusion maps, spectral clustering and reaction coordinates of dynamical systems, Appl. Comput. Harmon. Anal. 21 (2006), no. 1, 113–127. MR 2238669, DOI https://doi.org/10.1016/j.acha.2005.07.004
  • D. Pfitzner, R. Leibbrandt, and D. Powers, Characterization and evaluation of similarity measures for pairs of clusterings, Knowledge and Information Systems, 16 (2009), 361–394.
  • E. Ramasso, V. Placet, and M. L. Boubakar, Unsupervised consensus clustering of acoustic emission time-series for robust damage sequence estimation in composites, IEEE Transactions on Instrumentation and Measurement 64 (2015), no. 12, 3297–3307.
  • B. D. Ripley, Pattern recognition and neural networks, Cambridge University Press, Cambridge, 2007. Reprint of the 1996 original. MR 2451352
  • S. J. Roberts, Parametric and non-parametric unsupervised cluster analysis, Pattern Recognition 30 (1997), no. 2, 261–272.
  • A. M. Saxe, J. L. McClelland, and S. Ganguli, Exact solutions to the nonlinear dynamics of learning in deep linear neural networks, International Conference on Learning Representations (2014).
  • Rouslan L. Stratonovich, Nonlinear nonequilibrium thermodynamics. I, Springer Series in Synergetics, vol. 57, Springer-Verlag, Berlin, 1992. Linear and nonlinear fluctuation-dissipation theorems; Translated from the Russian by V. V. Stratonovich and A. P. Repjev. MR 1217972
  • Arthur D. Szlam, Mauro Maggioni, and Ronald R. Coifman, Regularization on graphs with function-adapted diffusion processes, J. Mach. Learn. Res. 9 (2008), 1711–1739. MR 2438821
  • P.-N. Tan, M. Steinback, A. Karpatne, and V. Kumar, Introduction to data mining, Pearson, 2005.
  • U. von Luxburg, A tutorial on spectral clustering, Technical report TR-149, Max Planck Institute for Biological Cybernetics, Tubingen, Germany (2006).
  • M. Weinstein and D. Horn, Dynamic quantum clustering: a method for visual exploration of structures in data, Physical Review E 80 (2009), 066117.
  • Qi He and Jack Xin, Hybrid deterministic-stochastic gradient Langevin dynamics for Bayesian learning, Commun. Inf. Syst. 12 (2012), no. 3, 221–232. MR 3157685, DOI https://doi.org/10.4310/CIS.2012.v12.n3.a3
  • P. Wittek, High-performance dynamic quantum clustering on graphics processors, Journal of Computational Physics 233 (2013), 262–271.
  • Lingsong Zhang, J. S. Marron, Haipeng Shen, and Zhengyuan Zhu, Singular value decomposition and its visualization, J. Comput. Graph. Statist. 16 (2007), no. 4, 833–854. MR 2412485, DOI https://doi.org/10.1198/106186007X256080

Similar Articles

Retrieve articles in Quarterly of Applied Mathematics with MSC (2010): 62H30, 82C31, 60H10, 37M25

Retrieve articles in all journals with MSC (2010): 62H30, 82C31, 60H10, 37M25


Additional Information

Kyle Lafata
Affiliation: Department of Radiation Oncology, Physics Division, Department of Physics, and Medical Physics Graduate Program, Duke University, Durham, North Carolina 27705
Email: kyle.lafata@duke.edu

Zhennan Zhou
Affiliation: Beijing International Center for Mathematical Research, Peking University, Beijing, People’s Republic of China 100871
MR Author ID: 1067205
Email: zhennan@bicmr.pku.edu.cn

Jian-Guo Liu
Affiliation: Departments of Mathematics and Physics, Duke University, Durham, North Carolina 27705
MR Author ID: 233036
ORCID: 0000-0002-9911-4045
Email: jliu@math.duke.edu

Fang-Fang Yin
Affiliation: Department of Radiation Oncology, Physics Division and Medical Physics Graduate Program, Duke University, Durham, North Carolina 27705
MR Author ID: 843641
Email: fangfang.yin@duke.edu

Received by editor(s): July 16, 2018
Published electronically: October 11, 2018
Additional Notes: The first author was partially supported by a grant from Varian Medical Systems.
The second author was partially supported by RNMS11-07444 (KI-Net) and the startup grant from Peking University.
The third author was partially supported by KI-Net NSF RNMS 11-07444, NSF DMS-1514826, and NSF DMS-1812573.
The fourth author was partially supported by NIH 2R21CA218940, NIH 1R01CA184173, NIH 1R21CA165384, and a grant from Varian Medical Systems.
Article copyright: © Copyright 2018 Brown University