Skip to main content
Log in

Techniques of Cluster Algorithms in Data Mining

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

An overview of cluster analysis techniques from a data mining point of view is given. This is done by a strict separation of the questions of various similarity and distance measures and related optimization criteria for clusterings from the methods to create and modify clusterings themselves. In addition to this general setting and overview, the second focus is used on discussions of the essential ingredients of the demographic cluster algorithm of IBM's Intelligent Miner, based Condorcet's criterion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ball, G.H. 1965. Data analysis in the social sciences-What about the details. In Proc. AFIPS Fall Joint Computer Conf. 27. 1965. London: McMillan, Vol. 1, pp. 533–559.

    Google Scholar 

  • Ball, G.H. 1967a. A clustering technique for summarizing multivariate data. Behavioral Science, 12:153–155.

    Google Scholar 

  • Ball, G.H. 1967b. PROMENADE-An online pattern recognition system. Stanford Res. Inst., Technical Report No. RADC-TR-67-310.

  • Ball, G.H. and Hall, D.J. 1965. ISODATA, a novel technique for data analysis and pattern classification. Standford Res. Inst., Menlo Park, CA.

    Google Scholar 

  • Bigus, J.P. 1996. Data Mining with Neural Networks. New York: McGraw-Hill.

    Google Scholar 

  • Bishop, C. 1995. Neural Networks for Pattern Recognition. Oxford, UK: Oxford University Press.

    Google Scholar 

  • Bock, H.H. 1974. Automatische Klassifikation. Vandenhoeck & Ruprecht.

  • Braverman, E.M. 1996. The method of potential functions in the problem of training machines to recognize patterns without a teacher. Automation Remote Control, 27:1748–1771.

    Google Scholar 

  • Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., and Uthurusamy, R. (Eds.). 1996. Advances in Knowledge Discovery and Data Mining. AAAI Press/The MIT Press, Menlo Park.

    Google Scholar 

  • Fortier, J.J. and Solomon, H. 1996. Clustering procedures. In proceedings of the Multivariate Analysis,' 66, P.R. Krishnaiah (Ed.), pp. 493–506.

  • Graham, R.L., Knuth, D., and Patashnik, O. 1989. Concrete Mathematics, a Foundation of Computer Science. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Hartung, H.J. and Elpelt, B. 1984. Multivariate Statistik. München Wien: Oldenbourg.

    Google Scholar 

  • Höppner, F., Klawonn, F., Kruse, R., and Runkler, T. 1999. Fuzzy Cluster Analysis. Chichester: Wiley. Updated German version: Höppner, F., Klawonn, F., and Kruse, R.: Fuzzy-Clusteranalyse. Verfahren fur die Bilderkennung, Klassifikation und Datenanalyse, Vieweg, Braunschweig, 1997. Also available at http://fuzzy.cs.uni-magdeburg.de/clusterbook

    Google Scholar 

  • Jain, A.K. and Dubes, R.C. 1988. Algorithms for Clustering Data. New York: Wiley.

    Google Scholar 

  • Jobson, J.D. 1992. Applied Multivariate Data Analysis. New York: Springer Bd. I and II.

    Google Scholar 

  • Johnson, N.L. and Kotz, S. 1990. Continuous Univariate Distributions-1. New York: Wiley.

    Google Scholar 

  • Kaufman, L. and Rousseeuw, P.J. 1990. Finding Groups in Data. New York: Wiley.

    Google Scholar 

  • Kohonen, T. 1997. Self-Organizing Maps, 2nd Ed. Berlin: Springer-Verlag.

    Google Scholar 

  • Krishnaiah, P.R. Multivariate analysis. In Proceedings of the Multivariate Analysis' 66, P.R. Krishnaiah (Ed.). New York: Academic Press.

  • Lu, S.Y. and Fu, K.S. 1978. A sentence-to-sentence clustering procedure for pattern analysis. IEEE Transactions on Systems, Man and Cybernetics SMC, 8:381–389.

    Google Scholar 

  • McLachlan, G.J. and Basford, K.E. Mixture Models. New York: Marcel Dekker.

  • Messatfa, H. and Zait, M. 1997.A comparative study of clustering methods. Future Generation Computer Systems, 13:149–159.

    Google Scholar 

  • Michaud, P. 1982. Aggrégation á la majorité: Hommage á Condorcet. Technical Report F-051, IBM Centre Scientifique IBM France, Paris.

    Google Scholar 

  • Michaud, P. 1985. Aggrégation á la majorité II: Analyse du Résultat d'un vote. Technical Report F-052, IBM Centre Scientifique IBM France, Paris.

    Google Scholar 

  • Michaud, P. 1987a. Aggrégation á la majorité III: Approache statistique, géometrique ou logique. Technical Report F-084, IBM Centre Scientifique IBM France, Paris.

    Google Scholar 

  • Michaud, P. 1987b. Condorcet-a man of the avant-garde. Applied Stochastic Models and Data Analysis, 3:173–198.

    Google Scholar 

  • Michaud, P. 1995. Classical version non-classical clustering methods: An overview. Technical Report MAP-010, IBM ECAM.

  • Michaud, P. 1997. Clustering techniques. Future Generation Computer Systems, 13:135–147.

    Google Scholar 

  • Rao, C.R. 1973. Linear Statistical Inference and Its Application. New York: Wiley.

    Google Scholar 

  • Renyi, A. 1962. Wahrscheinlichkeitstheorie, mit einem Anhang über Informationstheorie. VEB Deutsche Verlag der Wissenschaften, Berlin.

    Google Scholar 

  • Ripley, B.D. 1996. Pattern Recognition and Neural Network. Oxford, UK: Cambridge University Press, Oxford, 1996.

    Google Scholar 

  • Robins, H. and Monro, S. 1951. A stochastic approximation method. Ann. Math. Stat., 22:400–407.

    Google Scholar 

  • Rudolph, A. 1999. Data Mining in action: Statistische Verfahren der Klassifikation. Shaker Verlag.

  • Seber, G.A.F. 1984. Multivariate Observations. New York: Wiley.

    Google Scholar 

  • Spaeth, H. 1984. Cluster Analysis-Algorithms. Chicester; Ellis Horwood Limited.

    Google Scholar 

  • Steinhausen, D. and Langer, K. 1977. Clusteranalyse. Walter de Gruyter.

  • Tsypkin, Y.Z. and Kelmans, G.K. 1967. Recursive self-training algorithms. Enginering Cybernetics USSR, V:70–79.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Grabmeier, J., Rudolph, A. Techniques of Cluster Algorithms in Data Mining. Data Mining and Knowledge Discovery 6, 303–360 (2002). https://doi.org/10.1023/A:1016308404627

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1016308404627

Navigation