Cluster analysis is a class of techniques that are used to classify objects or cases into relative groups called clusters. Cluster Analysis is also called classification analysis or numerical taxonomy. In Cluster Analysis, there is no prior information about the group or cluster membership for any of the objects.
Cluster Analysis has been used in marketing for various purposes. These purposes are as follows:
Segmentation of consumers in Cluster Analysis is used on the basis of benefits sought from the purchase of the product. Cluster Analysis can be used to identify homogeneous groups of buyers.
Cluster analysis involves formulating a problem, selecting a distance measure, selecting a clustering procedure, deciding the number of clusters, interpreting the profile clusters and finally, assessing the validity of clustering.
The variables on which the cluster analysis is to be done should be selected by keeping past research in mind. It should also be selected by theory, the hypotheses being tested, and the judgment of the researcher. An appropriate measure of distance or similarity should be selected in cluster analysis. The most commonly used measure in cluster analysis is the Euclidean distance or its square.
Clustering procedures in cluster analysis may be hierarchical, non hierarchical, or a two step procedure. A hierarchical procedure in cluster analysis is characterized by the development of a tree like structure. A hierarchical procedure in cluster analysis can be agglomerative or divisive. Agglomerative methods in cluster analysis consist of linkage methods, variance methods and centroid methods. Linkage methods in cluster analysis are comprised of single linkage, complete linkage and average linkage.
The non-hierarchical methods in cluster analysis are frequently referred to as K means clustering. The two-step procedure in cluster analysis can automatically determine the optimal number of clusters by comparing the values of model choice criteria across different clustering solutions. The choice of clustering procedure and the choice of distance measure are interrelated in cluster analysis. The relative sizes of clusters in cluster analysis should be meaningful. The clusters in cluster analysis should be interpreted in terms of cluster centroids.
There are certain concepts and statistics associated with cluster analysis:
Agglomeration schedule in cluster analysis gives information on the objects or cases being combined at each stage of the hierarchical clustering process.
Cluster Centroid in cluster analysis is the mean values of a variable for all the cases or objects in a particular cluster.
A dendrogram in cluster analysis is a graphical device for displaying cluster results.
Distances between cluster centers in cluster analysis indicate how separated the individual pairs of clusters are. The clusters that are widely separated are distinct and therefore desirable in cluster analysis.
Similarity/ distance coefficient matrix in cluster analysis is a lower triangle matrix containing pair-wise distances between objects or cases.
Cluster Analysis Resources
Abonyi, J., & Feil, B. (2007). Cluster analysis for data mining and system identification. Boston, MA: Birkhäuser Basel.
Aldenderfer, M. S., & Blashfield, R. K. (1984). Cluster analysis. Newbury Park, CA: Sage Publications.
Anderberg, M. R. (1973). Cluster analysis for applications. New York: Academic Press.
Arabie, P., Carroll, J. D., & DeSarbo, W. S. (1987). Three-way scaling and clustering. Newbury Park, CA: Sage Publications.
Everitt, B. S. (1980). Cluster analysis. Quality and Quantity, 14(1), 75-100.
Everitt, B. S., Landau, S., & Leese, M. (2001). Cluster analysis (4th ed.). London: Arnold.
Everitt, B. S., & Rabe-Hesketh, S. (1997). The analysis of proximity data. London: Arnold.
Jain, A. K., & Dubes, R. C. (1988). Algorithms for clustering data. Englewood Cliffs, NJ: Prentice Hall.
Jajuga, K., Sokolowski, A., & Bock, H. -H. (2002). Classification, clustering and data analysis. New York: Springer.
Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis. New York: Wiley.
Meli?, M., & Heckerman, D. (1998). An experimental comparison of several clustering and initialization methods. Redmond, WA: Microsoft.
Rapkin, B. D., & Luke, D. A. (1993). Cluster analysis in community research: Epistemology and practice. American Journal of Community Psychology, 21(2), 247-277.
Romesburg, H. C. (2004). Cluster analysis for researchers. North Carolina: Lulu.
Sireci, S. G., & Geisinger, K. F. (1992). Analyzing test content using cluster analysis and multidimensional scaling. Applied Psychological Measurement, 16(1), 17-31.
SPSS, Inc. (2001). The SPSS twostep cluster component, a scalable component enabling more efficient customer segmentation. Chicago, IL: SPSS.

