“Where is the life, we have lost in living?
Where is the wisdom, we have lost in knowledge?
Where is the knowledge, we have lost in information?”
- T. S. Elliot,
Choruses from the Rock
(1888 – 1965)
Today we have been taught about Cluster Analysis. I never knew the far reaching applications of something like ‘cluster analysis’ taught in a b-school.
This is what my understanding of Cluster Analysis is:
Cluster analysis is a collection of statistical methods, which identifies groups of samples that behave similarly or show similar characteristics. It is used to reduce the complexity of data. It generates groups which are similar. It is homogen
eous within the group and as much as possible heterogeneous to other groups. Data consists usually of objects or persons, and the segmentation is based on more than two variables.
Examples for datasets used for cluster analysis:
- Socio-economic criteria: income, education, profession, age, number of children, size of city of residence
- Psychographic criteria: interest, life style, moti
vation, values, involvement
- Criteria linked to the buying behaviour: price range, type of media used, intensity of use, choice of retail outlet, fidelity, buyer/non-buyer, buying intensity
Interesting Applications: Mapping Crime
A hot spot is a condition indicating some form of clustering in a spatial distribution. However, not all clusters are hot spots because the environments that help generate crime—the places where people are—also tend to be clusters. Hot spots are small places in which the occurrence of crime is so frequent that it is highly predictable, at least over a 1-year period.
Cluster analysis methods depend on th
e proximity of incident points. Typically, an arbitrary starting point ("seed") is established. This seed point could be the center of the map. The program then finds the data point statistically farthest from there and makes that point the second seed, thus dividing the data points into two groups. Then distances from each seed to other points are repeatedly calculated, and clusters based on new seeds are developed so that the sums of within-cluster distances are minimized. The figure shown below illustrates hot spots derived from the Spatial and Temporal Analysis of Crime (STAC) method, which performs the functions of radial search and identification of events concentrated in a given area (Levine, 1996).
I never thought that Cluster analysis could also have such unimaginable and practical applications even in such unsought departments. I hope we will be able to efficiently apply these techniques and methods in various tasks.
Posted By –
Roll No: 13084