## Monday, 29 August 2011

Well since the day we started with Business Analytics lot of stress has been put on Clustering Technique. Well I pondered upon a number of websites to get an understanding of the same. The simplest definition that I came across is

“Organizing data into classes such that there is

• High intra-class similarity
• Low inter-class similarity

It is also known as classification by Statisticians and Segmentation by Marketers .

Let’s try to understand with the use of following illustrations.

What is a natural grouping among these objects ?

.
.
.
.
.

Clustering is Subjective

During Clustering we also came across one concept of Distance Measurement.

Definition: Let O1 and O2 be two objects from the universe of possible objects. The distance (dissimilarity) between O1 and O2 is a real number denoted by D(O1,O2)
E.g. Let’s try to figure out dissimilarity in the following objects

The Black box above comprises of a function which has the following properties
• D(A,B) = D(B,A)         Symmetry
• D(A,A) = 0         Constancy of Self-Similarity
• D(A,B) = 0 If A= B Positivity (Separation)
• D(A,B) £ D(A,C) + D(B,C) Triangular Inequality

Group : Marketing Group 5
Author of the Article: Navdeep Kumar
Roll no : 13150