Monday, 29 August 2011

CLUSTERING MADE EASY


Well since the day we started with Business Analytics lot of stress has been put on Clustering Technique. Well I pondered upon a number of websites to get an understanding of the same. The simplest definition that I came across is 


“Organizing data into classes such that there is

  • High intra-class similarity
  • Low inter-class similarity 

It is also known as classification by Statisticians and Segmentation by Marketers .


Let’s try to understand with the use of following illustrations.


What is a natural grouping among these objects ?



.
.
.
.
.

Clustering is Subjective 



During Clustering we also came across one concept of Distance Measurement.

Definition: Let O1 and O2 be two objects from the universe of possible objects. The distance (dissimilarity) between O1 and O2 is a real number denoted by D(O1,O2) 
E.g. Let’s try to figure out dissimilarity in the following objects 



The Black box above comprises of a function which has the following properties 
  • D(A,B) = D(B,A)         Symmetry 
  • D(A,A) = 0         Constancy of Self-Similarity 
  • D(A,B) = 0 If A= B Positivity (Separation)
  • D(A,B) £ D(A,C) + D(B,C) Triangular Inequality 

Group : Marketing Group 5
Author of the Article: Navdeep Kumar
Roll no : 13150


No comments:

Post a Comment