## Monday, 29 August 2011

### Cluster Analysis helps linking up Diapers with Alcohol!! (Cluster analysis + Data Mining + Retail)

After going through 6 marathon lectures on business analytics, most of which focussed on cluster analysis and its application, I was reminded of one of the subjects I studied back in my engineering- data mining and warehousing. There were loads of other concepts which I studied under the same topic, something related to ROLAP and MOLAP was the one that I found to be very closely related to cluster analysis. Both of them, work on querries to generate desired results as per requirement of the user. The user may be an engineer or someone working for a retail company.

Now where I found data mining to be very closely linked to cluster analysis was in the way it works. A highly specialised field with extremely complex algorithms, it normally deals with large amount of data, none of which makes any sense in the first go. There is also mostly never any fixed agenda when say the data of a retail outlet is studied via data mining. What it aims to achieve is to identify some relationship between attributes which can be taken advantage of to boost sales and improve operational efficiencies of the retail outlets. As an example, in a study conducted in a retail store chain in USA, it was found that on Saturdays married couples purchased beer and diapers together in a lot of cases. Though the relationship was initially not known and not understood in the first go, but when further analysed, it was found that these young married couples drink a lot on the weekend and they know they would forget to attend to their babies, as a result of which they buy diapers as a precautionary measure. Now how would any algorithm identify any such pattern from a raw set of data. The answer is – through cluster analysis, which I understood in these past few lectures. Data is fed into processing systems, wherein the algorithm (which works on cluster analysis techniques) uses techniques of grouping together various attributes (via dendograms). It does so for all permutation and combinations of the various attributes until it identifies various clusters which show a relationship or which exhibit a lot of dependence on each other.

Once such relations are identified for a store chain in the retail sector, it can be made use of to plan customer schemes in order to boost sales for the same set of products. It also helps in re-layout of the store to keep together those products which sell together and sell a lot. It also helps identify those non-complementary products which have a strong connection of being sold together for whatever reason. Thus there are numerous applications of cluster analysis, be it for the retail industry or for medical purposes wherein genes and DNA are studied upon, what matters in the end is how an analyst interprets the results and implements them commercially for the benefit of all. Merely finding clusters is not the end of the job, the most important factor is to give it shape and form so that it can be understood by all.

Thanks !!

Submitted by:-

Kartik Arora

Roll No. 13140

Marketing

Group 4