## Monday, 29 August 2011

### Football & Clusters !

First day of Business Analytics, not my first day with SPSS, but it was a refreshing way to refresh some of the knowledge imparted to us in the last semester. Heavy usage of the word refresh. Anyway, the moment I opened the file, it came as a small shock. For someone used to working with 4-5 variables at max in Microsoft Excel, this was a rude awakening. A very rude awakening.

Anyway, let me first brief a person new to the world of analytics what clustering is all about. Clustering is all about dividing data or a set of observations into homogeneous groups. Groups that within themselves hold similar data. Not in every respect but in most respects. For example, all females of the world can form a cluster and all the males could form another cluster. Valid examples these two. Thought provoking too.

Let's proceed. So a cluster is basically a set of observations that can be bundled together based on their similarity. Now while there are many ways of performing an analysis of this data, there are only a few important methods that one would most probably take.

Hierarchical Clustering: As the name suggests, the cluster is made based on data that has been ordered in some way. There are two different ways this is done as well

· Agglomerative Clustering for dummies:

§ Obtain your set of data

§ Sort it in whichever way you deem possible

§ Cluster/Collect similar data into the first cluster

§ Repeat this until you have one final cluster, the parent of all the other clusters

· Divisive Clustering for dummies:

§ Obtain your set of data

§ Do the exact opposite of the above

§ Congratulate yourself on a job well done !

K-Means Clustering: An often used form of partitioning of data that involves an iterative process where clusters are initially built randomly, their distance from the centroid computed and then re-clustered. This is done till the best fit is obtained.

Now, how do I relate this to football. Well, while I was watching the football match today evening (my passion), we (me and my friends) wondered how clustering takes place in football. (After-Effects of BA Class I guess. Or maybe because there were girls watching the football match for a change). Anyhow, here is how I perceive it. Suppose you were to step into the shoes of a manager of a professional football team, how would clustering help you? Here's how.

What exactly do you need:

• Defenders
• Midfielders
• Attackers
• And a Goalkeeper of course !

Well, there are tons of football players out there. How would you choose the best? By Divisive Clustering of course. You allot all the defenders of the world into one cluster. You whittle it down based on your requirements and your criteria till you can finally make your dream team.

It's too late for me to think of Agglomerative Clustering and its examples so I'll just leave you to figure that out for yourself.

Enough from me for now. More gyaan later !

Group Name: Finance 2

Author: Kshitij Sharma