Sunday, 28 August 2011

Business Analytics at a Glance

Business analytics (BA) refers to the skills, technologies, applications and practices for continuous iterative exploration and investigation of past business performance to gain insight and drive business planning. Business analytics focuses on developing new insights and understanding of business performance based on data and statistical methods.

Business analytics makes extensive use of data, statistical and quantitative analysis, explanatory and predictive modeling and fact-based management to drive decision making. Analytics may be used as input for human decisions or may drive fully automated decisions. Business intelligence is querying, reporting, OLAP and "alerts". In other words, querying, reporting, OLAP, and alert tools can answer questions such as what happened, how many, how often, where the problem is, and what actions are needed. Business analytics can answer questions like why is this happening, what if these trends continue, what will happen next (that is, predict), what is the best that can happen (that is, optimize).

Clustering in a nutshell

The term cluster analysis does not identify a particular statistical method or model, as do discriminant analysis, factor analysis, and regression. We often don’t have to make any assumptions about the underlying distribution of the data. Using cluster analysis, you can also form groups of related variables, similar to what you do in factor analysis.

There are numerous ways we can sort cases into groups. The choice of a method depends on, among other things, the size of the data file. Methods commonly used for small data sets are impractical for data files with thousands of cases. SPSS has three different procedures that can be used to cluster data: hierarchical cluster analysis, k-means cluster, and two-step cluster. If we have a large data file (even 1,000 cases is large for clustering) or a mixture of continuous and categorical variables, we should use the SPSS two-step procedure. If we have a small data set and want to easily examine solutions with increasing numbers of clusters, we may use hierarchical clustering. If we know how many clusters we want and we have a moderately sized data set, we can use k-means clustering. We will cluster three different sets of data using the three SPSS procedures. We will use a hierarchical algorithm to cluster figure-skating judges in the 2002 Olympic Games. We will use k-means clustering to study the metal composition of Roman pottery. Finally, we will cluster the participants in the 2002 General Social Survey, using a two-stage clustering algorithm. We will find homogenous clusters based on education, age, income, gender, and region of the country.

Hierarchical Clustering

There are numerous ways in which clusters can be formed. Hierarchical clustering is one of the most straightforward methods. It can be either agglomerative or divisive.

Agglomerative hierarchical clustering begins with every case being a cluster unto itself. At successive steps, similar clusters are merged. The algorithm ends with everybody in one jolly, but useless, cluster. Divisive clustering starts with everybody in one cluster and ends up with everyone in individual clusters. Obviously, neither the first step nor the last step is a worthwhile solution with either method.

In agglomerative clustering, once a cluster is formed, it cannot be split; it can only be combined with other clusters. Agglomerative hierarchical clustering doesn’t let cases separate from clusters that they’ve joined. Once in a cluster, always in that cluster. To form clusters using a hierarchical cluster analysis, we must select: - A criterion for determining similarity or distance between cases, a criterion for determining which clusters are merged at successive steps, the number of clusters we need to represent data.

References :

Group – Marketing 3

Author of the Article – Ritesh Chaddha (13094)

No comments:

Post a Comment