## Sunday, 28 August 2011

### The devil in the data...

An Intro with data extraction, interpretation and analysis marks the beginning of our business analytics course.

The Story untold

There used to be store in a city where people go and buy daily needs from the shop. Over the years the shop became prominent place for customer satisfaction satisfying all the customer needs of the locality. During these years shopkeeper's responsibilities increased and business is not generating enough to meet his daily needs. As a relative to the shopkeeper, he asked me to help him in improving his business. I looked at his records and could easily make out that his fixed costs as proportion of total assets are high. So the sale he is generating isn't enough to satisfy his interest payments, daily expenses etc. The business case has become to increase the sales of the shop. In line with the case I began defining different variables which capture effectively dependent variable (sales). The variables which I have collected for case are: Number of bills generated from store everyday- estimated frequency of bills generated of different products, Age groups of customers, Time of the day where maximum sales happen, product assortment, payment terms or credit period - estimated frequency of every other day credit is given

Null Hypothesis: There is no relationship between sales happened in day to number of day’s credit given to customers

Sale (measured in rupees) being the continuous and interval data with fixed interval of 1 paisa and credit period being the duration of time payment is made after the sale.

To verify the hypothesis I cross-tabulated amount of sales with credit period and found out there is positive correlation between the amount of sales and credit period given. The Pearson chi square which is used for two-way table , tests goodness of fit and independence amounted to 67 and significance of 0.08 with degree of freedom about 20(6 rows & 5 columns).

Since the test duration and the number of observations are small I rather calculated fisher's exact test which is used in analysis of contingency tables when sample sizes are small.

One cross-tabulation is done I refined the process of data analysis into categories based on fast moving, medium moving and slow moving and its correlation with credit period and formed clusters of SKU's. In forming clusters of product assortment I took mode as measure of distance between clusters. This helped me to suggest better arrangement in the store i.e. fast moving on top while slow moving on bottom and also faster procurement and better inventory management of goods in the store. This overall gave better idea to optimize the store operations, credit policy and maximize the sales of shop.

Posted By: Avinash Thatikonda

Group 2_Operations