## Sunday, 28 August 2011

### An afterthought about Crosstabs ....

It is of paramount importance to reduce the complexity of the data. Data is present in abundance but we segregate it and bring it to a form so that it can be used.

Analyzing data from online surveys is the toughest part of the game.

Frequency analysis provides a wide answer to the questions put across in the survey. Some basic questions include:

What % of people who gave responses is below 18 years old?

What is the average height of children who will test the product sample?

What is the average age of the people using public transport?

Cross tabulation analysis or crosstab gives greater meaning to the data. Some basic questions include:

What % of males took the test in the last year?

Are school children likely to go the new ice cream parlour than grown ups?

What % of cricketers tested positive for the banned drug in the last series?

What is a Cross Tab Analysis?
- A cross tabulation analysis is useful for showing how respondents answered on two or more questions at the same time.
- Shows a distribution between two variables
- Usually presented as a matrix in the form of a table

Why Would You Use It?
- Easy to understand and draw quick conclusions
- Tables can provide greater insight than single statistics.
- They are simple to create with Checkbox Survey.

As a basic rule, the control group or independent variable is on the X-axis (such as age, gender education, etc) while the dependent variable or group under study is located on the Y Axis.

Example Cross Tab.

Ages 20-29 Ages 30-39 Ages 40-49
Read publications online 84% 61% 36%
Pay bills online 65% 42% 28%
4+ hrs daily on Internet 87% 68% 47%

To make a cross tab within CHECKBOX, you have to have a few completed survey responses. Then simply navigate to the Reports Manager. Auto generate the report to start. Then Add an item to that report "Cross Tab". Now pick two questions that you want to cross tabulate. Run the report. You're done. Add a style sheet if you want.

Cross-tabs or cross tabulation is a quantitative research method appropriate for analyzing the relationship between two or more variables. Data about variables is recorded in a table or matrix. A sample is used to gather information about the variable. The most common type of data collected in cross tabulation is account of occurrences of the variables. This count or number is referred to as frequency. The matrix used to show the frequency of the occurrences of the variables being studied is called frequency distribution. A matrix is used to show and analyze frequencies for a particular group or designation.

Cross Tabulation Provides Structure for Quantitative Data.

Raw data is easier to manage and understand when it has structure. Tables permit data about variables to be organized. These tables are often called contingency tables. Contingency refers to the possibility that a relationship exits. Variables describe an attribute of a person, group, place, thing, or idea. Variables can be either categorical (qualitative) or quantitative. Categorical variables are descriptive, often indicating something about the group from which the data is derived. Examples of categorical variables are attribute labels or names.

Study One Variable with Cross-Tabs

Researchers refer to frequency tables by names that indicate number or arrangement of variables that are being studied. A univariate frequency table shows data about one variable. Often the data in a univariate table is put into groups that consist of a range of values or designations that have been given a value or rank. The ranks are then put into order. An example of univariate data would be the frequency at which students earn grade points and fall into the A, B, C or 4.0, 3.5, 3.0 categories for a college course.

Study Multiple Variables with Cross-Tabs

When a frequency table shows data for more than one variable, it is called joint orbivariate contingency table. Bivariate frequency tables often show data in a two-way arrangement. An example of bivariate data would be the frequency with which people from different regions (north, south, east or west) of the country select crunchy snack bars or chewy snack bars.

Quantitative variables can fall into one of two types: Discrete or continuous. Discrete variables can only be an integer value -- that is, a number between zero and infinity. Continuous variables can be any one of the possible values between the permitted or agreed upon maximum and minimum values in a range of values. As a general rule, variable types -- discrete or continuous -- are not used together in the same frequency distribution.

Cross-Tabs Permit Comparisons of Frequency Distributions

Data from frequency distributions can also be shown in a visual way, as in a graph. Distributions are compared by looking at four features of the data: Centre, spread, shape, and irregularities. Centre refers to the point at which half of the data falls on either side of a central point. Spread refers to the variability of the data, with a wide spread indicating greater variability than a narrow spread. Shape refers to the symmetry, skewness, or peaks and valleys of the distribution. Irregularities refer to gaps or outliers in the data pattern.

A Crosstab differs from frequency distribution because the latter provides distribution of one variable only. A Cross Table has each cell showing the number of respondents which gives a particular combination of replies. An example of Cross Tabulation would be a 3 x 2 contingency table. One variable would be age group which has three age ranges: 10-30, 31-50, and 51-up. Another variable would be the choice of Ray Ban shades or fast Track. With a crosstab, it would be easy to for a company to see what the choices of shades are for the three age groups. For instance, the table would show that 45% of those aged 31-50 prefer Ray Ban, while only 10% of those aged 51-up prefer Fast Track. With the information, they can come up with moves which will be beneficial to the success of the business of Fast Track or Ray Ban. Cross Tabulations are popular choices for statistical reporting because of their simplicity and that they are laid out in clear formats. They can be used with any level of data whether the data is ordinal, nominal, interval or ratio because the Crosstab will treat all of them as if they are nominal data. Crosstab tables provide more detailed insights to a single statistics in a simple way and they solve the problem of empty or sparse cells.

Cross tabulations, or cross tabs, are a good way to compare two subgroups of information. Cross tabs allow you to compare data from two questions to determine if there is a relationship between them. Like frequency tables, cross tabs appear as a table of data showing answers to one question as a series of rows and answers to another question as a series of columns.

 Base Question Female Male Product Manager 57.2% 53.4% Director 12.6% 14.2% Product Marketing Manager 24.7% 23.1% Program Manager 2.8% 1.5% Technical Product Manager 2.8% 7.7% Total Counts 215 337

Cross tabs are used most frequently to look at answers to a question among various demographic groups. The intersections of the various columns and rows, commonly called cells, are the percentages of people who answered each of the responses. In the example above, females and males had relatively similar distribution among various job titles, with the exception of the tile of "Technical Product Manager", where 2.5 times as many males had the title as compared to females. For analysis purposes, cross tabs are a great way to do comparisons.

Since Cross Tabulation is widely used in statistics, there many statistical process and terms that are closely associated with it. Most of these processes are methods to test the strengths of Crosstabs which is needed to maintain consistency and come up with accurate data because data being laid out using Crosstabs may come from a wide variety of sources. Companies find the services of a data warehouse very indispensable. But inside the data warehouse can be found billions of data which most of them are unrelated. Without the aid of tools, these data might not be of any use to the company. These data are not homogenous. They may come from various sources, often from other data suppliers and other warehouses which may be coming from other branches in other geographical locations. Software applications like relational database monitoring systems have Cross Tabulation functionalities which allow end users to correlate and compare any piece of data. Crosstab analysis engines can examine dozens of table very fast and efficiently and these engines can even create full statistical outputs by very clicks of the mouse or keyboards.

Name : Siddhartha Singh

Roll : 13165

Marketing Group 6