## Sunday, 4 September 2011

### Factor Anlaysis – What & Why?

Selecting Variables/Items for the Analysis. Ideally the researcher will select items which are reliable and will have good communalities. Include enough variables so that each common factor will be represented by at least three or four variables.

Selecting Subjects for the Analysis. Don't make the mistake of sampling from a population of subjects for which there is little variance in the factors you wish to estimate. You might even want to sample in such a way that your subjects will vary exceptionally much with respect to the factors you wish to estimate but little on other attributes.

Principal Components Analysis or Factor Analysis? If your purpose is to reduce the information in many variables into a set of weighted linear combinations of those variables, use Principal Components Analysis (PCA), which does not differentiate between common and unique variance. If your purpose is to identify the latent variables which are contributing to the common variance in a set of measured variables, use Factor Analysis (FA), which will attempt to exclude unique variance from the analysis.

Exploratory or Confirmatory Factor Analysis? If you wish to restrict the number of factors extracted to a particular number and specify particular patterns of relationship between measured variables and common factors, and this is done a priori (before seeing the data), then the confirmatory procedure is for you. If you have no such well specified a priori restrictions, then use the exploratory procedure.

Which Factor Extraction Procedure? Maximum Likelihood (ML) extraction allows computation of assorted indices of goodness-of-fit (of data to the model) and the testing of the significance of loadings and correlations between factors, but requires the assumption of multivariate normality. Principal Factors (PF) methods have no distributional assumptions. It is suggested to go with ML extraction that one first examine the distributions of the measured variables for normality. Unless there are severe problems ( |skew| > 2, kurtosis > 7), they say go with ML. If there are severe problems, consider trying to correct the problems (by transforming variables, for example) rather than using PF methods.

How Many Factors to Extract? Prefer overfactoring (too many factors) to underfactoring (too few factors). Overfactoring is likely to lead to a solution where the major factors are well estimated by the obtained loadings but where there are also additional poorly defined factors (with few, if any, variables loading well on them). Underfactoring is likely to lead to factors that are poorly estimated (poor correspondence between the structure of the true factors and that of the estimated factors), a more serious problem.

The authors spoke kindly of "parallel analysis," in which the obtained eigenvalues are compared to those one would expect to obtain from random data. If the first m eigenvalues are those which have values greater than what would be expected from random data, then one adopts a solution with m factors. Regretfully, this method is not available in the major statistical programs.

The goodness-of-fit statistics available from ML factor analysis may be helpful in determining the number of factors to retain. The analyst first decides how many factors, at most, e would be willing to retain. Then e fits models with 0, 1, 2, 3, ... up to that number of factors and compares them with respect to goodness-of-fit.

The authors also note that "a model that fails to produce a rotated solution that is interpretable and theoretically sensible has little value." This sounds like what I call the "meaningfulness criterion." I typically examine, in addition to the solution with what seems at first to have the correct number of factors, solutions with one or two more or fewer factors. I then adopt the solution which makes the most sense to me.

What Type of Rotation? The authors make a strong argument in favor of oblique rotations rather than orthogonal solutions. They note that dimensions of interest to psychologists are not often dimensions we would expect to be orthogonal. If the latent variables are, in fact, correlated, then an oblique rotation will produce a better estimate of the true factors and a better simple structure than will an orthogonal rotation -- and if the oblique rotation indicates that the factors have close to zero correlations between one another, then the analyst can go ahead and conduct an orthogonal rotation (which should then give about the same solution as the oblique rotation).

What Do Researchers Actually Do? Based on articles published between 1991 and 1995 in the Journal of Personality and Social Psychology and the Journal of Applied Psychology, about half use a PCA, despite the fact that the primary goal was to identify latent variables, in which case FA should have been employed. They do often report the reliabilities of their variables, but not the communalities (which are more informative). Frequently they do not explain the method they used to decide how many factors to retain, and when they do report the method it is most likely to be the eigenvalue-greater-than-one method They use varimax rotation. When asked to provide a copy of their data so that Fabrigar et al. could determine if a better solution would be obtained by making decisions other than those made by the researchers, most researchers failed to provide the data. For those that did provide the data, Fabrigar et al. found that an oblique rotation often produced a slightly better simple structure than did a varimax rotation, but the pattern of loadings was almost always the same with varimax as with oblique rotation.

Why do Researchers Make These Decisions? That is, why do they elect to do a PCA, retain as many factors as have eigenvalues greater than 1, and use varimax rotation? Well, maybe it is just because these are the defaults for factor analysis in SPSS. You know, one does not have to understand anything about factor analysis to be able to point and click.

Author: Ankit Gupta (13066)

Marketing Group 1