## Sunday, 4 September 2011

### Exploratory factor analysis Vs Confirmatory factor analysis

Exploratory factor analysis (EFA) could be described as orderly simplification of interrelated measures.  EFA, traditionally, has been used to explore the possible underlying factor structure of a set of observed variables without imposing a preconceived structure on the outcome. By performing EFA, the underlying factor structure is identified.
Confirmatory factor analysis (CFA) is a statistical technique used to verify the factor structure of a set of observed variables. CFA allows the researcher to test the hypothesis that a relationship between observed variables and their underlying latent constructs exists. The researcher uses knowledge of the theory, empirical research, or both, postulates the relationship pattern and then tests the hypothesis statistically.
CFA and EFA are powerful statistical techniques. An example of application of CFA and EFA may be the development of measurement instruments, e.g. a satisfaction scale, attitudes toward health, customer service questionnaire etc.
A blueprint is first developed, questions written, a scale determined, the instrument pilot tested, data collected, and CFA applied. The blueprint identifies the factor structure or what we think it is. However, some questions may not measure what we thought they should. If the factor structure is not confirmed, EFA is the next step. EFA helps us determine what the factor structure looks like according to how participant responses. Exploratory factor analysis is essential to determine underlying constructs for a set of measured variables.
The use of CFA could be impacted by
·         ƒ the research hypothesis being tested
·         ƒ the requirement of sufficient sample size (e.g., 5-20 cases per parameter estimate)
·         ƒ measurement instruments
·         ƒ multivariate normality
·         ƒ parameter identification
·         ƒ outliers
·         ƒ missing data
·         ƒ Interpretation of model fit indices
A suggested approach to CFA proceeds through the following process:
·         review the relevant theory and research literature to support model specification
·          specify a model (e.g., diagram, equations)
·         determine model identification (e.g., if unique values can be found for parameter estimation; the number of degrees of freedom for model testing  is positive)
·         collect data
·         conduct preliminary descriptive statistical analysis (e.g., scaling, missing data, collinearity issues, outlier detection)
·          estimate parameters in the model
·          assess model fit
·          present and interpret the results
Characteristics of EFA
·         ƒIt is a variable reduction technique which identifies the number of latent constructs and the underlying factor structure of a set of variables
·         ƒ hypothesizes an underlying construct, a variable not measured directly
·         ƒ estimates factors which influence responses on observed variables
·         ƒ allows you to describe and identify the number of latent constructs (factors)
·         ƒ includes unique factors, error due to unreliability in measurement
·         ƒ traditionally has been used to explore the possible underlying factor structure of a set of measured variables without imposing any preconceived structure on the outcome
Assumptions underlying EFA are
·         Interval or ratio level of measurement
·         Random sampling
·         Relationship between observed variables is linear
·         A normal distribution (each observed variable)
·         A bi-variate normal distribution (each pair of observed variables)
·         Multivariate normality
Limitations of EFA are
·         The correlations, the basis of factor analysis, describe relationships.  No causal inferences can be made from correlations alone.
·         the reliability of the measurement instrument (avoid an instrument with low reliability)
·         sample size ( larger sample à larger correlation)
Ø  Minimal number of cases for reliable results is more than 100 observations and 5 times the number of items
Ø  Since some subjects may not answer every item, a larger sample is desirable. For example, 30 items would require at least 150 cases (5*30), a sample of 200 subjects would allow for missing data
·         sample selection
Ø  Should be representative of population
Ø  Do not pool populations
·         variables could be sample specific, e.g., a unique quality possessed by a group does not generalize to the population
·         Can’t process non-normal distribution of data

Written By Nirupam Mandal (13025)
OPS Group 1