Factor analysis is a technique that is used to reduce a large number of variables into fewer numbers of factors. This technique extracts maximum common variance from all variables and puts them into a common score. As an index of all variables, we can use this score for further analysis. Factor analysis is part of general linear model (GLM) and this method also assumes several assumptions: there is linear relationship, there is no multicollinearity, it includes relevant variables into analysis, and there is true correlation between variables and factors. Several methods are available, but principle component analysis is used most commonly.
Types of factoring:
There are different types of methods used to extract the factor from the data set:
1. Principal component analysis: This is the most common method used by researchers. PCA starts extracting the maximum variance and puts them into the first factor. After that, it removes that variance explained by the first factors and then starts extracting maximum variance for the second factor. This process goes to the last factor.
2. Common factor analysis: The second most preferred method by researchers, it extracts the common variance and puts them into factors. This method does not include the unique variance of all variables. This method is used in SEM.
3. Image factoring: This method is based on correlation matrix. OLS Regression method is used to predict the factor in image factoring.
4. Maximum likelihood method: This method also works on correlation metric but it uses maximum likelihood method to factor.
5. Other methods of factor analysis: Alfa factoring outweighs least squares. Weight square is another regression based method which is used for factoring.
Factor loading is basically the correlation coefficient for the variable and factor. Factor loading shows the variance explained by the variable on that particular factor. In the SEM approach, as a rule of thumb, 0.7 or higher factor loading represents that the factor extracts sufficient variance from that variable.
Eigenvalues: Eigenvalues is also called characteristic roots. Eigenvalues shows variance explained by that particular factor out of the total variance. From the commonality column, we can know how much variance is explained by the first factor out of the total variance. For example, if our first factor explains 68% variance out of the total, this means that 32% variance will be explained by the other factor.
Factor score: The factor score is also called the component score. This score is of all row and columns, which can be used as an index of all variables and can be used for further analysis. We can standardize this score by multiplying a common term. With this factor score, whatever analysis we will do, we will assume that all variables will behave as factor scores and will move.
Criteria for determining the number of factors: According to the Kaiser Criterion, Eigenvalues is a good criteria for determining a factor. If Eigenvalues is greater than one, we should consider that a factor and if Eigenvalues is less than one, then we should not consider that a factor. According to the variance extraction rule, it should be more than 0.7. If variance is less than 0.7, then we should not consider that a factor.
Rotation method: Rotation method makes it more reliable to understand the output. Eigenvalues do not affect the rotation method, but the rotation method affects the Eigenvalues or percentage of variance extracted. There are a number of rotation methods available: (1) No rotation method, (2) Varimax rotation method, (3) Quartimax rotation method, (4) Direct oblimin rotation method, and (5) Promax rotation method. Each of these can be easily selected in SPSS, and we can compare our variance explained by those particular methods.
- No outlier: Assume that there are no outliers in data.
- Adequate sample size: The case must be greater than the factor.
- No perfect multicollinearity: Factor analysis is an interdependency technique. There should not be perfect multicollinearity between the variables.
- Homoscedasticity: Since factor analysis is a linear function of measured variables, it does not require homoscedasticity between the variables.
- Linearity: Factor analysis is also based on linearity assumption. Non-linear variables can also be used. After transfer, however, it changes into linear variable.
- Interval Data: Interval data are assumed.
Key concepts and terms:
Exploratory factor analysis: Assumes that any indicator or variable may be associated with any factor. This is the most common factor analysis used by researchers and it is not based on any prior theory.
Confirmatory factor analysis (CFA): Used to determine the factor and factor loading of measured variables, and to confirm what is expected on the basic or pre-established theory. CFA assumes that each factor is associated with a specified subset of measured variables. It commonly uses two approaches:
- The traditional method: Traditional factor method is based on principle factor analysis method rather than common factor analysis. Traditional method allows the researcher to know more about insight factor loading.
- The SEM approach: CFA is an alternative approach of factor analysis which can be done in SEM. In SEM, we will remove all straight arrows from the latent variable, and add only that arrow which has to observe the variable representing the covariance between every pair of latents. We will also leave the straight arrows error free and disturbance terms to their respective variables. If standardized error term in SEM is less than the absolute value two, then it is assumed good for that factor, and if it is more than two, it means that there is still some unexplained variance which can be explained by factor. Chi-square and a number of other goodness-of-fit indexes are used to test how well the model fits.
Bryant, F. B., & Yarnold, P. R. (1995). Principal components analysis and exploratory and confirmatory factor analysis. In L. G. Grimm & P. R. Yarnold (Eds.), Reading and understanding multivariate analysis. Washington, DC: American Psychological Association.
Dunteman, G. H. (1989). Principal components analysis. Newbury Park, CA: Sage Publications.
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4(3), 272-299.
Gorsuch, R. L. (1983). Factor Analysis. Hillsdale, NJ: Lawrence Erlbaum Associates.
Hair, J. F., Jr., Anderson, R. E., Tatham, R. L., & Black, W. C. (1995). Multivariate data analysis with readings (4th ed.). Upper Saddle River, NJ: Prentice-Hall.
Hatcher, L. (1994). A step-by-step approach to using the SAS system for factor analysis and structural equation modeling. Cary, NC: SAS Institute.
Hutcheson, G., & Sofroniou, N. (1999). The multivariate social scientist: Introductory statistics using generalized linear models. Thousand Oaks, CA: Sage Publications.
Kim, J. -O., & Mueller, C. W. (1978a). Introduction to factor analysis: What it is and how to do it. Newbury Park, CA: Sage Publications.
Kim, J. -O., & Mueller, C. W. (1978b). Factor Analysis: Statistical methods and practical issues. Newbury Park, CA: Sage Publications.
Lawley, D. N., & Maxwell, A. E. (1962). Factor analysis as a statistical method. The Statistician, 12(3), 209-229.
Levine, M. S. (1977). Canonical analysis and factor comparison. Newbury Park, CA: Sage Publications.
Pett, M. A., Lackey, N. R., & Sullivan, J. J. (2003). Making sense of factor analysis: The use of factor analysis for instrument development in health care research. Thousand Oaks, CA: Sage Publications.
Shapiro, S. E., Lasarev, M. R., & McCauley, L. (2002). Factor analysis of Gulf War illness: What does it add to our understanding of possible health effects of deployment, American Journal of Epidemiology, 156, 578-585.
Velicer, W. F., Eaton, C. A., & Fava, J. L. (2000). Construct explication through factor or component analysis: A review and evaluation of alternative procedures for determining the number of factors or components. In R. D. Goffin & E. Helmes (Eds.), Problems and solutions in human assessment: Honoring Douglas Jackson at seventy. Boston, MA: Kluwer.
Widaman, K. F. (1993). Common factor analysis versus principal component analysis: Differential bias in representing model parameters, Multivariate Behavioral Research, 28, 263-311.