During the proposal phase, students conducting quantitative research need to determine the sample size required for their analysis. GPower is a popular power analysis tool that allows students to calculate required sample sizes for many basic analyses. These analyses include (e.g., correlations, t-tests, ANOVAs, and linear regression) using only a few parameters (e.g., the alpha level, desired power level, and effect size). GPower supports most common analyses. However, it cannot calculate sample sizes for complex analyses such as factor analysis, SEM, and other advanced modeling techniques. So how can you determine your sample size if you are doing a more complicated analysis? Through simulation.
Simulation (sometimes referred to as Monte Carlo simulations) is a method of power analysis for complex models. It is gaining popularity and becoming more accessible with the development of new and existing software tools. In this blog, we will discuss what simulation methods are, the advantages and disadvantages of simulation methods, and software options for performing power analysis via simulations.
In order to understand how simulation methods work, you should first understand the concept of statistical power. Statistical power refers to the probability of finding a statistically significant result in a sample drawn from the population. For example, lets say you’re trying to determine if there is a correlation between self-efficacy and job satisfaction among nurses. Your statistical power would be the probability of finding a significant correlation if you were to collect and analyze data from a random sample of nurses. There are many factors that will affect power, such as the strength of the correlation you are hoping to observe (i.e., effect size) and the size of your sample.
Sample size is the factor we are most interested in for the purposes of this discussion. Generally, a larger sample size will mean a higher level of power. It’s important to have a large enough sample size is because we want our statistical power to be high. In the social sciences, the desired level of power is typically .80, or an 80% chance of finding a significant result. This means that if we were to conduct our study 100 times, we would hope to observe a significant result in at least 80 of those studies.
This brings us to the concept of using simulations to determine statistical power. Instead of collecting real data and conducting a full study 100 times, we instead generate hypothetical data (i.e., simulations) and see how often we observe statistical significance in the simulated data.
For example, if we wanted to know what our statistical power would be for a study with a sample size of 50. We would generate 100 simulated datasets, each with a sample size of 50, and run our correlation analysis on each of those datasets to see how often the result comes out as significant. If the correlation is significant in 80 (or more) out of the 100 simulations, then we may conclude that a sample size of 50 is enough for our real study. If the correlation is significant in fewer than 80 out of the 100 simulations, then we may conclude that a sample size of 50 is not enough.
We could then try our simulation tests again with a larger sample size parameter (say, 60 or 70) and see if we get a higher number of significant results. This process can be repeated until the sample size that yields the desired power level is discovered.
The main advantages of power analysis via simulation methods are that power can be estimated for very complicated models and that your hypothesized model can be specified very precisely. For instance, if you need to test a very complicated structural equation model, you can build your exact model and set the hypothesized value of each effect or parameter in your model. Thus, if you specify your model correctly, you can obtain a sample size estimate that is essentially “custom made” for your specific study. Therefore, correctly performed simulation studies may give better results than other methods of sample size estimation (such as “rules of thumb”).
A disadvantage of simulation methods, however, is that you need to provide more precise and complicated parameters than other methods. In GPower, for instance, you often only need to specify the power level, alpha level, effect size, and perhaps one or two other basic parameters of your analysis (such as the number of predictors for a regression). Simulation methods require you to provide estimates for parameters such as covariance, residual variance, and regression weights for every indicator in your model. If you do not know how to estimate these values, it will be difficult to perform the procedure correctly. Another disadvantage of simulation methods is that the software packages available for simulation are not as user-friendly as GPower. They will generally be difficult for novices to use.
Multiple software tools are available that support the procedure. Mplus is software that allows users to perform Monte Carlo simulations on many types of complex models. PwrSEM is another option, which was developed by Wang and Rhemtulla (2021). It is a free tool that provides step-by-step instructions for conducting power analysis for structural equation models.
Wang, Y. A., & Rhemtulla, M. (2021). Power analysis for parameter estimation in structural equation modeling: A discussion and tutorial. Advances in Methods and Practices in Psychological Science, 4(1), 1-17.
We work with graduate students every day and know what it takes to get your research approved.