Confirmatory Factor Analysis Sample Size: How Small Is Risky
- 01. Confirmatory factor analysis sample size
- 02. Historical and methodological context
- 03. Rules of thumb and practical benchmarks
- 04. Model complexity and data quality considerations
- 05. Power and precision in CFA
- 06. Best practices for researchers in the field
- 07. Illustrative data table
- 08. Methodological appendix
- 09. Frequently asked questions
Confirmatory factor analysis sample size
The minimum acceptable sample size for CFA depends on model complexity, communalities, and distributional properties, but in practice a safe baseline is around 200 participants for standard models with moderate communalities. This recommendation reflects the convergence of multiple guidelines and empirical studies that emphasize stability of parameter estimates and model fit indices when the sample is near or above this threshold. Practical takeaway: use at least 200 observations or 5-10 observations per item, with higher requirements for complex models or non-normal data is a common rule of thumb across the literature.
Historical and methodological context
Early guidance proposed modest minimums (e.g., 100-200 participants) depending on factors like the number of items and factor loadings. Over time, the literature has grown into nuanced recommendations that emphasize data quality (communalities, cross-loadings, factor loadings strength) over a single universal N. The consensus now leans toward larger samples for more complex models or when data exhibit non-normality, missingness, or multigroup comparisons. Key anchor point: the larger and cleaner the data, the more confidently CFA results reflect the underlying construct structure.
Rules of thumb and practical benchmarks
Common thumb rules used in practice include:
- At least 5-10 observations per estimated parameter (per indicator, loading, and residual terms).
- A baseline of N ≈ 200 for models with a moderate number of factors and items.
- For models with high communalities and strong loadings, some researchers accept smaller samples, though this trades off precision and generalizability.
- When comparing multiple groups (multi-group CFA), aim for roughly 100 participants per group as a practical minimum to maintain adequate power across groups.
Model complexity and data quality considerations
Three core drivers shape the required sample size: (1) number of observed indicators per factor, (2) number of factors and their intercorrelations, and (3) the degree of shared variance (communalities). Higher communalities and clearer factor loadings allow smaller samples to some extent, but this is rarely sufficient in practice for reliable inference. In contrast, models with many indicators or low communalities demand substantially larger samples to achieve stable estimates and acceptable fit indices. Operational note: when planning a CFA, simulate or conduct a Monte Carlo power analysis if possible to tailor N to your specific model and data properties.
Power and precision in CFA
Power analyses for CFA/SEM consider the ability to reject incorrect models and to detect nonzero factor correlations. Studies show that power can be adequate with smaller samples if the model has strong signal (high loadings, high communalities) but may require larger samples when data are noisy or distributions are non-normal. For typical psychological constructs, power-focused guidelines often converge on N around 200-300 as a practical compromise between feasibility and robustness. Actionable reminder: report power considerations alongside CFA results to justify the chosen sample size.
Best practices for researchers in the field
To maximize CFA reliability with constrained samples, researchers should:
- Pre-register a plan for model specification and fit criteria to avoid post-hoc overinterpretation.
- Assess and report communalities and cross-loadings to contextualize sample size decisions.
- Prefer higher-quality data with minimal missingness or apply robust estimation methods if missing data are present.
- Consider alternative estimation techniques (e.g., robust maximum likelihood, diagonally weighted least squares) when data depart from normality and sample size is limited.
Illustrative data table
The table presents fabricated but plausible benchmarks to illustrate how sample size interacts with model attributes. These figures are for demonstration and should be tailored to your actual study context.
| Model Type | Number of Factors | Indicators per Factor | Average Communality | Estimated Required N | Notes |
|---|---|---|---|---|---|
| CFA | 2 | 4 | 0.60 | 180-220 | Moderate brevity; robust if data clean |
| CFA | 3 | 5 | 0.65 | 230-280 | Higher complexity; higher N advisable |
| CFA | 4 | 6 | 0.55 | 280-340 | Lower communalities demand more data |
| Multi-group CFA | 2 | 4 | 0.70 | 100 per group | Power across groups; ensure measurement invariance steps |
Methodological appendix
When designing a CFA study, a prudent workflow is to specify the measurement model, simulate expected fit indices under plausible distributions, and assess how varying N affects parameter recovery. This approach helps researchers avoid underpowered analyses that could mislead interpretation of the factor structure. Guidance anchor: plan for N that accommodates potential non-normality and missing data, and be explicit about the estimation method chosen (e.g., robust maximum likelihood) to align with sample characteristics.
Frequently asked questions
"The data drive the size; the theory drives the model." This maxim underscores that while rules of thumb are helpful, actual data characteristics should direct CFA planning and reporting.
In sum, CFA sample size is not a single universal number; it is a function of model complexity, data quality, and the analytic goals. For most conventional CFA applications in the social and behavioral sciences, aiming for a minimum of 200 participants, with larger samples for complex models or when data are imperfect, provides a solid foundation for credible inference. Researchers should document their assumptions, report diagnostic information about communalities and loadings, and consider supplementary analyses to validate the robustness of their findings. Bottom line: plan, simulate, and report with transparency to maximize the reliability and usefulness of CFA results.
Everything you need to know about Confirmatory Factor Analysis Sample Size How Small Is Risky
Why sample size matters in CFA?
In CFA, researchers estimate many parameters (factor loadings, variances, covariances), and small samples can yield unstable estimates and biased fit statistics. Larger samples reduce standard errors, improve power to detect misspecified models, and stabilize estimates of factors with lower communalities. This is especially critical when evaluating measurement invariance or multi-group CFA where each group adds parameters to estimate. Stability concerns rise when sample sizes dip below 100, increasing the risk of spurious good fit or missed misfit signals.
[Question]?
What is the minimum sample size for CFA in a model with 3 factors and 4 indicators per factor?
[Question]?
Is 150 participants enough for CFA?
[Question]?
How should researchers report CFA sample size decisions?
[Question]?
What estimation method works best with smaller samples?
[Question] What is CFA sample size?
In CFA, sample size refers to the number of participants or observations used to estimate the factor structure. The commonly cited target is around 200 participants for typical models, with higher N recommended for complex or non-normal data. Context note: sample size interacts with factor loadings, communalities, and the number of indicators per factor to determine estimation precision and model stability.
[Question] Can CFA be reliable with small samples?
Reliability with small samples is possible if the model is simple, communalities are high, and data are well-behaved, but reliability and generalizability often suffer as sample size decreases. Researchers should emphasize reporting of limitations and consider alternative analyses if N cannot be increased. Practical cue: use bootstrap or robust standard errors to gauge precision in small samples when feasible.
[Question] How to determine N for CFA?
A practical approach combines literature-guided rules of thumb with data-driven checks: (1) choose a baseline target (e.g., N ≈ 200), (2) compute post-hoc power for the model given the observed loadings and communalities, (3) evaluate stability of parameter estimates via re-sampling or bootstrapping, and (4) report effect sizes and confidence intervals for key parameters to convey precision. Note: pre-study power analysis for CFA is nuanced and benefits from Monte Carlo simulations tailored to the specific model and data characteristics.
[Question] How does multi-group CFA affect sample size?
Multi-group CFA typically increases the required N because you estimate parameters in each group and test invariance across groups. A common heuristic is to target roughly 100 participants per group as a practical floor, with larger samples needed for more groups, non-normal data, or complex invariance testing. Takeaway: plan for higher overall N or fewer groups when data collection is constrained.
[Question]?
Where can I find more authoritative guidelines on CFA sample size?