Description
Notebook proposal
Title: Confirmatory Factor Analysis and Structural Equation Models
Why should this notebook be added to pymc-examples?
This fills a gap in the coverage we have of CFA and SEM models highlighting in particular their role in the analysis of psychometric survey data. It's super interesting and tied to Judea Pearl style causal inference on DAGs.
DATA

Basic CFA example in PyMC
coords = {'obs': list(range(len(df_p))),
'indicators': ['PI', 'AD', 'IGC', 'FI', 'FC'],
'indicators_1': ['PI', 'AD', 'IGC'],
'indicators_2': ['FI', 'FC'],
'latent': ['Student', 'Faculty']
}
obs_idx = list(range(len(df_p)))
with pm.Model(coords=coords) as model:
Psi = pm.InverseGamma('Psi', 5, 10, dims='indicators')
lambdas_ = pm.Normal('lambdas_1', 1, 10, dims=('indicators_1'))
lambdas_1 = pm.Deterministic('lambdas1', pt.set_subtensor(lambdas_[0], 1), dims=('indicators_1'))
lambdas_ = pm.Normal('lambdas_2', 1, 10, dims=('indicators_2'))
lambdas_2 = pm.Deterministic('lambdas2', pt.set_subtensor(lambdas_[0], 1), dims=('indicators_2'))
tau = pm.Normal('tau', 3, 10, dims='indicators')
kappa = 0
sd_dist = pm.Exponential.dist(1.0, shape=2)
chol, _, _ = pm.LKJCholeskyCov('chol_cov', n=2, eta=2,
sd_dist=sd_dist, compute_corr=True)
ksi = pm.MvNormal('ksi', kappa, chol=chol, dims=('obs', 'latent'))
m1 = tau[0] + ksi[obs_idx, 0]*lambdas_1[0]
m2 = tau[1] + ksi[obs_idx, 0]*lambdas_1[1]
m3 = tau[2] + ksi[obs_idx, 0]*lambdas_1[2]
m4 = tau[3] + ksi[obs_idx, 1]*lambdas_2[0]
m5 = tau[4] + ksi[obs_idx, 1]*lambdas_2[1]
mu = pm.Deterministic('mu', pm.math.stack([m1, m2, m3, m4, m5]).T)
_ = pm.Normal('likelihood', mu, Psi, observed=df_p.values)
idata = pm.sample(nuts_sampler='numpyro', target_accept=.95,
idata_kwargs={"log_likelihood": True})
idata.extend(pm.sample_posterior_predictive(idata))
Suggested categories:
- Level: Intermediate.
Related notebooks
Perhaps this one: https://www.pymc.io/projects/examples/en/latest/case_studies/factor_analysis.html
But it seems to be recount factor analysis more as a machine learning feature reduction technique than as a means of analysis as per the psychometrics use-case.
References
Will likely adapt (WIP) a blog post i'm working on here: https://nathanielf.github.io/posts/post-with-code/CFA_AND_SEM/CFA_AND_SEM.html
The original work references the book Bayesian Psychometric Modeling by Mislevey and Levy