Skip to content

Confirmatory Factor Analysis and Structural Equation Models #695

Open
@NathanielF

Description

@NathanielF

Notebook proposal

Title: Confirmatory Factor Analysis and Structural Equation Models

Why should this notebook be added to pymc-examples?

This fills a gap in the coverage we have of CFA and SEM models highlighting in particular their role in the analysis of psychometric survey data. It's super interesting and tied to Judea Pearl style causal inference on DAGs.

image

DATA

image

Basic CFA example in PyMC

coords = {'obs': list(range(len(df_p))), 
          'indicators': ['PI', 'AD',    'IGC', 'FI', 'FC'],
          'indicators_1': ['PI', 'AD',  'IGC'],
          'indicators_2': ['FI', 'FC'],
          'latent': ['Student', 'Faculty']
          }


obs_idx = list(range(len(df_p)))
with pm.Model(coords=coords) as model:
  
  Psi = pm.InverseGamma('Psi', 5, 10, dims='indicators')
  lambdas_ = pm.Normal('lambdas_1', 1, 10, dims=('indicators_1'))
  lambdas_1 = pm.Deterministic('lambdas1', pt.set_subtensor(lambdas_[0], 1), dims=('indicators_1'))
  lambdas_ = pm.Normal('lambdas_2', 1, 10, dims=('indicators_2'))
  lambdas_2 = pm.Deterministic('lambdas2', pt.set_subtensor(lambdas_[0], 1), dims=('indicators_2'))
  tau = pm.Normal('tau', 3, 10, dims='indicators')
  kappa = 0
  sd_dist = pm.Exponential.dist(1.0, shape=2)
  chol, _, _ = pm.LKJCholeskyCov('chol_cov', n=2, eta=2,
    sd_dist=sd_dist, compute_corr=True)
  ksi = pm.MvNormal('ksi', kappa, chol=chol, dims=('obs', 'latent'))

  m1 = tau[0] + ksi[obs_idx, 0]*lambdas_1[0]
  m2 = tau[1] + ksi[obs_idx, 0]*lambdas_1[1]
  m3 = tau[2] + ksi[obs_idx, 0]*lambdas_1[2]
  m4 = tau[3] + ksi[obs_idx, 1]*lambdas_2[0]
  m5 = tau[4] + ksi[obs_idx, 1]*lambdas_2[1]
  
  mu = pm.Deterministic('mu', pm.math.stack([m1, m2, m3, m4, m5]).T)
  _  = pm.Normal('likelihood', mu, Psi, observed=df_p.values)

  idata = pm.sample(nuts_sampler='numpyro', target_accept=.95, 
                    idata_kwargs={"log_likelihood": True})
  idata.extend(pm.sample_posterior_predictive(idata))

Suggested categories:

  • Level: Intermediate.

Related notebooks

Perhaps this one: https://www.pymc.io/projects/examples/en/latest/case_studies/factor_analysis.html
But it seems to be recount factor analysis more as a machine learning feature reduction technique than as a means of analysis as per the psychometrics use-case.

References

Will likely adapt (WIP) a blog post i'm working on here: https://nathanielf.github.io/posts/post-with-code/CFA_AND_SEM/CFA_AND_SEM.html

The original work references the book Bayesian Psychometric Modeling by Mislevey and Levy

Metadata

Metadata

Assignees

No one assigned

    Labels

    proposalNew notebook proposal still up for discussion

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions