Skip to content

Posterior predictive samples not drawn from correct conditional #3012

Closed
@ckrapu

Description

@ckrapu

This issue concerns something that was raised on Discourse. See that page for more details.

Suppose x and y are the predictor and response for a simple linear regression. I fit the model and draw samples using sample_ppc as follows:

with pm.Model() as model1:
    x     = pm.Uniform('x',lower = 0.0, upper = 10.0, observed = observed_x,shape = n)
    sigma = pm.HalfCauchy('sigma',beta = 1.0)
    beta  = pm.Normal('beta',sd = 5.0)
    alpha = pm.Normal('alpha',sd = 5.0)
    y_hat = beta * x + alpha
    y = pm.Normal('y',mu = y_hat,sd = sigma, shape = n,observed = observed_y)
    trace = pm.sample()
    ppc = pm.sample_ppc(trace, samples=1)

Then, I plot the samples of Y versus the samples of X
plt.scatter(ppc['x'],ppc['y']) and I find that they are uncorrelated:
image1

Now, I was expecting it to instead look like this:

plt.scatter(observed_x,ppc['y'])
image2

@junpenglao pointed out that sample_ppc is not drawing samples of y conditional on the new values x but rather from the distribution conditional on the observed data observed_x which is not what we might expect. As a consequence, we get samples of x and y from their joint distribution which do not have the right correlation.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions