Skip to content

predictive sampling fails with multivariate posterior  #5002

Closed
@falkmielke

Description

@falkmielke

description

  • goal: pm.sample_posterior_predictive() from a model+trace
  • but: draw a different number of samples than originally observed
  • this fails when using a multivariate likelihood/posterior/observed with LKJ/Cholesky prior.

Thanks in advance for looking into this!

additional information
cf. https://discourse.pymc.io/t/prediction-setting-data-fails-with-multivariate-observed/8011

I suspect some data-shape-dependent theano object is initialized with the LKJ prior.
This is unexpected; I don't see how it depends on the data (but I'm uninformed).
A quick solution would be to convert that internal to a shared "pm.Data" which can be accessed ex post.

example
(note the "KILLSWITCH")

import os as os
import numpy as np
import pandas as pd
import pymc3 as pm
import scipy.stats as stats
import theano as th
import theano.tensor as tt
import matplotlib.pyplot as plt


### simulated data
print ('#### Simulation ####')
n_observations = 99
n_observables = 2

mean = [-0.5, 0.2]
cov = [[0.8, -0.6], [-0.6, 1.4]]
obs0, obs1 = np.random.multivariate_normal(mean, cov, n_observations).T

print ('obs corr:', stats.pearsonr(obs0, obs1))


# combining the data
data = pd.DataFrame.from_dict({ \
                                  'index': np.arange(n_observations) \
                                , 'obs0': obs0 \
                                , 'obs1': obs1 \
                              })

### Model Construction and sampling
print ('#### Modeling ####')
# shared theano objects (pm.Data)
shared = {}

with pm.Model() as model:

    # using pm.DAta; yet the error also occurs when using th.shared directly
    shared['interecept'] = pm.Data(f'intercept_data', np.ones((n_observations, 1)))

    ## intercept
    intercept = pm.Normal( 'intercept' \
                         , mu = 0., sd = 1. \
                         , shape = (1, n_observables) \
                        )
    estimator = tt.dot(shared['interecept'], intercept)


    ## observable
    if False: # <- this is the "KILLSWITCH"
        # individual observables
        # with this, prediction WORKS

        # residual
        residual = pm.HalfCauchy('residual', 1., shape = (1,n_observables))

        posterior = pm.Normal(  'posterior' \
                                , mu = estimator \
                                , sd = residual \
                                , observed = data.loc[:, ['obs0', 'obs1']].values \
                              )
    else:
        # multivariate observables
        # with this, prediction FAILS

        # LKJ Prior
        sd_dist = pm.HalfNormal.dist(1.)
        packed_cholesky = pm.LKJCholeskyCov(  'packed_cholesky' \
                                              , n = n_observables \
                                              , eta = 1. \
                                              , sd_dist = sd_dist \
                                            )

        # compute the observables covariance and correlation matrix
        cholesky_matrix = pm.expand_packed_triangular(n_observables, packed_cholesky, lower = True)

        # posterior|likelihood|observables
        posterior = pm.MvNormal(  'posterior' \
                                , mu = estimator \
                                , chol = cholesky_matrix \
                                , observed = data.loc[:, ['obs0', 'obs1']].values \
                               )

    # inference
    trace = pm.sample(  draws = 2**10, tune = 2**10 \
                      # , cores = 4, chains = 4 \
                      , return_inferencedata = True \
                      )

# show outcome
print (pm.summary(trace))
pm.plot_trace(trace)
plt.show()

### predictive sampling
print ('#### Prediction ####')

# CRITICAL: the desired number of predictions is different from the number of observations
n_predictions = 57

# adjust intercept shape
probe_data = {}
probe_data['intercept_data'] = np.ones((n_predictions, 1))

# how often to repeat the n_predictions draws
n_repeats = 10

with model:
    pm.set_data(probe_data)
    prediction = pm.sample_posterior_predictive(trace, samples = n_repeats)

print (prediction)

traceback:

Complete error traceback
Traceback (most recent call last):--------------------------------------------------------------------| 0.00% [0/10 00:00<00:00]
  File "/usr/lib/python3.9/site-packages/pymc3/distributions/shape_utils.py", line 174, in broadcast_dist_samples_shape
    broadcast_shape = shapes_broadcasting(*sp_shapes, raise_exception=True)
  File "/usr/lib/python3.9/site-packages/pymc3/distributions/shape_utils.py", line 101, in shapes_broadcasting
    raise ValueError(
ValueError: Supplied shapes (57, 2), (99, 2) do not broadcast together

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/pymc3/distributions/distribution.py", line 971, in _draw_value
    return dist_tmp.random(point=point, size=size)
  File "/usr/lib/python3.9/site-packages/pymc3/distributions/multivariate.py", line 275, in random
    mu = broadcast_dist_samples_to(to_shape=output_shape, samples=[mu], size=size)[0]
  File "/usr/lib/python3.9/site-packages/pymc3/distributions/shape_utils.py", line 396, in broadcast_dist_samples_to
    samples, to_shape = get_broadcastable_dist_samples(
  File "/usr/lib/python3.9/site-packages/pymc3/distributions/shape_utils.py", line 271, in get_broadcastable_dist_samples
    out_shape = broadcast_dist_samples_shape(p_shapes, size=size)
  File "/usr/lib/python3.9/site-packages/pymc3/distributions/shape_utils.py", line 176, in broadcast_dist_samples_shape
    raise ValueError(
ValueError: Cannot broadcast provided shapes (57, 2), (99, 2) given size: ()

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/01_kinematics/14_papio_fcas/archive/MinimalExample_Prediction.py", line 127, in <module>
    prediction = pm.sample_posterior_predictive(trace, samples = n_repeats)
  File "/usr/lib/python3.9/site-packages/pymc3/sampling.py", line 1733, in sample_posterior_predictive
    values = draw_values(vars_, point=param, size=size)
  File "/usr/lib/python3.9/site-packages/pymc3/distributions/distribution.py", line 791, in draw_values
    value = _draw_value(next_, point=point, givens=temp_givens, size=size)
  File "/usr/lib/python3.9/site-packages/pymc3/distributions/distribution.py", line 979, in _draw_value
    val = np.atleast_1d(dist_tmp.random(point=point, size=None))
  File "/usr/lib/python3.9/site-packages/pymc3/distributions/multivariate.py", line 276, in random
    param = np.broadcast_to(param, shape=output_shape + dist_shape[-1:])
  File "<__array_function__ internals>", line 5, in broadcast_to
  File "/usr/lib/python3.9/site-packages/numpy/lib/stride_tricks.py", line 411, in broadcast_to
    return _broadcast_to(array, shape, subok=subok, readonly=True)
  File "/usr/lib/python3.9/site-packages/numpy/lib/stride_tricks.py", line 343, in _broadcast_to
    raise ValueError('cannot broadcast a non-scalar to a scalar array')
ValueError: cannot broadcast a non-scalar to a scalar array

Versions and main components

  • PyMC3 Version: 3.11.4
  • Theano Version: 1.1.2
  • Python Version: 3.9.7 (default, Aug 31 2021, 13:28:12)
  • Operating system: arch linux
  • How did you install PyMC3: pip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions