Skip to content

BUG: input a ndarray type data but still raise an error: setting an array element with a sequence. #7597

Open
@JamGengGeng

Description

@JamGengGeng

Describe the issue:

pm.gp.Marginal.conditional need a array-like type variable. I input a ndarray type variable,which raise a valueError (setting an array element with a sequence).I suppose it should be filled in a tensor variable?

Reproduceable code example:

import numpy as np
import pandas as pd
import pymc as pm
from sklearn.neighbors import KernelDensity

data = pd.read_csv('QDN_AC.csv')
X = data['Lon'].astype(float)
Y = data['Lat'].astype(float)
Z = data['AC production'].astype(float)

coords = np.vstack([X, Y]).T
kde = KernelDensity(kernel='gaussian', bandwidth=1.0).fit(coords)
density = kde.score_samples(coords)

with pm.Model() as model_kde_gp:
    sigma = pm.HalfNormal('sigma', sigma=1)
    length_scale = pm.Gamma('length_scale', alpha=2, beta=1)

    mean_func = pm.gp.mean.Zero()
    cov_func = pm.gp.cov.ExpQuad(input_dim=1, ls=length_scale)
    gp = pm.gp.Marginal(mean_func=mean_func, cov_func=cov_func)

    Y_obs = gp.marginal_likelihood('Y_obs', X=density[:, None], y=Z, noise=sigma)

    trace_kde_gp = pm.sample(10, return_inferencedata=True)

grid_x, grid_y = np.meshgrid(
    np.linspace(X.min(), X.max(), 100),
    np.linspace(Y.min(), Y.max(), 100),
)
grid_coords = np.vstack([grid_x.ravel(), grid_y.ravel()]).T
grid_density = kde.score_samples(grid_coords)
grid_coords = np.array(grid_coords)

with model_kde_gp:
    p = 0.1
    gp_pred = gp.conditional('gp_pred', Xnew=grid_coords) 
    pred_samples = pm.sample_posterior_predictive(trace_kde_gp, var_names=['gp_pred'], random_seed=42)

pred_mean = pred_samples['gp_pred'].mean(axis=0)
pred_lower = np.percentile(pred_samples['gp_pred'], 2.5, axis=0)
pred_upper = np.percentile(pred_samples['gp_pred'], 97.5, axis=0)

grid_density = np.exp(grid_density).reshape(grid_x.shape)
pred_mean = pred_mean.reshape(grid_x.shape)
pred_lower = pred_lower.reshape(grid_x.shape)
pred_upper = pred_upper.reshape(grid_x.shape)

Error message:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
TypeError: float() argument must be a string or a real number, not 'TensorVariable'

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
Cell In[9], line 37
     35 with model_kde_gp:
     36     p = 0.1
---> 37     gp_pred = gp.conditional('gp_pred', Xnew=grid_coords) 
     38     pred_samples = pm.sample_posterior_predictive(trace_kde_gp, var_names=['gp_pred'], random_seed=42)
     40 pred_mean = pred_samples['gp_pred'].mean(axis=0)

File ~/miniconda3/envs/ai/lib/python3.10/site-packages/pymc/gp/gp.py:624, in Marginal.conditional(self, name, Xnew, pred_noise, given, jitter, **kwargs)
    591 R"""
    592 Return the conditional distribution evaluated over new input locations `Xnew`.
    593 
   (...)
    621     constructor.
    622 """
    623 givens = self._get_given_vals(given)
--> 624 mu, cov = self._build_conditional(Xnew, pred_noise, False, *givens, jitter)
    625 return pm.MvNormal(name, mu=mu, cov=cov, **kwargs)
...
--> 820     subarr = np.asarray(arr, dtype=dtype)
    821 else:
    822     subarr = np.array(arr, dtype=dtype, copy=copy)

ValueError: setting an array element with a sequence.

PyMC version information:

PyMC version: 5.18.2

Context for the issue:

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions