Mention likely multi-threading issues on BVAR notebook

### Describe the issue:

When running the [Bayesian Vector Autoregressive Models — PyMC example gallery 1](https://www.pymc.io/projects/examples/en/latest/time_series/bayesian_var_model.html) on x86_64 (tested on Ubuntu 20.04 + fresh install of PyMC via official [installation instructions](https://www.pymc.io/projects/docs/en/latest/installation.html)), the sampling time explodes to tens of hours instead of the expected few minutes. I can reproduce the issue on multiple machines.

Differently, on arm64 (M1 Macbook) the issue does not occur, and the sampling time is a few minutes as expected.

Below is a minimal example, extracted from the notebook above, that consistently shows the issue on x86_64.

The critical line is l.113, where `pm.sample()` has default values and the NUTS sampler is auto-assigned.

By manually specifying `pm.sample(cores=1, ...` or using a non-default NUTS sampler (`nuts_sampler="blackjax"`, or `nuts_sampler="numpyro"`), the issue disappears on x86_64.

Related discussion on Discord [here](https://discourse.pymc.io/t/bayesian-var-example-notebook-extremely-low-sampling-rate/12178/6).

Screenshot of the issue:
<img width="1453" alt="Screen Shot 2023-05-22 at 2 10 44 PM" src="https://github.com/pymc-devs/pymc/assets/69366/f3d5e8bd-741f-450a-9810-c2125811440a">


### Reproduceable code example:

```python
import numpy as np
import pandas as pd
import pymc as pm


RANDOM_SEED = 8927
rng = np.random.default_rng(RANDOM_SEED)

def simulate_var(
    intercepts, coefs_yy, coefs_xy, coefs_xx, coefs_yx, noises=(1, 1), *, warmup=100, steps=200
):
    draws_y = np.zeros(warmup + steps)
    draws_x = np.zeros(warmup + steps)
    draws_y[:2] = intercepts[0]
    draws_x[:2] = intercepts[1]
    for step in range(2, warmup + steps):
        draws_y[step] = (
            intercepts[0]
            + coefs_yy[0] * draws_y[step - 1]
            + coefs_yy[1] * draws_y[step - 2]
            + coefs_xy[0] * draws_x[step - 1]
            + coefs_xy[1] * draws_x[step - 2]
            + rng.normal(0, noises[0])
        )
        draws_x[step] = (
            intercepts[1]
            + coefs_xx[0] * draws_x[step - 1]
            + coefs_xx[1] * draws_x[step - 2]
            + coefs_yx[0] * draws_y[step - 1]
            + coefs_yx[1] * draws_y[step - 2]
            + rng.normal(0, noises[1])
        )
    return draws_y[warmup:], draws_x[warmup:]


### Define a helper function that will construct our autoregressive step for the marginal contribution of each lagged
### term in each of the respective time series equations
def calc_ar_step(lag_coefs, n_eqs, n_lags, df):
    ars = []
    for j in range(n_eqs):
        ar = pm.math.sum(
            [
                pm.math.sum(lag_coefs[j, i] * df.values[n_lags - (i + 1) : -(i + 1)], axis=-1)
                for i in range(n_lags)
            ],
            axis=0,
        )
        ars.append(ar)
    beta = pm.math.stack(ars, axis=-1)

    return beta


### Make the model in such a way that it can handle different specifications of the likelihood term
### and can be run for simple prior predictive checks. This latter functionality is important for debugging of
### shape handling issues. Building a VAR model involves quite a few moving parts and it is handy to
### inspect the shape implied in the prior predictive checks.
def make_model(n_lags, n_eqs, df, priors, mv_norm=True, prior_checks=True):
    coords = {
        "lags": np.arange(n_lags) + 1,
        "equations": df.columns.tolist(),
        "cross_vars": df.columns.tolist(),
        "time": [x for x in df.index[n_lags:]],
    }

    with pm.Model(coords=coords) as model:
        lag_coefs = pm.Normal(
            "lag_coefs",
            mu=priors["lag_coefs"]["mu"],
            sigma=priors["lag_coefs"]["sigma"],
            dims=["equations", "lags", "cross_vars"],
        )
        alpha = pm.Normal(
            "alpha", mu=priors["alpha"]["mu"], sigma=priors["alpha"]["sigma"], dims=("equations",)
        )
        data_obs = pm.Data("data_obs", df.values[n_lags:], dims=["time", "equations"], mutable=True)

        betaX = calc_ar_step(lag_coefs, n_eqs, n_lags, df)
        betaX = pm.Deterministic(
            "betaX",
            betaX,
            dims=[
                "time",
            ],
        )
        mean = alpha + betaX

        if mv_norm:
            n = df.shape[1]
            ## Under the hood the LKJ prior will retain the correlation matrix too.
            noise_chol, _, _ = pm.LKJCholeskyCov(
                "noise_chol",
                eta=priors["noise_chol"]["eta"],
                n=n,
                sd_dist=pm.HalfNormal.dist(sigma=priors["noise_chol"]["sigma"]),
            )
            obs = pm.MvNormal(
                "obs", mu=mean, chol=noise_chol, observed=data_obs, dims=["time", "equations"]
            )
        else:
            ## This is an alternative likelihood that can recover sensible estimates of the coefficients
            ## But lacks the multivariate correlation between the timeseries.
            sigma = pm.HalfNormal("noise", sigma=priors["noise"]["sigma"], dims=["equations"])
            obs = pm.Normal(
                "obs", mu=mean, sigma=sigma, observed=data_obs, dims=["time", "equations"]
            )

        if prior_checks:
            idata = pm.sample_prior_predictive()
            return model, idata
        else:
            idata = pm.sample_prior_predictive()
            idata.extend(pm.sample(draws=2000, random_seed=130))
            pm.sample_posterior_predictive(idata, extend_inferencedata=True, random_seed=rng)
    return model, idata


if __name__=="__main__":

    
    var_y, var_x = simulate_var(
        intercepts=(18, 8),
        coefs_yy=(-0.8, 0),
        coefs_xy=(0.9, 0),
        coefs_xx=(1.3, -0.7),
        coefs_yx=(-0.1, 0.3),
    )
    df = pd.DataFrame({"x": var_x, "y": var_y})
    df.head()
    
    n_lags = 2
    n_eqs = 2
    priors = {
        "lag_coefs": {"mu": 0.3, "sigma": 1},
        "alpha": {"mu": 15, "sigma": 5},
        "noise_chol": {"eta": 1, "sigma": 1},
        "noise": {"sigma": 1},
    }
    model, idata_fake_data = make_model(n_lags, n_eqs, df, priors, prior_checks=False)
```


### Error message:

```shell
No error message, just the code hanging (almost) forever.
```


### PyMC version information:

Fresh install of PyMC via official [installation instructions](https://www.pymc.io/projects/docs/en/latest/installation.html).

```
%load_ext watermark
%watermark -n -u -v -iv -w -p pytensor,aeppl,xarray
----
Last updated: Tue May 23 2023

Python implementation: CPython
Python version       : 3.11.3
IPython version      : 8.13.2

pytensor: 2.11.2
aeppl   : not installed
xarray  : 2023.5.0

pymc     : 5.3.0
watermark: 2.3.1
pandas   : 2.0.1
numpy    : 1.24.3

Watermark: 2.3.1
```

### Context for the issue:

1. This [example](https://www.pymc.io/projects/examples/en/latest/time_series/bayesian_var_model.html) in the PyMC documentation becomes unusable for a fair share of users.
2. The reason behind the issue may affect PyMC users in general, irrespective of the example.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Mention likely multi-threading issues on BVAR notebook #550

Describe the issue:

Reproduceable code example:

Error message:

PyMC version information:

Context for the issue:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Mention likely multi-threading issues on BVAR notebook #550

Description

Describe the issue:

Reproduceable code example:

Error message:

PyMC version information:

Context for the issue:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions