Description
Hey 👋 ! It's me (again 🙈 ). I was reading the paper Bayesian additive regression trees for probabilistic programming and I was gladly surprised with the model presented in Section 4.4 Heteroscedasticity, in particular with the Code Block 6 where you model the mean and variance with a BART
model of size 2. I think This example should be in the documentation.
The most interesting part would be to clarify the connection between the Y
parameter (The response vector ) of the BART
model with the likelihood. Note that in the code
with pm.Model() as model_marketing_full:
w = pmb.BART("w", X, Y, m = 200, size =2 )
y = pm.Normal("y", w[0], np.abs(w[1]), observed = Y)
idata_marketing_full = pm.sample()
w[1]
(i.e. the variance estimation) is estimated using Y
. I am sometimes confused about the relationship between the Y
parameter and the BART
random variable (see for example #31). I think adding this example would benefit new users a lot.
I could draft a PR. Should it be added into the same notebook as in pymc-devs/pymc-examples#507 ?
Thanks 🙂