Skip to content

Updates to getting started, GLM and model comparison notebooks #5269

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jan 7, 2022

Conversation

fonnesbeck
Copy link
Member

This updates the GLM and model comparison notebooks to use v4.0 beta.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@codecov
Copy link

codecov bot commented Dec 17, 2021

Codecov Report

Merging #5269 (59c9981) into main (cce1613) will increase coverage by 0.98%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5269      +/-   ##
==========================================
+ Coverage   79.17%   80.16%   +0.98%     
==========================================
  Files          88       88              
  Lines       14346    14734     +388     
==========================================
+ Hits        11359    11812     +453     
+ Misses       2987     2922      -65     
Impacted Files Coverage Δ
pymc/backends/ndarray.py 79.64% <0.00%> (-9.74%) ⬇️
pymc/bart/utils.py 82.60% <0.00%> (-5.92%) ⬇️
pymc/tests/sampler_fixtures.py 84.89% <0.00%> (-5.88%) ⬇️
pymc/bart/bart.py 94.33% <0.00%> (-1.58%) ⬇️
pymc/tests/backend_fixtures.py 76.04% <0.00%> (-1.52%) ⬇️
pymc/util.py 74.34% <0.00%> (-0.66%) ⬇️
pymc/distributions/simulator.py 86.11% <0.00%> (-0.66%) ⬇️
pymc/tests/models.py 87.14% <0.00%> (-0.63%) ⬇️
pymc/sampling.py 86.27% <0.00%> (-0.54%) ⬇️
pymc/smc/smc.py 97.90% <0.00%> (-0.40%) ⬇️
... and 42 more

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 18, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-12-18T11:11:59Z
----------------------------------------------------------------

There are mentions to Thomas being the author both at the top and bottom of the notebook, I would keep it at the bottom with the link to the original blogpost. There are also links to other blogposts in the glm series that point to Thomas' blog, I don't know if those have been converted to examples too or if @twiecki is planning to update the blogposts to v4.

IMO the ideal scenario would be converting those links to links to pymc examples (updated to use v4 sometime soon) and having those notebooks point to the original blogpost. Pointing to the blogposts would be fine if they are updated to use v4 but I would not keep links to the blogposts if they still use v3. This notebook is now part of v4 documentation and should only link to other v4 doc pages and use v4 code (except maybe in the migration guide to compare).


fonnesbeck commented on 2021-12-21T16:40:59Z
----------------------------------------------------------------

Done

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 18, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-12-18T11:12:00Z
----------------------------------------------------------------

the mention to glm module needs to be updated.


fonnesbeck commented on 2021-12-21T16:29:38Z
----------------------------------------------------------------

Removed

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 18, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-12-18T11:12:00Z
----------------------------------------------------------------

I assume this is due to bambi still being v3 only. @tomicapretto @aloctavodia is development bambi version able to run on v4?

@fonnesbeck proposed not using bambi to circumvent this in slack which could be helpful, but then we'd need to modify the notebook to remove all mentions to easier formula syntax specification for glms which is one of the messages.


tomicapretto commented on 2021-12-18T14:51:44Z
----------------------------------------------------------------

You're right. Bambi still relies on v3 and it uses Theano.

The development version is not able to run on v4, but I think it is something we could implement without much difficulty. There's only one thing that may be problematic and it is related to the LKJCholeskyCov distribution. I don't know if it is already working on v4 already.

I plan to make a new release in the first half of January. We could work on porting Bambi development version to v4 after that. 

fonnesbeck commented on 2021-12-18T16:00:46Z
----------------------------------------------------------------

I had theano-pymc installed in the env, which caused this. Goes away when I remove it.

twiecki commented on 2021-12-21T08:46:28Z
----------------------------------------------------------------

I'd either update the cell or just manually type out that this was on version 4.0.0b1. We update these NB so infrequently that we can expect these warnings to be here for a long time.

OriolAbril commented on 2021-12-21T08:50:52Z
----------------------------------------------------------------

going forward, this might still be true for pymc-examples but won't and can't be the case with these notebooks. Moving them back here and doing so with only a handful is to make them part of the versioned docs. Will try to set up CI to autoexecute and everything, but these notebooks need to always be executed with the current version of the documentation. If that can't be for some reason (i.e. too long to run on ci) then they have to be moved back to pymc-examples.

fonnesbeck commented on 2021-12-21T16:38:51Z
----------------------------------------------------------------

Should be fixed now. bambi installs theano-pymc as a dependency, which is not ideal, but I assume that goes away in the near future when it is updated to PyMC4/Aesara.

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 18, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-12-18T11:12:01Z
----------------------------------------------------------------

Use

import sys

!{sys.executable} -m pip install bambi

instead. There are generally no problems with the !pip but it can install the package on a different kernel than the one used by the notebook.



fonnesbeck commented on 2021-12-21T16:30:05Z
----------------------------------------------------------------

Done

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 18, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-12-18T11:12:02Z
----------------------------------------------------------------

this function is not needed, we can use plot_lm from arviz, see below for what I think would work.


fonnesbeck commented on 2021-12-21T16:41:27Z
----------------------------------------------------------------

Fixed

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 18, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-12-18T11:12:03Z
----------------------------------------------------------------

_, ax = plt.subplots(figsize=(7, 7))
y_model = trace.posterior["Intercept"] + trace.posterior["x"] * xr.DataArray(x)
az.plot_lm(
    idata=trace, 
    num_samples=100,
    x=x,
    y="y",
    y_model=y_model,
    y_kwargs=dict(marker="x", color="C0"),
    y_model_plot_kwargs=dict(lw=0.2, c="k"),
    axes=ax,
)
ax.plot(x, true_regression_line, label="true regression line", lw=3.0, c="y")

ax.set_title("Posterior predictive regression lines")
ax.legend(loc=0)
ax.set_xlabel("x")
ax.set_ylabel("y");

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 18, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-12-18T11:12:04Z
----------------------------------------------------------------

while we are here, I would use PyMC instead of pymc


Copy link
Contributor

You're right. Bambi still relies on v3 and it uses Theano.

The development version is not able to run on v4, but I think it is something we could implement without much difficulty. There's only one thing that may be problematic and it is related to the LKJCholeskyCov distribution. I don't know if it is already working on v4 already.

I plan to make a new release in the first half of January. We could work on porting Bambi development version to v4 after that. 


View entire conversation on ReviewNB

Copy link
Member Author

I had theano-pymc installed in the env, which caused this. Goes away when I remove it.


View entire conversation on ReviewNB

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 19, 2021

View / edit / reply to this conversation on ReviewNB

michaelosthege commented on 2021-12-19T18:43:21Z
----------------------------------------------------------------

The introduction is rather outdated:

  • "preview of PyMC" ??
  • "its new GLM submodule" was just nuked 😅
  • That DARPA grant was in 2013 and with all the innovation since I don't think the usability argument is still relevant

fonnesbeck commented on 2021-12-21T16:30:33Z
----------------------------------------------------------------

I've revised all this now.

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 19, 2021

View / edit / reply to this conversation on ReviewNB

michaelosthege commented on 2021-12-19T18:43:22Z
----------------------------------------------------------------

the 2000 is not aligned with the cell below. Just drop it?


fonnesbeck commented on 2021-12-21T16:30:49Z
----------------------------------------------------------------

Fixed.

Copy link
Member

@OriolAbril OriolAbril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marking with request changes to make sure we update the text to reflect what the code does and the current situation.

I also think it should not use first person "I" anymore as it's no longer a blogpost but a collaborative page in the docs.

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 20, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-12-20T00:04:47Z
----------------------------------------------------------------

the bambi link is broken


fonnesbeck commented on 2021-12-21T16:30:55Z
----------------------------------------------------------------

Fixed.

@fonnesbeck fonnesbeck changed the title Updates to GLM and model comparison notebooks Updates to getting started, GLM and model comparison notebooks Dec 20, 2021
@fonnesbeck
Copy link
Member Author

Draft of getting started notebook added. Will finish it up in the next few hours.

@OriolAbril
Copy link
Member

OriolAbril commented Dec 20, 2021

could we rename that notebook to intro_for_ppl_practitioners or something along these lines?

I think it is a great notebook for this audience (which imo is what it was aiming for, after all this is based on the paper whose audience is clearly ppl practitioners who keep up with the literature), but it's not really a getting started notebook. New users writing their first bayesian model with a ppl should not need arbitrary logprob terms, distributions nor using aesara explicitly and therefore none of that should be mentioned when writing a tutorial. I think having a "pressing void" on a getting started for pymc newcomers might make it a bit more of a priority.

On a related note, maybe we could base the getting started notebook on @AustinRochford webinar for the upcoming pymc-data umbrella series which could then be notebook with a talk.

@twiecki
Copy link
Member

twiecki commented Dec 20, 2021

Completely agree @OriolAbril. I think we want a "Getting started if you know Bayesian modeling" and one if you don't.

@fonnesbeck
Copy link
Member Author

fonnesbeck commented Dec 20, 2021

I'm all for that, but we should probably not let that be a blocker for getting out of beta.

How about we rename this one pymc_overview.ipynb which is a more accurate description? We can then add the intro for non-Bayesians notebook later. Its not clear whether that would belong here or with pymc-examples.

@OriolAbril
Copy link
Member

OriolAbril commented Dec 21, 2021

I'm all for that, but we should probably not let that be a blocker for getting out of beta.

agreed, the blockers should be having everything in the api following the template in #5266 and addressing the TODOs currently visible in the website. @martinacantaro knows better and we'll try to organize when she is back after new year.

How about we rename this one pymc_overview.ipynb which is a more accurate description?

that is perfect, we do need to update the toctrees now though so the notebook appears in the website, relevant sidebars and so on, not sure if I'll have time to do this sometime soon.

We can then add the intro for non-Bayesians notebook later. Its not clear whether that would belong here or with pymc-examples.

It should definitely be here I think. The idea was to have a handful of notebooks included in the versioned docs (and therefore always executed with the version of the docs, I'll try to set up CI for this) covering the core functionalities of pymc.

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 21, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-12-21T08:44:35Z
----------------------------------------------------------------

the target also needs to be updated


fonnesbeck commented on 2021-12-21T16:45:53Z
----------------------------------------------------------------

Done

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 22, 2021

View / edit / reply to this conversation on ReviewNB

twiecki commented on 2021-12-22T12:44:26Z
----------------------------------------------------------------

Why not use the defaults here and generate 1000 samples.


@review-notebook-app
Copy link

review-notebook-app bot commented Dec 22, 2021

View / edit / reply to this conversation on ReviewNB

twiecki commented on 2021-12-22T12:44:27Z
----------------------------------------------------------------

Maybe this can be removed as well, no one uses these samplers, and no one should.


fonnesbeck commented on 2021-12-24T21:50:18Z
----------------------------------------------------------------

I think its important to show that other samplers can be used. There's nothing wrong with the slice sampler.

@review-notebook-app
Copy link

review-notebook-app bot commented Dec 22, 2021

View / edit / reply to this conversation on ReviewNB

twiecki commented on 2021-12-22T12:44:28Z
----------------------------------------------------------------

Maybe good opportunity to show how increasing target_accept would help here (if it does)?


fonnesbeck commented on 2021-12-24T21:50:37Z
----------------------------------------------------------------

Good idea.

@twiecki
Copy link
Member

twiecki commented Dec 22, 2021

I think these could be great getting started guides already. one for if you do not know Bayesian modeling, one if you do.

Copy link
Member Author

Its a horseshoe prior. If we chose normal priors we would not be doing sparse regression.


View entire conversation on ReviewNB

Copy link
Member Author

I can remove it.


View entire conversation on ReviewNB

Copy link
Member Author

I think its important to show that other samplers can be used. There's nothing wrong with the slice sampler.


View entire conversation on ReviewNB

Copy link
Member Author

Good idea.


View entire conversation on ReviewNB

Copy link
Member Author

No, that strips away all the x-axes labels except for the bottom one.


View entire conversation on ReviewNB

Copy link
Member Author

The plot looks better but it does a poor job with the labels (see newly-pushed version)


View entire conversation on ReviewNB

Copy link
Member Author

I can't seem to get any of these working here. I will leave them as markdown for now.


View entire conversation on ReviewNB

Copy link
Member

Oh yeah, it looks like ArviZ sets the grid and ticklables to match the centers of the bins. Changing the order should look better but we should probably add a way to turn this off in ArviZ and have regular grid and ticklabels not tied to any histogram (which doesn't make much sense when comparing)


View entire conversation on ReviewNB

Copy link
Member

wasn't sure it'd work, let's forget about this


View entire conversation on ReviewNB

Copy link
Member Author

The problem here is that I want the text to say "platform-specific installation guides" and not simply "installation"


View entire conversation on ReviewNB

Copy link
Member Author

Similar to above, I don't want the text to say "api" I want it to say "API documentation".


View entire conversation on ReviewNB

@OriolAbril
Copy link
Member

There are some examples of custom text in https://sphinx-primer.readthedocs.io/en/latest/core/roles.html

Copy link
Member Author

I've updated these passages


View entire conversation on ReviewNB

Copy link
Member Author

Done


View entire conversation on ReviewNB

Copy link
Member Author

done


View entire conversation on ReviewNB

Copy link
Member Author

Unfortunately swapping the order creates xticklabels that overlap to the point of unreadability. I have left it as is for the time being.


View entire conversation on ReviewNB

@fonnesbeck
Copy link
Member Author

I think I've addressed all of the comments as best as possible. Let me know if there is anything else that needs attention.

@OriolAbril
Copy link
Member

{ref}`API documentation <api>`
{ref}`platform-specific installation guides <installation>`

Will point to the right place and show the custom text. There shouldn't be links hardcoded to v3 docs unless they aim to always point there.

@OriolAbril OriolAbril merged commit 5626a04 into main Jan 7, 2022
@OriolAbril OriolAbril deleted the example_notebooks_update branch January 7, 2022 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants