Skip to content

Update GLM hierarchical & Robust Outlier Detection to best practices #147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

CloudChaoszero
Copy link
Contributor

@CloudChaoszero CloudChaoszero commented Apr 25, 2021

This PR is the start of #14

Addresses #87 and #80 and updates the notebooks to "Best Practices"

Summary

  • Have trace objects saved as InferenceData objects via return_inferencedata=True.

I'm open to interpretation on if this is what should be the goal for linked issue #14.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@CloudChaoszero
Copy link
Contributor Author

There are some sns plot-related errors, due to the data type change.
So, I am open to figuring out how to resolve those 😢

@review-notebook-app
Copy link

review-notebook-app bot commented Apr 25, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-04-25T14:16:42Z
----------------------------------------------------------------

I would use az.plot_pair here


CloudChaoszero commented on 2021-05-14T08:16:56Z
----------------------------------------------------------------

Thanks! I wasn't able to get the contour colors though. :/

@review-notebook-app
Copy link

review-notebook-app bot commented Apr 25, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-04-25T14:16:43Z
----------------------------------------------------------------

Here I'd also use plot_pair and have marginals share color with 2d kde


CloudChaoszero commented on 2021-05-02T09:55:00Z
----------------------------------------------------------------

Hmm, how would I go about plotting these plot_pair visualizations on the same graph?

I tried doing the following, but it created "two graphs" that are not combined into one figure.

marginalKwards = {'kind': 'kde', 'color':sns.color_palette()[1]}
az.plot_pair(trc_ols, var_names = fts,
       kind='scatter',
       contour= True,
       fill_last=True,
       divergences=True,
       figsize=[16,16],
       marginals=True,
       marginal_kwargs= marginalKwards,
       group='posterior'
)

marginalKwards = {'kind': 'kde', 'color':sns.color_palette()[2]}
az.plot_pair(trc_studentt, var_names = fts,
       kind='scatter',
       contour= True,
       fill_last=True,

       divergences=True,
       figsize=[16,16],
       marginals=True,
       marginal_kwargs= marginalKwards,
       group='posterior'
)

OriolAbril commented on 2021-05-04T11:52:21Z
----------------------------------------------------------------

You need to do:

ax = az.plot_pair(...)
az.plot_pair(..., ax=ax);

CloudChaoszero commented on 2021-05-12T11:47:25Z
----------------------------------------------------------------

Oh great, thanks for the tip!

@OriolAbril
Copy link
Member

As a general starting point, https://docs.pymc.io/notebooks/multilevel_modeling.html should be a good starting point to get plots to work.

Also, can you add links to all 3 addressed issues? It's more important for me to keep track of the updates and their state to have links to the tracker issues than to #14 (which I may end up closing to reduce confusion)

Copy link
Contributor Author

Hmm, how would I go about plotting these plot_pair visualizations on the same graph?

I tried doing the following, but it created "two graphs" that are not combined into one figure.

marginalKwards = {'kind': 'kde', 'color':sns.color_palette()[1]}
az.plot_pair(trc_ols, var_names = fts,
       kind='scatter',
       contour= True,
       fill_last=True,
       divergences=True,
       figsize=[16,16],
       marginals=True,
       marginal_kwargs= marginalKwards,
       group='posterior'
)

marginalKwards = {'kind': 'kde', 'color':sns.color_palette()[2]}
az.plot_pair(trc_studentt, var_names = fts,
       kind='scatter',
       contour= True,
       fill_last=True,

       divergences=True,
       figsize=[16,16],
       marginals=True,
       marginal_kwargs= marginalKwards,
       group='posterior'
)


View entire conversation on ReviewNB

Copy link
Contributor Author

Oh a tracker for the examples/generalized_linear_models/GLM-robust-with-outlier-detection.ipynbnotebook right?

@OriolAbril
Copy link
Member

@CloudChaoszero I have updated the description to link to both relevant issues, I noticed that the PR is also modifying the hierarchcal binomial notebook but there are no changes it is only reexecuted. In fact you already updated it back in #45 so it's already in "Best Practices", I think the notebook should be excluded from this PR and work only on the other two.

Copy link
Member

You need to do:

ax = az.plot_pair(...)
az.plot_pair(..., ax=ax);

View entire conversation on ReviewNB

@review-notebook-app
Copy link

review-notebook-app bot commented May 4, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-05-04T11:53:03Z
----------------------------------------------------------------

Add a ; to avoid the extra output, also in other plots.


@review-notebook-app
Copy link

review-notebook-app bot commented May 4, 2021

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-05-04T11:53:03Z
----------------------------------------------------------------

Take a look at https://docs.pymc.io/notebooks/multilevel_modeling.html as a guide on how to perform the computations. Note that the current ones are actually wrong as they average first then operate when it should be the other way around.


CloudChaoszero commented on 2021-05-14T08:17:26Z
----------------------------------------------------------------

Thanks! I tried resolving this, and did such. However, some parts of the visualization are lost due to the change in data type.

@CloudChaoszero CloudChaoszero force-pushed the coords-inferencedata-update-2 branch from 793ad90 to 37aeb6a Compare May 12, 2021 11:45
Copy link
Contributor Author

Oh great, thanks for the tip!


View entire conversation on ReviewNB

Copy link
Contributor Author

I've updated examples/generalized_linear_models/GLM-hierarchical.ipynb

Will work onexamples/generalized_linear_models/GLM-hierarchical.ipynb sometime soon :D

Copy link
Contributor Author

Thanks! I wasn't able to get the contour colors though. :/


View entire conversation on ReviewNB

Copy link
Contributor Author

Thanks! I tried resolving this, and did such. However, some parts of the visualization are lost due to the change in data type.


View entire conversation on ReviewNB

@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

CloudChaoszero commented on 2021-05-22T09:04:17Z
----------------------------------------------------------------

@OriolAbril would it be okay if this not contain the arrows & additional data points. Oddly, I'm stuck here and one other visualization... haha


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-05-22T11:26:56Z
----------------------------------------------------------------

I would use county as dimension and have the coords contain the county names like we did in the https://docs.pymc.io/notebooks/multilevel_modeling.html (I have trouble seeing the differences between the two notebooks though).


@OriolAbril
Copy link
Member

@CloudChaoszero it is hard to give better feedback without running the notebook myself and playing around a bit. Do you mind if I do that and push the changes afterwards? I can also try to live code and post a link here (I can't promise it will be anything worth watching though). Let me know how this sounds and if you like the idea we'll coordinate to avoid git conflicts and so on.

@CloudChaoszero
Copy link
Contributor Author

@CloudChaoszero it is hard to give better feedback without running the notebook myself and playing around a bit. Do you mind if I do that and push the changes afterwards? I can also try to live code and post a link here (I can't promise it will be anything worth watching though). Let me know how this sounds and if you like the idea we'll coordinate to avoid git conflicts and so on.

@OriolAbril That works! Thanks a bunch, I really appreciate the help 🙏

@CloudChaoszero
Copy link
Contributor Author

Thank you @OriolAbril this works now. 🙏 ❤️ ( Seriously)
Should we proceed with confirming merge, or should there be an additional review?

@fonnesbeck fonnesbeck merged commit 7b259f1 into pymc-devs:main Jul 3, 2021
@fonnesbeck
Copy link
Member

Thanks @CloudChaoszero . Apologies for the delay in merging.

@OriolAbril OriolAbril deleted the coords-inferencedata-update-2 branch November 17, 2021 01:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants