Skip to content

Adds quadratic approximation #4847

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

rasmusbergpalm
Copy link

@rasmusbergpalm rasmusbergpalm commented Jul 8, 2021

The quadratic approximation is an extremely fast way to obtain an estimate of the posterior. It only works if the posterior is unimodal and approximately symmetric.

The implementation finds the quadratic approximation and then samples from it to return an arviz.InferenceData object. Finding the exact (approximate) posterior and then sampling from it might seem counter intuitive, but it's done to be compatible with the rest of the codebase and Arviz functionality. The exact (approximate) posterior is also returned as a scipy.stats.multivariate_normal distribution.

Thank your for opening a PR!

Depending on what your PR does, here are a few things you might want to address in the description:

The quadratic approximation is an extremely fast way to obtain an estimate of the posterior. It only works if the posterior is unimodal and approximately symmetric.

The implementation finds the quadratic approximation and then samples from it to return an arviz.InferenceData object. Finding the exact (approximate) posterior and then sampling from it might seem counter intuitive, but it's done to be compatible with the rest of the codebase and Arviz functionality. The exact (approximate) posterior is also returned as a scipy.stats.multivariate_normal distribution.
----------
vars: list
List of variables to approximate the posterior for.
n_chains: int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this kwarg and should fix it to 1.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, Arviz will complain if there aren't at least 2 chains.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you share an example? I think you'll get trouble for multivariate distributions, where arviz will interpret the leading dimension as the chains, but I would solve that as follows:

data = {'a': np.ones(10), 'b': np.ones((100, 3))}

az.convert_to_inference_data(data) # warning, misinterprets `b` as having 100 chains

az.convert_to_inference_data({k: v[np.newaxis, ...] for k, v in data.items()}) # Good, explicitly sets number of chains to 1.

But maybe I'm misunderstanding the issue!

I'm against including chains since it gives the impression that you might want to run diagnostics on the quality of the returned samples.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

I guess that's not too bad. All the plots seem to work. I just saw that warning and figured nothing would work. I'll remove the chains. I fully agree. The less impression we can give that running diagnostics the better. A little warning and a NaN might actually be good :)


Returns
-------
(arviz.InferenceData, scipy.stats.multivariate_normal):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also list these items individually, you can look at other doc strings for examples for multiple return elements.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find any examples. I tried a new format. Please provide a reference if it's not correct :)

@twiecki
Copy link
Member

twiecki commented Jul 9, 2021

This looks pretty good. Needs a line in the release notes. I wonder if we should make this available through sample?

Really though this probably fits into the VI submodule, but that one is a bit more complex. Curious what @ferrine thinks.

@codecov
Copy link

codecov bot commented Jul 9, 2021

Codecov Report

Merging #4847 (c65147c) into main (7e464e5) will decrease coverage by 5.42%.
The diff coverage is 38.09%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4847      +/-   ##
==========================================
- Coverage   72.32%   66.90%   -5.43%     
==========================================
  Files          85       86       +1     
  Lines       13884    13905      +21     
==========================================
- Hits        10042     9303     -739     
- Misses       3842     4602     +760     
Impacted Files Coverage Δ
pymc3/quadratic_approximation.py 35.00% <35.00%> (ø)
pymc3/__init__.py 100.00% <100.00%> (ø)
pymc3/smc/sample_smc.py 12.80% <0.00%> (-79.21%) ⬇️
pymc3/tests/backend_fixtures.py 0.00% <0.00%> (-78.98%) ⬇️
pymc3/smc/smc.py 15.06% <0.00%> (-57.54%) ⬇️
pymc3/exceptions.py 58.62% <0.00%> (-41.38%) ⬇️
pymc3/backends/tracetab.py 70.00% <0.00%> (-30.00%) ⬇️
pymc3/backends/ndarray.py 61.57% <0.00%> (-25.79%) ⬇️
pymc3/backends/base.py 66.15% <0.00%> (-20.00%) ⬇️
pymc3/distributions/logprob.py 76.02% <0.00%> (-19.87%) ⬇️
... and 9 more

@ricardoV94
Copy link
Member

ricardoV94 commented Jul 10, 2021

I suggest adding a test with multivariate variables and / or variables with different dimensions (larger than 1 dimension). Not sure how well that mean concatenation will work in those cases.

@ricardoV94
Copy link
Member

ricardoV94 commented Jul 10, 2021

This looks pretty good. Needs a line in the release notes. I wonder if we should make this available through sample?

I don't think that would be the best.

But we could perhaps add a general approx module where we put stuff like this and MAP (which doesn't make much sense as "tuning" anymore).

I am planning to add a "grid approximation" function that works a lot like this quadratic in spirit (deterministic evaluation + samples to returns an inferencedata), and would place it there as well.

InferenceData with samples from the approximate posterior, multivariate normal posterior approximation

"""
map = find_MAP(vars=vars)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any arguments to find_MAP that we might want to allow the user to pass? I.e. map = find_MAP(vars=vars, **map_kwargs)? Same for the Hessian. If not, that's fine (even better)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I think I'd say no. It's a leaky abstraction. If someone raises an issue where they need to pass args to find_MAP we can think about how to best do it at that time. That's my 2 cents.

@rasmusbergpalm
Copy link
Author

rasmusbergpalm commented Jul 10, 2021

I suggest adding a test with multivariate variables and / or variables with different dimensions (larger than 1 dimension). Not sure how well that mean concatenation will work in those cases.

That's why I made test_hdi_contains_parameters_in_linear_regression which has weights with shape=2 :)

rasmusbergpalm and others added 4 commits July 10, 2021 09:43
Co-authored-by: Thomas Wiecki <thomas.wiecki@gmail.com>
Co-authored-by: Thomas Wiecki <thomas.wiecki@gmail.com>
@rasmusbergpalm
Copy link
Author

I've adressed the PR comments as best as I could, ran linting and added a line in release notes. Please take another look @twiecki

I will add an example notebook once merged.

@rasmusbergpalm
Copy link
Author

rasmusbergpalm commented Jul 10, 2021

This looks pretty good. Needs a line in the release notes. I wonder if we should make this available through sample?

I don't think that would be the best.

But we could perhaps add a general approx module where we put stuff like this and MAP (which doesn't make much sense as "tuning" anymore).

I am planning to add a "grid approximation" function that works a lot like this quadratic in spirit (deterministic evaluation + samples to returns an inferencedata), and would place it there as well.

Agreed it doesn't fit well in sample. A humble suggestion might be to make an inference module and place the various types of posterior inference there, e.g. MCMC, VI, this, grid approx., etc.

@fonnesbeck
Copy link
Member

I like the approx submodule idea best. If we ever get around to adding INLA it could live there as well.

----------
vars: list
List of variables to approximate the posterior for.
n_chains: int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you share an example? I think you'll get trouble for multivariate distributions, where arviz will interpret the leading dimension as the chains, but I would solve that as follows:

data = {'a': np.ones(10), 'b': np.ones((100, 3))}

az.convert_to_inference_data(data) # warning, misinterprets `b` as having 100 chains

az.convert_to_inference_data({k: v[np.newaxis, ...] for k, v in data.items()}) # Good, explicitly sets number of chains to 1.

But maybe I'm misunderstanding the issue!

I'm against including chains since it gives the impression that you might want to run diagnostics on the quality of the returned samples.

@ColCarroll
Copy link
Member

(sorry, submitted a review late, but excited to see this!)

@rasmusbergpalm
Copy link
Author

@ColCarroll I removed the chains. PTAL :)

@rasmusbergpalm
Copy link
Author

I've spent the entire day trying to reproduce the test errors and have failed. I simply cannot create a working development environment.

First I spent countless hours fixing C import stdio.h errors, upgrading macOS, installing xcode, etc.
Then I spent a lot of time with "Couldn't find Library System" when compiling Aesara, following the contributing.MD installation guide (using conda)
Then I tried using a venv and pip which is the closest I've got which gave me

>           _lbfgsb.setulb(m, x, low_bnd, upper_bnd, nbd, f, g, factr,
                           pgtol, wa, iwa, task, iprint, csave, lsave,
                           isave, dsave, maxls)
E           ValueError: unexpected array size: new_size=2, got array with arr_size=0

Some bug deep in the FORTRAN code of LBFGS, which is not the error in CI.
Then I tried the docker approach, but running . ./../../scripts/test.sh gives unknown command pytest.
Installing pytest and re-running gives E ModuleNotFoundError: No module named 'aesara'.

At this point I've given up. The code works, as you can see here https://colab.research.google.com/drive/1DTe7QchyW-wpbUmulzY27lBTRFQXzhm0?usp=sharing (at least in PyMC3 3.11.2)

@twiecki
Copy link
Member

twiecki commented Dec 21, 2021

Closing due to inactivity, feel free to reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants