Adds quadratic approximation #4847

rasmusbergpalm · 2021-07-08T11:40:16Z

The quadratic approximation is an extremely fast way to obtain an estimate of the posterior. It only works if the posterior is unimodal and approximately symmetric.

The implementation finds the quadratic approximation and then samples from it to return an arviz.InferenceData object. Finding the exact (approximate) posterior and then sampling from it might seem counter intuitive, but it's done to be compatible with the rest of the codebase and Arviz functionality. The exact (approximate) posterior is also returned as a scipy.stats.multivariate_normal distribution.

Thank your for opening a PR!

Depending on what your PR does, here are a few things you might want to address in the description:

what are the (breaking) changes that this PR makes?
important background, or details about the implementation
are the changes—especially new features—covered by tests and docstrings?
linting/style checks have been run
consider adding/updating relevant example notebooks
right before it's ready to merge, mention the PR in the RELEASE-NOTES.md

The quadratic approximation is an extremely fast way to obtain an estimate of the posterior. It only works if the posterior is unimodal and approximately symmetric. The implementation finds the quadratic approximation and then samples from it to return an arviz.InferenceData object. Finding the exact (approximate) posterior and then sampling from it might seem counter intuitive, but it's done to be compatible with the rest of the codebase and Arviz functionality. The exact (approximate) posterior is also returned as a scipy.stats.multivariate_normal distribution.

pymc3/quadratic_approximation.py

twiecki · 2021-07-09T14:11:40Z

pymc3/quadratic_approximation.py

+    ----------
+    vars: list
+        List of variables to approximate the posterior for.
+    n_chains: int


I don't think we need this kwarg and should fix it to 1.

Unfortunately, Arviz will complain if there aren't at least 2 chains.

can you share an example? I think you'll get trouble for multivariate distributions, where arviz will interpret the leading dimension as the chains, but I would solve that as follows:

data = {'a': np.ones(10), 'b': np.ones((100, 3))} az.convert_to_inference_data(data) # warning, misinterprets `b` as having 100 chains az.convert_to_inference_data({k: v[np.newaxis, ...] for k, v in data.items()}) # Good, explicitly sets number of chains to 1.

But maybe I'm misunderstanding the issue!

I'm against including chains since it gives the impression that you might want to run diagnostics on the quality of the returned samples.

I guess that's not too bad. All the plots seem to work. I just saw that warning and figured nothing would work. I'll remove the chains. I fully agree. The less impression we can give that running diagnostics the better. A little warning and a NaN might actually be good :)

twiecki · 2021-07-09T14:12:09Z

pymc3/quadratic_approximation.py

+
+    Returns
+    -------
+    (arviz.InferenceData, scipy.stats.multivariate_normal):


Also list these items individually, you can look at other doc strings for examples for multiple return elements.

I couldn't find any examples. I tried a new format. Please provide a reference if it's not correct :)

twiecki · 2021-07-09T14:14:52Z

This looks pretty good. Needs a line in the release notes. I wonder if we should make this available through sample?

Really though this probably fits into the VI submodule, but that one is a bit more complex. Curious what @ferrine thinks.

codecov · 2021-07-09T15:07:29Z

Codecov Report

Merging #4847 (c65147c) into main (7e464e5) will decrease coverage by 5.42%.
The diff coverage is 38.09%.

@@            Coverage Diff             @@
##             main    #4847      +/-   ##
==========================================
- Coverage   72.32%   66.90%   -5.43%     
==========================================
  Files          85       86       +1     
  Lines       13884    13905      +21     
==========================================
- Hits        10042     9303     -739     
- Misses       3842     4602     +760

Impacted Files	Coverage Δ
pymc3/quadratic_approximation.py	`35.00% <35.00%> (ø)`
pymc3/__init__.py	`100.00% <100.00%> (ø)`
pymc3/smc/sample_smc.py	`12.80% <0.00%> (-79.21%)`	⬇️
pymc3/tests/backend_fixtures.py	`0.00% <0.00%> (-78.98%)`	⬇️
pymc3/smc/smc.py	`15.06% <0.00%> (-57.54%)`	⬇️
pymc3/exceptions.py	`58.62% <0.00%> (-41.38%)`	⬇️
pymc3/backends/tracetab.py	`70.00% <0.00%> (-30.00%)`	⬇️
pymc3/backends/ndarray.py	`61.57% <0.00%> (-25.79%)`	⬇️
pymc3/backends/base.py	`66.15% <0.00%> (-20.00%)`	⬇️
pymc3/distributions/logprob.py	`76.02% <0.00%> (-19.87%)`	⬇️
... and 9 more

ricardoV94 · 2021-07-10T06:11:11Z

I suggest adding a test with multivariate variables and / or variables with different dimensions (larger than 1 dimension). Not sure how well that mean concatenation will work in those cases.

ricardoV94 · 2021-07-10T06:23:11Z

This looks pretty good. Needs a line in the release notes. I wonder if we should make this available through sample?

I don't think that would be the best.

But we could perhaps add a general approx module where we put stuff like this and MAP (which doesn't make much sense as "tuning" anymore).

I am planning to add a "grid approximation" function that works a lot like this quadratic in spirit (deterministic evaluation + samples to returns an inferencedata), and would place it there as well.

ricardoV94 · 2021-07-10T06:27:25Z

pymc3/quadratic_approximation.py

+        InferenceData with samples from the approximate posterior, multivariate normal posterior approximation
+
+    """
+    map = find_MAP(vars=vars)


Are there any arguments to find_MAP that we might want to allow the user to pass? I.e. map = find_MAP(vars=vars, **map_kwargs)? Same for the Hessian. If not, that's fine (even better)

Hmm. I think I'd say no. It's a leaky abstraction. If someone raises an issue where they need to pass args to find_MAP we can think about how to best do it at that time. That's my 2 cents.

rasmusbergpalm · 2021-07-10T07:42:21Z

I suggest adding a test with multivariate variables and / or variables with different dimensions (larger than 1 dimension). Not sure how well that mean concatenation will work in those cases.

That's why I made test_hdi_contains_parameters_in_linear_regression which has weights with shape=2 :)

Co-authored-by: Thomas Wiecki <thomas.wiecki@gmail.com>

rasmusbergpalm · 2021-07-10T08:06:52Z

I've adressed the PR comments as best as I could, ran linting and added a line in release notes. Please take another look @twiecki

I will add an example notebook once merged.

rasmusbergpalm · 2021-07-10T12:47:10Z

This looks pretty good. Needs a line in the release notes. I wonder if we should make this available through sample?

I don't think that would be the best.

But we could perhaps add a general approx module where we put stuff like this and MAP (which doesn't make much sense as "tuning" anymore).

I am planning to add a "grid approximation" function that works a lot like this quadratic in spirit (deterministic evaluation + samples to returns an inferencedata), and would place it there as well.

Agreed it doesn't fit well in sample. A humble suggestion might be to make an inference module and place the various types of posterior inference there, e.g. MCMC, VI, this, grid approx., etc.

fonnesbeck · 2021-07-10T16:59:51Z

I like the approx submodule idea best. If we ever get around to adding INLA it could live there as well.

ColCarroll · 2021-07-10T18:23:36Z

pymc3/quadratic_approximation.py

+    ----------
+    vars: list
+        List of variables to approximate the posterior for.
+    n_chains: int


can you share an example? I think you'll get trouble for multivariate distributions, where arviz will interpret the leading dimension as the chains, but I would solve that as follows:

data = {'a': np.ones(10), 'b': np.ones((100, 3))} az.convert_to_inference_data(data) # warning, misinterprets `b` as having 100 chains az.convert_to_inference_data({k: v[np.newaxis, ...] for k, v in data.items()}) # Good, explicitly sets number of chains to 1.

But maybe I'm misunderstanding the issue!

I'm against including chains since it gives the impression that you might want to run diagnostics on the quality of the returned samples.

pymc3/quadratic_approximation.py

ColCarroll · 2021-07-10T18:26:38Z

(sorry, submitted a review late, but excited to see this!)

rasmusbergpalm · 2021-07-12T10:27:05Z

@ColCarroll I removed the chains. PTAL :)

rasmusbergpalm · 2021-07-13T12:08:26Z

I've spent the entire day trying to reproduce the test errors and have failed. I simply cannot create a working development environment.

First I spent countless hours fixing C import stdio.h errors, upgrading macOS, installing xcode, etc.
Then I spent a lot of time with "Couldn't find Library System" when compiling Aesara, following the contributing.MD installation guide (using conda)
Then I tried using a venv and pip which is the closest I've got which gave me

>           _lbfgsb.setulb(m, x, low_bnd, upper_bnd, nbd, f, g, factr,
                           pgtol, wa, iwa, task, iprint, csave, lsave,
                           isave, dsave, maxls)
E           ValueError: unexpected array size: new_size=2, got array with arr_size=0

Some bug deep in the FORTRAN code of LBFGS, which is not the error in CI.
Then I tried the docker approach, but running . ./../../scripts/test.sh gives unknown command pytest.
Installing pytest and re-running gives E ModuleNotFoundError: No module named 'aesara'.

At this point I've given up. The code works, as you can see here https://colab.research.google.com/drive/1DTe7QchyW-wpbUmulzY27lBTRFQXzhm0?usp=sharing (at least in PyMC3 3.11.2)

twiecki · 2021-12-21T16:49:58Z

Closing due to inactivity, feel free to reopen.

twiecki reviewed Jul 9, 2021

View reviewed changes

pymc3/quadratic_approximation.py Outdated Show resolved Hide resolved

twiecki reviewed Jul 9, 2021

View reviewed changes

pymc3/quadratic_approximation.py Outdated Show resolved Hide resolved

twiecki reviewed Jul 9, 2021

View reviewed changes

ricardoV94 reviewed Jul 10, 2021

View reviewed changes

rasmusbergpalm and others added 4 commits July 10, 2021 09:43

Spelling

3f9a33a

Co-authored-by: Thomas Wiecki <thomas.wiecki@gmail.com>

Update pymc3/quadratic_approximation.py

bee10e1

Co-authored-by: Thomas Wiecki <thomas.wiecki@gmail.com>

PR comments

f0e7030

linting

e15c065

ColCarroll reviewed Jul 10, 2021

View reviewed changes

Removes chains argument

c65147c

twiecki closed this Dec 21, 2021

theorashid mentioned this pull request Nov 6, 2023

ENH: Adding INLA to PyMC step 1: get a Laplace Approximation #6992

Open

theorashid mentioned this pull request May 15, 2024

Laplace approximation pymc-devs/pymc-extras#341

Closed

Uh oh!

Adds quadratic approximation #4847

Adds quadratic approximation #4847

Uh oh!

Conversation

rasmusbergpalm commented Jul 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

twiecki commented Jul 9, 2021

Uh oh!

codecov bot commented Jul 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ricardoV94 commented Jul 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ricardoV94 commented Jul 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rasmusbergpalm commented Jul 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rasmusbergpalm commented Jul 10, 2021

Uh oh!

rasmusbergpalm commented Jul 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fonnesbeck commented Jul 10, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ColCarroll commented Jul 10, 2021

Uh oh!

rasmusbergpalm commented Jul 12, 2021

Uh oh!

rasmusbergpalm commented Jul 13, 2021

Uh oh!

twiecki commented Dec 21, 2021

Uh oh!

Uh oh!

rasmusbergpalm commented Jul 8, 2021 •

edited

Loading

codecov bot commented Jul 9, 2021 •

edited

Loading

ricardoV94 commented Jul 10, 2021 •

edited

Loading

ricardoV94 commented Jul 10, 2021 •

edited

Loading

rasmusbergpalm commented Jul 10, 2021 •

edited

Loading

rasmusbergpalm commented Jul 10, 2021 •

edited

Loading