From 57b6e338a59d7d705482a93cf251eec799a784d3 Mon Sep 17 00:00:00 2001 From: Michael Osthege Date: Thu, 26 May 2022 20:58:39 +0200 Subject: [PATCH 1/3] Write instructions on running the test suite --- .../contributing/running_the_test_suite.md | 46 ++++++++++++++++--- 1 file changed, 39 insertions(+), 7 deletions(-) diff --git a/docs/source/contributing/running_the_test_suite.md b/docs/source/contributing/running_the_test_suite.md index 60aeb86082..e415b1d714 100644 --- a/docs/source/contributing/running_the_test_suite.md +++ b/docs/source/contributing/running_the_test_suite.md @@ -1,15 +1,47 @@ (running_the_test_suite)= # Running the test suite +The first step to run tests is the installation of additional dependencies that are needed for testing: -TODO: explain steps to run test suite. This is a how to guide, so assume readers know how to install -things and so on, at most mention what dependencies are needed. +```bash +pip install -r requirements-dev.txt +``` + +The PyMC test suite uses `pytest` as the testing framework. +If you are unfamiliar with `pytest`, check out [this short video series](https://calmcode.io/pytest/). + +With the optional dependencies installed, you can start running tests. +Below are some example of how you might want to run certain parts of the test suite. + +```{attention} +Running the entire test suite will take hours. +Therefore, we recommend to run just specific tests that target the parts of the codebase you're working on. +``` + +To run all tests from a single file: +```bash +pytest -v pymc/tests/test_model.py +``` + +```{tip} +The `-v` flag is short-hand for `--verbose` and prints the names of the test cases that are currently running. +``` + +Often, you'll want to focus on just a few test cases first. +By using the `-k` flag, you can filter for test cases that match a certain pattern. +For example, the following command runs all test cases from `test_model.py` that have "coord" in their name: ```bash -pip install pytest pytest-cov coverage +pytest -v pymc/tests/test_model.py -k coord +``` + -# To run a subset of tests -pytest --verbose pymc/tests/.py +To get a coverage report, you can pass `--cov=pymc`, optionally with `--cov-report term-missing` to get a printout of the line numbers that were visited by the invoked tests. +Note that because you are not running the entire test suite, the coverage will be terrible. +But you can still watch for specific line numbers of the code that you're working on. -# To get a coverage report -pytest --verbose --cov=pymc --cov-report term-missing pymc/tests/.py +```bash +pytest -v --cov=pymc --cov-report term-missing pymc/tests/.py ``` + +When you are reasonably confident about the changes you made, you can push the changes and open a pull request. +Our GitHub Actions pipeline will run the entire test suite and if there are failures you can go back and run these tests on your local machine. From 2d0f94364a2a8fbedbf5c1cdd1ecd81bd77fc3f3 Mon Sep 17 00:00:00 2001 From: Michael Osthege Date: Thu, 26 May 2022 21:23:06 +0200 Subject: [PATCH 2/3] Add troubleshooting tips on pre-commit --- docs/source/contributing/python_style.md | 28 +++++++++++++++++++ .../contributing/review_pr_pymc_examples.md | 2 +- 2 files changed, 29 insertions(+), 1 deletion(-) diff --git a/docs/source/contributing/python_style.md b/docs/source/contributing/python_style.md index b707d07a52..bf8a9b289f 100644 --- a/docs/source/contributing/python_style.md +++ b/docs/source/contributing/python_style.md @@ -47,3 +47,31 @@ or, if you just want to manually run them on a subset of files, ```bash pre-commit run --files ... ``` + +## Gotchas & Troubleshooting +__Pre-commit runs on staged files__ + +If you have some `git` changes staged and other unstaged, the `pre-commit` will only run on the staged files. + +__Pre-commit repeatedly complains about the same formatting changes__ + +Check the unstaged changes (see previous point). + +__Whitespace changes in the `environment-dev-*.yml` files__ + +On Windows, there are some bugs in pre-commit hooks that can lead to changes in some environment YAML files. +Until this is fixed upstream, you should __ignore these changes__. +To actually make the commit, deactivate the automated `pre-commit` with `pre-commit uninstall` and make sure to run it manually with `pre-commit run --all`. + +__Failures in the `mypy` step__ + +We are running static type checks with `mypy` to continuously improve the reliability and type safety of the PyMC codebase. +However, there are many files with unresolved type problems, which is why we are allowing some files to fail the `mypy` check. + +If you are seeing the `mypy` step complain, chances are that you are in one of the following two situations: +* 😕 Your changes introduced type problems in a file that was previously free of type problems. +* 🥳 Your changes fixed type problems. + +In any case __read the logging output of the `mypy` hook__, because it contains the instructions how to proceed. + +You can also run the `mypy` check manually with `python scripts/run_mypy.py [--verbose]`. diff --git a/docs/source/contributing/review_pr_pymc_examples.md b/docs/source/contributing/review_pr_pymc_examples.md index 215b4c2dc8..caf1cb5577 100644 --- a/docs/source/contributing/review_pr_pymc_examples.md +++ b/docs/source/contributing/review_pr_pymc_examples.md @@ -14,7 +14,7 @@ The most important guideline is the following: **When you aren't _completely_ sure about something communicate it, ask around and (re)read the documentation**. ::: -pymc-examples is a huge collection of notebooks, covering many fields +[pymc-examples](https://github.com/pymc-devs/pymc-examples/) is a huge collection of notebooks, covering many fields and applications of PyMC and Bayesian statistics. In addition, its HTML generation infrastructure has been improved by specialized people who were also paid so they could dedicate themselves to this task. From 82b65e0c0551518bc326fbb57123f0fd546dfac5 Mon Sep 17 00:00:00 2001 From: Michael Osthege Date: Thu, 26 May 2022 22:00:27 +0200 Subject: [PATCH 3/3] Update guide on implementing new distributions Co-authored-by: Oriol Abril-Pla --- ...veloper_guide_implementing_distribution.md | 83 ++++++++++--------- 1 file changed, 42 insertions(+), 41 deletions(-) diff --git a/docs/source/contributing/developer_guide_implementing_distribution.md b/docs/source/contributing/developer_guide_implementing_distribution.md index 39920a389c..32fa236400 100644 --- a/docs/source/contributing/developer_guide_implementing_distribution.md +++ b/docs/source/contributing/developer_guide_implementing_distribution.md @@ -1,10 +1,10 @@ # Implementing a Distribution -This guide provides an overview on how to implement a distribution for version 4 of PyMC. +This guide provides an overview on how to implement a distribution for PyMC version `>=4.0.0`. It is designed for developers who wish to add a new distribution to the library. -Users will not be aware of all this complexity and should instead make use of helper methods such as `~pymc.distributions.DensityDist`. +Users will not be aware of all this complexity and should instead make use of helper methods such as `~pymc.DensityDist`. -PyMC {class}`~pymc.distributions.Distribution` builds on top of Aesara's {class}`~aesara.tensor.random.op.RandomVariable`, and implements `logp`, `logcdf` and `moment` methods as well as other initialization and validation helpers. +PyMC {class}`~pymc.Distribution` builds on top of Aesara's {class}`~aesara.tensor.random.op.RandomVariable`, and implements `logp`, `logcdf` and `moment` methods as well as other initialization and validation helpers. Most notably `shape/dims` kwargs, alternative parametrizations, and default `transforms`. Here is a summary check-list of the steps needed to implement a new distribution. @@ -23,8 +23,8 @@ This guide does not attempt to explain the rationale behind the `Distributions` {class}`~aesara.tensor.random.op.RandomVariable` are responsible for implementing the random sampling methods, which in version 3 of PyMC used to be one of the standard `Distribution` methods, alongside `logp` and `logcdf`. The `RandomVariable` is also responsible for parameter broadcasting and shape inference. -Before creating a new `RandomVariable` make sure that it is not offered in the {mod}`Numpy library `. -If it is, it should be added to the {doc}`Aesara library ` first and then imported into the PyMC library. +Before creating a new `RandomVariable` make sure that it is not already offered in the {mod}`NumPy library `. +If it is, it should be added to the {doc}`Aesara library ` first and then imported into the PyMC library. In addition, it might not always be necessary to implement a new `RandomVariable`. For example if the new `Distribution` is just a special parametrization of an existing `Distribution`. @@ -61,14 +61,14 @@ class BlahRV(RandomVariable): # If you want to add a custom signature and default values for the # parameters, do it like this. Otherwise this can be left out. - def __call__(self, loc=0.0, scale=1.0, size=None, **kwargs) -> TensorVariable: - return super().__call__(loc, scale, size=size, **kwargs) + def __call__(self, loc=0.0, scale=1.0, **kwargs) -> TensorVariable: + return super().__call__(loc, scale, **kwargs) # This is the Python code that produces samples. Its signature will always # start with a NumPy `RandomState` object, then the distribution # parameters, and, finally, the size. # - # This is effectively the PyMC v4.x replacement for `Distribution.random`. + # This is effectively the PyMC >=4.0 replacement for `Distribution.random`. @classmethod def rng_fn( cls, @@ -86,12 +86,12 @@ blah = BlahRV() Some important things to keep in mind: -1. Everything inside the `rng_fn` method is pure Python code (as are the inputs) and should not make use of other `Aesara` symbolic ops. The random method should make use of the `rng` which is a Numpy {class}`~numpy.random.RandomState`, so that samples are reproducible. -1. The `size` argument (together with the inputs shape) are the only way for the user to specify non-default `RandomVariable` dimensions. The `rng_fn` will have to take this into consideration for correct output. `size` is the specification used by `Numpy` and `Scipy` and works like PyMC `shape` for univariate distributions, but is different for multivariate distributions. Unfortunately there is no general reference documenting how `size` ought to work for multivariate distributions. This [discussion](https://github.com/numpy/numpy/issues/17669) may be helpful to get more context. -1. `Aesara` tries to infer the output shape of the `RandomVariable` (given a user-specified size) by introspection of the `ndim_supp` and `ndim_params` attributes. However, the default method may not work for more complex distributions. In that case, custom `_supp_shape_from_params` (and less probably, `_infer_shape`) should also be implemented in the new `RandomVariable` class. One simple example is seen in the {class}`~pymc.distributions.multivariate.DirichletMultinomialRV` where it was necessary to specify the `rep_param_idx` so that the `default_supp_shape_from_params` helper method could do its job. In more complex cases, it may not suffice to use this default helper. This could hapen for instance if the argument values determined the support shape of the distribution, as happens in the `~pymc.distribution.multivarite._LKJCholeskyCovRV`. +1. Everything inside the `rng_fn` method is pure Python code (as are the inputs) and should not make use of other `Aesara` symbolic ops. The random method should make use of the `rng` which is a NumPy {class}`~numpy.random.RandomState`, so that samples are reproducible. +1. Non-default `RandomVariable` dimensions will end up in the `rng_fn` via the `size` kwarg. The `rng_fn` will have to take this into consideration for correct output. `size` is the specification used by NumPy and SciPy and works like PyMC `shape` for univariate distributions, but is different for multivariate distributions. For multivariate distributions the __`size` excludes the `ndim_supp` support dimensions__, whereas the __`shape` of the resulting `TensorVariabe` or `ndarray` includes the support dimensions__. This [discussion](https://github.com/numpy/numpy/issues/17669) may be helpful to get more context. +1. `Aesara` tries to infer the output shape of the `RandomVariable` (given a user-specified size) by introspection of the `ndim_supp` and `ndim_params` attributes. However, the default method may not work for more complex distributions. In that case, custom `_supp_shape_from_params` (and less probably, `_infer_shape`) should also be implemented in the new `RandomVariable` class. One simple example is seen in the {class}`~pymc.DirichletMultinomialRV` where it was necessary to specify the `rep_param_idx` so that the `default_supp_shape_from_params` helper method can do its job. In more complex cases, it may not suffice to use this default helper. This could happen for instance if the argument values determined the support shape of the distribution, as happens in the `~pymc.distributions.multivarite._LKJCholeskyCovRV`. 1. It's okay to use the `rng_fn` `classmethods` of other Aesara and PyMC `RandomVariables` inside the new `rng_fn`. For example if you are implementing a negative HalfNormal `RandomVariable`, your `rng_fn` can simply return `- halfnormal.rng_fn(rng, scale, size)`. -*Note: In addition to `size`, the PyMC API also provides `shape` and `dims` as alternatives to define a distribution dimensionality, but this is taken care of by {class}`~pymc.distributions.Distribution`, and should not require any extra changes.* +*Note: In addition to `size`, the PyMC API also provides `shape` and `dims` as alternatives to define a distribution dimensionality, but this is taken care of by {class}`~pymc.Distribution`, and should not require any extra changes.* For a quick test that your new `RandomVariable` `Op` is working, you can call the `Op` with the necessary parameters and then call `eval()` on the returned object: @@ -115,13 +115,13 @@ blah([0, 0], [1, 2], size=(10, 2)).eval() ## 2. Inheriting from a PyMC base `Distribution` class -After implementing the new `RandomVariable` `Op`, it's time to make use of it in a new PyMC {class}`pymc.distributions.Distribution`. -PyMC 4.x works in a very {term}`functional ` way, and the `distribution` classes are there mostly to facilitate porting the `PyMC3` v3.x code to the new `PyMC` v4.x version, add PyMC API features and keep related methods organized together. +After implementing the new `RandomVariable` `Op`, it's time to make use of it in a new PyMC {class}`~pymc.Distribution`. +PyMC `>=4.0.0` works in a very {term}`functional ` way, and the `distribution` classes are there mostly to facilitate porting the `PyMC3` v3.x code to PyMC `>=4.0.0`, add PyMC API features and keep related methods organized together. In practice, they take care of: -1. Linking ({term}`Dispatching`) a rv_op class with the corresponding `moment`, `logp` and `logcdf` methods. +1. Linking ({term}`Dispatching`) an rv_op class with the corresponding `moment`, `logp` and `logcdf` methods. 1. Defining a standard transformation (for continuous distributions) that converts a bounded variable domain (e.g., positive line) to an unbounded domain (i.e., the real line), which many samplers prefer. -1. Validating the parametrization of a distribution and converting non-symbolic inputs (i.e., numeric literals or numpy arrays) to symbolic variables. +1. Validating the parametrization of a distribution and converting non-symbolic inputs (i.e., numeric literals or NumPy arrays) to symbolic variables. 1. Converting multiple alternative parametrizations to the standard parametrization that the `RandomVariable` is defined in terms of. Here is how the example continues: @@ -136,16 +136,16 @@ from pymc.distributions.dist_math import check_parameters # Subclassing `PositiveContinuous` will dispatch a default `log` transformation class Blah(PositiveContinuous): # This will be used by the metaclass `DistributionMeta` to dispatch the - # class `logp` and `logcdf` methods to the `blah` `op` + # class `logp` and `logcdf` methods to the `blah` `Op` defined in the last line of the code above. rv_op = blah - # dist() is responsible for returning an instance of the rv_op. We pass - # the standard parametrizations to super().dist + # dist() is responsible for returning an instance of the rv_op. + # We pass the standard parametrizations to super().dist @classmethod def dist(cls, param1, param2=None, alt_param2=None, **kwargs): param1 = at.as_tensor_variable(intX(param1)) if param2 is not None and alt_param2 is not None: - raise ValueError('Only one of param2 and alt_param2 is allowed') + raise ValueError("Only one of param2 and alt_param2 is allowed.") if alt_param2 is not None: param2 = 1 / alt_param2 param2 = at.as_tensor_variable(floatX(param2)) @@ -155,15 +155,16 @@ class Blah(PositiveContinuous): return super().dist([param1, param2], **kwargs) # moment returns a symbolic expression for the stable moment from which to start sampling - # the variable, given the implicit `rv`, `size` and `param1` ... `paramN` + # the variable, given the implicit `rv`, `size` and `param1` ... `paramN`. + # This is typically a "representative" point such as the the mean or mode. def moment(rv, size, param1, param2): moment, _ = at.broadcast_arrays(param1, param2) if not rv_size_is_none(size): moment = at.full(size, moment) return moment - # Logp returns a symbolic expression for the logp evaluation of the variable - # given the `value` of the variable and the parameters `param1` ... `paramN` + # Logp returns a symbolic expression for the elementwise log-pdf or log-pmf evaluation + # of the variable given the `value` of the variable and the parameters `param1` ... `paramN`. def logp(value, param1, param2): logp_expression = value * (param1 + at.log(param2)) @@ -175,14 +176,14 @@ class Blah(PositiveContinuous): ) # We use `check_parameters` for parameter validation. After the default expression, - # multiple comma-separated symbolic conditions can be added. Whenever - # a bound is invalidated, the returned expression raises an error with the message - # defined in the optional `msg` keyword argument. + # multiple comma-separated symbolic conditions can be added. + # Whenever a bound is invalidated, the returned expression raises an error + # with the message defined in the optional `msg` keyword argument. return check_parameters( logp_expression, param2 >= 0, msg="param2 >= 0", - ) + ) # logcdf works the same way as logp. For bounded variables, it is expected to return # `-inf` for values below the domain start and `0` for values above the domain end. @@ -193,12 +194,12 @@ class Blah(PositiveContinuous): Some notes: -1. A distribution should at the very least inherit from {class}`~pymc.distributions.Discrete` or {class}`~pymc.distributions.Continuous`. For the latter, more specific subclasses exist: `PositiveContinuous`, `UnitContinuous`, `BoundedContinuous`, `CircularContinuous`, `SimplexContinuous`, which specify default transformations for the variables. If you need to specify a one-time custom transform you can also create a `_default_transform` dispatch function as is done for the {class}`~pymc.distributions.multivariate.LKJCholeskyCov`. -1. If a distribution does not have a corresponding `random` implementation, a `RandomVariable` should still be created that raises a `NotImplementedError`. This is the case for the {class}`~pymc.distributions.continuous.Flat`. In this case it will be necessary to provide a `moment` method. -1. As mentioned above, `PyMC` v4.x works in a very {term}`functional ` way, and all the information that is needed in the `logp` and `logcdf` methods is expected to be "carried" via the `RandomVariable` inputs. You may pass numerical arguments that are not strictly needed for the `rng_fn` method but are used in the `logp` and `logcdf` methods. Just keep in mind whether this affects the correct shape inference behavior of the `RandomVariable`. If specialized non-numeric information is needed you might need to define your custom`_logp` and `_logcdf` {term}`Dispatching` functions, but this should be done as a last resort. +1. A distribution should at the very least inherit from {class}`~pymc.Discrete` or {class}`~pymc.Continuous`. For the latter, more specific subclasses exist: `PositiveContinuous`, `UnitContinuous`, `BoundedContinuous`, `CircularContinuous`, `SimplexContinuous`, which specify default transformations for the variables. If you need to specify a one-time custom transform you can also create a `_default_transform` dispatch function as is done for the {class}`~pymc.distributions.multivariate.LKJCholeskyCov`. +1. If a distribution does not have a corresponding `rng_fn` implementation, a `RandomVariable` should still be created to raise a `NotImplementedError`. This is, for example, the case in {class}`~pymc.distributions.continuous.Flat`. In this case it will be necessary to provide a `moment` method, because without a `rng_fn`, PyMC can't fall back to a random draw to use as an initial point for MCMC. +1. As mentioned above, PyMC `>=4.0.0` works in a very {term}`functional ` way, and all the information that is needed in the `logp` and `logcdf` methods is expected to be "carried" via the `RandomVariable` inputs. You may pass numerical arguments that are not strictly needed for the `rng_fn` method but are used in the `logp` and `logcdf` methods. Just keep in mind whether this affects the correct shape inference behavior of the `RandomVariable`. If specialized non-numeric information is needed you might need to define your custom`_logp` and `_logcdf` {term}`Dispatching` functions, but this should be done as a last resort. 1. The `logcdf` method is not a requirement, but it's a nice plus! -1. Currently only one moment is supported in the `moment` method, and probably the "higher-order" one is the most useful (that is `mean` > `median` > `mode`)... You might need to truncate the moment if you are dealing with a discrete distribution. -1. When creating the `moment` method, we have to be careful with `size != None` and broadcast properly when some parameters that are not used in the moment may nevertheless inform about the shape of the distribution. E.g. `pm.Normal.dist(mu=0, sigma=np.arange(1, 6))` returns a moment of `[mu, mu, mu, mu, mu]`. +1. Currently, only one moment is supported in the `moment` method, and probably the "higher-order" one is the most useful (that is `mean` > `median` > `mode`)... You might need to truncate the moment if you are dealing with a discrete distribution. +1. When creating the `moment` method, be careful with `size != None` and broadcast properly also based on parameters that are not necessarily used to calculate the moment. For example, the `sigma` in `pm.Normal.dist(mu=0, sigma=np.arange(1, 6))` is irrelevant for the moment, but may nevertheless inform about the shape. In this case, the `moment` should return `[mu, mu, mu, mu, mu]`. For a quick check that things are working you can try the following: @@ -292,11 +293,11 @@ You can see an example in {class}`~pymc.test.test_random.TestDirichlet`. ### Notes on `check_pymcs_draws_match_reference` test The `check_pymcs_draws_match_reference` is a very simple test for the equality of draws from the `RandomVariable` and the exact same python function, given the same inputs and random seed. -A small number (`size=15`) is checked. This is not supposed to be a test for the correctness of the random generator. -The latter kind of test (if warranted) can be performed with the aid of `pymc_random` and `pymc_random_discrete` methods in the same test file, which will perform an expensive statistical comparison between the RandomVariable `rng_fn` and a reference Python function. -This kind of test only makes sense if there is a good independent generator reference (i.e., not just the same composition of numpy / scipy python calls that is done inside `rng_fn`). +A small number (`size=15`) is checked. This is not supposed to be a test for the correctness of the random number generator. +The latter kind of test (if warranted) can be performed with the aid of `pymc_random` and `pymc_random_discrete` methods in the same test file, which will perform an expensive statistical comparison between the `RandomVariable.rng_fn` and a reference Python function. +This kind of test only makes sense if there is a good independent generator reference (i.e., not just the same composition of NumPy / SciPy calls that is done inside `rng_fn`). -Finally, when your `rng_fn` is doing something more than just calling a `numpy` or `scipy` method, you will need to setup an equivalent seeded function with which to compare for the exact draws (instead of relying on `seeded_[scipy|numpy]_distribution_builder`). +Finally, when your `rng_fn` is doing something more than just calling a NumPy or SciPy method, you will need to set up an equivalent seeded function with which to compare for the exact draws (instead of relying on `seeded_[scipy|numpy]_distribution_builder`). You can find an example in {class}`~pymc.tests.test_distributions_random.TestWeibull`, whose `rng_fn` returns `beta * np.random.weibull(alpha, size=size)`. @@ -342,13 +343,13 @@ def test_blah(self): ``` -These methods will perform a grid evaluation on the combinations of domain and paramdomains values, and check that the pymc methods and the reference functions match. +These methods will perform a grid evaluation on the combinations of `domain` and `paramdomains` values, and check that the PyMC methods and the reference functions match. There are a couple of details worth keeping in mind: -1. By default the first and last values (edges) of the `Domain` are not compared (they are used for other things). If it is important to test the edge of the `Domain`, the edge values can be repeated. This is done by the `Bool`: `Bool = Domain([0, 0, 1, 1], "int64")` -1. There are some default domains (such as `R` and `Rplus`) that you can use for testing your new distribution, but it's also perfectly fine to create your own domains inside the test function if there is a good reason for it (e.g., when the default values lead too many extreme unlikely combinations that are not very informative about the correctness of the implementation). -1. By default, a random subset of 100 `param` x `paramdomain` combinations is tested, in order to keep the test runtime under control. When testing your shiny new distribution, you can temporarily set `n_samples=-1` to force all combinations to be tested. This is important to avoid the your `PR` leading to surprising failures in future runs whenever some bad combinations of parameters are randomly tested. -1. On Github all tests are run twice, under the `aesara.config.floatX` flags of `"float64"` and `"float32"`. However, the reference Python functions will run in a pure "float64" environment, which means the reference and the PyMC results can diverge quite a lot (e.g., underflowing to `-np.inf` for extreme parameters). You should therefore make sure you test locally in both regimes. A quick and dirty way of doing this is to temporariliy add `aesara.config.floatX = "float32"` at the very top of file, immediately after `import aesara`. Remember to set `n_samples=-1` as well to test all combinations. The test output will show what exact parameter values lead to a failure. If you are confident that your implementation is correct, you may opt to tweak the decimal precision with `select_by_precision`, or adjust the tested `Domain` values. In extreme cases, you can mark the test with a conditional `xfail` (if only one of the sub-methods is failing, they should be separated, so that the `xfail` is as narrow as possible): +1. By default, the first and last values (edges) of the `Domain` are not compared (they are used for other things). If it is important to test the edge of the `Domain`, the edge values can be repeated. This is done by the `Bool`: `Bool = Domain([0, 0, 1, 1], "int64")` +3. There are some default domains (such as `R` and `Rplus`) that you can use for testing your new distribution, but it's also perfectly fine to create your own domains inside the test function if there is a good reason for it (e.g., when the default values lead too many extreme unlikely combinations that are not very informative about the correctness of the implementation). +4. By default, a random subset of 100 `param` x `paramdomain` combinations is tested, to keep the test runtime under control. When testing your shiny new distribution, you can temporarily set `n_samples=-1` to force all combinations to be tested. This is important to avoid your `PR` leading to surprising failures in future runs whenever some bad combinations of parameters are randomly tested. +5. On GitHub some tests run twice, under the `aesara.config.floatX` flags of `"float64"` and `"float32"`. However, the reference Python functions will run in a pure "float64" environment, which means the reference and the PyMC results can diverge quite a lot (e.g., underflowing to `-np.inf` for extreme parameters). You should therefore make sure you test locally in both regimes. A quick and dirty way of doing this is to temporarily add `aesara.config.floatX = "float32"` at the very top of file, immediately after `import aesara`. Remember to set `n_samples=-1` as well to test all combinations. The test output will show what exact parameter values lead to a failure. If you are confident that your implementation is correct, you may opt to tweak the decimal precision with `select_by_precision`, or adjust the tested `Domain` values. In extreme cases, you can mark the test with a conditional `xfail` (if only one of the sub-methods is failing, they should be separated, so that the `xfail` is as narrow as possible): ```python