Skip to content

BUG: ICDF implementation for the discrete geometric distribution fails some tests. #6670

Closed
@gokuld

Description

@gokuld

Describe the issue:

The ICDF function for the discrete geometric distribution involves the ceiling function applied on a ratio of two values which is a floating point number. If this ratio is close to an integer, a small noise added to it even in the smallest digit of precision can sway the result of the ceiling function to the other side resulting in a wrong ICDF value.

The example code below is for an example of a Geometric distribution with p=0.99, and CDF value 0.9999 whose ICDF is expected to be 2. Within the ICDF function, defining the ratio a = pt.log1p(-value) / pt.log1p(-p)

    Ratio a before ceil: 2.000000000000025
    Ratio a after ceil: 3.0
    Expected value after ceil: 2

The tiny numerical error 000000000000025 contributes to the wrong value after ceil.

I also have a solution to this, which involves truncating the floating point value to a certain number of digits before using the ceiling function on it. This gets rid of the sensitivity to the smallest of noise. I will push a commit with this fix shortly.

Reproduceable code example:

import pytensor as pt
import pymc as pm
from pymc.logprob.basic import icdf
from pymc.pytensorf import inputvars
import numpy as np

value = 0.9999 # CDF value for n=2
p = 0.99 # Parameter p of the discrete Geometric distribution.

# The below ratio 'a' is used inside the ICDF function:
a = np.log1p(-value) / np.log1p(-p)

print(
    f"""
    Ratio before ceil: {a}
    Ratio after ceil: {np.ceil(a)}
    Expected value after ceil: 2
    """
)

dist = pm.Geometric.dist(p=p)

dist_icdf = icdf(dist, pt.tensor.type.TensorType(dtype='float64', shape=[])('value'))
dist_icdf_fn = pt.function(list(inputvars(dist_icdf)), dist_icdf)

assert dist_icdf_fn(value) == 2

Error message:

AssertionError

PyMC version information:

pytensor: 2.10.1
Python: 3.10.10
PyMC: latest commit 2a324bc

Context for the issue:

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions