Description
Describe the issue:
The ICDF function for the discrete geometric distribution involves the ceiling function applied on a ratio of two values which is a floating point number. If this ratio is close to an integer, a small noise added to it even in the smallest digit of precision can sway the result of the ceiling function to the other side resulting in a wrong ICDF value.
The example code below is for an example of a Geometric distribution with p=0.99
, and CDF value 0.9999
whose ICDF is expected to be 2
. Within the ICDF function, defining the ratio a = pt.log1p(-value) / pt.log1p(-p)
Ratio a before ceil: 2.000000000000025
Ratio a after ceil: 3.0
Expected value after ceil: 2
The tiny numerical error 000000000000025
contributes to the wrong value after ceil.
I also have a solution to this, which involves truncating the floating point value to a certain number of digits before using the ceiling function on it. This gets rid of the sensitivity to the smallest of noise. I will push a commit with this fix shortly.
Reproduceable code example:
import pytensor as pt
import pymc as pm
from pymc.logprob.basic import icdf
from pymc.pytensorf import inputvars
import numpy as np
value = 0.9999 # CDF value for n=2
p = 0.99 # Parameter p of the discrete Geometric distribution.
# The below ratio 'a' is used inside the ICDF function:
a = np.log1p(-value) / np.log1p(-p)
print(
f"""
Ratio before ceil: {a}
Ratio after ceil: {np.ceil(a)}
Expected value after ceil: 2
"""
)
dist = pm.Geometric.dist(p=p)
dist_icdf = icdf(dist, pt.tensor.type.TensorType(dtype='float64', shape=[])('value'))
dist_icdf_fn = pt.function(list(inputvars(dist_icdf)), dist_icdf)
assert dist_icdf_fn(value) == 2
Error message:
AssertionError
PyMC version information:
pytensor: 2.10.1
Python: 3.10.10
PyMC: latest commit 2a324bc
Context for the issue:
No response