Description
Minimal example to reproduce the warning:
>>> import stumpy
>>> import numpy as np
>>> stumpy.stump(np.array([0, 1, 2, 3, np.nan, np.nan, np.nan]), m=3)
/Users/user0/code/venv/lib/python3.10/site-packages/stumpy/core.py:2257: RuntimeWarning: divide by zero encountered in divide
Σ_T_inverse = 1.0 / Σ_T
mparray([[inf, -1, -1, -1],
[inf, -1, -1, -1],
[inf, -1, -1, -1],
[inf, -1, -1, -1],
[inf, -1, -1, -1]], dtype=object)
This is caused when a subsequence of at least length m
is all NaN. It seems to be handled without issues in the output, but I'm curious if there's a way we can handle the all-NaN subsequence case without raising the warning that may be confusing, or to raise a more useful warning.
As indicated by the warning, part of Σ_T = 0 in stumpy/core/preprocess_diagonal
, though the line prior attempts to avoid the divide by zero warning by setting constant sections of the rolling std to 1.0 instead of 0.0.
It looks like preprocess_diagonal
calls process_isconstant
--> rolling_iscontant
--> _rolling_isconstant
to check if the subsequence is constant. Link to _rolling_isconstant
.
Currently _rolling_isconstant
says the sequence is not constant if any value is NaN. A possible fix would be to call a subsequence constant is all of its values are NaN.
That could perhaps be implemented into something like this for _rolling_isconstant
:
out = np.empty(l)
all_nan = np.empty(l)
for i in prange(l):
out[i] = np.ptp(a[i : i + w])
all_nan[i] = all(np.isnan(a[i : i + w]))
return (out == 0) or all_nan
or
out = np.empty(l)
for i in prange(l):
out[i] = (np.ptp(a[i : i + w]) == 0) or all(np.isnan(a[i : i + w]))
return out
Would this break other things or be otherwise undesirable?