Open
Description
Description
It seems to be faster, both in the C and Numba backends, and regardless of whether the idx is constant or symbolic:
import pytensor
import pytensor.tensor as pt
x = pt.vector("x", shape=(10_000,))
idx = np.random.default_rng(1).binomial(n=1, p=0.5, size=x.type.shape).astype(bool)
fn1 = pytensor.function([x], x[idx], trust_input=True)
fn1.dprint()
fn2 = pytensor.function([x], x[idx.nonzero()], trust_input=True)
fn2.dprint()
x_test = np.arange(x.type.shape[0]).astype(x.dtype)
%timeit fn1(x_test)
%timeit fn2(x_test)
# AdvancedSubtensor [id A] 0
# ├─ x [id B]
# └─ [ True Tr ... lse False] [id C]
# AdvancedSubtensor1 [id A] 0
# ├─ x [id B]
# └─ [ 0 1 ... 9994 9996] [id C]
# 52.5 μs ± 1.09 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
# 17.1 μs ± 494 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
Difference is also large in the numba backend. This would allow us to simplify the codebase quite a lot by getting rid of boolean indices in our graph representation. There's only one case where boolean indices are not equivalent to .nonzero()
, which is when the boolean variable is scalar, but we don't support that explicitly anyway.