Skip to content

Convert boolean indices to integer with nonzero #1432

Open
@ricardoV94

Description

@ricardoV94

Description

It seems to be faster, both in the C and Numba backends, and regardless of whether the idx is constant or symbolic:

import pytensor
import pytensor.tensor as pt

x = pt.vector("x", shape=(10_000,))
idx = np.random.default_rng(1).binomial(n=1, p=0.5, size=x.type.shape).astype(bool)
fn1 = pytensor.function([x], x[idx], trust_input=True)
fn1.dprint()

fn2 = pytensor.function([x], x[idx.nonzero()], trust_input=True)
fn2.dprint()

x_test = np.arange(x.type.shape[0]).astype(x.dtype)
%timeit fn1(x_test)
%timeit fn2(x_test)

# AdvancedSubtensor [id A] 0
#  ├─ x [id B]
#  └─ [ True  Tr ... lse False] [id C]
# AdvancedSubtensor1 [id A] 0
#  ├─ x [id B]
#  └─ [   0    1 ... 9994 9996] [id C]
# 52.5 μs ± 1.09 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
# 17.1 μs ± 494 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Difference is also large in the numba backend. This would allow us to simplify the codebase quite a lot by getting rid of boolean indices in our graph representation. There's only one case where boolean indices are not equivalent to .nonzero(), which is when the boolean variable is scalar, but we don't support that explicitly anyway.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions