Skip to content

On the ambiguity of .shape behavior #891

Open
@ricardoV94

Description

@ricardoV94

According to #97 a library can decide to either return a tuple[int | None, ...] or a tuple-like object that:

The returned value should be a tuple; however, where warranted, an array library may choose to return a custom shape object. If an array library returns a custom shape object, the object must be immutable, must support indexing for dimension retrieval, and must behave similarly to a tuple.

This seems like a recipe for disaster? The second option allows to operate on shape graphs, whereas the first would fail when you try to act on None, say to find the size of some dimensions by doing prod(x.shape[1:]) (forced example so that .size wouldn't be applicable).

In PyTensor we have the distinction between variable.shape and variable.type.shape, that correspond to those two kinds of output. They are flipped though, and it seems odd to make variable.shape return a tuple with None. It doesn't make sense to build a computation on top of static shape, because those None are not linked to anything.

import numpy as np

import pytensor
import pytensor.tensor as pt

x = pt.tensor("x", shape=(3, None,))
print(x.shape)  # Shape.0
print(x.type.shape)  # (3, None)

# Could not possibly work with x.type.shape
out = pt.broadcast_to(x, (2, x.shape[0], x.shape[1]))
print(out.type.shape)  # (2, 3, None)

assert out.eval({x: np.ones((3, 4))}).shape == (2, 3, 4)
assert out.eval({x: np.ones((3, 5))}).shape == (2, 3, 5)

Besides that, we sometimes also allow users to replace variables with different static shapes, although it's arguable a bit of an undefined behavior. It seems to contradict the specification that it must be immutable, so happy to say it's out of scope:

new_x = pt.tensor("x", shape=(4, 4))

# Even ignoring the issue of using None for unknown dimensions, the following could not work
# if the original graph was built on top of the static 3 dim length, as that's not "connected" to anything.
new_out = pytensor.graph.clone_replace(out, {x: new_x}, rebuild_strict=False)
print(new_out.type.shape)  # (2, 4, 4)

assert new_out.eval({new_x: np.ones((4, 4))}).shape == (2, 4, 4)

Proposal

Would make sense to separate the two kinds of shape clearly? Perhaps as variable.shape and variable.static_shape. The first should be valid to build computations on top of variable shapes, statically known or not, while the second would allow libraries to reason as much as possible about what is known (and choose to fail if the provided information is insufficient) without having to try and probe which kind of shape output is returned by a specific library.

This is somewhat related to #839, where a library may need as much information as possible to make a decision. Perhaps a static_value would also make sense for a library to return the entries that can be known ahead of time. Anyway that should be discussed there.

If both options make sense, I would argue that .shape should behave like pytensor does.

The standard should also specify if library.shape(x) should match x.shape or x.static_shape. Again I think it should match the first.

Metadata

Metadata

Assignees

No one assigned

    Labels

    topic: Lazy/GraphLazy and graph-based array implementations.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions