Description
According to #97 a library can decide to either return a tuple[int | None, ...]
or a tuple-like object that:
The returned value should be a tuple; however, where warranted, an array library may choose to return a custom shape object. If an array library returns a custom shape object, the object must be immutable, must support indexing for dimension retrieval, and must behave similarly to a tuple.
This seems like a recipe for disaster? The second option allows to operate on shape graphs, whereas the first would fail when you try to act on None
, say to find the size of some dimensions by doing prod(x.shape[1:])
(forced example so that .size
wouldn't be applicable).
In PyTensor we have the distinction between variable.shape
and variable.type.shape
, that correspond to those two kinds of output. They are flipped though, and it seems odd to make variable.shape
return a tuple with None
. It doesn't make sense to build a computation on top of static shape, because those None
are not linked to anything.
import numpy as np
import pytensor
import pytensor.tensor as pt
x = pt.tensor("x", shape=(3, None,))
print(x.shape) # Shape.0
print(x.type.shape) # (3, None)
# Could not possibly work with x.type.shape
out = pt.broadcast_to(x, (2, x.shape[0], x.shape[1]))
print(out.type.shape) # (2, 3, None)
assert out.eval({x: np.ones((3, 4))}).shape == (2, 3, 4)
assert out.eval({x: np.ones((3, 5))}).shape == (2, 3, 5)
Besides that, we sometimes also allow users to replace variables with different static shapes, although it's arguable a bit of an undefined behavior. It seems to contradict the specification that it must be immutable, so happy to say it's out of scope:
new_x = pt.tensor("x", shape=(4, 4))
# Even ignoring the issue of using None for unknown dimensions, the following could not work
# if the original graph was built on top of the static 3 dim length, as that's not "connected" to anything.
new_out = pytensor.graph.clone_replace(out, {x: new_x}, rebuild_strict=False)
print(new_out.type.shape) # (2, 4, 4)
assert new_out.eval({new_x: np.ones((4, 4))}).shape == (2, 4, 4)
Proposal
Would make sense to separate the two kinds of shape clearly? Perhaps as variable.shape
and variable.static_shape
. The first should be valid to build computations on top of variable shapes, statically known or not, while the second would allow libraries to reason as much as possible about what is known (and choose to fail if the provided information is insufficient) without having to try and probe which kind of shape output is returned by a specific library.
This is somewhat related to #839, where a library may need as much information as possible to make a decision. Perhaps a static_value
would also make sense for a library to return the entries that can be known ahead of time. Anyway that should be discussed there.
If both options make sense, I would argue that .shape
should behave like pytensor does.
The standard should also specify if library.shape(x)
should match x.shape
or x.static_shape
. Again I think it should match the first.