Skip to content

Add a better mutually_broadcastable_shapes strategy #73

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

asmeurer
Copy link
Member

This is based on the work from
Quansight-Labs/ndindex#126. The strategy that comes
with hypothesis has default parameters that makes it not generate interesting
examples by default. In particular, it will never generate a shape with more
than 2 dimensions and side length more than 3.

This new strategy generates shapes that are based on a base shape drawn from
shapes(), so more complex shapes are possible. This is preferable to doing so
by modifying the parameters to mutually_broadcastable_shapes() because it's
difficult to find parameters that give interesting examples but don't require
too much filtering for total size.

This commit also changes it so that mutually_broadcastable_shapes generates
shapes with minimum side length 0 by default.

For example, before this commit:

>>> from array_api_tests.hypothesis_helpers import mutually_broadcastable_shapes
>>> for i in range(10):
...     print(mutually_broadcastable_shapes(2).example())
((3,), (3, 1))
((), (3,))
((), (2,))
((1,), ())
((3,), ())
((), (1,))
((1,), (1, 1))
((2,), ())
((), (1,))
((3,), ())

After this commit:

>>> for i in range(10):
...     print(mutually_broadcastable_shapes(2).example())
((), (1, 1))
((0, 1), (1, 1, 1, 1))
((1, 3, 1, 1), (3, 1, 4))
((), ())
((1, 5, 1), (1, 5, 1))
((1, 1), (2, 2))
((1,), (2, 0, 1))
((), (1, 1, 1, 2))
((), ())
((), (1, 4, 1))

The generation with this new strategy is somewhat slower than the old one
(about 2x slower), but the generation speed is not slow enough that it should
matter too much.

The shrinking can also be imperfect with this strategy. For example, I've seen
it shrink to ((0, 1), (0, 2)) for the currently failing in-place tests where
it could shrink to just ((1,), (2,)). I also saw the pre-existing strategy
have the same problem, however. One should just be aware that test failures
involving shapes from this strategy might not actually be shrunk minimally.

This is based on the work from
Quansight-Labs/ndindex#126. The strategy that comes
with hypothesis has default parameters that makes it not generate interesting
examples by default. In particular, it will never generate a shape with more
than 2 dimensions and side length more than 3.

This new strategy generates shapes that are based on a base shape drawn from
shapes(), so more complex shapes are possible. This is preferable to doing so
by modifying the parameters to mutually_broadcastable_shapes() because it's
difficult to find parameters that give interesting examples but don't require
too much filtering for total size.

This commit also changes it so that mutually_broadcastable_shapes generates
shapes with minimum side length 0 by default.

For example, before this commit:

```py
>>> from array_api_tests.hypothesis_helpers import mutually_broadcastable_shapes
>>> for i in range(10):
...     print(mutually_broadcastable_shapes(2).example())
((3,), (3, 1))
((), (3,))
((), (2,))
((1,), ())
((3,), ())
((), (1,))
((1,), (1, 1))
((2,), ())
((), (1,))
((3,), ())
```

After this commit:

```py
>>> for i in range(10):
...     print(mutually_broadcastable_shapes(2).example())
((), (1, 1))
((0, 1), (1, 1, 1, 1))
((1, 3, 1, 1), (3, 1, 4))
((), ())
((1, 5, 1), (1, 5, 1))
((1, 1), (2, 2))
((1,), (2, 0, 1))
((), (1, 1, 1, 2))
((), ())
((), (1, 4, 1))
```

The generation with this new strategy is somewhat slower than the old one
(about 2x slower), but the generation speed is not slow enough that it should
matter too much.

The shrinking can also be imperfect with this strategy. For example, I've seen
it shrink to ((0, 1), (0, 2)) for the currently failing in-place tests where
it could shrink to just ((1,), (2,)). I also saw the pre-existing strategy
have the same problem, however. One should just be aware that test failures
involving shapes from this strategy might not actually be shrunk minimally.
Copy link
Member

@honno honno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generating base shapes is a nice idea. Shame about minimisation, would rather solve this now as opposed to later, but it is something I can mull over.

Carrying over kwargs to base_shape is blocking (see review comment), otherwise this LGTM.

Co-authored-by: Matthew Barber <quitesimplymatt@gmail.com>
@honno
Copy link
Member

honno commented Jan 14, 2022

We can merge this, but it's worth just checking if changing the defaulting does what you want—please try out #75. This would be preferable for minimisation behaviour, as well as be less of a maintenance burden.

Just printing the results of a typical run, it seems quite comparable. I don't have much practical experience with the strategy like you do though.

@asmeurer
Copy link
Member Author

The simpler change #75 should work for now. If we ever decide to modify shapes from the default strategy, this may end up being a better approach. Ultimately, I'd prefer if hypothesis allowed filtering both shapes and mutually_broadcastable_shapes by total size in a more direct way.

@asmeurer asmeurer closed this Jan 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants