Skip to content

Usability improvements #53

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 23, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 12 additions & 12 deletions .github/workflows/numpy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,30 +23,30 @@ jobs:
python -m pip install -r requirements.txt
- name: Run the test suite
env:
ARRAY_API_TESTS_MODULE: numpy.array_api
XPTESTS_MODULE: numpy.array_api
run: |
# Mark some known issues as XFAIL
cat << EOF >> xfails.txt

# https://github.com/numpy/numpy/issues/18881
array_api_tests/test_creation_functions.py::test_linspace
xptests/test_creation_functions.py::test_linspace
# einsum is not yet completed in the spec
array_api_tests/test_signatures.py::test_has_names[einsum]
xptests/test_signatures.py::test_has_names[einsum]
# dlpack support is not yet implemented in NumPy
# See https://github.com/numpy/numpy/pull/19083
array_api_tests/test_signatures.py::test_function_positional_args[__dlpack__]
array_api_tests/test_signatures.py::test_function_positional_args[__dlpack_device__]
array_api_tests/test_signatures.py::test_function_positional_args[from_dlpack]
array_api_tests/test_signatures.py::test_function_positional_args[to_device]
array_api_tests/test_signatures.py::test_function_keyword_only_args[__dlpack__]
xptests/test_signatures.py::test_function_positional_args[__dlpack__]
xptests/test_signatures.py::test_function_positional_args[__dlpack_device__]
xptests/test_signatures.py::test_function_positional_args[from_dlpack]
xptests/test_signatures.py::test_function_positional_args[to_device]
xptests/test_signatures.py::test_function_keyword_only_args[__dlpack__]
# floor_divide has an issue related to https://github.com/data-apis/array-api/issues/264
array_api_tests/test_elementwise_functions.py::test_floor_divide
xptests/test_elementwise_functions.py::test_floor_divide
# mesgrid doesn't return all arrays as the promoted dtype
array_api_tests/test_type_promotion.py::test_meshgrid
xptests/test_type_promotion.py::test_meshgrid
# https://github.com/numpy/numpy/pull/20066#issuecomment-947056094
array_api_tests/test_type_promotion.py::test_where
xptests/test_type_promotion.py::test_where
# shape mismatches are not handled
array_api_tests/test_type_promotion.py::test_tensordot
xptests/test_type_promotion.py::test_tensordot

EOF

Expand Down
270 changes: 158 additions & 112 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,166 +1,212 @@
# Array API Standard Test Suite
# Test Suite for Array API Compliance

This is the test suite for the Python array API standard.
This is the test suite for array libraries adopting the [Python Array API
standard](https://data-apis.org/array-api/).

**NOTE: This test suite is still a work in progress.**
Note the suite is still a **work in progress**. Feedback and contributions are
welcome!

Feedback and contributions are welcome, but be aware that this suite is not
yet completed. In particular, there are still many parts of the array API
specification that are not yet tested here.

## Running the tests
## Quickstart

### Setup

To run the tests, first install the testing dependencies
To run the tests, install the testing dependencies.

```bash
$ pip install -r requirements
```

pip install pytest hypothesis
Ensure you have the array library that you want to test installed.

or
### Specifying the array module

conda install pytest hypothesis
You need to specify the array library to test. It can be specified via the
`XPTESTS_MODULE` environment variable, e.g.

as well as the array libraries that you want to test.
```bash
$ export XPTESTS_MODULE=numpy.array_api
```

### Specifying the array module
Alternately, change the `array_module` variable in `xptests/_array_module.py`
line, e.g.

To run the tests, you need to set the array library that is to be tested. There
are two ways to do this. One way is to set the `ARRAY_API_TESTS_MODULE`
environment variable. For example you can set it when running `pytest`
```diff
- array_module = None
+ import numpy.array_api as array_module
```

ARRAY_API_TESTS_MODULE=numpy pytest
### Run the suite

Alternately, edit the `array_api_tests/_array_module.py` file and change the
line
Simply run `pytest` against the `xptests/` folder to run the full suite.

```py
array_module = None
```bash
$ pytest xptests/
```

to
The suite tries to logically organise its tests. `pytest` allows you to only run
a specific test case, which is useful when developing functions.

```py
import numpy as array_module
```bash
$ pytest xptests/test_creation_functions.py::test_zeros
```

(replacing `numpy` with the array module namespace to be tested).
## What the test suite covers

We are interested in array libraries conforming to the
[spec](https://data-apis.org/array-api/latest/API_specification/index.html).
Ideally this means that if a library has fully adopted the Array API, the test
suite passes. We take great care to _not_ test things which are out-of-scope, so
as to not unexpectedly fail the suite.

### Primary tests

Every function—including array object methods—has a respective test method. We
use [Hypothesis](https://hypothesis.readthedocs.io/en/latest/) to generate a
diverse set of valid inputs. This means array inputs will cover different dtypes
and shapes, as well as contain interesting elements. These examples generate
with interesting arrangements of non-array positional arguments and keyword
arguments.

Each test case will cover the following areas if relevant:

* **Smoking**: We pass our generated examples to all functions. As these
examples solely consist of *valid* inputs, we are testing that functions can
be called using their documented inputs without raising errors.

### Specifying test cases
* **Data type**: For functions returning/modifying arrays, we assert that output
arrays have the correct data types. Most functions
[type-promote](https://data-apis.org/array-api/latest/API_specification/type_promotion.html)
input arrays and some functions have bespoke rules—in both cases we simulate
the correct behaviour to find the expected data types.

The test suite tries to logically organise its tests so you can find specific
test cases whilst developing something in particular. So to avoid running the
rather slow complete suite, you can specify particular test cases like any other
test suite.
* **Shape**: For functions returning/modifying arrays, we assert that output
arrays have the correct shape. Most functions
[broadcast](https://data-apis.org/array-api/latest/API_specification/broadcasting.html)
input arrays and some functions have bespoke rules—in both cases we simulate
the correct behaviour to find the expected shapes.

pytest array_api_tests/test_creation_functions.py::test_zeros
* **Values**: We assert output values (including the elements of
returned/modified arrays) are as expected. Except for manipulation functions
or special cases, the spec allows floating-point inputs to have inexact
outputs, so with such examples we only assert values are roughly as expected.

## Notes on Interpreting Errors
### Additional tests

- Some tests cannot be run unless other tests pass first. This is because very
basic APIs such as certain array creation APIs are required for a large
fraction of the tests to run. TODO: Write which tests are required to pass
first here.
In addition to having one test case for each function, we test other properties
of the functions and some miscellaneous things.

- If an error message involves `_UndefinedStub`, it means some name that is
required for the test to run is not defined in the array library.
* **Special cases**: For functions with special case behaviour, we assert that
these functions return the correct values.

- Due to the nature of the array api spec, virtually every array library will
produce a large number of errors from nonconformance. It is still a work in
progress to enable reporting the errors in a way that makes them easy to
understand, even if there are a large number of them.
* **Signatures**: We assert functions have the correct signatures.

- The spec documents are the ground source of truth. If the test suite appears
to be testing something that is different from the spec, or something that
isn't actually mentioned in the spec, this is a bug. [Please report
it](https://github.com/data-apis/array-api-tests/issues/new). Furthermore,
be aware that some aspects of the spec are either impossible or extremely
difficult to actually test, so they are not covered in the test suite (TODO:
list what these are).
* **Constants**: We assert that
[constants](https://data-apis.org/array-api/latest/API_specification/constants.html)
behave expectedly, are roughly the expected value, and that any related
functions interact with them correctly.

Be aware that some aspects of the spec are impractical or impossible to actually
test, so they are not covered in the suite <!-- TODO: note what these are -->

## Configuring Tests
## Interpreting errors

First and foremost, note that most tests have to assume that certain aspects of
the Array API have been correctly adopted, as fundamental APIs such as array
creation and equalities are hard requirements for many assertions. This means a
test case for one function might fail because another function has bugs or even
no implementation.

This means adopting libraries at first will result in a vast number of errors
due to cascading errors. Generally the nature of the spec means many granular
details such as type promotion is likely going to also fail nearly-conforming
functions.

We hope to improve user experience in regards to "noisy" errors in
[#51](https://github.com/data-apis/array-api-tests/issues/51). For now, if an
error message involves `_UndefinedStub`, it means an attribute of the array
library (including functions) and it's objects (e.g. the array) is missing.

The spec is the suite's source of truth. If the suite appears to assume
behaviour different from the spec, or test something that is not documented,
this is a bug—please [report such
issues](https://github.com/data-apis/array-api-tests/issues/) to us.

## Configuration

By default, tests for the optional Array API extensions such as
[`linalg`](https://data-apis.org/array-api/latest/extensions/linear_algebra_functions.html)
will be skipped if not present in the specified array module. You can purposely
skip testing extension(s) via the `--disable-extension` option, and likewise
purposely test them via the `--enable-extension` option.

The tests make heavy use of the
[Hypothesis](https://hypothesis.readthedocs.io/en/latest/) testing library.
Hypothesis generates random input values for the tests. You can configure how
many values are generated and run using the `--max-examples` flag. The default
`--max-examples` is 100. For example, `--max-examples 50` will only generate
half as many examples and as a result, the test suite will run in about half
the time. Setting `--max-examples` to a lower value can be useful when you
want to have a faster test run. It can also be useful to set `--max-examples`
to a large value to do a longer, more rigorous run of the tests. For example,
`--max-examples 10000` will do a very rigorous check of the tests, but may
take a few hours to run.

## Contributing

### Adding Tests
The tests make heavy use
[Hypothesis](https://hypothesis.readthedocs.io/en/latest/). You can configure
how many examples are generated using the `--max-examples` flag, which defaults
to 100. Lower values can be useful for quick checks, and larger values should
result in more rigorous runs. For example, `--max-examples 10000` may find bugs
where default runs don't, but will take a much longer time.

It is important that every test in the test suite only uses APIs that are part
of the standard. This means that, for instance, when creating test arrays, you
should only use array creation functions that are part of the spec, such as
`ones` or `full`. It also means that many array testing functions that are
built-in to libraries like numpy are reimplemented in the test suite (see
`array_api_tests/pytest_helpers.py`, `array_api_tests/array_helpers.py`, and
`array_api_tests/hypothesis_helpers.py`).
<!-- TODO: howto on CI -->

In order to enforce this, the `array_api_tests._array_module` should be used
everywhere in place of the actual array module that is being tested.
## Contributing

### Hypothesis
### Remain in-scope

The test suite uses [Hypothesis](https://hypothesis.readthedocs.io/en/latest/)
to generate random input data. Any test that should be applied over all
possible array inputs should use hypothesis tests. Custom Hypothesis
strategies are in the `array_api_tests/hypothesis_helpers.py` file.
It is important that every test only uses APIs that are part of the standard.
For instance, when creating input arrays you should only use the [array creation
functions](https://data-apis.org/array-api/latest/API_specification/creation_functions.html)
that are documented in the spec. The same goes for testing arrays—you'll find
many utilities that parralel NumPy's own test utils in the `*_helpers.py` files.

### Parameterization
### Tools

Any test that applies over all functions in a module should use
`pytest.mark.parametrize` to parameterize over them. For example,
Hypothesis should always be used for the primary tests, and can be useful
elsewhere. Effort should be made so drawn arguments are labeled with their
respective names. For
[`st.data()`](https://hypothesis.readthedocs.io/en/latest/data.html#hypothesis.strategies.data),
draws should be accompanied with the `label` kwarg i.e. `data.draw(<strategy>,
label=<label>)`.

```py
from . import function_stubs
[`pytest.mark.parametrize`](https://docs.pytest.org/en/latest/how-to/parametrize.html)
should be used to run tests over multiple arguments. Parameterization should be
preferred over using Hypothesis when there are a small number of possible
inputs, as this allows better failure reporting. Note using both parametrize and
Hypothesis for a single test method is possible and can be quite useful.

@pytest.mark.parametrize('name', function_stubs.__all__)
def test_whatever(name):
...
```
### Error messages

will parameterize `test_whatever` over all the functions stubs generated from
the spec. Parameterization should be preferred over using Hypothesis whenever
there are a finite number of input possibilities, as this will cause pytest to
report failures for all input values separately, as opposed to Hypothesis
which will only report one failure.
Any assertion should be accompanied with a descriptive error message, including
the relevant values. Error messages should be self-explanatory as to why a given
test fails, as one should not need prior knowledge of how the test is
implemented.

### Error Strings
### Generated files

Any assertion or exception should be accompanied with a useful error message.
The test suite is designed to be ran by people who are not familiar with the
test suite code, so the error messages should be self explanatory as to why
the module fails a given test.
Some files in the suite are automatically generated from the spec, and should
not be edited directly. To regenerate these files, run the script

### Meta-errors
./generate_stubs.py path/to/array-api

Any error that indicates a bug in the test suite itself, rather than in the
array module not following the spec, should use `RuntimeError` whenever
possible.
where `path/to/array-api` is the path to a local clone of the [`array-api`
repo](https://github.com/data-apis/array-api/). Edit `generate_stubs.py` to make
changes to the generated files.

(TODO: Update this policy to something better. See [#5](https://github.com/data-apis/array-api-tests/issues/5).)
## Future plans

### Automatically Generated Files
Keeping full coverage of the spec is an on-going priority as the Array API
evolves.

Some files in the test suite are automatically generated from the API spec
files. These files should not be edited directly. To regenerate these files,
run the script
Additionally, we have features and general improvements planned. Work on such
functionality is guided primarily by the concerete needs of developers
implementing and using the Array API—be sure to [let us
know](https://github.com/data-apis/array-api-tests/issues) any limitations you
come across.

./generate_stubs.py path/to/array-api
* A dependency graph for every test case, which could be used to modify pytest's
collection so that low-dependency tests are run first, and tests with faulty
dependencies would skip/xfail.

where `path/to/array-api` is the path to the local clone of the `array-api`
repo. To modify the automatically generated files, edit the code that
generates them in the `generate_stubs.py` script.
* In some tests we've found it difficult to find appropaite assertion parameters
for output values (particularly epsilons for floating-point outputs), so we
need to review these and either implement assertions or properly note the lack
thereof.
6 changes: 3 additions & 3 deletions conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
from pytest import mark
from hypothesis import settings

from array_api_tests import _array_module as xp
from array_api_tests._array_module import _UndefinedStub
from xptests import _array_module as xp
from xptests._array_module import _UndefinedStub


settings.register_profile('xp_default', deadline=800)
Expand Down Expand Up @@ -77,7 +77,7 @@ def xp_has_ext(ext: str) -> bool:
if xfails_path.exists():
with open(xfails_path) as f:
for line in f:
if line.startswith('array_api_tests'):
if line.startswith('xptests'):
id_ = line.strip('\n')
xfail_ids.append(id_)

Expand Down
Loading