|
1 |
| -## Coverage of the Test Suite |
| 1 | +## What this suite actually tests |
2 | 2 |
|
3 |
| -This document outlines the coverage of the test suite over the |
4 |
| -[spec](https://data-apis.org/array-api/) at a high level. |
| 3 | +`array-api-tests` tests that an array library adopting the [standard](https://data-apis.org/array-api/) is indeed covering everything that is in scope. |
5 | 4 |
|
6 |
| -The following things are tested |
| 5 | +## Primary tests |
7 | 6 |
|
8 |
| -* **Smoke tested** means that the function has a basic test that calls the |
9 |
| - function with some inputs, but does not imply any testing of the output |
10 |
| - value. This includes calling keyword arguments to the function, and checking |
11 |
| - that it takes the correct number of positional arguments. A smoke test will |
12 |
| - fail if the function is not implemented with the correct signature or raises |
13 |
| - an exception, but will not check any other aspect of the spec. |
| 7 | +Every function—including array object methods—has a respective test method. We use [Hypothesis](https://hypothesis.readthedocs.io/en/latest/) to generate a diverse set of valid inputs. This means array inputs will cover different dtypes and shapes, as well as contain interesting elements. These examples generate with interesting arrangements of non-array positional arguments and keyword arguments. |
14 | 8 |
|
15 |
| -* **All Inputs** means that the function is tested with all possible inputs |
16 |
| - required by the spec (using hypothesis). This means all possible array |
17 |
| - shapes, all possible dtypes (that are required for the given function), and |
18 |
| - all possible values for the given dtype (omitting those whose behavior is |
19 |
| - undefined). |
| 9 | +Each test case will cover the following areas if relevant: |
20 | 10 |
|
21 |
| -* **Output Shape** means that the result shape is tested. For functions that |
22 |
| - take more than one argument, this means the result shape should produced |
23 |
| - from |
24 |
| - [broadcasting](https://data-apis.org/array-api/latest/API_specification/broadcasting.html) |
25 |
| - the input shapes. For functions of a single argument, the result shape |
26 |
| - should be the same as the input shape. |
| 11 | +* **Smoking**: We pass our generated examples to all functions. As these examples solely consist of *valid* inputs, we are testing that functions can be called using their documented inputs without raising errors. |
27 | 12 |
|
28 |
| -* **Output Dtype** means that the result dtype is tested. For (most) functions |
29 |
| - with a single argument, the result dtype should be the same as the input. |
30 |
| - For functions with two arguments, there are different possibilities, such as |
31 |
| - performing [type |
32 |
| - promotion](https://data-apis.org/array-api/latest/API_specification/type_promotion.html) |
33 |
| - or always returning a specific dtype (e.g., `equals()` should always return |
34 |
| - a `bool` array). |
| 13 | +* **Data type**: For functions returning/modifying arrays, we assert that output arrays have the correct data types. Most functions [type-promote](https://data-apis.org/array-api/latest/API_specification/type_promotion.html) input arrays and some functions have bespoke rules—in both cases we simulate the correct behaviour to find the expected data types. |
35 | 14 |
|
36 |
| -* **Output Values** means that the exact output is tested in some way. For |
37 |
| - functions that operate on floating-point inputs, the spec does not require |
38 |
| - exact values, so a "Yes" in this case will mean only that the output value |
39 |
| - is checked to be "close" to the numerically correct result. The exception to |
40 |
| - this is special cases for elementwise functions, which are tested exactly. |
41 |
| - For functions that operate on non-floating-point inputs, or functions like |
42 |
| - manipulation functions or indexing that simply rearrange the same values of |
43 |
| - the input arrays, a "Yes" means that the exact values are tested. Note that |
44 |
| - in many cases, certain values of inputs are left unspecified, and are thus |
45 |
| - not tested (e.g., the behavior for division by integer 0 is unspecified). |
| 15 | +* **Shape**: For functions returning/modifying arrays, we assert that output arrays have the correct shape. Most functions [broadcast](https://data-apis.org/array-api/latest/API_specification/broadcasting.html) input arrays and some functions have bespoke rules—in both cases we simulate the correct behaviour to find the expected shapes. |
46 | 16 |
|
47 |
| -* **Stacking** means that functions that operate on "stacks" of smaller data |
48 |
| - are tested to produce the same result on a stack as on the individual |
49 |
| - components. For example, an elementwise function on an array |
50 |
| - should produce the same output values as the same function called on each |
51 |
| - value individually, or a linalg function on a stack of matrices should |
52 |
| - produce the same value when called on individual matrices. Here "same" may |
53 |
| - only mean "close" when the input values are floating-point. |
| 17 | +* **Values**: We assert output values (including the elements of returned/modified arrays) are as expected. Except for manipulation functions or special cases, the spec allows floating-point inputs to have inexact outputs, so with such examples we only assert values are roughly as expected. |
54 | 18 |
|
55 |
| -## Statistical Functions |
| 19 | +## Additional tests |
56 | 20 |
|
57 |
| -| Function | Smoke Test | All Inputs | Output Shape | Result Dtype | Output Values | Stacking | |
58 |
| -|----------|------------|------------|--------------|--------------|---------------|----------| |
59 |
| -| max | Yes | Yes | Yes | Yes | | | |
60 |
| -| mean | Yes | Yes | Yes | Yes | | | |
61 |
| -| min | Yes | Yes | Yes | Yes | | | |
62 |
| -| prod | Yes | Yes | Yes | Yes [^1] | | | |
63 |
| -| std | Yes | Yes | Yes | Yes | | | |
64 |
| -| sum | Yes | Yes | Yes | Yes [^1] | | | |
65 |
| -| var | Yes | Yes | Yes | Yes | | | |
| 21 | +In addition to having one test case for each function, we test other properties of the functions and some miscellaneous things. |
66 | 22 |
|
67 |
| -[^1]: `sum` and `prod` have special type promotion rules. |
| 23 | +* **Special cases**: For functions with special case behaviour, we assert that these functions return the correct values. |
68 | 24 |
|
69 |
| -## Additional Planned Features |
| 25 | +* **Signatures**: We assert functions have the correct signatures. |
70 | 26 |
|
71 |
| -In addition to getting full coverage of the spec, there are some additional |
72 |
| -features and improvements for the test suite that are planned. Work on these features |
73 |
| -will be guided primarily by concrete needs from library implementers, so if |
74 |
| -you are someone using this test suite to test your library, please [let us |
75 |
| -know](https://github.com/data-apis/array-api-tests/issues) the limitations you |
76 |
| -come across. |
| 27 | +* **Constants**: We assert that [constants](https://data-apis.org/array-api/latest/API_specification/constants.html) behave expectedly, are roughly the expected value, and that any related functions interact with them correctly. |
77 | 28 |
|
78 |
| -- Making the test suite more usable for partially conforming libraries. Many |
79 |
| - tests rely on various functions in the array library to function. This means |
80 |
| - that if certain functions aren't implemented, for example, `asarray()` or |
81 |
| - `equals()`, then many tests will not function at all. We want to improve |
82 |
| - this situation, so that tests that don't strictly require these functions can |
83 |
| - still be run. |
84 | 29 |
|
85 |
| -- Better reporting. The pytest output can be difficult to parse, especially |
86 |
| - when there are many failures. Additionally some error messages can be |
87 |
| - difficult to understand without prior knowledge of the test internals. |
88 |
| - Better reporting can also make it easier to compare different |
89 |
| - implementations by their conformance. |
90 |
| - |
91 |
| -- Better tests for numerical outputs. Right now numerical outputs are either |
92 |
| - not tested at all, or only tested against very rough epsilons. This is |
93 |
| - partly due to the fact that the spec does not mandate any level of precision |
94 |
| - for most functions. However, it may be useful to, for instance, give a |
95 |
| - report of how off a given function is from the "expected" exact output. |
| 30 | +TODO: future plans |
0 commit comments