|
1 | | -## Coverage of the Test Suite |
| 1 | +## What this suite actually tests |
2 | 2 |
|
3 | | -This document outlines the coverage of the test suite over the |
4 | | -[spec](https://data-apis.org/array-api/) at a high level. |
| 3 | +`array-api-tests` tests that an array library adopting the [standard](https://data-apis.org/array-api/) is indeed covering everything that is in scope. |
5 | 4 |
|
6 | | -The following things are tested |
| 5 | +## Primary tests |
7 | 6 |
|
8 | | -* **Smoke tested** means that the function has a basic test that calls the |
9 | | - function with some inputs, but does not imply any testing of the output |
10 | | - value. This includes calling keyword arguments to the function, and checking |
11 | | - that it takes the correct number of positional arguments. A smoke test will |
12 | | - fail if the function is not implemented with the correct signature or raises |
13 | | - an exception, but will not check any other aspect of the spec. |
| 7 | +Every function—including array object methods—has a respective test method. We use [Hypothesis](https://hypothesis.readthedocs.io/en/latest/) to generate a diverse set of valid inputs. This means array inputs will cover different dtypes and shapes, as well as contain interesting elements. These examples generate with interesting arrangements of non-array positional arguments and keyword arguments. |
14 | 8 |
|
15 | | -* **All Inputs** means that the function is tested with all possible inputs |
16 | | - required by the spec (using hypothesis). This means all possible array |
17 | | - shapes, all possible dtypes (that are required for the given function), and |
18 | | - all possible values for the given dtype (omitting those whose behavior is |
19 | | - undefined). |
| 9 | +Each test case will cover the following areas if relevant: |
20 | 10 |
|
21 | | -* **Output Shape** means that the result shape is tested. For functions that |
22 | | - take more than one argument, this means the result shape should produced |
23 | | - from |
24 | | - [broadcasting](https://data-apis.org/array-api/latest/API_specification/broadcasting.html) |
25 | | - the input shapes. For functions of a single argument, the result shape |
26 | | - should be the same as the input shape. |
| 11 | +* **Smoking**: We pass our generated examples to all functions. As these examples solely consist of *valid* inputs, we are testing that functions can be called using their documented inputs without raising errors. |
27 | 12 |
|
28 | | -* **Output Dtype** means that the result dtype is tested. For (most) functions |
29 | | - with a single argument, the result dtype should be the same as the input. |
30 | | - For functions with two arguments, there are different possibilities, such as |
31 | | - performing [type |
32 | | - promotion](https://data-apis.org/array-api/latest/API_specification/type_promotion.html) |
33 | | - or always returning a specific dtype (e.g., `equals()` should always return |
34 | | - a `bool` array). |
| 13 | +* **Data type**: For functions returning/modifying arrays, we assert that output arrays have the correct data types. Most functions [type-promote](https://data-apis.org/array-api/latest/API_specification/type_promotion.html) input arrays and some functions have bespoke rules—in both cases we simulate the correct behaviour to find the expected data types. |
35 | 14 |
|
36 | | -* **Output Values** means that the exact output is tested in some way. For |
37 | | - functions that operate on floating-point inputs, the spec does not require |
38 | | - exact values, so a "Yes" in this case will mean only that the output value |
39 | | - is checked to be "close" to the numerically correct result. The exception to |
40 | | - this is special cases for elementwise functions, which are tested exactly. |
41 | | - For functions that operate on non-floating-point inputs, or functions like |
42 | | - manipulation functions or indexing that simply rearrange the same values of |
43 | | - the input arrays, a "Yes" means that the exact values are tested. Note that |
44 | | - in many cases, certain values of inputs are left unspecified, and are thus |
45 | | - not tested (e.g., the behavior for division by integer 0 is unspecified). |
| 15 | +* **Shape**: For functions returning/modifying arrays, we assert that output arrays have the correct shape. Most functions [broadcast](https://data-apis.org/array-api/latest/API_specification/broadcasting.html) input arrays and some functions have bespoke rules—in both cases we simulate the correct behaviour to find the expected shapes. |
46 | 16 |
|
47 | | -* **Stacking** means that functions that operate on "stacks" of smaller data |
48 | | - are tested to produce the same result on a stack as on the individual |
49 | | - components. For example, an elementwise function on an array |
50 | | - should produce the same output values as the same function called on each |
51 | | - value individually, or a linalg function on a stack of matrices should |
52 | | - produce the same value when called on individual matrices. Here "same" may |
53 | | - only mean "close" when the input values are floating-point. |
| 17 | +* **Values**: We assert output values (including the elements of returned/modified arrays) are as expected. Except for manipulation functions or special cases, the spec allows floating-point inputs to have inexact outputs, so with such examples we only assert values are roughly as expected. |
54 | 18 |
|
55 | | -## Statistical Functions |
| 19 | +## Additional tests |
56 | 20 |
|
57 | | -| Function | Smoke Test | All Inputs | Output Shape | Result Dtype | Output Values | Stacking | |
58 | | -|----------|------------|------------|--------------|--------------|---------------|----------| |
59 | | -| max | Yes | Yes | Yes | Yes | | | |
60 | | -| mean | Yes | Yes | Yes | Yes | | | |
61 | | -| min | Yes | Yes | Yes | Yes | | | |
62 | | -| prod | Yes | Yes | Yes | Yes [^1] | | | |
63 | | -| std | Yes | Yes | Yes | Yes | | | |
64 | | -| sum | Yes | Yes | Yes | Yes [^1] | | | |
65 | | -| var | Yes | Yes | Yes | Yes | | | |
| 21 | +In addition to having one test case for each function, we test other properties of the functions and some miscellaneous things. |
66 | 22 |
|
67 | | -[^1]: `sum` and `prod` have special type promotion rules. |
| 23 | +* **Special cases**: For functions with special case behaviour, we assert that these functions return the correct values. |
68 | 24 |
|
69 | | -## Additional Planned Features |
| 25 | +* **Signatures**: We assert functions have the correct signatures. |
70 | 26 |
|
71 | | -In addition to getting full coverage of the spec, there are some additional |
72 | | -features and improvements for the test suite that are planned. Work on these features |
73 | | -will be guided primarily by concrete needs from library implementers, so if |
74 | | -you are someone using this test suite to test your library, please [let us |
75 | | -know](https://github.com/data-apis/array-api-tests/issues) the limitations you |
76 | | -come across. |
| 27 | +* **Constants**: We assert that [constants](https://data-apis.org/array-api/latest/API_specification/constants.html) behave expectedly, are roughly the expected value, and that any related functions interact with them correctly. |
77 | 28 |
|
78 | | -- Making the test suite more usable for partially conforming libraries. Many |
79 | | - tests rely on various functions in the array library to function. This means |
80 | | - that if certain functions aren't implemented, for example, `asarray()` or |
81 | | - `equals()`, then many tests will not function at all. We want to improve |
82 | | - this situation, so that tests that don't strictly require these functions can |
83 | | - still be run. |
84 | 29 |
|
85 | | -- Better reporting. The pytest output can be difficult to parse, especially |
86 | | - when there are many failures. Additionally some error messages can be |
87 | | - difficult to understand without prior knowledge of the test internals. |
88 | | - Better reporting can also make it easier to compare different |
89 | | - implementations by their conformance. |
90 | | - |
91 | | -- Better tests for numerical outputs. Right now numerical outputs are either |
92 | | - not tested at all, or only tested against very rough epsilons. This is |
93 | | - partly due to the fact that the spec does not mandate any level of precision |
94 | | - for most functions. However, it may be useful to, for instance, give a |
95 | | - report of how off a given function is from the "expected" exact output. |
| 30 | +TODO: future plans |
0 commit comments