Description
Overview
The concept of strides has been raised several times since the beginning of the Consortium. This history was summarized in #641. In short, because several array libraries don't have an explicit concept of "strides" and strides can be considered an implementation detail, the Consortium has been intentional about avoiding any formal standardization of stride APIs.
Nevertheless, for a subset of array consuming libraries (such as SciPy and scikit-learn), being able to leverage (and manipulate) strides can confer performance advantages for the performance inclined. As a result, downstream libraries wanting to squeeze out additional performance have a higher likelihood of special casing the arrays of recognized libraries. This special casing is fine so long as you have a small number of supported array libraries; however, it presents a material disadvantage to those libraries also supporting strides which are not one of the privileged few.
Proposal
Now that the array API standard has an inspection API, this RFC seeks to propose adding support for querying whether a conforming array library has explicit support for array strides. Specifically, the following field should be added to the object returned by info.capabilities()
:
{
...
"max rank": Optional[int],
+ "strides": bool
}
With this API in place, downstream libraries could query whether a conforming array library supports strides and special case accordingly.
Questions
-
Does the proposed inspection API provide enough value to downstream libraries to justify inclusion? Notably, this proposal does not seek to standardize any keywords or stride-specific APIs (e.g.,
stride_tricks
), and, thus, downstream libraries would still be left to check for specific API support. But, this stated, this may be enough of a hook to foster de facto community standardization, of which, among stride supporting libraries, there seems to be. -
As discussed in Adding
order
argument toasarray
#571, DLPack allows access to strides; however, this API is a bit cumbersome for downstream libraries to use just to access strides. Nevertheless, maybe this is enough? -
A related question is, if we decide to move forward with this proposal, does it make sense to also add support for querying whether an array API supports an array "order" (e.g., "row-major" or "column-major")? This is another implementation detail which downstream array libraries hoping to adopt the standard have raised with particular concern for performance implications. Especially for CPU-bound dense array libraries which commonly interface with libraries written in other languages (C/C++/Fortran) and which assume a specific memory layout (row or column major), ensuring that created arrays have a particular layout can confer performance advantages.