Description
Description
This paper and this library describe and implement a number of linear algebra simplifications that we can implement as graph rewrites. This issue is both a tracker for implementing these rewrites, and a discussion for how to handle them.
Consider the following graph:
x = pt.dmatrix('x')
y = pt.diagonal(x)
z = pt.linalg.inv(y)
If we could promise at rewrite time that y
is diagonal, we could re-write the last operation as z = 1/y
, exploiting the structure of the diagonal matrix. Other non-trivial examples exist, for example:
x = pt.dmatrix('x')
y = pt.eye(3)
z = pt.kron(x, y)
z_inv = pt.linalg.inv(z)
If we could promise at rewrite time that z
is block diagonal, we could rewrite z_inv = pt.kron(pt.linalg.inv(x), y)
, which is a much faster operation (since x
is 3x smaller than z
).
The linked paper and library list a huge number of such shortcut computations. The following is a list version of Table 1 from the paper. Under each function are the types of matrices for which a rewrite rule exists to accelerate the function. It would be nice to collaborative update this list with links to the COLA library where the relevant rewrite is implemented:
Thanks to @tanish1729 for compiling a list of links to relevant rewrites. Missing links indicate no direct implementation
-
- Identity
- Diagonal
- Triangular
- Permutation
- Convolution
- Sum
- Product
- Kronecker Product
- Block Diagonal
- Concatenation
-
- Identity
- Diagonal
- Triangular
- Convolution
- Sum
- Kronecker Product
- Block Diagonal
-
- Identity
- Diagonal
- Sum
- Scalar Product
- Triangular
- Permutation
- Kronecker Product
- Block Diagonal
- Concatenation
-
Unary Operations (such as log, pow, exp, sqrt)
In addition, potential re-writes could also be applied to Topelitz and Circulant matrices, although these are not covered by COLA.
The wrinkle to all this is that we would need more information about matrices as they enter and exit Ops
. Right now, we're using tags to accomplish rewrites like this, see for example #303 and #459 . Some of these rewrites might be possible to do via inference. For example, a pt.linalg.block_diag
Op
always returns a block diagonal matrix, as does pt.kron(pt.eye, A)
. pt.diagonal
always returns a diagonal matrix, as does pt.eye
; pt.linalg.cholesky
always returns a triangular matrix, etc. Other potential types, like block
, positive
, definite
, psd
, topelitz
, circulant
, etc, would be less trivial to automatically detect.
The other issue is that as these type tags proliferate, we become more and more locked into a somewhat hack-y system for marking things. Perhaps putting some thought into how to handle this now will save some refactoring headaches down the road?