Skip to content

ENH: Implement COLA library rewrites for linear algebra functions #573

Open
@jessegrabowski

Description

@jessegrabowski

Description

This paper and this library describe and implement a number of linear algebra simplifications that we can implement as graph rewrites. This issue is both a tracker for implementing these rewrites, and a discussion for how to handle them.

Consider the following graph:

x = pt.dmatrix('x')
y = pt.diagonal(x)
z = pt.linalg.inv(y)

If we could promise at rewrite time that y is diagonal, we could re-write the last operation as z = 1/y, exploiting the structure of the diagonal matrix. Other non-trivial examples exist, for example:

x = pt.dmatrix('x')
y = pt.eye(3)
z = pt.kron(x, y)
z_inv = pt.linalg.inv(z)

If we could promise at rewrite time that z is block diagonal, we could rewrite z_inv = pt.kron(pt.linalg.inv(x), y), which is a much faster operation (since x is 3x smaller than z).

The linked paper and library list a huge number of such shortcut computations. The following is a list version of Table 1 from the paper. Under each function are the types of matrices for which a rewrite rule exists to accelerate the function. It would be nice to collaborative update this list with links to the COLA library where the relevant rewrite is implemented:

Thanks to @tanish1729 for compiling a list of links to relevant rewrites. Missing links indicate no direct implementation

In addition, potential re-writes could also be applied to Topelitz and Circulant matrices, although these are not covered by COLA.

The wrinkle to all this is that we would need more information about matrices as they enter and exit Ops. Right now, we're using tags to accomplish rewrites like this, see for example #303 and #459 . Some of these rewrites might be possible to do via inference. For example, a pt.linalg.block_diag Op always returns a block diagonal matrix, as does pt.kron(pt.eye, A). pt.diagonal always returns a diagonal matrix, as does pt.eye; pt.linalg.cholesky always returns a triangular matrix, etc. Other potential types, like block, positive, definite, psd, topelitz, circulant, etc, would be less trivial to automatically detect.

The other issue is that as these type tags proliferate, we become more and more locked into a somewhat hack-y system for marking things. Perhaps putting some thought into how to handle this now will save some refactoring headaches down the road?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions