|
| 1 | +Sparsity |
| 2 | +++++++++ |
| 3 | + |
| 4 | +Sparsity has been an area of rapid growth and importance within PyTorch; if any sparsity terms are confusing below, |
| 5 | +please refer to the `sparsity tutorial <https://pytorch.org/docs/stable/sparse.html>`__ for additional details. |
| 6 | + |
| 7 | +Sparse storage formats have been proven to be powerful in a variety of ways. As a primer, the first use case |
| 8 | +most practitioners think about is when the majority of elements are equal to zero (a high degree of sparsity), |
| 9 | +but even in cases of lower sparsity, certain formats (e.g. BSR) can take advantage of substructures within a matrix. |
| 10 | + |
| 11 | +.. note:: |
| 12 | + |
| 13 | + At the moment, MaskedTensor supports COO and CSR tensors with plans to support additional formats |
| 14 | + (e.g. BSR and CSC) in the future. If you have any requests for additional formats, please file a feature request! |
| 15 | + |
| 16 | +Principles |
| 17 | +---------- |
| 18 | + |
| 19 | +When creating a :class:`MaskedTensor` with sparse tensors, there are a few principles that must be observed: |
| 20 | + |
| 21 | +1. ``data`` and ``mask`` must have the same storage format, whether that's :attr:`torch.strided`, :attr:`torch.sparse_coo`, or :attr:`torch.sparse_csr` |
| 22 | +2. ``data`` and ``mask`` must have the same size, indicated by :func:`size()` |
| 23 | + |
| 24 | +Sparse COO tensors |
| 25 | +------------------ |
| 26 | + |
| 27 | +In accordance with Principle #1, a sparse COO MaskedTensor is created by passing in two sparse COO tensors, |
| 28 | +which can be initialized by any of its constructors, e.g. :func:`torch.sparse_coo_tensor`. |
| 29 | + |
| 30 | +As a recap of `sparse COO tensors <https://pytorch.org/docs/stable/sparse.html#sparse-coo-tensors>`__, the COO format |
| 31 | +stands for "coordinate format", where the specified elements are stored as tuples of their indices and the |
| 32 | +corresponding values. That is, the following are provided: |
| 33 | + |
| 34 | +* ``indices``: array of size ``(ndim, nse)`` and dtype ``torch.int64`` |
| 35 | +* ``values``: array of size `(nse,)` with any integer or floating point dtype |
| 36 | + |
| 37 | +where ``ndim`` is the dimensionality of the tensor and ``nse`` is the number of specified elements |
| 38 | + |
| 39 | +For both sparse COO and CSR tensors, you can construct a :class:`MaskedTensor` by doing either: |
| 40 | + |
| 41 | +1. ``masked_tensor(sparse_tensor_data, sparse_tensor_mask)`` |
| 42 | +2. ``dense_masked_tensor.to_sparse_coo()`` or ``dense_masked_tensor.to_sparse_csr()`` |
| 43 | + |
| 44 | +The second method is easier to illustrate so we've shown that below, but for more on the first and the nuances behind |
| 45 | +the approach, please read the :ref:`sparse-coo-appendix`. |
| 46 | + |
| 47 | + >>> values = torch.tensor([[0, 0, 3], [4, 0, 5]]) |
| 48 | + >>> mask = torch.tensor([[False, False, True], [False, False, True]]) |
| 49 | + >>> mt = masked_tensor(values, mask) |
| 50 | + >>> sparse_coo_mt = mt.to_sparse_coo() |
| 51 | + >>> mt |
| 52 | + MaskedTensor( |
| 53 | + [ |
| 54 | + [ --, --, 3], |
| 55 | + [ --, --, 5] |
| 56 | + ] |
| 57 | + ) |
| 58 | + >>> sparse_coo_mt |
| 59 | + MaskedTensor( |
| 60 | + [ |
| 61 | + [ --, --, 3], |
| 62 | + [ --, --, 5] |
| 63 | + ] |
| 64 | + ) |
| 65 | + >>> sparse_coo_mt.get_data() |
| 66 | + tensor(indices=tensor([[0, 1], |
| 67 | + [2, 2]]), |
| 68 | + values=tensor([3, 5]), |
| 69 | + size=(2, 3), nnz=2, layout=torch.sparse_coo) |
| 70 | + |
| 71 | +Sparse CSR tensors |
| 72 | +------------------ |
| 73 | + |
| 74 | +Similarly, :class:`MaskedTensor` also supports the |
| 75 | +`CSR (Compressed Sparse Row) <https://pytorch.org/docs/stable/sparse.html#sparse-csr-tensor>`__ |
| 76 | +sparse tensor format. Instead of storing the tuples of the indices like sparse COO tensors, sparse CSR tensors |
| 77 | +aim to decrease the memory requirements by storing compressed row indices. |
| 78 | +In particular, a CSR sparse tensor consists of three 1-D tensors: |
| 79 | + |
| 80 | +* ``crow_indices``: array of compressed row indices with size ``(size[0] + 1,)``. This array indicates which row |
| 81 | + a given entry in values lives in. The last element is the number of specified elements, |
| 82 | + while crow_indices[i+1] - crow_indices[i] indicates the number of specified elements in row i. |
| 83 | +* ``col_indices``: array of size ``(nnz,)``. Indicates the column indices for each value. |
| 84 | +* ``values``: array of size ``(nnz,)``. Contains the values of the CSR tensor. |
| 85 | + |
| 86 | +Of note, both sparse COO and CSR tensors are in a `beta <https://pytorch.org/docs/stable/index.html>`__ state. |
| 87 | + |
| 88 | +By way of example: |
| 89 | + |
| 90 | + >>> mt_sparse_csr = mt.to_sparse_csr() |
| 91 | + >>> mt_sparse_csr |
| 92 | + MaskedTensor( |
| 93 | + [ |
| 94 | + [ --, --, 3], |
| 95 | + [ --, --, 5] |
| 96 | + ] |
| 97 | + ) |
| 98 | + >>> mt_sparse_csr.get_data() |
| 99 | + tensor(crow_indices=tensor([0, 1, 2]), |
| 100 | + col_indices=tensor([2, 2]), |
| 101 | + values=tensor([3, 5]), size=(2, 3), nnz=2, layout=torch.sparse_csr) |
| 102 | + |
| 103 | +Appendix |
| 104 | +++++++++ |
| 105 | + |
| 106 | +.. _sparse-coo-appendix: |
| 107 | + |
| 108 | +Sparse COO construction |
| 109 | +----------------------- |
| 110 | + |
| 111 | +Recall in our original example, we created a :class:`MaskedTensor` and then converted it to a sparse COO MaskedTensor |
| 112 | +with :meth:`MaskedTensor.to_sparse_coo`. |
| 113 | + |
| 114 | +Alternatively, we can also construct a sparse COO MaskedTensor directly by passing in two sparse COO tensors: |
| 115 | + |
| 116 | + >>> values = torch.tensor([[0, 0, 3], [4, 0, 5]]).to_sparse() |
| 117 | + >>> mask = torch.tensor([[False, False, True], [False, False, True]]).to_sparse() |
| 118 | + >>> mt = masked_tensor(values, mask) |
| 119 | + >>> values |
| 120 | + tensor(indices=tensor([[0, 1, 1], |
| 121 | + [2, 0, 2]]), |
| 122 | + values=tensor([3, 4, 5]), |
| 123 | + size=(2, 3), nnz=3, layout=torch.sparse_coo) |
| 124 | + >>> mask |
| 125 | + tensor(indices=tensor([[0, 1], |
| 126 | + [2, 2]]), |
| 127 | + values=tensor([True, True]), |
| 128 | + size=(2, 3), nnz=2, layout=torch.sparse_coo) |
| 129 | + >>> mt |
| 130 | + MaskedTensor( |
| 131 | + [ |
| 132 | + [ --, --, 3], |
| 133 | + [ --, --, 5] |
| 134 | + ] |
| 135 | + ) |
| 136 | + |
| 137 | +Instead of using :meth:`torch.Tensor.to_sparse`, we can also create the sparse COO tensors directly, which brings us to a warning: |
| 138 | + |
| 139 | +.. warning:: |
| 140 | + |
| 141 | + When using a function like :meth:`MaskedTensor.to_sparse_coo`, if the user does not specify the indices like in the above |
| 142 | + example, then the 0 values will be "unspecified" by default. |
| 143 | + |
| 144 | +Below, we explicitly specify the 0's: |
| 145 | + |
| 146 | + >>> values = torch.sparse_coo_tensor(i, v, (2, 3)) |
| 147 | + >>> mask = torch.sparse_coo_tensor(i, m, (2, 3)) |
| 148 | + >>> mt2 = masked_tensor(values, mask) |
| 149 | + >>> values |
| 150 | + tensor(indices=tensor([[0, 1, 1], |
| 151 | + [2, 0, 2]]), |
| 152 | + values=tensor([3, 4, 5]), |
| 153 | + size=(2, 3), nnz=3, layout=torch.sparse_coo) |
| 154 | + >>> mask |
| 155 | + tensor(indices=tensor([[0, 1, 1], |
| 156 | + [2, 0, 2]]), |
| 157 | + values=tensor([ True, False, True]), |
| 158 | + size=(2, 3), nnz=3, layout=torch.sparse_coo) |
| 159 | + >>> mt2 |
| 160 | + MaskedTensor( |
| 161 | + [ |
| 162 | + [ --, --, 3], |
| 163 | + [ --, --, 5] |
| 164 | + ] |
| 165 | + ) |
| 166 | + |
| 167 | +Note that ``mt`` and ``mt2`` look identical on the surface, and in the vast majority of operations, will yield the same |
| 168 | +result. But this brings us to a detail on the implementation: |
| 169 | + |
| 170 | +``data`` and ``mask`` -- only for sparse MaskedTensors -- can have a different number of elements (:func:`nnz`) |
| 171 | +**at creation**, but the indices of ``mask`` must then be a subset of the indices of ``data``. In this case, |
| 172 | +``data`` will assume the shape of ``mask`` by ``data = data.sparse_mask(mask)``; in other words, any of the elements |
| 173 | +in ``data`` that are not ``True`` in ``mask`` (i.e. not specified) will be thrown away. |
| 174 | + |
| 175 | +Therefore, under the hood, the data looks slightly different; ``mt2`` has the "4" value masked out and ``mt`` is completely |
| 176 | +without it. Their underlying data has different shapes, which would make operations like ``mt + mt2`` invalid. |
| 177 | + |
| 178 | + >>> mt.get_data() |
| 179 | + tensor(indices=tensor([[0, 1], |
| 180 | + [2, 2]]), |
| 181 | + values=tensor([3, 5]), |
| 182 | + size=(2, 3), nnz=2, layout=torch.sparse_coo) |
| 183 | + >>> mt2.get_data() |
| 184 | + tensor(indices=tensor([[0, 1, 1], |
| 185 | + [2, 0, 2]]), |
| 186 | + values=tensor([3, 4, 5]), |
| 187 | + size=(2, 3), nnz=3, layout=torch.sparse_coo) |
| 188 | + |
| 189 | +.. _sparse-csr-appendix: |
| 190 | + |
| 191 | +Sparse CSR construction |
| 192 | +----------------------- |
| 193 | + |
| 194 | +We can also construct a sparse CSR MaskedTensor using sparse CSR tensors, |
| 195 | +and like the example above, this results in a similar treatment under the hood. |
| 196 | + |
| 197 | + >>> crow_indices = torch.tensor([0, 2, 4]) |
| 198 | + >>> col_indices = torch.tensor([0, 1, 0, 1]) |
| 199 | + >>> values = torch.tensor([1, 2, 3, 4]) |
| 200 | + >>> mask_values = torch.tensor([True, False, False, True]) |
| 201 | + >>> |
| 202 | + >>> csr = torch.sparse_csr_tensor(crow_indices, col_indices, values, dtype=torch.double) |
| 203 | + >>> mask = torch.sparse_csr_tensor(crow_indices, col_indices, mask_values, dtype=torch.bool) |
| 204 | + >>> |
| 205 | + >>> mt = masked_tensor(csr, mask) |
| 206 | + >>> mt |
| 207 | + MaskedTensor( |
| 208 | + [ |
| 209 | + [ 1.0000, --], |
| 210 | + [ --, 4.0000] |
| 211 | + ] |
| 212 | + ) |
| 213 | + >>> mt.get_data() |
| 214 | + tensor(crow_indices=tensor([0, 2, 4]), |
| 215 | + col_indices=tensor([0, 1, 0, 1]), |
| 216 | + values=tensor([1., 2., 3., 4.]), size=(2, 2), nnz=4, |
| 217 | + dtype=torch.float64, layout=torch.sparse_csr) |
| 218 | + |
0 commit comments