Skip to content

Commit 4b10b89

Browse files
committed
[maskedtensor] Add sparsity tutorial
ghstack-source-id: 3584547 Pull Request resolved: #2043
1 parent c0b92e0 commit 4b10b89

File tree

1 file changed

+218
-0
lines changed

1 file changed

+218
-0
lines changed
Lines changed: 218 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,218 @@
1+
Sparsity
2+
++++++++
3+
4+
Sparsity has been an area of rapid growth and importance within PyTorch; if any sparsity terms are confusing below,
5+
please refer to the `sparsity tutorial <https://pytorch.org/docs/stable/sparse.html>`__ for additional details.
6+
7+
Sparse storage formats have been proven to be powerful in a variety of ways. As a primer, the first use case
8+
most practitioners think about is when the majority of elements are equal to zero (a high degree of sparsity),
9+
but even in cases of lower sparsity, certain formats (e.g. BSR) can take advantage of substructures within a matrix.
10+
11+
.. note::
12+
13+
At the moment, MaskedTensor supports COO and CSR tensors with plans to support additional formats
14+
(e.g. BSR and CSC) in the future. If you have any requests for additional formats, please file a feature request!
15+
16+
Principles
17+
----------
18+
19+
When creating a :class:`MaskedTensor` with sparse tensors, there are a few principles that must be observed:
20+
21+
1. ``data`` and ``mask`` must have the same storage format, whether that's :attr:`torch.strided`, :attr:`torch.sparse_coo`, or :attr:`torch.sparse_csr`
22+
2. ``data`` and ``mask`` must have the same size, indicated by :func:`size()`
23+
24+
Sparse COO tensors
25+
------------------
26+
27+
In accordance with Principle #1, a sparse COO MaskedTensor is created by passing in two sparse COO tensors,
28+
which can be initialized by any of its constructors, e.g. :func:`torch.sparse_coo_tensor`.
29+
30+
As a recap of `sparse COO tensors <https://pytorch.org/docs/stable/sparse.html#sparse-coo-tensors>`__, the COO format
31+
stands for "coordinate format", where the specified elements are stored as tuples of their indices and the
32+
corresponding values. That is, the following are provided:
33+
34+
* ``indices``: array of size ``(ndim, nse)`` and dtype ``torch.int64``
35+
* ``values``: array of size `(nse,)` with any integer or floating point dtype
36+
37+
where ``ndim`` is the dimensionality of the tensor and ``nse`` is the number of specified elements
38+
39+
For both sparse COO and CSR tensors, you can construct a :class:`MaskedTensor` by doing either:
40+
41+
1. ``masked_tensor(sparse_tensor_data, sparse_tensor_mask)``
42+
2. ``dense_masked_tensor.to_sparse_coo()`` or ``dense_masked_tensor.to_sparse_csr()``
43+
44+
The second method is easier to illustrate so we've shown that below, but for more on the first and the nuances behind
45+
the approach, please read the :ref:`sparse-coo-appendix`.
46+
47+
>>> values = torch.tensor([[0, 0, 3], [4, 0, 5]])
48+
>>> mask = torch.tensor([[False, False, True], [False, False, True]])
49+
>>> mt = masked_tensor(values, mask)
50+
>>> sparse_coo_mt = mt.to_sparse_coo()
51+
>>> mt
52+
MaskedTensor(
53+
[
54+
[ --, --, 3],
55+
[ --, --, 5]
56+
]
57+
)
58+
>>> sparse_coo_mt
59+
MaskedTensor(
60+
[
61+
[ --, --, 3],
62+
[ --, --, 5]
63+
]
64+
)
65+
>>> sparse_coo_mt.get_data()
66+
tensor(indices=tensor([[0, 1],
67+
[2, 2]]),
68+
values=tensor([3, 5]),
69+
size=(2, 3), nnz=2, layout=torch.sparse_coo)
70+
71+
Sparse CSR tensors
72+
------------------
73+
74+
Similarly, :class:`MaskedTensor` also supports the
75+
`CSR (Compressed Sparse Row) <https://pytorch.org/docs/stable/sparse.html#sparse-csr-tensor>`__
76+
sparse tensor format. Instead of storing the tuples of the indices like sparse COO tensors, sparse CSR tensors
77+
aim to decrease the memory requirements by storing compressed row indices.
78+
In particular, a CSR sparse tensor consists of three 1-D tensors:
79+
80+
* ``crow_indices``: array of compressed row indices with size ``(size[0] + 1,)``. This array indicates which row
81+
a given entry in values lives in. The last element is the number of specified elements,
82+
while crow_indices[i+1] - crow_indices[i] indicates the number of specified elements in row i.
83+
* ``col_indices``: array of size ``(nnz,)``. Indicates the column indices for each value.
84+
* ``values``: array of size ``(nnz,)``. Contains the values of the CSR tensor.
85+
86+
Of note, both sparse COO and CSR tensors are in a `beta <https://pytorch.org/docs/stable/index.html>`__ state.
87+
88+
By way of example:
89+
90+
>>> mt_sparse_csr = mt.to_sparse_csr()
91+
>>> mt_sparse_csr
92+
MaskedTensor(
93+
[
94+
[ --, --, 3],
95+
[ --, --, 5]
96+
]
97+
)
98+
>>> mt_sparse_csr.get_data()
99+
tensor(crow_indices=tensor([0, 1, 2]),
100+
col_indices=tensor([2, 2]),
101+
values=tensor([3, 5]), size=(2, 3), nnz=2, layout=torch.sparse_csr)
102+
103+
Appendix
104+
++++++++
105+
106+
.. _sparse-coo-appendix:
107+
108+
Sparse COO construction
109+
-----------------------
110+
111+
Recall in our original example, we created a :class:`MaskedTensor` and then converted it to a sparse COO MaskedTensor
112+
with :meth:`MaskedTensor.to_sparse_coo`.
113+
114+
Alternatively, we can also construct a sparse COO MaskedTensor directly by passing in two sparse COO tensors:
115+
116+
>>> values = torch.tensor([[0, 0, 3], [4, 0, 5]]).to_sparse()
117+
>>> mask = torch.tensor([[False, False, True], [False, False, True]]).to_sparse()
118+
>>> mt = masked_tensor(values, mask)
119+
>>> values
120+
tensor(indices=tensor([[0, 1, 1],
121+
[2, 0, 2]]),
122+
values=tensor([3, 4, 5]),
123+
size=(2, 3), nnz=3, layout=torch.sparse_coo)
124+
>>> mask
125+
tensor(indices=tensor([[0, 1],
126+
[2, 2]]),
127+
values=tensor([True, True]),
128+
size=(2, 3), nnz=2, layout=torch.sparse_coo)
129+
>>> mt
130+
MaskedTensor(
131+
[
132+
[ --, --, 3],
133+
[ --, --, 5]
134+
]
135+
)
136+
137+
Instead of using :meth:`torch.Tensor.to_sparse`, we can also create the sparse COO tensors directly, which brings us to a warning:
138+
139+
.. warning::
140+
141+
When using a function like :meth:`MaskedTensor.to_sparse_coo`, if the user does not specify the indices like in the above
142+
example, then the 0 values will be "unspecified" by default.
143+
144+
Below, we explicitly specify the 0's:
145+
146+
>>> values = torch.sparse_coo_tensor(i, v, (2, 3))
147+
>>> mask = torch.sparse_coo_tensor(i, m, (2, 3))
148+
>>> mt2 = masked_tensor(values, mask)
149+
>>> values
150+
tensor(indices=tensor([[0, 1, 1],
151+
[2, 0, 2]]),
152+
values=tensor([3, 4, 5]),
153+
size=(2, 3), nnz=3, layout=torch.sparse_coo)
154+
>>> mask
155+
tensor(indices=tensor([[0, 1, 1],
156+
[2, 0, 2]]),
157+
values=tensor([ True, False, True]),
158+
size=(2, 3), nnz=3, layout=torch.sparse_coo)
159+
>>> mt2
160+
MaskedTensor(
161+
[
162+
[ --, --, 3],
163+
[ --, --, 5]
164+
]
165+
)
166+
167+
Note that ``mt`` and ``mt2`` look identical on the surface, and in the vast majority of operations, will yield the same
168+
result. But this brings us to a detail on the implementation:
169+
170+
``data`` and ``mask`` -- only for sparse MaskedTensors -- can have a different number of elements (:func:`nnz`)
171+
**at creation**, but the indices of ``mask`` must then be a subset of the indices of ``data``. In this case,
172+
``data`` will assume the shape of ``mask`` by ``data = data.sparse_mask(mask)``; in other words, any of the elements
173+
in ``data`` that are not ``True`` in ``mask`` (i.e. not specified) will be thrown away.
174+
175+
Therefore, under the hood, the data looks slightly different; ``mt2`` has the "4" value masked out and ``mt`` is completely
176+
without it. Their underlying data has different shapes, which would make operations like ``mt + mt2`` invalid.
177+
178+
>>> mt.get_data()
179+
tensor(indices=tensor([[0, 1],
180+
[2, 2]]),
181+
values=tensor([3, 5]),
182+
size=(2, 3), nnz=2, layout=torch.sparse_coo)
183+
>>> mt2.get_data()
184+
tensor(indices=tensor([[0, 1, 1],
185+
[2, 0, 2]]),
186+
values=tensor([3, 4, 5]),
187+
size=(2, 3), nnz=3, layout=torch.sparse_coo)
188+
189+
.. _sparse-csr-appendix:
190+
191+
Sparse CSR construction
192+
-----------------------
193+
194+
We can also construct a sparse CSR MaskedTensor using sparse CSR tensors,
195+
and like the example above, this results in a similar treatment under the hood.
196+
197+
>>> crow_indices = torch.tensor([0, 2, 4])
198+
>>> col_indices = torch.tensor([0, 1, 0, 1])
199+
>>> values = torch.tensor([1, 2, 3, 4])
200+
>>> mask_values = torch.tensor([True, False, False, True])
201+
>>>
202+
>>> csr = torch.sparse_csr_tensor(crow_indices, col_indices, values, dtype=torch.double)
203+
>>> mask = torch.sparse_csr_tensor(crow_indices, col_indices, mask_values, dtype=torch.bool)
204+
>>>
205+
>>> mt = masked_tensor(csr, mask)
206+
>>> mt
207+
MaskedTensor(
208+
[
209+
[ 1.0000, --],
210+
[ --, 4.0000]
211+
]
212+
)
213+
>>> mt.get_data()
214+
tensor(crow_indices=tensor([0, 2, 4]),
215+
col_indices=tensor([0, 1, 0, 1]),
216+
values=tensor([1., 2., 3., 4.]), size=(2, 2), nnz=4,
217+
dtype=torch.float64, layout=torch.sparse_csr)
218+

0 commit comments

Comments
 (0)