Skip to content

Commit c0b92e0

Browse files
committed
[maskedtensor] Add overview tutorial
ghstack-source-id: dd255bc Pull Request resolved: #2042
1 parent 1125546 commit c0b92e0

File tree

2 files changed

+253
-0
lines changed

2 files changed

+253
-0
lines changed
Lines changed: 244 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,244 @@
1+
MaskedTensor Overview
2+
=====================
3+
4+
This tutorial is designed to serve as a starting point for using MaskedTensors
5+
and discuss its masking semantics.
6+
7+
Using MaskedTensor
8+
++++++++++++++++++
9+
10+
Construction
11+
------------
12+
13+
There are a few different ways to construct a MaskedTensor:
14+
15+
* The first way is to directly invoke the MaskedTensor class
16+
* The second (and our recommended way) is to use :func:`masked.masked_tensor` and :func:`masked.as_masked_tensor` factory functions,
17+
which are analogous to :func:`torch.tensor` and :func:`torch.as_tensor`
18+
19+
.. autosummary::
20+
:toctree: generated
21+
:nosignatures:
22+
23+
masked.masked_tensor
24+
masked.as_masked_tensor
25+
26+
Accessing the data and mask
27+
---------------------------
28+
29+
The underlying fields in a MaskedTensor can be accessed through:
30+
31+
* the :meth:`MaskedTensor.get_data` function
32+
* the :meth:`MaskedTensor.get_mask` function. Recall that ``True`` indicates "specified" or "valid" while ``False`` indicates
33+
"unspecified" or "invalid".
34+
35+
In general, the underlying data that is returned may not be valid in the unspecified entries, so we recommend that
36+
when users require a Tensor without any masked entries, that they use :meth:`MaskedTensor.to_tensor` (as shown above) to
37+
return a Tensor with filled values.
38+
39+
Indexing and slicing
40+
--------------------
41+
42+
:class:`MaskedTensor` is a Tensor subclass, which means that it inherits the same semantics for indexing and slicing
43+
as :class:`torch.Tensor`. Below are some examples of common indexing and slicing patterns:
44+
45+
>>> data = torch.arange(60).reshape(3, 4, 5)
46+
>>> mask = data % 2 == 0
47+
>>> mt = masked_tensor(data.float(), mask)
48+
>>> mt[0]
49+
MaskedTensor(
50+
[
51+
[ 0.0000, --, 2.0000, --, 4.0000],
52+
[ --, 6.0000, --, 8.0000, --],
53+
[ 10.0000, --, 12.0000, --, 14.0000],
54+
[ --, 16.0000, --, 18.0000, --]
55+
]
56+
)
57+
>>> mt[[0,2]]
58+
MaskedTensor(
59+
[
60+
[
61+
[ 0.0000, --, 2.0000, --, 4.0000],
62+
[ --, 6.0000, --, 8.0000, --],
63+
[ 10.0000, --, 12.0000, --, 14.0000],
64+
[ --, 16.0000, --, 18.0000, --]
65+
],
66+
[
67+
[ 40.0000, --, 42.0000, --, 44.0000],
68+
[ --, 46.0000, --, 48.0000, --],
69+
[ 50.0000, --, 52.0000, --, 54.0000],
70+
[ --, 56.0000, --, 58.0000, --]
71+
]
72+
]
73+
)
74+
>>> mt[:, :2]
75+
MaskedTensor(
76+
[
77+
[
78+
[ 0.0000, --, 2.0000, --, 4.0000],
79+
[ --, 6.0000, --, 8.0000, --]
80+
],
81+
[
82+
[ 20.0000, --, 22.0000, --, 24.0000],
83+
[ --, 26.0000, --, 28.0000, --]
84+
],
85+
[
86+
[ 40.0000, --, 42.0000, --, 44.0000],
87+
[ --, 46.0000, --, 48.0000, --]
88+
]
89+
]
90+
)
91+
92+
Semantics
93+
+++++++++
94+
95+
MaskedTensor vs NumPy's MaskedArray
96+
-----------------------------------
97+
98+
NumPy's ``MaskedArray`` has a few fundamental semantics differences from MaskedTensor.
99+
100+
1. Their factory function and basic definition inverts the mask (similar to ``torch.nn.MHA``); that is, MaskedTensor
101+
uses ``True`` to denote "specified" and ``False`` to denote "unspecified", or "valid"/"invalid", whereas NumPy does the
102+
opposite.
103+
2. Intersection semantics. In NumPy, if one of two elements are masked out, the resulting element will be
104+
masked out as well -- in practice, they
105+
`apply the logical_or operator <https://github.com/numpy/numpy/blob/68299575d8595d904aff6f28e12d21bf6428a4ba/numpy/ma/core.py#L1016-L1024>`__.
106+
107+
>>> data = torch.arange(5.)
108+
>>> mask = torch.tensor([True, True, False, True, False])
109+
>>> npm0 = np.ma.masked_array(data.numpy(), (~mask).numpy())
110+
>>> npm1 = np.ma.masked_array(data.numpy(), (mask).numpy())
111+
>>> npm0
112+
masked_array(data=[0.0, 1.0, --, 3.0, --],
113+
mask=[False, False, True, False, True],
114+
fill_value=1e+20,
115+
dtype=float32)
116+
>>> npm1
117+
masked_array(data=[--, --, 2.0, --, 4.0],
118+
mask=[ True, True, False, True, False],
119+
fill_value=1e+20,
120+
dtype=float32)
121+
>>> npm0 + npm1
122+
masked_array(data=[--, --, --, --, --],
123+
mask=[ True, True, True, True, True],
124+
fill_value=1e+20,
125+
dtype=float32)
126+
127+
Meanwhile, MaskedTensor does not support addition or binary operators with masks that don't match -- to understand why,
128+
please find the section on reductions.
129+
130+
>>> mt0 = masked_tensor(data, mask)
131+
>>> mt1 = masked_tensor(data, ~mask)
132+
>>> m0
133+
MaskedTensor(
134+
[ 0.0000, 1.0000, --, 3.0000, --]
135+
)
136+
>>> mt0 = masked_tensor(data, mask)
137+
>>> mt1 = masked_tensor(data, ~mask)
138+
>>> mt0
139+
MaskedTensor(
140+
[ 0.0000, 1.0000, --, 3.0000, --]
141+
)
142+
>>> mt1
143+
MaskedTensor(
144+
[ --, --, 2.0000, --, 4.0000]
145+
)
146+
>>> mt0 + mt1
147+
ValueError: Input masks must match. If you need support for this, please open an issue on Github.
148+
149+
However, if this behavior is desired, MaskedTensor does support these semantics by giving access to the data and masks
150+
and conveniently converting a MaskedTensor to a Tensor with masked values filled in using :func:`to_tensor`.
151+
152+
>>> t0 = mt0.to_tensor(0)
153+
>>> t1 = mt1.to_tensor(0)
154+
>>> mt2 = masked_tensor(t0 + t1, mt0.get_mask() & mt1.get_mask())
155+
>>> t0
156+
tensor([0., 1., 0., 3., 0.])
157+
>>> t1
158+
tensor([0., 0., 2., 0., 4.])
159+
>>> mt2
160+
MaskedTensor(
161+
[ --, --, --, --, --]
162+
163+
.. _reduction-semantics:
164+
165+
Reduction semantics
166+
-------------------
167+
168+
The basis for reduction semantics `has been documented and discussed at length <https://github.com/pytorch/rfcs/pull/27>`__,
169+
but again, by way of example:
170+
171+
>>> data = torch.arange(12, dtype=torch.float).reshape(3, 4)
172+
>>> mask = torch.randint(2, (3, 4), dtype=torch.bool)
173+
>>> mt = masked_tensor(data, mask)
174+
>>> mt
175+
MaskedTensor(
176+
[
177+
[ --, 1.0000, --, --],
178+
[ --, 5.0000, 6.0000, 7.0000],
179+
[ 8.0000, 9.0000, --, 11.0000]
180+
]
181+
)
182+
183+
>>> torch.sum(mt, 1)
184+
MaskedTensor(
185+
[ 1.0000, 18.0000, 28.0000]
186+
)
187+
>>> torch.mean(mt, 1)
188+
MaskedTensor(
189+
[ 1.0000, 6.0000, 9.3333]
190+
)
191+
>>> torch.prod(mt, 1)
192+
MaskedTensor(
193+
[ 1.0000, 210.0000, 792.0000]
194+
)
195+
>>> torch.amin(mt, 1)
196+
MaskedTensor(
197+
[ 1.0000, 5.0000, 8.0000]
198+
)
199+
>>> torch.amax(mt, 1)
200+
MaskedTensor(
201+
[ 1.0000, 7.0000, 11.0000]
202+
)
203+
204+
Now we can revisit the question: why do we enforce the invariant that masks must match for binary operators?
205+
In other words, why don't we use the same semantics as ``np.ma.masked_array``? Consider the following example:
206+
207+
>>> data0 = torch.arange(10.).reshape(2, 5)
208+
>>> data1 = torch.arange(10.).reshape(2, 5) + 10
209+
>>> mask0 = torch.tensor([[True, True, False, False, False], [False, False, False, True, True]])
210+
>>> mask1 = torch.tensor([[False, False, False, True, True], [True, True, False, False, False]])
211+
212+
>>> npm0 = np.ma.masked_array(data0.numpy(), (mask0).numpy())
213+
>>> npm1 = np.ma.masked_array(data1.numpy(), (mask1).numpy())
214+
>>> npm0
215+
masked_array(
216+
data=[[--, --, 2.0, 3.0, 4.0],
217+
[5.0, 6.0, 7.0, --, --]],
218+
mask=[[ True, True, False, False, False],
219+
[False, False, False, True, True]],
220+
fill_value=1e+20,
221+
dtype=float32)
222+
>>> npm1
223+
masked_array(
224+
data=[[10.0, 11.0, 12.0, --, --],
225+
[--, --, 17.0, 18.0, 19.0]],
226+
mask=[[False, False, False, True, True],
227+
[ True, True, False, False, False]],
228+
fill_value=1e+20,
229+
dtype=float32)
230+
>>> (npm0 + npm1).sum(0)
231+
masked_array(data=[--, --, 38.0, --, --],
232+
mask=[ True, True, False, True, True],
233+
fill_value=1e+20,
234+
dtype=float32)
235+
>>> npm0.sum(0) + npm1.sum(0)
236+
masked_array(data=[15.0, 17.0, 38.0, 21.0, 23.0],
237+
mask=[False, False, False, False, False],
238+
fill_value=1e+20,
239+
dtype=float32)
240+
241+
Sum and addition should clearly be associative, but with NumPy's semantics, they are allowed to not be,
242+
which can certainly be confusing for the user. That being said, if the user wishes, there are ways around this
243+
(e.g. filling in the MaskedTensor's undefined elements with 0 values using :func:`to_tensor` as shown in a previous
244+
example), but the user must now be more explicit with their intentions.

index.rst

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -804,6 +804,15 @@ Additional Resources
804804
beginner/translation_transformer
805805

806806

807+
.. toctree::
808+
:maxdepth: 2
809+
:includehidden:
810+
:hidden:
811+
:caption: MaskedTensor
812+
813+
beginner/maskedtensor_overview
814+
815+
807816
.. toctree::
808817
:maxdepth: 2
809818
:includehidden:

0 commit comments

Comments
 (0)