You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2020-10-26-1.7-released.md
+30-30Lines changed: 30 additions & 30 deletions
Original file line number
Diff line number
Diff line change
@@ -8,11 +8,11 @@ Today, we’re announcing the availability of PyTorch 1.7, along with updated do
8
8
9
9
A few of the highlights include:
10
10
11
-
* 1- CUDA 11 is now officially supported with binaries available at [PyTorch.org](http://pytorch.org/)
12
-
* 2- Updates and additions to profiling and performance for RPC, TorchScript, Stack traces and Benchmark utilities
13
-
* 3- (Beta) Support for NumPy compatible Fast Fourier transforms (FFT) via torch.fft
14
-
* 4- (Prototype) Support for Nvidia A100 generation GPUs and native TF32 format
15
-
* 5- (Prototype) Distributed training on Windows now supported
11
+
1. CUDA 11 is now officially supported with binaries available at [PyTorch.org](http://pytorch.org/)
12
+
2. Updates and additions to profiling and performance for RPC, TorchScript, Stack traces and Benchmark utilities
13
+
3. (Beta) Support for NumPy compatible Fast Fourier transforms (FFT) via torch.fft
14
+
4. (Prototype) Support for Nvidia A100 generation GPUs and native TF32 format
15
+
5. (Prototype) Distributed training on Windows now supported
16
16
17
17
To reiterate, starting [PyTorch 1.6](https://pytorch.org/blog/pytorch-feature-classification-changes/), features are now classified as stable, beta and prototype. You can see the detailed announcement [here](https://pytorch.org/blog/pytorch-feature-classification-changes/). Note that the prototype features listed in this blog are available as part of this release.
18
18
@@ -64,8 +64,8 @@ Note that this is necessary, **but not sufficient**, for determinism **within a
64
64
65
65
See the documentation for ```torch.set_deterministic(bool)``` for the list of affected operations.
Users can now see not only operator name/inputs in the profiler output table but also where the operator is in the code. The workflow requires very little change to take advantage of this capability. The user uses the [autograd profiler](https://pytorch.org/docs/stable/autograd.html#profiler) as before but with optional new parameters: ```with_stack``` and ```group_by_stack_n.```
@@ -138,21 +138,21 @@ Torchelastic offers a strict superset of the current ```torch.distributed.launch
138
138
139
139
By bundling ```torchelastic``` in the same docker image as PyTorch, users can start experimenting with torchelastic right-away without having to separately install ```torchelastic```. In addition to convenience, this work is a nice-to-have when adding support for elastic parameters in the existing Kubeflow’s distributed PyTorch operators.
140
140
141
-
* Usage examples and how to get started | [Link](https://pytorch.org/elastic/0.2.0/examples.html)
141
+
* Usage examples and how to get started ([Link](https://pytorch.org/elastic/0.2.0/examples.html))
142
142
143
143
## [Beta] Support for uneven dataset inputs in DDP
144
144
145
145
PyTorch 1.7 introduces a new context manager to be used in conjunction with models trained using ```torch.nn.parallel.DistributedDataParallel``` to enable training with uneven dataset size across different processes. This feature enables greater flexibility when using DDP and prevents the user from having to manually ensure dataset sizes are the same across different process. With this context manager, DDP will handle uneven dataset sizes automatically, which can prevent errors or hangs at the end of training.
In the past, NCCL training runs would hang indefinitely due to stuck collectives, leading to a very unpleasant experience for users. This feature will abort stuck collectives and throw an exception/crash the process if a potential hang is detected. When used with something like torchelastic (which can recover the training process from the last checkpoint), users can have much greater reliability for distributed training. This feature is completely opt in and sits behind an environment variable that needs to be explicitly set in order to enable this feature (otherwise users will see the same behavior as before).
## [Beta] Distributed optimizer with TorchScript support
195
195
@@ -199,9 +199,9 @@ In PyTorch 1.7, we are enabling the TorchScript support in distributed optimizer
199
199
200
200
Currently, the only optimizer that supports automatic conversion with TorchScript is ```Adagrad```, all other optimizers will still work as before without TorchScript support. We are working on expanding the coverage to all PyTorch optimizers.
201
201
202
-
* Design doc | Link **Missing Link**
203
-
* Documentation | Link **Missing Link**
204
-
* Usage examples | Link **Missing Link**
202
+
* Design doc (Link)**Missing Link**
203
+
* Documentation (Link)**Missing Link**
204
+
* Usage examples (Link)**Missing Link**
205
205
206
206
## [Beta] Enhancements to RPC-based Profiling
207
207
@@ -213,15 +213,15 @@ Support for using the PyTorch profiler in conjunction with the RPC framework was
213
213
214
214
Users are now able to use familiar profiling tools such as with ```torch.autograd.profiler.profile()``` and ```with torch.autograd.profiler.record_function```, and this works transparently with the RPC framework with full feature support, profiles asynchronous functions, and TorchScript functions.
As of PyTorch 1.6, DDP would put an extra copy of gradient tensors in communication buckets. This incurs additional memory overhead which is equivalent to the size of the gradients. In PyTorch 1.7, we added a ```gradient_as_bucket_view``` flag to the DDP constructor API. When this flag is set to ```True```, DDP will override ```param.grad``` as views that point of communication buckets. This not only eliminates an extra in-memory copy of gradients, but also avoids the additional read/write operations to synchronize communication buckets and ```param.grad``` values.
0 commit comments