Skip to content

Commit 3d04f5f

Browse files
authored
Merge branch 'main' into improve-quantization-recipe
2 parents 592d369 + 630c2e2 commit 3d04f5f

File tree

4 files changed

+7
-6
lines changed

4 files changed

+7
-6
lines changed

advanced_source/super_resolution_with_onnxruntime.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
.. note::
66
As of PyTorch 2.1, there are two versions of ONNX Exporter.
77
8-
* ``torch.onnx.dynamo_export`is the newest (still in beta) exporter based on the TorchDynamo technology released with PyTorch 2.0.
8+
* ``torch.onnx.dynamo_export`` is the newest (still in beta) exporter based on the TorchDynamo technology released with PyTorch 2.0.
99
* ``torch.onnx.export`` is based on TorchScript backend and has been available since PyTorch 1.2.0.
1010
1111
In this tutorial, we describe how to convert a model defined

advanced_source/usb_semisup_learn.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@
8181
# algorithm on dataset
8282
#
8383
# Note that a CUDA-enabled backend is required for training with the ``semilearn`` package.
84-
# See `Enabling CUDA in Google Colab <https://pytorch.org/tutorials/beginner/colab#using-cuda>`__ for instructions
84+
# See `Enabling CUDA in Google Colab <https://pytorch.org/tutorials/beginner/colab#enabling-cuda>`__ for instructions
8585
# on enabling CUDA in Google Colab.
8686
#
8787
import semilearn

beginner_source/ddp_series_multigpu.rst

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -78,15 +78,15 @@ Imports
7878
Constructing the process group
7979
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
8080

81+
- First, before initializing the group process, call `set_device <https://pytorch.org/docs/stable/generated/torch.cuda.set_device.html?highlight=set_device#torch.cuda.set_device>`__,
82+
which sets the default GPU for each process. This is important to prevent hangs or excessive memory utilization on `GPU:0`
8183
- The process group can be initialized by TCP (default) or from a
8284
shared file-system. Read more on `process group
8385
initialization <https://pytorch.org/docs/stable/distributed.html#tcp-initialization>`__
8486
- `init_process_group <https://pytorch.org/docs/stable/distributed.html?highlight=init_process_group#torch.distributed.init_process_group>`__
8587
initializes the distributed process group.
8688
- Read more about `choosing a DDP
8789
backend <https://pytorch.org/docs/stable/distributed.html#which-backend-to-use>`__
88-
- `set_device <https://pytorch.org/docs/stable/generated/torch.cuda.set_device.html?highlight=set_device#torch.cuda.set_device>`__
89-
sets the default GPU for each process. This is important to prevent hangs or excessive memory utilization on `GPU:0`
9090

9191
.. code-block:: diff
9292
@@ -98,8 +98,9 @@ Constructing the process group
9898
+ """
9999
+ os.environ["MASTER_ADDR"] = "localhost"
100100
+ os.environ["MASTER_PORT"] = "12355"
101-
+ init_process_group(backend="nccl", rank=rank, world_size=world_size)
102101
+ torch.cuda.set_device(rank)
102+
+ init_process_group(backend="nccl", rank=rank, world_size=world_size)
103+
103104
104105
105106
Constructing the DDP model

beginner_source/knowledge_distillation_tutorial.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -324,7 +324,7 @@ def train_knowledge_distillation(teacher, student, train_loader, epochs, learnin
324324
soft_prob = nn.functional.log_softmax(student_logits / T, dim=-1)
325325

326326
# Calculate the soft targets loss. Scaled by T**2 as suggested by the authors of the paper "Distilling the knowledge in a neural network"
327-
soft_targets_loss = -torch.sum(soft_targets * soft_prob) / soft_prob.size()[0] * (T**2)
327+
soft_targets_loss = torch.sum(soft_targets * (soft_targets.log() - soft_prob)) / soft_prob.size()[0] * (T**2)
328328

329329
# Calculate the true label loss
330330
label_loss = ce_loss(student_logits, labels)

0 commit comments

Comments
 (0)