Skip to content

Commit 2d16581

Browse files
gaoteng-gitmaxluk
andcommitted
Update intermediate_source/tensorboard_profiler_tutorial.py
Co-authored-by: maxluk <maxluk@microsoft.com>
1 parent cd8c987 commit 2d16581

File tree

1 file changed

+26
-26
lines changed

1 file changed

+26
-26
lines changed

intermediate_source/tensorboard_profiler_tutorial.py

Lines changed: 26 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,19 @@
11
"""
22
PyTorch TensorBoard Profiler
33
====================================
4-
This recipe explains how to use PyTorch TensorBoard Profiler
5-
and measure the performance bottleneck of the model.
4+
This recipe demonstrates how to use PyTorch Profiler
5+
to detect performance bottlenecks of the model.
66
77
.. note::
88
PyTorch 1.8 introduces the new API that will replace the older profiler API
99
in the future releases. Check the new API at `this page <https://pytorch.org/docs/master/profiler.html>`__.
1010
1111
Introduction
1212
------------
13-
PyTorch 1.8 includes an updated profiler API that could help user
14-
record both the operators running on CPU side and the CUDA kernels running on GPU side.
15-
Given the profiling information,
16-
we can use this TensorBoard Plugin to visualize it and analyze the performance bottleneck.
13+
PyTorch 1.8 includes an updated profiler API capable of
14+
recording the CPU side operations as well as the CUDA kernel launches on the GPU side.
15+
The profiler can visualize this information
16+
in TensorBoard Plugin and provide analysis of the performance bottlenecks.
1717
1818
In this recipe, we will use a simple Resnet model to demonstrate how to
1919
use profiler to analyze model performance.
@@ -37,13 +37,13 @@
3737
# 1. Prepare the data and model
3838
# 2. Use profiler to record execution events
3939
# 3. Run the profiler
40-
# 4. Use TensorBoard to view and analyze performance
40+
# 4. Use TensorBoard to view results and analyze performance
4141
# 5. Improve performance with the help of profiler
4242
#
4343
# 1. Prepare the data and model
4444
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4545
#
46-
# Firstly, let’s import all necessary libraries:
46+
# First, import all necessary libraries:
4747
#
4848

4949
import torch
@@ -57,18 +57,18 @@
5757

5858
######################################################################
5959
# Then prepare the input data. For this tutorial, we use the CIFAR10 dataset.
60-
# We transform it to desired format and use DataLoader to load each batch.
60+
# Transform it to the desired format and use DataLoader to load each batch.
6161

6262
transform = T.Compose(
6363
[T.Resize(224),
6464
T.ToTensor(),
6565
T.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
6666
train_set = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
67-
train_loader = torch.utils.data.DataLoader(train_set, batch_size=32, shuffle=True) # num_workers=4
67+
train_loader = torch.utils.data.DataLoader(train_set, batch_size=32, shuffle=True)
6868

6969
######################################################################
70-
# Let’s create an instance of a Resnet model, an instance of loss, and an instance of optimizer.
71-
# To run on GPU, we put model and loss to GPU device.
70+
# Next, create Resnet model, loss function, and optimizer objects.
71+
# To run on GPU, move model and loss to GPU device.
7272

7373
device = torch.device("cuda:0")
7474
model = torchvision.models.resnet18(pretrained=True).cuda(device)
@@ -78,7 +78,7 @@
7878

7979

8080
######################################################################
81-
# We define the training step for each batch of input data.
81+
# Define the training step for each batch of input data.
8282

8383
def train(data):
8484
inputs, labels = data[0].to(device=device), data[1].to(device=device)
@@ -93,7 +93,7 @@ def train(data):
9393
# 2. Use profiler to record execution events
9494
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
9595
#
96-
# The profiler is enabled through the context manager and accepts a number of parameters,
96+
# The profiler is enabled through the context manager and accepts several parameters,
9797
# some of the most useful are:
9898
#
9999
# - ``schedule`` - callable that takes step (int) as a single parameter
@@ -111,8 +111,8 @@ def train(data):
111111
# During ``active`` steps, the profiler works and record events.
112112
# - ``on_trace_ready`` - callable that is called at the end of each cycle;
113113
# In this example we use ``torch.profiler.tensorboard_trace_handler`` to generate result files for TensorBoard.
114-
# After profiling, result files can be generated in the ``./log/resnet18`` directory,
115-
# which could be specified to open and analyzed in TensorBoard.
114+
# After profiling, result files will be saved into the ``./log/resnet18`` directory.
115+
# Specify this directory as a ``logdir`` parameter to analyze profile in TensorBoard.
116116
# - ``record_shapes`` - whether to record shapes of the operator inputs.
117117

118118
with torch.profiler.profile(
@@ -135,18 +135,18 @@ def train(data):
135135

136136

137137
######################################################################
138-
# 4. Use TensorBoard to view and analyze performance
138+
# 4. Use TensorBoard to view results and analyze performance
139139
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
140140
#
141-
# This requires the latest versions of PyTorch TensorBoard Profiler.
141+
# Install PyTorch Profiler TensorBoard Plugin.
142142
#
143143
# ::
144144
#
145145
# pip install torch_tb_profiler
146146
#
147147

148148
######################################################################
149-
# Launch the TensorBoard Profiler.
149+
# Launch the TensorBoard.
150150
#
151151
# ::
152152
#
@@ -158,21 +158,21 @@ def train(data):
158158
#
159159
# ::
160160
#
161-
# http://localhost:6006/#torch_profiler
161+
# http://localhost:6006/#pytorch_profiler
162162
#
163163

164164
######################################################################
165-
# The profiler’s front page is as below.
165+
# You should see Profiler plugin page as shown below.
166166
#
167167
# .. image:: ../../_static/img/profiler_overview1.png
168168
# :scale: 25 %
169169
#
170-
# This overview shows a high-level summary of performance.
170+
# The overview shows a high-level summary of model performance.
171171
#
172-
# The "Step Time Breakdown" break the time spent on each step into multiple categories.
173-
# In this example, you can see the ``DataLoader`` costs a lot of time.
172+
# The "Step Time Breakdown" shows distribution of time spent in each step over different categories of execution.
173+
# In this example, you can see the ``DataLoader`` overhead is significant.
174174
#
175-
# The bottom "Performance Recommendation" leverages the profiling result
175+
# The bottom "Performance Recommendation" uses the profiling data
176176
# to automatically highlight likely bottlenecks,
177177
# and gives you actionable optimization suggestions.
178178
#
@@ -187,7 +187,7 @@ def train(data):
187187
# The GPU kernel view shows all kernels’ time spent on GPU.
188188
#
189189
# The trace view shows timeline of profiled operators and GPU kernels.
190-
# You can select it to see detail as below.
190+
# You can select it to see details as below.
191191
#
192192
# .. image:: ../../_static/img/profiler_trace_view1.png
193193
# :scale: 25 %

0 commit comments

Comments
 (0)