Skip to content

Add a tutorial for Tensorboard profiler #1380

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 20, 2021

Conversation

gaoteng-git
Copy link
Contributor

@gaoteng-git gaoteng-git commented Mar 1, 2021

@brianjo @chauhang @ilia-cher @gdankel This is a tutorial introducing usage of kineto-based profiler and visualization in TensorBoard. Would you please help merge it?

@netlify
Copy link

netlify bot commented Mar 1, 2021

Deploy preview for pytorch-tutorials-preview ready!

Built with commit 939474e

https://deploy-preview-1380--pytorch-tutorials-preview.netlify.app

@gaoteng-git gaoteng-git force-pushed the tensorboard_profiler branch from 5efecaf to 86fcc26 Compare March 2, 2021 02:20
@gaoteng-git gaoteng-git marked this pull request as ready for review March 2, 2021 09:15
@gaoteng-git gaoteng-git force-pushed the tensorboard_profiler branch from 0dac772 to 4868a5e Compare March 2, 2021 14:12
@gaoteng-git gaoteng-git force-pushed the tensorboard_profiler branch from 137c71c to 4919e6d Compare March 10, 2021 12:27
@gaoteng-git gaoteng-git force-pushed the tensorboard_profiler branch 3 times, most recently from 4d8f9ef to c2d8223 Compare March 23, 2021 04:03
@gaoteng-git gaoteng-git force-pushed the tensorboard_profiler branch from c2d8223 to 32593d0 Compare April 8, 2021 13:19
@lenisha
Copy link

lenisha commented Apr 12, 2021

@gaoteng-git would the recipe for deprecated profiler be updated too?
https://github.com/pytorch/tutorials/blob/master/recipes_source/recipes/profiler_recipe.py

@gaoteng-git
Copy link
Contributor Author

@gaoteng-git would the recipe for deprecated profiler be updated too?
https://github.com/pytorch/tutorials/blob/master/recipes_source/recipes/profiler_recipe.py

I think it's better to add deprecation comment and link to this new tutorial. Only after this new tutorial is merged, the link will be available.

@ilia-cher ilia-cher self-requested a review April 16, 2021 17:58
Comment on lines 4 to 5
This recipe demonstrates how to use PyTorch Profiler
to detect performance bottlenecks of the model.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's say that this tutorial demonstrates how to use TensorBoard plugin with PyTorch Profiler to detect performance bottlenecks of the model

This recipe demonstrates how to use PyTorch Profiler
to detect performance bottlenecks of the model.

.. note::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this note, since your tutorial already uses new API, we don't need to highlight the older vs newer API here, users would just use the new API (according to this tutorial) and it's fine

The profiler can visualize this information
in TensorBoard Plugin and provide analysis of the performance bottlenecks.

In this recipe, we will use a simple Resnet model to demonstrate how to
Copy link
Contributor

@ilia-cher ilia-cher Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this tutorial ... how to use TensorBoard plugin to analyze model performance

# During ``warmup`` steps, the profiler starts profiling as warmup but does not record any events.
# This is for reducing the profiling overhead.
# The overhead at the beginning of profiling is high and easy to bring skew to the profiling result.
# During ``active`` steps, the profiler works and record events.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

records events

# During ``active`` steps, the profiler works and record events.
# - ``on_trace_ready`` - callable that is called at the end of each cycle;
# In this example we use ``torch.profiler.tensorboard_trace_handler`` to generate result files for TensorBoard.
# After profiling, result files will be saved into the ``./log/resnet18`` directory.
Copy link
Contributor

@ilia-cher ilia-cher Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

into the directory specified with logdir parameter

#
# - ``schedule`` - callable that takes step (int) as a single parameter
# and returns the profiler action to perform at each step;
# In this example with wait=1, warmup=1, active=5,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's also use the code quotes around wait=1, warmup=1, active=5

if step >= 7:
break
train(batch_data)
prof.step()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add a comment here saying that we call prof.step() after each step to notify the profiler

# .. image:: ../../_static/img/profiler_trace_view2.png
# :scale: 25 %
#
# From the above view, we can find the event of ``enumerate(DataLoader)`` is shortened,
Copy link
Contributor

@ilia-cher ilia-cher Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the view above, we can see that the runtime of enumerate(DataLoader) is reduced

@ilia-cher
Copy link
Contributor

Looks really good, thank you!
Left a few of minor comments

@gaoteng-git gaoteng-git force-pushed the tensorboard_profiler branch from 80f8332 to 939474e Compare April 20, 2021 09:00
@brianjo brianjo merged commit 24946b2 into pytorch:master Apr 20, 2021
rodrigo-techera pushed a commit to Experience-Monks/tutorials that referenced this pull request Nov 29, 2021
* add tensorboard_profiler tutorial

* Update intermediate_source/tensorboard_profiler_tutorial.py

Co-authored-by: maxluk <maxluk@microsoft.com>

* update title

* remove testing on windows because kineto doesn't support windows now

* rename

* Update with the help of Ilia

Co-authored-by: Teng Gao <tegao@microsoft.com>
Co-authored-by: maxluk <maxluk@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants